Type-Driven Lambda Extractors

The problem

We operate a large number of backend workloads on AWS Lambda, including workloads that process HTTP requests (via API Gateway), SQS events, EventBridge events, amongst other things. In early iterations of these services, most handlers followed a similar structure. Each began by parsing a JSON payload/event into a domain type, whilst extracting headers, query parameters, path parameters, etc from the event as needed. This led to a lot of duplicate and imperative code, as every lambda would start off doing the same thing - and the user would define how to do it. An example of a lambda processing a HTTP request might look something like this,

async fn handler(event: Request) -> Result<Response<Body>> {
  let user: User = serde_json::from_slice::<User>(event.body().as_ref())?;
  let path_params = &event.path_parameters;
  let query_string_params = &event.query_string_parameters;
  let headers = &event.headers;

  /* ... */

Whereas for handling a SQS event,

async fn handler(event: SqsEvent) -> Result<Response<Body>> {
  let user: User = convert_full_sqs_event_to::<User>(event.payload.clone());

  /* ... */

As this would be repeated in every handler in imperative fashion, we had the following requirement: every handler should explicitly declare what it needs in its type signature. This would allow us to centralise parsing and extraction logic rather than re-implement it per handler. Finally, the approach should be extensible, so that new cross-cutting concerns could be introduced without invasive changes to existing code.

Design overview

The design is inspired by two existing models in Rust. The first is Bevy’s system execution model, in which function signatures describe the data a system requires and the runtime is responsible for supplying it. The second is Axum’s extractor pattern, where handler parameters are derived from incoming requests through trait-based extraction.

The solution we went with was to create a trait HandlerParam that various types would implement. Those types would represent the different parts of an event that we could be interested in. For example, WithHttpPayload<T>, WithSqsPayload, WithHeader, WithPathParam, to name a few.

// The trait
pub trait HandlerParam<E> {
  fn fetch(ctx: &Context<E>) -> Self
  where
    Self: Sized;
}

// A type representing an SQS event payload
pub struct WithSqsPayload<T>(pub Vec<crate::io::Result<T>>);

In the HandlerParam<E> trait, the polymorphic E type represents the event type (i.e. HTTP event, SQS event, etc.), and all a type has to do is implement fetch which defines how to get the event and what to do with it. For example, to parse an event from API Gateway into a payload, we created the WithHttpPayload<T> type and implemented the fetch function as follows,

pub struct WithHttpPayload<T>(pub Result<T>);

impl<T: DeserializeOwned> HandlerParam<Request> for WithHttpPayload<T> {
  fn fetch(ctx: &Context<Request>) -> Self {
    Self(serde_json::from_slice::<T>(ctx.event.body().as_ref()))
  }
}

This fetch implementation knows which part of the event to grab, and then automatically parses it via serde. All the user has to do is supply the type to parse the event body into via T. For example, to parse the event body of a HTTP request into a User type, the following code is all that is needed,

async fn handler(WithHttpPayload(maybe_user): WithHttpPayload<User>) -> Result<Response<Body>>

And if we want the handler to have access to the payload, path params, and headers, then all we need to write is,

async fn handler(
  (WithHttpPayload(maybe_user), WithPath(path), WithHeaders(headers)): (
    WithHttpPayload<User>,
    WithPath,
    WithHeaders
  )
) -> Result<Response<Body>> {
  /* ... */
}

This is declarative, extensible, and reduces mental load from the application developer.

How it works

Our codebase so far contains 10 of these types implementing HandlerParam. With this setup we can now create a function which runs all the fetches, which we called run_handler. The purpose of this function is to take the type-signature of whatever handler function we’re “running”, call fetch on each one, and pass them into the handler function itself.

pub async fn run_handler<E, P, F, Fut, T>(event: E, handler: F) -> crate::io::Result<T>
where
  P: HandlerParam<E>,
  F: Fn(P) -> Fut,
  Fut: Future<Output = crate::io::Result<T>>,
{
  handler(P::fetch(&Context { event })).await
}

If we go back to,

async fn handler(WithHttpPayload(maybe_user): WithHttpPayload<User>) -> Result<Response<Body>>

What’s happening is that run_handler takes an argument called handler of type F, where F is a function taking a P and returning a Future. The P argument to this function is something which implements HandlerParam, meaning the handler function we pass into run_handler is a function which takes a HandlerParam as parameter. We can see this in our HTTP handler example above. Now that we know, in our run_handler function that P is something implementing HandlerParam, we know we can call P::fetch which is what we do in the body of run_handler. We call fetch on whatever concrete instance P is and pass in the event to our handler. This is what injects whatever we’ve specified into our handler via the type signature.

Tuples

One important constraint for us was that we wanted to be able to specify multiple parameters at once, with each parameter independently derived from the same event, where order of parameter doesn’t matter. As we see in the example above where we declare we want a HTTP payload, path params, and headers, we are actually defining that handler takes a tuple of items, not just one item.

And to see how we did this, it’s important to note that P in the run_handler definition above might be a single concrete type, but it might not be - it could also be a tuple of types. And this is possible because we also implemented HandlerParam for a couple of different tuples,

impl<E, A: HandlerParam<E>, B: HandlerParam<E>> HandlerParam<E> for (A, B) {
  fn fetch(ctx: &Context<E>) -> Self {
    (A::fetch(ctx), B::fetch(ctx))
  }
}

impl<E, A: HandlerParam<E>, B: HandlerParam<E>, C: HandlerParam<E>> HandlerParam<E> for (A, B, C) {
  fn fetch(ctx: &Context<E>) -> Self {
    (A::fetch(ctx), B::fetch(ctx), C::fetch(ctx))
  }
}

impl<E, A: HandlerParam<E>, B: HandlerParam<E>, C: HandlerParam<E>, D: HandlerParam<E>>
  HandlerParam<E> for (A, B, C, D)
{
  fn fetch(ctx: &Context<E>) -> Self {
    (A::fetch(ctx), B::fetch(ctx), C::fetch(ctx), D::fetch(ctx))
  }
}

This is a small implementation detail with a significant effect on usability. By relying on tuple composition, handlers do not need to care about the order in which parameters are extracted or constructed. Each element of the tuple is resolved independently from each other, and the tuple simply groups those results together. As seen from the implementation above, our binary tuple of A and B simply calls A::fetch and B::fetch. With this each element of the tuple handles its own requirement independently, making both options below valid:

async fn handler(
  (WithHttpPayload(maybe_user), WithPath(path), WithHeaders(headers)): (
    WithHttpPayload<User>,
    WithPath,
    WithHeaders
  )
) -> Result<Response<Body>> {
  /* ... */
}

async fn handler(
  (WithHeaders(headers), WithPath(path)): (
    WithHeaders,
    WithPath
  )
) -> Result<Response<Body>> {
  /* ... */
}

Conclusion

With this design and implementation we feel we have achieved our goals of using the type-signatures to declaratively define what each handler wants, resulting in cleaner handler functions than before, and we have satisfied our goal of creating a centralised, standardised, and extensible API. And this was done with just 174 lines of Rust code.