Photo by Edgar Chaparro on Unsplash
Building Resilient Webhooks with AWS API Gateway direct SQS Integration
Webhooks are a great asynchronous way to listen to events happening from external systems via HTTP. You provide an endpoint that you want that system to send a request — usually as a POST — to whenever an event of interest takes place.
External services like Stripe, Checkout.com, and similar platforms offer the option to supply them with a webhook URL for specific events, often crucial to your workflows, such as the capture of a payment. Typically, signaling the successful acknowledgment and processing of such events involves returning a status code of ≥ 200 ≤ to the external service. A common approach is to use a Lambda function behind the API Gateway path to handle these requests, although potential pitfalls may arise based on your requirements & implementation.
Throttling on Downstream: Overloading the downstream service (lambda) during peak hours without the ability to exert back-pressure on the requests coming in.
Trust Issues: Unreliable external system retries that cannot be fully trusted. For example the external system making configuration changes that screws up the signature/hash check, making all requests invalid and challenging to replay.
Lack of Retries: Absence of a mechanism to handle retries following the acknowledgment of processing the webhook request, leaving no retries in case of errors on our end.
Timeout Challenges: Event processing logic extending beyond 30 seconds, leading to API Gateway terminating the request due to the timeout limit on API Gateway.
In the past few years, AWS has consistently unveiled numerous updates focused on enhancing service-to-service integration. This aims to streamline the connection of various AWS services without necessitating an intermediary Lambda function. Among the array of services offered by AWS & the integrations available for API Gateway, SQS has emerged as a particularly effective solution for addressing such challenges.
We start by creating the queue that the API Gateway will send the request to
Resources:
WebhookQueue:
Type: AWS::SQS::Queue
Properties:
VisibilityTimeout: 90
RedrivePolicy:
deadLetterTargetArn:
Fn::GetAtt:
- WebhookDLQ
- Arn
maxReceiveCount: 3
WebhookDLQ:
Type: AWS::SQS::Queue
In order for us to to route requests on certain API Gateway endpoints to SQS a couple of resources are required
A role that we will attach to the API Gateway method resource to give it permissions to send the message to the SQS queue
The API Gateway resource (path) in this example it will be
/webhook
Finally the API Gateway method for the path that we defined in step 2.
- SetIntegrationResponses
to status code 200 to return after sending the message to the queue by API Gateway
- SetRequestParameters
headerContent-Type
application/x-www-form-urlencoded
because that type content type header that is accepted by SQS service
- In theRequestTemplate
we define a template that will send a message to the queue with both the body & headers as the following structure{ body: { ... }, headers: { ... } }
API Gateway Cloudformation Template
Resources:
WebhookAPIGatewayToSQSRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Action:
- sts:AssumeRole
Effect: Allow
Principal:
Service:
- apigateway.amazonaws.com
Version: '2012-10-17'
Policies:
- PolicyDocument:
Statement:
- Action: sqs:SendMessage
Effect: Allow
Resource:
Fn::GetAtt:
- WebhookQueue
- Arn
- Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:DescribeLogGroups
- logs:DescribeLogStreams
- logs:PutLogEvents
- logs:GetLogEvents
- logs:FilterLogEvents
Effect: Allow
Resource: '*' # Bad practice, make sure your security is tight
Version: '2012-10-17'
PolicyName: apig-sqs-send-msg-policy
RoleName: webhook-apig-sqs-send-msg-role
ApiGatewayResourceWebhook:
Type: AWS::ApiGateway::Resource
Properties:
ParentId:
Fn::GetAtt:
- ApiGatewayRestApi # This is a CFN resource that serverless framework automatically creates
- RootResourceId
PathPart: webhook
RestApiId:
Ref: ApiGatewayRestApi # This is a CFN resource that serverless framework automatically creates
ApiGatewayMethodWebhookPost:
Type: AWS::ApiGateway::Method
Properties:
AuthorizationType: NONE
HttpMethod: POST
Integration:
Type: AWS
Credentials:
Fn::GetAtt:
- WebhookAPIGatewayToSQSRole
- Arn
IntegrationHttpMethod: POST
IntegrationResponses:
- StatusCode: '200'
PassthroughBehavior: WHEN_NO_TEMPLATES
RequestParameters:
integration.request.header.Content-Type: '''application/x-www-form-urlencoded'''
RequestTemplates:
"application/json": "Action=SendMessage&MessageBody={
\"body\": $input.json('$'),
\"headers\": {
#foreach($param in $input.params().header.keySet())
\"$param\": \"$util.escapeJavaScript($input.params().header.get($param))\" #if($foreach.hasNext), #end
#end
}
}"
Uri:
Fn::Join:
- ''
- - 'arn:aws:apigateway:'
- Ref: AWS::Region
- :sqs:path/
- Ref: AWS::AccountId
- /
- Fn::GetAtt:
- WebhookQueue
- QueueName
MethodResponses:
- ResponseModels:
application/json: Empty
StatusCode: '200'
ResourceId:
Ref: ApiGatewayResourceWebhook
RestApiId:
Ref: ApiGatewayRestApi # This is a CFN resource that serverless framework automatically creates
When a request is directed to the /webhook
endpoint, API Gateway will utilize the specified request template to extract both the body and headers, consolidating them into an object before dispatching a message to the queue. This setup grants us comprehensive control over the volume of messages processed at a given moment. By configuring a maximum concurrency on the Lambda event source mapper, we can apply back-pressure in the event of a surge in incoming requests. Significantly, this implementation provides the ability to retry requests should any issues arise.
I’d like to conclude with a crucial point: ensuring that the downstream process can manage idempotency is always essential. I’ve written a recent article that explores various ways to incorporate idempotency into your workflows. If you’re leveraging Lambda Power Tools, consider yourself fortunate, as the tool offers a dedicated utility for this purpose. You can find more information about it in the documentation: https://docs.powertools.aws.dev/lambda/typescript/latest/utilities/idempotency/