AWS Inter-Service Communication: Simplifying Integration and Enhancing Efficiency

AWS Inter-Service Communication: Simplifying Integration and Enhancing Efficiency

Service-to-service communication (also known as inter-service communication) refers to the exchange of data or messages between different services or components that collectively form a larger application or system (think how different microservices communicate between each other). There needs to be some sort of contract established to ensure reliability and ease the integration between different services.

Historically at Calo this has been done by publishing packages manually (away from the source code) that hold types (Typescript interfaces, enums, etc.) of requests and responses of the different exposed API endpoints. Unfortunately, this also came with exposing internal data types and naming standards that leaked into other services and created a coupling of some sort, we understood that there are some problems that came with this

  • Documenting Changes Is An Afterthought: Teams introduce numerous breaking changes that are not properly communicated or documented

  • High Maintenance Costs: Types were costly to maintain specially when a lot of APIs are introduced & changed, often also not updated promptly when changes were made that would match what is deployed.

  • Ownership Confusion: We saw that the consuming teams were creating the interfaces and not the owning team of the service, despite the consumer not being the expert. This happened because the service owners would have different APIs created for their own use-cases, other teams at a certain point want to integrate with them.

A recurring issue is that teams are not documenting their APIs for consumption. If there is any documentation, it's usually outdated. Additionally, the publisher/owner of the service does not provide an easy way for consumers to integrate. The question that comes to mind is

How can we establish a reliable and efficient contract for service-to-service communication at Calo to address the issues of undocumented changes, high maintenance costs, and ownership confusion, but also help in making teams as independent as possible from each other?

As the system & teams grew over time, it has been quite hectic to try and enforce a standard for documentation and process that teams could follow and implement to streamline the communication between their services without bumping into each other. To the developers, API first approach was a bit unnatural. They are used to code their APIs, and then "document" them so it was quite the mind-shift, even though they understood the benefits. The only way we are able to break through this was to bring the documentation to closer to what felt more natural to developers, which is coding.

To mitigate this issue, it can be solved by generating an SDK that is owned and published by the service that is exposing its services externally. The SDK will aid in exposing functionality that exactly matches the expected request that the server has defined, not only that but also provides types that are only supposed to be used by external services in order not to decouple internal types from other services.

An SDK alone is not the only piece of the solution to solve the mentioned problems above but what comes before it as well, and that is having versioned documentation for the exposed API endpoints that are available for external integration. To solve the problem of having outdated request/response interfaces and types we want to bring these as close as possible to the source code (the actual endpoints) to avoid having developers maintain two different sources.

From Code To OpenAPI

As a first step the aim is to tackle the lack of API documentation. As we know the industry standard for document API is via OpenAPI specification, and since its a well known standard and the tooling that comes around OpenAPI is vastly available to tackle the rest of the problems that we mentioned earlier that could be automated.

The universal language for describing a set of API endpoints is the OpenAPI Specification.

The OpenAPI Specification (OAS) defines a standard, language-agnostic interface to HTTP APIs which allows both humans and computers to discover and understand the capabilities of the service without access to source code, documentation, or through network traffic inspection. When properly defined, a consumer can understand and interact with the remote service with a minimal amount of implementation logic.

Writing OpenAPI documentation feels unfamiliar to developers. We want to start the documentation process as quickly as possible and in a way that feels natural to them.

We introduced zod as the schema validation library the beauty of it is that it's typescript-first. This made it easier for all the teams to move towards zod since it can parse the already existing TypeScript types that were defined for both the request and response to a zod object. The second reason behind choosing zod and that is there is a well-maintained project that can convert zod schemas into an OpenAPI document called zod-to-openapi.

Worth to mention, powertools for AWS lambda typescript recently has added a new parsing feature that makes use of zod, so having it as part of our projects won't feel foreign since we're already making use of the power-tools in our codebase.

💻
You can find a complete example with the OpenAPI spec & SDK outputs in the following repository https://github.com/teamcalo/sdk-generation-example

Defining the request & response schemas

Request and response validation was not covered in all of our endpoints across all of our projects which is a critical aspect that we were missing. An advantage of having a schema validation library as part of the documentation process will force the implementation of the validations across all of the projects and that in turn enhances the security aspect of our endpoints. Defining the request and response of the API endpoints is the starting point of the API documentation.

Even though these endpoints are not for public consumption, but for internal use, it's very important to validate the requests that are coming into the service to protect the service from any malicious activity and to also communicate to the requestor by first defining the schemas

// schemas.ts
import { z } from 'zod';
import { extendZodWithOpenApi } from '@asteasolutions/zod-to-openapi';

extendZodWithOpenApi(z);

const RequestSchema = z.object({
  name: z.string(),
  phoneNumber: z.string(),
  email: z.email(),
  address: z.object({
    city: z.string(),
    street: z.string(),
    block: z.string()
  })
}).openapi('CreateUserRequest');

const ResponseSchema = z.object({
  name: z.string(),
  phoneNumber: z.string(),
  email: z.email(),
  address: z.object({
    city: z.string(),
    street: z.string(),
    block: z.string()
  })
}).openapi('CreateUserResponse');

type CreateUserRequestSchema = z.infer<typeof RequestSchema>;
type CreateUserResponseSchema = z.infer<typeof ResponseSchema>;

and then validating the request using the the schema that was defined in the request handler

// handler.ts
import { UserRepository } from './repository';
import { RequestSchema, CreateUserResponseSchema } from './schemas';

const userRepository = new UserRepository();

export default = async (event) => {
  const requestBody = event.body;

  try {
    RequestSchema.parse(requestBody);
  } catch (error) {
    return {
      statusCode: 422,
      body: JSON.stringify(error)
    };
  }

  const user: User = await userRepository.create(....);
  const response: CreateUserResponseSchema = makeCreateUserResponse(user);

  return {
    statusCode: 201,
    body: JSON.stringify(response)
  }
}

Defining the route

With the request & response defined for an endpoint, we're ready to define the actual route details. We're going to make use of zod-to-openapi which extends zod with extra functionality namely what we've seen in the last example in the request and response definition. It helps in assigning a certain name to the schema that is different than what we have used in our code z.object({...}).openapi('CreateUserResponse'), but also helps in generating the OpenAPI schema from our zod definitions.

// route.ts
import type { OpenAPIDefinitions } from '@asteasolutions/zod-to-openapi/dist/openapi-registry';
import { z } from 'zod';

import { RequestSchema, ResponseSchema } from './schemas';

export const route: OpenAPIDefinitions = {
  type: 'route',
  route: {
    method: 'post',
    path: '/v1/user',
    operationId: 'createCreate',
    summary: 'Create a user',
    request: {
      body: {
        content: {
          'application/json': {
            schema: RequestSchema
          }
        }
      }
    },
    responses: {
      '200': {
        description: '',
        content: {
          'application/json': {
            schema: ResponseSchema
          }
        }
      }
    }
  }
};

Service Authentication

There are multiple ways to implement service-to-service authentication on AWS, but for this article, we will focus on the simpler method: IAM (role-based access control) authentication and authorization. It may seem cumbersome at first, but it works well. By using AWS Signature V4 as our authentication method, we can sign the request using the awsv4-axios request interceptor with the credentials available in our Lambda functions, such as accessKeyId and secretAccessKey.

The openapi-generator typescript-axios template doesn't support awsv4 authentication header, but there is an open pull request to support it that could be forked while it goes through the review process

We need to mark the different endpoint with the awsv4 scheme in order to signal to the users of the SDK that such authentication method is required. We can do this by registering a security schema

// registry.ts
import { OpenAPIRegistry } from '@asteasolutions/zod-to-openapi';

const registry = new OpenAPIRegistry();

export const sigv4 = registry.registerComponent('securitySchemes', 'sigv4', {
  type: 'apiKey',
  name: 'Authorization',
  in: 'header',
  'x-amazon-apigateway-authtype': 'awsSigv4'
});

export default registry;

Lambda functions by default don't have any kind of permissions to enable them to call another resource (think calling another stacks API Gateway) and for that we need to set the required permissions

The below code snippet is an example of a lambda that is in another stack (service) that is making an API call to an API Gateway of a service that we generated an SDK for.

  • Allow the lambda to invoke the API Gateway endpoint with path /v1/users via POST method

  • Pass AWS_REGION as an environment variable that will be used via the awsv4 axios interceptor later through the generated SDK

# serverless.yml
dashboardCreateUser:
  handler: src/dashboard/user/create/endpoint.handler
  iamRoleStatements:
    - Effect: Allow
      Action:
        - execute-api:Invoke
      Resource:
        - Fn::Join:
            - ""
            - - "arn:aws:execute-api:" 
              - Ref: AWS::Region
              - ":"
              - Ref: AWS::AccountId
              - ":"
              - Fn::ImportValue: sls-user-service-${sls:stage}-HttpApiId
              - "/*/POST/v1/users/*"

  environment:
    AWS_REGION:
      Ref: AWS::Region

Generating the SDK from an OpenAPI spec

Once the route and the associated schemas have been defined, we can now make use of them to generate an OpenAPI spec file for the service.

// generate.ts
import fs from 'node:fs';

import { OpenApiGeneratorV3 } from '@asteasolutions/zod-to-openapi';
import yaml from 'yaml';

import registry from './registry';

import { route as CreateUserRoute } from './schema.ts';

function getOpenApiDocumentation() {
  const generator = new OpenApiGeneratorV3([
    ...registry.definitions,
    ...CreateUserRoute
  ]);

  return generator.generateDocument({
    openapi: '3.0.1',
    info: {
      version: '1.0.0',
      title: 'User Service'
    }
  });
}

function writeDocumentation() {
  const docs = getOpenApiDocumentation();
  const fileContent = yaml.stringify(docs);

  fs.writeFileSync(`${__dirname}/openapi.yml`, fileContent, {
    encoding: 'utf8'
  });
}

writeDocumentation();

Running the above script will output a openapi.yml file that we can make use of with the openapi-generator to generate for us an SDK. Quickest way to do this is via the docker image (note that you might need to build your own docker image from the above referenced PR to get the awsv4 security schema & the awsv4 axios interceptor), since the below code snippest is using the official docker image that as of today (25th Aug 2024) doesn't have the awsv4 schema support for the typescript-axios template which is by using the awsv4-axios interceptor package.

docker run \
    --rm -v "\${PWD}:/local" \
    openapitools/openapi-generator-cli generate \
    -i ../../local/src/schemas/openapi.yml \
    -g typescript-axios \
    -o ../../local/sdk

Conclusion

As teams and services expand, effective communication becomes crucial. This article highlighted one of the many approaches to address this challenge. Introducing new processes and tools is rarely straightforward; it requires effort to ensure smooth adoption. In our case, instead of introducing the OpenAPI specification directly, we leveraged Zod—a technology familiar to our engineers and easily integrated into their workflow.

Adopting an OpenAPI specification alongside an SDK greatly improves team efficiency and collaboration. It establishes a clear, standardized contract for services, enabling teams to work independently with minimal communication overhead. This approach ensures that changes and capabilities are transparently communicated, allowing stakeholders to stay aligned and adapt quickly. Ultimately, it accelerates development while enhancing the reliability and scalability of the entire system.


If you’re looking to work on similar problems, we’re always looking for great engineers at Calo! Visit https://calo.jobs/ to find any openings that might fit you