The key serverless component in AWS is called AWS Lambda. A Lambda is basically some business logic code that can be triggered by an event source:
A data event source could be the put or get of an object to an S3 bucket. Streaming event sources could be new records that have been to a DynamoDB table that trigger a Lambda function. Other streaming event sources include Kinesis Streams and SQS.
One example of requests to endpoints are Alexa skills, from Alexa echo devices. Another popular one is Amazon API Gateway, when you call an endpoint that would invoke a Lambda function. In addition, you can use changes in AWS CodeCommit or Amazon Cloud Watch.
Finally, you can trigger different events and messages based on SNS or different cron events. These would be regular events or they could be notification events.
The main idea is that the integration between the event source and the Lambda is managed fully by AWS, so all you need to do is write the business logic code, which is the function itself. Once you have the function running, you can run either a transformation or some business logic code to actually write to other services on the right of the diagram. These could be data stores or invoke other endpoints.
In the serverless world, you can implement sync/asyc requests, messaging or event stream processing much more easily using AWS Lambdas. This includes the microservice communication style and data-management patterns we just talked about.
Lambda has two types of event sources types, non-stream event sources and stream event sources:
- Non-stream event sources: Lambdas can be invoked asynchronously or synchronously. For example, SNS/S3 are asynchronous but API Gateway is sync. For sync invocations, the client is responsible for retries, but for async it will retry many times before sending it to a Dead Letter Queue (DLQ) if configured. It's great to have this retry logic and integration built in and supported by AWS event sources, as it means less code and a simpler architecture:
- Stream event sources: The Lambda is invoked with micro-batches of data. In terms of concurrency, there is one Lambda invoked in parallel per shard for Kinesis Streams or one Lambda per partition for DynamoDB Stream. Within the lambda, you just need to iterate over the Kinesis Streams, DynamoDB, or SQS data passed in as JSON records. In addition, you benefit from the AWS built-in streams integration where the Lambda will poll the stream and retrieve the data in order, and will retry upon failure until the data expires, which can be up to seven days for Kinesis Streams. It's also great to have that retry logic built in without having to write a line of code. It is much more effort if you had to build it as a fleet of EC2 or containers using the AWS Consumer or Kinesis SDK yourself:
In essence, AWS is responsible for the invocation and passing in the event data to the Lambda, you are responsible for the processing and the response of the Lambda.