Deploying an Agent to AWS Lambda
Continue with the workshop writeup
Open the canonical pages, recording, materials, and code repo.
We start from the FastAPI service we deployed to Railway in the previous workshop, strip out FastAPI, and swap it for a custom AWS Lambda runtime. The runtime handles both the static frontend and the streaming agent API. We deploy one container image as a Lambda Function URL with SSE streaming.
Most of the code-writing work is delegated to a coding agent (Codex). The exact prompts I used are quoted verbatim.
This was a freestyle session, so it also surfaces a fair amount of meta-discussion: how to work with agents, when to trust them, and when to slow down and read the code.
Links
Related material:
The shift versus the previous workshop
The diagram below shows where Lambda fits in the request path:
The agent loop, the search tool, the renderer abstraction, and the frontend are unchanged from the previous workshop.
What changes is the web layer and the deployment pipeline:
- FastAPI is gone. A custom Lambda runtime (
backend/lambda_runtime.py) handles routing, static file serving, and SSE streaming directly against the Lambda Runtime API. - The Dockerfile is rebased on
public.ecr.aws/lambda/python:3.14instead ofpython:3.14-slim. - Deployment is done with
./deploy.sh, which builds a container image, pushes it to ECR, and deploys a CloudFormation stack that creates the Lambda function and a Function URL withRESPONSE_STREAMinvoke mode. - Railway and the GitHub Actions promotion workflow are gone.
The benefit of moving to Lambda is the same in plain words: with Railway or Render you pay for a server that has to be up all the time. With a Lambda Function URL you only pay per invocation, which fits tools and small agents that are used occasionally.
Extra material
- Appendix: isolating the AWS environment - how to spin up an isolated AWS sub-account, mint short-lived credentials, and ship them to the box that runs the agent. Linked to the reproducible scripts in aws-account/.