Series Overview
AI agents are powerful, but running them in production is a different game. You need infrastructure that’s reliable, scalable, and secure — and that’s where AWS comes in.
This series teaches you how to build production-grade infrastructure for AI agents using AWS services. Each day covers one piece of the puzzle: deploying models, managing state, caching, routing traffic, and automating deployments.
The Big Picture — What We’re Building
┌─────────────────────────────────────────────────────────────────────┐│ Production AI Agent Architecture │├─────────────────────────────────────────────────────────────────────┤│ ││ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ││ │ Agent │ │ Agent │ │ Agent │ │ Agent │ ││ │ (Team A)│ │ (Team B)│ │ (Team C)│ │ (Team D)│ ││ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ ││ │ │ │ │ ││ └───────────────┼───────────────┼───────────────┘ ││ │ │ ││ ┌────────▼───────────────▼────────┐ ││ │ Route53 + CloudFront │ ◄── Day 5 ││ │ (Global traffic routing + CDN) │ ││ └────────┬───────────────┬────────┘ ││ │ │ ││ ┌────────▼───────────────▼────────┐ ││ │ ALB (Load Balancer) │ ◄── Day 1 ││ └────────┬───────────────┬────────┘ ││ │ │ ││ ┌────────────────────┼───────────────┼────────────────────┐ ││ │ ┌──────────▼───────┐ ┌────▼───────────┐ │ ││ │ │ ECS Fargate │ │ Lambda + │ │ ││ │ │ (Containerized) │ │ Bedrock │ ◄── Day 1,4 ││ │ │ MCP Server │ │ (Serverless) │ │ ││ │ └──────────┬───────┘ └────┬───────────┘ │ ││ │ │ │ │ ││ │ ┌─────────────────┼──────────────┼────────────────┐ │ ││ │ │ ┌───────▼──────┐ ┌────▼────────┐ │ │ ││ │ │ │ DynamoDB │ │ ElastiCache │ │ │ ││ │ │ │ (State, │ │ (Cache, │ │ │ ││ │ │ │ Sessions) │ │ Bedrock) │ │ │ ││ │ │ └─────────────┘ └─────────────┘ │ │ ││ │ └───────────────────────────────────────────────┘ │ ││ └──────────────────────────────────────────────────────┘ ││ ││ ┌──────────────────────────────────────────────────────────────┐ ││ │ CI/CD Pipeline (CodePipeline + CodeBuild) ◄── Day 6│ ││ │ Git push → Build Docker → Push ECR → Deploy ECS │ ││ └──────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────┘Series Roadmap
| Day | Chủ đề | AWS Services | What you learn |
|---|---|---|---|
| 1 | Deploy MCP Server on ECS Fargate | ECS, ECR, ALB, Secrets Manager | Containerize + deploy your first agent server with HTTPS, secrets, and auto-scaling |
| 2 | Agent State with DynamoDB | DynamoDB Global Tables, DAX | Store conversation history, session state, and handle multi-region replication |
| 3 | LLM Caching with ElastiCache + Bedrock | ElastiCache (Redis), Bedrock | Semantic caching, prompt caching with Bedrock, reduce latency and cost |
| 4 | Serverless Agent with Lambda + Bedrock | Lambda, API Gateway, Bedrock, Step Functions | Build agents without managing servers — Lambda orchestrates Bedrock calls |
| 5 | Multi-Region Routing with Route53 | Route53, CloudFront, Global Accelerator | Global traffic routing, failover, latency-based routing for agents |
| 6 | CI/CD for AI Agents | CodePipeline, CodeBuild, ECR, ECS | Automated deployment pipeline — ship agent updates with zero downtime |
Each day builds on the previous one. By day 6, you’ll have a complete production infrastructure for any AI agent.
Day 1: Deploy an MCP Server on ECS Fargate
Your MCP server works on localhost. Now make it accessible to the internet — and to every agent that needs it.
ECS Fargate is the sweet spot: no EC2 to manage, auto-scaling out of the box, and a built-in load balancer. You ship a Docker image, Fargate does the rest.
What we deploy today:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐│ Agent │────▶│ ALB │────▶│ ECS ││ (anywhere) │ │ (HTTPS) │ │ Fargate │├──────────────┤ ├──────────────┤ ├──────────────┤│ MCP Client │ │ ┌────────┐ │ │ ┌──────────┐ ││ (SSE) │ │ │ :443 │ │ │ │ MCP │ ││ │ │ │ ─────▶ │ │ │ │ Server │ │└──────────────┘ │ │ :3001 │ │ │ │ (Docker) │ │ │ └────────┘ │ │ └──────────┘ │ └──────────────┘ └──────────────┘Step by step:
- Package the MCP server as a Docker container
- Push it to ECR (private Docker registry)
- Store secrets (GitHub tokens) in AWS Secrets Manager
- Create an ECS Fargate cluster and task definition
- Set up an ALB with HTTPS to route traffic
- Configure auto-scaling
- Wire up CI/CD so future deployments are automatic
Prerequisites
# AWS CLI (logged in)aws configure
# Dockerdocker --version
# Node.js 18+node --version
# An MCP server project. Any server with SSE transport works.Step 1: Dockerize the MCP Server
Dockerfile
FROM node:20-alpine AS builderWORKDIR /appCOPY package*.json ./RUN npm ci --omit=dev
FROM node:20-alpine AS runtimeWORKDIR /appRUN addgroup -S mcp && adduser -S mcp -G mcp
COPY --from=builder /app/node_modules ./node_modulesCOPY dist/ ./dist/COPY package.json ./
USER mcpEXPOSE 3001
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:3001/health || exit 1
ENV NODE_ENV=productionENV PORT=3001
CMD ["node", "dist/index.js"]Key points:
- Multi-stage build: builder stage has devDependencies for compilation, runtime stays minimal
- Non-root user: security best practice for containers
- Health check: ECS uses this to determine container health
- No hardcoded tokens: secrets are injected at runtime via Secrets Manager
Build and test locally:
docker build -t github-issue-mcp .docker run -p 3001:3001 \ -e GITHUB_TOKEN=your_token_here \ -e AWS_REGION=us-east-1 \ github-issue-mcp
# Verifycurl http://localhost:3001/healthStep 2: Set Up ECR Repository
ECR is Docker Hub on AWS — private, fast, and integrated with ECS.
# Create repository with vulnerability scanningaws ecr create-repository \ --repository-name github-issue-mcp \ --image-scanning-configuration scanOnPush=true
# Authenticate Dockeraws ecr get-login-password --region us-east-1 | \ docker login --username AWS --password-stdin <account>.dkr.ecr.us-east-1.amazonaws.com
# Tag and pushdocker tag github-issue-mcp:latest <account>.dkr.ecr.us-east-1.amazonaws.com/github-issue-mcp:latestdocker push <account>.dkr.ecr.us-east-1.amazonaws.com/github-issue-mcp:latestscanOnPush=true scans every pushed image for vulnerabilities before it reaches production.
Step 3: Store Secrets in AWS Secrets Manager
Never bake tokens into images. Never commit them to Git.
aws secretsmanager create-secret \ --name "github-issue-mcp/github-token" \ --description "GitHub Personal Access Token for MCP server" \ --secret-string "ghp_your_token_here"Also store the SSE shared secret if you implemented authentication (from the MCP security series):
aws secretsmanager create-secret \ --name "github-issue-mcp/sse-shared-secret" \ --secret-string "your-sse-secret-here"Step 4: Create ECS Cluster + Task Definition
Cluster
aws ecs create-cluster \ --cluster-name mcp-server-cluster \ --capacity-providers FARGATE FARGATE_SPOTUsing FARGATE_SPOT as secondary capacity saves 30-50% on compute costs.
Task Definition
The task definition tells ECS what container to run, what ports to expose, and which secrets to inject.
GITHUB_TOKEN_ARN=$(aws secretsmanager describe-secret \ --secret-id "github-issue-mcp/github-token" --query ARN --output text)
aws ecs register-task-definition \ --family github-issue-mcp \ --network-mode awsvpc \ --requires-compatibilities FARGATE \ --cpu 256 \ --memory 512 \ --execution-role-arn "arn:aws:iam::<account>:role/ecsTaskExecutionRole" \ --container-definitions '[ { "name": "mcp-server", "image": "<account>.dkr.ecr.us-east-1.amazonaws.com/github-issue-mcp:latest", "essential": true, "portMappings": [{"containerPort": 3001, "protocol": "tcp"}], "environment": [ {"name": "NODE_ENV", "value": "production"}, {"name": "AWS_REGION", "value": "us-east-1"} ], "secrets": [ {"name": "GITHUB_TOKEN", "valueFrom": "'"$GITHUB_TOKEN_ARN"'"} ], "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/github-issue-mcp", "awslogs-region": "us-east-1", "awslogs-stream-prefix": "ecs" } } } ]'The execution-role-arn references an IAM role that gives ECS permission to pull images from ECR and write logs to CloudWatch.
Step 5: Create ALB + Service
Security groups
# ALB — receive HTTPS from anywhereaws ec2 create-security-group --group-name mcp-alb-sg --description "ALB for MCP server"aws ec2 authorize-security-group-ingress --group-id <alb-sg-id> \ --protocol tcp --port 443 --cidr 0.0.0.0/0
# Tasks — receive traffic only from ALBaws ec2 create-security-group --group-name mcp-task-sg --description "MCP server tasks"aws ec2 authorize-security-group-ingress --group-id <task-sg-id> \ --protocol tcp --port 3001 --source-group <alb-sg-id>Target group and ALB
# Target group — health check on /healthaws elbv2 create-target-group --name mcp-server-tg --protocol HTTP --port 3001 \ --target-type ip --vpc-id <vpc-id> --health-check-path /health
# ALBaws elbv2 create-load-balancer --name mcp-server-alb \ --subnets subnet-<public-a> subnet-<public-b> --security-groups <alb-sg-id>
# HTTPS listener (requires ACM certificate)aws elbv2 create-listener --load-balancer-arn <alb-arn> \ --protocol HTTPS --port 443 \ --certificates CertificateArn=<acm-cert-arn> \ --default-actions Type=forward,TargetGroupArn=<tg-arn>ECS Service
aws ecs create-service \ --cluster mcp-server-cluster \ --service-name github-issue-mcp \ --task-definition github-issue-mcp \ --desired-count 2 \ --launch-type FARGATE \ --network-configuration "awsvpcConfiguration={subnets=[subnet-<private-a>,subnet-<private-b>],securityGroups=[<task-sg-id>],assignPublicIp=DISABLED}" \ --load-balancers "targetGroupArn=<tg-arn>,containerName=mcp-server,containerPort=3001" \ --deployment-configuration "maximumPercent=200,minimumHealthyPercent=100"Private subnets + no public IP: the ALB handles all inbound traffic. The tasks only need outbound access to the GitHub API.
Step 6: Auto-Scaling
Scale on the metric that matters: request count per ALB target.
aws application-autoscaling register-scalable-target \ --service-namespace ecs \ --resource-id service/mcp-server-cluster/github-issue-mcp \ --scalable-dimension ecs:service:DesiredCount \ --min-capacity 2 --max-capacity 20
aws application-autoscaling put-scaling-policy \ --service-namespace ecs \ --resource-id service/mcp-server-cluster/github-issue-mcp \ --scalable-dimension ecs:service:DesiredCount \ --policy-name request-count-target \ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration '{ "TargetValue": 100.0, "PredefinedMetricSpecification": { "PredefinedMetricType": "ALBRequestCountPerTarget", "ResourceLabel": "<alb-arn/tg-arn>" }, "ScaleOutCooldown": 60, "ScaleInCooldown": 120 }'Step 7: CI/CD with CodePipeline
buildspec.yml
version: 0.2phases: install: commands: - npm ci pre_build: commands: - npm run build - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $ECR_REPOSITORY_URI build: commands: - docker build -t $ECR_REPOSITORY_URI:$CODEBUILD_RESOLVED_SOURCE_VERSION . - docker tag $ECR_REPOSITORY_URI:$CODEBUILD_RESOLVED_SOURCE_VERSION $ECR_REPOSITORY_URI:latest post_build: commands: - docker push $ECR_REPOSITORY_URI:$CODEBUILD_RESOLVED_SOURCE_VERSION - docker push $ECR_REPOSITORY_URI:latest - printf '[{"name":"mcp-server","imageUri":"%s"}]' $ECR_REPOSITORY_URI:$CODEBUILD_RESOLVED_SOURCE_VERSION > imagedefinitions.jsonartifacts: files: imagedefinitions.jsonNow every git push to main triggers:
- CodeBuild compiles TypeScript and builds the Docker image
- Pushes to ECR
- ECS deploys a new task definition with the updated image
- ALB gradually drains old connections and routes to new tasks
Step 8: Connecting an Agent
const client = new McpClient({ transport: new SSEClientTransport({ url: "https://mcp-server-<alb-dns>.us-east-1.elb.amazonaws.com/sse", headers: { "Authorization": "Bearer <sse-shared-secret>", }, }),});For SSE transport, enable stickiness on the ALB target group, or implement an external session store.
Monitoring
Dashboard
aws cloudwatch put-dashboard --dashboard-name MCP-Server --dashboard-body '{ "widgets": [ { "type": "metric", "properties": { "metrics": [ ["AWS/ECS", "CPUUtilization", {"stat": "Average"}], ["AWS/ECS", "MemoryUtilization", {"stat": "Average"}] ], "period": 300, "stat": "Average", "region": "us-east-1", "title": "MCP Server Resource Usage" } }, { "type": "metric", "properties": { "metrics": [ ["AWS/ApplicationELB", "RequestCount", {"stat": "Sum"}], ["AWS/ApplicationELB", "TargetResponseTime", {"stat": "p95"}], ["AWS/ApplicationELB", "HTTPCode_Target_5XX_Count", {"stat": "Sum"}] ], "period": 300, "region": "us-east-1", "title": "ALB Metrics" } } ]}'Cost Breakdown
| Component | Configuration | Monthly |
|---|---|---|
| ECS Fargate | 2 tasks × 256/512 | ~$30 |
| ALB | 1 ALB | ~$22 |
| ECR | < 5GB storage | ~$1 |
| Secrets Manager | 2 secrets | ~$1 |
| CloudWatch | Logs + metrics | ~$5 |
| CodePipeline | 50+ builds | ~$10 |
| Total | ~$69/mo |
With FARGATE_SPOT for 50% of tasks: ~$50/mo.
What We Used
| AWS Service | Purpose |
|---|---|
| ECR | Private Docker registry |
| Secrets Manager | GitHub tokens, SSE shared secret |
| ECS Fargate | Serverless container runtime |
| ALB | HTTPS termination + routing + auto-scaling |
| Application Auto Scaling | Scale on request count |
| CodePipeline + CodeBuild | CI/CD from git push |
| CloudWatch | Logs, metrics, alarms |
Checklist
- Dockerfile with multi-stage build
- ECR repository with scanOnPush
- Secrets in Secrets Manager
- ECS task definition with secret references
- ALB + HTTPS + health check
- Auto-scaling policy configured
- CodePipeline from git → build → deploy
- CloudWatch dashboard + alarms
| Day | Topic |
|---|---|
| 1 | Deploy MCP Server on ECS Fargate ✅ |
| 2 | Agent State with DynamoDB Global Tables |
| 3 | LLM Caching with ElastiCache + Bedrock |
| 4 | Serverless Agent with Lambda + Bedrock |
| 5 | Multi-Region Agent Routing with Route53 |
| 6 | CI/CD for AI Agents with CodePipeline |
Series: AWS for AI/Agent Developers. Day 1: Deploy an MCP server on ECS Fargate with ALB, Secrets Manager, auto-scaling, and CI/CD pipeline. Full AWS CLI commands included.
Advertisement
Advertisement