Serverless Scaling: Deploying Strands + MCP on AWS – ██FR█████ █INTELL███████████

This content originally appeared on DEV Community and was authored by Om Shree

In this Article, we’ll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We’ll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.

1. Introduction

Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers¹.

2. Deployment Options Overview

Option	Benefits	Limitations
AWS Lambda (Native)	Fast startup, easy CI/CD, unified observability	Max 15-minute execution, no streaming support²
Lambda with Web Adapter	Preserve web frameworks, serverless pay-per-use	Slower cold start (1–3 s), added complexity³
AWS Fargate (ECS/EKS)	Long-running containers, streaming support	Higher cost, container lifecycle management⁴

3. Native AWS Lambda (Stateless MCP)

Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport³.

# lambda_mcp.py
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("lambda-mcp", stateless_http=True)

@mcp.tool()
def echo(message: str) -> str:
    return message

def lambda_handler(event, context):
    return mcp.handle_lambda_event(event, context)

How to Deploy:

zip function.zip lambda_mcp.py

aws lambda create-function \
  --function-name lambdaMcp \
  --runtime python3.9 \
  --handler lambda_mcp.lambda_handler \
  --zip-file fileb://function.zip \
  --role <LAMBDA_IAM_ROLE_ARN> \
  --timeout 900

Optionally, expose it via API Gateway:

aws apigateway create-rest-api --name mcpAPI
# Configure /mcp POST integration with the Lambda function

Benefits:

Fast cold starts
Simplified deployment for stateless tools
Integrated with AWS native monitoring

Limitations:

No streaming support
15-minute execution timeout
No persistent state between invocations

4. Lambda + Web Adapter (Containerized MCP)

Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.

Dockerfile:

FROM public.ecr.aws/lambda/python:3.9
COPY app.py requirements.txt ./
RUN pip install -r requirements.txt
CMD ["app.lambda_handler"]

app.py Example:

from fastmcp import FastMCP
from aws_lambda_adapter import api_gateway_handler

mcp = FastMCP("web-mcp", stateless_http=True)
app = mcp.app

def lambda_handler(event, context):
    return api_gateway_handler(app, event, context)

Deploy via AWS CDK Example:

from aws_cdk import (
    aws_lambda as _lambda,
    aws_apigateway as apigw,
    Stack
)
from constructs import Construct

class WebAdapterStack(Stack):
    def __init__(self, scope, id, **kwargs):
        super().__init__(scope, id, **kwargs)

        fn = _lambda.DockerImageFunction(self, "WebMCPFn",
            code=_lambda.DockerImageCode.from_image_asset("path/to/dockerfile")
        )

        apigw.LambdaRestApi(self, "ApiGateway", handler=fn)

Benefits:

Allows existing web frameworks
Flexible HTTP routing via API Gateway
Serverless, pay-per-use

Limitations:

Added container and adapter complexity
Cold start delays (1–3 seconds)
Still no native streaming support

5. AWS Fargate (Containerized MCP)

Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming².

Dockerfile:

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY mcp_server.py ./
CMD ["python", "mcp_server.py"]

mcp_server.py Example:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("fargate-mcp", stateless_http=True, port=8080)

@mcp.tool()
def echo(message: str) -> str:
    return message

if __name__ == "__main__":
    mcp.run(transport="streamable-http")

CDK Deployment Example:

from aws_cdk import (
    aws_ecs as ecs,
    aws_ecs_patterns as patterns,
    aws_ecr_assets as assets,
    Stack
)
from constructs import Construct

class FargateStack(Stack):
    def __init__(self, scope, id, **kwargs):
        super().__init__(scope, id, **kwargs)

        docker_image = assets.DockerImageAsset(self, "McpImage",
            directory="path/to/dockerfile"
        )

        patterns.ApplicationLoadBalancedFargateService(
            self, "FargateMCPService",
            task_image_options={
                "image": ecs.ContainerImage.from_docker_image_asset(docker_image)
            },
            desired_count=2,
            public_load_balancer=True
        )

Benefits:

Full streaming and persistent workloads supported
Scalability with ECS or EKS
Suitable for production-grade deployments

Limitations:

More costly than Lambda for low-usage patterns
Slightly longer deploy cycles
Requires container orchestration setup

6. Choosing the Right Model

Use Native Lambda for testing, short-lived tasks, low traffic.
Add Web Adapter when integrating with web apps or frameworks.
Choose Fargate for streaming, persistent workloads, or higher performance needs⁴³.

7. Key Considerations

Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry²³.
Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage⁴.
Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows³.

8. Next Steps

Start with a proof-of-concept using native Lambda + FastMCP.
Expand to include frameworks via Web Adapter for structured web API support.
Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects¹.

References

This content originally appeared on DEV Community and was authored by Om Shree