Skip to content

Commit b5cc19a

Browse files
committed
docs: clarify Streamable HTTP stateless mode semantics and usage
1 parent 5983a65 commit b5cc19a

File tree

1 file changed

+146
-1
lines changed

1 file changed

+146
-1
lines changed

README.md

Lines changed: 146 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1234,7 +1234,7 @@ Note that `uv run mcp run` or `uv run mcp dev` only supports server using FastMC
12341234

12351235
### Streamable HTTP Transport
12361236

1237-
> **Note**: Streamable HTTP transport is the recommended transport for production deployments. Use `stateless_http=True` and `json_response=True` for optimal scalability.
1237+
> **Note**: Streamable HTTP transport is the recommended transport for production deployments. For serverless and load-balanced environments, consider using `stateless_http=True` and `json_response=True`. See [Understanding Stateless Mode](#understanding-stateless-mode) for guidance on choosing between stateful and stateless operation.
12381238
12391239
<!-- snippet-source examples/snippets/servers/streamable_config.py -->
12401240
```python
@@ -1346,6 +1346,151 @@ The streamable HTTP transport supports:
13461346
- JSON or SSE response formats
13471347
- Better scalability for multi-node deployments
13481348

1349+
#### Understanding Stateless Mode
1350+
1351+
The Streamable HTTP transport can operate in two modes: **stateful** (default) and **stateless**. Understanding the difference is important for choosing the right deployment model.
1352+
1353+
##### What "Stateless" Means
1354+
1355+
In **stateless mode** (`stateless_http=True`), each HTTP request creates a completely independent MCP session that exists only for the duration of that single request:
1356+
1357+
- **No session tracking**: No `Mcp-Session-Id` header is used or required
1358+
- **Per-request lifecycle**: Each request initializes a fresh server instance, processes the request, and terminates
1359+
- **No state persistence**: No information is retained between requests
1360+
- **No event store**: Resumability features are disabled
1361+
1362+
This is fundamentally different from **stateful mode** (default), where:
1363+
1364+
- A session persists across multiple requests
1365+
- The `Mcp-Session-Id` header links requests to an existing session
1366+
- Server state (e.g., subscriptions, context) is maintained between calls
1367+
- Event stores can provide resumability if the connection drops
1368+
1369+
##### MCP Features Impacted by Stateless Mode
1370+
1371+
When running in stateless mode, certain MCP features are unavailable or behave differently:
1372+
1373+
| Feature | Stateful Mode | Stateless Mode |
1374+
|---------|---------------|----------------|
1375+
| **Server Notifications** | ✅ Supported | ❌ Not available<sup>1</sup> |
1376+
| **Resource Subscriptions** | ✅ Supported | ❌ Not available<sup>1</sup> |
1377+
| **Multi-turn Context** | ✅ Maintained | ❌ Lost between requests<sup>2</sup> |
1378+
| **Long-running Tools** | ✅ Can use notifications for progress | ⚠️ Must complete within request timeout |
1379+
| **Event Resumability** | ✅ With event store | ❌ Not applicable |
1380+
| **Tools/Resources/Prompts** | ✅ Fully supported | ✅ Fully supported |
1381+
| **Concurrent Requests** | ⚠️ One per session | ✅ Unlimited<sup>3</sup> |
1382+
1383+
<sup>1</sup> Server-initiated notifications require a persistent connection to deliver updates
1384+
<sup>2</sup> Each request starts fresh; client must provide all necessary context
1385+
<sup>3</sup> Each request is independent, enabling horizontal scaling
1386+
1387+
##### When to Use Stateless Mode
1388+
1389+
**Stateless mode is ideal for:**
1390+
1391+
- **Serverless Deployments**: AWS Lambda, Cloud Functions, or similar FaaS platforms where instances are ephemeral
1392+
- **Load-Balanced Multi-Node**: Deploying across multiple servers without sticky sessions
1393+
- **Stateless APIs**: Services where each request is self-contained (e.g., data lookups, calculations)
1394+
- **High Concurrency**: Scenarios requiring many simultaneous independent operations
1395+
- **Simplified Operations**: Avoiding session management complexity
1396+
1397+
**Use stateful mode when:**
1398+
1399+
- Server needs to push notifications to clients (e.g., progress updates, real-time events)
1400+
- Resources require subscriptions with change notifications
1401+
- Tools maintain conversation state across multiple turns
1402+
- Long-running operations need to report progress asynchronously
1403+
- Connection resumability is required
1404+
1405+
##### Example: Stateless Configuration
1406+
1407+
```python
1408+
from mcp.server.fastmcp import FastMCP
1409+
1410+
# Stateless server - each request is independent
1411+
mcp = FastMCP(
1412+
"StatelessAPI",
1413+
stateless_http=True, # Enable stateless mode
1414+
json_response=True, # Recommended for stateless
1415+
)
1416+
1417+
@mcp.tool()
1418+
def calculate(a: int, b: int, operation: str) -> int:
1419+
"""Stateless calculation tool."""
1420+
operations = {"add": a + b, "multiply": a * b}
1421+
return operations[operation]
1422+
1423+
# Each request will:
1424+
# 1. Initialize a new server instance
1425+
# 2. Process the calculate tool call
1426+
# 3. Return the result
1427+
# 4. Terminate the instance
1428+
```
1429+
1430+
##### Deployment Patterns
1431+
1432+
**Pattern 1: Pure Stateless (Recommended)**
1433+
1434+
```python
1435+
# Best for: Serverless, auto-scaling environments
1436+
mcp = FastMCP("MyServer", stateless_http=True, json_response=True)
1437+
1438+
# Clients can connect to any instance
1439+
# Load balancer doesn't need session affinity
1440+
```
1441+
1442+
**Pattern 2: Stateful with Sticky Sessions**
1443+
1444+
```python
1445+
# Best for: When you need notifications but have load balancing
1446+
mcp = FastMCP("MyServer", stateless_http=False) # Default
1447+
1448+
# Load balancer must use sticky sessions based on Mcp-Session-Id header
1449+
# ALB/NGINX can route by header value to maintain session affinity
1450+
```
1451+
1452+
**Pattern 3: Hybrid Approach**
1453+
1454+
```python
1455+
# Deploy both modes side-by-side
1456+
stateless_mcp = FastMCP("StatelessAPI", stateless_http=True)
1457+
stateful_mcp = FastMCP("StatefulAPI", stateless_http=False)
1458+
1459+
app = Starlette(routes=[
1460+
Mount("/api/stateless", app=stateless_mcp.streamable_http_app()),
1461+
Mount("/api/stateful", app=stateful_mcp.streamable_http_app()),
1462+
])
1463+
```
1464+
1465+
##### Technical Details
1466+
1467+
**Session Lifecycle in Stateless Mode:**
1468+
1469+
1. Client sends HTTP POST request to `/mcp` endpoint
1470+
2. Server creates ephemeral `StreamableHTTPServerTransport` (no session ID)
1471+
3. Server initializes fresh `Server` instance with `stateless=True` flag
1472+
4. Request is processed using the ephemeral transport
1473+
5. Response is sent back to client
1474+
6. Transport and server instance are immediately terminated
1475+
1476+
**Performance Characteristics:**
1477+
1478+
- **Initialization overhead**: Each request pays the cost of server initialization
1479+
- **Memory efficiency**: No long-lived sessions consuming memory
1480+
- **Scalability**: Excellent horizontal scaling with no state synchronization
1481+
- **Latency**: Slightly higher per-request latency due to initialization
1482+
1483+
**Stateless Mode Checklist:**
1484+
1485+
When designing for stateless mode, ensure:
1486+
1487+
- ✅ Tools are self-contained and don't rely on previous calls
1488+
- ✅ All required context is passed in each request
1489+
- ✅ Tools complete synchronously within request timeout
1490+
- ✅ No server notifications or subscriptions are needed
1491+
- ✅ Client handles any necessary state management
1492+
- ✅ Operations are idempotent where possible
1493+
13491494
#### CORS Configuration for Browser-Based Clients
13501495

13511496
If you'd like your server to be accessible by browser-based MCP clients, you'll need to configure CORS headers. The `Mcp-Session-Id` header must be exposed for browser clients to access it:

0 commit comments

Comments
 (0)