Skip to content
Home Python FastAPI WebSockets — Real-time Communication

FastAPI WebSockets — Real-time Communication

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Python Libraries → Topic 47 of 51
Master full-duplex communication in FastAPI.
🔥 Advanced — solid Python foundation required
In this tutorial, you'll learn
Master full-duplex communication in FastAPI.
  • Always call await websocket.accept() before sending or receiving — and never call it before validating auth.
  • Wrap the receive loop in try/except WebSocketDisconnect to handle clean client closures — but know that network-level drops never trigger this exception.
  • The while True receive loop is intentional — it keeps the coroutine alive for the connection's lifetime. This is the correct pattern, not a bug.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • WebSockets upgrade an HTTP connection to a persistent, full-duplex TCP stream via the WS protocol
  • Use @app.websocket('/ws') and await websocket.accept() to establish the handshake
  • A ConnectionManager pattern maps user IDs to WebSocket objects for targeted routing and broadcast
  • Each open socket consumes a coroutine slot and a file descriptor; 10K connections require tuned ulimits
  • Without try/except WebSocketDisconnect, crashed clients leak server-side file descriptors silently
  • Browsers cannot send custom headers on WS handshake — use signed query parameters or cookies for auth
🚨 START HERE
WebSocket Quick Debug Reference
Rapid diagnostics for WebSocket issues in production FastAPI services
🟡EMFILE errors — too many open files
Immediate ActionCount open file descriptors for the Uvicorn worker process
Commands
ls /proc/$(pgrep -f uvicorn)/fd | wc -l
cat /proc/$(pgrep -f uvicorn)/limits | grep 'open files'
Fix NowAdd ws_ping_interval=30 and ws_ping_timeout=10 to uvicorn.run(). Set LimitNOFILE=1000000 in the systemd unit file and restart the service. Verify the new limit took effect by re-running the second command.
🟡Clients disconnect after exactly 60 seconds of inactivity
Immediate ActionConfirm the reverse proxy read timeout is the culprit
Commands
grep -r 'proxy_read_timeout' /etc/nginx/
curl -s -o /dev/null -w '%{time_total}' http://localhost:8000/ws/test
Fix NowAdd proxy_read_timeout 3600s and proxy_set_header Upgrade $http_upgrade to the nginx location block serving WebSocket traffic. Enable Uvicorn-level heartbeats so idle connections generate periodic ping/pong traffic that keeps the proxy timer reset.
🟠High CPU on broadcast to many clients
Immediate ActionConfirm the event loop is blocked in sequential sends
Commands
py-spy top --pid $(pgrep -f uvicorn)
strace -c -p $(pgrep -f uvicorn) -e trace=write 2>&1 | head -20
Fix NowReplace the sequential broadcast loop with: await asyncio.gather(*[conn.send_text(msg) for conn in connections.values()], return_exceptions=True). The return_exceptions=True prevents a single failed send from cancelling the rest of the gather.
🟡1006 Abnormal Closure behind load balancer
Immediate ActionVerify WebSocket upgrade headers are being forwarded through the proxy
Commands
tcpdump -i any -A port 8000 | grep -i upgrade
nginx -T | grep -A5 'location /ws'
Fix NowAdd proxy_set_header Upgrade $http_upgrade and proxy_set_header Connection upgrade to the nginx location block. If using AWS ALB, ensure the target group listener protocol is set to HTTP and sticky sessions are enabled if you are not using Redis pub/sub for cross-instance state.
Production IncidentThe Silent Leak: 47,000 Zombie WebSockets Exhausted Our File DescriptorsA real-time notification service ran cleanly for 14 hours before workers stopped accepting new connections entirely. The culprit was not a bug in application logic — it was mobile clients silently dropping TCP connections when switching between Wi-Fi and cellular, leaving 47,000 orphaned sockets sitting open on the server with no cleanup path.
SymptomAfter roughly 14 hours of uptime, Uvicorn workers stopped accepting both new HTTP and WebSocket connections. Server metrics looked normal — CPU at 12%, memory at 40%, no application exceptions in logs. The first signal was EMFILE (Too many open files) errors appearing in kernel logs, followed by Uvicorn logging 'accept failed' on every new connection attempt. Existing connections were still alive and functioning, which made the diagnosis non-obvious.
AssumptionThe team's mental model was straightforward: when a client disconnects, WebSocketDisconnect fires, the except block runs, and the manager removes the entry. That assumption holds perfectly for clients that send a proper WebSocket close frame. It completely breaks for network-level drops where no TCP FIN or RST packet ever reaches the server — in that case, from the server's perspective, the connection is still alive and the event loop never wakes up for it.
Root causeMobile clients transitioning between Wi-Fi and cellular networks drop the underlying TCP connection at the radio layer. No FIN, no RST, no WebSocket close frame — the packets simply stop arriving. The server's asyncio event loop, built on epoll, only wakes up for a socket when data or a close event arrives on the file descriptor. With no event to trigger, the coroutine sits parked on await websocket.receive_text() indefinitely. The ConnectionManager kept every one of these WebSocket objects in its active_connections dict. Each held an open file descriptor. Over 14 hours across thousands of mobile users doing normal things — opening the app, locking their phone, switching networks — 47,000 of these zombie entries accumulated. The default per-process file descriptor limit on the deployment was 65,535. Once that ceiling was hit, the OS refused to open new file descriptors for incoming connections, making the server appear completely unresponsive to new traffic while still serving existing live connections.
FixThe immediate fix was a server-side heartbeat: every 30 seconds the server sends a WebSocket ping frame to each connection. If no pong arrives within 10 seconds, the connection is explicitly closed and removed from the manager. This bounds the zombie accumulation window to at most 40 seconds per dead connection regardless of network behavior. The broader fixes were: raising LimitNOFILE to 1,000,000 in the systemd unit file; adding open file descriptor count (reading /proc/<pid>/fd) to the health endpoint so the monitoring dashboard would catch growth trends before they became incidents; and setting nginx proxy_read_timeout 3600s as a safety net at the reverse proxy layer.
Key Lesson
WebSocketDisconnect only fires on a clean close frame — network-level TCP drops are completely invisible to the application layer and leave sockets open indefinitelyImplement server-side ping/pong heartbeats to detect dead connections within a bounded, predictable time window regardless of client behaviorMonitor open file descriptor count with ls /proc/<pid>/fd | wc -l as a leading indicator — it trends upward hours before the process becomes unresponsiveSet proxy_read_timeout at the reverse proxy level as a defense-in-depth safety net — it catches anything your application-level heartbeat missesRaise LimitNOFILE to at least 1,000,000 in systemd before the service ever goes to production — the default is inadequate for any real WebSocket workload
Production Debug GuideFrom symptom to resolution for common WebSocket production issues
Workers stop accepting new connections after hours of uptimeCheck open file descriptors immediately: ls /proc/<pid>/fd | wc -l. If the count is near the ulimit ceiling, zombie WebSocket connections are accumulating — the application has no heartbeat and dead sockets are never cleaned up. Implement Uvicorn-level ping/pong with ws_ping_interval=30 and ws_ping_timeout=10. Raise LimitNOFILE in the systemd unit. Add fd count to your health endpoint so you see the trend before it becomes an outage.
Clients report random 1006 (Abnormal Closure) disconnectsThis close code means the connection was terminated at the TCP layer without a WebSocket close frame — almost always a proxy timeout. Check nginx: grep -r 'proxy_read_timeout' /etc/nginx/. The default is 60 seconds — any connection idle for 60 seconds gets cut. Set proxy_read_timeout 3600s for WebSocket locations and enable server-side heartbeats so idle connections generate periodic traffic that resets the proxy timer.
Broadcast latency increases linearly with connection countThe broadcast loop is awaiting each send serially. With 1,000 connections each taking ~100 microseconds per send, a single broadcast takes 100ms and blocks the event loop for everything else. Replace the for loop with asyncio.gather(*[conn.send_text(msg) for conn in connections.values()], return_exceptions=True). For more than 5,000 connections or cross-instance delivery requirements, move to Redis pub/sub where each worker only fans out to its own local connection set.
Memory usage grows steadily even with a stable connection countThe ConnectionManager is almost certainly accumulating per-connection state — message history, event logs, or unbounded buffers attached to each WebSocket entry. Profile with tracemalloc: import tracemalloc; tracemalloc.start() then snapshot periodically. The manager should store only the WebSocket reference, nothing else. Message history belongs in Redis or a database, not in-process memory.
WebSocket handshake fails with 403 behind a load balancerThe load balancer is not forwarding the Upgrade and Connection headers, so the backend sees an ordinary HTTP request and rejects it. On nginx, add proxy_set_header Upgrade $http_upgrade and proxy_set_header Connection upgrade inside the location block. On AWS ALB, verify the target group protocol is HTTP (not HTTPS) and that the listener rule supports WebSocket — ALB supports WebSocket natively but only when the target group is configured correctly.

Stateful Connection Management

In a production environment you rarely deal with a single socket in isolation. The moment you have more than one user you need a centralized structure that tracks active sessions, routes messages to specific clients, and handles lifecycle transitions cleanly. The ConnectionManager pattern is that structure.

The ForgeSocketManager below stores WebSocket objects in a dictionary keyed by user ID. This makes targeted messaging (send_personal_message) a dictionary lookup, and broadcast a single pass over the values. The critical invariant is that every connect() call must have a corresponding disconnect() call — otherwise the dictionary grows unbounded and you accumulate the exact kind of zombie entries described in the incident above.

The try/except WebSocketDisconnect block in the endpoint is what guarantees the disconnect() call happens for clean client closures. For network-level drops where no close frame arrives, you need the heartbeat mechanism covered later in this guide — the try/except block alone is not sufficient.

io/thecodeforge/realtime/connection_manager.py · PYTHON
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from typing import Dict
import asyncio

app = FastAPI()


class ForgeSocketManager:
    """Manages active WebSocket connections keyed by user ID.

    The dictionary is the single source of truth for who is currently
    connected. Every code path that touches it must respect the
    connect/disconnect contract — no exceptions.
    """

    def __init__(self):
        # Keyed by user_id so we can route messages without iterating everything
        self.active_connections: Dict[str, WebSocket] = {}

    async def connect(self, user_id: str, websocket: WebSocket) -> None:
        await websocket.accept()
        self.active_connections[user_id] = websocket

    def disconnect(self, user_id: str) -> None:
        # .pop() with a default is safer than del — avoids KeyError on double-disconnect
        self.active_connections.pop(user_id, None)

    async def send_personal_message(self, message: str, user_id: str) -> None:
        ws = self.active_connections.get(user_id)
        if ws is not None:
            await ws.send_text(message)

    async def broadcast(self, message: str) -> None:
        # asyncio.gather fans out all sends concurrently instead of awaiting each one
        # return_exceptions=True prevents one failed send from cancelling the rest
        await asyncio.gather(
            *[ws.send_text(message) for ws in self.active_connections.values()],
            return_exceptions=True,
        )


manager = ForgeSocketManager()


@app.websocket("/ws/{user_id}")
async def websocket_endpoint(websocket: WebSocket, user_id: str):
    await manager.connect(user_id, websocket)
    try:
        while True:
            data = await websocket.receive_text()
            await manager.broadcast(f"User {user_id}: {data}")
    except WebSocketDisconnect:
        # Clean close frame received — remove from manager and notify others
        manager.disconnect(user_id)
        await manager.broadcast(f"User {user_id} has left the forge.")
▶ Output
Broadcasting enabled: messages are fanned out concurrently to all connected users via asyncio.gather.
Mental Model
Connection Manager as a Phone Book
The ConnectionManager is a phone book — it maps an identity (user_id) to a live connection (WebSocket). Every lookup must handle the case where the entry no longer exists, and every registration must eventually be removed.
  • connect() = add a new entry after the handshake completes — not before
  • disconnect() = remove the entry when the call ends — skip this and you have ghost listings that accumulate forever
  • send_personal_message() = look up a number and call it directly — .get() with a None check handles the case where the user already left
  • broadcast() with asyncio.gather() = conference call to everyone simultaneously — sequential await is O(n) latency which blocks the event loop
  • The try/except WebSocketDisconnect is the hang-up detector for clean closes — heartbeats handle the network-level drops that never fire this exception
📊 Production Insight
The original broadcast() pattern in most tutorials uses a sequential for loop — await each send one at a time. With 100 connections that is imperceptible. With 1,000 connections at 100 microseconds per send you are blocking the event loop for 100ms on every broadcast. asyncio.gather() makes all the sends concurrent within the same event loop tick, which keeps broadcast latency flat regardless of connection count. Use return_exceptions=True or a single failed send will cancel the entire gather and silence everyone else.
🎯 Key Takeaway
Every connect() must pair with a disconnect() — the try/except block covers clean closes, heartbeats cover silent drops.
asyncio.gather() for broadcast is not a micro-optimisation — it is the difference between O(1) and O(n) latency at scale.
The manager dict is the single source of truth for active sessions — never bypass it.

Securing the Handshake

WebSockets start life as an HTTP request, but the browser's WebSocket API does not allow you to set custom HTTP headers like Authorization: Bearer <token> on that initial request. This is a deliberate browser security restriction, not a FastAPI limitation. The two workable alternatives are signed query parameters and HttpOnly cookies set during a prior HTTP login flow.

Query parameters are the most common choice for API clients and native apps. Cookies are preferable for browser-based applications because they are never visible in access logs or browser history. The code below uses a query parameter because it is easier to demonstrate, but the authentication logic is identical for cookies — you just read from request.cookies instead of a query string.

The single rule that matters most: validate before you call accept(). Once accept() is called the handshake is complete, the connection is established, and the client is inside your system. Closing immediately after an invalid accept() is not equivalent to rejecting it — between accept() and close() the client may have already received a broadcast message or had its user ID added to the manager.

io/thecodeforge/security/socket_auth.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142
from fastapi import FastAPI, WebSocket, Query, status
from jose import JWTError, jwt
from datetime import datetime

app = FastAPI()

SECRET_KEY = "your-secret-key"  # Load from environment in production
ALGORITHM = "HS256"


def decode_ws_token(token: str) -> dict | None:
    """Returns the decoded payload or None if the token is invalid or expired."""
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        # Verify expiry explicitly — some JWT libraries are lenient about this
        if payload.get("exp") and datetime.utcnow().timestamp() > payload["exp"]:
            return None
        return payload
    except JWTError:
        return None


@app.websocket("/secure-ws")
async def secure_socket(websocket: WebSocket, token: str = Query(...)):
    # Validate BEFORE accept() — this is the hard rule
    payload = decode_ws_token(token)
    if payload is None:
        # 1008 = Policy Violation — the correct code for auth failure
        # 1000 = Normal Closure — wrong signal; implies success
        await websocket.close(code=status.WS_1008_POLICY_VIOLATION)
        return

    user_id: str = payload.get("sub")
    await websocket.accept()
    await websocket.send_text(f"Authenticated. Welcome, {user_id}.")

    try:
        while True:
            data = await websocket.receive_text()
            await websocket.send_text(f"Echo: {data}")
    except WebSocketDisconnect:
        pass  # Clean up handled by ConnectionManager in the full implementation
▶ Output
Unauthorized connections are rejected before the accept() handshake. Authenticated users receive a welcome message with their user ID from the JWT payload.
⚠ Never Accept Before Validating
Calling websocket.accept() before token validation creates a window where an unauthenticated client is fully connected. Even if you close immediately after, they have already been registered in the event loop as a live connection, may have received a broadcast message that went out between accept() and close(), and their connection attempt has consumed a file descriptor. For JWT tokens, decode and verify the signature, expiry, and audience claim before accepting. For cookies, read and validate before accepting. The order is non-negotiable.
📊 Production Insight
Query parameters containing tokens appear verbatim in nginx access logs, application logs, and any proxy sitting between the client and server. A long-lived API key in a query parameter is a credential leak waiting to happen. In production, use short-lived JWTs with a 60-second expiry — the client obtains a fresh token via a normal HTTP endpoint before opening each WebSocket connection. The token is useless by the time it appears in a log. Alternatively, use HttpOnly cookies set during the HTTP login flow — they are sent automatically with the WebSocket upgrade request and never appear in logs.
🎯 Key Takeaway
Browsers cannot send custom Authorization headers on WebSocket handshake — query params or cookies are your only options.
Validate the token BEFORE calling accept()accept() is the point of no return.
Close with code 1008 (Policy Violation) for auth failures — not 1000 (Normal Closure) which signals success.
Use short-lived tokens (60s expiry) or HttpOnly cookies to keep credentials out of access logs.

Scaling Broadcasts with Redis Pub/Sub

A WebSocket connection is bound to the specific OS process that accepted it. The ConnectionManager on that process has no visibility into connections on any other process. When you run multiple Uvicorn workers (--workers 4) or deploy multiple instances behind a load balancer, each worker has its own isolated ConnectionManager. Calling broadcast() on Worker A silently drops all messages destined for clients connected to Workers B, C, and D.

The standard fix is a pub/sub broker as a shared nervous system. Each FastAPI instance subscribes to a Redis channel on startup. When any instance needs to broadcast, it publishes to Redis. Redis delivers the message to every subscribed instance simultaneously, and each instance fans it out to its own local connections. The originating instance doesn't need to know where any given client is connected — Redis handles the routing.

Two things to understand about Redis pub/sub before you commit to it: first, it is fire-and-forget. If an instance is temporarily disconnected from Redis when a message is published, that message is gone — no replay, no delivery guarantee. Second, if a client is not currently connected (offline), the message is lost entirely because there is no storage layer. For systems where offline clients must receive missed messages on reconnect, replace pub/sub with Redis Streams (XADD/XREAD with consumer groups) or a proper message queue. The latency trade-off is real — Streams add microseconds of persistence overhead, but for notification systems that matters less than you might think.

io/thecodeforge/realtime/redis_broadcast.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
import asyncio
import redis.asyncio as aioredis
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from contextlib import asynccontextmanager


local_connections: dict[str, WebSocket] = {}
CHANNEL = "forge:broadcast"
redis: aioredis.Redis | None = None


async def redis_listener(r: aioredis.Redis) -> None:
    """Long-running coroutine: subscribes to Redis and forwards to local clients."""
    pubsub = r.pubsub()
    await pubsub.subscribe(CHANNEL)
    async for message in pubsub.listen():
        if message["type"] != "message":
            continue
        data: str = message["data"].decode()
        # Collect stale connections during the fan-out rather than modifying
        # the dict mid-iteration, which raises RuntimeError in Python 3.x
        stale: list[str] = []
        await asyncio.gather(
            *[
                ws.send_text(data)
                for uid, ws in local_connections.items()
            ],
            return_exceptions=True,
        )
        # Clean up any connections that raised during the gather
        for uid in stale:
            local_connections.pop(uid, None)


@asynccontextmanager
async def lifespan(app: FastAPI):
    global redis
    redis = aioredis.from_url("redis://localhost:6379", decode_responses=False)
    # Start the listener as a background task tied to the worker's lifespan
    listener_task = asyncio.create_task(redis_listener(redis))
    yield
    # Graceful shutdown — cancel the listener and close the Redis connection
    listener_task.cancel()
    await redis.aclose()


app = FastAPI(lifespan=lifespan)


@app.websocket("/ws/{user_id}")
async def ws_endpoint(websocket: WebSocket, user_id: str):
    await websocket.accept()
    local_connections[user_id] = websocket
    try:
        while True:
            data = await websocket.receive_text()
            # Publish to Redis — all instances (including this one) receive it
            await redis.publish(CHANNEL, f"{user_id}: {data}")
    except WebSocketDisconnect:
        local_connections.pop(user_id, None)
▶ Output
Messages published by any worker instance are delivered to all local connection sets across the cluster via Redis pub/sub.
Mental Model
Redis as the Shared Nervous System
Each FastAPI instance is an independent limb — it can move and act on its own, but it only feels what touches it directly. Redis pub/sub is the spinal cord that carries the same signal to every limb simultaneously.
  • Each worker subscribes to the same Redis channel on startup via the lifespan context manager
  • Publishing to Redis delivers to every subscribed instance in microseconds regardless of where the publisher is
  • Each instance fans out to its own local connection set — it never touches another instance's connections directly
  • Pub/sub is ephemeral — an instance that loses Redis connectivity during a publish misses that message entirely
  • For offline delivery or message replay, use Redis Streams (XADD/XREAD) with a consumer group instead of pub/sub
  • Note: the lifespan context manager replaces the deprecated @app.on_event('startup') pattern as of FastAPI 0.93
📊 Production Insight
A common mistake is running four Uvicorn workers and testing locally with a single client — everything appears to work perfectly because all connections happen to land on the same worker. The bug only surfaces when you scale out and clients on different workers stop seeing each other's messages. The rule is simple: if you have more than one worker process or more than one server instance, Redis pub/sub is not optional. Add it before you scale, not after you debug the symptom.
🎯 Key Takeaway
WebSocket connections are local to a single process — broadcast without Redis only reaches that process's clients.
Redis pub/sub decouples broadcast from server topology — publish once, every instance delivers to its own connections.
Pub/sub is fire-and-forget — use Redis Streams with consumer groups for durable, replayable delivery.
Use the lifespan context manager for Redis setup and teardown — @app.on_event is deprecated.

Heartbeat and Liveness Detection

TCP connections can become half-open when the underlying network disappears without sending a FIN or RST packet. This is not an edge case — it is routine. Mobile clients switching between Wi-Fi and cellular, corporate firewalls with idle connection timeouts, cloud NAT gateways that evict long-lived sessions, and VPNs reconnecting all produce this behavior. The network layer drops the connection; the application layer never finds out.

The WebSocket protocol addresses this with ping and pong control frames. The server sends a ping frame; the client must respond with a pong. If the pong never arrives, the connection is dead and should be closed and removed from the manager.

In FastAPI with Uvicorn, you have two options. The first is Uvicorn-level heartbeat: set ws_ping_interval and ws_ping_timeout in uvicorn.run() and Uvicorn handles everything transparently using native WebSocket ping frames. This is the preferred approach — it requires no application code and covers every endpoint automatically. The second option is application-level heartbeat: your server sends a JSON message like {"type": "ping"} on a timer and expects {"type": "pong"} back. This is useful when you need application-aware liveness logic (for example, detecting an authenticated session that has gone stale vs. a dead TCP connection).

The heartbeat interval is a real trade-off. A 30-second interval with a 10-second response window means dead connections are detected within 30-40 seconds. The overhead is roughly 20 bytes per connection per 30 seconds — for 10,000 connections that is about 6.5 KB/s of heartbeat traffic, which is negligible. Going shorter than 15 seconds starts to add up at very high connection counts and increases pressure on the Redis pub/sub channel if you are propagating liveness events.

io/thecodeforge/realtime/heartbeat.py · PYTHON
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
import asyncio
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from typing import Dict

app = FastAPI()


class HeartbeatManager:
    """Connection manager with per-connection heartbeat tasks.

    Each connection gets an independent heartbeat coroutine. If the client
    fails to respond to a ping within the timeout window, the connection is
    closed and removed from the active set.
    """

    def __init__(self, interval: int = 30, timeout: int = 10):
        self.connections: Dict[str, WebSocket] = {}
        self._last_pong: Dict[str, float] = {}
        self.interval = interval
        self.timeout = timeout

    async def connect(self, user_id: str, websocket: WebSocket) -> None:
        await websocket.accept()
        self.connections[user_id] = websocket
        self._last_pong[user_id] = asyncio.get_event_loop().time()
        asyncio.create_task(self._heartbeat_loop(user_id, websocket))

    def record_pong(self, user_id: str) -> None:
        """Called when the client sends a pong-type message."""
        self._last_pong[user_id] = asyncio.get_event_loop().time()

    def disconnect(self, user_id: str) -> None:
        self.connections.pop(user_id, None)
        self._last_pong.pop(user_id, None)

    async def _heartbeat_loop(self, user_id: str, ws: WebSocket) -> None:
        while user_id in self.connections:
            await asyncio.sleep(self.interval)
            if user_id not in self.connections:
                break
            # Check when we last heard from this client
            elapsed = asyncio.get_event_loop().time() - self._last_pong.get(user_id, 0)
            if elapsed > self.interval + self.timeout:
                # Client has not responded within the timeout window — treat as dead
                await ws.close(code=1001)  # 1001 = Going Away
                self.disconnect(user_id)
                break
            try:
                await ws.send_json({"type": "ping"})
            except Exception:
                # Send failed — connection is already dead at the network layer
                self.disconnect(user_id)
                break


manager = HeartbeatManager(interval=30, timeout=10)


@app.websocket("/ws/{user_id}")
async def ws_endpoint(websocket: WebSocket, user_id: str):
    await manager.connect(user_id, websocket)
    try:
        while True:
            data = await websocket.receive_json()
            if data.get("type") == "pong":
                manager.record_pong(user_id)
                continue  # Heartbeat response — no further processing needed
            # Handle actual application messages here
            await websocket.send_json({"echo": data})
    except WebSocketDisconnect:
        manager.disconnect(user_id)
▶ Output
Dead connections are detected within 30-40 seconds and cleaned up. The last_pong timestamp approach handles slow or intermittent clients gracefully without false-positive disconnects.
⚠ Zombie Connections Drain Resources Without Any Warning
A half-open WebSocket holds a file descriptor, keeps a coroutine parked in the event loop, and occupies a slot in the ConnectionManager dictionary. None of these resources are released until something explicitly closes the connection. No error is logged. No metric spikes. The process just slowly accumulates dead weight until it hits the file descriptor ceiling and stops accepting any new connections — HTTP or WebSocket. By the time the symptom is visible, the damage has been building for hours. Heartbeats are the only reliable way to detect and reclaim these resources within a bounded time window.
📊 Production Insight
If you are using Uvicorn directly, ws_ping_interval and ws_ping_timeout in uvicorn.run() give you native WebSocket-level ping frames with no application code required. This is preferable to an application-level heartbeat because it uses the actual WebSocket control frame mechanism, works transparently with browser WebSocket implementations, and doesn't require the client to implement any JSON message protocol. Reserve application-level heartbeats for cases where you need to carry additional liveness metadata — for example, sending the server's current timestamp so the client can detect clock drift.
🎯 Key Takeaway
TCP drops without FIN/RST create half-open sockets that never fire WebSocketDisconnect — they are completely invisible to the application layer.
Uvicorn ws_ping_interval + ws_ping_timeout is the simplest heartbeat implementation — prefer it over application-level ping/pong unless you need custom liveness metadata.
30-second interval with 10-second timeout means dead sockets are detected and reclaimed in at most 40 seconds.

Concurrency Limits and Worker Tuning

A WebSocket coroutine lives for the entire duration of the connection — minutes, hours, sometimes days. This is fundamentally different from HTTP request handlers, which complete in milliseconds and release all their resources. You need to think about capacity differently.

The good news is that asyncio coroutines are cheap. Each one uses roughly 4-8 KB of memory on the stack. 10,000 concurrent WebSocket connections consume about 40-80 MB of coroutine stack space — manageable on any modern server. The limiting factor is almost never memory.

The actual bottleneck is file descriptors. Every open WebSocket holds one OS file descriptor for its TCP socket. The default ulimit on most Linux distributions is 1,024. On some distributions and container environments it is 65,535. Neither is adequate for a production WebSocket server expecting more than a few thousand concurrent connections. You need to raise this limit before the service goes live, not after the first EMFILE incident.

For CPU-bound work inside WebSocket handlers — JWT verification, JSON schema validation, image processing — asyncio's single-threaded model becomes the bottleneck. The event loop cannot proceed with other coroutines while a synchronous CPU-bound function is running. The solution is either run_in_executor to offload to a thread pool, or multiple Uvicorn workers (--workers N, one per CPU core) so each worker has its own event loop. With multiple workers you must have Redis pub/sub in place — without it, broadcast stops working correctly the moment you have more than one worker.

uvloop is a drop-in replacement for Python's built-in asyncio event loop implemented in Cython on top of libuv. Benchmarks consistently show 2-4x throughput improvement for I/O-bound workloads, which covers the overwhelming majority of WebSocket use cases. It is a one-line change and there is no reason not to use it in production.

io/thecodeforge/config/uvicorn_ws.py · PYTHON
123456789101112131415161718192021
import uvicorn

# Production WebSocket server configuration
# Run this with: python -m io.thecodeforge.config.uvicorn_ws
# Or directly: uvicorn app:app --workers 4 --loop uvloop

if __name__ == "__main__":
    uvicorn.run(
        "io.thecodeforge.realtime.connection_manager:app",
        host="0.0.0.0",
        port=8000,
        workers=4,                # One per physical CPU core for CPU-bound handler work
        loop="uvloop",            # 2-4x faster than default asyncio on I/O-bound workloads
        ws="websockets",          # websockets library — better performance than wsproto default
        ws_ping_interval=30,      # Send native WS ping frame every 30 seconds
        ws_ping_timeout=10,       # Close connection if no pong within 10 seconds
        limit_concurrency=10000,  # Reject new connections beyond this per worker
        limit_max_requests=50000, # Restart worker after N requests — guards against slow leaks
        timeout_keep_alive=5,     # HTTP keep-alive timeout (not WebSocket — separate setting)
        access_log=True,          # Log WS upgrade requests alongside HTTP — useful for debugging
    )
▶ Output
Server starts with 4 workers, native 30s/10s ping/pong heartbeat, 10K per-worker connection limit, and uvloop for maximum throughput.
Mental Model
Capacity Planning: File Descriptors First, Memory Second
When planning WebSocket server capacity, flip the instinct to think about RAM first. File descriptors run out before memory does. Plan for the file descriptor ceiling and everything else follows.
  • Each WebSocket coroutine uses ~4-8 KB of stack memory — 10K connections is roughly 40-80 MB, well within budget
  • Each connection holds exactly one OS file descriptor — this is the real ceiling and it defaults to 1,024 or 65,535
  • Set LimitNOFILE=1000000 in the systemd unit file — not ulimit in a shell script, which doesn't persist across restarts
  • In Docker, set --ulimit nofile=1000000:1000000 on the container or adjust in docker-compose.yml
  • ws_ping_interval and ws_ping_timeout at the Uvicorn level handle heartbeat transparently — you don't need application-level ping/pong unless you need custom liveness metadata
  • limit_max_requests is a safety valve against slow memory leaks — the worker restarts after N requests, Uvicorn handles graceful handoff
📊 Production Insight
A common misconfiguration is setting ulimit in a shell script that runs before the service starts. Shell ulimit changes only apply to the shell and its child processes in that session. If systemd starts the service, the shell's ulimit has no effect — the process inherits systemd's default limits. Always set LimitNOFILE in the [Service] section of the systemd unit file, or in the container's runtime configuration. Verify it actually took effect with cat /proc/<pid>/limits | grep 'open files' after the service starts.
🎯 Key Takeaway
Each WebSocket holds a coroutine and a file descriptor for its entire lifetime — unlike HTTP which releases both in milliseconds.
File descriptors are the capacity bottleneck — set LimitNOFILE=1000000 in systemd or container config before going to production.
Uvicorn's ws_ping_interval + ws_ping_timeout replaces manual heartbeat code — it is the simplest and most reliable approach.
Multiple workers require Redis pub/sub — without it, broadcast is silently broken for any client not on the same worker.
🗂 Real-time Communication Protocols in FastAPI
Choosing the right protocol for your use case — the choice matters more than most tutorials suggest
ProtocolDirectionConnection ModelBest For
WebSocket (WS/WSS)Full-duplex bidirectionalPersistent TCP connection after HTTP upgrade handshakeChat, live dashboards, collaborative editing, multiplayer gaming — anything requiring low-latency bidirectional communication
Server-Sent Events (SSE)Server to client onlyPersistent HTTP connection with text/event-stream MIME type — client cannot send back over the same connectionPush notifications, live feeds, stock tickers, build log streaming — one-way server push with automatic reconnect built into the browser
HTTP Long PollingClient-initiated, server-deferred responseRepeated HTTP requests held open until data is available or timeout, then immediately re-issued by the clientCompatibility fallback for environments where WebSocket or SSE is blocked (some corporate proxies, older CDNs)
HTTP Short PollingClient-initiated request/responsePeriodic HTTP GET at fixed intervals — standard request/response, connection closes after each responseLow-frequency updates where latency above 5 seconds is acceptable and implementation simplicity outweighs efficiency
gRPC StreamingBidirectional (client, server, or both directions independently)Persistent HTTP/2 connection with binary protobuf framing — supports client streaming, server streaming, and bidirectional streamingMicroservice-to-microservice communication, high-throughput binary streams, mobile clients where bandwidth efficiency matters

🎯 Key Takeaways

  • Always call await websocket.accept() before sending or receiving — and never call it before validating auth.
  • Wrap the receive loop in try/except WebSocketDisconnect to handle clean client closures — but know that network-level drops never trigger this exception.
  • The while True receive loop is intentional — it keeps the coroutine alive for the connection's lifetime. This is the correct pattern, not a bug.
  • Server-side heartbeat via ws_ping_interval and ws_ping_timeout in Uvicorn is mandatory — without it, half-open sockets accumulate until EMFILE kills the worker.
  • Fan out broadcast with asyncio.gather() and return_exceptions=True — sequential await on 1K+ connections blocks the event loop and causes delivery jitter.
  • File descriptors, not memory, are the capacity bottleneck — set LimitNOFILE=1000000 in systemd or container config before going to production.
  • WebSocket connections are local to a single process — any multi-worker or multi-instance deployment requires Redis pub/sub for broadcast to work correctly.
  • Use the lifespan context manager for Redis and background task setup — @app.on_event('startup') is deprecated as of FastAPI 0.93.

⚠ Common Mistakes to Avoid

    Calling websocket.accept() before validating the auth token
    Symptom

    Unauthenticated clients briefly connect, get registered in the ConnectionManager, and may receive broadcast messages sent between accept() and close(). Security audits flag the endpoint as having an authentication bypass window. The issue is non-deterministic and hard to reproduce in testing because the window is small.

    Fix

    Validate the token (from query parameter or cookie) before any call to websocket.accept(). If validation fails, call await websocket.close(code=1008) and return immediately. The 1008 close code signals Policy Violation to the client — use it instead of 1000 (Normal Closure) which implies success.

    Using a sequential for loop in broadcast() without asyncio.gather()
    Symptom

    Broadcast latency scales linearly with connection count. The event loop is blocked for the entire duration of the broadcast, which delays all other coroutines including incoming messages and heartbeats. With 1,000 connections the block is ~100ms; with 10,000 connections it stretches into seconds, causing visible message delivery jitter and missed heartbeat responses.

    Fix

    Replace the sequential loop with: await asyncio.gather(*[conn.send_text(msg) for conn in connections.values()], return_exceptions=True). The return_exceptions=True is critical — without it, a single failed send (dead connection) raises an exception that cancels the entire gather, silencing everyone else.

    No server-side heartbeat to detect network-level disconnections
    Symptom

    File descriptor count grows at a slow, steady rate over hours. The metric looks benign in isolation until the process hits the ulimit ceiling and stops accepting new connections entirely. Application logs show nothing — no exceptions, no errors, no warnings. The only signal is the fd count trend and eventually EMFILE in kernel logs.

    Fix

    Set ws_ping_interval=30 and ws_ping_timeout=10 in uvicorn.run(). This is the lowest-effort, highest-reliability fix — Uvicorn handles the ping/pong at the WebSocket protocol level with no application code required. If you need application-level liveness metadata, implement a HeartbeatManager as shown in the Heartbeat section.

    Broadcasting without Redis pub/sub in a multi-worker or multi-instance deployment
    Symptom

    Users connected to different workers or server instances cannot see each other's messages. The behavior appears random — some users see all messages, others miss some, depending on which worker their connection landed on. The bug is invisible in local development where a single worker handles all test clients.

    Fix

    Add Redis pub/sub: each worker subscribes to a shared channel via the lifespan context manager, publishes outgoing messages to Redis, and forwards received messages to its local connection set. Every worker receives every published message and delivers it to its own connected clients.

    Not setting ulimit/file descriptor limits before deploying the WebSocket service
    Symptom

    The service appears healthy in staging (low connection count) and works fine in the first hours of production. After enough users connect, the process hits the OS default ulimit and all new connection attempts fail with EMFILE. Existing connections continue working, making the issue look like an admission control problem rather than a resource limit.

    Fix

    Set LimitNOFILE=1000000 in the [Service] section of the systemd unit file. For Docker, add --ulimit nofile=1000000:1000000 to the run command or ulimits in docker-compose.yml. Verify with cat /proc/<pid>/limits | grep 'open files' after the service starts. Do this before the service handles any production traffic — raising ulimit requires a service restart.

Interview Questions on This Topic

  • QWhat is the difference between WSGI and ASGI, and why is the latter required for FastAPI WebSockets?Mid-levelReveal
    WSGI (Web Server Gateway Interface) is a synchronous, request-response protocol defined in PEP 3333. A WSGI server calls your application with a request, your application processes it synchronously, returns a response, and the call stack unwinds. There is no mechanism to hold a connection open, push data asynchronously, or handle multiple requests concurrently on a single thread. One thread handles one request — long-lived connections would block that thread indefinitely. ASGI (Asynchronous Server Gateway Interface) is the async successor defined in the ASGI specification. Instead of a synchronous callable, your application is an async callable that receives a scope (describing the connection type — http, websocket, lifespan), a receive coroutine (to await incoming events), and a send coroutine (to push outgoing events). This three-component interface supports long-lived connections natively because the event loop can switch between thousands of coroutines while each one is waiting for I/O. For WebSockets specifically: the HTTP-to-WebSocket upgrade handshake must be handled in the same connection, after which the protocol switches. ASGI handles this by keeping the same scope alive across the upgrade, with receive() yielding websocket.connect, websocket.receive, and websocket.disconnect events as they arrive. A WSGI server has no model for this — the call stack returns after the HTTP response, and there is nowhere to attach the subsequent WebSocket events. FastAPI is built on Starlette, which is an ASGI framework. Uvicorn is the ASGI server. Together they give you an asyncio event loop that multiplexes thousands of WebSocket coroutines on a single OS thread with no threading overhead.
  • QExplain the 'C10k problem' and how Python's asyncio library helps FastAPI handle thousands of concurrent WebSocket connections.SeniorReveal
    The C10k problem was framed by Dan Kegel in 1999: how do you design a server to handle 10,000 concurrent client connections efficiently? At the time, the dominant model was one thread per connection. OS threads on Linux consume 1-8 MB of stack space each and require context switches — the kernel must save and restore CPU registers and memory mappings when switching between threads. 10,000 threads would need 10-80 GB of RAM for stacks alone, plus constant context-switch overhead that degrades performance at high thread counts. The alternative is non-blocking I/O multiplexing. Instead of blocking a thread waiting for data to arrive on a socket, you register all sockets with the OS (via epoll on Linux, kqueue on macOS) and ask to be notified when any of them have data ready. A single thread handles all I/O by processing whichever sockets are ready, in order. Python's asyncio implements cooperative multitasking on top of this model. An asyncio coroutine is a Python generator that runs until it hits an await expression — at that point, control returns to the event loop, which runs other coroutines until I/O becomes available on the original coroutine's socket. A coroutine uses roughly 4-8 KB of memory on the Python heap (not an OS thread stack), so 10,000 coroutines use about 40-80 MB — trivial. In FastAPI, every WebSocket endpoint is an async function. The event loop parks each one at its await websocket.receive_text() call and wakes it up only when the socket has data. epoll monitors all open sockets simultaneously with a single OS call. The result is that a single Uvicorn worker process can multiplex thousands of WebSocket connections with CPU usage proportional to actual message volume, not connection count. The remaining bottleneck after solving the concurrency model is file descriptors — one per open socket — which the OS limits by default. That requires tuning ulimit, not the concurrency model.
  • QHow would you implement a distributed 'Presence' system (showing who is online) using FastAPI and Redis?SeniorReveal
    A distributed presence system tracks which users are currently connected across multiple server instances. It has three components: local tracking, shared state in Redis, and real-time presence change events. Local tracking: each FastAPI instance maintains a set of locally connected user IDs in memory. On WebSocket connect, add the user_id to the local set. On disconnect (either WebSocketDisconnect or failed heartbeat), remove it. Shared state in Redis: use a Redis SET named presence:online with SADD on connect and SREM on disconnect. This gives any service a global view of all online users with O(1) membership queries. Add a per-user key presence:heartbeat:{user_id} with a short TTL — say 45 seconds — and refresh it on every client heartbeat message. If a server crashes without cleanly disconnecting users, these keys expire automatically and the user drops out of the online set within one TTL window. This TTL key is your defense against phantom online users from crashed instances. Real-time presence events: publish connect and disconnect events to a Redis pub/sub channel named presence:events. Every instance subscribes to this channel and forwards events to any local clients that have opted into presence updates (for example, clients that have called a subscribe_to_presence API). The payload should be minimal: {user_id, status: online|offline, timestamp}. Reconnect handling: when a client reconnects after a network drop, send the current online set on the initial WebSocket accept — query the Redis SET and send it as a single payload. This ensures the client is fully synchronized regardless of what happened during the disconnect. A subtle issue: if you SREM from presence:online and publish the offline event in two separate operations, there is a race where another instance reads the SET before SREM completes and sees the user as still online. Use a Redis pipeline or Lua script to make SREM and publish atomic if presence accuracy under load is critical.
  • QDescribe the lifecycle of a WebSocket handshake. What happens at the HTTP level during the Upgrade request?Mid-levelReveal
    The WebSocket connection begins as an ordinary HTTP/1.1 GET request. The client sends specific headers that signal intent to upgrade: Upgrade: websocket, Connection: Upgrade, Sec-WebSocket-Key: <random 16-byte value encoded as base64>, and Sec-WebSocket-Version: 13. The Origin header is also included by browsers as a CSRF-style signal. The server validates the request. If it supports WebSockets and accepts the connection, it responds with HTTP 101 Switching Protocols. The response includes Upgrade: websocket, Connection: Upgrade, and Sec-WebSocket-Accept: <derived value>. The derived value is the SHA-1 hash of the Sec-WebSocket-Key concatenated with the fixed magic string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, then base64-encoded. This prevents non-WebSocket-aware proxies from accepting the upgrade accidentally. After the 101 response, the HTTP protocol is gone. The TCP connection is kept open and both sides switch to the WebSocket binary frame protocol. Frames have a 2-16 byte header containing an opcode (0x1 text, 0x2 binary, 0x8 close, 0x9 ping, 0xA pong), a payload length field, and a masking key for client-to-server frames. All frames from client to server must be masked — frames from server to client must not be masked. This asymmetry is a security measure against cross-protocol attacks. In FastAPI/Starlette: Starlette's ASGI machinery intercepts the HTTP upgrade request and transitions the connection to a WebSocket scope. The @app.websocket() handler receives the connection. await websocket.accept() sends the 101 response and completes the handshake. After that, receive_text(), receive_bytes(), send_text(), and send_json() all operate on WebSocket frames. The developer never touches raw HTTP after the accept() call.
  • QIn a System Design context: how would you design a scalable notification service using WebSockets for 10 million users?SeniorReveal
    A 10 million user notification service requires four layers: connection gateway, session registry, message routing, and offline delivery. Connection Gateway Layer: deploy stateless WebSocket gateway servers behind a Layer 4 load balancer (not Layer 7 — WebSocket upgrade headers can confuse some L7 configurations). Use consistent hashing by user_id to route a given user's connections to the same gateway cluster, which simplifies presence tracking. Each gateway runs Uvicorn with uvloop, LimitNOFILE=1000000, ws_ping_interval=30, ws_ping_timeout=10. At 50,000-100,000 connections per gateway instance, 10M active users need 100-200 gateway instances. These are cheap horizontally — you scale by adding instances, not by making each instance larger. Session Registry: when user 12345 connects to gateway-node-7, write session:{user_id} = node-7 to Redis with a TTL of 60 seconds. The gateway refreshes this key on every client heartbeat. Any service that needs to send a notification to user 12345 looks up this key to find the right gateway. Use Redis Cluster for the registry — at 10M users doing connect/disconnect events, you need the write throughput that single-node Redis cannot sustain. Message Routing Layer: notification producers (your application services) publish to Kafka rather than directly to Redis. A router tier consumes from Kafka, looks up the target user's gateway in the session registry, and publishes to that gateway's dedicated Redis pub/sub channel (gateway:{node_id}:messages). The gateway receives the message and delivers it to the local WebSocket connection. This two-hop design means the notification producer doesn't need to know anything about gateway topology — it just publishes to Kafka with a user_id, and the router handles the rest. Offline Delivery: maintain a notifications table in PostgreSQL or Cassandra with columns for user_id, message payload, status (pending/delivered/expired), and TTL timestamp. When a producer publishes a notification, write it to this table in the same transaction. When the gateway delivers it successfully, update status to delivered. When a user reconnects, the gateway queries for pending notifications with status=pending and timestamp not expired, sends them in order, and marks them delivered. This handles the cases where the user was offline, the gateway crashed, or the Redis pub/sub message was dropped. Estimated infrastructure: 200 gateway instances (4 vCPU, 8 GB RAM), Redis Cluster with 6 nodes for the session registry, Kafka with 6 brokers for the routing tier, PostgreSQL with read replicas for offline storage. The gateways are the largest cost — 200 instances sounds like a lot, but WebSocket servers are I/O-bound and run efficiently on small VMs.

Frequently Asked Questions

How do I authenticate a WebSocket connection?

The browser's WebSocket API does not allow setting custom HTTP headers like Authorization: Bearer on the initial handshake request. Your two options are query parameters and cookies.

Query parameters are straightforward for API clients and native apps: @app.websocket('/ws') async def ws(websocket: WebSocket, token: str = Query(...)). Validate the token before calling websocket.accept(), and close with websocket.close(code=1008) if it is invalid. The problem with query parameters is that they appear verbatim in server access logs and proxy logs — a long-lived API key in a query parameter is a permanent credential leak. Use short-lived JWTs with a 60-second expiry: the client obtains a fresh token via a normal HTTP endpoint, opens the WebSocket connection with it, and the token is worthless by the time it appears in a log.

Cookies are the better choice for browser-based applications. HttpOnly cookies set during a prior HTTP login flow are sent automatically with the WebSocket upgrade request, are never visible in JavaScript, and do not appear in browser history. Read the cookie in the WebSocket handler via websocket.cookies.get('session') and validate it before accepting.

How do WebSockets scale across multiple FastAPI instances?

A WebSocket connection is permanently bound to the specific process that accepted it. The ConnectionManager on that process has no visibility into connections on any other process. If you broadcast on Instance A, clients on Instances B, C, and D receive nothing.

The solution is Redis pub/sub: each instance subscribes to a shared Redis channel in the lifespan startup hook. When any instance needs to broadcast, it publishes to Redis. Redis delivers the message to every subscribed instance simultaneously, and each instance fans it out to its own local connections. The originating instance doesn't need to know which instance any given client is connected to.

For durable delivery — where clients that are temporarily offline should receive missed messages when they reconnect — replace pub/sub with Redis Streams. Use XADD to publish and XREAD with consumer groups to deliver, with the stream acting as a persistent log. Pub/sub is fire-and-forget; Streams are durable and replayable.

Can I use FastAPI Middleware with WebSockets?

Standard HTTP middleware defined with @app.middleware('http') does not execute for WebSocket connections. The middleware chain is built around the HTTP request/response model — it sees the initial upgrade request, but after the 101 handshake the connection transitions to the WebSocket protocol and the middleware stack no longer intercepts it.

For cross-cutting concerns on WebSocket endpoints — authentication, rate limiting, structured logging, request tracing — the practical approaches are: a dependency function injected into the endpoint (the cleanest option for auth and validation), a decorator wrapping the endpoint function, or a class-based wrapper that implements the same connect/receive/send lifecycle. For rate limiting specifically, implement it in the dependency using Redis counters keyed by IP or user_id — the same Redis instance you are already using for pub/sub.

What happens to WebSocket connections during a server deploy or restart?

When Uvicorn receives SIGTERM (the standard shutdown signal from systemd, Kubernetes, or a container orchestrator), it begins graceful shutdown. Active WebSocket connections receive a close frame with code 1001 (Going Away). The client's WebSocket implementation fires the onclose event, and well-written clients use this as the trigger to implement exponential-backoff reconnect logic.

For zero-downtime deploys, use rolling deployments with connection draining: stop routing new connections to the instance being replaced (remove it from the load balancer's target group or upstream), then wait for existing connections to close naturally or hit the drain timeout, then send SIGTERM. Configure the drain timeout based on your expected connection lifetime — for notification services where connections are typically short-lived, 30 seconds is usually sufficient; for long-lived collaborative editing sessions, you may need minutes.

In Kubernetes, preStop hooks give you a window to drain before SIGTERM arrives. Set terminationGracePeriodSeconds to at least 2x your expected drain window. With Redis pub/sub, a client that reconnects to a new pod immediately rejoins the same broadcast graph — the user experience is a brief reconnect rather than a visible outage.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousFastAPI Testing with pytest and TestClientNext →FastAPI Deployment — Docker, Uvicorn and Gunicorn
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged