Senior 7 min · March 05, 2026

FastAPI Background Tasks and Async Endpoints

Master FastAPI concurrency.

N
Naren Founder & Principal Engineer

20+ years shipping production Python across data and backend systems. Everything here is grounded in real deployments.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • FastAPI handles concurrency via two paths: BackgroundTasks for post-response work and async/await for non-blocking I/O
  • Use async def only when calling async libraries (httpx, asyncpg, motor) — never for synchronous blocking code
  • Use regular def for synchronous or CPU-bound work — FastAPI auto-offloads it to a thread pool via AnyIO
  • BackgroundTasks execute AFTER the response is sent — the client never waits for them
  • Calling time.sleep() inside async def blocks the entire event loop and freezes every concurrent request on the server
  • BackgroundTasks are lost on server restart — use Celery or ARQ for mission-critical jobs that must survive crashes
  • The thread pool default size is min(32, os.cpu_count() + 4) — exhaust it and sync endpoint requests start queuing silently
✦ Definition~90s read
What is FastAPI Background Tasks and Async Endpoints?

FastAPI background tasks let you defer work — like sending emails, writing logs, or triggering webhooks — until after your endpoint returns a response. The catch is that mixing synchronous blocking code with FastAPI's async event loop can silently kill performance.

Think of a restaurant kitchen.

If you mark an endpoint async def and then call a synchronous background task, you block the entire event loop while that task runs, defeating the purpose of async. The BackgroundTasks class from fastapi solves this by running tasks in a thread pool executor when the endpoint is async def, or directly in the same thread when the endpoint is def.

This means you must match your endpoint signature to the nature of your background work: use async def only when both the endpoint logic and the background task are truly async (e.g., httpx.AsyncClient calls), and use def when either is synchronous (e.g., smtplib or requests). Misunderstanding this leads to production incidents where a single slow background job stalls all concurrent requests — a classic footgun that separates senior engineers from beginners.

Plain-English First

Think of a restaurant kitchen. BackgroundTasks is like telling the waiter 'bring the check now, I'll clear the table after the customer leaves' — the customer walks out happy, and cleanup happens without holding up the next seating. The async/await choice is like deciding whether a chef can tend multiple dishes simultaneously (async — the chef stirs one pot, sets a timer, and moves to the next) or needs to stand in front of one dish with full attention until it is done (sync — the chef is occupied, and everyone else waits). Put a synchronous chef in an async kitchen and the entire service grinds to a halt the moment one dish needs three minutes of uninterrupted attention.

FastAPI is built on Starlette and asyncio, which makes it one of the fastest Python web frameworks available. That speed is not free — it is entirely conditional on the developer understanding how the event loop works and respecting its rules. Unlike traditional synchronous frameworks where every request gets its own thread and isolation is automatic, FastAPI can handle thousands of concurrent connections on a single thread. The moment you introduce a blocking call into that single thread, every one of those connections stalls together.

The core decision point is deceptively simple on the surface: async def or def. Get it right and you get the throughput numbers in the benchmarks. Get it wrong and your production server handles 10 requests per second instead of 10,000 — with no obvious error, no alarm, just a wall of slow responses and a CPU that looks inexplicably idle.

This guide covers both concurrency paths — BackgroundTasks for fire-and-forget post-response logic, and async/await for non-blocking I/O — with the kind of specificity that only comes from seeing both patterns succeed and fail in production. The failure modes are predictable. The fixes are mechanical once you understand the model.

Why FastAPI Background Tasks Need Async Awareness

FastAPI's BackgroundTasks is a lightweight mechanism to schedule work after returning an HTTP response, without blocking the client. It runs tasks in the same process, reusing the event loop thread pool. The core mechanic: you declare a BackgroundTasks parameter in your endpoint, add callables to it, and FastAPI executes them after the response is sent. This is not a queue system — it's a fire-and-forget pattern for short-lived operations.

Key property: BackgroundTasks runs synchronously by default, even in async endpoints. If your task is I/O-bound (e.g., sending an email, writing to a file), it will block the event loop unless you explicitly wrap it in asyncio.to_thread or use an async callable. FastAPI does not automatically await your sync task — it runs it in a thread pool, but the thread pool is shared and can starve if tasks are slow. For async tasks, you must pass an async function; FastAPI will await it, but the event loop remains occupied until completion.

Use BackgroundTasks for operations that must happen after the response but can tolerate a few seconds of delay: logging, cache invalidation, sending notifications. Do not use it for critical work that requires durability, retries, or monitoring — that's what Celery or RQ are for. The sweet spot is sub-second, non-critical side effects where you'd otherwise block the response.

Blocking the Event Loop
A sync BackgroundTask that does I/O (e.g., smtplib.send) will block the entire event loop if the thread pool is exhausted — your other endpoints stall.
Production Insight
A team used BackgroundTasks to resize uploaded images (5-10 seconds each). Under load, the thread pool saturated, causing all other endpoints to time out.
Symptom: p95 latency spikes from 200ms to 30s, no errors logged because the task silently queued.
Rule: If a task takes >1 second or does blocking I/O, move it to a dedicated worker queue with backpressure.
Key Takeaway
BackgroundTasks is for fire-and-forget, not fire-and-hope.
Sync tasks in async endpoints block the event loop — use asyncio.to_thread or make them async.
Never use BackgroundTasks for durability-critical work; it has no retry, no persistence, no monitoring.
FastAPI Background Tasks & Async Endpoints THECODEFORGE.IO FastAPI Background Tasks & Async Endpoints Flow of background tasks with async awareness and common pitfalls FastAPI Endpoint async def or def handler BackgroundTasks Run after response returned Async vs Sync Task async def for I/O; def for CPU Task Execution Runs in background thread/event loop Restart Drops Tasks No persistence across restarts Monkey-Patch for Async I/O Enable async operations in sync tasks ⚠ BackgroundTasks lost on server restart Use external queue (Redis/Celery) for durability THECODEFORGE.IO
thecodeforge.io
FastAPI Background Tasks & Async Endpoints
Fastapi Background Tasks Async

BackgroundTasks — Run After Response

The standard HTTP request-response cycle forces the client to wait for every operation inside the endpoint to complete before receiving a response. For operations that the user does not need to wait for — sending a welcome email, writing an audit log entry, invalidating a CDN cache — this is pure latency overhead with no user-facing benefit.

FastAPI's BackgroundTasks class solves this with a clean hook into the ASGI response lifecycle. You register a function (and its arguments) with the task manager inside your endpoint. FastAPI sends the HTTP response to the client first, then executes the registered functions afterward. The client's round-trip time reflects only the endpoint logic, not the background work.

The execution model has a few characteristics worth understanding clearly. Tasks are executed sequentially, not in parallel — if you register three background tasks, they run one after another in registration order, not concurrently. Regular def background functions run in FastAPI's thread pool, keeping the event loop free. Async def background functions run on the event loop itself — which means a slow async background function can delay other requests if it does not yield control frequently.

The hardest thing to accept about BackgroundTasks is what it explicitly is not. It is not a task queue. It has no retry mechanism. It has no persistence layer. It has no scheduling capability. It has no visibility into task status after the response is sent. If the server process dies while a background task is running or queued, the task is gone with no record of it. This is not a limitation to work around — it is a design boundary that defines where BackgroundTasks is appropriate and where a durable task queue is required.

The appropriate use cases are narrow but common: sending transactional emails, writing audit or activity log entries, cache invalidation, incrementing counters in analytics systems where occasional data loss is acceptable. The inappropriate use cases are equally clear: payment processing, order fulfillment, document generation, any operation where losing the task in a crash would cause a business impact.

io/thecodeforge/tasks/registration.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
from fastapi import FastAPI, BackgroundTasks, status
from pydantic import BaseModel, EmailStr
import logging
import time

app = FastAPI()
log = logging.getLogger(__name__)


class RegistrationRequest(BaseModel):
    email: EmailStr
    username: str


def send_welcome_email(email: str, username: str) -> None:
    """
    Regular defFastAPI runs this in the thread pool after the response is sent.
    The event loop is not involved. time.sleep() is safe here because this
    runs in a worker thread, not on the event loop thread.

    Critical: wrap everything in try/except. FastAPI silently swallows
    exceptions raised inside background functions. Without this block,
    a failed SMTP connection produces no log entry and no alert — the
    user simply never receives the email and no one knows.
    """
    try:
        log.info(
            "welcome_email_started",
            extra={"email": email, "username": username}
        )
        # Simulate SMTP handshake — safe in a thread pool, catastrophic on event loop
        time.sleep(2)
        # In production: use smtplib here, or aiosmtplib in an async background task
        log.info(
            "welcome_email_sent",
            extra={"email": email, "username": username}
        )
    except Exception:
        # Log with exc_info=True to capture the full traceback
        # Also increment a metric counter here in production:
        # metrics.counter("forge.bg.email.failure").increment()
        log.error(
            "welcome_email_failed",
            extra={"email": email},
            exc_info=True
        )


def write_registration_audit_log(email: str, username: str) -> None:
    """
    Second background task — runs sequentially after send_welcome_email.
    Tasks registered with add_task() execute in registration order, not in parallel.
    """
    try:
        log.info(
            "registration_audit_written",
            extra={"email": email, "username": username, "event": "user_registered"}
        )
    except Exception:
        log.error("audit_log_failed", extra={"email": email}, exc_info=True)


@app.post('/forge/register', status_code=status.HTTP_201_CREATED)
async def register_user(payload: RegistrationRequest, tasks: BackgroundTasks):
    """
    Registers a user and enqueues post-response work.

    Execution order:
      1. Validate payload (Pydantic)
      2. Write user to database (not shown)
      3. Send HTTP 201 to client  <-- client connection returns here
      4. send_welcome_email() runs in thread pool
      5. write_registration_audit_log() runs in thread pool

    The client's measured latency includes only steps 1-3.
    Steps 4-5 are invisible to the client and independent of client connectivity.
    """
    # Step 1: In production, write to database here
    # await db.users.insert({"email": payload.email, "username": payload.username})

    # Step 2: Register background tasks — they do not run yet
    tasks.add_task(send_welcome_email, payload.email, payload.username)
    tasks.add_task(write_registration_audit_log, payload.email, payload.username)

    # Step 3: Return response — background tasks run after this is sent
    return {
        "status": "created",
        "detail": "Registration complete. Welcome email is on its way.",
    }


# GET /forge/register POST {"email": "dev@thecodeforge.io", "username": "forge-dev"}
# Client receives immediately:
# -> {"status": "created", "detail": "Registration complete. Welcome email is on its way."}
#
# Server logs after response (invisible to client):
# INFO  welcome_email_started    email=dev@thecodeforge.io
# INFO  welcome_email_sent       email=dev@thecodeforge.io
# INFO  registration_audit_written email=dev@thecodeforge.io
Output
{"status": "created", "detail": "Registration complete. Welcome email is on its way."}
BackgroundTasks as a Lightweight Post-Response Hook
  • Tasks run AFTER the response is fully sent — the client's round-trip time excludes background work entirely.
  • Tasks run sequentially in registration order — a slow first task delays all subsequent tasks. There is no parallelism between background tasks.
  • Regular def background functions run in the thread pool — synchronous I/O is safe. Async def background functions run on the event loop — blocking calls inside them affect all concurrent requests.
  • Exceptions inside background functions are silently swallowed by FastAPI. Without explicit try/except and logging inside the background function, failures are invisible. This is the most common operational mistake with BackgroundTasks.
  • No retries, no persistence, no scheduling — if the process dies, tasks die with it. Use for: welcome emails, audit logs, cache invalidation. Do NOT use for: payment processing, order fulfillment, document generation, or any operation where losing the task causes a business impact.
Production Insight
FastAPI silently swallows exceptions raised inside background functions because the HTTP response has already been sent by the time the exception occurs — there is nowhere to propagate it to the client, and the framework makes a deliberate choice not to crash the server process over a background task failure.
The consequence: if your email function crashes due to an SMTP timeout, an invalid recipient address, or an uncaught import error, the API returns 201 and the user never receives the email. No error log appears unless you wrote one. No alert fires unless you instrumented one. The failure is completely invisible to both the user and the operator.
Rule: every background function must have a top-level try/except block that logs the exception with exc_info=True and increments a failure counter metric. This is not optional defensive programming — it is the minimum viable error observability for a function that has no other error reporting path.
Key Takeaway
BackgroundTasks is a post-response hook — not a queue, not a retry system, not a scheduler. It runs registered functions sequentially after the HTTP response is sent, in the same process, with no persistence layer.
Exceptions are silently swallowed. Always wrap background function bodies in try/except with structured logging and a failure metric.
Use it for fire-and-forget work where occasional data loss is acceptable. The moment a task must complete reliably or survive a server restart, you are past what BackgroundTasks can offer — reach for a durable task queue.
Choosing BackgroundTasks vs a Durable Task Queue
IfTask is non-critical and losing it on server restart has no business impact (welcome email, cache bust, analytics event)
UseUse BackgroundTasks — zero infrastructure overhead, no broker to operate, fast to implement
IfTask must complete even if the server restarts, crashes, or is killed mid-execution
UseUse Celery with Redis or RabbitMQ, or ARQ for async-native workloads — durable broker required
IfTask needs automatic retries on failure, exponential backoff, or dead-letter handling
UseUse Celery with retry policies — BackgroundTasks has no retry mechanism whatsoever
IfTask is CPU-intensive (image resizing, PDF generation, video transcoding, ML inference)
UseUse Celery with dedicated workers in separate processes — background tasks in the API process compete for CPU with request handling and will degrade both

async def vs def — When to Use Which

This is the single most consequential decision in any FastAPI codebase, and it is the most frequently misunderstood. The surface-level explanation — 'use async def for async code and def for sync code' — is correct but incomplete. Understanding why FastAPI behaves differently based on the function signature is what separates code that performs from code that silently degrades under load.

When FastAPI receives a request for a regular def endpoint, it does not execute the function on the event loop. It submits the function to AnyIO's thread pool executor and awaits the result. The event loop is free to handle other requests while the sync function runs in a worker thread. This is why synchronous code inside a regular def endpoint is completely safe — it runs in isolation from the event loop.

When FastAPI receives a request for an async def endpoint, it awaits the coroutine directly on the event loop thread. No thread is involved. The coroutine runs cooperatively — it progresses until it hits an await expression, yields control back to the event loop, and resumes when the awaited operation completes. This cooperative yielding is what allows thousands of concurrent requests to share a single thread efficiently. The contract is simple: every operation inside an async def function must yield at every I/O boundary.

Break that contract — call time.sleep(), call requests.get(), open a synchronous database connection — and the cooperative model collapses. The event loop thread is occupied for the duration of the blocking call, and every other coroutine waits. There is no error, no warning, no exception. The server simply becomes unresponsive in proportion to how long and how frequently the blocking call occurs.

The thread pool that handles regular def endpoints has a default size of min(32, os.cpu_count() + 4). On a 4-core machine, that is 8 threads. If 8 sync endpoints are all executing simultaneously — each holding a thread for a database query, a file read, or an external HTTP call — request 9 must wait for a thread to become available. This is thread pool exhaustion, and it produces the same symptom as event loop blockage: slow responses with low CPU usage. Monitor active thread count and tune the pool size or migrate high-traffic sync endpoints to async libraries accordingly.

The practical decision framework is mechanical: does the code call an async library? Use async def with await. Does the code call a synchronous library? Use regular def. Does the code mix both? Use async def and wrap every synchronous call with await asyncio.to_thread(). Follow this consistently and the framework handles the rest.

io/thecodeforge/concurrency/endpoint_types.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
from fastapi import FastAPI
import asyncio
import httpx
import hashlib
import time

app = FastAPI()


# ─────────────────────────────────────────────────────────────────────────────
# CASE 1: async def — the correct choice for async I/O
# ─────────────────────────────────────────────────────────────────────────────
@app.get('/forge/weather/{city}')
async def get_weather(city: str):
    """
    Uses httpx.AsyncClient — a truly async HTTP client.
    Every network operation yields to the event loop via await.
    While waiting for the external API response, FastAPI handles other requests.
    This endpoint scales to thousands of concurrent requests on a single worker.
    """
    async with httpx.AsyncClient(timeout=10.0) as client:
        response = await client.get(
            f'https://api.thecodeforge.io/v1/weather/{city}'
        )
        response.raise_for_status()
        return response.json()


# ─────────────────────────────────────────────────────────────────────────────
# CASE 2: regular def — the correct choice for synchronous or CPU-bound work
# ─────────────────────────────────────────────────────────────────────────────
@app.get('/forge/hash')
def compute_password_hash(password: str):
    """
    CPU-bound operation using hashlib — synchronous and compute-intensive.
    FastAPI detects regular def and submits this to the thread pool.
    The event loop is free to handle other requests while this runs.
    Thread pool default: min(32, os.cpu_count() + 4).
    """
    # Simulate bcrypt-level work with high iteration SHA-256
    digest = hashlib.pbkdf2_hmac(
        'sha256',
        password.encode(),
        b'forge-salt',
        iterations=100_000
    )
    return {"hash": digest.hex()}


# ─────────────────────────────────────────────────────────────────────────────
# CASE 3: async def with mixed sync/async — use asyncio.to_thread()
# ─────────────────────────────────────────────────────────────────────────────
@app.post('/forge/report')
async def generate_report(report_id: str):
    """
    Needs both async (fetch data from async DB) and sync (write to filesystem).
    Wrapping the synchronous file write in asyncio.to_thread() delegates it
    to the thread pool, keeping the event loop free during file I/O.
    """
    # Async database call — yields to event loop
    # data = await db.reports.find_one({"id": report_id})
    data = {"report_id": report_id, "rows": 1500}  # Simulated

    # Synchronous file I/O — must NOT run on event loop directly
    # Wrap with asyncio.to_thread() to delegate to thread pool
    def write_to_disk(payload: dict) -> str:
        path = f"/tmp/forge-report-{payload['report_id']}.json"
        import json
        with open(path, 'w') as f:
            json.dump(payload, f)
        return path

    output_path = await asyncio.to_thread(write_to_disk, data)
    return {"report_id": report_id, "output": output_path}


# ─────────────────────────────────────────────────────────────────────────────
# CASE 4: THE LOOP KILLER — async def with synchronous blocking call
# This is the pattern behind most FastAPI production incidents.
# ─────────────────────────────────────────────────────────────────────────────
#
# @app.get('/forge/disaster')
# async def event_loop_hostage():
#     time.sleep(10)  # BLOCKS THE ENTIRE SERVER FOR 10 SECONDS
#                     # Every concurrent request queues behind this.
#                     # CPU shows nearly 0% — the server appears idle.
#                     # Health checks time out. Kubernetes kills the pod.
#                     # There is no error. There is no warning.
#                     # There is just silence and then a wall of timeouts.


# ─────────────────────────────────────────────────────────────────────────────
# CASE 5: asyncio.sleep() — the correct non-blocking delay in async context
# ─────────────────────────────────────────────────────────────────────────────
@app.get('/forge/delayed-response')
async def delayed_response():
    """
    asyncio.sleep() yields control to the event loop for the duration.
    During this 2-second wait, FastAPI processes thousands of other requests.
    time.sleep(2) here would freeze all of them.
    """
    await asyncio.sleep(2)
    return {"status": "ready", "source": "thecodeforge"}


# Summary of behavior:
# Case 1: async def + httpx.AsyncClient -> event loop, non-blocking, scales to 10k+ concurrent
# Case 2: regular def + hashlib        -> thread pool, safe for sync/CPU work
# Case 3: async def + to_thread()      -> event loop for async parts, thread pool for sync parts
# Case 4: async def + time.sleep()     -> event loop BLOCKED, all other requests frozen
# Case 5: async def + asyncio.sleep()  -> event loop free, correct non-blocking delay
Output
Case 1: {"city": "london", "temp_c": 14}
Case 2: {"hash": "a3f2c1..."}
Case 3: {"report_id": "RPT-001", "output": "/tmp/forge-report-RPT-001.json"}
Case 5: {"status": "ready", "source": "thecodeforge"}
The async def Trap — Blocking the Event Loop Has No Error Signal
  • If your async def endpoint contains time.sleep(), requests.get(), smtplib, synchronous file I/O, or any blocking database driver — you are blocking every concurrent request on the server for the duration of that call. The server does not error. It does not warn. It silently queues everything.
  • A single 3-second blocking call in an async def endpoint under 10 concurrent requests means the event loop is blocked for 30 seconds of every 30-second window. Health checks fail. Kubernetes restarts the pod. The restart clears the queue and the cycle repeats.
  • The fix options in priority order: (1) Use a native async library instead (httpx instead of requests, asyncpg instead of psycopg2, aiosmtplib instead of smtplib). (2) Wrap the blocking call: result = await asyncio.to_thread(blocking_function, arg1). (3) Convert the endpoint to regular def if it has no legitimate async operations.
  • Enable ruff's ASYNC lint rules in CI — specifically ASYNC101 through ASYNC220. These detect time.sleep, open(), and common synchronous I/O calls inside async functions at review time. Catching this class of bug before it merges costs seconds. Catching it in production at 2 AM costs hours.
  • Add event loop lag monitoring to production. A lag metric above 100ms is a reliable early warning of a blocking call somewhere in the codebase. CPU and memory metrics will not show it.
Production Insight
The thread pool that handles regular def endpoints is managed by AnyIO in modern FastAPI versions (0.95+). The default capacity is min(32, os.cpu_count() + 4) threads. On a typical 4-core container, that is 8 threads.
Eight threads sounds limiting, but for I/O-bound sync work (database queries, external HTTP calls) it handles substantial concurrency — each thread spends most of its time waiting for I/O, not executing CPU instructions. The real ceiling appears when those 8 threads are all occupied with long-running operations simultaneously. Request 9 waits. This produces the same symptom as event loop blockage — slow responses, low CPU — but the cause is different and the fix is different.
For I/O-bound sync endpoints under heavy load: consider migrating to async libraries. For CPU-bound sync endpoints that are legitimately compute-heavy: increase workers at the Uvicorn level to scale across cores, or offload to Celery workers. You can also tune the AnyIO thread pool size via the ANYIO_MAX_THREADS environment variable, but increasing thread count does not solve the underlying bottleneck — it defers it.
Key Takeaway
async def means you own the concurrency contract — every operation inside must yield at every I/O boundary via await. Regular def means FastAPI owns it — synchronous code is submitted to the thread pool and the event loop stays free.
The most dangerous code in a FastAPI codebase is synchronous I/O inside async def. It produces no error, no warning, and no stack trace — just a progressively unresponsive server that looks idle on every resource dashboard. Prevent it with linting. Detect it in production with event loop lag monitoring.
Choosing async def vs def for Your Endpoint
IfCalling async-native libraries: httpx, asyncpg, motor, aioredis, aiosmtplib
UseUse async def with await — this is the intended fast path. The event loop handles cooperative scheduling.
IfCalling synchronous libraries: requests, psycopg2, smtplib, synchronous file I/O
UseUse regular def — FastAPI submits to the thread pool automatically. Sync code is safe and does not block the event loop.
IfCPU-bound work: cryptographic hashing, image processing, data transformation, ML inference
UseUse regular def for thread pool isolation. For very heavy CPU work, use Celery with dedicated workers in separate processes to avoid GIL contention.
IfMixing async and sync calls within a single endpoint
UseUse async def and wrap every synchronous call with await asyncio.to_thread(sync_call, args). Never call a sync function directly from async def.

BackgroundTasks Drops Tasks on Restart — Horizontally Scale or Lose Work

The built-in BackgroundTasks lives in memory. No persistence. No retry. If your process crashes or you deploy a new version, every queued task vanishes. In production, this is not a bug — it's a design limitation you must work around.

Use BackgroundTasks only for fire-and-forget work you can afford to lose. Welcome emails? Fine. Audit logs? Fine. Payment processing? Absolutely not.

For durable workloads, push tasks into a proper queue (Redis RQ, Celery, or SQS) with a worker pool. This gives you retry logic, task persistence, and the ability to scale workers independently from your API servers.

The pattern is simple: the API route enqueues a message, a separate worker dequeues and executes it. Your API stays responsive, your tasks survive restarts, and you can monitor failures without digging through logs.

durable_tasks.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# io.thecodeforge.durable_tasks
from fastapi import BackgroundTasks, FastAPI
from redis import Redis
from rq import Queue

app = FastAPI()
redis = Redis()
task_queue = Queue(connection=redis)

def send_email_persistent(user_id: str):
    # This runs in a separate worker process
    print(f"Sending email to user {user_id} at {time.time()}")

@app.post("/signup")
async def signup(user_id: str, background_tasks: BackgroundTasks):
    # Use BackgroundTasks for disposable logging
    background_tasks.add_task(lambda: print(f"Audit: {user_id} signed up"))
    
    # Use proper queue for durable work
    task_queue.enqueue(send_email_persistent, user_id)
    return {"status": "queued", "user_id": user_id}

# Run: rq worker --with-scheduler
Output
API returns immediately. Worker processes email asynchronously. Restart API without losing queued emails.
Production Trap:
If you deploy with Docker, every restart kills BackgroundTasks. Always separate durable work into an external queue — your users won't wait, and your SREs won't page you at 3 AM.
Key Takeaway
BackgroundTasks is a scratchpad, not a database. Use real queues for work you can't afford to lose.

Monkey-Patch BackgroundTasks for Async I/O Without Blocking the Event Loop

FastAPI runs sync BackgroundTasks functions in a thread pool. Each sync task blocks a thread until completion. If you have 40 concurrent tasks doing file writes or HTTP calls, you exhaust your thread pool and degrade response times.

Async tasks are different. They run on the same event loop as your request handlers. But here's the gotcha: if an async task performs blocking I/O (like synchronous requests.get or file.write), it blocks the entire event loop — not just a thread.

Solution: Use async def for any task that does I/O. If you must call a synchronous library, wrap it in run_in_executor. Or better, monkey-patch your sync calls at the task boundary.

This pattern keeps your event loop responsive while background work runs. The trick is being explicit about where the blocking happens — don't let Starlette decide for you.

async_io_tasks.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# io.thecodeforge.async_io_tasks
import asyncio
import time

from fastapi import BackgroundTasks, FastAPI

app = FastAPI()

def blocking_file_write(data: str):
    time.sleep(1)  # Simulates blocking I/O
    print(f"Written: {data}")

async def async_http_call(url: str):
    await asyncio.sleep(1)  # Non-blocking
    print(f"Called: {url}")

@app.post("/process")
async def process(background_tasks: BackgroundTasks):
    # This blocks a thread pool worker
    background_tasks.add_task(blocking_file_write, "data")
    
    # This runs on the event loop — don't block inside it
    background_tasks.add_task(async_http_call, "https://example.com")
    
    return {"status": "ok"}
Output
Response returns immediately. blocking_file_write uses a thread. async_http_call uses event loop. Both run concurrently without blocking each other.
Async Task Gotcha:
Don't mix blocking I/O (requests.get, time.sleep) in async tasks. Wrap them in asyncio.get_event_loop().run_in_executor() or migrate to async libraries (httpx, aiofiles). Your event loop will thank you.
Key Takeaway
Async tasks are not magic — blocking I/O in an async function blocks everything. Be explicit about thread vs. event loop boundaries.
● Production incidentPOST-MORTEMseverity: high

Registration Endpoint Frozen — async def with Synchronous SMTP Killed the Event Loop

Symptom
All API endpoints became sluggish simultaneously during registration spikes — not just the registration endpoint. Health check endpoints started timing out. Kubernetes liveness probes began failing. No CPU spike was visible, no memory leak, no database errors — the server appeared nearly idle by resource metrics but refused new connections and returned responses with multi-second delays.
Assumption
The team assumed the PostgreSQL database was the bottleneck, since registration is a write-heavy operation. They spent two hours profiling slow queries, checking connection pool saturation, and reviewing the EXPLAIN ANALYZE output for the INSERT statement. The database was completely healthy — 200 idle connections, sub-5ms query times, no lock contention. The investigation was looking in entirely the wrong layer.
Root cause
The registration endpoint was declared as async def and called smtplib.SMTP().sendmail() — a synchronous blocking call that holds the executing thread for the duration of the SMTP handshake and data transfer. In asyncio, there is only one thread running the event loop. A blocking call on that thread does not just slow down the current request — it prevents every other coroutine from advancing for the entire duration of the block. With a 3-second SMTP handshake, every registration request held the event loop hostage for 3 seconds. Under modest load, this stacked: 10 concurrent registrations meant the event loop was blocked for 30 seconds of every 30-second window. Health checks, order lookups, search requests — everything queued behind SMTP.
Fix
Moved email sending to BackgroundTasks using a regular synchronous function, which FastAPI automatically offloads to its internal thread pool. The event loop is no longer involved in the SMTP handshake. For services where email delivery reliability mattered more, switched to aiosmtplib for a truly async SMTP implementation that works correctly inside async def. Added a ruff lint rule (ASYNC210) to the CI pipeline to flag synchronous I/O calls inside async functions at the pull request stage, before they reach production. Added event loop lag monitoring via a middleware that records the delta between when a request callback is scheduled and when it actually executes — a lag above 100ms now triggers an alert.
Key lesson
  • Never call synchronous I/O inside async def. The event loop is single-threaded and cooperative — a blocking call does not yield, it occupies. Every other request on the server waits.
  • If you must use a synchronous library inside an async endpoint, wrap it: result = await asyncio.to_thread(sync_function, arg1, arg2). This delegates execution to the thread pool without blocking the event loop.
  • BackgroundTasks with a regular def function is the correct pattern for fire-and-forget sync work like email sending. The task runs in the thread pool after the response is sent, and the event loop is never involved.
  • Monitor event loop lag as a first-class production metric. CPU and memory metrics will look normal during an event loop blockage — the server appears idle while being completely unresponsive. Lag monitoring is the only reliable way to catch blocking calls before users notice.
  • Add async-aware linting to CI. Ruff's ASYNC rule set detects common blocking patterns inside async functions — time.sleep, requests.get, synchronous file I/O. Catching these at review time costs nothing; catching them in production costs hours.
Production debug guideSymptom → Action mapping for async/blocking problems5 entries
Symptom · 01
All endpoints slow during traffic spikes, but CPU usage is low and the database looks healthy
Fix
This pattern — low CPU, healthy database, slow responses — is almost always an event loop blockage. Enable PYTHONASYNCIODEBUG=1 to activate asyncio's slow callback detector, which logs any callback that takes longer than 100ms to complete. Search your async endpoints for time.sleep(), requests.get(), synchronous file operations, or any database driver that is not async-native. The blocking call is almost always in an async def function that looks correct at a glance. Once found, either convert the endpoint to regular def or wrap the blocking call with await asyncio.to_thread().
Symptom · 02
BackgroundTask email or notification never arrives, but the API returns 201 every time
Fix
FastAPI silently swallows exceptions raised inside background functions — the HTTP response has already been sent by the time the exception occurs, so there is nowhere to propagate it. Check server logs for unhandled exceptions in background functions, but if you have not added explicit try/except with logging inside the background function, there will be nothing to find. Add structured error logging as an immediate fix. Add a Prometheus counter or Datadog metric for background task failures so silent failures produce an observable signal going forward.
Symptom · 03
Background tasks execute slowly and subsequent requests are noticeably delayed
Fix
Check whether the background function is declared as async def. Async background functions run on the event loop, not in the thread pool, which means a slow async background task can starve other event loop operations. For CPU-bound or slow I/O background work, the background function should be a regular def — FastAPI runs it in the thread pool, leaving the event loop free. If the function genuinely needs to be async (it calls async libraries), ensure it properly awaits all I/O and does not contain any blocking calls.
Symptom · 04
Server handles only 10–20 concurrent requests despite low CPU and memory usage
Fix
Check the ASGI server worker count first — Uvicorn defaults to a single worker process, which means a single event loop. Increase with --workers to match available CPU cores: uvicorn main:app --workers 4. If you are already running multiple workers, check for thread pool exhaustion on sync endpoints. The default thread pool is min(32, cpu_count + 4) threads. If all threads are occupied by long-running synchronous operations, subsequent sync requests queue. Monitor active thread count and consider converting high-traffic sync endpoints to async with native async libraries.
Symptom · 05
Background task data is lost after server restart or rolling deploy
Fix
This is expected behavior, not a bug. BackgroundTasks are in-memory and tied to the process lifecycle. There is no persistence layer, no broker, no checkpointing. Tasks that are queued or in-flight when the process receives SIGTERM are dropped. For tasks that must survive restarts — user notifications, order confirmation emails, data pipeline steps — migrate to Celery with a Redis or RabbitMQ broker, or ARQ if you prefer a lighter async-native alternative. The migration is straightforward: the function body stays the same; you change how you enqueue the task.
★ FastAPI Concurrency Debug Cheat SheetWhen your FastAPI server is slow or background tasks are failing, run these checks in order. Start with the event loop before touching infrastructure.
Event loop blocked — all endpoints frozen, CPU low, health checks timing out
Immediate action
Find which async endpoint is making a synchronous blocking call — this is the cause in 90% of cases
Commands
PYTHONASYNCIODEBUG=1 uvicorn main:app --log-level debug 2>&1 | grep -i 'Executing.*took\|slow callback\|blocked'
grep -rn 'async def' app/ | xargs grep -l 'time.sleep\|requests.get\|smtplib\|open(' 2>/dev/null
Fix now
Wrap the blocking call with await asyncio.to_thread(sync_function, args) or convert the endpoint to regular def so FastAPI offloads it to the thread pool automatically
Background task silently failing — API returns success but side effects never happen+
Immediate action
Check server logs for unhandled exceptions in the background function scope — if none exist, the function has no error handling
Commands
docker compose logs api --tail=200 | grep -i 'error\|exception\|traceback\|background'
grep -rn 'add_task' app/ | head -20
Fix now
Add try/except with structured logging inside every background function body. Add a metric counter for failures so silent errors produce an observable signal in your monitoring platform
Low throughput despite low CPU — requests queuing with no obvious bottleneck+
Immediate action
Check Uvicorn worker count and thread pool saturation before looking at the application code
Commands
ps aux | grep uvicorn | grep -v grep
curl -s http://localhost:8000/metrics | grep -i 'thread\|worker\|active'
Fix now
Increase workers: uvicorn main:app --workers $(nproc). If thread pool saturation is the issue, convert high-traffic sync endpoints to async with native async libraries, or increase the thread pool size by setting the ANYIO_MAX_THREADS environment variable
BackgroundTasks vs Celery vs Direct Async
FeatureBackgroundTasksCeleryDirect Async Endpoint
Execution TimingAfter response is sent — client never waitsOn worker pickup, independent of API processDuring request — client waits for completion
PersistenceNone — in-memory, lost on process deathYes — written to broker (Redis/RabbitMQ) before executionN/A — completes or fails within the request lifecycle
RetriesNo — one attempt, silent failure on exceptionYes — configurable retry policies, exponential backoff, dead-letter queuesNo — handle failures in endpoint logic manually
InfrastructureNone — built into FastAPI, zero dependenciesBroker (Redis or RabbitMQ) + separate worker processesNone — runs in the same ASGI worker as the request
Survives RestartNo — queued and in-progress tasks are droppedYes — broker holds tasks until a worker picks them upN/A
Best ForWelcome emails, audit logs, cache invalidation, analytics eventsVideo processing, payment workflows, PDF generation, data pipelinesDatabase queries, external API calls, any work the user needs to wait for
Performance ImpactMinimal — sync tasks run in thread pool, async tasks on event loopScales independently from API — dedicated worker fleetBlocks event loop if sync code is used in async def; safe in regular def via thread pool

Key takeaways

1
BackgroundTasks is ideal for fire-and-forget logic that does not require a durable distributed system
but know its limits precisely: no retries, no persistence, no survival on restart. The boundary between BackgroundTasks and a durable queue is the question 'does losing this task cause a business impact?'
2
Endpoints declared with async def must use only non-blocking calls at every I/O boundary. time.sleep(), requests.get(), synchronous database drivers, and synchronous file I/O inside async def are production incidents in waiting
they block the entire event loop and freeze all concurrent requests with no error signal.
3
Regular def endpoints are safe for synchronous and CPU-bound code because FastAPI automatically submits them to the AnyIO thread pool. The thread pool default size is min(32, os.cpu_count() + 4)
exhaust it and sync endpoint requests queue silently.
4
The client receives the HTTP response before any code in BackgroundTasks starts executing. The client's measured latency excludes all background work. This is the correct pattern for any side effect the user does not need to wait for.
5
Exceptions inside BackgroundTasks are silently swallowed
there is no error path to the client after the response has been sent. Every background function must have a top-level try/except block with structured logging and a failure metric counter. Without it, failures are completely invisible.
6
For complex, long-running, or mission-critical jobs that must survive server restarts, use a durable task queue. Celery with Redis or RabbitMQ for most teams; ARQ for async-native workflows where Celery's operational weight is not justified.

Common mistakes to avoid

5 patterns
×

Using async def with synchronous I/O calls — the #1 FastAPI production incident

Symptom
The entire server freezes for the duration of every blocking call. Under load, P99 latency spikes from milliseconds to seconds. CPU usage stays near zero — the server appears idle by every resource metric. Other endpoints, including health checks, become unresponsive. Kubernetes starts restarting pods due to liveness probe failures. The blocking endpoint itself looks fine in isolation; the problem only surfaces under concurrent load.
Fix
Option 1 (preferred): Replace the synchronous library with an async-native equivalent — httpx instead of requests, asyncpg instead of psycopg2, aiosmtplib instead of smtplib. Option 2: Wrap the blocking call with result = await asyncio.to_thread(sync_function, arg1, arg2) to delegate it to the thread pool. Option 3: Convert the endpoint from async def to regular def if it has no legitimate async operations. Enable ruff ASYNC lint rules in CI to catch this class of mistake before it reaches production.
×

Expecting BackgroundTasks to survive server restarts, rolling deploys, or OOM kills

Symptom
After a rolling deploy during a traffic spike, a batch of users registered in the 30 seconds before the old pods were terminated never receive welcome emails. No errors appear in logs — the tasks were queued but the process was killed before they executed. Support receives user complaints hours later with no record of the failure.
Fix
BackgroundTasks are in-memory and tied entirely to the process lifecycle. There is no persistence, no handoff, no graceful drain on shutdown. For tasks that must complete regardless of server state, use Celery with Redis or RabbitMQ as the broker. Tasks written to the broker survive process death — a new worker picks them up when it starts. ARQ is a lighter async-native alternative if you are already using Redis and want to avoid Celery's operational complexity.
×

Not handling exceptions inside background functions

Symptom
API returns 201 Created consistently. Users report never receiving welcome emails. Server logs contain no errors. The failure is completely invisible — there is no error path from a background function exception to any observable output unless you explicitly create one. The team discovers the problem via user complaints, not monitoring.
Fix
Wrap every background function body in a top-level try/except block. Log the exception with exc_info=True so the full traceback is captured. In production, also increment a failure metric: a Prometheus counter or Datadog metric named something like forge_bg_task_failures_total with a task_name label. This converts a silent failure into an observable signal that your alerting can detect.
×

Running CPU-bound work inside async def without offloading to a separate executor

Symptom
Single CPU core spikes to 100% while all other cores remain idle. The event loop is occupied by CPU computation, which does not yield between iterations. All concurrent requests queue behind the CPU work. Scaling to more Uvicorn workers partially helps (each worker gets its own CPU core) but the GIL still prevents true parallelism within a single worker process for pure Python code.
Fix
For moderate CPU work that needs to stay in the API process: wrap with await asyncio.to_thread(cpu_function, args) to run in the thread pool. For heavy CPU work: use a ProcessPoolExecutor explicitly (await asyncio.get_event_loop().run_in_executor(process_pool, cpu_function, args)) to escape the GIL and use multiple cores. For production-scale CPU-intensive work: use Celery workers running in dedicated processes, scaled independently from the API fleet.
×

Using time.sleep() instead of asyncio.sleep() inside async endpoints

Symptom
An endpoint with a time.sleep(2) call for a polling delay or rate limit freezes the entire server for 2 seconds under concurrent load. Health check liveness probes time out at 1-second intervals. Kubernetes marks the pod as unhealthy and restarts it. After restart, the same behavior recurs immediately because the code was not changed — only the infrastructure responded to the symptom.
Fix
Replace every instance of time.sleep(n) inside async def with await asyncio.sleep(n). The asyncio version yields control to the event loop for the duration — other requests are processed normally during the wait. Add a pre-commit hook or ruff rule (ASYNC101) to flag time.sleep imports in files that contain async def functions. This is a one-character mistake with severe production consequences and a trivially detectable pattern — catch it in tooling, not in postmortems.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain how FastAPI's internal thread pool handles synchronous def funct...
Q02SENIOR
What is the event loop, and how does calling time.sleep() inside an asyn...
Q03SENIOR
Scenario: You need to process an image upload and generate a thumbnail. ...
Q04SENIOR
How does FastAPI's BackgroundTasks differ from a raw threading.Thread im...
Q05SENIOR
If a FastAPI server restarts while a BackgroundTask is executing, what h...
Q01 of 05SENIOR

Explain how FastAPI's internal thread pool handles synchronous def functions versus async def functions — and what happens when a developer gets the choice wrong.

ANSWER
FastAPI's behavior is fundamentally different depending on function signature, and the difference is not cosmetic — it is architectural. For regular def endpoints, FastAPI submits the function to AnyIO's thread pool executor and awaits the result asynchronously. The event loop posts the work item and immediately returns to handling other requests. When the thread pool completes the function, the event loop resumes the request coroutine and sends the response. The event loop is never blocked, even if the function takes several seconds. For async def endpoints, FastAPI awaits the coroutine directly on the event loop thread. No thread is allocated. The function runs cooperatively — it progresses until it hits an await expression, yields control, and resumes when the awaited operation completes. The assumption is that every I/O operation is async-native and will yield quickly. When a developer writes synchronous blocking code inside async def — time.sleep(), requests.get(), a synchronous database driver — the cooperative model breaks. The blocking call does not yield. The event loop thread is occupied for the duration of the call. No other coroutine can run: no other request can be processed, no health checks can respond, no background tasks can execute. The server appears idle by CPU and memory metrics while being completely unresponsive to new requests. The thread pool default size is min(32, os.cpu_count() + 4). On a 4-core machine: 8 threads. If 8 sync endpoints hold threads simultaneously, request 9 queues. This is thread pool exhaustion — same symptom as event loop blockage, different root cause, different fix. The diagnostic tells them apart: event loop blockage shows zero CPU with all requests slow. Thread pool exhaustion shows moderate CPU with sync endpoint requests queuing while async endpoints respond normally.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is the difference between BackgroundTasks and Celery?
02
Can I use async def with a database driver that has only a synchronous API?
03
Is it possible to return data from a BackgroundTask to the user?
N
Naren Founder & Principal Engineer

20+ years shipping production Python across data and backend systems. Everything here is grounded in real deployments.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's Python Libraries. Mark it forged?

7 min read · try the examples if you haven't

Previous
FastAPI Authentication — JWT and OAuth2 with Password Flow
42 / 51 · Python Libraries
Next
FastAPI Database Integration with SQLAlchemy