FastAPI Background Tasks and Async Endpoints
Master FastAPI concurrency.
20+ years shipping production Python across data and backend systems. Everything here is grounded in real deployments.
- FastAPI handles concurrency via two paths: BackgroundTasks for post-response work and async/await for non-blocking I/O
- Use
async defonly when calling async libraries (httpx, asyncpg, motor) — never for synchronous blocking code - Use regular
deffor synchronous or CPU-bound work — FastAPI auto-offloads it to a thread pool via AnyIO - BackgroundTasks execute AFTER the response is sent — the client never waits for them
- Calling
time.sleep()insideasync defblocks the entire event loop and freezes every concurrent request on the server - BackgroundTasks are lost on server restart — use Celery or ARQ for mission-critical jobs that must survive crashes
- The thread pool default size is min(32, os.cpu_count() + 4) — exhaust it and sync endpoint requests start queuing silently
Think of a restaurant kitchen. BackgroundTasks is like telling the waiter 'bring the check now, I'll clear the table after the customer leaves' — the customer walks out happy, and cleanup happens without holding up the next seating. The async/await choice is like deciding whether a chef can tend multiple dishes simultaneously (async — the chef stirs one pot, sets a timer, and moves to the next) or needs to stand in front of one dish with full attention until it is done (sync — the chef is occupied, and everyone else waits). Put a synchronous chef in an async kitchen and the entire service grinds to a halt the moment one dish needs three minutes of uninterrupted attention.
FastAPI is built on Starlette and asyncio, which makes it one of the fastest Python web frameworks available. That speed is not free — it is entirely conditional on the developer understanding how the event loop works and respecting its rules. Unlike traditional synchronous frameworks where every request gets its own thread and isolation is automatic, FastAPI can handle thousands of concurrent connections on a single thread. The moment you introduce a blocking call into that single thread, every one of those connections stalls together.
The core decision point is deceptively simple on the surface: async def or def. Get it right and you get the throughput numbers in the benchmarks. Get it wrong and your production server handles 10 requests per second instead of 10,000 — with no obvious error, no alarm, just a wall of slow responses and a CPU that looks inexplicably idle.
This guide covers both concurrency paths — BackgroundTasks for fire-and-forget post-response logic, and async/await for non-blocking I/O — with the kind of specificity that only comes from seeing both patterns succeed and fail in production. The failure modes are predictable. The fixes are mechanical once you understand the model.
Why FastAPI Background Tasks Need Async Awareness
FastAPI's BackgroundTasks is a lightweight mechanism to schedule work after returning an HTTP response, without blocking the client. It runs tasks in the same process, reusing the event loop thread pool. The core mechanic: you declare a BackgroundTasks parameter in your endpoint, add callables to it, and FastAPI executes them after the response is sent. This is not a queue system — it's a fire-and-forget pattern for short-lived operations.
Key property: BackgroundTasks runs synchronously by default, even in async endpoints. If your task is I/O-bound (e.g., sending an email, writing to a file), it will block the event loop unless you explicitly wrap it in asyncio.to_thread or use an async callable. FastAPI does not automatically await your sync task — it runs it in a thread pool, but the thread pool is shared and can starve if tasks are slow. For async tasks, you must pass an async function; FastAPI will await it, but the event loop remains occupied until completion.
Use BackgroundTasks for operations that must happen after the response but can tolerate a few seconds of delay: logging, cache invalidation, sending notifications. Do not use it for critical work that requires durability, retries, or monitoring — that's what Celery or RQ are for. The sweet spot is sub-second, non-critical side effects where you'd otherwise block the response.
BackgroundTasks — Run After Response
The standard HTTP request-response cycle forces the client to wait for every operation inside the endpoint to complete before receiving a response. For operations that the user does not need to wait for — sending a welcome email, writing an audit log entry, invalidating a CDN cache — this is pure latency overhead with no user-facing benefit.
FastAPI's BackgroundTasks class solves this with a clean hook into the ASGI response lifecycle. You register a function (and its arguments) with the task manager inside your endpoint. FastAPI sends the HTTP response to the client first, then executes the registered functions afterward. The client's round-trip time reflects only the endpoint logic, not the background work.
The execution model has a few characteristics worth understanding clearly. Tasks are executed sequentially, not in parallel — if you register three background tasks, they run one after another in registration order, not concurrently. Regular def background functions run in FastAPI's thread pool, keeping the event loop free. Async def background functions run on the event loop itself — which means a slow async background function can delay other requests if it does not yield control frequently.
The hardest thing to accept about BackgroundTasks is what it explicitly is not. It is not a task queue. It has no retry mechanism. It has no persistence layer. It has no scheduling capability. It has no visibility into task status after the response is sent. If the server process dies while a background task is running or queued, the task is gone with no record of it. This is not a limitation to work around — it is a design boundary that defines where BackgroundTasks is appropriate and where a durable task queue is required.
The appropriate use cases are narrow but common: sending transactional emails, writing audit or activity log entries, cache invalidation, incrementing counters in analytics systems where occasional data loss is acceptable. The inappropriate use cases are equally clear: payment processing, order fulfillment, document generation, any operation where losing the task in a crash would cause a business impact.
- Tasks run AFTER the response is fully sent — the client's round-trip time excludes background work entirely.
- Tasks run sequentially in registration order — a slow first task delays all subsequent tasks. There is no parallelism between background tasks.
- Regular def background functions run in the thread pool — synchronous I/O is safe. Async def background functions run on the event loop — blocking calls inside them affect all concurrent requests.
- Exceptions inside background functions are silently swallowed by FastAPI. Without explicit try/except and logging inside the background function, failures are invisible. This is the most common operational mistake with BackgroundTasks.
- No retries, no persistence, no scheduling — if the process dies, tasks die with it. Use for: welcome emails, audit logs, cache invalidation. Do NOT use for: payment processing, order fulfillment, document generation, or any operation where losing the task causes a business impact.
async def vs def — When to Use Which
This is the single most consequential decision in any FastAPI codebase, and it is the most frequently misunderstood. The surface-level explanation — 'use async def for async code and def for sync code' — is correct but incomplete. Understanding why FastAPI behaves differently based on the function signature is what separates code that performs from code that silently degrades under load.
When FastAPI receives a request for a regular def endpoint, it does not execute the function on the event loop. It submits the function to AnyIO's thread pool executor and awaits the result. The event loop is free to handle other requests while the sync function runs in a worker thread. This is why synchronous code inside a regular def endpoint is completely safe — it runs in isolation from the event loop.
When FastAPI receives a request for an async def endpoint, it awaits the coroutine directly on the event loop thread. No thread is involved. The coroutine runs cooperatively — it progresses until it hits an await expression, yields control back to the event loop, and resumes when the awaited operation completes. This cooperative yielding is what allows thousands of concurrent requests to share a single thread efficiently. The contract is simple: every operation inside an async def function must yield at every I/O boundary.
Break that contract — call time.sleep(), call requests.get(), open a synchronous database connection — and the cooperative model collapses. The event loop thread is occupied for the duration of the blocking call, and every other coroutine waits. There is no error, no warning, no exception. The server simply becomes unresponsive in proportion to how long and how frequently the blocking call occurs.
The thread pool that handles regular def endpoints has a default size of min(32, os.cpu_count() + 4). On a 4-core machine, that is 8 threads. If 8 sync endpoints are all executing simultaneously — each holding a thread for a database query, a file read, or an external HTTP call — request 9 must wait for a thread to become available. This is thread pool exhaustion, and it produces the same symptom as event loop blockage: slow responses with low CPU usage. Monitor active thread count and tune the pool size or migrate high-traffic sync endpoints to async libraries accordingly.
The practical decision framework is mechanical: does the code call an async library? Use async def with await. Does the code call a synchronous library? Use regular def. Does the code mix both? Use async def and wrap every synchronous call with await asyncio.to_thread(). Follow this consistently and the framework handles the rest.
- If your async def endpoint contains
time.sleep(),requests.get(), smtplib, synchronous file I/O, or any blocking database driver — you are blocking every concurrent request on the server for the duration of that call. The server does not error. It does not warn. It silently queues everything. - A single 3-second blocking call in an async def endpoint under 10 concurrent requests means the event loop is blocked for 30 seconds of every 30-second window. Health checks fail. Kubernetes restarts the pod. The restart clears the queue and the cycle repeats.
- The fix options in priority order: (1) Use a native async library instead (httpx instead of requests, asyncpg instead of psycopg2, aiosmtplib instead of smtplib). (2) Wrap the blocking call:
result = await asyncio.to_thread(blocking_function, arg1). (3) Convert the endpoint to regular def if it has no legitimate async operations. - Enable ruff's ASYNC lint rules in CI — specifically ASYNC101 through ASYNC220. These detect time.sleep,
open(), and common synchronous I/O calls inside async functions at review time. Catching this class of bug before it merges costs seconds. Catching it in production at 2 AM costs hours. - Add event loop lag monitoring to production. A lag metric above 100ms is a reliable early warning of a blocking call somewhere in the codebase. CPU and memory metrics will not show it.
os.cpu_count() + 4) threads. On a typical 4-core container, that is 8 threads.await asyncio.to_thread(sync_call, args). Never call a sync function directly from async def.BackgroundTasks Drops Tasks on Restart — Horizontally Scale or Lose Work
The built-in BackgroundTasks lives in memory. No persistence. No retry. If your process crashes or you deploy a new version, every queued task vanishes. In production, this is not a bug — it's a design limitation you must work around.
Use BackgroundTasks only for fire-and-forget work you can afford to lose. Welcome emails? Fine. Audit logs? Fine. Payment processing? Absolutely not.
For durable workloads, push tasks into a proper queue (Redis RQ, Celery, or SQS) with a worker pool. This gives you retry logic, task persistence, and the ability to scale workers independently from your API servers.
The pattern is simple: the API route enqueues a message, a separate worker dequeues and executes it. Your API stays responsive, your tasks survive restarts, and you can monitor failures without digging through logs.
Monkey-Patch BackgroundTasks for Async I/O Without Blocking the Event Loop
FastAPI runs sync BackgroundTasks functions in a thread pool. Each sync task blocks a thread until completion. If you have 40 concurrent tasks doing file writes or HTTP calls, you exhaust your thread pool and degrade response times.
Async tasks are different. They run on the same event loop as your request handlers. But here's the gotcha: if an async task performs blocking I/O (like synchronous requests.get or file.write), it blocks the entire event loop — not just a thread.
Solution: Use async def for any task that does I/O. If you must call a synchronous library, wrap it in run_in_executor. Or better, monkey-patch your sync calls at the task boundary.
This pattern keeps your event loop responsive while background work runs. The trick is being explicit about where the blocking happens — don't let Starlette decide for you.
asyncio.get_event_loop().run_in_executor() or migrate to async libraries (httpx, aiofiles). Your event loop will thank you.Registration Endpoint Frozen — async def with Synchronous SMTP Killed the Event Loop
async def and called smtplib.SMTP().sendmail() — a synchronous blocking call that holds the executing thread for the duration of the SMTP handshake and data transfer. In asyncio, there is only one thread running the event loop. A blocking call on that thread does not just slow down the current request — it prevents every other coroutine from advancing for the entire duration of the block. With a 3-second SMTP handshake, every registration request held the event loop hostage for 3 seconds. Under modest load, this stacked: 10 concurrent registrations meant the event loop was blocked for 30 seconds of every 30-second window. Health checks, order lookups, search requests — everything queued behind SMTP.aiosmtplib for a truly async SMTP implementation that works correctly inside async def. Added a ruff lint rule (ASYNC210) to the CI pipeline to flag synchronous I/O calls inside async functions at the pull request stage, before they reach production. Added event loop lag monitoring via a middleware that records the delta between when a request callback is scheduled and when it actually executes — a lag above 100ms now triggers an alert.- Never call synchronous I/O inside async def. The event loop is single-threaded and cooperative — a blocking call does not yield, it occupies. Every other request on the server waits.
- If you must use a synchronous library inside an async endpoint, wrap it:
result = await asyncio.to_thread(sync_function, arg1, arg2). This delegates execution to the thread pool without blocking the event loop. - BackgroundTasks with a regular def function is the correct pattern for fire-and-forget sync work like email sending. The task runs in the thread pool after the response is sent, and the event loop is never involved.
- Monitor event loop lag as a first-class production metric. CPU and memory metrics will look normal during an event loop blockage — the server appears idle while being completely unresponsive. Lag monitoring is the only reliable way to catch blocking calls before users notice.
- Add async-aware linting to CI. Ruff's ASYNC rule set detects common blocking patterns inside async functions — time.sleep, requests.get, synchronous file I/O. Catching these at review time costs nothing; catching them in production costs hours.
time.sleep(), requests.get(), synchronous file operations, or any database driver that is not async-native. The blocking call is almost always in an async def function that looks correct at a glance. Once found, either convert the endpoint to regular def or wrap the blocking call with await asyncio.to_thread().uvicorn main:app --workers 4. If you are already running multiple workers, check for thread pool exhaustion on sync endpoints. The default thread pool is min(32, cpu_count + 4) threads. If all threads are occupied by long-running synchronous operations, subsequent sync requests queue. Monitor active thread count and consider converting high-traffic sync endpoints to async with native async libraries.PYTHONASYNCIODEBUG=1 uvicorn main:app --log-level debug 2>&1 | grep -i 'Executing.*took\|slow callback\|blocked'grep -rn 'async def' app/ | xargs grep -l 'time.sleep\|requests.get\|smtplib\|open(' 2>/dev/nullawait asyncio.to_thread(sync_function, args) or convert the endpoint to regular def so FastAPI offloads it to the thread pool automaticallyKey takeaways
async def must use only non-blocking calls at every I/O boundary. time.sleep(), requests.get(), synchronous database drivers, and synchronous file I/O inside async def are production incidents in waitingdef endpoints are safe for synchronous and CPU-bound code because FastAPI automatically submits them to the AnyIO thread pool. The thread pool default size is min(32, os.cpu_count() + 4)Common mistakes to avoid
5 patternsUsing async def with synchronous I/O calls — the #1 FastAPI production incident
result = await asyncio.to_thread(sync_function, arg1, arg2) to delegate it to the thread pool. Option 3: Convert the endpoint from async def to regular def if it has no legitimate async operations. Enable ruff ASYNC lint rules in CI to catch this class of mistake before it reaches production.Expecting BackgroundTasks to survive server restarts, rolling deploys, or OOM kills
Not handling exceptions inside background functions
Running CPU-bound work inside async def without offloading to a separate executor
await asyncio.to_thread(cpu_function, args) to run in the thread pool. For heavy CPU work: use a ProcessPoolExecutor explicitly (await asyncio.get_event_loop().run_in_executor(process_pool, cpu_function, args)) to escape the GIL and use multiple cores. For production-scale CPU-intensive work: use Celery workers running in dedicated processes, scaled independently from the API fleet.Using time.sleep() instead of asyncio.sleep() inside async endpoints
time.sleep(n) inside async def with await asyncio.sleep(n). The asyncio version yields control to the event loop for the duration — other requests are processed normally during the wait. Add a pre-commit hook or ruff rule (ASYNC101) to flag time.sleep imports in files that contain async def functions. This is a one-character mistake with severe production consequences and a trivially detectable pattern — catch it in tooling, not in postmortems.Interview Questions on This Topic
Explain how FastAPI's internal thread pool handles synchronous def functions versus async def functions — and what happens when a developer gets the choice wrong.
time.sleep(), requests.get(), a synchronous database driver — the cooperative model breaks. The blocking call does not yield. The event loop thread is occupied for the duration of the call. No other coroutine can run: no other request can be processed, no health checks can respond, no background tasks can execute. The server appears idle by CPU and memory metrics while being completely unresponsive to new requests.
The thread pool default size is min(32, os.cpu_count() + 4). On a 4-core machine: 8 threads. If 8 sync endpoints hold threads simultaneously, request 9 queues. This is thread pool exhaustion — same symptom as event loop blockage, different root cause, different fix.
The diagnostic tells them apart: event loop blockage shows zero CPU with all requests slow. Thread pool exhaustion shows moderate CPU with sync endpoint requests queuing while async endpoints respond normally.Frequently Asked Questions
20+ years shipping production Python across data and backend systems. Everything here is grounded in real deployments.
That's Python Libraries. Mark it forged?
7 min read · try the examples if you haven't