FastAPI CLI: Dev, Run, and Deploy Without the Bloat
FastAPI CLI dev run deploy: master the official CLI for development, production serving, and deployment.
20+ years shipping production Python across data and backend systems. Drawn from code that ran under real load.
- ✓Basic Python syntax
- ✓Familiarity with FastAPI basics
- ✓Understanding of HTTP and REST APIs
Use fastapi dev main.py for development with auto-reload, and fastapi run main.py for production. The CLI handles Uvicorn configuration, reload, and environment variables. For deployment, use fastapi run with a process manager like Gunicorn or Supervisor.
Think of FastAPI CLI as a remote control for your API. Instead of manually wiring up Uvicorn, setting reload flags, and managing environment variables, you press one button: fastapi dev for testing (it auto-restarts when you change code) and fastapi run for the real show. It's like having a co-pilot who handles the pre-flight checklist so you focus on flying.
You've built a FastAPI app. Now what? If you're still typing uvicorn main:app --reload like it's 2019, you're wasting time and missing production-ready defaults. I've seen teams deploy with --reload still on because they copy-pasted the dev command. That's a memory leak waiting to happen.
The FastAPI CLI exists because the community needed a single, opinionated tool that handles dev, run, and deploy without the guesswork. It's not just a wrapper—it sets sane defaults for workers, reload, and environment detection. No more if DEBUG hacks in your entry point.
By the end of this, you'll be able to scaffold a new project, run it in dev with hot reload, deploy it in production with correct worker counts, and debug the most common failures—all from the CLI. No Dockerfile required for basic deployments.
Why FastAPI CLI Exists: The Problem with Raw Uvicorn
Before FastAPI CLI, every project had a run.py that looked like a frankenstein of environment checks, reload flags, and port bindings. You'd see uvicorn.run(app, host='0.0.0.0', port=int(os.getenv('PORT', 8000)), reload=os.getenv('ENV')=='dev'). That's boilerplate that shouldn't exist.
The CLI eliminates this. It knows that fastapi dev means reload on, debug on, single worker. fastapi run means reload off, multiple workers, production logging. No more accidental --reload in prod. No more forgetting to set --host 0.0.0.0 and wondering why Docker port mapping fails.
But here's the catch: the CLI is opinionated. It assumes your app is in a file called main.py with a variable app. If you named your file server.py or your variable api, you need to specify: fastapi dev server.py or fastapi dev main:api. The CLI will try to auto-detect, but don't rely on it.
fastapi dev in production. The reload flag keeps file watchers alive, consuming memory. Also, dev mode sets log level to debug, which can flood your log aggregator.Development Workflow: Hot Reload That Actually Works
The dev command is where FastAPI CLI shines. Run fastapi dev main.py and you get: - Auto-reload on any .py file change (not just the entry point) - Debug mode enabled (full tracebacks in responses) - Single worker (no concurrency surprises) - Host bound to 127.0.0.1 by default (safe for local dev)
But there's a nuance: the reload watches the entire directory tree. If you have a node_modules or a giant data/ folder, the watcher will thrash. I've seen this bring down a dev server on a monorepo because the file watcher exhausted inotify watches. Fix: exclude directories with --reload-dir or better, keep your project structure clean.
Another gotcha: if you're using Docker for local dev, fastapi dev inside a container won't reload because file changes on the host don't propagate to the container's filesystem unless you mount volumes correctly. Use docker compose with a volume mount and set --reload-dir /app.
fastapi dev --reload-dir src if your code lives in a subdirectory. This reduces watcher overhead and speeds up reloads significantly.Production Serving: Workers, Graceful Shutdown, and Logging
fastapi run main.py, you get- Multiple workers (default 1, but you can set with
--workers) - No reload
- Log level set to
info(not debug) - Host bound to
0.0.0.0(accessible from outside) - Graceful shutdown with
SIGTERM(workers finish current requests before dying)
The worker count is critical. The classic rookie mistake: setting --workers to the number of CPU cores. That's fine for CPU-bound tasks, but FastAPI apps are I/O-bound (waiting on databases, APIs). You want more workers to overlap I/O. Rule of thumb: 2 * CPU cores + 1. But memory is the real constraint. Each worker is a separate process with its own memory. On a 1GB container, 4 workers might OOM under load.
Graceful shutdown is another hidden gem. When you send SIGTERM to the master process, it stops accepting new connections and waits for existing requests to finish (up to a timeout). This prevents dropped requests during deploys. But if your workers are stuck on long-running tasks, they'll be killed after the timeout. Use --timeout-keep-alive to control this.
fastapi run with --reload in production. I've seen it in the wild. The reloader creates a subprocess for each worker, and when a file changes, it kills and restarts all workers simultaneously. That's a full service outage for every deploy.Environment Management: .env Files and Configuration
FastAPI CLI automatically loads .env files if they exist. This is huge. No more python-dotenv boilerplate. The CLI reads .env from the current directory and exports variables before starting the app. This means os.getenv('DATABASE_URL') just works.
But there's a subtlety: the CLI loads .env only if you don't set the environment variable yourself. So if you have DATABASE_URL in both .env and the shell environment, the shell wins. This is correct behavior—it allows Docker or CI to override local settings.
Another trap: the CLI does NOT load .env for fastapi dev if you're using a different working directory. Always run the command from the project root. Or use --env-file to specify a path: fastapi dev main.py --env-file config/.env.prod.
.env to .env.local, the CLI won't load it. The default file must be named .env. Use --env-file for any other name.Scaffolding: fastapi new and Project Structure
The fastapi new command generates a project skeleton. It's not just a hello world—it creates a proper structure with main.py, routers/, models/, schemas/, and a requirements.txt. This is gold for teams that want consistency.
But here's the honest truth: I rarely use it. The generated structure is fine for a microservice, but if you're building a monolith or a library, it's overkill. Also, it pins FastAPI to the latest version, which can cause dependency conflicts if you're in a monorepo. Use it for greenfield projects, but don't feel bad about rolling your own.
One thing it does well: it creates a Dockerfile that uses fastapi run as the entry point. That's a solid default. But it uses python:3.11-slim which might be too heavy for some deployments. Swap to python:3.11-alpine if you're size-conscious.
pip freeze > requirements.txt to lock versions. The generated requirements.txt uses unpinned versions like fastapi>=0.100.0, which will break your build when a major version drops.Deployment: Docker, Kubernetes, and Process Managers
The CLI is not a deployment tool—it's a server runner. For production, you still need a process manager to keep it alive. The simplest: use fastapi run inside a Docker container with CMD ["fastapi", "run", "main.py", "--workers", "4"]. Then use Docker's restart policy or Kubernetes liveness probes.
But if you're not using containers, you need something like Supervisor or systemd. The CLI doesn't daemonize—it runs in the foreground. That's correct for containers, but for bare metal, you'll want a service file. Here's a systemd unit that works:
``` [Unit] Description=FastAPI app After=network.target
[Service] User=www-data WorkingDirectory=/opt/myapp ExecStart=/usr/local/bin/fastapi run main.py --workers 4 Restart=always
[Install] WantedBy=multi-user.target ```
One more thing: never run fastapi run as root. Create a dedicated user. The CLI doesn't enforce this, but it's a security risk.
--workers to a high number. Each pod should handle a modest load. Scale horizontally instead. Setting --workers 8 in a 256MB pod will OOM immediately.Debugging Common CLI Failures
Let's talk about the errors that will hit you at 2am.
Error: ModuleNotFoundError: No module named 'app' You ran fastapi dev from the wrong directory. The CLI looks for main.py in the current directory. If your app is in src/main.py, run fastapi dev src/main.py.
Error: Address already in use Another process is on that port. Use lsof -i :8000 to find it, then kill -9 <PID>. Or change the port with --port.
Error: Workers dying with Segmentation fault This is almost always a C extension incompatibility. Check if you're using orjson or uvloop with an incompatible Python version. Downgrade or upgrade the library.
Error: [Errno 24] Too many open files Your app is opening too many file descriptors (database connections, file handles). Increase the system limit with ulimit -n 65536 before starting the CLI. Or fix the leak in your code.
--timeout-keep-alive), then kills remaining workers. The master process exits after all workers are done.When Not to Use FastAPI CLI
The CLI is great for standard FastAPI apps. But there are cases where it's the wrong tool:
- You need custom Uvicorn config: The CLI exposes only a subset of Uvicorn options. If you need
--limit-concurrency,--backlog, or--ssl-keyfile, you're better off running Uvicorn directly. - You're using Gunicorn as a process manager: The CLI doesn't integrate with Gunicorn. Use
gunicorn -k uvicorn.workers.UvicornWorker main:appinstead. - You have multiple apps in one process: The CLI runs one app per process. If you need to mount multiple ASGI apps (e.g., FastAPI + Admin + Static files), use Uvicorn's
--appoption or a custom lifespan. - You're on Windows in production: The CLI uses
uvloopon Unix for performance. On Windows, it falls back toasyncio's event loop, which is slower. Consider using Hypercorn instead.
In those cases, drop down to raw Uvicorn or Gunicorn. The CLI is a convenience layer, not a silver bullet.
The Dev Server That Was Accidentally Running in Production for Three Weeks
fastapi run main.py in production mode. The team assumed this was correct because the deployment passed health checks on startup.fastapi dev main.py --host 0.0.0.0 --port 80. The developer used fastapi dev — which enables auto-reload, debug mode, and single-worker mode — and committed it. The CI/CD pipeline copied the Dockerfile verbatim. fastapi dev runs a single reload-capable worker that was never intended for production. It lacks the process management and isolation that fastapi run or gunicorn provides. The auto-reload file watcher leaked file descriptors over time, eventually hitting the container's limit.fastapi run main.py (production mode). Added a CI check that grep's the Dockerfile for fastapi dev and fails the build. Set up a pre-commit hook to catch fastapi dev in any Dockerfile or compose file.- Never use
fastapi devin Dockerfiles or CI/CD scripts — it's designed for local iteration only. - Add automated checks:
grep -r 'fastapi dev' Dockerfile docker-compose*.ymlin CI pipeline. fastapi runis the production command — it enables proper worker management, no reload overhead, and production-grade defaults.
Killed (OOM)dmesg | tail -20 for OOM killer messages. 2. Reduce --workers count. 3. Increase container memory. 4. Add memory limits to Docker/K8s.--host is 0.0.0.0. 2. Verify Docker port mapping -p 8000:8000. 3. Check firewall rules.lsof -i :8000kill -9 <PID>--port 8001| File | Command / Code | Purpose |
|---|---|---|
| main.py | from fastapi import FastAPI | Why FastAPI CLI Exists |
| dev.sh | fastapi dev main.py | Development Workflow |
| run_prod.sh | fastapi run main.py --workers 4 | Production Serving |
| env_example.sh | DATABASE_URL=postgresql://user:pass@localhost/db | Environment Management |
| scaffold.sh | fastapi new my_project | Scaffolding |
| Dockerfile | FROM python:3.11-slim | Deployment |
| debug.sh | lsof -i :8000 | Debugging Common CLI Failures |
| custom_uvicorn.sh | uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 --limit-concurrency 100 ... | When Not to Use FastAPI CLI |
Key takeaways
fastapi dev for local development with hot reload; never in production.fastapi run for production; set workers based on memory, not CPU cores..env automatically; use --env-file for custom paths.Interview Questions on This Topic
How does FastAPI CLI handle graceful shutdown under `fastapi run`?
--timeout-keep-alive), then exit. The master waits for all workers before exiting. This prevents dropped requests during deploys.Frequently Asked Questions
Every FastAPI concept with runnable in-browser examples — params, Pydantic, dependency injection, JWT auth, async, SQLAlchemy, testing, WebSockets, and Docker deployment. The interactive reference for production engineers.
20+ years shipping production Python across data and backend systems. Drawn from code that ran under real load.
That's Python Libraries. Mark it forged?
5 min read · try the examples if you haven't