Home Python FastAPI CLI: Dev, Run, and Deploy Without the Bloat
Intermediate 5 min · July 05, 2026

FastAPI CLI: Dev, Run, and Deploy Without the Bloat

FastAPI CLI dev run deploy: master the official CLI for development, production serving, and deployment.

N
Naren Founder & Principal Engineer

20+ years shipping production Python across data and backend systems. Drawn from code that ran under real load.

Follow
Production
production tested
July 05, 2026
last updated
141
articles · all by Naren
Before you start⏱ 30 min
  • Basic Python syntax
  • Familiarity with FastAPI basics
  • Understanding of HTTP and REST APIs
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

Use fastapi dev main.py for development with auto-reload, and fastapi run main.py for production. The CLI handles Uvicorn configuration, reload, and environment variables. For deployment, use fastapi run with a process manager like Gunicorn or Supervisor.

✦ Definition~90s read
What is FastAPI CLI?

FastAPI CLI is the official command-line tool for FastAPI applications. It provides commands for development servers, production serving with Uvicorn, and project scaffolding. It wraps Uvicorn with sensible defaults and adds FastAPI-specific features like automatic reload and environment management.

Think of FastAPI CLI as a remote control for your API.
Plain-English First

Think of FastAPI CLI as a remote control for your API. Instead of manually wiring up Uvicorn, setting reload flags, and managing environment variables, you press one button: fastapi dev for testing (it auto-restarts when you change code) and fastapi run for the real show. It's like having a co-pilot who handles the pre-flight checklist so you focus on flying.

You've built a FastAPI app. Now what? If you're still typing uvicorn main:app --reload like it's 2019, you're wasting time and missing production-ready defaults. I've seen teams deploy with --reload still on because they copy-pasted the dev command. That's a memory leak waiting to happen.

The FastAPI CLI exists because the community needed a single, opinionated tool that handles dev, run, and deploy without the guesswork. It's not just a wrapper—it sets sane defaults for workers, reload, and environment detection. No more if DEBUG hacks in your entry point.

By the end of this, you'll be able to scaffold a new project, run it in dev with hot reload, deploy it in production with correct worker counts, and debug the most common failures—all from the CLI. No Dockerfile required for basic deployments.

Why FastAPI CLI Exists: The Problem with Raw Uvicorn

Before FastAPI CLI, every project had a run.py that looked like a frankenstein of environment checks, reload flags, and port bindings. You'd see uvicorn.run(app, host='0.0.0.0', port=int(os.getenv('PORT', 8000)), reload=os.getenv('ENV')=='dev'). That's boilerplate that shouldn't exist.

The CLI eliminates this. It knows that fastapi dev means reload on, debug on, single worker. fastapi run means reload off, multiple workers, production logging. No more accidental --reload in prod. No more forgetting to set --host 0.0.0.0 and wondering why Docker port mapping fails.

But here's the catch: the CLI is opinionated. It assumes your app is in a file called main.py with a variable app. If you named your file server.py or your variable api, you need to specify: fastapi dev server.py or fastapi dev main:api. The CLI will try to auto-detect, but don't rely on it.

main.pyPYTHON
1
2
3
4
5
6
7
8
9
10
# io.thecodeforge — Python tutorial

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World"}
Output
No output when running this file alone. Run with `fastapi dev main.py` to start the dev server.
Production Trap:
Never use fastapi dev in production. The reload flag keeps file watchers alive, consuming memory. Also, dev mode sets log level to debug, which can flood your log aggregator.

Development Workflow: Hot Reload That Actually Works

The dev command is where FastAPI CLI shines. Run fastapi dev main.py and you get: - Auto-reload on any .py file change (not just the entry point) - Debug mode enabled (full tracebacks in responses) - Single worker (no concurrency surprises) - Host bound to 127.0.0.1 by default (safe for local dev)

But there's a nuance: the reload watches the entire directory tree. If you have a node_modules or a giant data/ folder, the watcher will thrash. I've seen this bring down a dev server on a monorepo because the file watcher exhausted inotify watches. Fix: exclude directories with --reload-dir or better, keep your project structure clean.

Another gotcha: if you're using Docker for local dev, fastapi dev inside a container won't reload because file changes on the host don't propagate to the container's filesystem unless you mount volumes correctly. Use docker compose with a volume mount and set --reload-dir /app.

dev.shBASH
1
2
3
4
5
6
7
8
9
10
11
# io.thecodeforge — Python tutorial

# Start dev server with reload
fastapi dev main.py

# With custom host and port
fastapi dev main.py --host 0.0.0.0 --port 8080

# Exclude a directory from reload
fastapi dev main.py --reload-dir . --reload-exclude 'data/*'
Output
INFO: Will watch for changes in these directories: ['/path/to/project']
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [12345] using WatchFiles
INFO: Started server process [12346]
INFO: Waiting for application startup.
INFO: Application startup complete.
Senior Shortcut:
Use fastapi dev --reload-dir src if your code lives in a subdirectory. This reduces watcher overhead and speeds up reloads significantly.

Production Serving: Workers, Graceful Shutdown, and Logging

When you run fastapi run main.py, you get
  • Multiple workers (default 1, but you can set with --workers)
  • No reload
  • Log level set to info (not debug)
  • Host bound to 0.0.0.0 (accessible from outside)
  • Graceful shutdown with SIGTERM (workers finish current requests before dying)

The worker count is critical. The classic rookie mistake: setting --workers to the number of CPU cores. That's fine for CPU-bound tasks, but FastAPI apps are I/O-bound (waiting on databases, APIs). You want more workers to overlap I/O. Rule of thumb: 2 * CPU cores + 1. But memory is the real constraint. Each worker is a separate process with its own memory. On a 1GB container, 4 workers might OOM under load.

Graceful shutdown is another hidden gem. When you send SIGTERM to the master process, it stops accepting new connections and waits for existing requests to finish (up to a timeout). This prevents dropped requests during deploys. But if your workers are stuck on long-running tasks, they'll be killed after the timeout. Use --timeout-keep-alive to control this.

run_prod.shBASH
1
2
3
4
5
6
7
8
9
10
11
# io.thecodeforge — Python tutorial

# Production run with 4 workers
fastapi run main.py --workers 4

# With custom port and timeout
fastapi run main.py --port 80 --timeout-keep-alive 30

# Behind a reverse proxy, bind to 127.0.0.1
fastapi run main.py --host 127.0.0.1 --port 8000
Output
INFO: Started server process [12346]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Never Do This:
Don't run fastapi run with --reload in production. I've seen it in the wild. The reloader creates a subprocess for each worker, and when a file changes, it kills and restarts all workers simultaneously. That's a full service outage for every deploy.

Environment Management: .env Files and Configuration

FastAPI CLI automatically loads .env files if they exist. This is huge. No more python-dotenv boilerplate. The CLI reads .env from the current directory and exports variables before starting the app. This means os.getenv('DATABASE_URL') just works.

But there's a subtlety: the CLI loads .env only if you don't set the environment variable yourself. So if you have DATABASE_URL in both .env and the shell environment, the shell wins. This is correct behavior—it allows Docker or CI to override local settings.

Another trap: the CLI does NOT load .env for fastapi dev if you're using a different working directory. Always run the command from the project root. Or use --env-file to specify a path: fastapi dev main.py --env-file config/.env.prod.

env_example.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
# io.thecodeforge — Python tutorial

# .env file content:
DATABASE_URL=postgresql://user:pass@localhost/db
SECRET_KEY=supersecret

# FastAPI CLI loads it automatically
fastapi dev main.py

# Or specify a custom env file
fastapi run main.py --env-file .env.production
Output
INFO: Loading environment from '.env'
INFO: Started server process [12346]
The Classic Bug:
If you rename .env to .env.local, the CLI won't load it. The default file must be named .env. Use --env-file for any other name.

Scaffolding: fastapi new and Project Structure

The fastapi new command generates a project skeleton. It's not just a hello world—it creates a proper structure with main.py, routers/, models/, schemas/, and a requirements.txt. This is gold for teams that want consistency.

But here's the honest truth: I rarely use it. The generated structure is fine for a microservice, but if you're building a monolith or a library, it's overkill. Also, it pins FastAPI to the latest version, which can cause dependency conflicts if you're in a monorepo. Use it for greenfield projects, but don't feel bad about rolling your own.

One thing it does well: it creates a Dockerfile that uses fastapi run as the entry point. That's a solid default. But it uses python:3.11-slim which might be too heavy for some deployments. Swap to python:3.11-alpine if you're size-conscious.

scaffold.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# io.thecodeforge — Python tutorial

# Create a new project
fastapi new my_project

# This creates:
# my_project/
# ├── app/
# │   ├── __init__.py
# │   ├── main.py
# │   ├── routers/
# │   │   └── __init__.py
# │   ├── models/
# │   │   └── __init__.py
# │   └── schemas/
# │       └── __init__.py
# ├── Dockerfile
# ├── requirements.txt
# └── .env
Output
Creating project 'my_project'...
Done. Run `cd my_project && fastapi dev` to start.
Senior Shortcut:
After scaffolding, immediately run pip freeze > requirements.txt to lock versions. The generated requirements.txt uses unpinned versions like fastapi>=0.100.0, which will break your build when a major version drops.

Deployment: Docker, Kubernetes, and Process Managers

The CLI is not a deployment tool—it's a server runner. For production, you still need a process manager to keep it alive. The simplest: use fastapi run inside a Docker container with CMD ["fastapi", "run", "main.py", "--workers", "4"]. Then use Docker's restart policy or Kubernetes liveness probes.

But if you're not using containers, you need something like Supervisor or systemd. The CLI doesn't daemonize—it runs in the foreground. That's correct for containers, but for bare metal, you'll want a service file. Here's a systemd unit that works:

``` [Unit] Description=FastAPI app After=network.target

[Service] User=www-data WorkingDirectory=/opt/myapp ExecStart=/usr/local/bin/fastapi run main.py --workers 4 Restart=always

[Install] WantedBy=multi-user.target ```

One more thing: never run fastapi run as root. Create a dedicated user. The CLI doesn't enforce this, but it's a security risk.

DockerfileDOCKERFILE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# io.thecodeforge — Python tutorial

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000

CMD ["fastapi", "run", "main.py", "--workers", "4", "--port", "8000"]
Output
Build with: docker build -t myapp .
Run with: docker run -p 8000:8000 myapp
Production Trap:
If you're using Kubernetes, don't set --workers to a high number. Each pod should handle a modest load. Scale horizontally instead. Setting --workers 8 in a 256MB pod will OOM immediately.

Debugging Common CLI Failures

Let's talk about the errors that will hit you at 2am.

Error: ModuleNotFoundError: No module named 'app' You ran fastapi dev from the wrong directory. The CLI looks for main.py in the current directory. If your app is in src/main.py, run fastapi dev src/main.py.

Error: Address already in use Another process is on that port. Use lsof -i :8000 to find it, then kill -9 <PID>. Or change the port with --port.

Error: Workers dying with Segmentation fault This is almost always a C extension incompatibility. Check if you're using orjson or uvloop with an incompatible Python version. Downgrade or upgrade the library.

Error: [Errno 24] Too many open files Your app is opening too many file descriptors (database connections, file handles). Increase the system limit with ulimit -n 65536 before starting the CLI. Or fix the leak in your code.

debug.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
# io.thecodeforge — Python tutorial

# Find what's using port 8000
lsof -i :8000

# Increase file descriptor limit
ulimit -n 65536
fastapi run main.py --workers 4

# Run with verbose logging
fastapi run main.py --log-level debug
Output
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 12345 user 4u IPv4 123456 0t0 TCP *:8000 (LISTEN)
Interview Gold:
Q: How does FastAPI CLI handle graceful shutdown? A: It sends SIGTERM to workers, waits for active requests to finish (up to --timeout-keep-alive), then kills remaining workers. The master process exits after all workers are done.

When Not to Use FastAPI CLI

The CLI is great for standard FastAPI apps. But there are cases where it's the wrong tool:

  • You need custom Uvicorn config: The CLI exposes only a subset of Uvicorn options. If you need --limit-concurrency, --backlog, or --ssl-keyfile, you're better off running Uvicorn directly.
  • You're using Gunicorn as a process manager: The CLI doesn't integrate with Gunicorn. Use gunicorn -k uvicorn.workers.UvicornWorker main:app instead.
  • You have multiple apps in one process: The CLI runs one app per process. If you need to mount multiple ASGI apps (e.g., FastAPI + Admin + Static files), use Uvicorn's --app option or a custom lifespan.
  • You're on Windows in production: The CLI uses uvloop on Unix for performance. On Windows, it falls back to asyncio's event loop, which is slower. Consider using Hypercorn instead.

In those cases, drop down to raw Uvicorn or Gunicorn. The CLI is a convenience layer, not a silver bullet.

custom_uvicorn.shBASH
1
2
3
4
5
6
7
8
# io.thecodeforge — Python tutorial

# When you need custom Uvicorn options
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 --limit-concurrency 100 --backlog 2048

# With Gunicorn
gunicorn -k uvicorn.workers.UvicornWorker main:app --bind 0.0.0.0:8000 --workers 4
Output
No output until server starts.
Senior Shortcut:
If you need SSL termination, don't use the CLI. Put a reverse proxy (Nginx, Caddy) in front. It's more secure and performant.
● Production incidentPOST-MORTEMseverity: high

The Dev Server That Was Accidentally Running in Production for Three Weeks

Symptom
A new microservice deployed via CI/CD pipeline was crashing every 12-24 hours with no clear pattern. Memory usage grew monotonically until the process was killed. Restarting fixed it temporarily. No security incidents were detected, but the service was unpredictably unavailable.
Assumption
The CI/CD script was running fastapi run main.py in production mode. The team assumed this was correct because the deployment passed health checks on startup.
Root cause
The Dockerfile's CMD was fastapi dev main.py --host 0.0.0.0 --port 80. The developer used fastapi dev — which enables auto-reload, debug mode, and single-worker mode — and committed it. The CI/CD pipeline copied the Dockerfile verbatim. fastapi dev runs a single reload-capable worker that was never intended for production. It lacks the process management and isolation that fastapi run or gunicorn provides. The auto-reload file watcher leaked file descriptors over time, eventually hitting the container's limit.
Fix
Changed the Dockerfile CMD to fastapi run main.py (production mode). Added a CI check that grep's the Dockerfile for fastapi dev and fails the build. Set up a pre-commit hook to catch fastapi dev in any Dockerfile or compose file.
Key lesson
  • Never use fastapi dev in Dockerfiles or CI/CD scripts — it's designed for local iteration only.
  • Add automated checks: grep -r 'fastapi dev' Dockerfile docker-compose*.yml in CI pipeline.
  • fastapi run is the production command — it enables proper worker management, no reload overhead, and production-grade defaults.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
Workers crash with Killed (OOM)
Fix
1. Check dmesg | tail -20 for OOM killer messages. 2. Reduce --workers count. 3. Increase container memory. 4. Add memory limits to Docker/K8s.
Symptom · 02
App not accessible from outside Docker
Fix
1. Check --host is 0.0.0.0. 2. Verify Docker port mapping -p 8000:8000. 3. Check firewall rules.
Symptom · 03
Slow startup with many workers
Fix
1. Check if each worker is connecting to the database (connection pool exhaustion). 2. Use a connection pooler like PgBouncer. 3. Reduce workers.
★ FastAPI CLI Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
`Address already in use`
Immediate action
Find the process using the port
Commands
lsof -i :8000
kill -9 <PID>
Fix now
Change port with --port 8001
`ModuleNotFoundError`+
Immediate action
Check current directory and file name
Commands
pwd && ls *.py
fastapi dev <correct_file>.py
Fix now
Run from project root or specify full path
Workers dying silently+
Immediate action
Check system logs for OOM
Commands
dmesg | tail -20
free -m
Fix now
Reduce --workers or increase memory
Slow responses under load+
Immediate action
Check worker count and CPU
Commands
htop
fastapi run --workers <2*cores+1>
Fix now
Increase workers, add connection pooling
Featurefastapi devfastapi run
Auto-reloadYesNo
Default host127.0.0.10.0.0.0
Default log leveldebuginfo
Workers11 (configurable)
Graceful shutdownNo (reload kills workers)Yes
Load .envYesYes
⚙ Quick Reference
8 commands from this guide
FileCommand / CodePurpose
main.pyfrom fastapi import FastAPIWhy FastAPI CLI Exists
dev.shfastapi dev main.pyDevelopment Workflow
run_prod.shfastapi run main.py --workers 4Production Serving
env_example.shDATABASE_URL=postgresql://user:pass@localhost/dbEnvironment Management
scaffold.shfastapi new my_projectScaffolding
DockerfileFROM python:3.11-slimDeployment
debug.shlsof -i :8000Debugging Common CLI Failures
custom_uvicorn.shuvicorn main:app --host 0.0.0.0 --port 8000 --workers 4 --limit-concurrency 100 ...When Not to Use FastAPI CLI

Key takeaways

1
Use fastapi dev for local development with hot reload; never in production.
2
Use fastapi run for production; set workers based on memory, not CPU cores.
3
The CLI loads .env automatically; use --env-file for custom paths.
4
When you need advanced Uvicorn options, skip the CLI and use raw Uvicorn.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does FastAPI CLI handle graceful shutdown under `fastapi run`?
Q02SENIOR
When would you choose `fastapi run` over raw Uvicorn in production?
Q03SENIOR
What happens if you run `fastapi dev` with `--workers 4`?
Q04JUNIOR
What is the default host for `fastapi dev` and `fastapi run`?
Q05SENIOR
You deploy a FastAPI app with `fastapi run --workers 8` on a 512MB conta...
Q06SENIOR
How would you design a zero-downtime deployment for a FastAPI service us...
Q01 of 06SENIOR

How does FastAPI CLI handle graceful shutdown under `fastapi run`?

ANSWER
It sends SIGTERM to each worker. Workers stop accepting new connections, finish in-flight requests (up to --timeout-keep-alive), then exit. The master waits for all workers before exiting. This prevents dropped requests during deploys.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What's the difference between `fastapi dev` and `fastapi run`?
02
How do I set the number of workers with FastAPI CLI?
03
How do I use a custom .env file with FastAPI CLI?
04
Can I use FastAPI CLI with Gunicorn?
COMPLETE GUIDE
FastAPI Complete Guide — Interactive Tutorial for Production APIs →

Every FastAPI concept with runnable in-browser examples — params, Pydantic, dependency injection, JWT auth, async, SQLAlchemy, testing, WebSockets, and Docker deployment. The interactive reference for production engineers.

N
Naren Founder & Principal Engineer

20+ years shipping production Python across data and backend systems. Drawn from code that ran under real load.

Follow
Verified
production tested
July 05, 2026
last updated
141
articles · all by Naren
🔥

That's Python Libraries. Mark it forged?

5 min read · try the examples if you haven't

Previous
The Zen of Python: 19 Principles That Explain Every Design Decision
52 / 57 · Python Libraries
Next
FastAPI OpenAPI Customization — Tags, Examples and Schema