Python Intermediate

FastAPI File Uploads and Form Data

Q: How do I save an uploaded file to disk in FastAPI?

While you can use `file.read()`, the most professional approach for production is to stream the file in chunks to a permanent location. This prevents your server from buffering gigabytes of data. Use `shutil.copyfileobj(file.file, destination)` for synchronous writes, or an async chunk loop for better concurrency: `while chunk := await file.read(size): buffer.write(chunk)`.

Q: Why can't I use a Pydantic model for form data?

This is fundamentally due to how browsers and HTTP handle the `multipart/form-data` encoding. It does not map natively to a single JSON object. While you can hack this by receiving a JSON string as a Form field and parsing it manually inside the endpoint, the standard FastAPI pattern is to define individual `Form()` parameters.

Q: Can I upload multiple files at once?

Yes. Simply define the type hint as a list: `files: list[UploadFile] = File(...)`. This allows the client to send multiple files under the same key. You can then iterate through the list and process each file individually.

Q: How do I set a maximum file size in FastAPI?

FastAPI doesn't have a built-in limit. You must implement it in your endpoint by reading chunks and enforcing a cumulative size. Additionally, configure your reverse proxy (nginx, AWS ALB) to reject oversized requests early with `client_max_body_size`.

Q: What is the difference between `File(...)` and `File(None)`?

`File(...)` makes the file parameter required. `File(None)` makes it optional — the parameter defaults to `None` if no file is sent. Use `UploadFile | None = File(None)` for optional file uploads.

📅 March 05, 2026 ⏱ 3 min read 🎯 Intermediate

Where developers are forged. · Structured learning · Free forever.

📍 Part of: Python Libraries → Topic 44 of 51

Master multipart/form-data in FastAPI.

⚙️ Intermediate — basic Python knowledge assumed

In this tutorial, you'll learn

Master multipart/form-data in FastAPI.

UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
File metadata like filename and content_type are accessible immediately without reading the file body.
Synchronous vs Asynchronous: file.read() is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant.

✦ Plain-English analogy ✦ Real code with output ✦ Interview questions

⚡Quick Answer

Use UploadFile over bytes for streaming, memory efficiency, and metadata access
FastAPI parses multipart data into individual Form() parameters — no single Pydantic model for the whole body
UploadFile wraps a SpooledTemporaryFile: small files stay in memory, large files hit disk automatically
Always validate content_type and enforce file size before writing to disk
The biggest production trap: trusting client-provided MIME type — verify with python-magic or inspect file headers

🚨 START HERE

File Upload Quick Debug Cheat Sheet

Five-command workflow for diagnosing file upload issues in FastAPI

🟡Upload returns 400 Bad Request

Immediate ActionCheck if client sent correct Content-Type: multipart/form-data

Commands

curl -v -F 'file=@test.pdf' http://localhost:8000/upload

Check server logs: uvicorn log level DEBUG

Fix NowEnsure route parameter is declared as `UploadFile = File(...)`

🟡File saved but corrupted

Immediate ActionCheck if you called `file.read()` before writing; you may have consumed the stream.

Commands

Add debugging: print(f'File size after read: {len(content)}')

Use `await file.seek(0)` before saving

Fix NowAlways use a chunked loop that reads and writes without buffering entire file

🟡High memory usage

Immediate ActionCheck endpoint for unbounded `await file.read()`

Commands

kubectl top pods (or docker stats)

Add a size limit check: if file.size > MAX_SIZE: raise HTTPException(413)

Fix NowMake endpoint read in chunks of 1MB

Production IncidentOOM Kill in Production: The 10MB File That Brought Down the APIA single 2GB file upload consumed all available RAM, triggering the OOM killer and taking down the entire FastAPI service.

SymptomThe API became unresponsive; kubectl top pods showed memory usage grew linearly with file size. Eventually the pod was OOMKilled and restarted.

AssumptionThe team assumed FastAPI's UploadFile would handle large files by streaming them to disk automatically, so no explicit chunked reading was implemented in the endpoint.

Root causeThe endpoint used await file.read() without a size limit, reading the entire file into memory before processing. UploadFile's spooling only kicks in if you write to disk via file.file; read() returns all bytes at once.

FixImplemented chunked reading with a max chunk size of 1MB and a cumulative size check. If file exceeds 100MB, reject immediately with a 413 error. Use file.file for streaming writes with shutil.copyfileobj().

Key Lesson

Never call await file.read() without a size argument in production — you're asking for an OOM.Always enforce a maximum file size upfront, before processing any content.Use file.file (the SpooledTemporaryFile) for streaming writes, not read() for large files.

Production Debug GuideCommon symptoms and immediate actions when file uploads fail in production

Upload fails with 413 Request Entity Too Large→Check reverse proxy (nginx, ALB) client_max_body_size or similar setting. FastAPI itself doesn't impose a limit unless you code it, but proxies do.

Upload works but file is empty or truncated→Check that you're not consuming the file twice (e.g., calling read() for validation and then again for saving). Use file.file.seek(0) after first read.

File parameter is always None even though file was sent→Confirm the client is sending the field as multipart/form-data and the field name matches the endpoint parameter. Use curl -F 'file=@/path' to test.

Form fields are received as empty strings or missing→Verify that all form fields are declared with Form(...) (not Body()). Mixing JSON and multipart is not allowed; use separate Form() parameters.

Memory usage spikes during multiple concurrent uploads→Ensure you are streaming files to disk using chunks. Also set FILE_MAX_SIZE and reject early. Use asyncio.Semaphore to limit concurrent uploads.

Uploading a Single File with Validation

The UploadFile class is a wrapper around a Python SpooledTemporaryFile. This is critical for performance: it keeps small files in RAM for speed but offloads larger files to the temporary directory of your OS to prevent memory exhaustion. In production, always validate the content_type and enforce a maximum file size.

io/thecodeforge/files/upload_handler.py · PYTHON

123456789101112131415161718192021222324252627282930313233343536

from fastapi import FastAPI, UploadFile, File, HTTPException, status
import shutil
from pathlib import Path

app = FastAPI()
UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)

@app.post('/upload')
async def upload_document(file: UploadFile = File(...)):
    # 1. MIME Type Validation (Don't trust the extension!)
    if file.content_type not in ['application/pdf', 'image/jpeg']:
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST, 
            detail="Invalid file type. Only PDF and JPEG are allowed."
        )

    # 2. Size Validation using file object metadata
    # Note: .size is available in newer FastAPI/Starlette versions
    MAX_SIZE = 10 * 1024 * 1024  # 10MB
    real_file_size = 0
    
    # Stream content to disk to avoid memory spikes
    save_path = UPLOAD_DIR / file.filename
    with open(save_path, "wb") as buffer:
        while chunk := await file.read(1024 * 1024):  # Read in 1MB chunks
            real_file_size += len(chunk)
            if real_file_size > MAX_SIZE:
                raise HTTPException(status_code=413, detail="File too large")
            buffer.write(chunk)

    return {
        'filename': file.filename,
        'saved_at': str(save_path),
        'final_size': real_file_size
    }

▶ Output

{"filename": "contract.pdf", "saved_at": "uploads/contract.pdf", "final_size": 1048576}

📊 Production Insight

Chunked reading with a size check prevents OOM, but also avoids slow writes from reading entire file first.

The file.size attribute is only available in Starlette 0.20+ — always implement a manual size check as a fallback.

Rule: Always read in chunks, never await file.read() without arguments.

🎯 Key Takeaway

Use UploadFile for streaming, but never trust content_type alone.

Validate file size during streaming, not after.

Chunk size of 1MB balances memory and I/O overhead.

Mixing Form Fields with File Uploads

A frequent pain point for developers is trying to send a Pydantic JSON body alongside a file. Due to the way HTTP works, you cannot easily mix application/json and multipart/form-data. Instead, you must declare each form field using Form(). FastAPI will then parse the multipart body and map the keys to your arguments.

io/thecodeforge/files/profile_form.py · PYTHON

12345678910111213141516171819202122232425

from fastapi import FastAPI, UploadFile, File, Form, status
from typing import Annotated

app = FastAPI()

@app.post('/forge/profile-update', status_code=status.HTTP_201_CREATED)
async def update_profile(
    username: Annotated[str, Form(...)],
    bio: Annotated[str, Form(min_length=10)],
    # Optional file: defaults to None if not in the request
    avatar: UploadFile | None = File(None) 
):
    response = {
        "user_id": 1024, 
        "username": username, 
        "bio": bio,
        "avatar_received": False
    }
    
    if avatar:
        # Process avatar (e.g., upload to S3 or resize)
        response["avatar_received"] = True
        response["avatar_name"] = avatar.filename
        
    return response

▶ Output

{"username": "forge_admin", "bio": "Senior Editor at TheCodeForge", "avatar_received": true}

📊 Production Insight

Developers often try to embed JSON inside a form field (e.g., metadata field) — this breaks type safety.

Use separate form fields for each scalar value; for complex structures, serialize to JSON string and parse inside endpoint.

Rule: No Pydantic model for multipart — declare each field explicitly.

🎯 Key Takeaway

You cannot mix JSON body and multipart — use Form() for each field.

Optional files: default to None and use UploadFile | None = File(None).

For complex data, encode as JSON string in a form field.

Form or File? How to Choose Between Form and File

IfField is a simple scalar (string, int, bool)

→

UseUse Form(...) — FastAPI will parse it from the multipart body.

IfField is a file (image, PDF, video)

→

UseUse UploadFile = File(...) — gives streaming, metadata, and efficient handling.

IfYou need to send a nested JSON object alongside a file

→

UseSerialize the JSON to a string field using Form(), then deserialize in the endpoint.

Handling Multiple File Uploads

FastAPI natively supports multiple files under the same field name by declaring the parameter as a list of UploadFile. The client sends each file with the same key, and FastAPI collects them into a Python list. This is common for batch uploads, image galleries, or attachment-heavy workflows.

io/thecodeforge/files/multi_upload.py · PYTHON

12345678910111213141516171819202122232425262728

from fastapi import FastAPI, UploadFile, File, HTTPException

app = FastAPI()
MAX_FILES = 10
MAX_TOTAL_SIZE = 100 * 1024 * 1024  # 100MB

@app.post('/upload-multiple')
async def upload_multiple(files: list[UploadFile] = File(...)):
    if len(files) > MAX_FILES:
        raise HTTPException(400, detail=f"Maximum {MAX_FILES} files allowed")
    
    total_size = 0
    saved_files = []
    
    for file in files:
        # Stream file contents to disk
        content = b''
        while chunk := await file.read(1024 * 1024):
            content += chunk
            total_size += len(chunk)
            if total_size > MAX_TOTAL_SIZE:
                raise HTTPException(413, detail="Total upload size exceeds limit")
        
        with open(f"uploads/{file.filename}", "wb") as f:
            f.write(content)
        saved_files.append(file.filename)
    
    return {"saved_files": saved_files, "total_size": total_size}

▶ Output

{"saved_files": ["photo1.jpg", "photo2.jpg"], "total_size": 5242880}

⚠ Memory Risk with Multiple Files

The above code accumulates each file's content in memory (content += chunk) before writing. For large files, this defeats the purpose of streaming. Use proper streaming by writing each chunk directly to a file descriptor instead of accumulating in a bytes object.

📊 Production Insight

Multiple file endpoints are a prime target for resource exhaustion — attackers send many large files to fill memory or disk.

Always enforce both per-file and total size limits, and cap the number of files.

Use asyncio.gather for concurrent file processing, but watch for I/O contention on disk.

🎯 Key Takeaway

Declare list[UploadFile] for multiple files.

Stream each file independently to disk — never accumulate all chunks in one bytes object.

Enforce MAX_FILES and MAX_TOTAL_SIZE before processing.

Validating File Type Beyond Content-Type

The content_type attribute from the client is trivial to spoof. A malicious client can upload a .exe with Content-Type: image/jpeg. To validate file contents, you need to inspect file signatures (magic bytes). Use python-magic which wraps libmagic, or manually check the first few bytes. This is critical for security-sensitive uploads like profile pictures or document scans.

io/thecodeforge/files/validate_type.py · PYTHON

12345678910111213141516

import magic
from fastapi import UploadFile, HTTPException

ALLOWED_MIME_TYPES = {
    'image/jpeg', 'image/png', 'application/pdf'
}

async def validate_file_type(file: UploadFile):
    # Read first 2048 bytes for magic detection
    chunk = await file.read(2048)
    await file.seek(0)  # Reset stream position
    
    mime = magic.from_buffer(chunk, mime=True)
    if mime not in ALLOWED_MIME_TYPES:
        raise HTTPException(400, detail=f"File type {mime} is not allowed. Only {ALLOWED_MIME_TYPES}")
    return True

Mental Model

File Type Detection: Trust the Bytes, Not the Header

A file's content_type is just a label; the first bytes tell the real story.

Magic bytes are the first n bytes of a file — they identify the format regardless of extension or MIME type.
libmagic (via python-magic) reads these bytes and returns the actual MIME type.
Always seek back to 0 after reading the magic chunk — you consumed part of the stream.
For images, you can also use PIL (Pillow) to verify the image can be decoded — this catches truncated or corrupted files.

📊 Production Insight

Trusting content_type directly leads to security vulnerabilities. Attackers can upload executable files disguised as images.

python-magic adds latency per upload (~5ms). For high-throughput endpoints, cache allowed MIME signatures or use a file extension whitelist as a pre-filter.

Rule: Never rely solely on client-provided content type — always verify with magic bytes.

🎯 Key Takeaway

content_type is client-controlled — never trust it for security decisions.

python-magic reads actual file content to determine MIME type.

Always seek(0) after reading magic bytes to not lose the file data.

When to Use Magic Byte Validation

IfHigh-security endpoint (profile photos, document uploads)

→

UseUse python-magic or manual header check. Must verify before any processing.

IfLow-risk, internal API with trusted clients

→

UseRely on content_type plus extension check — but still log mismatches for audit.

IfNeed to handle both images and PDFs with different processing

→

UseUse magic bytes to route to the correct handler (e.g., resize image vs. extract text from PDF).

Sanitising Filenames to Prevent Path Traversal

Saving uploaded files with the client-provided filename is dangerous. A malicious user can supply a name like ../../etc/passwd to overwrite system files or create files outside the intended directory. Always sanitize filenames: strip directory separators, use a whitelist of allowed characters, or generate a UUID-based name and store the original in a database.

io/thecodeforge/files/sanitize_filename.py · PYTHON

12345678910111213141516171819

import re
import uuid
from pathlib import Path

def sanitize_filename(original: str) -> str:
    # Remove directory separators and replace spaces
    safe = re.sub(r'[\\/]', '', original)
    safe = re.sub(r'[^a-zA-Z0-9._-]', '_', safe)
    # Truncate to avoid filesystem limits (255 chars typical)
    return safe[:255]

def generate_unique_filename(original: str) -> str:
    ext = Path(original).suffix
    return f"{uuid.uuid4().hex}{ext}"

# Usage:
# original = "../../etc/passwd"
# safe = sanitize_filename(original)  -> "..__etc_passwd" (after replacements)
# unique = generate_unique_filename(original) -> "a1b2c3d4..."

📊 Production Insight

Path traversal is a classic vulnerability; OWASP lists it consistently in Top 10.

Even with sanitization, an attacker could use null bytes or encoding tricks — always use os.path.basename and join with a safe base directory.

Best practice: never use the original filename on disk — store under a UUID and keep the original name in a database or metadata field.

Rule: Always generate a unique name for storage; store the original name separately for display.

🎯 Key Takeaway

Never trust user-supplied filenames — they can contain path traversal sequences.

Use UUID-based filenames for storage, map back to original via database.

If you must keep original name, whitelist characters and strip directory separators.

🗂 bytes vs UploadFile: What to Use and When

Choosing the right type for file parameters in FastAPI

Aspect	bytes	UploadFile
Memory usage	Entire file loaded into memory	Streamed; spooled to disk if large
Metadata access	None — need to parse headers separately	Immediate: filename, content_type, size
Async support	Not native (await `file.read()` blocks)	Fully async; await read/chunk methods
Best for	Small files (<1MB), simple use cases	All production cases, especially large files
Validation	Manual chunking required	Built-in chunking via read(chunk_size)

🎯 Key Takeaways

UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
File metadata like filename and content_type are accessible immediately without reading the file body.
Synchronous vs Asynchronous: file.read() is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant.
The 'No Pydantic' Rule: When using multipart/form-data, you must declare fields individually using Form() or use a library like python-multipart.
Always use a library like python-magic or check the file header if you need deep validation of file types beyond the client-provided MIME type.

⚠ Common Mistakes to Avoid

✕Calling `await file.read()` without a size argument

Symptom

Large files cause OOM and application crash. Memory usage spikes linearly with file size.

Fix

Always pass a chunk size: while chunk := await file.read(1024 * 1024):

✕Using `File()` for form fields that are not files (e.g., strings)

Symptom

FastAPI raises a validation error because it expects a file upload for that field.

Fix

Use Form(...) for non-file form fields. Only use File(...) for actual file upload parameters.

✕Saving file directly with client-provided filename

Symptom

Path traversal vulnerability — files can be written outside the intended directory.

Fix

Sanitize filename or generate a UUID-based name. Never join user input directly to a path.

✕Assuming `content_type` is accurate

Symptom

Malicious files with wrong extension bypass validation, leading to security compromise.

Fix

Use python-magic to detect actual MIME type from file bytes.

✕Not resetting file stream position after reading

Symptom

Subsequent await file.read() or file.file.read() returns empty bytes.

Fix

Call await file.seek(0) after any read call that you expect to re-read.

Interview Questions on This Topic

QExplain the difference between bytes and UploadFile in a FastAPI endpoint. Which one would you use for a 2GB video upload and why?JuniorReveal
bytes reads the entire file into memory; UploadFile wraps a SpooledTemporaryFile that streams content. For a 2GB video, use UploadFile because it allows chunked reading (e.g., in 1MB blocks), avoiding OOM. Also gives immediate access to filename and content_type without reading the body.
QHow does the python-multipart library interact with FastAPI under the hood? Why is it a required dependency for form handling?Mid-levelReveal
FastAPI relies on Starlette, which uses python-multipart to parse incoming multipart/form-data requests. It decodes the MIME multipart stream, separating file parts (as UploadFile objects) and form fields (as strings). Without it, FastAPI cannot parse file uploads; FastAPI will raise an import error if it's missing and you declare a File() parameter.
QDescribe the security risks associated with allowing user-defined filenames. How would you sanitize a filename before saving it to a local filesystem?Mid-levelReveal
Risks include path traversal (../../etc/passwd), null-byte injection, and special characters that break file system operations. Sanitization: strip directory separators, whitelist allowed characters (letters, digits, underscores, hyphens, dots), truncate to 255 characters, and generate a UUID-based name for storage. Always store the original name in a database if needed.
QHow would you implement a progress bar or status tracker for a large file upload in a purely asynchronous FastAPI environment?SeniorReveal
FastAPI's request handling is not designed for streaming partial progress back to the client during upload (the entire body is read before the handler runs). Workarounds: split the upload into multiple chunks via a custom endpoint (client sends chunks with index), or use WebSocket to report progress while a background task processes the file. Alternatively, use a third-party service like tus.io for resumable uploads with progress.
QWhat is a 'SpooledTemporaryFile', and how does it prevent the 'Out of Memory' (OOM) killer from terminating your API process during large transfers?SeniorReveal
SpooledTemporaryFile (from Python's tempfile module) starts as an io.BytesIO in memory. When the data exceeds a threshold (default 64KB), it is rolled over to a temporary file on disk. This prevents large files from consuming RAM — the bulk of the data goes to disk. FastAPI's UploadFile wraps this, so even if you call await file.read(), the underlying file is already spooled, but it's still loaded into memory at that point. For true OOM prevention, you must stream chunks rather than calling read() completely.

Frequently Asked Questions

How do I save an uploaded file to disk in FastAPI?

While you can use file.read(), the most professional approach for production is to stream the file in chunks to a permanent location. This prevents your server from buffering gigabytes of data. Use shutil.copyfileobj(file.file, destination) for synchronous writes, or an async chunk loop for better concurrency: while chunk := await file.read(size): buffer.write(chunk).

Why can't I use a Pydantic model for form data?

This is fundamentally due to how browsers and HTTP handle the multipart/form-data encoding. It does not map natively to a single JSON object. While you can hack this by receiving a JSON string as a Form field and parsing it manually inside the endpoint, the standard FastAPI pattern is to define individual Form() parameters.

Can I upload multiple files at once?

Yes. Simply define the type hint as a list: files: list[UploadFile] = File(...). This allows the client to send multiple files under the same key. You can then iterate through the list and process each file individually.

How do I set a maximum file size in FastAPI?

FastAPI doesn't have a built-in limit. You must implement it in your endpoint by reading chunks and enforcing a cumulative size. Additionally, configure your reverse proxy (nginx, AWS ALB) to reject oversized requests early with client_max_body_size.

What is the difference between `File(...)` and `File(None)`?

File(...) makes the file parameter required. File(None) makes it optional — the parameter defaults to None if no file is sent. Use UploadFile | None = File(None) for optional file uploads.

🔥

Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

About Naren Get in touch

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged