Skip to content
Home Python FastAPI File Uploads and Form Data

FastAPI File Uploads and Form Data

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Python Libraries → Topic 44 of 51
Master multipart/form-data in FastAPI.
⚙️ Intermediate — basic Python knowledge assumed
In this tutorial, you'll learn
Master multipart/form-data in FastAPI.
  • UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
  • File metadata like filename and content_type are accessible immediately without reading the file body.
  • Synchronous vs Asynchronous: file.read() is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Use UploadFile over bytes for streaming, memory efficiency, and metadata access
  • FastAPI parses multipart data into individual Form() parameters — no single Pydantic model for the whole body
  • UploadFile wraps a SpooledTemporaryFile: small files stay in memory, large files hit disk automatically
  • Always validate content_type and enforce file size before writing to disk
  • The biggest production trap: trusting client-provided MIME type — verify with python-magic or inspect file headers
🚨 START HERE
File Upload Quick Debug Cheat Sheet
Five-command workflow for diagnosing file upload issues in FastAPI
🟡Upload returns 400 Bad Request
Immediate ActionCheck if client sent correct Content-Type: multipart/form-data
Commands
curl -v -F 'file=@test.pdf' http://localhost:8000/upload
Check server logs: uvicorn log level DEBUG
Fix NowEnsure route parameter is declared as `UploadFile = File(...)`
🟡File saved but corrupted
Immediate ActionCheck if you called `file.read()` before writing; you may have consumed the stream.
Commands
Add debugging: print(f'File size after read: {len(content)}')
Use `await file.seek(0)` before saving
Fix NowAlways use a chunked loop that reads and writes without buffering entire file
🟡High memory usage
Immediate ActionCheck endpoint for unbounded `await file.read()`
Commands
kubectl top pods (or docker stats)
Add a size limit check: if file.size > MAX_SIZE: raise HTTPException(413)
Fix NowMake endpoint read in chunks of 1MB
Production IncidentOOM Kill in Production: The 10MB File That Brought Down the APIA single 2GB file upload consumed all available RAM, triggering the OOM killer and taking down the entire FastAPI service.
SymptomThe API became unresponsive; kubectl top pods showed memory usage grew linearly with file size. Eventually the pod was OOMKilled and restarted.
AssumptionThe team assumed FastAPI's UploadFile would handle large files by streaming them to disk automatically, so no explicit chunked reading was implemented in the endpoint.
Root causeThe endpoint used await file.read() without a size limit, reading the entire file into memory before processing. UploadFile's spooling only kicks in if you write to disk via file.file; read() returns all bytes at once.
FixImplemented chunked reading with a max chunk size of 1MB and a cumulative size check. If file exceeds 100MB, reject immediately with a 413 error. Use file.file for streaming writes with shutil.copyfileobj().
Key Lesson
Never call await file.read() without a size argument in production — you're asking for an OOM.Always enforce a maximum file size upfront, before processing any content.Use file.file (the SpooledTemporaryFile) for streaming writes, not read() for large files.
Production Debug GuideCommon symptoms and immediate actions when file uploads fail in production
Upload fails with 413 Request Entity Too LargeCheck reverse proxy (nginx, ALB) client_max_body_size or similar setting. FastAPI itself doesn't impose a limit unless you code it, but proxies do.
Upload works but file is empty or truncatedCheck that you're not consuming the file twice (e.g., calling read() for validation and then again for saving). Use file.file.seek(0) after first read.
File parameter is always None even though file was sentConfirm the client is sending the field as multipart/form-data and the field name matches the endpoint parameter. Use curl -F 'file=@/path' to test.
Form fields are received as empty strings or missingVerify that all form fields are declared with Form(...) (not Body()). Mixing JSON and multipart is not allowed; use separate Form() parameters.
Memory usage spikes during multiple concurrent uploadsEnsure you are streaming files to disk using chunks. Also set FILE_MAX_SIZE and reject early. Use asyncio.Semaphore to limit concurrent uploads.

Uploading a Single File with Validation

The UploadFile class is a wrapper around a Python SpooledTemporaryFile. This is critical for performance: it keeps small files in RAM for speed but offloads larger files to the temporary directory of your OS to prevent memory exhaustion. In production, always validate the content_type and enforce a maximum file size.

io/thecodeforge/files/upload_handler.py · PYTHON
123456789101112131415161718192021222324252627282930313233343536
from fastapi import FastAPI, UploadFile, File, HTTPException, status
import shutil
from pathlib import Path

app = FastAPI()
UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)

@app.post('/upload')
async def upload_document(file: UploadFile = File(...)):
    # 1. MIME Type Validation (Don't trust the extension!)
    if file.content_type not in ['application/pdf', 'image/jpeg']:
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST, 
            detail="Invalid file type. Only PDF and JPEG are allowed."
        )

    # 2. Size Validation using file object metadata
    # Note: .size is available in newer FastAPI/Starlette versions
    MAX_SIZE = 10 * 1024 * 1024  # 10MB
    real_file_size = 0
    
    # Stream content to disk to avoid memory spikes
    save_path = UPLOAD_DIR / file.filename
    with open(save_path, "wb") as buffer:
        while chunk := await file.read(1024 * 1024):  # Read in 1MB chunks
            real_file_size += len(chunk)
            if real_file_size > MAX_SIZE:
                raise HTTPException(status_code=413, detail="File too large")
            buffer.write(chunk)

    return {
        'filename': file.filename,
        'saved_at': str(save_path),
        'final_size': real_file_size
    }
▶ Output
{"filename": "contract.pdf", "saved_at": "uploads/contract.pdf", "final_size": 1048576}
📊 Production Insight
Chunked reading with a size check prevents OOM, but also avoids slow writes from reading entire file first.
The file.size attribute is only available in Starlette 0.20+ — always implement a manual size check as a fallback.
Rule: Always read in chunks, never await file.read() without arguments.
🎯 Key Takeaway
Use UploadFile for streaming, but never trust content_type alone.
Validate file size during streaming, not after.
Chunk size of 1MB balances memory and I/O overhead.

Mixing Form Fields with File Uploads

A frequent pain point for developers is trying to send a Pydantic JSON body alongside a file. Due to the way HTTP works, you cannot easily mix application/json and multipart/form-data. Instead, you must declare each form field using Form(). FastAPI will then parse the multipart body and map the keys to your arguments.

io/thecodeforge/files/profile_form.py · PYTHON
12345678910111213141516171819202122232425
from fastapi import FastAPI, UploadFile, File, Form, status
from typing import Annotated

app = FastAPI()

@app.post('/forge/profile-update', status_code=status.HTTP_201_CREATED)
async def update_profile(
    username: Annotated[str, Form(...)],
    bio: Annotated[str, Form(min_length=10)],
    # Optional file: defaults to None if not in the request
    avatar: UploadFile | None = File(None) 
):
    response = {
        "user_id": 1024, 
        "username": username, 
        "bio": bio,
        "avatar_received": False
    }
    
    if avatar:
        # Process avatar (e.g., upload to S3 or resize)
        response["avatar_received"] = True
        response["avatar_name"] = avatar.filename
        
    return response
▶ Output
{"username": "forge_admin", "bio": "Senior Editor at TheCodeForge", "avatar_received": true}
📊 Production Insight
Developers often try to embed JSON inside a form field (e.g., metadata field) — this breaks type safety.
Use separate form fields for each scalar value; for complex structures, serialize to JSON string and parse inside endpoint.
Rule: No Pydantic model for multipart — declare each field explicitly.
🎯 Key Takeaway
You cannot mix JSON body and multipart — use Form() for each field.
Optional files: default to None and use UploadFile | None = File(None).
For complex data, encode as JSON string in a form field.
Form or File? How to Choose Between Form and File
IfField is a simple scalar (string, int, bool)
UseUse Form(...) — FastAPI will parse it from the multipart body.
IfField is a file (image, PDF, video)
UseUse UploadFile = File(...) — gives streaming, metadata, and efficient handling.
IfYou need to send a nested JSON object alongside a file
UseSerialize the JSON to a string field using Form(), then deserialize in the endpoint.

Handling Multiple File Uploads

FastAPI natively supports multiple files under the same field name by declaring the parameter as a list of UploadFile. The client sends each file with the same key, and FastAPI collects them into a Python list. This is common for batch uploads, image galleries, or attachment-heavy workflows.

io/thecodeforge/files/multi_upload.py · PYTHON
12345678910111213141516171819202122232425262728
from fastapi import FastAPI, UploadFile, File, HTTPException

app = FastAPI()
MAX_FILES = 10
MAX_TOTAL_SIZE = 100 * 1024 * 1024  # 100MB

@app.post('/upload-multiple')
async def upload_multiple(files: list[UploadFile] = File(...)):
    if len(files) > MAX_FILES:
        raise HTTPException(400, detail=f"Maximum {MAX_FILES} files allowed")
    
    total_size = 0
    saved_files = []
    
    for file in files:
        # Stream file contents to disk
        content = b''
        while chunk := await file.read(1024 * 1024):
            content += chunk
            total_size += len(chunk)
            if total_size > MAX_TOTAL_SIZE:
                raise HTTPException(413, detail="Total upload size exceeds limit")
        
        with open(f"uploads/{file.filename}", "wb") as f:
            f.write(content)
        saved_files.append(file.filename)
    
    return {"saved_files": saved_files, "total_size": total_size}
▶ Output
{"saved_files": ["photo1.jpg", "photo2.jpg"], "total_size": 5242880}
⚠ Memory Risk with Multiple Files
The above code accumulates each file's content in memory (content += chunk) before writing. For large files, this defeats the purpose of streaming. Use proper streaming by writing each chunk directly to a file descriptor instead of accumulating in a bytes object.
📊 Production Insight
Multiple file endpoints are a prime target for resource exhaustion — attackers send many large files to fill memory or disk.
Always enforce both per-file and total size limits, and cap the number of files.
Use asyncio.gather for concurrent file processing, but watch for I/O contention on disk.
🎯 Key Takeaway
Declare list[UploadFile] for multiple files.
Stream each file independently to disk — never accumulate all chunks in one bytes object.
Enforce MAX_FILES and MAX_TOTAL_SIZE before processing.

Validating File Type Beyond Content-Type

The content_type attribute from the client is trivial to spoof. A malicious client can upload a .exe with Content-Type: image/jpeg. To validate file contents, you need to inspect file signatures (magic bytes). Use python-magic which wraps libmagic, or manually check the first few bytes. This is critical for security-sensitive uploads like profile pictures or document scans.

io/thecodeforge/files/validate_type.py · PYTHON
12345678910111213141516
import magic
from fastapi import UploadFile, HTTPException

ALLOWED_MIME_TYPES = {
    'image/jpeg', 'image/png', 'application/pdf'
}

async def validate_file_type(file: UploadFile):
    # Read first 2048 bytes for magic detection
    chunk = await file.read(2048)
    await file.seek(0)  # Reset stream position
    
    mime = magic.from_buffer(chunk, mime=True)
    if mime not in ALLOWED_MIME_TYPES:
        raise HTTPException(400, detail=f"File type {mime} is not allowed. Only {ALLOWED_MIME_TYPES}")
    return True
Mental Model
File Type Detection: Trust the Bytes, Not the Header
A file's content_type is just a label; the first bytes tell the real story.
  • Magic bytes are the first n bytes of a file — they identify the format regardless of extension or MIME type.
  • libmagic (via python-magic) reads these bytes and returns the actual MIME type.
  • Always seek back to 0 after reading the magic chunk — you consumed part of the stream.
  • For images, you can also use PIL (Pillow) to verify the image can be decoded — this catches truncated or corrupted files.
📊 Production Insight
Trusting content_type directly leads to security vulnerabilities. Attackers can upload executable files disguised as images.
python-magic adds latency per upload (~5ms). For high-throughput endpoints, cache allowed MIME signatures or use a file extension whitelist as a pre-filter.
Rule: Never rely solely on client-provided content type — always verify with magic bytes.
🎯 Key Takeaway
content_type is client-controlled — never trust it for security decisions.
python-magic reads actual file content to determine MIME type.
Always seek(0) after reading magic bytes to not lose the file data.
When to Use Magic Byte Validation
IfHigh-security endpoint (profile photos, document uploads)
UseUse python-magic or manual header check. Must verify before any processing.
IfLow-risk, internal API with trusted clients
UseRely on content_type plus extension check — but still log mismatches for audit.
IfNeed to handle both images and PDFs with different processing
UseUse magic bytes to route to the correct handler (e.g., resize image vs. extract text from PDF).

Sanitising Filenames to Prevent Path Traversal

Saving uploaded files with the client-provided filename is dangerous. A malicious user can supply a name like ../../etc/passwd to overwrite system files or create files outside the intended directory. Always sanitize filenames: strip directory separators, use a whitelist of allowed characters, or generate a UUID-based name and store the original in a database.

io/thecodeforge/files/sanitize_filename.py · PYTHON
12345678910111213141516171819
import re
import uuid
from pathlib import Path

def sanitize_filename(original: str) -> str:
    # Remove directory separators and replace spaces
    safe = re.sub(r'[\\/]', '', original)
    safe = re.sub(r'[^a-zA-Z0-9._-]', '_', safe)
    # Truncate to avoid filesystem limits (255 chars typical)
    return safe[:255]

def generate_unique_filename(original: str) -> str:
    ext = Path(original).suffix
    return f"{uuid.uuid4().hex}{ext}"

# Usage:
# original = "../../etc/passwd"
# safe = sanitize_filename(original)  -> "..__etc_passwd" (after replacements)
# unique = generate_unique_filename(original) -> "a1b2c3d4..."
📊 Production Insight
Path traversal is a classic vulnerability; OWASP lists it consistently in Top 10.
Even with sanitization, an attacker could use null bytes or encoding tricks — always use os.path.basename and join with a safe base directory.
Best practice: never use the original filename on disk — store under a UUID and keep the original name in a database or metadata field.
Rule: Always generate a unique name for storage; store the original name separately for display.
🎯 Key Takeaway
Never trust user-supplied filenames — they can contain path traversal sequences.
Use UUID-based filenames for storage, map back to original via database.
If you must keep original name, whitelist characters and strip directory separators.
🗂 bytes vs UploadFile: What to Use and When
Choosing the right type for file parameters in FastAPI
AspectbytesUploadFile
Memory usageEntire file loaded into memoryStreamed; spooled to disk if large
Metadata accessNone — need to parse headers separatelyImmediate: filename, content_type, size
Async supportNot native (await file.read() blocks)Fully async; await read/chunk methods
Best forSmall files (<1MB), simple use casesAll production cases, especially large files
ValidationManual chunking requiredBuilt-in chunking via read(chunk_size)

🎯 Key Takeaways

  • UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
  • File metadata like filename and content_type are accessible immediately without reading the file body.
  • Synchronous vs Asynchronous: file.read() is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant.
  • The 'No Pydantic' Rule: When using multipart/form-data, you must declare fields individually using Form() or use a library like python-multipart.
  • Always use a library like python-magic or check the file header if you need deep validation of file types beyond the client-provided MIME type.

⚠ Common Mistakes to Avoid

    Calling `await file.read()` without a size argument
    Symptom

    Large files cause OOM and application crash. Memory usage spikes linearly with file size.

    Fix

    Always pass a chunk size: while chunk := await file.read(1024 * 1024):

    Using `File()` for form fields that are not files (e.g., strings)
    Symptom

    FastAPI raises a validation error because it expects a file upload for that field.

    Fix

    Use Form(...) for non-file form fields. Only use File(...) for actual file upload parameters.

    Saving file directly with client-provided filename
    Symptom

    Path traversal vulnerability — files can be written outside the intended directory.

    Fix

    Sanitize filename or generate a UUID-based name. Never join user input directly to a path.

    Assuming `content_type` is accurate
    Symptom

    Malicious files with wrong extension bypass validation, leading to security compromise.

    Fix

    Use python-magic to detect actual MIME type from file bytes.

    Not resetting file stream position after reading
    Symptom

    Subsequent await file.read() or file.file.read() returns empty bytes.

    Fix

    Call await file.seek(0) after any read call that you expect to re-read.

Interview Questions on This Topic

  • QExplain the difference between bytes and UploadFile in a FastAPI endpoint. Which one would you use for a 2GB video upload and why?JuniorReveal
    bytes reads the entire file into memory; UploadFile wraps a SpooledTemporaryFile that streams content. For a 2GB video, use UploadFile because it allows chunked reading (e.g., in 1MB blocks), avoiding OOM. Also gives immediate access to filename and content_type without reading the body.
  • QHow does the python-multipart library interact with FastAPI under the hood? Why is it a required dependency for form handling?Mid-levelReveal
    FastAPI relies on Starlette, which uses python-multipart to parse incoming multipart/form-data requests. It decodes the MIME multipart stream, separating file parts (as UploadFile objects) and form fields (as strings). Without it, FastAPI cannot parse file uploads; FastAPI will raise an import error if it's missing and you declare a File() parameter.
  • QDescribe the security risks associated with allowing user-defined filenames. How would you sanitize a filename before saving it to a local filesystem?Mid-levelReveal
    Risks include path traversal (../../etc/passwd), null-byte injection, and special characters that break file system operations. Sanitization: strip directory separators, whitelist allowed characters (letters, digits, underscores, hyphens, dots), truncate to 255 characters, and generate a UUID-based name for storage. Always store the original name in a database if needed.
  • QHow would you implement a progress bar or status tracker for a large file upload in a purely asynchronous FastAPI environment?SeniorReveal
    FastAPI's request handling is not designed for streaming partial progress back to the client during upload (the entire body is read before the handler runs). Workarounds: split the upload into multiple chunks via a custom endpoint (client sends chunks with index), or use WebSocket to report progress while a background task processes the file. Alternatively, use a third-party service like tus.io for resumable uploads with progress.
  • QWhat is a 'SpooledTemporaryFile', and how does it prevent the 'Out of Memory' (OOM) killer from terminating your API process during large transfers?SeniorReveal
    SpooledTemporaryFile (from Python's tempfile module) starts as an io.BytesIO in memory. When the data exceeds a threshold (default 64KB), it is rolled over to a temporary file on disk. This prevents large files from consuming RAM — the bulk of the data goes to disk. FastAPI's UploadFile wraps this, so even if you call await file.read(), the underlying file is already spooled, but it's still loaded into memory at that point. For true OOM prevention, you must stream chunks rather than calling read() completely.

Frequently Asked Questions

How do I save an uploaded file to disk in FastAPI?

While you can use file.read(), the most professional approach for production is to stream the file in chunks to a permanent location. This prevents your server from buffering gigabytes of data. Use shutil.copyfileobj(file.file, destination) for synchronous writes, or an async chunk loop for better concurrency: while chunk := await file.read(size): buffer.write(chunk).

Why can't I use a Pydantic model for form data?

This is fundamentally due to how browsers and HTTP handle the multipart/form-data encoding. It does not map natively to a single JSON object. While you can hack this by receiving a JSON string as a Form field and parsing it manually inside the endpoint, the standard FastAPI pattern is to define individual Form() parameters.

Can I upload multiple files at once?

Yes. Simply define the type hint as a list: files: list[UploadFile] = File(...). This allows the client to send multiple files under the same key. You can then iterate through the list and process each file individually.

How do I set a maximum file size in FastAPI?

FastAPI doesn't have a built-in limit. You must implement it in your endpoint by reading chunks and enforcing a cumulative size. Additionally, configure your reverse proxy (nginx, AWS ALB) to reject oversized requests early with client_max_body_size.

What is the difference between `File(...)` and `File(None)`?

File(...) makes the file parameter required. File(None) makes it optional — the parameter defaults to None if no file is sent. Use UploadFile | None = File(None) for optional file uploads.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousFastAPI Database Integration with SQLAlchemyNext →FastAPI Middleware — Logging, CORS and Custom Middleware
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged