FastAPI File Uploads and Form Data
- UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
- File metadata like filename and content_type are accessible immediately without reading the file body.
- Synchronous vs Asynchronous:
file.read()is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant.
- Use UploadFile over bytes for streaming, memory efficiency, and metadata access
- FastAPI parses multipart data into individual Form() parameters — no single Pydantic model for the whole body
- UploadFile wraps a SpooledTemporaryFile: small files stay in memory, large files hit disk automatically
- Always validate content_type and enforce file size before writing to disk
- The biggest production trap: trusting client-provided MIME type — verify with python-magic or inspect file headers
Upload returns 400 Bad Request
curl -v -F 'file=@test.pdf' http://localhost:8000/uploadCheck server logs: uvicorn log level DEBUGFile saved but corrupted
Add debugging: print(f'File size after read: {len(content)}')Use `await file.seek(0)` before savingHigh memory usage
kubectl top pods (or docker stats)Add a size limit check: if file.size > MAX_SIZE: raise HTTPException(413)Production Incident
kubectl top pods showed memory usage grew linearly with file size. Eventually the pod was OOMKilled and restarted.UploadFile would handle large files by streaming them to disk automatically, so no explicit chunked reading was implemented in the endpoint.await file.read() without a size limit, reading the entire file into memory before processing. UploadFile's spooling only kicks in if you write to disk via file.file; read() returns all bytes at once.file.file for streaming writes with shutil.copyfileobj().await file.read() without a size argument in production — you're asking for an OOM.Always enforce a maximum file size upfront, before processing any content.Use file.file (the SpooledTemporaryFile) for streaming writes, not read() for large files.Production Debug GuideCommon symptoms and immediate actions when file uploads fail in production
client_max_body_size or similar setting. FastAPI itself doesn't impose a limit unless you code it, but proxies do.read() for validation and then again for saving). Use file.file.seek(0) after first read.File parameter is always None even though file was sent→Confirm the client is sending the field as multipart/form-data and the field name matches the endpoint parameter. Use curl -F 'file=@/path' to test.Form(...) (not Body()). Mixing JSON and multipart is not allowed; use separate Form() parameters.FILE_MAX_SIZE and reject early. Use asyncio.Semaphore to limit concurrent uploads.Uploading a Single File with Validation
The UploadFile class is a wrapper around a Python SpooledTemporaryFile. This is critical for performance: it keeps small files in RAM for speed but offloads larger files to the temporary directory of your OS to prevent memory exhaustion. In production, always validate the content_type and enforce a maximum file size.
from fastapi import FastAPI, UploadFile, File, HTTPException, status import shutil from pathlib import Path app = FastAPI() UPLOAD_DIR = Path("uploads") UPLOAD_DIR.mkdir(exist_ok=True) @app.post('/upload') async def upload_document(file: UploadFile = File(...)): # 1. MIME Type Validation (Don't trust the extension!) if file.content_type not in ['application/pdf', 'image/jpeg']: raise HTTPException( status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid file type. Only PDF and JPEG are allowed." ) # 2. Size Validation using file object metadata # Note: .size is available in newer FastAPI/Starlette versions MAX_SIZE = 10 * 1024 * 1024 # 10MB real_file_size = 0 # Stream content to disk to avoid memory spikes save_path = UPLOAD_DIR / file.filename with open(save_path, "wb") as buffer: while chunk := await file.read(1024 * 1024): # Read in 1MB chunks real_file_size += len(chunk) if real_file_size > MAX_SIZE: raise HTTPException(status_code=413, detail="File too large") buffer.write(chunk) return { 'filename': file.filename, 'saved_at': str(save_path), 'final_size': real_file_size }
file.size attribute is only available in Starlette 0.20+ — always implement a manual size check as a fallback.await file.read() without arguments.UploadFile for streaming, but never trust content_type alone.Mixing Form Fields with File Uploads
A frequent pain point for developers is trying to send a Pydantic JSON body alongside a file. Due to the way HTTP works, you cannot easily mix application/json and multipart/form-data. Instead, you must declare each form field using Form(). FastAPI will then parse the multipart body and map the keys to your arguments.
from fastapi import FastAPI, UploadFile, File, Form, status from typing import Annotated app = FastAPI() @app.post('/forge/profile-update', status_code=status.HTTP_201_CREATED) async def update_profile( username: Annotated[str, Form(...)], bio: Annotated[str, Form(min_length=10)], # Optional file: defaults to None if not in the request avatar: UploadFile | None = File(None) ): response = { "user_id": 1024, "username": username, "bio": bio, "avatar_received": False } if avatar: # Process avatar (e.g., upload to S3 or resize) response["avatar_received"] = True response["avatar_name"] = avatar.filename return response
metadata field) — this breaks type safety.Form() for each field.None and use UploadFile | None = File(None).Form(...) — FastAPI will parse it from the multipart body.UploadFile = File(...) — gives streaming, metadata, and efficient handling.Form(), then deserialize in the endpoint.Handling Multiple File Uploads
FastAPI natively supports multiple files under the same field name by declaring the parameter as a list of UploadFile. The client sends each file with the same key, and FastAPI collects them into a Python list. This is common for batch uploads, image galleries, or attachment-heavy workflows.
from fastapi import FastAPI, UploadFile, File, HTTPException app = FastAPI() MAX_FILES = 10 MAX_TOTAL_SIZE = 100 * 1024 * 1024 # 100MB @app.post('/upload-multiple') async def upload_multiple(files: list[UploadFile] = File(...)): if len(files) > MAX_FILES: raise HTTPException(400, detail=f"Maximum {MAX_FILES} files allowed") total_size = 0 saved_files = [] for file in files: # Stream file contents to disk content = b'' while chunk := await file.read(1024 * 1024): content += chunk total_size += len(chunk) if total_size > MAX_TOTAL_SIZE: raise HTTPException(413, detail="Total upload size exceeds limit") with open(f"uploads/{file.filename}", "wb") as f: f.write(content) saved_files.append(file.filename) return {"saved_files": saved_files, "total_size": total_size}
content += chunk) before writing. For large files, this defeats the purpose of streaming. Use proper streaming by writing each chunk directly to a file descriptor instead of accumulating in a bytes object.asyncio.gather for concurrent file processing, but watch for I/O contention on disk.list[UploadFile] for multiple files.MAX_FILES and MAX_TOTAL_SIZE before processing.Validating File Type Beyond Content-Type
The content_type attribute from the client is trivial to spoof. A malicious client can upload a .exe with Content-Type: image/jpeg. To validate file contents, you need to inspect file signatures (magic bytes). Use python-magic which wraps libmagic, or manually check the first few bytes. This is critical for security-sensitive uploads like profile pictures or document scans.
import magic from fastapi import UploadFile, HTTPException ALLOWED_MIME_TYPES = { 'image/jpeg', 'image/png', 'application/pdf' } async def validate_file_type(file: UploadFile): # Read first 2048 bytes for magic detection chunk = await file.read(2048) await file.seek(0) # Reset stream position mime = magic.from_buffer(chunk, mime=True) if mime not in ALLOWED_MIME_TYPES: raise HTTPException(400, detail=f"File type {mime} is not allowed. Only {ALLOWED_MIME_TYPES}") return True
- Magic bytes are the first n bytes of a file — they identify the format regardless of extension or MIME type.
- libmagic (via python-magic) reads these bytes and returns the actual MIME type.
- Always seek back to 0 after reading the magic chunk — you consumed part of the stream.
- For images, you can also use PIL (Pillow) to verify the image can be decoded — this catches truncated or corrupted files.
content_type directly leads to security vulnerabilities. Attackers can upload executable files disguised as images.python-magic adds latency per upload (~5ms). For high-throughput endpoints, cache allowed MIME signatures or use a file extension whitelist as a pre-filter.content_type is client-controlled — never trust it for security decisions.python-magic reads actual file content to determine MIME type.seek(0) after reading magic bytes to not lose the file data.python-magic or manual header check. Must verify before any processing.content_type plus extension check — but still log mismatches for audit.Sanitising Filenames to Prevent Path Traversal
Saving uploaded files with the client-provided filename is dangerous. A malicious user can supply a name like ../../etc/passwd to overwrite system files or create files outside the intended directory. Always sanitize filenames: strip directory separators, use a whitelist of allowed characters, or generate a UUID-based name and store the original in a database.
import re import uuid from pathlib import Path def sanitize_filename(original: str) -> str: # Remove directory separators and replace spaces safe = re.sub(r'[\\/]', '', original) safe = re.sub(r'[^a-zA-Z0-9._-]', '_', safe) # Truncate to avoid filesystem limits (255 chars typical) return safe[:255] def generate_unique_filename(original: str) -> str: ext = Path(original).suffix return f"{uuid.uuid4().hex}{ext}" # Usage: # original = "../../etc/passwd" # safe = sanitize_filename(original) -> "..__etc_passwd" (after replacements) # unique = generate_unique_filename(original) -> "a1b2c3d4..."
os.path.basename and join with a safe base directory.| Aspect | bytes | UploadFile |
|---|---|---|
| Memory usage | Entire file loaded into memory | Streamed; spooled to disk if large |
| Metadata access | None — need to parse headers separately | Immediate: filename, content_type, size |
| Async support | Not native (await file.read() blocks) | Fully async; await read/chunk methods |
| Best for | Small files (<1MB), simple use cases | All production cases, especially large files |
| Validation | Manual chunking required | Built-in chunking via read(chunk_size) |
🎯 Key Takeaways
- UploadFile is superior to raw bytes as it uses a spooling temporary file, preventing RAM overflow for large assets.
- File metadata like filename and content_type are accessible immediately without reading the file body.
- Synchronous vs Asynchronous:
file.read()is an awaitable method; however, for truly massive files, use a streaming chunk approach to keep the memory footprint constant. - The 'No Pydantic' Rule: When using multipart/form-data, you must declare fields individually using
Form()or use a library likepython-multipart. - Always use a library like
python-magicor check the file header if you need deep validation of file types beyond the client-provided MIME type.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain the difference between
bytesandUploadFilein a FastAPI endpoint. Which one would you use for a 2GB video upload and why?JuniorReveal - QHow does the
python-multipartlibrary interact with FastAPI under the hood? Why is it a required dependency for form handling?Mid-levelReveal - QDescribe the security risks associated with allowing user-defined filenames. How would you sanitize a filename before saving it to a local filesystem?Mid-levelReveal
- QHow would you implement a progress bar or status tracker for a large file upload in a purely asynchronous FastAPI environment?SeniorReveal
- QWhat is a 'SpooledTemporaryFile', and how does it prevent the 'Out of Memory' (OOM) killer from terminating your API process during large transfers?SeniorReveal
Frequently Asked Questions
How do I save an uploaded file to disk in FastAPI?
While you can use , the most professional approach for production is to stream the file in chunks to a permanent location. This prevents your server from buffering gigabytes of data. Use file.read()shutil.copyfileobj(file.file, destination) for synchronous writes, or an async chunk loop for better concurrency: while chunk := await file.read(size): buffer.write(chunk).
Why can't I use a Pydantic model for form data?
This is fundamentally due to how browsers and HTTP handle the multipart/form-data encoding. It does not map natively to a single JSON object. While you can hack this by receiving a JSON string as a Form field and parsing it manually inside the endpoint, the standard FastAPI pattern is to define individual Form() parameters.
Can I upload multiple files at once?
Yes. Simply define the type hint as a list: files: list[UploadFile] = File(...). This allows the client to send multiple files under the same key. You can then iterate through the list and process each file individually.
How do I set a maximum file size in FastAPI?
FastAPI doesn't have a built-in limit. You must implement it in your endpoint by reading chunks and enforcing a cumulative size. Additionally, configure your reverse proxy (nginx, AWS ALB) to reject oversized requests early with client_max_body_size.
What is the difference between `File(...)` and `File(None)`?
File(...) makes the file parameter required. File(None) makes it optional — the parameter defaults to None if no file is sent. Use UploadFile | None = File(None) for optional file uploads.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.