Docker Containerization Explained (2026 Guide)
- Containers share the host kernel β they are not VMs. Isolation via Linux namespaces and cgroups. Start in milliseconds, use <1MB overhead versus 100s of MB for VMs.
- Layer caching is the key to fast builds: copy requirements.txt and install dependencies BEFORE copying application code β code changes won't invalidate the dependency layer.
- Multi-stage builds separate build-time and runtime dependencies. Go from 1.2GB images to 180MB by not shipping your compiler to production.
In 2013, Solomon Hykes showed a demo at PyCon that changed how software gets deployed. In five minutes, he ran a container on a stage laptop using a tool called Docker β the same container that would run identically on any Linux host in the world. The audience response was immediate: this solved a problem that had plagued developers for decades.
The 'works on my machine' problem is as old as software. Your application runs perfectly in development. You deploy it and it crashes. The Python version is different. A library has a different version. The OS has a different timezone setting. Debug sessions that should take minutes take hours because the environments are subtly different.
Docker solves this by packaging your application alongside everything it needs β runtime, libraries, environment variables, configuration files β into a self-contained unit called a container. The container runs identically everywhere that Docker is installed. This guide explains how containers actually work, how to write production-grade Dockerfiles, and the patterns that distinguish beginner Docker usage from senior engineer usage.
What Containers Actually Are (Not VMs)
The most common misunderstanding about Docker: containers are not virtual machines. A virtual machine emulates hardware β it has its own kernel, its own memory management, its own full OS. Booting a VM takes seconds to minutes and uses hundreds of MB of RAM just for the OS overhead.
Containers share the host kernel. They use Linux kernel features β namespaces (for isolation: each container sees its own filesystem, network, processes) and cgroups (for resource limits: CPU and memory caps) β to create isolated environments without the overhead of a separate OS.
Practical difference: a VM running Ubuntu might use 512MB RAM just for the OS. A Docker container running Ubuntu uses <1MB overhead for isolation β the processes inside see an Ubuntu-like environment but share the host's Linux kernel. This is why you can run 50 containers on a machine that could only run 3 VMs.
Your First Dockerfile
A Dockerfile is a recipe for building a container image. Every line is a layer β Docker caches layers and only rebuilds what changed. Understanding layer caching is the difference between 30-second builds and 5-minute builds.
# ββ Stage 1: Base image βββββββββββββββββββββββββββββββββββββββββ # Always pin exact versions β 'python:latest' will break your # build when a new Python version releases FROM python:3.12-slim # ββ Metadata ββββββββββββββββββββββββββββββββββββββββββββββββββββ LABEL maintainer="your@email.com" LABEL version="1.0.0" # ββ Set working directory ββββββββββββββββββββββββββββββββββββββββ WORKDIR /app # ββ Copy requirements FIRST (layer caching trick) ββββββββββββββββ # If requirements.txt doesn't change, Docker caches this layer. # If you copied all files first, every code change would # invalidate the pip install layer β rebuilds take forever. COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # ββ Now copy the rest of your code ββββββββββββββββββββββββββββββ COPY . . # ββ Create non-root user (security best practice) βββββββββββββββ # Running as root inside a container is a security risk. # If the container is compromised, the attacker has root. RUN useradd --create-home appuser USER appuser # ββ Tell Docker which port the app uses ββββββββββββββββββββββββββ # EXPOSE is documentation only β it doesn't actually publish the port EXPOSE 8000 # ββ Default command ββββββββββββββββββββββββββββββββββββββββββββββ CMD ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
docker build -t myapp:1.0.0 .
# Run it:
docker run -p 8000:8000 myapp:1.0.0
# Check it:
curl http://localhost:8000
Multi-Stage Builds β Production Image Sizes
The single biggest mistake in beginner Dockerfiles: shipping build tools to production. A Python app that compiles some C extensions needs gcc, make, and build headers during build β but not at runtime. A Go app needs the entire Go toolchain to compile β but the final binary needs nothing.
Multi-stage builds solve this: use a heavy 'builder' image with all your build tools, then copy only the finished artifact into a minimal 'runtime' image. Production images shrink from gigabytes to tens of megabytes.
# ββ Stage 1: Builder βββββββββββββββββββββββββββββββββββββββββββββ FROM python:3.12 AS builder WORKDIR /build # Install build dependencies (won't be in final image) RUN apt-get update && apt-get install -y gcc libpq-dev COPY requirements.txt . RUN pip install --user --no-cache-dir -r requirements.txt # ββ Stage 2: Runtime (minimal) βββββββββββββββββββββββββββββββββββ FROM python:3.12-slim AS runtime WORKDIR /app # Copy only the installed packages from builder COPY --from=builder /root/.local /root/.local COPY . . # Non-root user RUN useradd --create-home appuser && chown -R appuser /app USER appuser ENV PATH=/root/.local/bin:$PATH EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] # Result: builder image ~1.2GB vs runtime image ~180MB # Only runtime image is pushed to registry and deployed
docker build -f Dockerfile.multistage -t myapp:slim .
docker images myapp
# REPOSITORY TAG SIZE
# myapp slim 178MB β runtime only
# vs single-stage: 1.24GB
Volumes β Persisting Data Beyond Container Lifetime
Containers are ephemeral by design β when a container stops, everything written to its filesystem is lost. For stateful applications (databases, file uploads, logs), you need volumes.
Named volumes: Docker manages the storage location. Survives container restarts and removals. Best for databases.
Bind mounts: Mount a host directory into the container. Great for development (code changes reflect immediately without rebuilding). Not recommended for production β ties the container to host filesystem paths.
tmpfs mounts: Stored in host memory only. Useful for sensitive temporary data that must not persist to disk.
# ββ Named volume (production databases) ββββββββββββββββββββββββββ docker volume create postgres_data docker run -d \ --name postgres \ -e POSTGRES_PASSWORD=secret \ -v postgres_data:/var/lib/postgresql/data \ -p 5432:5432 \ postgres:16 # Data persists even if container is removed: docker rm postgres docker run -d \ --name postgres_new \ -e POSTGRES_PASSWORD=secret \ -v postgres_data:/var/lib/postgresql/data \ postgres:16 # Same data is there # ββ Bind mount (development) ββββββββββββββββββββββββββββββββββββββ docker run -d \ --name dev_app \ -v $(pwd):/app \ -p 8000:8000 \ myapp:latest # Edit code on host β changes reflected in container immediately # No rebuild required
docker volume ls
# DRIVER VOLUME NAME
# local postgres_data
# Inspect:
docker volume inspect postgres_data
# Mountpoint: /var/lib/docker/volumes/postgres_data/_data
Docker Compose β Multi-Container Applications
Real applications are never one container. You have a web app, a database, a cache, maybe a background worker. Docker Compose defines and runs multi-container applications with a single YAML file and a single command.
version: '3.9' services: # ββ Web application ββββββββββββββββββββββββββββββββββββββββββ web: build: . ports: - "8000:8000" environment: - DATABASE_URL=postgresql://user:password@db:5432/myapp - REDIS_URL=redis://redis:6379 depends_on: db: condition: service_healthy redis: condition: service_started restart: unless-stopped # ββ PostgreSQL database βββββββββββββββββββββββββββββββββββββββ db: image: postgres:16-alpine environment: POSTGRES_USER: user POSTGRES_PASSWORD: password POSTGRES_DB: myapp volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U user -d myapp"] interval: 10s timeout: 5s retries: 5 # ββ Redis cache βββββββββββββββββββββββββββββββββββββββββββββββ redis: image: redis:7-alpine command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru volumes: postgres_data:
docker compose up -d
# Check status:
docker compose ps
# View logs:
docker compose logs -f web
# Stop everything:
docker compose down
# Stop and delete volumes (wipes database):
docker compose down -v
Production Best Practices β What Separates Senior from Junior Docker Usage
1. Never use :latest in production. FROM python:latest or image: postgres:latest will silently upgrade on your next deploy and potentially break your application. Always pin exact versions: python:3.12.3-slim, postgres:16.2-alpine.
2. Scan images for vulnerabilities. docker scout quickview myimage or integrate Trivy into your CI pipeline. Container images accumulate CVEs as base OS packages age.
3. Use .dockerignore. Excluding node_modules, .git, __pycache__, .env from build context prevents accidentally shipping secrets and dramatically speeds up builds.
4. Set resource limits. A container with no limits can consume all host resources and crash other services. Always set --memory and --cpus in production.
5. Implement health checks. Docker and Kubernetes use health checks to know when a container is ready to receive traffic and when it needs to be restarted.
6. Store secrets in secrets managers, not env vars or images. ENV SECRET_KEY=abc123 in a Dockerfile bakes the secret into every layer of the image β it appears in docker history. Use Docker secrets, AWS Secrets Manager, or Vault.
# .dockerignore β what NOT to send to Docker daemon during build .git .gitignore __pycache__ *.pyc *.pyo .pytest_cache .coverage htmlcov/ .env .env.* *.env node_modules/ npm-debug.log .DS_Store docker-compose*.yml Dockerfile* README.md docs/ tests/ *.test.py Coverage/
# Faster builds, no accidental secret leaks, smaller attack surface
π― Key Takeaways
- Containers share the host kernel β they are not VMs. Isolation via Linux namespaces and cgroups. Start in milliseconds, use <1MB overhead versus 100s of MB for VMs.
- Layer caching is the key to fast builds: copy requirements.txt and install dependencies BEFORE copying application code β code changes won't invalidate the dependency layer.
- Multi-stage builds separate build-time and runtime dependencies. Go from 1.2GB images to 180MB by not shipping your compiler to production.
- Never use :latest in production β pin exact versions. Never put secrets in Dockerfiles or environment variables β use a secrets manager.
- Docker Compose orchestrates multi-container development environments. Kubernetes extends this to production at scale.
Interview Questions on This Topic
- QWhat is the difference between a Docker container and a virtual machine? How does container isolation actually work at the kernel level?
- QWhy should you copy requirements.txt and run pip install before copying your application code?
- QWhat is a multi-stage build and when would you use it?
- QHow do Docker volumes differ from bind mounts? When would you use each?
- QWhat are three security best practices for production Docker images?
Frequently Asked Questions
Does Docker work on Windows and macOS?
Docker containers require Linux (they share a Linux kernel). On macOS and Windows, Docker Desktop runs a lightweight Linux VM transparently β your containers run inside that VM. This is why container startup time on macOS/Windows is slightly slower than native Linux, and why some kernel-specific features (like certain network modes) behave differently in development vs Linux production.
What is the difference between Docker and Kubernetes?
Docker packages and runs containers on a single machine. Kubernetes orchestrates containers across a cluster of machines β handling scheduling, scaling, rolling deployments, service discovery, and self-healing. A common pattern: develop and test locally with Docker Compose, deploy to production on Kubernetes (using the same container images). Docker is the container runtime, Kubernetes is the orchestrator.
How do I reduce my Docker image size?
Four main techniques: (1) Use slim or alpine base images (python:3.12-slim vs python:3.12 saves ~600MB). (2) Multi-stage builds β don't ship build tools to production. (3) Combine RUN commands with && to reduce layers. (4) Use .dockerignore to exclude large directories. A well-optimised Python image is typically 100-200MB vs 1GB+ naively.
Should I run multiple processes in one container?
Generally no β the container philosophy is one process per container. Multiple processes require a process supervisor (supervisord), make the container harder to monitor and scale, and blur the boundaries between services. Exceptions: sidecar patterns in Kubernetes, or tightly coupled processes like nginx + PHP-FPM where they are functionally one unit.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.