Skip to content
Home DevOps Introduction to Docker: Containers, Images and Real-World Usage Explained

Introduction to Docker: Containers, Images and Real-World Usage Explained

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Docker → Topic 1 of 18
Docker explained for intermediate developers — understand containers vs VMs, write real Dockerfiles, and avoid the mistakes that break production deployments.
⚙️ Intermediate — basic DevOps knowledge assumed
In this tutorial, you'll learn
Docker explained for intermediate developers — understand containers vs VMs, write real Dockerfiles, and avoid the mistakes that break production deployments.
  • Containers share the host OS kernel — they're not mini VMs. This is why they start in milliseconds and use megabytes of memory, making them economically practical for microservices at scale.
  • Docker image layers are cached from top to bottom. Copy dependency manifests and run installs BEFORE copying source code, or every git commit will trigger a full package reinstall.
  • Multi-stage builds are not optional in production — they separate build-time tooling from the runtime image, cutting image sizes by 50-70% and removing attack surface from your deployed artifact.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Container: a running instance of an image — isolated filesystem, network, and process tree sharing the host kernel
  • Image: a read-only blueprint built from layers — each Dockerfile instruction creates one cached layer
  • Dockerfile: the build script that defines the image — instructions are executed top-to-bottom
  • Volume: persistent storage that survives container deletion — named volumes for production, bind mounts for development
  • Containers share the host OS kernel (no guest OS overhead)
  • VMs run a full guest OS per instance (stronger isolation, much heavier)
  • Layer caching: changing one layer invalidates all layers after it — order from least-to-most frequently changing
  • Multi-stage builds: use heavy toolchains during compilation, ship only the output to production
🚨 START HERE
Docker Container Triage Cheat Sheet
First-response commands when containers crash, builds are slow, or services cannot communicate.
🟡Container exits immediately on startup.
Immediate ActionCheck container logs and exit code.
Commands
docker compose logs <service> --tail 50
docker inspect <container> --format '{{.State.ExitCode}} {{.State.Error}}'
Fix NowExit code 1 = app error (check logs). Exit code 137 = OOM (increase memory limit). Exit code 143 = SIGTERM (check healthcheck or depends_on).
🟡Container A cannot reach Container B by hostname.
Immediate ActionVerify network connectivity and DNS resolution.
Commands
docker network inspect <network> --format '{{range .Containers}}{{.Name}} {{end}}'
docker exec <container-a> nslookup <container-b-service-name>
Fix NowIf container-b is missing from the network, add it to the same network in docker-compose.yml. If using default bridge, create a user-defined network.
🟠Docker build is slow — every rebuild takes minutes.
Immediate ActionCheck which layers are being rebuilt vs cached.
Commands
docker build --progress=plain -t test . 2>&1 | grep -E 'CACHED|RUN|COPY'
docker history <image> --format '{{.Size}} {{.CreatedBy}}' | sort -hr
Fix NowIf RUN npm install rebuilds on every change, move COPY package.json before COPY . . . Separate dependency installation from source code copying.
🟡Port already allocated — container cannot start.
Immediate ActionFinddocker ps --format what is using the port.
Commands
'{{.Names}} {{.Ports}}' | grep <port>
ss -tlnp | grep <port>
Fix NowStop the conflicting container or change the host-side port in docker-compose.yml (e.g., 3001:3000 instead of 3000:3000).
🟡Container data lost after docker compose down.
Immediate ActionCheck if volumes were deleted or never created.
Commands
docker volume ls | grep <project>
docker compose config | grep -A2 volumes
Fix NowIf docker compose down -v was run, data is gone (check backups). If no volumes defined in docker-compose.yml, add named volumes for stateful services.
🟡Image is unexpectedly large (>500MB).
Immediate ActionInspect layer sizes and check for .dockerignore.
Commands
docker history <image> --format '{{.Size}} {{.CreatedBy}}' | sort -hr
cat .dockerignore 2>/dev/null || echo 'NO .dockerignore FILE'
Fix NowCreate .dockerignore. Use multi-stage builds. Chain cleanup in same RUN: RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*.
Production IncidentProduction Deployment Crashes on Boot — API Connects to Database Before Postgres Is ReadyA team migrated from a VM-based deployment to Docker Compose. The API container started before Postgres finished initialization, crashed with 'connection refused', and entered a restart loop that delayed the deployment by 15 minutes.
SymptomAfter running docker compose up -d, the API container exited immediately with code 1. docker compose logs api showed: 'Error: connect ECONNREFUSED 172.18.0.2:5432'. The container restarted automatically (restart: unless-stopped) but crashed again with the same error. After 4-5 restarts over 60 seconds, Postgres finished initializing, and the API finally started successfully on the 6th attempt.
AssumptionThe team assumed a network configuration issue — maybe the containers were on different networks. They checked docker network inspect and confirmed both containers were on the same network. They assumed a DNS resolution issue — they exec'd into the API container and ran nslookup postgres, which resolved correctly. They assumed the database was misconfigured — they exec'd into the Postgres container and ran psql, which connected successfully.
Root causeThe docker-compose.yml had depends_on: [postgres] without a condition. depends_on without condition: service_healthy only waits for the container to START (docker start returns), not for the process inside to be READY. Postgres takes 5-15 seconds to initialize after the container starts — creating the default database, running init scripts, and opening the port. The API container started immediately after the Postgres container started, before Postgres was accepting connections. The API crashed, Docker restarted it, and this repeated until Postgres was finally ready.
Fix1. Added a healthcheck to the Postgres service using pg_isready. 2. Changed depends_on to use condition: service_healthy so the API waits until Postgres passes its health check. 3. Added a start_period: 30s to the healthcheck to prevent false failures during Postgres initialization. 4. Added a retry loop in the API's startup code as a defense-in-depth measure (connect with exponential backoff for 30 seconds before giving up). 5. Documented the pattern in the team's Docker Compose style guide.
Key Lesson
depends_on without condition: service_healthy only waits for container start — not process readiness. Always use service_healthy for databases.Postgres, MySQL, Redis, and any service with initialization time needs a healthcheck. Without it, dependent services will crash on boot.A restart loop (container crashing and restarting repeatedly) is a symptom of a race condition, not a network issue. Check startup ordering first.Defense-in-depth: add a connection retry loop in your application code as a second layer of protection, even with healthchecks in place.The start_period flag on healthcheck prevents false failures during slow startup. Set it to the expected maximum initialization time.
Production Debug GuideFrom startup crashes to slow builds — systematic debugging paths.
Container exits immediately with code 1 or 137 on startup.Check logs: docker compose logs <service> or docker logs <container>. Exit code 1 is an application error — check the stack trace. Exit code 137 is OOM-killed — check memory limits with docker stats. Exit code 143 is SIGTERM — check if the container is being stopped by another process or healthcheck.
Container cannot connect to another container — 'ECONNREFUSED' or 'could not translate host name'.Verify both containers are on the same network: docker network inspect <network>. Check if the target container is running: docker compose ps. Check if the target service is healthy (if it has a healthcheck). Verify DNS resolution: docker exec <container> nslookup <service-name>.
Docker build is slow — every rebuild takes 3-5 minutes.Check layer ordering. Run docker history <image> to see which layers were rebuilt. If the dependency install layer rebuilds on every code change, the Dockerfile copies source code before dependency manifests. Fix: copy package.json/requirements.txt before COPY . . and run install in a separate layer.
Container data disappears after restart.Check if a volume is mounted: docker inspect <container> --format '{{.Mounts}}'. If no volume is mounted, data lives in the container's writable layer and is lost on docker rm. Fix: add a named volume to docker-compose.yml and restart.
docker compose up fails with 'port is already allocated'.Check what is using the port: docker ps --format '{{.Ports}}' or ss -tlnp | grep <port>. Either stop the conflicting container or change the host-side port mapping in docker-compose.yml.
Image is unexpectedly large — 1GB+ for a simple application.Run docker history --no-trunc <image> to see layer sizes. Check if .dockerignore exists — without it, COPY . . includes node_modules and .git. Check if multi-stage builds are used — the final image may contain build tools. Check if package manager cache is cleaned in the same RUN layer.

Environment drift is the root cause of most 'works on my machine' failures. A different Node version, a missing library, an environment variable pointing nowhere — these are not skill problems, they are infrastructure problems. Docker eliminates this class of issue by packaging the entire runtime environment into a portable, immutable container.

Containers are not VMs. They share the host OS kernel and use Linux namespaces and cgroups for isolation. This means containers start in milliseconds and use megabytes of memory, making microservices architectures economically viable. On the same machine that runs three VMs, you can run thirty containers.

Common misconceptions: containers are not inherently insecure (misconfiguration is the problem, not the technology), data inside containers is not persistent by default (you need volumes), and Docker Compose is not just for development (it works in production for single-host deployments).

Containers vs Virtual Machines: Why Docker Is a Fundamentally Different Idea

Most people learn Docker by running commands without understanding the architectural shift underneath. That's fine for getting started, but it bites you the moment something breaks.

A virtual machine (VM) runs a full guest operating system — its own kernel, drivers, system processes — on top of a hypervisor. Your app sits at the top of this tower. Booting a VM can take minutes. It consumes gigabytes of RAM even before your app starts. Scaling ten microservices with VMs means ten full operating systems running simultaneously.

Docker containers take a different path. They share the host machine's kernel directly. Each container gets its own isolated view of the filesystem (via union file system layers), its own network namespace, and its own process tree — but there's no duplicated OS. A container starts in milliseconds. It uses megabytes of overhead instead of gigabytes.

The practical implication: on the same machine where you could run three VMs, you can run thirty containers. That's not a minor efficiency gain — it's the reason microservices architectures became economically viable. When AWS charges you per second of compute, that difference compounds fast.

Containers are not inherently less secure than VMs — they're just differently isolated. A misconfigured container is dangerous, just as a misconfigured VM is. The security story depends on your configuration, not the technology itself.

Kernel sharing trade-off: Because containers share the host kernel, a kernel vulnerability (like CVE-2022-0185 or Dirty Pipe) affects all containers on that host. VMs have a separate kernel per instance, so a kernel vulnerability in one VM does not affect others. For high-security multi-tenant environments (running untrusted code), VMs provide stronger isolation. For single-tenant application workloads, container isolation is sufficient.

io/thecodeforge/container_vs_vm_demo.sh · BASH
1234567891011121314151617181920212223
# Compare startup time and resource footprint — run these and watch the difference

# Pull a minimal Linux image (only ~5MB compressed)
docker pull alpine:3.19

# Start a container, run a command, and exit — time the whole thing
time docker run --rm alpine:3.19 echo "Container is alive"
# --rm tells Docker to delete the container after it exits (no cleanup needed)
# alpine:3.19 is the image — think of it as the blueprint
# 'echo ...' is the command to run inside the container

# Now check how much memory the container used at peak
# Run it in the background with resource stats
docker run -d --name resource-demo alpine:3.19 sleep 30
# -d runs in detached (background) mode
# --name gives it a human-readable name instead of a random hash

docker stats resource-demo --no-stream
# --no-stream prints one snapshot instead of a live feed
# Look at the MEM USAGE column — typically under 1MB for alpine doing nothing

# Clean up
docker stop resource-demo && docker rm resource-demo
▶ Output
Container is alive

real 0m0.387s
user 0m0.021s
sys 0m0.018s

NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
resource-demo 0.00% 632KiB / 15.55GiB 0.00% 796B / 0B 0B / 0B
Mental Model
Containers as Apartments, VMs as Houses
When would you choose a VM over a container?
  • Multi-tenant environments running untrusted code — the shared kernel is a risk.
  • Workloads requiring a different kernel version than the host.
  • Compliance requirements that mandate full OS isolation.
  • For everything else — single-tenant application workloads — containers are the right choice.
📊 Production Insight
The kernel sharing trade-off has real security implications. In 2022, the Dirty Pipe vulnerability (CVE-2022-0847) allowed any process to overwrite read-only files on the host kernel. Every container on an affected host was vulnerable simultaneously. VMs were not affected because each VM has its own kernel. For high-security environments, consider gVisor (user-space kernel) or Kata Containers (lightweight VMs) as alternatives that provide VM-level isolation with container-like startup speed.
🎯 Key Takeaway
Containers share the host kernel — they start in milliseconds and use megabytes of memory. VMs run a full guest OS — stronger isolation but much heavier. For single-tenant workloads, containers are the right choice. For multi-tenant or high-security environments, consider gVisor or Kata Containers.
Container vs VM Selection
IfSingle-tenant application workload (API, web server, worker)
UseContainer. Faster startup, lower cost, sufficient isolation.
IfMulti-tenant environment running untrusted code
UseVM or gVisor/Kata Containers. Stronger isolation required.
IfWorkload requires a specific kernel version
UseVM. Containers share the host kernel.
IfCI/CD pipeline, developer environment
UseContainer. Fast spin-up, disposable, reproducible.

Images, Layers and Dockerfiles: How Docker Actually Builds Your App

A Docker image is a read-only blueprint for creating containers. A container is a running instance of an image — the same relationship as a class and an object in OOP, or a recipe and a meal.

Images are built in layers. Every instruction in a Dockerfile creates a new layer on top of the previous one. Docker caches these layers aggressively. This is the single most important thing to understand about Dockerfile efficiency: if layer 3 changes, Docker rebuilds from layer 3 downward. Layers 1 and 2 are served from cache instantly.

This is why experienced engineers always copy dependency manifests (package.json, requirements.txt, go.mod) and install dependencies BEFORE copying application source code. Source code changes every commit; dependencies change rarely. Put the slow, stable work near the top of your Dockerfile so it stays cached.

Multi-stage builds are the other major pattern worth knowing early. You use one image (with compilers, build tools, dev dependencies) to build your app, then copy only the compiled output into a minimal runtime image. Your final image contains zero build tooling — smaller, faster, and with a dramatically reduced attack surface.

Let's build a realistic Node.js API with both patterns applied — this is what a production-ready Dockerfile actually looks like, not the toy examples you usually see.

Layer cleanup in the same RUN: Each RUN creates a new layer. If you download a 200MB package in one RUN and delete it in the next RUN, the 200MB still exists in the first layer — layers are additive. Always chain download and cleanup in the same RUN with &&.

io/thecodeforge/Dockerfile · DOCKERFILE
1234567891011121314151617181920212223242526272829303132333435363738394041424344
# ── STAGE 1: Build Stage ──────────────────────────────────────────────────────
# Use the full Node image with build tools available
FROM node:20-alpine AS builder
# 'AS builder' names this stage so we can reference it later
# node:20-alpine uses Alpine Linux — much smaller than node:20-bullseye

# Set the working directory inside the container
WORKDIR /app

# COPY dependency files FIRST — before application code
# Docker caches this layer. If package.json hasn't changed, npm install
# won't re-run even if your source code changed. This saves minutes per build.
COPY package.json package-lock.json ./

# Install only production dependencies (saves ~200MB vs installing devDependencies)
RUN npm ci --omit=dev
# npm ci is faster and stricter than npm install — it respects package-lock.json exactly

# NOW copy application source code
# Changing any source file only invalidates from this line forward
COPY src/ ./src/

# ── STAGE 2: Production Runtime Stage ─────────────────────────────────────────
# Start fresh from a minimal image — no build tools, no npm, no package manager cruft
FROM node:20-alpine AS production

# Run as a non-root user — critical for production security
# node:alpine ships with a 'node' user built in
USER node

WORKDIR /app

# Copy only what we need from the builder stage — not the entire filesystem
COPY --from=builder --chown=node:node /app/node_modules ./node_modules
COPY --from=builder --chown=node:node /app/src ./src
COPY --chown=node:node package.json ./
# --chown ensures the node user owns these files, not root

# Document which port the app listens on (informational — doesn't actually publish it)
EXPOSE 3000

# Define the command to run when a container starts from this image
# Use array form (exec form) — NOT string form — to ensure signals are handled correctly
CMD ["node", "src/server.js"]
▶ Output
# Build the image — run from the directory containing your Dockerfile
$ docker build -t my-node-api:1.0.0 .

Sending build context to Docker daemon 48.13kB
Step 1/11 : FROM node:20-alpine AS builder
---> 3f4d90098f5b
Step 2/11 : WORKDIR /app
---> Using cache
Step 3/11 : COPY package.json package-lock.json ./
---> Using cache <- dependencies layer served from cache!
Step 4/11 : RUN npm ci --omit=dev
---> Using cache <- install step also cached — build is fast
Step 5/11 : COPY src/ ./src/
---> 8c3a1b2d4e5f <- only this layer rebuilt (source changed)
...
Successfully built a7b3c9d1e2f4
Successfully tagged my-node-api:1.0.0

# Check the final image size
$ docker image ls my-node-api
REPOSITORY TAG IMAGE ID CREATED SIZE
my-node-api 1.0.0 a7b3c9d1e2f4 12 seconds ago 142MB
# Compare: the builder stage alone would be ~380MB with all dev tooling
Mental Model
Layers as Transparent Slides
Why does changing one layer invalidate all layers after it?
  • Each layer is a diff on top of the previous layer. If the base changes, the diff no longer applies.
  • Docker cannot know if a later instruction depends on the changed content in an earlier layer.
  • The cache is sequential, not selective — Docker rebuilds from the first invalidated layer onward.
  • This is why layer ordering (least-change to most-change) is the single most impactful Dockerfile optimization.
📊 Production Insight
The cleanup-in-same-layer rule is the most common cause of bloated images. A team's image was 1.2GB because they ran apt-get install in one RUN and apt-get clean in the next. The 800MB apt cache persisted in the first layer. Fix: chain with && and clean up in the same RUN. This alone reduced their image from 1.2GB to 340MB.
🎯 Key Takeaway
Docker builds images as a stack of cached layers. Order instructions from least-to-most frequently changing. Copy dependency manifests before source code. Chain cleanup in the same RUN as the operation. This single optimization can turn 5-minute rebuilds into 10-second rebuilds.
Layer Ordering Strategy
IfBase image (FROM)
UseFirst layer. Changes rarely. Cached indefinitely until the tag is updated.
IfSystem dependencies (apt-get install, apk add)
UseSecond layer. Changes occasionally. Chain with && and clean up in the same RUN.
IfDependency manifests (package.json, requirements.txt)
UseThird layer. Changes when dependencies change. Copy BEFORE source code.
IfDependency installation (npm ci, pip install)
UseFourth layer. Changes when dependencies change. Cached until manifests change.
IfSource code (COPY . . or COPY src/)
UseLast layer. Changes on every code edit. Must be the final COPY to maximize cache.

Volumes and Docker Compose: Persistence and Multi-Container Orchestration

Containers are ephemeral by design. When a container stops, any data written inside it vanishes. That's perfect for stateless services, but databases, file uploads, and logs need to survive container restarts. Docker volumes solve this by mounting a storage location from the host (or a managed volume) into the container's filesystem.

There are three storage mechanisms: bind mounts (link a specific host directory into the container — great for local development where you want live code reloading), named volumes (Docker manages the storage location — best for databases in production), and tmpfs mounts (in-memory only — useful for sensitive data you never want written to disk).

Real applications are never a single container. You have an API, a database, a cache, maybe a background worker. Running and networking these manually with individual docker run commands is error-prone and impossible to reproduce reliably. Docker Compose lets you define your entire multi-container application in one YAML file and bring it all up with a single command.

Here's a complete, realistic Compose setup for a Node.js API backed by PostgreSQL and Redis — the stack you'll encounter in most backend roles.

The depends_on trap: depends_on without condition: service_healthy only waits for the container to START — not for the process inside to be READY. Postgres takes 5-15 seconds to initialize. Without service_healthy, your API will crash on boot trying to connect to a database that is not accepting connections yet. This is the single most common cause of flaky Docker Compose environments.

io/thecodeforge/docker-compose.yml · YAML
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
# Docker Compose V2 format (no 'version' key needed with modern Docker Desktop)
services:

  # ── The API service ───────────────────────────────────────────────────────
  api:
    build:
      context: .           # Build from the Dockerfile in the current directory
      target: production   # Use the 'production' stage from our multi-stage Dockerfile
    container_name: my-api
    ports:
      - "3000:3000"        # Map host port 3000 -> container port 3000
    environment:
      # Reference values from a .env file — never hardcode secrets in Compose files
      NODE_ENV: production
      DATABASE_URL: postgresql://api_user:${DB_PASSWORD}@postgres:5432/app_db
      REDIS_URL: redis://redis:6379
      # 'postgres' and 'redis' are the service names below — Docker's internal
      # DNS resolves them automatically within the shared network
    depends_on:
      postgres:
        condition: service_healthy   # Wait until postgres passes its health check
      redis:
        condition: service_started
    restart: unless-stopped          # Restart on crash, but not if manually stopped

  # ── PostgreSQL database ───────────────────────────────────────────────────
  postgres:
    image: postgres:16-alpine        # Always pin a specific version — never use 'latest'
    container_name: my-postgres
    environment:
      POSTGRES_DB: app_db
      POSTGRES_USER: api_user
      POSTGRES_PASSWORD: ${DB_PASSWORD}   # Pulled from .env file
    volumes:
      - postgres_data:/var/lib/postgresql/data
      # Named volume — Docker manages where this lives on the host.
      # Database files survive 'docker compose down' and container rebuilds.
      - ./db/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
      # Bind mount an init script — runs once when the DB is first created.
      # :ro makes it read-only inside the container (good security habit)
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U api_user -d app_db"]
      interval: 5s     # Check every 5 seconds
      timeout: 5s      # Fail if no response in 5 seconds
      retries: 5       # Mark unhealthy after 5 consecutive failures
      start_period: 30s  # Grace period before health checks start

  # ── Redis cache ───────────────────────────────────────────────────────────
  redis:
    image: redis:7-alpine
    container_name: my-redis
    command: redis-server --appendonly yes
    # --appendonly yes enables AOF persistence — data survives Redis restarts
    volumes:
      - redis_data:/data

# Named volumes must be declared at the top level
# Docker creates and manages these — they persist across 'docker compose down'
volumes:
  postgres_data:
  redis_data:
▶ Output
# Start everything (add -d for detached/background mode)
$ docker compose up -d

[+] Running 5/5
✔ Network my-app_default Created
✔ Volume "postgres_data" Created
✔ Volume "redis_data" Created
✔ Container my-postgres Healthy
✔ Container my-redis Started
✔ Container my-api Started

# Check all services are running
$ docker compose ps
NAME IMAGE COMMAND STATUS PORTS
my-api my-app-api "docker-entrypoint.s…" Up 12 seconds 0.0.0.0:3000->3000/tcp
my-postgres postgres:16-alpine "docker-entrypoint.s…" Up 18 seconds 5432/tcp
my-redis redis:7-alpine "docker-entrypoint.s…" Up 18 seconds 6379/tcp

# Tail logs from a specific service
$ docker compose logs -f api
my-api | Server listening on port 3000
my-api | Database connection established
my-api | Redis connection established

# Tear down (volumes are preserved by default)
$ docker compose down
# Add --volumes to also delete the named volumes (WARNING: deletes all DB data)
Mental Model
docker compose down vs docker compose down -v
Why is docker compose down -v the most dangerous Docker command?
  • It deletes all named volumes for the project — including databases with days of data.
  • There is no undo. Once volumes are deleted, data is gone unless backed up.
  • Developers often use it thinking it is a 'clean restart' — it is a destructive operation.
  • Always back up volumes before running down -v. Use: docker run --rm -v vol:/data -v $(pwd):/backup alpine tar czf /backup/backup.tar.gz -C /data .
📊 Production Insight
The depends_on with service_healthy pattern is not just for databases. Any service with initialization time — Redis, Elasticsearch, Kafka, message queues — needs a healthcheck and a depends_on condition. Without it, dependent services will crash on boot and enter a restart loop, delaying deployments and creating flaky CI pipelines.
🎯 Key Takeaway
Named volumes persist data across container restarts — they are the production default. depends_on with condition: service_healthy prevents race conditions between services. docker compose down preserves volumes; down -v deletes them. Always pin image versions — never use 'latest' in production.
Volume Type Selection
IfProduction database or stateful service
UseNamed volume (postgres_data:/var/lib/postgresql/data). Docker manages the path. Portable.
IfDevelopment — live code reload, config files
UseBind mount (-v ./src:/app/src). Direct access to host files. Fast iteration.
IfSensitive data that should never touch disk
Usetmpfs mount (--tmpfs /secrets:size=1m). In-memory only. Deleted on container stop.
IfShared config files across multiple containers
UseNamed volume with :ro (read-only) flag. Prevents accidental modification by any container.
🗂 Virtual Machines vs Docker Containers
Architecture, resource usage, and trade-offs for each approach.
AspectVirtual MachinesDocker Containers
Startup time30 seconds – 5 minutesMilliseconds to 2 seconds
Memory overhead512MB – 2GB per instance1MB – 50MB per instance
OS isolationFull guest OS per VMShared host kernel, isolated namespaces
Disk footprint5GB – 50GB per image5MB – 500MB per image
PortabilityHypervisor-dependent (.vmdk, .vhd)Runs on any Docker host (Linux, Mac, Windows, Cloud)
Security isolationStrong (separate kernel)Good (namespaces + cgroups, but shared kernel)
Best forFull OS control, strong isolation needsMicroservices, CI/CD pipelines, developer environments
Scaling speedMinutes (VM provisioning)Seconds (container spin-up)

🎯 Key Takeaways

  • Containers share the host OS kernel — they're not mini VMs. This is why they start in milliseconds and use megabytes of memory, making them economically practical for microservices at scale.
  • Docker image layers are cached from top to bottom. Copy dependency manifests and run installs BEFORE copying source code, or every git commit will trigger a full package reinstall.
  • Multi-stage builds are not optional in production — they separate build-time tooling from the runtime image, cutting image sizes by 50-70% and removing attack surface from your deployed artifact.
  • Named volumes persist data across container restarts and rebuilds; depends_on with service_healthy prevents race conditions — both are non-negotiable for any database-backed service.
  • docker compose down preserves volumes. docker compose down -v deletes them. Always back up volumes before any destructive operation.
  • Always use exec-form CMD (CMD ["node", "server.js"]) — shell form silently breaks graceful shutdown in Kubernetes.

⚠ Common Mistakes to Avoid

    Copying all source files before installing dependencies
    Symptom

    every code change triggers a full npm install or pip install, making builds take 3-5 minutes instead of 10 seconds —

    Fix

    always COPY your dependency manifest (package.json, requirements.txt) and run your install command BEFORE copying the rest of your source code. Only the layers below a changed file get rebuilt.

    Using 'latest' as your image tag in production
    Symptom

    docker compose pull silently pulls a new major version of postgres or redis that has breaking changes, and your app crashes in production with confusing errors —

    Fix

    always pin to a specific version tag like postgres:16.2-alpine. Treat image versions the same way you treat library versions in a lockfile.

    Running containers as the root user
    Symptom

    a vulnerability in your app gives an attacker root access to the container filesystem, and depending on your setup, a path to the host —

    Fix

    add USER node (or create a dedicated low-privilege user) in your Dockerfile before the CMD instruction. Most official images ship with a built-in non-root user for exactly this reason.

    Using depends_on without condition: service_healthy
    Symptom

    API container crashes on boot with 'ECONNREFUSED' to the database, enters a restart loop, and eventually starts after 30-60 seconds when the database is finally ready —

    Fix

    add a healthcheck to the database service and use depends_on with condition: service_healthy in the API service.

    No .dockerignore file
    Symptom

    build context includes node_modules (500MB+), .git history (100MB+), and .env files with real secrets — builds are slow and secrets are baked into the image —

    Fix

    create .dockerignore with at minimum: node_modules/, .git/, .env, .log, coverage/.

    Using shell-form CMD
    Symptom

    container takes 30 seconds to stop in Kubernetes instead of shutting down gracefully —

    Fix

    change CMD node server.js to CMD ["node", "server.js"]. Shell form wraps your process in /bin/sh -c, making it PID 2. SIGTERM goes to PID 1 (the shell), not your app.

Interview Questions on This Topic

  • QWhat's the difference between a Docker image and a Docker container, and how does the layer caching system affect your Dockerfile design decisions?
  • QIf your API container starts before your database is ready and crashes on boot, how would you fix that in a Docker Compose file without adding a sleep command?
  • QWhat's the practical difference between a bind mount and a named volume, and when would you choose one over the other in a production environment?
  • QExplain the difference between containers and VMs at the kernel level. When would you choose one over the other for security reasons?
  • QYour Docker image is 1.5GB for a simple Node.js API. Walk me through how you would diagnose and reduce the size.
  • QYour container takes 30 seconds to stop in Kubernetes. What is the most likely cause and how do you fix it?

Frequently Asked Questions

What is Docker used for in real-world software development?

Docker is used to package applications and their dependencies into portable containers that run identically across development, staging, and production environments. In practice it's used for local development environments, CI/CD pipelines, microservices deployment, and running databases or third-party services locally without installing them on your machine.

Is Docker the same as a virtual machine?

No — they solve a similar problem (environment isolation) but in fundamentally different ways. A VM runs a complete guest operating system with its own kernel, which costs gigabytes of memory and minutes to start. A Docker container shares the host OS kernel and uses Linux namespaces and cgroups for isolation, starting in milliseconds and using megabytes. For most application workloads, containers are faster, cheaper, and just as reliable.

Does data inside a Docker container get deleted when the container stops?

Yes — by default, any data written inside a container's writable layer is lost when the container is removed. To persist data you need to use volumes: named volumes (Docker manages the storage location, best for databases) or bind mounts (maps a specific host directory into the container, best for local development). Neither type of volume is deleted by docker compose down unless you explicitly pass the --volumes flag.

What is the difference between depends_on and depends_on with condition: service_healthy?

depends_on without a condition only waits for the container to START (docker start returns), not for the process inside to be READY. depends_on with condition: service_healthy waits for the healthcheck to PASS — the process inside must be fully ready and accepting connections. For databases, message queues, and any service with startup time, always use service_healthy.

How do I reduce the size of my Docker image?

The three highest-impact changes are: (1) use a minimal base image like Alpine instead of full Debian — this alone drops your base from ~180MB to ~7MB; (2) use multi-stage builds so your build tools and compiler never ship to production; (3) chain RUN commands with && and clean up package manager caches in the same RUN instruction so intermediate files don't persist in a layer.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

Next →Containerization vs Virtualization
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged