DevOps Intermediate

Introduction to Docker: Containers, Images and Real-World Usage Explained

📅 March 2026 ⏱ 8 min read 🎯 Intermediate

In Plain English 🔥

Imagine you're moving house. Instead of dismantling every piece of furniture and hoping it fits in the new place, you pack everything — sofa, TV, cables, instruction manuals — into a perfectly sized shipping container. That container can be loaded onto any truck, ship, or train and delivered anywhere. Docker does exactly this for software: it bundles your app, its dependencies, its config, and its runtime into one portable 'container' that runs identically on your laptop, your colleague's machine, or a cloud server in Singapore. No more 'but it works on my machine'.

⚡ Quick Answer

Every developer has lived through the nightmare: you spend three days building a feature locally, push it to staging, and everything explodes. Different Node version, missing environment variable, wrong Python path — the list goes on. This is not a skill problem; it's an infrastructure problem. And it costs real companies millions of dollars in debugging time every year. Docker was built specifically to kill this class of problem dead.

Before Docker, the standard solution was virtual machines — spinning up a full copy of an operating system just to run one app. That's like hiring a full construction crew to hang a single picture frame. Docker introduced containers: lightweight, isolated processes that share the host OS kernel but keep their own filesystem, network, and process space completely separate. The result is an environment that's fast to spin up, tiny in memory, and — most importantly — identical wherever it runs.

By the end of this article you'll understand exactly how Docker containers and images relate to each other, how to write a production-worthy Dockerfile from scratch, how to persist data with volumes, and how to wire multiple services together with Docker Compose. You'll also know the gotchas that trip up experienced developers — not just beginners — in real deployments.

Containers vs Virtual Machines: Why Docker Is a Fundamentally Different Idea

Most people learn Docker by running commands without understanding the architectural shift underneath. That's fine for getting started, but it bites you the moment something breaks.

A virtual machine (VM) runs a full guest operating system — its own kernel, drivers, system processes — on top of a hypervisor. Your app sits at the top of this tower. Booting a VM can take minutes. It consumes gigabytes of RAM even before your app starts. Scaling ten microservices with VMs means ten full operating systems running simultaneously.

Docker containers take a different path. They share the host machine's kernel directly. Each container gets its own isolated view of the filesystem (via union file system layers), its own network namespace, and its own process tree — but there's no duplicated OS. A container starts in milliseconds. It uses megabytes of overhead instead of gigabytes.

The practical implication: on the same machine where you could run three VMs, you can run thirty containers. That's not a minor efficiency gain — it's the reason microservices architectures became economically viable. When AWS charges you per second of compute, that difference compounds fast.

Containers are not inherently less secure than VMs — they're just differently isolated. A misconfigured container is dangerous, just as a misconfigured VM is. The security story depends on your configuration, not the technology itself.

container_vs_vm_demo.sh · BASH

1234567891011121314151617181920212223

# Compare startup time and resource footprint — run these and watch the difference

# Pull a minimal Linux image (only ~5MB compressed)
docker pull alpine:3.19

# Start a container, run a command, and exit — time the whole thing
time docker run --rm alpine:3.19 echo "Container is alive"
# --rm tells Docker to delete the container after it exits (no cleanup needed)
# alpine:3.19 is the image — think of it as the blueprint
# 'echo ...' is the command to run inside the container

# Now check how much memory the container used at peak
# Run it in the background with resource stats
docker run -d --name resource-demo alpine:3.19 sleep 30
# -d runs in detached (background) mode
# --name gives it a human-readable name instead of a random hash

docker stats resource-demo --no-stream
# --no-stream prints one snapshot instead of a live feed
# Look at the MEM USAGE column — typically under 1MB for alpine doing nothing

# Clean up
docker stop resource-demo && docker rm resource-demo

▶ Output

Container is alive

real 0m0.387s
user 0m0.021s
sys 0m0.018s

NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
resource-demo 0.00% 632KiB / 15.55GiB 0.00% 796B / 0B 0B / 0B

🔥

Why This Matters in Production:A Node.js app in an Alpine-based Docker image typically weighs 80–120MB. The equivalent EC2 instance OS overhead is 1–2GB. At scale, that difference determines whether your architecture is cost-effective or not.

Images, Layers and Dockerfiles: How Docker Actually Builds Your App

A Docker image is a read-only blueprint for creating containers. A container is a running instance of an image — the same relationship as a class and an object in OOP, or a recipe and a meal.

Images are built in layers. Every instruction in a Dockerfile creates a new layer on top of the previous one. Docker caches these layers aggressively. This is the single most important thing to understand about Dockerfile efficiency: if layer 3 changes, Docker rebuilds from layer 3 downward. Layers 1 and 2 are served from cache instantly.

This is why experienced engineers always copy dependency manifests (package.json, requirements.txt, go.mod) and install dependencies BEFORE copying application source code. Source code changes every commit; dependencies change rarely. Put the slow, stable work near the top of your Dockerfile so it stays cached.

Multi-stage builds are the other major pattern worth knowing early. You use one image (with compilers, build tools, dev dependencies) to build your app, then copy only the compiled output into a minimal runtime image. Your final image contains zero build tooling — smaller, faster, and with a dramatically reduced attack surface.

Let's build a realistic Node.js API with both patterns applied — this is what a production-ready Dockerfile actually looks like, not the toy examples you usually see.

Dockerfile · DOCKERFILE

1234567891011121314151617181920212223242526272829303132333435363738394041424344

# ── STAGE 1: Build Stage ──────────────────────────────────────────────────────
# Use the full Node image with build tools available
FROM node:20-alpine AS builder
# 'AS builder' names this stage so we can reference it later
# node:20-alpine uses Alpine Linux — much smaller than node:20-bullseye

# Set the working directory inside the container
WORKDIR /app

# COPY dependency files FIRST — before application code
# Docker caches this layer. If package.json hasn't changed, npm install
# won't re-run even if your source code changed. This saves minutes per build.
COPY package.json package-lock.json ./

# Install only production dependencies (saves ~200MB vs installing devDependencies)
RUN npm ci --omit=dev
# npm ci is faster and stricter than npm install — it respects package-lock.json exactly

# NOW copy application source code
# Changing any source file only invalidates from this line forward
COPY src/ ./src/

# ── STAGE 2: Production Runtime Stage ─────────────────────────────────────────
# Start fresh from a minimal image — no build tools, no npm, no package manager cruft
FROM node:20-alpine AS production

# Run as a non-root user — critical for production security
# node:alpine ships with a 'node' user built in
USER node

WORKDIR /app

# Copy only what we need from the builder stage — not the entire filesystem
COPY --from=builder --chown=node:node /app/node_modules ./node_modules
COPY --from=builder --chown=node:node /app/src ./src
COPY --chown=node:node package.json ./
# --chown ensures the node user owns these files, not root

# Document which port the app listens on (informational — doesn't actually publish it)
EXPOSE 3000

# Define the command to run when a container starts from this image
# Use array form (exec form) — NOT string form — to ensure signals are handled correctly
CMD ["node", "src/server.js"]

▶ Output

# Build the image — run from the directory containing your Dockerfile
$ docker build -t my-node-api:1.0.0 .

Sending build context to Docker daemon 48.13kB
Step 1/11 : FROM node:20-alpine AS builder
---> 3f4d90098f5b
Step 2/11 : WORKDIR /app
---> Using cache
Step 3/11 : COPY package.json package-lock.json ./
---> Using cache ← dependencies layer served from cache!
Step 4/11 : RUN npm ci --omit=dev
---> Using cache ← install step also cached — build is fast
Step 5/11 : COPY src/ ./src/
---> 8c3a1b2d4e5f ← only this layer rebuilt (source changed)
...
Successfully built a7b3c9d1e2f4
Successfully tagged my-node-api:1.0.0

# Check the final image size
$ docker image ls my-node-api
REPOSITORY TAG IMAGE ID CREATED SIZE
my-node-api 1.0.0 a7b3c9d1e2f4 12 seconds ago 142MB
# Compare: the builder stage alone would be ~380MB with all dev tooling

⚠️

Watch Out: CMD String Form vs Array FormCMD "node src/server.js" (string form) wraps your process in a shell, meaning Docker's SIGTERM signal hits the shell, not your Node process. Your app won't shut down gracefully. Always use CMD ["node", "src/server.js"] (array/exec form) so signals go directly to your process.

Volumes and Docker Compose: Persistence and Multi-Container Orchestration

Containers are ephemeral by design. When a container stops, any data written inside it vanishes. That's perfect for stateless services, but databases, file uploads, and logs need to survive container restarts. Docker volumes solve this by mounting a storage location from the host (or a managed volume) into the container's filesystem.

There are three storage mechanisms: bind mounts (link a specific host directory into the container — great for local development where you want live code reloading), named volumes (Docker manages the storage location — best for databases in production), and tmpfs mounts (in-memory only — useful for sensitive data you never want written to disk).

Real applications are never a single container. You have an API, a database, a cache, maybe a background worker. Running and networking these manually with individual docker run commands is error-prone and impossible to reproduce reliably. Docker Compose lets you define your entire multi-container application in one YAML file and bring it all up with a single command.

Here's a complete, realistic Compose setup for a Node.js API backed by PostgreSQL and Redis — the stack you'll encounter in most backend roles.

docker-compose.yml · YAML

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

# Docker Compose V2 format (no 'version' key needed with modern Docker Desktop)
services:

  # ── The API service ───────────────────────────────────────────────────────
  api:
    build:
      context: .           # Build from the Dockerfile in the current directory
      target: production   # Use the 'production' stage from our multi-stage Dockerfile
    container_name: my-api
    ports:
      - "3000:3000"        # Map host port 3000 → container port 3000
    environment:
      # Reference values from a .env file — never hardcode secrets in Compose files
      NODE_ENV: production
      DATABASE_URL: postgresql://api_user:${DB_PASSWORD}@postgres:5432/app_db
      REDIS_URL: redis://redis:6379
      # 'postgres' and 'redis' are the service names below — Docker's internal
      # DNS resolves them automatically within the shared network
    depends_on:
      postgres:
        condition: service_healthy   # Wait until postgres passes its health check
      redis:
        condition: service_started
    restart: unless-stopped          # Restart on crash, but not if manually stopped

  # ── PostgreSQL database ───────────────────────────────────────────────────
  postgres:
    image: postgres:16-alpine        # Always pin a specific version — never use 'latest'
    container_name: my-postgres
    environment:
      POSTGRES_DB: app_db
      POSTGRES_USER: api_user
      POSTGRES_PASSWORD: ${DB_PASSWORD}   # Pulled from .env file
    volumes:
      - postgres_data:/var/lib/postgresql/data
      # Named volume — Docker manages where this lives on the host.
      # Database files survive 'docker compose down' and container rebuilds.
      - ./db/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
      # Bind mount an init script — runs once when the DB is first created.
      # :ro makes it read-only inside the container (good security habit)
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U api_user -d app_db"]
      interval: 5s     # Check every 5 seconds
      timeout: 5s      # Fail if no response in 5 seconds
      retries: 5       # Mark unhealthy after 5 consecutive failures

  # ── Redis cache ───────────────────────────────────────────────────────────
  redis:
    image: redis:7-alpine
    container_name: my-redis
    command: redis-server --appendonly yes
    # --appendonly yes enables AOF persistence — data survives Redis restarts
    volumes:
      - redis_data:/data

# Named volumes must be declared at the top level
# Docker creates and manages these — they persist across 'docker compose down'
volumes:
  postgres_data:
  redis_data:

▶ Output

# Start everything (add -d for detached/background mode)
$ docker compose up -d

[+] Running 5/5
✔ Network my-app_default Created
✔ Volume "postgres_data" Created
✔ Volume "redis_data" Created
✔ Container my-postgres Healthy
✔ Container my-redis Started
✔ Container my-api Started

# Check all services are running
$ docker compose ps
NAME IMAGE COMMAND STATUS PORTS
my-api my-app-api "docker-entrypoint.s…" Up 12 seconds 0.0.0.0:3000->3000/tcp
my-postgres postgres:16-alpine "docker-entrypoint.s…" Up 18 seconds 5432/tcp
my-redis redis:7-alpine "docker-entrypoint.s…" Up 18 seconds 6379/tcp

# Tail logs from a specific service
$ docker compose logs -f api
my-api | Server listening on port 3000
my-api | Database connection established
my-api | Redis connection established

# Tear down (volumes are preserved by default)
$ docker compose down
# Add --volumes to also delete the named volumes (WARNING: deletes all DB data)

⚠️

Pro Tip: Always Use Health Checks with depends_ondepends_on without a condition only waits for the container to START, not for the service inside it to be READY. PostgreSQL takes a few seconds to initialize after the container starts. Without service_healthy, your API will crash on boot trying to connect to a postgres that isn't ready yet. This is one of the most common causes of flaky local environments.

Aspect	Virtual Machines	Docker Containers
Startup time	30 seconds – 5 minutes	Milliseconds to 2 seconds
Memory overhead	512MB – 2GB per instance	1MB – 50MB per instance
OS isolation	Full guest OS per VM	Shared host kernel, isolated namespaces
Disk footprint	5GB – 50GB per image	5MB – 500MB per image
Portability	Hypervisor-dependent (.vmdk, .vhd)	Runs on any Docker host (Linux, Mac, Windows, Cloud)
Security isolation	Strong (separate kernel)	Good (namespaces + cgroups, but shared kernel)
Best for	Full OS control, strong isolation needs	Microservices, CI/CD pipelines, developer environments
Scaling speed	Minutes (VM provisioning)	Seconds (container spin-up)

🎯 Key Takeaways

Containers share the host OS kernel — they're not mini VMs. This is why they start in milliseconds and use megabytes of memory, making them economically practical for microservices at scale.
Docker image layers are cached from top to bottom. Copy dependency manifests and run installs BEFORE copying source code, or every git commit will trigger a full package reinstall.
Multi-stage builds are not optional in production — they separate build-time tooling from the runtime image, cutting image sizes by 50-70% and removing attack surface from your deployed artifact.
Named volumes persist data across container restarts and rebuilds; depends_on with service_healthy prevents race conditions — both are non-negotiable for any database-backed service.

⚠ Common Mistakes to Avoid

✕Mistake 1: Copying all source files before installing dependencies — Symptom: every code change triggers a full npm install or pip install, making builds take 3-5 minutes instead of 10 seconds — Fix: always COPY your dependency manifest (package.json, requirements.txt) and run your install command BEFORE copying the rest of your source code. Only the layers below a changed file get rebuilt.
✕Mistake 2: Using 'latest' as your image tag in production — Symptom: docker compose pull silently pulls a new major version of postgres or redis that has breaking changes, and your app crashes in production with confusing errors — Fix: always pin to a specific version tag like postgres:16.2-alpine. Treat image versions the same way you treat library versions in a lockfile.
✕Mistake 3: Running containers as the root user — Symptom: a vulnerability in your app gives an attacker root access to the container filesystem, and depending on your setup, a path to the host — Fix: add USER node (or create a dedicated low-privilege user) in your Dockerfile before the CMD instruction. Most official images ship with a built-in non-root user for exactly this reason.

Interview Questions on This Topic

QWhat's the difference between a Docker image and a Docker container, and how does the layer caching system affect your Dockerfile design decisions?
QIf your API container starts before your database is ready and crashes on boot, how would you fix that in a Docker Compose file without adding a sleep command?
QWhat's the practical difference between a bind mount and a named volume, and when would you choose one over the other in a production environment?

Frequently Asked Questions

What is Docker used for in real-world software development?

Docker is used to package applications and their dependencies into portable containers that run identically across development, staging, and production environments. In practice it's used for local development environments, CI/CD pipelines, microservices deployment, and running databases or third-party services locally without installing them on your machine.

Is Docker the same as a virtual machine?

No — they solve a similar problem (environment isolation) but in fundamentally different ways. A VM runs a complete guest operating system with its own kernel, which costs gigabytes of memory and minutes to start. A Docker container shares the host OS kernel and uses Linux namespaces and cgroups for isolation, starting in milliseconds and using megabytes. For most application workloads, containers are faster, cheaper, and just as reliable.

Does data inside a Docker container get deleted when the container stops?

Yes — by default, any data written inside a container's writable layer is lost when the container is removed. To persist data you need to use volumes: named volumes (Docker manages the storage location, best for databases) or bind mounts (maps a specific host directory into the container, best for local development). Neither type of volume is deleted by docker compose down unless you explicitly pass the --volumes flag.

🔥

TheCodeForge Editorial Team Verified Author

Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.

About Our Team Editorial Standards

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged