GitLab CI/CD Tutorial: Pipelines, Jobs & Real-World Workflows
Every software team eventually hits the same wall: code that works on one developer's machine fails on another's, deployments are manual and nerve-wracking, and nobody remembers the exact steps to release a new version. This isn't a skill problem — it's a process problem. GitLab CI/CD exists to solve exactly this. It turns your deployment process from a tribal ritual into a documented, automated, repeatable system that runs identically every single time.
The problem CI/CD solves is feedback delay. Without it, a developer might push broken code on Monday and not find out until Thursday when QA picks it up. With a properly configured GitLab pipeline, that same developer gets a red build notification within minutes of pushing. The pipeline catches the bug, runs the tests, and blocks the broken code from ever reaching production. The cost of fixing a bug drops dramatically when you catch it 10 minutes after writing it versus 3 days later.
By the end of this article, you'll understand how GitLab pipelines actually work under the hood, how to write a .gitlab-ci.yml file that handles real-world scenarios like caching dependencies, running parallel jobs, and deploying only on specific branches. You'll also know the most common mistakes teams make and how to avoid them before they cost you a failed production deployment.
How GitLab Pipelines Actually Work (The Mental Model You Need)
Before writing a single line of YAML, you need the right mental model. A GitLab pipeline is a directed acyclic graph (DAG) of work. That's a fancy way of saying it's a series of jobs organized into stages, where each stage waits for the previous one to pass before running.
Here's the key hierarchy: a Pipeline contains Stages, and each Stage contains one or more Jobs. Jobs within the same stage run in parallel by default. Jobs in different stages run sequentially. A GitLab Runner — a separate process that can live on your own server or GitLab's shared infrastructure — picks up each job and executes it in an isolated environment, usually a Docker container.
Why does this matter? Because the isolation is what makes CI/CD trustworthy. Each job starts fresh, with no leftover state from previous jobs unless you explicitly pass artifacts or use caching. This means your tests can't accidentally pass because of something that only exists on one developer's machine. The pipeline environment is the single source of truth.
Every pipeline is triggered by an event: a git push, a merge request, a schedule, or a manual trigger. GitLab reads your .gitlab-ci.yml from the root of your repository and constructs the pipeline graph from it. If the file doesn't exist, no pipeline runs. If it has a syntax error, GitLab tells you immediately in the UI before anything executes.
# Define the order of stages — jobs in the same stage run in parallel # Jobs in later stages only run if all jobs in prior stages pass stages: - install # Stage 1: Get dependencies ready - test # Stage 2: Run all automated tests - build # Stage 3: Compile/bundle the application - deploy # Stage 4: Ship to the target environment # A default block applies settings to ALL jobs unless a job overrides them default: image: node:20-alpine # Every job runs inside this Docker container before_script: - echo "Pipeline started for branch: $CI_COMMIT_BRANCH" # ── STAGE: install ────────────────────────────────────────────────────────── install_dependencies: stage: install script: - npm ci # 'ci' is stricter than 'install' — uses package-lock.json exactly # Cache node_modules so later stages (and future pipelines) don't re-download cache: key: files: - package-lock.json # Cache is invalidated only when lock file changes paths: - node_modules/ artifacts: paths: - node_modules/ # Pass node_modules to downstream jobs in this pipeline expire_in: 1 hour # Don't keep artifacts forever — saves storage # ── STAGE: test ───────────────────────────────────────────────────────────── run_unit_tests: stage: test script: - npm run test:unit -- --coverage # Run unit tests and generate coverage report coverage: '/Lines\s*:\s*(\d+\.?\d*)%/' # Regex to parse coverage % from output artifacts: reports: coverage_report: coverage_format: cobertura path: coverage/cobertura-coverage.xml # Shown as coverage badge in GitLab UI run_lint: stage: test # Runs in PARALLEL with run_unit_tests — same stage script: - npm run lint # Code style check — runs at the same time as unit tests # ── STAGE: build ───────────────────────────────────────────────────────────── build_production_bundle: stage: build script: - npm run build # Creates optimised production assets in /dist artifacts: paths: - dist/ # Pass the compiled app to the deploy stage expire_in: 1 week # ── STAGE: deploy ───────────────────────────────────────────────────────────── deploy_to_production: stage: deploy script: - echo "Deploying commit $CI_COMMIT_SHA to production..." - ./scripts/deploy.sh # Your actual deployment script environment: name: production url: https://myapp.example.com # CRITICAL: Only deploy automatically from the main branch # All other branches can run tests and build, but NOT deploy rules: - if: '$CI_COMMIT_BRANCH == "main"' when: on_success # Deploy automatically if all prior stages passed - when: never # For any other branch, skip this job entirely
Stage: install
✅ install_dependencies (42s)
Stage: test
✅ run_unit_tests (1m 12s) — Coverage: 87.4%
✅ run_lint (18s)
Stage: build
✅ build_production_bundle (34s)
Stage: deploy
✅ deploy_to_production (1m 03s)
Pipeline passed in 3m 49s
Caching vs Artifacts: The Distinction That Changes Pipeline Speed
This is the most misunderstood concept in GitLab CI/CD, and getting it wrong will either break your pipeline or make it painfully slow. They look similar but serve completely different purposes.
Artifacts are files that jobs pass downstream within the same pipeline. When your build job creates a /dist folder, the deploy job needs that folder. You declare it as an artifact and GitLab uploads it to its object storage, then downloads it automatically for any downstream job that needs it. Artifacts are precise, pipeline-scoped, and short-lived.
Cache is a performance optimisation that persists across multiple pipelines. Your node_modules folder takes 45 seconds to download every run. Cache it with a key tied to your package-lock.json, and subsequent pipelines skip the download entirely unless your dependencies change. Cache is best-effort — GitLab can evict it, and you should never rely on it for correctness.
The mental model: artifacts are for passing work between jobs in a pipeline (correctness), cache is for skipping repeated work across pipelines (speed). If your deploy job can't find the built files, you have an artifact problem. If your pipeline is unnecessarily slow, you have a cache problem. These are never interchangeable.
# ── DEMONSTRATING THE DIFFERENCE BETWEEN CACHE AND ARTIFACTS ──────────────── stages: - dependencies - test - package install_python_packages: stage: dependencies image: python:3.12-slim script: - pip install -r requirements.txt --target=.packages cache: # Cache key is a hash of requirements.txt # The cache is ONLY invalidated when requirements.txt changes # This saves ~30-60s on pipelines where nothing changed key: files: - requirements.txt paths: - .packages/ # Cached across pipelines for speed artifacts: paths: - .packages/ # Also an artifact so test stage can USE these packages expire_in: 2 hours run_pytest: stage: test image: python:3.12-slim script: # PYTHONPATH tells Python where to find the packages installed above - export PYTHONPATH="$CI_PROJECT_DIR/.packages:$PYTHONPATH" - python -m pytest tests/ -v --junitxml=report.xml artifacts: # Test reports are artifacts — GitLab reads them to display pass/fail in MR UI reports: junit: report.xml # Displays individual test results in the merge request when: always # Upload report EVEN if tests fail — you need the evidence create_deployment_package: stage: package image: python:3.12-slim script: - zip -r deployment.zip src/ .packages/ config/ - echo "Package size: $(du -sh deployment.zip | cut -f1)" artifacts: name: "app-package-$CI_COMMIT_SHORT_SHA" # Dynamic name includes commit hash paths: - deployment.zip expire_in: 1 week # Keep for a week so you can re-deploy without re-building # Note: NO cache here — the zip file is a one-off per pipeline, not reusable
Checking cache... HIT (key: abc123def456)
Restoring cache from .packages/ (saved 34s)
Running: pip install -r requirements.txt --target=.packages
Requirements already satisfied (cache hit)
Uploading artifacts: .packages/ (12.4 MB)
run_pytest:
Downloading artifacts from install_python_packages...
Running: python -m pytest tests/ -v
========================= 47 passed in 8.31s =========================
Uploading test report: report.xml
create_deployment_package:
Package size: 14M
Uploading artifact: app-package-f3a9c21.zip
Environment-Based Deployments with Review Apps and Protected Branches
A mature CI/CD pipeline doesn't just have one deployment target. Real projects deploy to multiple environments: feature branches might spin up temporary 'review apps', merges to develop deploy to staging, and only merges to main reach production. This isn't complexity for its own sake — it's the safety net that lets teams ship fast without breaking things.
GitLab's environment keyword is what makes this elegant. When you define an environment in a job, GitLab tracks which pipeline version is running where. You can see at a glance in the GitLab UI that production is running commit f3a9c21 while staging has b7e1d04. You can also roll back to a previous deployment with one click directly from the Environments page.
Review Apps take this further. For every merge request, GitLab can automatically spin up a live, isolated environment just for that feature branch — complete with a unique URL. Product managers and designers can preview changes before they're merged. No more 'can you deploy this branch so I can see it?' conversations.
Protected branches add the security layer. When main is a protected branch, only Maintainers can push to it directly, and only pipelines triggered from protected branches can access protected CI/CD variables (like production API keys). This prevents a developer from accidentally deploying untested code to production.
# ── MULTI-ENVIRONMENT DEPLOYMENT PIPELINE ──────────────────────────────────── # This demonstrates: review apps, staging, and production with proper guards stages: - test - deploy variables: # These non-sensitive defaults can live in the YAML STAGING_URL: "https://staging.myapp.example.com" PRODUCTION_URL: "https://myapp.example.com" # DEPLOY_SSH_KEY and PRODUCTION_API_KEY are set in # GitLab Settings > CI/CD > Variables (masked + protected) # ── Shared test job (runs for ALL branches) ─────────────────────────────────── run_all_tests: stage: test image: node:20-alpine script: - npm ci - npm test # ── REVIEW APP: Deploys for every Merge Request ─────────────────────────────── deploy_review_app: stage: deploy image: alpine:latest script: # CI_ENVIRONMENT_SLUG is auto-generated from the environment name # e.g., environment name "review/fix-login-bug" becomes slug "review-fix-login-bug" - echo "Deploying review app for MR: $CI_MERGE_REQUEST_IID" - apk add --no-cache openssh-client rsync - ./scripts/deploy-review.sh $CI_ENVIRONMENT_SLUG # Your deploy script environment: name: review/$CI_COMMIT_REF_SLUG # Creates a unique environment per branch url: https://$CI_ENVIRONMENT_SLUG.review.myapp.example.com on_stop: teardown_review_app # Tell GitLab which job cleans this up rules: - if: '$CI_PIPELINE_SOURCE == "merge_request_event"' # Only runs for MRs # ── Teardown job: Runs when MR is closed or merged ─────────────────────────── teardown_review_app: stage: deploy image: alpine:latest script: - echo "Tearing down review app: $CI_ENVIRONMENT_SLUG" - ./scripts/teardown-review.sh $CI_ENVIRONMENT_SLUG environment: name: review/$CI_COMMIT_REF_SLUG action: stop # This is what links it to the on_stop in deploy_review_app rules: - if: '$CI_PIPELINE_SOURCE == "merge_request_event"' when: manual # Triggered manually OR automatically when MR closes # ── STAGING: Deploys automatically when develop branch is updated ───────────── deploy_to_staging: stage: deploy image: alpine:latest script: - echo "Deploying $CI_COMMIT_SHORT_SHA to staging..." - ./scripts/deploy.sh staging environment: name: staging url: $STAGING_URL rules: - if: '$CI_COMMIT_BRANCH == "develop"' when: on_success # ── PRODUCTION: Requires manual approval — never deploys automatically ───────── deploy_to_production: stage: deploy image: alpine:latest script: - echo "Deploying $CI_COMMIT_SHORT_SHA to PRODUCTION" - ./scripts/deploy.sh production environment: name: production url: $PRODUCTION_URL rules: - if: '$CI_COMMIT_BRANCH == "main"' when: manual # A human must click 'Run' in the GitLab UI to proceed allow_failure: false # If this job fails, the pipeline is marked failed
Stage: test
✅ run_all_tests (55s)
Stage: deploy
✅ deploy_review_app (28s)
Environment: review/feature-login-redesign
URL: https://feature-login-redesign.review.myapp.example.com
⏸ teardown_review_app (manual — runs when MR closes)
---
Pipeline #5108 — Branch: main
Stage: test
✅ run_all_tests (55s)
Stage: deploy
⏸ deploy_to_production (manual approval required)
Click ▶ in GitLab UI to deploy to production
Pipeline Optimization: Parallelism, DAG, and Cutting Run Times in Half
Once your pipeline is working correctly, the next battle is speed. A 20-minute pipeline that runs on every commit destroys developer flow. The good news is that most slow pipelines have structural problems, not hardware problems, and they're fixable in YAML.
The first tool is the needs keyword, which unlocks GitLab's DAG (Directed Acyclic Graph) mode. By default, all jobs in stage 2 wait for ALL jobs in stage 1 to finish. With needs, a specific job can start the moment its direct dependencies finish — regardless of what stage it's in. If your build_api job doesn't depend on run_e2e_tests, why should it wait for it?
The second tool is parallel:matrix, which lets you run the same job multiple times with different variables simultaneously. Instead of running tests for Node 18, then Node 20, then Node 22 sequentially, you run all three at the same time. What was a 9-minute sequential test suite becomes a 3-minute parallel one.
The third tool is job-level rules with changes. If a push only touches markdown files in /docs, there's no reason to rebuild your entire application. The changes rule checks which files changed and skips jobs that don't need to run. Used aggressively, this can skip 70% of your pipeline on documentation-only commits.
# ── OPTIMISED PIPELINE USING DAG + PARALLEL MATRIX + CHANGE DETECTION ──────── stages: - install - test # In default mode, everything here waits for install to finish - build # In default mode, everything here waits for ALL tests to pass - deploy install_node_modules: stage: install image: node:20-alpine script: - npm ci cache: key: files: [package-lock.json] paths: [node_modules/] artifacts: paths: [node_modules/] expire_in: 1 hour # ── PARALLEL MATRIX: Runs 3 simultaneous jobs instead of 3 sequential ones ─── test_across_node_versions: stage: test # 'needs' tells GitLab: start me as soon as install_node_modules passes # Don't wait for other jobs in the install stage that don't affect me needs: ["install_node_modules"] parallel: matrix: # GitLab spins up 3 separate jobs, one per entry — all run at the same time - NODE_VERSION: "18" - NODE_VERSION: "20" - NODE_VERSION: "22" image: node:${NODE_VERSION}-alpine # Each job uses its own Node version script: - echo "Testing on Node $NODE_VERSION" - npm test - echo "Node $NODE_VERSION — PASSED" # ── CHANGE-BASED SKIPPING: Only rebuild if source code actually changed ─────── build_docker_image: stage: build image: docker:24 services: - docker:24-dind # Docker-in-Docker: lets you build Docker images inside CI needs: # DAG: start building as soon as tests pass — don't wait for other build jobs - job: test_across_node_versions script: - docker build -t myapp:$CI_COMMIT_SHORT_SHA . - docker push myregistry.example.com/myapp:$CI_COMMIT_SHORT_SHA rules: - if: '$CI_COMMIT_BRANCH == "main"' changes: # Only run this job if one of these paths changed # A docs-only push? This job is SKIPPED entirely - src/**/* - Dockerfile - package.json - package-lock.json # ── VISUALISING THE DAG EFFECT ──────────────────────────────────────────────── # WITHOUT needs (default stage ordering): # install(42s) → test_node18(3m) → test_node20(3m) → test_node22(3m) → build(2m) # Total: ~11 minutes sequential # # WITH needs + parallel matrix: # install(42s) → [test_node18 + test_node20 + test_node22](3m parallel) → build(2m) # Total: ~6 minutes ← Almost 2x faster with zero new hardware
Stage: install
✅ install_node_modules (42s)
Stage: test [parallel — all started immediately after install]
✅ test_across_node_versions: Node 18 (2m 48s)
✅ test_across_node_versions: Node 20 (2m 55s)
✅ test_across_node_versions: Node 22 (3m 02s)
← All 3 ran simultaneously. Wall-clock time: 3m 02s
Stage: build
✅ build_docker_image (1m 54s)
Total pipeline time: 6m 18s
(vs ~11 minutes with default sequential execution)
| Aspect | GitLab CI/CD (Self-Managed or SaaS) | GitHub Actions |
|---|---|---|
| Pipeline config file | .gitlab-ci.yml in repo root | .github/workflows/*.yml |
| Runner infrastructure | Shared runners (free tier) or self-hosted | GitHub-hosted or self-hosted |
| DAG / job dependencies | Native `needs` keyword | Native `needs` keyword |
| Review Apps | Built-in environment tracking + auto URLs | Requires third-party actions |
| Container registry | Built into every GitLab project | GitHub Packages (separate service) |
| Secret management | Protected + masked CI/CD Variables per group/project | Encrypted secrets per repo/org |
| Merge request integration | Pipeline status, coverage, test reports native in MR UI | Status checks via PR checks API |
| Free CI minutes (SaaS) | 400 min/month on free tier | 2,000 min/month on free tier |
| Best for | Teams already on GitLab, needing integrated DevOps | Teams on GitHub, wanting large marketplace of actions |
🎯 Key Takeaways
- The
ruleskeyword replacesonly/exceptfor all modern pipelines — it handles branch conditions, file changes, pipeline sources, and scheduling in a single unified block. - Artifacts pass work between jobs within a pipeline (correctness); cache skips repeated work across pipelines (speed). Mixing them up causes either stale deployments or unnecessarily slow pipelines.
- The
needskeyword unlocks DAG mode — jobs start the moment their direct dependencies finish, not when their entire stage finishes. This alone can cut pipeline times in half on real projects. - Protected CI/CD variables only appear in pipelines triggered from protected branches — this is the architectural reason production secrets can't leak even if a developer pushes to a feature branch.
⚠ Common Mistakes to Avoid
- ✕Mistake 1: Using
only: [main]alongsiderulesin the same job — GitLab throws a validation error ('rulescannot be used withonly/except') and the pipeline fails to create. Fix: pick one. If you need any conditional logic beyond simple branch names, migrate fully torules. Therulesapproach is strictly more powerful. - ✕Mistake 2: Declaring an artifact in a job but forgetting to add
needsin the downstream job that requires it — the downstream job starts before the artifact is uploaded and fails with 'file not found'. Fix: explicitly list the producing job inneedson the consuming job. Never assume artifact availability based on stage order alone when using DAG pipelines. - ✕Mistake 3: Storing sensitive credentials (API keys, SSH private keys) directly in
.gitlab-ci.yml— these are committed to the repository and visible to anyone with read access, including in the git history even after deletion. Fix: always add secrets through GitLab Settings > CI/CD > Variables with the 'Masked' flag enabled. For production secrets, also enable 'Protected' so they only appear in pipelines running from protected branches.
Interview Questions on This Topic
- QWhat's the difference between Continuous Delivery and Continuous Deployment, and how would you configure GitLab CI/CD to implement each one?
- QYou have a GitLab pipeline that takes 25 minutes to run. Walk me through how you'd diagnose and reduce that time without adding more runner hardware.
- QA developer says 'our production deploy job is accessing a CI variable but getting an empty value'. What are the three most likely causes and how do you investigate each?
Frequently Asked Questions
What is a GitLab Runner and do I need to set one up?
A GitLab Runner is the agent that actually executes your pipeline jobs. GitLab.com provides shared runners for free (up to 400 minutes/month on the free tier) so you don't need to set anything up to get started. For self-managed GitLab instances or teams that need more minutes, custom compute, or private network access, you register your own runner on any machine using the gitlab-runner register command.
How do I stop a GitLab pipeline from running on every single commit to every branch?
Use rules with if conditions on each job. For example, if: '$CI_COMMIT_BRANCH == "main" || $CI_PIPELINE_SOURCE == "merge_request_event"' limits a job to only main branch pushes and merge requests. You can also use workflow:rules at the top level to control whether a pipeline is created at all, which is more efficient than per-job rules for blanket filtering.
What's the difference between a pipeline triggered by a push and one triggered by a merge request event?
A push pipeline runs when code is pushed to a branch and uses the state of that branch's code. A merge request pipeline runs in the context of the MR and can access merge-request-specific variables like CI_MERGE_REQUEST_IID and the merged result (the code as it would look after merging). To avoid running duplicate pipelines for both events on the same commit, use workflow:rules to allow only one type, or use CI_OPEN_MERGE_REQUESTS to skip push pipelines when an MR already exists.
Written and reviewed by senior developers with real-world experience across enterprise, startup and open-source projects. Every article on TheCodeForge is written to be clear, accurate and genuinely useful — not just SEO filler.