GitHub Actions CI/CD Tutorial: Workflows, Jobs, and Real-World Pipelines
- The hierarchy is Workflow → Job → Step — jobs are parallel by default, steps within a job are sequential and share a filesystem. Getting this model wrong is the root of most pipeline bugs.
- Use environment secrets with required reviewers for production deployments — repository-level secrets are accessible to every workflow and every job, which is a credential leak waiting to happen.
- Pin third-party Actions to a commit SHA, not a branch or floating tag — branch-pinning means someone else's commit can break your deploy pipeline without you touching a single file.
- Workflow: YAML file triggered by an event — one event can trigger many workflows
- Job: parallel unit of work — runs simultaneously by default, use 'needs' for sequencing
- Step: sequential within a job — either a shell command (run) or pre-built Action (uses)
- Secrets: scoped by level — org, repo, or environment (most secure)
- Concurrency: prevents deployment race conditions with cancel-in-progress
Pipeline fails with no code change — Action input or version issue
gh run view <run-id> --log-failed (see exact failure in logs)gh api repos/{owner}/{repo}/actions/runs/{run-id}/jobs (see which job failed)Two deployments raced — server has mixed versions
gh run list --workflow=deploy.yml --limit=5 (see recent deploy runs)gh run cancel <run-id> (cancel the racing deploy)Secret exposed in logs — credential leak detected
gh secret list (verify which secrets exist at which level)gh secret set <NAME> (re-set the rotated secret)Cache miss every run — builds are slow
git diff HEAD~1 package-lock.json (is the lock file changing?)gh run view <run-id> --log (check for 'Cache not found' messages)Fork PR cannot access secrets — integration tests fail
Check workflow trigger: on: pull_request (fork-safe) vs on: pull_request_target (has secrets)Use OIDC for cloud credentials (short-lived tokens, no stored secrets)Production Incident
uses: actions/checkout@main — pinned to the main branch of the checkout Action.
2. The checkout Action maintainer pushed a commit that renamed the ref input to repository-ref.
3. The team's workflow still passed ref: ${{ github.sha }} which no longer existed as an input.
4. The checkout Action failed with 'Input required and not supplied: ref' because the input was renamed.
5. The team's workflow YAML had not changed — the upstream Action changed under them.
6. The scheduled nightly build picked up the new Action version automatically.
7. The team spent 14 hours debugging before checking the Action's changelog.uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683.
2. Update the input name from ref to repository-ref to match the new Action version.
3. Team rule: all third-party Actions must be pinned to commit SHAs, not branch tags or floating version tags.
4. Added a linting step that checks workflow YAML for non-SHA-pinned Actions: uses: zricethezav/actionlint@latest.
5. Set up Dependabot alerts for Action version updates so the team can review and update SHA pins deliberately.Production Debug GuideSystematic recovery paths for broken pipelines, deployment races, secret leaks, and cache issues.
concurrency: { group: deploy-${{ github.ref }}, cancel-in-progress: true }.
4. Prevention: all deployment workflows must have concurrency groups.key: ${{ runner.os }}-node-${{ hashFiles('package-lock.json') }}.
2. If package-lock.json changes on every run (e.g., version bumping scripts), the cache key changes every time.
3. Check: git diff HEAD~1 package-lock.json — is the lock file changing when it should not?
4. Fix: use restore-keys as a fallback: ${{ runner.os }}-node- to get partial cache hits.on: pull_request_target for fork PR workflows that need secrets — but read the security implications first.
3. Better: use OIDC for cloud credentials (short-lived tokens, no stored secrets).
4. Alternative: skip integration tests on fork PRs, run them after merge.GitHub Actions is a CI/CD platform that runs workflows defined as YAML files in your repository. Workflows are triggered by events (push, pull request, schedule, manual) and execute jobs that contain sequential steps. The platform provides hosted runners, a marketplace of 16,000+ pre-built Actions, and built-in secrets management.
The key architectural decisions: jobs run in parallel by default (use needs for sequencing), steps within a job share a filesystem (install in step 1, use in step 2), and runners are ephemeral (clean every run unless you explicitly cache or upload artifacts). Secrets are scoped at three levels — org, repo, and environment — with environment secrets being the most secure for production credentials.
Common misconceptions: that secrets are automatically redacted in all contexts (they are not — encoding/decoding bypasses redaction), that on: push and on: pull_request have the same permissions (fork PRs get read-only access and no secrets), and that pinning Actions to branch tags is safe (it is not — a maintainer's breaking change breaks your pipeline without any code change from you).
How GitHub Actions Actually Works: Events, Workflows, Jobs, and Steps
The mental model is a clean hierarchy, and getting it right changes everything. At the top is a Workflow — a YAML file in .github/workflows/. A workflow is triggered by an Event: a push, a pull request, a schedule, or even a manual button click in the GitHub UI. One event can trigger many workflows.
Inside a workflow are Jobs. Jobs are the parallel units of work. By default they run simultaneously — so your 'run tests' job and your 'lint code' job can race each other. That's a huge speed win. If you need sequencing (don't deploy until tests pass), you declare explicit dependencies with needs.
Inside each job are Steps. Steps are sequential within a job — they share the same runner machine and filesystem, which is why you can install Node in step 1 and use it in step 2. Each step is either a shell command (run) or a pre-built Action (uses). Those pre-built Actions are the real superpower: the community has published Actions for deploying to AWS, sending Slack messages, caching npm dependencies — thousands of them on the GitHub Marketplace.
The runner is just a virtual machine spun up on demand by GitHub. It's clean every run — nothing carries over between workflow runs unless you explicitly cache it or upload an artifact.
# io.thecodeforge — GitHub Actions CI Pipeline # # This workflow runs on every push to any branch and on every pull request targeting main. # It has two jobs: one for testing, one for linting — they run in parallel to save time. name: CI Pipeline on: push: branches: - '**' # Trigger on every branch push pull_request: branches: - main # Extra scrutiny on PRs targeting main jobs: # ── JOB 1: Run the test suite ─────────────────────────────────────────────── run-tests: name: Run Unit & Integration Tests runs-on: ubuntu-latest # GitHub-hosted runner — fresh VM every time steps: - name: Check out repository code uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # SHA-pinned for security - name: Set up Node.js 20 uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # SHA-pinned with: node-version: '20' cache: 'npm' # Caches node_modules between runs — huge speed win - name: Install dependencies run: npm ci # 'ci' is stricter than 'install' — uses package-lock.json exactly - name: Run tests with coverage run: npm test -- --coverage env: NODE_ENV: test # Set environment variables inline per step # ── JOB 2: Lint the codebase (runs in PARALLEL with run-tests) ─────────────── lint-code: name: ESLint Code Quality Check runs-on: ubuntu-latest steps: - name: Check out repository code uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # SHA-pinned - name: Set up Node.js 20 uses: actions/setup-node@4942d1e84afbd3f7d6820020 # SHA-pinned with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Run ESLint run: npm run lint # Fails the job (and blocks the PR) if lint errors exist
✓ Check out repository code
✓ Set up Node.js 20 [cache hit]
✓ Install dependencies
✓ Run tests with coverage — 48 passed, 0 failed
✓ ESLint Code Quality Check (38s)
✓ Check out repository code
✓ Set up Node.js 20 [cache hit]
✓ Install dependencies
✓ Run ESLint — No lint errors found
All checks passed. PR is ready to merge.
- npm ci reads package-lock.json exactly — no resolution, no surprises
- npm install resolves dependencies fresh — lock file may change
- CI should test the exact tree your teammates agreed on, not a fresh resolution
- npm ci also deletes node_modules first for a clean install — stricter by design
Handling Secrets, Environment Variables, and Multi-Environment Deployments
Here's where most tutorials fail you: they show you how to reference a secret but not how to think about secrets architecture for a real project. Let's fix that.
GitHub has three levels of secrets: Organization secrets (shared across repos), Repository secrets (just this repo), and Environment secrets (scoped to a named deployment environment like 'staging' or 'production'). Environment secrets are the most powerful for CI/CD because GitHub won't hand them to a workflow unless it's deploying to that specific named environment — and you can add required reviewers, meaning a human must approve before prod secrets are ever exposed.
The environment key on a job is what unlocks this. When you add environment: production to a deployment job, GitHub checks if that environment exists, applies its protection rules (required reviewers, wait timers), and only then injects its secrets into the job's environment variables.
Never log secrets. GitHub automatically redacts known secret values from logs, but if you base64-encode a secret and then decode it in a run step and echo it, GitHub has no idea that string is sensitive. The redaction is string-match based, not magic.
# io.thecodeforge — GitHub Actions Deploy Pipeline # # This workflow deploys to staging on every merge to main, # then requires a manual approval before deploying to production. # Secrets are scoped per environment so prod credentials are never # exposed during a staging deploy. name: Deploy Pipeline on: push: branches: - main # Only deploys on merges to main — not on feature branches jobs: # ── JOB 1: Tests must pass before anything deploys ────────────────────────── run-tests: name: Test Gate runs-on: ubuntu-latest steps: - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 - uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 with: node-version: '20' cache: 'npm' - run: npm ci - run: npm test # ── JOB 2: Deploy to Staging (runs after tests pass) ──────────────────────── deploy-staging: name: Deploy to Staging runs-on: ubuntu-latest needs: run-tests # Will not start until run-tests job succeeds environment: staging # Unlocks staging environment secrets + protection rules steps: - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 - name: Build production bundle run: npm run build env: VITE_API_URL: ${{ vars.API_URL }} # 'vars' = non-secret config variables (visible in logs) - name: Deploy to staging server via SSH run: | # Write the SSH private key from secrets to a temp file echo "${{ secrets.STAGING_SSH_PRIVATE_KEY }}" > /tmp/deploy_key chmod 600 /tmp/deploy_key # Sync build output to the staging server rsync -avz --delete \ -e "ssh -i /tmp/deploy_key -o StrictHostKeyChecking=no" \ ./dist/ \ ${{ secrets.STAGING_USER }}@${{ secrets.STAGING_HOST }}:/var/www/app/ # Clean up the key file immediately after use rm /tmp/deploy_key # secrets.STAGING_SSH_PRIVATE_KEY is ONLY available because environment: staging is set above # ── JOB 3: Deploy to Production (requires a human to approve in GitHub UI) ── deploy-production: name: Deploy to Production runs-on: ubuntu-latest needs: deploy-staging # Staging must succeed before prod is even offered environment: production # 'production' environment has required reviewers set in GitHub settings # The workflow PAUSES here until a reviewer approves in the GitHub UI steps: - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 - name: Build production bundle run: npm run build env: VITE_API_URL: ${{ vars.API_URL }} - name: Deploy to production server via SSH run: | echo "${{ secrets.PROD_SSH_PRIVATE_KEY }}" > /tmp/deploy_key chmod 600 /tmp/deploy_key rsync -avz --delete \ -e "ssh -i /tmp/deploy_key -o StrictHostKeyChecking=no" \ ./dist/ \ ${{ secrets.PROD_USER }}@${{ secrets.PROD_HOST }}:/var/www/app/ rm /tmp/deploy_key
✓ Test Gate (45s)
✓ Run tests — 48 passed
✓ Deploy to Staging (1m 12s)
✓ Build production bundle
✓ Deploy to staging server via SSH — 23 files transferred
⏸ Deploy to Production — Waiting for review
Reviewer '@alice' approved (3m later)
✓ Deploy to Production (1m 08s)
✓ Build production bundle
✓ Deploy to production server via SSH — 23 files transferred
All deployments complete.
PROD_SSH_PRIVATE_KEY as a repository secret instead of an environment secret, it's accessible to EVERY job in EVERY workflow — including a job triggered by a pull request from a fork. An attacker could open a PR, modify the workflow YAML, and exfiltrate your production key. Use environment secrets with protection rules for anything that touches production.Caching, Build Matrices, and Reusable Workflows — Scaling Without Pain
Once your pipeline works, the next battle is speed and maintainability. Three features change the game at scale.
Caching is the fastest win. Without it, npm ci downloads every package fresh on every run. With actions/cache (or the built-in cache on actions/setup-node), the node_modules are restored from a cache key built from your package-lock.json hash. If the lock file hasn't changed, you skip the download entirely. Same principle works for pip, Maven, Gradle, and Cargo.
Build matrices let you run the same job across multiple configurations in parallel without duplicating YAML. Testing against Node 18 and 20? Two browsers? Three operating systems? A matrix expands one job definition into N parallel jobs automatically. Failed combinations are clearly labeled, passing ones don't block each other.
Reusable workflows solve the DRY problem at the organization level. Instead of copy-pasting a 'deploy via SSH' job across 12 microservice repos, you define it once in a central repo and call it with uses: your-org/devops-workflows/.github/workflows/ssh-deploy.yml@main. Update the template once, every repo benefits. This is the pattern that separates organizations that maintain CI/CD well from those that have 12 slightly-different-and-all-broken pipelines.
# io.thecodeforge — GitHub Actions Matrix and Cache # # This workflow demonstrates a build matrix — running tests across multiple # Node.js versions and OS combinations simultaneously. # It also shows manual cache control for fine-grained cache invalidation. name: Cross-Platform Test Matrix on: pull_request: branches: - main jobs: test-matrix: name: "Node ${{ matrix.node-version }} / ${{ matrix.os }}" # ↑ GitHub uses this as the job label in the UI — makes failures obvious at a glance strategy: matrix: os: [ubuntu-latest, windows-latest] # Run on both Linux and Windows node-version: ['18', '20'] # And on both Node 18 and 20 # This creates 2 × 2 = 4 parallel jobs automatically fail-fast: false # ↑ IMPORTANT: Without this, if Node 18/Linux fails, GitHub cancels # the other 3 jobs immediately. Set fail-fast: false to see ALL results. runs-on: ${{ matrix.os }} # Each job uses the OS from its matrix slot steps: - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 - name: Set up Node.js ${{ matrix.node-version }} uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 with: node-version: ${{ matrix.node-version }} # We're NOT using the built-in cache here — we'll manage it manually # to show you exactly what's happening under the hood - name: Cache node_modules uses: actions/cache@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9 # SHA-pinned with: path: node_modules # Cache key = OS + Node version + hash of package-lock.json # If ANY of those change, the cache is invalidated and rebuilt key: ${{ runner.os }}-node-${{ matrix.node-version }}-${{ hashFiles('package-lock.json') }} # Fallback: if exact key not found, try a key from the same OS+version # This restores a slightly stale cache and npm ci tops it up — faster than a cold install restore-keys: | ${{ runner.os }}-node-${{ matrix.node-version }}- - name: Install dependencies run: npm ci # If the cache hit was exact, npm ci verifies integrity and exits fast (~3s) # If partial or no cache, it downloads and the cache is saved after the job - name: Run tests run: npm test # ── Reusable Workflow Call — deploy using a shared template ───────────────── # Instead of writing the deploy steps here, we call a workflow defined # in a central devops repo. All 12 microservices call this same template. deploy-via-shared-template: name: Deploy (Shared Workflow) needs: test-matrix uses: your-org/devops-workflows/.github/workflows/ssh-deploy.yml@main # ↑ References a reusable workflow in another repo — pinned to main branch with: environment: staging app-name: 'user-service' secrets: inherit # ↑ 'inherit' passes all secrets from the calling workflow to the reusable one # Without this, the reusable workflow has no access to any secrets
Running 4 parallel jobs:
✓ Node 18 / ubuntu-latest (52s) — 48 passed
✓ Node 20 / ubuntu-latest (49s) — 48 passed
✓ Node 18 / windows-latest (1m 4s) — 48 passed
✗ Node 20 / windows-latest (58s) — 47 passed, 1 FAILED
✗ test/fileUtils.test.js — path separator mismatch (\ vs /)
Note: fail-fast: false allowed the other 3 jobs to complete.
Without it, all 4 would have been cancelled on first failure.
Deploy (Shared Workflow): skipped — test-matrix did not fully pass.
- strategy.matrix expands one job definition into N parallel jobs
- fail-fast: true (default) cancels all jobs when one fails — you lose visibility
- fail-fast: false lets all jobs complete — see which configs are broken
- Use fail-fast: true for CI speed. Use fail-fast: false for debugging.
| Feature / Aspect | GitHub Actions | Jenkins |
|---|---|---|
| Setup time | Zero — lives in your repo, GitHub hosts it | Hours — install, configure, maintain a server |
| Config language | YAML in .github/workflows/ | Groovy (Jenkinsfile) or GUI-based |
| Marketplace / plugins | 16,000+ community Actions | 1,800+ plugins (older ecosystem) |
| Cost model | Free tier: 2,000 min/month; then per-minute | Self-hosted = server costs only, no per-minute fee |
| Secrets management | Built-in, org/repo/env scoped with protection rules | Credentials plugin — works but more manual wiring |
| Parallel jobs | Native matrix strategy, simple syntax | Parallel stages in Jenkinsfile — more verbose |
| Audit trail | Workflow run logs tied to git SHA and PR | Build logs separate from code history |
| Best for | Teams already on GitHub wanting zero ops overhead | Large orgs needing on-premise or highly custom pipelines |
🎯 Key Takeaways
- The hierarchy is Workflow → Job → Step — jobs are parallel by default, steps within a job are sequential and share a filesystem. Getting this model wrong is the root of most pipeline bugs.
- Use environment secrets with required reviewers for production deployments — repository-level secrets are accessible to every workflow and every job, which is a credential leak waiting to happen.
- Pin third-party Actions to a commit SHA, not a branch or floating tag — branch-pinning means someone else's commit can break your deploy pipeline without you touching a single file.
- The
concurrencykey withcancel-in-progress: trueis a one-liner that prevents deployment race conditions — skip it and you'll eventually get two deploys colliding on the same server. - Caching is the single highest-impact CI speed optimization. Exact cache hit = 3 seconds. Cold install = 2-3 minutes. Key = OS + version + lock file hash.
- Reusable workflows solve the DRY problem at the org level. Define once in a central repo, call from all repos with
secrets: inherit.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat's the difference between a job and a step in GitHub Actions, and why does it matter for sharing data between tasks?
- QHow would you prevent two simultaneous deploys from racing each other in a GitHub Actions workflow?
- QA pull request from a forked repository can't access repository secrets — why is that, and how do you safely run integration tests that need credentials on fork PRs?
- QWhy should you pin third-party Actions to commit SHAs instead of branch tags? What happens if you don't?
- QExplain the three levels of secrets in GitHub Actions. When would you use each level?
- QHow does caching work in GitHub Actions? What happens if the cache key changes on every run?
Frequently Asked Questions
How much does GitHub Actions cost for private repositories?
GitHub gives every account 2,000 free minutes per month for private repos on the Free plan (3,000 on Team, unlimited on Enterprise). Minutes on macOS runners are billed at 10× the Linux rate, and Windows at 2×. Public repositories get unlimited free minutes — which is why most open-source projects use GitHub Actions without a second thought about cost.
Can GitHub Actions deploy to AWS, GCP, or Azure?
Yes — and the recommended approach for cloud providers is OIDC (OpenID Connect) rather than storing long-lived cloud credentials as secrets. With OIDC, your workflow requests a short-lived token directly from the cloud provider for each run. AWS, GCP, and Azure all support this natively. Search the GitHub Marketplace for 'aws-actions/configure-aws-credentials' or 'google-github-actions/auth' for ready-made OIDC Actions.
What's the difference between `on: push` and `on: pull_request` triggers?
Both fire when code is involved, but with key differences in context. on: push fires after code lands on a branch — it has full access to repository secrets. on: pull_request fires when a PR is opened or updated — for security, workflows triggered by a fork's PR run with read-only permissions and no access to secrets by default. Use on: pull_request_target if you genuinely need secrets in a fork PR context, but read the security implications carefully first as it introduces risks.
How do I pin a GitHub Action to a commit SHA?
Find the commit SHA on the Action's repository (e.g., the latest release commit on actions/checkout). Use it in your workflow: uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683. You can add a comment with the version for readability: uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.1.0. Use actionlint to enforce SHA pinning across all workflow files.
What is the concurrency key and when should I use it?
The concurrency key groups workflow runs and optionally cancels in-progress runs when a new one starts. Use it on deployment workflows to prevent race conditions: concurrency: { group: deploy-${{ github.ref }}, cancel-in-progress: true }. Without it, two pushes to main within seconds trigger two simultaneous deploys that race to overwrite the same server.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.