CI/CD Silent Failure — Expired Docker Credentials
Docker registry credentials expired, push succeeded with exit code 0 but image never uploaded.
- CI automatically builds and tests code on every push
- CD produces a deployable artifact after every successful CI run
- Continuous Deployment auto-deploys to production
- Pipelines catch bugs early, reduce deployment risk
- Aim for pipeline under 10 minutes — longer loses value
- A failing pipeline that doesn't alert is a silent time bomb
CI/CD is like having an automated quality check for your code. Every time you make a change, the system automatically checks if it works and prepares it for release, so you don't have to remember all the steps.
CI/CD is the backbone of modern software delivery. It replaces the old model of big-bang releases with a continuous flow of small, validated changes. The core idea: every code change goes through an automated pipeline that builds, tests, and — optionally — deploys it. This isn't a luxury; teams that skip CI/CD spend 2x to 3x more time resolving integration conflicts and debugging production failures. Here's the catch: a poorly designed pipeline can be worse than no pipeline — it can give false confidence. This guide covers what CI/CD actually means, how to build one that works, and the production failures you'll face if you don't.
A GitHub Actions CI Pipeline
This GitHub Actions workflow runs on every push to main/develop and on pull requests. It sets up Python, caches pip, installs dependencies, runs ruff linter, runs pytest with coverage, and uploads results. The cache step is critical: without it, each pipeline run downloads all packages from scratch, turning a 2-minute install into a 10-minute one.
Adding CD — Automatic Deployment
The deploy job runs after the test job passes, only on main. It builds a Docker image, tags it with the commit SHA, pushes to a registry, then SSHes to a staging server and redeploys using docker compose. This is the core of Continuous Deployment to staging — a human still gates production deployments?
CI/CD Pipeline Stages and Their Purpose
Every CI/CD pipeline follows a set of stages that gate each other. The typical flow: lint → unit test → build → integration test → deploy → smoke test. Each stage acts as a safety net. Lint catches formatting and logic errors fast. Unit tests validate individual functions. Build produces the artifact. Integration tests confirm the artifact works in a real environment. Deploy publishes it. Smoke test verifies the service is alive.
The order matters — you want the fastest checks first so failures are caught early, without wasting time on slower stages.
- Fastest stages first — fail early, waste less compute
- Each stage should be deterministic — same commit always produces same result
- Stages that depend on external services (DB, API) should run integration tests, not just unit tests
- A stage that takes more than 5 minutes is a candidate for parallelisation
Continuous Delivery vs Continuous Deployment — The Trade-off
Continuous Delivery means every successful build produces a deployable artifact, but a human decides when to push it to production. Continuous Deployment goes all the way — each successful build is automatically deployed to production. The choice depends on your risk tolerance and deployment processes.
Continuous Delivery is safer for regulated industries or when you need a manual QA sign-off. Continuous Deployment is faster for teams with strong automated testing and rollback capability. The real question: can you detect and fix a bad deploy in under 5 minutes? If not, start with Delivery, not Deployment.
CI/CD Best Practices for Production Teams
Over years of building and debugging pipelines, these practices separate effective CI/CD from pipelines that cause more harm than good:
- Fast feedback — Keep the pipeline under 10 minutes. Long pipelines discourage frequent pushes. Split long tests into separate workflows or parallelise.
- Idempotent steps — Every step should produce the same result given the same input. Avoid steps that depend on global state or mutable external resources.
- Secrets management — Never hardcode secrets. Use encrypted environment variables (GitHub Actions secrets, GitLab CI variables, etc.) and rotate them regularly.
- Health checks after deploy — Deploy is not complete until the new version responds correctly. Add a curl or similar check in the deploy job.
- Rollback capability — Every deploy should be rollback-able. Tag Docker images with commit SHA so you can redeploy a known-good version.
- Pipeline as code — Version your pipeline definitions alongside your code. This makes changes reviewable and traceable.
Common CI/CD Pipeline Failures and How to Debug Them
Even well-designed pipelines fail. The most common failures fall into a few categories: - Environment drift: Your pipeline uses a Docker image that is updated upstream, breaking your build. Pin base image versions. - Cache poisoning: An old cache contains corrupt or outdated dependencies. Clear cache periodically. - Flaky tests: Tests that pass locally but fail in CI due to timing or order dependence. Use --reruns or retry strategy. - Secret expiration: Tokens or passwords expire. Automate rotation and alert on failure. - Resource exhaustion: Disk space or memory runs out during build. Add cleanup steps and monitor usage.
Silent Pipeline Failure: Image Not Found in Registry
- Never assume a push succeeded — verify by pulling and running the artifact in a test container.
- All pipeline steps should have proper exit code handling — don't rely on default behavior.
- Rotate secrets proactively; do not wait for them to expire at 2 AM on a Sunday.
docker system prune -af to free space. Also check if build cache is too large — consider multi-stage builds.ssh -v user@host. Verify the remote Docker daemon is running. Check if the compose file is valid.--reruns=2 to test command. Order-dependent tests: use --shuffle to reproduce. Check for timing issues with external services.if condition. Verify the branch name matches. Check that the artifact was actually pushed — look for registry tags.Key takeaways
Common mistakes to avoid
5 patternsHardcoding secrets in pipeline YAML
Using depends_on without a healthcheck in Docker Compose for pipelines
condition: service_healthy in the depends_on in your CI pipeline deployment steps.Not pinning base image versions in Dockerfile
python:3.12 breaks your build. The pipeline fails unpredictably.python:3.12-slim@sha256:abc.... Update intentionally and test.Ignoring flaky tests in CI
--seeds to reproduce order-dependent failures, add wait strategies for async code.Manually managing pipeline deployment steps without rollback
kubectl rollout undo or docker-compose pull && docker-compose up -d with previous version.Interview Questions on This Topic
What is the difference between CI and CD?
Frequently Asked Questions
That's CI/CD. Mark it forged?
3 min read · try the examples if you haven't