Senior 5 min · March 17, 2026

Feature Flags: Stale Flag Causes 15-Minute Outage

A stale flag at 100% for 3 months caused a NullPointerException, taking down checkout for 10% of users.

N
Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.

Follow
Production
production tested
June 10, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • A feature flag is a conditional in your code that controls whether a feature is active.
  • Deploy code with the feature off, then turn it on without a new deployment.
  • Flags enable canary releases, kill switches, A/B tests, and trunk-based development.
  • Use percentage rollouts with consistent hashing to ensure same user gets same experience.
  • Flag evaluation latency adds ~1-5ms per check; batch evaluations or use SDK caching to stay under 2ms.
  • Flag debt from unused conditionals is a real maintenance trap — set TTLs and schedule removals.
✦ Definition~90s read
What is Feature Flags?

Feature flags (also called feature toggles) are conditional branches in code that let you turn functionality on or off at runtime without deploying. They exist to decouple deployment from release — you can ship code to production that's dark, then enable it when you're ready.

Think of a feature flag like a temporary detour sign on a road.

This solves the fundamental problem that deploying and releasing are not the same thing. Without flags, you're stuck doing big-bang releases or relying on environment-specific branches, both of which increase risk and coordination overhead. The tradeoff is that every flag you add is a permanent complexity tax on your codebase, and stale flags left in production are a leading cause of outages — exactly the scenario this article covers.

In practice, feature flags range from simple boolean checks in an if-statement (backed by environment variables or a config file) to full-blown managed services like LaunchDarkly, Split, or Unleash that provide targeting, gradual rollouts, and real-time toggling. The basic pattern is always the same: read a flag value from some source, then branch on it.

The sophistication comes from how you manage flag state, targeting rules, and — critically — flag lifecycle. A flag that's been on for everyone for six months is not a feature flag anymore; it's dead code waiting to cause a production incident when someone removes the wrong branch.

The ecosystem includes open-source libraries (FFLAGS, Unleash client SDKs) and SaaS platforms that handle flag evaluation at scale. You should NOT use feature flags for every minor UI tweak — that creates flag debt. They're best reserved for risky changes, gradual rollouts, kill switches, and permission gating.

The canonical failure mode is the one described in this article: a stale flag that was supposed to be temporary, left in production, and accidentally toggled off during a routine config change, causing a 15-minute outage. The fix isn't better flag technology — it's disciplined cleanup and treating flags as ephemeral infrastructure.

Plain-English First

Think of a feature flag like a temporary detour sign on a road. It's useful while construction is happening, but if crews forget to remove it, drivers get confused and accidents happen. In software, those forgotten 'signs' are leftover code branches that can crash your application when someone flips the wrong switch.

A stale feature flag left at 100% for three months caused a NullPointerException that took down checkout for 10% of users for 15 minutes. This incident illustrates why feature flags are ephemeral infrastructure, not permanent configuration. The problem isn't the technology—it's the discipline of removing flags once they outlive their purpose.

Why Feature Flags Are a Double-Edged Sword

Feature flags (also called toggles) are conditional branches in code that allow you to turn functionality on or off without deploying new code. The core mechanic is simple: a boolean check at runtime reads a flag value from a configuration source — environment variable, database, or dedicated service — and gates execution accordingly. This decouples deployment from release, letting you ship incomplete features to production safely.

In practice, flags introduce a persistent state dependency into your application. Every flag evaluation adds latency (typically <1ms for local config, 5-50ms for remote evaluation) and, more critically, creates combinatorial complexity. With N flags, you have 2^N possible system states. Teams often forget to remove flags after a rollout completes, leaving dead code paths that accumulate technical debt and obscure the actual control flow.

Use feature flags for canary releases, A/B testing, and kill switches — not for permanent configuration. The real value is in reducing deployment risk: you can roll back a bad feature by flipping a flag instead of reverting a deploy. But every flag you add is a liability. The industry rule of thumb: if a flag has been in production for more than two release cycles without being removed, it's already stale.

Stale Flags Are Silent Killers
A flag left in 'on' for six months is not a feature flag — it's dead code with a configuration knob that can be accidentally flipped during an incident.
Production Insight
A team used a feature flag to gate a new payment provider integration. After the rollout, the flag was left enabled but never cleaned up. During a routine config push, the flag was accidentally toggled off, causing all payment requests to fall back to the old provider — which had been decommissioned. The symptom was a 15-minute outage with 100% payment failures and no clear root cause because the flag wasn't in any runbook.
The rule: every feature flag must have an expiration date and an owner. If it's not removed within two releases, it's a liability.
Key Takeaway
Feature flags decouple deployment from release but add persistent state that can fail silently.
Stale flags are the #1 cause of production incidents related to feature flags — remove them aggressively.
Treat every flag as a temporary control, not permanent configuration: set expiration dates and automate cleanup.
Feature Flag Lifecycle and Risks THECODEFORGE.IO Feature Flag Lifecycle and Risks From implementation to stale flag outage Feature Flag Service LaunchDarkly SDK pattern Flag Types Canary, gradual rollout, kill switch Stale Flag Accumulation Flag debt from incomplete cleanup 15-Minute Outage Stale flag causes production failure Flag Cleanup Process Remove expired flags regularly ⚠ Flag-driven development is a testing trap Feature flags don't replace contract tests THECODEFORGE.IO
thecodeforge.io
Feature Flag Lifecycle and Risks
Feature Flags Explained

Basic Flag Implementation

The simplest feature flag is just an if statement controlled by an environment variable or a config value. This pattern works for small teams and simple rollouts. For a production grade approach, you need consistent user bucketing — the same user must always see the same experience. A common way is to hash the flag name with the user ID and take modulo 100 to assign a bucket.

Here's the minimal pattern in Python, using the io.thecodeforge namespace for all production packages.

PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Package: io.thecodeforge.python.devops

# Simplest possible feature flag — environment variable
import os

def get_recommendations(user_id: int):
    if os.getenv('ENABLE_ML_RECOMMENDATIONS', 'false') == 'true':
        return ml_recommendations(user_id)   # new ML-based system
    else:
        return rule_based_recommendations(user_id)  # old system

# Better: percentage rollout — test on a fraction of users
import hashlib

def is_flag_enabled(flag_name: str, user_id: int, percentage: float) -> bool:
    """Consistently assign users to buckets using hash — same user always gets same result."""
    hash_input = f'{flag_name}:{user_id}'.encode()
    hash_val   = int(hashlib.md5(hash_input).hexdigest(), 16)
    bucket     = (hash_val % 100) + 1  # 1-100
    return bucket <= percentage

# Roll out to 5% of users
def get_checkout_flow(user_id: int):
    if is_flag_enabled('new_checkout', user_id, 5.0):
        return new_checkout_flow(user_id)
    return old_checkout_flow(user_id)
Environment Variable Flags Are Fragile
Using environment variables per flag works for a handful of toggles, but as you scale to hundreds of flags, you need a dedicated flag service with targeting rules and audit trails. Environment variables are also hard to change at runtime without a restart.
Production Insight
Hash collisions are rare but possible — use a long hash (MD5 or SHA-256) and validate bucket distribution on a sample of users.
A common mistake is using the user ID alone without the flag name, causing the same user to get inconsistent experiences across different flags.
Rule: always include the flag name in the hash input.
Key Takeaway
Start simple, but plan to migrate to a service before you hit 10 flags.
Consistent hashing is non-negotiable for percentage rollouts.
Test bucket distribution — a biased hash can ruin A/B tests.
When to use a simple flag vs. a flag service
IfFewer than 5 flags, single environment, small team
UseEnvironment variable flags are fine. Keep a checklist to track removal.
IfMore than 5 flags or multiple environments (staging, prod)
UseUse a dedicated flag service (LaunchDarkly, Unleash) for targeting, audit, and easy management.
IfNeed real-time changes (e.g., kill switch)
UseFlag service with streaming evaluation is necessary. Polling every 30 seconds is too slow for a kill switch.

Feature Flag Service — LaunchDarkly SDK Pattern

When your team needs targeting by user attributes (plan, country, beta group), a dedicated flag service is the way to go. The SDK handles evaluation, caching, and streaming updates. This example shows how to use the LaunchDarkly SDK in Python, evaluating a flag with a rich user context.

PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Package: io.thecodeforge.python.devops

# Using a feature flag service (LaunchDarkly, Unleash, Flagsmith)
import ldclient
from ldclient.config import Config

ldclient.set_config(Config(sdk_key='your-sdk-key'))
client = ldclient.get()

# Evaluate a flag for a specific user
def get_dashboard(user):
    context = {
        'key': str(user.id),
        'name': user.name,
        'email': user.email,
        'plan': user.subscription_plan,   # target premium users
        'country': user.country           # GDPR rollout by country
    }

    # Flag evaluated with user context — targeting rules in dashboard
    if client.variation('new-dashboard-v2', context, default=False):
        return render_new_dashboard(user)
    return render_old_dashboard(user)
Use Contextual Defaults
The default parameter in client.variation() is critical. If the flag service is unreachable (network partition, service down), the SDK falls back to this default. Always default to the old/safe behavior — never default to enabling a new feature.
Production Insight
SDK caching can mask stale flag values for up to the cache TTL (often 30 seconds). If you need instant rollback, use a CDN or feature flag proxy that streams updates.
Evaluation context size matters: large user objects (100+ attributes) can add 5-10ms to evaluation time.
Rule: keep context attributes under 20 and use streaming for time-sensitive flags.
Key Takeaway
Always provide a safe fallback default.
Streaming beats polling for kill switches.
Evaluate flags early, pass results down — don't eval inside loops.
Flag SDK evaluation strategy
IfNetwork latency to flag service > 10ms
UseUse a local cache with a short TTL (5-10 seconds) to avoid synchronous network calls on every request.
IfFlag changes need to propagate within seconds
UseEnable streaming (WebSocket or Server-Sent Events) to push changes, not poll.
IfSingle flag evaluated hundreds of times per request (e.g., in a loop)
UseEvaluate the flag once at the start of the request and pass the result as a parameter. Avoid repeated evaluations.

Types of Feature Flags

Not all feature flags are the same. Pete Hodgson's taxonomy (from Martin Fowler's article) defines four types: release toggles, experiment toggles, ops toggles, and permission toggles. Release toggles are short-lived — they control rollout of a new feature. Experiment toggles are for A/B tests and should be removed after the experiment ends. Ops toggles are kill switches and circuit breakers — they must be fast and reliable. Permission toggles (entitlement flags) enable features for specific user segments (e.g., premium plan users) and can live long-term.

Mixing these types leads to confusion. Use naming conventions to distinguish: release_, exp_, ops_, perm_.

PYTHON
1
2
3
4
5
6
7
# Package: io.thecodeforge.python.devops

# Naming convention for flag types
release_flag_variation = client.variation('release_new_checkout_v3', context, default=False)
experiment_flag_variation = client.variation('exp_checkout_button_color', context, default='blue')
ops_flag_variation = client.variation('ops_disable_payment_gateway', context, default=False)
perm_flag_variation = client.variation('perm_premium_dashboard', context, default=False)
Flag Types as Lifecycle Stages
  • Release flags: live 1 day – 2 weeks. Remove once rollout reaches 100%.
  • Experiment flags: live for the duration of the experiment (days to months). Remove after analysis.
  • Ops flags: live indefinitely but must be easy to toggle and have monitoring.
  • Permission flags: live indefinitely, but should be managed by a product config system, not a feature flag tool.
Production Insight
Permission flags in a feature flag service create a hidden dependency — if the service goes down, all premium users lose access.
Ops flags must have a dashboard button for emergency toggling, not a CLI command that takes 5 minutes to find.
Rule: never use a feature flag service for permanent permissions — use a role-based access control (RBAC) system instead.
Key Takeaway
Name flags by type to avoid confusion.
Permanent permissions don't belong in feature flag tools.
Ops flags need monitoring and a dashboard toggle.
Which flag type should you use?
IfRolling out a new feature to all users gradually
UseUse a release flag. Plan to remove it within 2 weeks of reaching 100% rollout.
IfTesting two versions of a UI element to measure engagement
UseUse an experiment flag. Ensure proper sample size calculation and statistical rigor.
IfNeed to instantly disable a misbehaving API call
UseUse an ops flag. Make sure the flag evaluation is fast (<1ms) and the toggle is available in a dashboard.
IfShow a feature only to paying users
UseUse a permission flag, but implement it via a user attribute lookup (database or auth token) rather than a feature flag SDK.

Canary Releases and Gradual Rollout with Flags

Canary releases are about routing a percentage of traffic to a new version of the service at the infrastructure level (e.g., Kubernetes canary deployments). But feature flags can enhance canaries by allowing you to target specific user segments within the canary pod. For example, you deploy the new version to 5% of pods, then use a feature flag to only enable the new feature for 10% of users hitting those pods. This gives you fine-grained control.

This pattern is common at large scale: you canary the deployment at the pod level, and inside the pod, use a flag to limit exposure further. This reduces blast radius if the new version has a bug — only a subset of the canary group sees the broken code.

PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Package: io.thecodeforge.python.devops

# Canary with feature flag: even if the pod receives traffic, only a fraction of users get the new feature
import hashlib

def compute_bucket(user_id, flag_name, total_percent):
    hash_val = int(hashlib.md5(f'{flag_name}:{user_id}'.encode()).hexdigest(), 16)
    return (hash_val % 100) + 1 <= total_percent

# Canary: 5% of pods run new code, but only 20% of users on those pods get the feature
# That's effectively 1% of total users
if compute_bucket(user_id, 'new_recommendation_v2', 20):
    # This code only runs in the canary pods
    return new_recommendation_system(user_id)
else:
    return old_system(user_id)
Hybrid Canary vs Pure Flag Canary
Pure flag canary: deploy the new code to all pods but turn the flag off. Then gradually increase the flag percentage. This is simpler but uses more resources (both old and new code paths are always loaded). Hybrid is safer for risky changes because the new code is only present in a subset of pods.
Production Insight
If you only use flags for canary, you must ensure the flag evaluation does not add noticeable latency. Use a local cache or a fast evaluation path.
Monitoring the canary: you need separate dashboards for the canary group vs the control group. Use the flag context to tag traces and metrics.
Rule: always run a canary for at least 10 minutes before ramping up. Watch error rates, latency, and business metrics.
Key Takeaway
Hybrid canary = pod-level + flag-level control for maximum safety.
Monitor the canary group separately — don't mix metrics with the control group.
Have a kill switch ops flag ready before starting the canary.
Canary strategy: flag only vs hybrid
IfLow-risk change (UI change, non-critical path)
UsePure flag canary: deploy to all pods, enable flag for 1% of users first.
IfHigh-risk change (database schema migration, payment logic)
UseHybrid canary: deploy to 5% of pods, then enable flag for 10% of users within those pods.
IfNeed to roll back instantly for a critical bug
UseUse an ops flag alongside the canary. Turn the ops flag on to immediately disable the new code path, even if the flag percentage is high.

Managing Flag Debt and Cleanup

Flag debt is the accumulation of stale conditionals in your code. Every flag that is no longer needed but still present forces your team to maintain two paths. Over time, the old path can break silently because it's rarely tested. The solution is to make flags ephemeral: set a removal date when you create the flag, automate reminders, and schedule cleanup as part of your sprint cycle.

A good rule: if a release flag has been at 100% for more than two weeks, it must be removed. For experiment flags, remove after the experiment analysis is complete — don't keep them 'just in case'. Ops flags and permission flags are exceptions, but they should be reviewed quarterly.

PYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Package: io.thecodeforge.python.devops

# Example: automate flag cleanup detection in CI
# This would be a script that checks git for old flag references

import subprocess
import re

FLAG_PATTERN = r'client\.variation\([\'"]([\w-]+)[\'"]'

def find_old_flags(months: int = 3):
    # Get all flags used in codebase
    result = subprocess.run(['grep', '-roPh', FLAG_PATTERN, 'src/'], capture_output=True, text=True)
    flags = set(re.findall(FLAG_PATTERN, result.stdout))
    # Check each flag's metadata (would use API in real life)
    # For now, just list them
    return flags

# In CI, warn if a release flag is older than 2 weeks
# This helps reduce flag debt
Flag Debt Causes Real Outages
A stale flag with a code path that is never exercised can break when a refactoring touches the old code. The outage described in the production incident above happened exactly this way. Treat flag cleanup as a security practice.
Production Insight
Automation is key: add a lint rule that flags any client.variation() call for a flag that is > 2 weeks at 100% rollout.
Manual audits every quarter are better than nothing but often get skipped.
Rule: when you create a flag, create a corresponding JIRA ticket with a due date for removal.
Key Takeaway
Create flags with an expiry date.
Automate flag debt detection in CI.
If a flag is at 100% for more than 2 weeks, schedule its removal now.
Flag cleanup priority
IfRelease flag at 100% for > 2 weeks
UseHigh priority: remove within next sprint. The old code path is dead and should be deleted.
IfExperiment flag ended > 1 month ago
UseMedium priority: remove after analysis report is finalized. Keep the winning variant, delete the rest.
IfOps flag never toggled in 6 months
UseLow priority but review: is this ops flag still needed? If not, remove it to simplify the codebase.

Flag-Driven Development Is a Testing Trap Without Kill Switches

Most teams add feature flags for gradual rollouts but forget the most critical flag: the kill switch. A kill switch is a flag that disables an entire feature category instantly — no dashboard login, no targeting rule tweak, just off. Without one, a broken feature that passes canary at 5% might kill your p99 latency at 25%. I've seen teams scramble to redeploy because their feature flags only controlled visibility, not execution. Kill switches live at the infrastructure level — environment variables or static configs loaded at startup — not in your feature management SDK. They should be toggleable from your CI/CD pipeline or a simple file change, not a slow API call. Netflix calls this the 'circuit breaker for features.' You need it. Because when your new checkout flow accidentally charges customers twice, you don't want to debug targeting rules — you want that code path dead in 10 seconds.

kill_switch.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
# io.thecodeforge.feature-flags.kill-switch
import os

KILL_SWITCH_CHECKOUT = os.getenv("KILL_SWITCH_CHECKOUT", "false").lower() == "true"

def process_checkout(user, cart):
    if KILL_SWITCH_CHECKOUT:
        logger.warning("Checkout kill switch active — falling back to legacy path")
        return legacy_checkout(user, cart)
    
    if feature_flags.is_enabled("new-checkout"):
        return new_checkout(user, cart)
    return legacy_checkout(user, cart)
Output
SET KILL_SWITCH_CHECKOUT=true → all checkout traffic routed to legacy path regardless of feature flag state
Production Trap:
I've debugged outages where a 'gradual rollout' flag had complex targeting rules that failed under load — but the kill switch (a simple env var) would have saved us. Always deploy kill switches at the infrastructure layer, not inside your feature flag SDK. SDKs can timeout or fail to evaluate under pressure.
Key Takeaway
Every feature flag needs a FUBAR-switch — a boolean kill switch at the config level that doesn't depend on your feature management provider.

Feature Flags Don't Replace Contract Tests — They Expose Missing Ones

Teams often think feature flags let them skip contract testing because they can 'just turn it off' if something breaks. That's dangerously wrong. A flag hides the UI or the code path, but your services still need to handle the new data shapes, the new API responses, the new database schema. I've watched a team spend two hours rolling back a flag-enabled feature because the old checkout service started receiving new payloads from a misconfigured flag that targeted the wrong user segment. The fix isn't more flags — it's consumer-driven contract tests (CDCTs) that validate both branches of every flag. Write Pact tests that verify the old path behaves correctly when the flag is off AND the new path when the flag is on. Every time you add a flag, you double your testing surface. Cover it with contracts, not hope.

contract_test.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# io.thecodeforge.feature-flags.contract-tests
from pact import Consumer, Provider

consumer = Consumer('CheckoutFrontend').has_pact_with(Provider('CheckoutService'))

# Test both flag states
@consumer.given('new-checkout flag is off')
def test_legacy_contract():
    expected = {'total': 29.99, 'items': [...]}
    pact = consumer.upon_receiving('legacy checkout request').with_request('POST', '/checkout', body={'cart_id': 'abc'})
    pact.will_respond_with(200, body=expected)
    with pact:
        result = request_checkout({'cart_id': 'abc', 'flag_override': False})
        assert result == expected

@consumer.given('new-checkout flag is on')
def test_new_contract():
    expected = {'total': 29.99, 'items': [...], 'promo_applied': True}
    pact = consumer.upon_receiving('new checkout request').with_request('POST', '/checkout', body={'cart_id': 'abc'})
    pact.will_respond_with(200, body=expected)
    with pact:
        result = request_checkout({'cart_id': 'abc', 'flag_override': True})
        assert result == expected
Output
Running 2 consumer tests... PASS - legacy contract | PASS - new contract → both flag states validated at CI
Senior Dev Insight:
The microservices teams I consult with now require a Pact test for every new feature flag. If your flag introduces a new API field, that contract test is your canary. When a downstream service doesn't get the memo, the test fails before production does.
Key Takeaway
A feature flag without a contract test for both the on and off states is just technical debt waiting to derail a release.
● Production incidentPOST-MORTEMseverity: high

Flag That Never Died: A Stale Flag Causes a 15-Minute Production Outage

Symptom
Starting at 14:32 UTC, 10% of users received HTTP 500 errors on the checkout page. The error rate climbed to 50% within 4 minutes. The team was not deploying — the system degraded on its own.
Assumption
The on-call engineer assumed a recent deployment caused the issue and initiated a rollback. The rollback did not fix the problem because the underlying flag evaluation code was months old.
Root cause
A feature flag from a previous quarter's experiment was never removed. After marking the flag as '100% rollout', the team stopped tracking it. Three months later, a refactor of the recommendation service broke the flag's evaluation path — the flag was still evaluated for every request, and the missing method threw a NullPointerException.
Fix
1. Identified the failing flag via exception stack traces pointing to flag evaluation code. 2. Turned the flag OFF globally to restore the stable fallback path. 3. Removed the flag from the codebase permanently. 4. Added a monitoring alert for any flag that remains at 100% rollout for more than 2 weeks.
Key lesson
  • Short-lived flags must die on a schedule — never let a rollout flag live past 2 weeks at 100%.
  • Flag evaluation should never throw: always provide a safe default and catch evaluation errors gracefully.
  • Monitor flag usage: alert when a flag has been at 100% for more than 14 days.
Production debug guideCommon symptoms and actions to identify and fix flag-related problems in production4 entries
Symptom · 01
A/B test shows no significant difference — or worse, both groups show the same behaviour.
Fix
Verify flag assignment is consistent: check the hash function and user identifier. Use a deterministic hash (e.g., MD5 of flag_name + user_id). Test that the same user always gets the same variant across restarts.
Symptom · 02
New feature suddenly visible to all users, even though rollout percentage is set to 5%.
Fix
Check for a default value override. Many SDKs use a default of 'false', but if the default was inadvertently set to 'true' or if the flag service is unreachable and the SDK falls back to 'true', all users see it. Inspect the SDK configuration and fallback logic.
Symptom · 03
Flag evaluation is slow (~50ms+ per check), causing API latency spikes.
Fix
Check if flag evaluation is making a network call per request. Most SDKs cache flag results locally. Ensure caching is enabled and the TTL is appropriate (1-30 seconds). If using a custom service, add a local cache with a short TTL to absorb load.
Symptom · 04
Rolling back a flag does not immediately fix a production issue — users still see the broken behaviour.
Fix
Verify that the flag's state change propagated to all application instances. Some SDKs poll the flag service with a delay (e.g., 30 seconds). Use streaming or webhooks for near-instant propagation. Check the application logs to confirm the new flag value was fetched.
★ Feature Flag Debug Cheat SheetQuick commands and checks to diagnose flag-related problems in production.
User sees wrong experience
Immediate action
Check the flag evaluation result for that user in the SDK logs or dashboard.
Commands
`curl -X GET "https://flags.example.com/eval?flag=my-feature&user=user123"`
`kubectl logs pod/my-app-pod | grep "flag_eval" | tail -50`
Fix now
If the flag service is down, override the default in your application environment variable: export MY_FEATURE_FLAG=false and restart the pod.
Rollback not taking effect+
Immediate action
Force a flag refresh by bouncing the pod or hitting the SDK's refresh endpoint.
Commands
`kill -HUP $(pgrep my-app)` (if the app reloads flags on SIGHUP)
`kubectl rollout restart deployment/my-app`
Fix now
As a fallback, remove the flag code path from the repository and redeploy. This is heavy but guarantees the broken code is gone.
Latency spikes on page load+
Immediate action
Check if flag evaluations are blocking the main thread (synchronous SDK calls).
Commands
`curl -w "@%{time_total}\n" -o /dev/null -s "https://myapp.com/api/checkout"`
`jstack $(pgrep -f my-app) | grep "FlagClient"`
Fix now
Switch to an async SDK or batch flag evaluations. Set a cache with a short TTL (e.g., 5 seconds) and use a local fallback.
Feature Flag Types Comparison
TypeLifecycleExampleRemoval Policy
Release ToggleShort-lived (days to weeks)Deploy new checkout flowRemove at 100% rollout + 2 weeks
Experiment ToggleMedium-lived (days to months)A/B test button colorRemove after experiment analysis
Ops ToggleLong-lived (indefinite)Kill switch for payment gatewayReview quarterly, monitor usage
Permission ToggleLong-lived (indefinite)Show premium featureUse RBAC instead if possible

Key takeaways

1
Feature flags decouple deployment from release
ship code dark, turn it on when ready.
2
Use consistent hashing for percentage rollouts
same user always gets the same experience.
3
Kill switches are flags with an immediate-off capability
essential for production safety.
4
Short-lived flags for releases; long-lived flags for A/B tests and operational controls.
5
Flag debt is real
remove flags after rollout is complete or the experiment ends.
6
Always set a safe default in the evaluation call (old behavior).
7
Plan flag removal at creation time
set a TTL and schedule a cleanup ticket.

Common mistakes to avoid

4 patterns
×

Using environment variables for hundreds of flags

Symptom
Flag management becomes a nightmare: no audit trail, no targeting, no gradual rollout control. A stale env var can linger forever.
Fix
Migrate to a dedicated feature flag service once you have more than 5 flags. Start with a SaaS tool like LaunchDarkly or open-source Unleash.
×

Not providing a safe default in the evaluation call

Symptom
If the flag service goes down, the SDK returns the default. If the default is True, your new feature becomes enabled for everyone — potentially exposing unstable code or causing a crash.
Fix
Always set the default parameter to the old/stable behavior. In LaunchDarkly: client.variation('flag-key', context, default=False) where False means old code path.
×

Evaluating flags inside loops or hot code paths

Symptom
Even with caching, evaluating a flag 1000 times in a single request adds 500ms+ latency. This kills page load times and increases CPU usage.
Fix
Evaluate the flag once per request at the entry point (e.g., middleware or controller) and pass the result as a parameter to downstream functions.
×

Keeping experiment flags after analysis is complete

Symptom
Dead code branches accumulate, making the codebase harder to navigate and increasing the risk of bugs in untested paths.
Fix
Set a TTL on the flag in the dashboard. When the experiment ends, schedule a cleanup ticket. Run a regular (e.g., monthly) flag audit.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is a feature flag and what problems does it solve?
Q02SENIOR
How do you ensure a user consistently gets the same experience with a pe...
Q03SENIOR
What is flag debt?
Q04SENIOR
Describe the four types of feature flags and when to use each.
Q01 of 04JUNIOR

What is a feature flag and what problems does it solve?

ANSWER
A feature flag is a conditional toggle that controls whether a feature is active at runtime. It decouples deployment from release, so you can ship code to production without making it visible to users. This solves: 1) risk — you can roll back instantly by flipping a flag instead of redeploying, 2) gradual rollout — you can expose a feature to 1% of users first, 3) A/B testing — you can run experiments, and 4) trunk-based development — developers can merge incomplete features without breaking main.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is the difference between a canary release and a feature flag?
02
What is flag debt and how do you manage it?
03
Do feature flags add latency to requests?
04
Should I use environment variables or a dedicated service for feature flags?
N
Naren Founder & Principal Engineer

20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.

Follow
Verified
production tested
June 10, 2026
last updated
1,554
articles · all by Naren
🔥

That's CI/CD. Mark it forged?

5 min read · try the examples if you haven't

Previous
Semantic Versioning Explained
11 / 14 · CI/CD
Next
Release Management Best Practices