Mid-level 6 min · March 06, 2026

Sidecar Pattern — Probe Loop Took Down Payment Cluster

Payment cluster crashed every 30s: pods in CrashLoopBackOff because readiness probe hit Envoy's port before proxy ready.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • The sidecar pattern co-locates a helper process alongside your main app, sharing the same network namespace
  • Transparent traffic interception via iptables rules redirects all calls through the sidecar without app changes
  • Service meshes (Istio/Envoy) use init containers to install iptables before the main app starts
  • Latency cost: ~1-10ms per hop (double for send+receive), CPU overhead ~0.5 vCPU per 1000 RPS
  • Production gotcha: probes routed through the sidecar create CrashLoopBackOff during startup
  • Biggest mistake: assuming sidecar availability equals application readiness — they have independent lifecycles
Plain-English First

Imagine you're riding a motorcycle and you attach a sidecar to it — a little pod that sits beside you, shares your wheels and road, but does its own job (carries luggage, a passenger, a machine gun if you're in a movie). Your motorcycle doesn't need to know anything about the sidecar. The sidecar just comes along for the ride. In microservices, your main application is the motorcycle. The sidecar is a second process that runs right next to it, handling cross-cutting concerns like logging, security, and traffic management — so your app doesn't have to.

Every production microservices platform eventually hits the same wall: you've got 40 services written in Go, Java, Python, and Node.js, and now someone says 'we need mutual TLS, distributed tracing, and circuit breaking — on all of them, by Friday.' Rewriting cross-cutting infrastructure logic into every service is a nightmare that scales linearly with your team's misery. The sidecar pattern is the architectural answer that lets you bolt that infrastructure onto any service without touching its source code.

The core problem the sidecar solves is language and team heterogeneity. In a polyglot microservices environment, you can't just ship a shared library — different runtimes, different release cycles, different teams that don't want your library's transitive dependencies polluting their build. The sidecar runs as a separate process in the same network namespace as your service, intercepting and augmenting traffic transparently. The application speaks to localhost. The sidecar handles the rest.

By the end of this article you'll understand exactly how a sidecar process intercepts network traffic using iptables rules (as Istio/Envoy does), how to design your own minimal sidecar in Go, when the pattern pays off versus when it's expensive overkill, and the production gotchas that have caused real outages. You'll also be able to defend architectural decisions involving sidecars in any staff-level system design interview.

How the Sidecar Pattern Actually Works Internally — Network Namespaces and Traffic Interception

The sidecar pattern is fundamentally a process co-location strategy. In Kubernetes, both the main container and the sidecar container share the same Pod, which means they share the same network namespace, the same loopback interface, and the same IP address. This is the key insight that makes transparent interception possible — they're neighbors on the same tiny private network.

Service meshes like Istio take this further. Before your main container starts, an init container runs and installs iptables rules that redirect ALL inbound and outbound TCP traffic through the Envoy sidecar proxy (typically on port 15001 for outbound and 15006 for inbound). Your application doesn't know this is happening. It calls http://payments-service:8080 as normal, but the kernel silently reroutes the packet to Envoy first.

Envoy then applies your configured policies — retries, circuit breaking, mTLS — and forwards the (now possibly encrypted and annotated) request to the real destination. On the receiving end, the destination's Envoy sidecar intercepts the inbound packet, verifies the TLS certificate, extracts trace headers, and only then delivers it to the application on localhost.

This interception model means zero code changes to your application. But it also means every single network call now passes through two additional userspace processes — a cost we'll quantify shortly.

inspect_sidecar_iptables.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env bash
# ─────────────────────────────────────────────────────────────────────────────
# inspect_sidecar_iptables.sh
# Run this INSIDE the init container or as root inside a pod to see exactly
# how Istio redirects traffic through Envoy.
# This is the actual rule set Istio's istio-init container installs.
# ─────────────────────────────────────────────────────────────────────────────

# Show the ISTIO_OUTPUT chain — handles outbound traffic FROM the application
echo "=== OUTBOUND RULES (ISTIO_OUTPUT chain) ==="
iptables -t nat -L ISTIO_OUTPUT -n --line-numbers -v

# Expected output will show rules like:
#   REDIRECT  tcp  --  anywhere  anywhere  redir ports 15001
# meaning all outbound TCP from non-Envoy processes goes to port 15001 (Envoy)

echo ""
echo "=== INBOUND RULES (ISTIO_INBOUND chain) ==="
iptables -t nat -L ISTIO_INBOUND -n --line-numbers -v

# Expected output will show:
#   REDIRECT  tcp  --  anywhere  anywhere  tcp dpt:8080  redir ports 15006
# meaning inbound traffic to your app's port gets redirected to port 15006

echo ""
echo "=== Envoy's listening ports ==="
# Envoy listens on these ports inside the pod's shared network namespace
ss -tlnp | grep -E '15001|15006|15090|9901'
# 15001 = outbound listener
# 15006 = inbound listener (virtual inbound)
# 15090 = Prometheus metrics endpoint
# 9901  = Envoy admin API
Output
=== OUTBOUND RULES (ISTIO_OUTPUT chain) ===
num pkts bytes target prot opt in out source destination
1 0 0 RETURN tcp -- * lo 0.0.0.0/0 127.0.0.6/32
2 0 0 REDIRECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 redir ports 15001
=== INBOUND RULES (ISTIO_INBOUND chain) ===
num pkts bytes target prot opt in out source destination
1 0 0 REDIRECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 redir ports 15006
=== Envoy's listening ports ===
LISTEN 0 128 0.0.0.0:15001 0.0.0.0:* users:(("envoy",pid=42,fd=18))
LISTEN 0 128 0.0.0.0:15006 0.0.0.0:* users:(("envoy",pid=42,fd=19))
LISTEN 0 128 0.0.0.0:15090 0.0.0.0:* users:(("envoy",pid=42,fd=20))
LISTEN 0 128 0.0.0.0:9901 0.0.0.0:* users:(("envoy",pid=42,fd=21))
Watch Out: The 127.0.0.6 RETURN rule is critical
Istio's iptables rules include a RETURN rule for traffic originating from 127.0.0.6 — the address Envoy itself uses when forwarding to the local application. Without this escape hatch, you'd get infinite redirect loops as Envoy's own forwarded packets get intercepted and redirected back to Envoy. If you're rolling a custom sidecar injection solution, you must replicate this loop-prevention rule or you will get a traffic black hole.
Production Insight
The iptables RETURN rule for 127.0.0.6 is the single most common reason custom sidecar implementations fail in production.
Teams who replicate Istio's init container logic often miss this rule, causing a redirect loop that brings down all traffic within seconds.
Rule: always exempt the sidecar's own loopback address from redirection.
Key Takeaway
Sidecar interception works by sharing the network namespace and installing iptables rules in an init container.
The application is oblivious — it sends and receives on localhost.
Without the return rule for the sidecar's own address, you get infinite redirects.

Building a Minimal Sidecar Proxy in Go — Logging and Header Injection Without Touching the App

Understanding a pattern means being able to implement a stripped-down version yourself. Let's build a sidecar that does two things: injects a X-Request-ID trace header into every outbound request, and logs the request/response metadata. The main application talks to this sidecar on localhost:7000, and the sidecar forwards to the real upstream.

This mirrors exactly what a service mesh does, minus the TLS and control plane. Writing this yourself makes the production system legible — you stop treating Envoy as a magic black box.

Note how the sidecar has zero awareness of business logic. It doesn't know whether the upstream is a payments service or a user profile service. It just intercepts, enriches, and forwards. This is the contract the pattern enforces: the sidecar is infrastructure, not application logic. If you find yourself putting business rules into a sidecar, stop — you've broken the pattern.

sidecar_proxy.goGO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
// sidecar_proxy.go
// A minimal sidecar proxy in Go that:
//   1. Listens on localhost:7000 (the port your app talks to)
//   2. Injects a X-Request-ID header if one isn't present
//   3. Logs method, path, upstream status, and latency
//   4. Forwards the request to the real upstream (configured via env var)
//
// Run: UPSTREAM_URL=http://httpbin.org go run sidecar_proxy.go
// Then: curl http://localhost:7000/get

package main

import (
	"fmt"
	"io"
	"log"
	"net/http"
	"os"
	"time"

	"github.com/google/uuid" // go get github.com/google/uuid
)

// upstreamBaseURL is where we actually forward requests to.
// In a real sidecar this comes from service discovery / control plane config.
var upstreamBaseURL string

func main() {
	upstreamBaseURL = os.Getenv("UPSTREAM_URL")
	if upstreamBaseURL == "" {
		log.Fatal("UPSTREAM_URL environment variable is required")
	}

	// The sidecar listens on 7000. The main application is configured to
	// send ALL outbound HTTP through http://localhost:7000.
	// In a real deployment, iptables rules do this transparently.
	mux := http.NewServeMux()
	mux.HandleFunc("/", handleProxyRequest)

	listenAddr := "127.0.0.1:7000"
	log.Printf("[sidecar] proxy listening on %s → forwarding to %s", listenAddr, upstreamBaseURL)

	if err := http.ListenAndServe(listenAddr, mux); err != nil {
		log.Fatalf("[sidecar] failed to start: %v", err)
	}
}

func handleProxyRequest(responseWriter http.ResponseWriter, incomingRequest *http.Request) {
	startTime := time.Now()

	// ── Step 1: Ensure a trace ID exists ─────────────────────────────────────
	// If the application (or an upstream caller) didn't set X-Request-ID,
	// we generate one here. This is a classic sidecar responsibility:
	// the app never needs to know about tracing infrastructure.
	requestID := incomingRequest.Header.Get("X-Request-ID")
	if requestID == "" {
		requestID = uuid.NewString() // e.g. "3f2504e0-4f89-11d3-9a0c-0305e82c3301"
		incomingRequest.Header.Set("X-Request-ID", requestID)
	}

	// ── Step 2: Build the upstream request ───────────────────────────────────
	// We reconstruct the full upstream URL by prepending the configured base.
	// incomingRequest.RequestURI includes path + query string.
	upstreamURL := upstreamBaseURL + incomingRequest.RequestURI

	upstreamRequest, err := http.NewRequest(
		incomingRequest.Method,
		upstreamURL,
		incomingRequest.Body, // stream the body directly — don't buffer it in memory
	)
	if err != nil {
		log.Printf("[sidecar] ERROR building upstream request: %v", err)
		http.Error(responseWriter, "sidecar: failed to build upstream request", http.StatusBadGateway)
		return
	}

	// Copy all original headers to the upstream request (including our new X-Request-ID)
	for headerName, headerValues := range incomingRequest.Header {
		for _, value := range headerValues {
			upstreamRequest.Header.Add(headerName, value)
		}
	}

	// Identify ourselves in the Via header — helpful for debugging proxy chains
	upstreamRequest.Header.Set("Via", "1.1 sidecar-proxy")

	// ── Step 3: Execute the upstream call ────────────────────────────────────
	httpClient := &http.Client{Timeout: 10 * time.Second}
	upstreamResponse, err := httpClient.Do(upstreamRequest)
	if err != nil {
		log.Printf("[sidecar] ERROR calling upstream: %v", err)
		http.Error(responseWriter, "sidecar: upstream unreachable", http.StatusBadGateway)
		return
	}
	defer upstreamResponse.Body.Close()

	// ── Step 4: Stream the response back to the caller ───────────────────────
	// Copy upstream response headers back to our response
	for headerName, headerValues := range upstreamResponse.Header {
		for _, value := range headerValues {
			responseWriter.Header().Add(headerName, value)
		}
	}
	// Echo the request ID back so the caller can correlate logs
	responseWriter.Header().Set("X-Request-ID", requestID)
	responseWriter.WriteHeader(upstreamResponse.StatusCode)

	bytesWritten, _ := io.Copy(responseWriter, upstreamResponse.Body)

	// ── Step 5: Emit a structured access log ─────────────────────────────────
	// In production you'd encode this as JSON and ship to your log aggregator.
	// The app itself emits zero log lines for this request — the sidecar owns telemetry.
	latencyMs := time.Since(startTime).Milliseconds()
	fmt.Printf(
		`[sidecar] request_id=%s method=%s path=%s status=%d bytes=%d latency_ms=%d\n`,
		requestID,
		incomingRequest.Method,
		incomingRequest.URL.Path,
		upstreamResponse.StatusCode,
		bytesWritten,
		latencyMs,
	)
}
Output
[sidecar] proxy listening on 127.0.0.1:7000 → forwarding to http://httpbin.org
[sidecar] request_id=3f2504e0-4f89-11d3-9a0c-0305e82c3301 method=GET path=/get status=200 bytes=412 latency_ms=143
[sidecar] request_id=9b7d2c11-8e01-4a23-bf44-12acde7890ef method=POST path=/post status=200 bytes=638 latency_ms=201
Pro Tip: Never buffer the body in a sidecar proxy
Notice we stream incomingRequest.Body directly into the upstream request rather than reading it all into a []byte first. Buffering kills your memory profile at scale — a 50MB file upload through a sidecar that buffers would double peak memory usage per request. Always use io.Copy or pipe the body reader directly. The only time you need to buffer is when the sidecar must inspect the payload (e.g., for WAF logic), and even then you should enforce a strict size cap.
Production Insight
Streaming the body is essential, but some sidecars need to inspect content (e.g., WAF).
When you must buffer, enforce a strict cap — say 10MB — and reject larger payloads immediately.
Without a cap, a single large upload can OOM the sidecar and take down the whole pod.
Key Takeaway
A sidecar is infrastructure — it should never contain business logic.
Stream the body to avoid memory pressure.
If you must buffer, set a hard size limit.

Performance Implications — Measuring the Real Cost of the Sidecar Tax

Nothing in architecture is free. The sidecar pattern adds latency on every network hop — two extra userspace process context switches per request (one outbound through your sidecar, one inbound through the destination's sidecar). Google's production measurements with Istio/Envoy show a P99 latency overhead in the range of 3–10ms per hop under normal load, climbing higher under CPU pressure.

The CPU overhead is more significant than the latency. Envoy handles TLS termination, header parsing, and xDS config reconciliation. In Lyft's original Envoy deployment blog, they noted each Envoy sidecar consumed roughly 0.5 vCPU at 1000 RPS. At 200 pods that's 100 vCPUs just for infrastructure. This isn't a reason to avoid the pattern — it's a reason to resource-plan honestly.

Memory is the third dimension. Each Envoy process running Istio's full xDS config (with a large service registry) can hold 50–150MB of memory just for the service mesh configuration state. In a cluster with 500 services, every sidecar knows the routing rules for all 500, even if a given pod only ever talks to 3 of them. This is a known scalability ceiling in flat-mesh architectures, which is why patterns like Istio's sidecar scope configuration resource exist.

The pragmatic rule: if your service handles fewer than 500 RPS and your team has fewer than 5 services, a full service mesh sidecar is likely over-engineered. The pattern earns its cost at scale.

sidecar_resource_limits.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# sidecar_resource_limits.yaml
# Production-grade Kubernetes pod spec showing how to:
#   1. Co-locate a sidecar with your main application container
#   2. Set SEPARATE resource limits for app vs sidecar (critical — most teams forget this)
#   3. Control startup order so the sidecar is ready before the app starts taking traffic
#   4. Use Istio's Sidecar CR to scope which services the sidecar needs to know about

apiVersion: v1
kind: Pod
metadata:
  name: payments-service-pod
  annotations:
    # Tell Istio to inject Envoy automatically when this pod is created
    sidecar.istio.io/inject: "true"
    # Override default Envoy resource limits — don't let the sidecar starve your app
    sidecar.istio.io/proxyCPU: "200m"         # 0.2 vCPU — tune per observed usage
    sidecar.istio.io/proxyMemory: "128Mi"     # baseline Envoy footprint
    sidecar.istio.io/proxyCPULimit: "1000m"   # allow burst to 1 vCPU under load
    sidecar.istio.io/proxyMemoryLimit: "256Mi"
spec:
  # initContainers run before any regular containers.
  # Istio injects istio-init here automatically to install iptables rules.
  # We show it explicitly so you understand what's happening.
  initContainers:
    - name: istio-init
      image: docker.io/istio/proxyv2:1.20.0
      args: ["istio-iptables", "-p", "15001", "-z", "15006", "-u", "1337"]
      # 1337 is the UID Envoy runs as — traffic from UID 1337 is exempted from
      # iptables redirect to prevent infinite loops
      securityContext:
        capabilities:
          add: ["NET_ADMIN", "NET_RAW"] # required to modify iptables rules
        runAsNonRoot: false
        runAsUser: 0 # init container runs as root only to set iptables

  containers:
    # ── Main application container ────────────────────────────────────────────
    - name: payments-service
      image: myregistry/payments-service:2.4.1
      ports:
        - containerPort: 8080
      resources:
        requests:
          cpu: "500m"
          memory: "512Mi"
        limits:
          cpu: "2000m"
          memory: "1Gi"
      # Health check goes directly to the app — NOT through the sidecar
      # If you route health checks through Envoy and Envoy is slow to start,
      # your pod will be killed in a restart loop before the app is even ready.
      readinessProbe:
        httpGet:
          path: /healthz
          port: 8080
        initialDelaySeconds: 5
        periodSeconds: 10

    # ── Sidecar container (shown explicitly; normally injected automatically) ─
    # In production Istio injects this — we show it here for educational clarity.
    - name: istio-proxy
      image: docker.io/istio/proxyv2:1.20.0
      args:
        - proxy
        - sidecar
        - --serviceCluster
        - payments-service
        - --proxyLogLevel
        - warning  # Don't log at 'info' in prod — it's extremely verbose
      ports:
        - containerPort: 15090  # Prometheus scrape port for Envoy metrics
        - containerPort: 9901   # Envoy admin API — useful for debugging
      # Sidecar gets its OWN resource envelope, completely separate from the app.
      # This is the single most important production config most teams skip.
      resources:
        requests:
          cpu: "200m"
          memory: "128Mi"
        limits:
          cpu: "1000m"
          memory: "256Mi"
      # Lifecycle hook: drain connections gracefully before the pod terminates.
      # Without this, in-flight requests get hard-killed during rolling deploys.
      lifecycle:
        preStop:
          exec:
            command:
              - "/bin/sh"
              - "-c"
              - "sleep 5 && curl -sf -X POST http://127.0.0.1:9901/healthcheck/fail"
---
# Istio Sidecar CR: Scope what this sidecar needs to know about.
# By default, Envoy loads routing config for EVERY service in the mesh.
# This scopes it to only the services payments-service actually calls,
# reducing memory from ~150MB to ~30MB in large clusters.
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
  name: payments-service-sidecar-scope
  namespace: production
spec:
  workloadSelector:
    labels:
      app: payments-service
  egress:
    - hosts:
        - "production/user-service"    # only services we actually call
        - "production/fraud-service"
        - "istio-system/*"             # always include the control plane
  ingress:
    - port:
        number: 8080
        protocol: HTTP
        name: http-payments
      defaultEndpoint: 127.0.0.1:8080 # deliver to app on loopback
Output
# After applying this config, verify sidecar memory usage dropped:
kubectl exec -n production payments-service-pod -c istio-proxy -- \
curl -s http://127.0.0.1:9901/memory_allocator/stats | grep allocated
# Before scoping (full mesh config):
# allocated: 142,606,912 bytes (~136MB)
# After applying Sidecar CR scoping to 2 upstreams:
# allocated: 31,457,280 bytes (~30MB)
# 78% memory reduction — in a 300-pod cluster that's ~30GB of cluster RAM freed.
Interview Gold: The 'double-proxy' latency question
Interviewers love asking 'how much latency does a service mesh add?' The honest answer has three parts: (1) ~0.5ms per hop for Envoy's processing under normal load, (2) this doubles because BOTH the caller's sidecar AND the receiver's sidecar are in the path, and (3) the dominant factor isn't Envoy's processing — it's TLS handshake overhead on new connections, which is why HTTP/2 connection pooling and keep-alives are non-negotiable in any service mesh deployment. Cite real numbers: Google/Istio benchmarks show P50 overhead of ~1ms, P99 overhead of ~8ms per service-to-service call.
Production Insight
Memory overhead is the silent killer in large meshes — default Envoy config loads all services.
Scoping with the Sidecar CR can cut memory by 80% for typical services.
Without scoping, memory grows linearly with cluster size, not with actual dependencies.
Key Takeaway
Sidecar tax is real: 1-10ms latency, 0.5 vCPU and 50-150MB per pod.
Plan resources separately for sidecar and app.
Use Sidecar CR scoping to avoid loading unnecessary config.

Sidecar vs Ambassador vs Adapter — Knowing Which Variant to Reach For

The sidecar is one of three container patterns described in Brendan Burns' original Kubernetes patterns paper, and they're frequently confused in interviews. Understanding the distinction helps you pick the right tool and communicate precisely with your team.

The Sidecar augments or extends the main container's behavior — the proxy, log shipper, and secret reloader all fall here. The sidecar and the main container cooperate, sharing the same lifecycle.

The Ambassador is a specific sidecar that acts as a proxy for outbound connections. Your application always talks to localhost, and the ambassador translates that into environment-specific upstream URLs, handles service discovery, and manages connection pooling. It's a specialization of the sidecar pattern focused purely on outbound egress. Think of a Twilio ambassador that your app talks to on localhost:5000, which handles authentication, rate limit backoff, and regional endpoint selection.

The Adapter normalizes the output of the main container so it conforms to a standard interface expected by the outside world. Classic example: your legacy app emits logs in a proprietary format, but your log aggregator expects JSON. The adapter container reads the legacy log file and re-emits it as structured JSON. The outside world only ever sees the adapter's normalized output.

In practice, a production pod might have all three: an Envoy sidecar (service mesh), a Fluent Bit log adapter, and an ambassador to an external secret manager. Each serves a distinct concern.

three_container_patterns.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
# three_container_patterns.yaml
# A single Kubernetes pod demonstrating all three container helper patterns:
#   - Sidecar:    Envoy proxy (traffic management, mTLS, observability)
#   - Adapter:    Fluent Bit (normalize app logs to structured JSON for Elasticsearch)
#   - Ambassador: Vault Agent (fetch secrets from HashiCorp Vault and expose on localhost)
#
# The main app (order-processor) does NONE of this itself. It:
#   - Writes plain-text logs to /var/log/app/orders.log
#   - Reads its DB password from /vault/secrets/db-password (written by Vault Agent)
#   - Makes HTTP calls to localhost:8200 when it needs additional secrets at runtime
#
# This is the sidecar pattern at full production maturity.

apiVersion: v1
kind: Pod
metadata:
  name: order-processor-pod
  labels:
    app: order-processor
  annotations:
    sidecar.istio.io/inject: "true"
spec:
  serviceAccountName: order-processor-sa  # needs Vault + Kubernetes auth

  volumes:
    # Shared volume between app and Fluent Bit adapter
    - name: app-log-volume
      emptyDir: {}
    # Shared volume where Vault Agent writes decrypted secrets
    - name: vault-secrets-volume
      emptyDir:
        medium: Memory  # NEVER write secrets to disk — use tmpfs (in-memory volume)

  containers:
    # ══════════════════════════════════════════════════════════════════════════
    # MAIN CONTAINER: The application itself. Blissfully ignorant of
    # infrastructure concerns. Reads secrets from files, writes plain logs.
    # ══════════════════════════════════════════════════════════════════════════
    - name: order-processor
      image: myregistry/order-processor:3.1.0
      env:
        # App reads DB password from a file. Vault Agent keeps this file fresh.
        - name: DB_PASSWORD_FILE
          value: /vault/secrets/db-password
        # App sends logs to this path. Fluent Bit tails this file.
        - name: LOG_FILE_PATH
          value: /var/log/app/orders.log
      volumeMounts:
        - name: app-log-volume
          mountPath: /var/log/app
        - name: vault-secrets-volume
          mountPath: /vault/secrets
          readOnly: true  # app can only READ secrets — cannot pollute the volume
      resources:
        requests: { cpu: "500m", memory: "256Mi" }
        limits:   { cpu: "2",    memory: "512Mi" }

    # ══════════════════════════════════════════════════════════════════════════
    # ADAPTER CONTAINER: Fluent Bit
    # Problem: app writes unstructured text logs like:
    #   "2024-01-15 14:32:01 INFO order_id=ORD-9921 status=FULFILLED"
    # Elasticsearch expects JSON with @timestamp and level fields.
    # Fluent Bit parses and re-emits as:
    #   {"@timestamp":"2024-01-15T14:32:01Z","level":"INFO","order_id":"ORD-9921",...}
    # The outside world (Elasticsearch) only sees normalized output.
    # ══════════════════════════════════════════════════════════════════════════
    - name: fluent-bit-adapter
      image: fluent/fluent-bit:3.0
      args:
        - /fluent-bit/bin/fluent-bit
        - --config=/fluent-bit/etc/fluent-bit.conf
      volumeMounts:
        - name: app-log-volume
          mountPath: /var/log/app
          readOnly: true  # adapter only READS logs — cannot write back to app's log dir
      resources:
        requests: { cpu: "50m",  memory: "32Mi" }
        limits:   { cpu: "200m", memory: "64Mi" }

    # ══════════════════════════════════════════════════════════════════════════
    # AMBASSADOR CONTAINER: HashiCorp Vault Agent
    # The app needs a DB password and a Stripe API key.
    # Without this ambassador, the app would need:
    #   - Vault SDK dependency
    #   - Token renewal logic
    #   - Secret lease management
    # With the ambassador, the app just reads a file. Vault Agent handles
    # auth, token refresh, secret rotation, and writes the fresh value.
    # The app calls localhost:8200 for dynamic secrets at runtime.
    # ══════════════════════════════════════════════════════════════════════════
    - name: vault-agent-ambassador
      image: hashicorp/vault:1.15
      args: ["agent", "-config=/vault/config/agent-config.hcl"]
      env:
        - name: VAULT_ADDR
          value: "https://vault.internal.mycompany.com:8200"
      ports:
        # Vault Agent exposes a local proxy on 8200 — app calls http://localhost:8200
        # Ambassador translates this into authenticated calls to the real Vault cluster
        - containerPort: 8200
          name: vault-proxy
      volumeMounts:
        - name: vault-secrets-volume
          mountPath: /vault/secrets  # writes decrypted secrets here
      resources:
        requests: { cpu: "50m",  memory: "64Mi" }
        limits:   { cpu: "200m", memory: "128Mi" }
Output
# Verify all three helper containers are running alongside the main app:
kubectl get pod order-processor-pod -o jsonpath='{.spec.containers[*].name}'
# Output:
order-processor fluent-bit-adapter vault-agent-ambassador istio-proxy
# Check the adapter is shipping logs to Elasticsearch:
kubectl logs order-processor-pod -c fluent-bit-adapter --tail=5
# [2024/01/15 14:32:05] [ info] [output:es:es.0] 12 records successfully flushed
# Check the ambassador wrote the latest secret:
kubectl exec order-processor-pod -c order-processor -- cat /vault/secrets/db-password
# postgres://orders_user:xK9#mQ2$vR@db.internal:5432/orders_prod
# (Vault Agent rotated this 4 minutes ago — the app read the new value automatically)
Pro Tip: Give every helper container its own resource limits
In Kubernetes, CPU and memory limits are set per container, not per pod. If you define limits only on your main container and leave the sidecar/adapter/ambassador unlimited, a misbehaving Fluent Bit can consume all available node memory and trigger an OOMKill on your main application. Always set explicit requests AND limits on every container in the pod, sized by profiling actual usage. Start with requests=actual P90 usage, limits=2x requests.
Production Insight
Unlimited helper containers are a common cause of pod OOM kills in production.
Profile each container's baseline usage under load and enforce limits.
Rule: sidecar, adapter, and ambassador each need their own resource envelope.
Key Takeaway
Sidecar augments behavior, Ambassador proxies outbound, Adapter normalizes output.
All three patterns decouple infrastructure from application.
Always set resource limits on every container — not just the main app.

Sidecar Lifecycle and Startup Ordering: The Init Container Problem

One of the most overlooked aspects of the sidecar pattern is the startup ordering between the init container, the sidecar proxy, and the main application. In Istio, the init container runs and installs iptables rules — but it does NOT wait for Envoy to be ready. The main container starts immediately after the init container completes, and Kubernetes begins sending readiness probes. If those probes are misconfigured (targeting the sidecar port), or if Envoy isn't ready to receive traffic, the pod enters a CrashLoopBackOff.

This is exactly what happened in the production incident described earlier. But there's another subtlety: even if probes target the app port correctly, the application itself may start before Envoy is fully initialized. The application initiates outbound connections to other services, but Envoy isn't listening yet. Those connections fail. Retry logic in the app might mask this, but the first few requests always fail.

To solve this, Istio introduced the holdApplicationUntilProxyStarts feature gate (enabled by default in Istio 1.15+). When active, the sidecar injector adds a postStart lifecycle hook to the application container that waits for Envoy's readiness endpoint to return 200. This delays the application's entry point until the sidecar is fully operational.

For non-Istio sidecars, you need to replicate this behavior. A common pattern is to add a startup script that polls the sidecar's admin endpoint before launching the app. In Kubernetes, you can also use init containers that block until the sidecar is ready, but this requires bidirectional coordination.

hold_app_until_proxy_ready.yamlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Enabling holdApplicationUntilProxyStarts in the Istio control plane
# This is a MeshConfig global setting.
# It adds a postStart hook to every injected application container that waits
# for Envoy's port 15000 (health probe listener) to return HTTP 200.

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: mesh-config
spec:
  meshConfig:
    defaultConfig:
      holdApplicationUntilProxyStarts: true
---
# For custom sidecars (non-Istio), you can emulate this with a startup script:
# This example uses a shell script that polls the sidecar status before
# starting the main application.
#
# Place this script as the entrypoint of your main container:
#   command: ["/start-with-sidecar.sh"]

# start-with-sidecar.sh
#!/bin/bash
# Wait for sidecar to be ready (poll localhost:9901/server_info)
# Timeout after 30 seconds
SIDECAR_READY=false
TIMEOUT=30
INTERVAL=1
ELAPSED=0
while [ "$SIDECAR_READY" = false ] && [ $ELAPSED -lt $TIMEOUT ]; do
  if curl -sf http://127.0.0.1:9901/server_info > /dev/null 2>&1; then
    SIDECAR_READY=true
    echo "Sidecar ready after ${ELAPSED}s"
  else
    sleep $INTERVAL
    ELAPSED=$((ELAPSED + INTERVAL))
  fi
done

if [ "$SIDECAR_READY" = false ]; then
  echo "Timeout waiting for sidecar"
  exit 1
fi

# Now start the actual application
exec /app/my-app
Output
# Verify holdApplicationUntilProxyStarts is working:
# The postStart hook will add a delay in the application container startup.
kubectl describe pod <pod_name> | grep -A5 "postStart"
# Expected output: a lifecycle hook that waits for Envoy
# Check application logs for the delay:
kubectl logs <pod_name> -c <app_container> --tail=5
# 2026/01/15 14:32:01 Sidecar ready after 3s
# 2026/01/15 14:32:01 Starting app...
Warning: holdApplicationUntilProxyStarts can increase pod startup time
Enabling this feature means every pod startup waits for Envoy to initialize (typically 3-10 seconds). For deployments that start many pods simultaneously (e.g., after a rolling update), this can slow the rollout. Profile your Envoy startup time and decide if the first-request reliability gain is worth the deployment speed cost. For critical services, it almost always is.
Production Insight
Without holdApplicationUntilProxyStarts, the first few outbound requests from a fresh pod will fail.
Retries might mask it, but they add latency and can cause cascading failures.
Rule: enable it for all services where startup order matters — which is most of them.
Key Takeaway
Sidecar and app have independent lifecycles — the init container isn't sufficient.
Use holdApplicationUntilProxyStarts or a custom startup script to wait.
The cost: 3-10 seconds added to pod startup time.
● Production incidentPOST-MORTEMseverity: high

The Readiness Probe Loop That Took Down a Payment Cluster

Symptom
Every pod in the payments-service deployment entered CrashLoopBackOff within 30 seconds of startup. The application container started successfully (logs showed 'listening on :8080') but was killed repeatedly. kubectl describe showed readiness probe failures.
Assumption
The team assumed the readiness probe targeting the sidecar's port (15006) would work because 'Envoy is part of the pod'. They also believed the init container would complete before the probe started, which it did, but Envoy's iptables rules were not yet fully applied.
Root cause
Istio's init container installs iptables rules that redirect inbound traffic on port 8080 to Envoy's inbound listener (15006). When the readiness probe targeted port 15006 directly, Kubernetes sent the probe before Envoy was ready to accept connections (Envoy takes ~5-10 seconds to initialize its listeners after the init container finishes). The probe failed, Kubernetes killed the pod, and the cycle repeated.
Fix
Configure readiness and liveness probes to target the application's original port (8080) directly. This bypasses the iptables redirection because the probe originates from the local kubelet and is not subject to NAT rules targeting the sidecar. Alternatively, use Istio's holdApplicationUntilProxyStarts: true feature gate to delay the application container until Envoy is ready.
Key lesson
  • Always set readiness/liveness probes to the application's port — not the sidecar's port.
  • Never assume sidecar startup completes before the main container's probe deadline.
  • Use holdApplicationUntilProxyStarts for critical services where startup order matters.
  • Test sidecar injection in a staging environment before rolling to production — the startup timing difference is subtle and hard to reproduce locally.
Production debug guideCommon sidecar-related problems and the commands to diagnose them fast4 entries
Symptom · 01
Pod stuck in Init:0/1 or CrashLoopBackOff after enabling sidecar injection
Fix
Check init container logs: kubectl logs <pod> -c istio-init. Verify iptables rules: kubectl exec <pod> -c istio-proxy -- iptables -t nat -L -n
Symptom · 02
Traffic between services failing with connection refused
Fix
Confirm both sidecars are healthy: kubectl exec <pod> -c istio-proxy -- curl -s http://127.0.0.1:9901/server_info. Check Envoy listeners: kubectl exec <pod> -c istio-proxy -- curl -s http://127.0.0.1:9901/listeners
Symptom · 03
High latency and retries after mesh enablement
Fix
Check Envoy stats for upstream_rq_time: kubectl exec -c istio-proxy -- curl -s http://127.0.0.1:15090/stats/prometheus | grep 'istio_requests_total'. Also verify mTLS is not causing additional handshakes: istioctl authz check <pod>
Symptom · 04
Application receives requests but sidecar logs show nothing
Fix
Application might be bypassing iptables rules. Check if app uses UID 1337 (Envoy's exclusions): kubectl exec <pod> -- id. Verify iptables RETURN rules for 127.0.0.6 loopback traffic.
★ Sidecar Debugging Quick ReferenceRun these commands in order when a sidecar-proxied service misbehaves. Each command narrows the possibility space.
Pod won't start (CrashLoopBackOff)
Immediate action
Check init container logs to see if iptables installation failed.
Commands
kubectl logs <pod_name> -c istio-init --tail=50
kubectl describe pod <pod_name> | grep -A5 Init
Fix now
If init container fails due to missing NET_ADMIN capability, add securityContext.capabilities.add: ['NET_ADMIN'] to the pod spec.
Outbound traffic blocked or slow+
Immediate action
Verify Envoy outbound listener is up and iptables rules are redirecting correctly.
Commands
kubectl exec <pod_name> -c istio-proxy -- curl -s http://127.0.0.1:9901/config_dump | jq '.configs[1].dynamicListeners'
kubectl exec <pod_name> -c istio-proxy -- iptables -t nat -L ISTIO_OUTPUT -n
Fix now
If outbound rules missing, re-apply sidecar injection: kubectl rollout restart deployment/<deploy> — the init container re-runs and installs rules.
High memory usage in sidecar container+
Immediate action
Check if Envoy is loading config for all services (default).
Commands
kubectl exec <pod_name> -c istio-proxy -- curl -s http://127.0.0.1:9901/memory | grep 'allocated'
istioctl analyze <deployment_name>
Fix now
Apply Istio Sidecar CR to scope the sidecar's config to only needed services.
mTLS errors between services+
Immediate action
Check destination rule and peer authentication settings.
Commands
kubectl exec <source_pod> -c istio-proxy -- curl -s http://127.0.0.1:9901/certs
istioctl authz check <source_pod> -t <target_pod>
Fix now
If certificate mismatch, verify workload entries and that both namespaces have sidecar injection enabled.
AspectSidecar PatternShared Library Approach
Language independenceComplete — sidecar runs as a separate process in any languageNone — library must be ported to every runtime your teams use
Upgrade pathRoll out new sidecar version independently via redeploymentEvery service must update dependency version and redeploy
Latency overhead1–10ms per hop (two extra process context switches)Near zero — in-process function calls
Memory overhead per pod50–150MB for a full Envoy configLibrary heap overhead only, typically 5–20MB
Blast radius of a bugSidecar crash can disrupt all traffic for that podLibrary bug affects only services that called the faulty code path
Configuration centralisationYes — control plane (Istio/Consul) pushes config to all sidecarsNo — each service owns its library config; config drift is common
Debugging complexityHigh — must trace through two extra processes; requires mesh observability toolingLower — standard in-process debugger works
Suitable scale50+ services, polyglot teams, compliance requirements1–10 services, single language, small team moving fast
Secret/cert rotationSidecar handles rotation transparently; app never restartsApp must implement reload logic or restart on rotation
Traffic shaping (retries, timeouts)Declarative YAML/CRD — no code changes to the appMust be coded into every service; easily inconsistent across teams

Key takeaways

1
Sidecar pattern co-locates infrastructure alongside your app, enabling cross-cutting concerns without code changes.
2
Traffic interception relies on iptables rules installed by an init container; the 127.0.0.6 RETURN rule is critical to avoid redirect loops.
3
Sidecar tax
1-10ms latency per hop, 0.5 vCPU and 50-150MB per pod – plan resources separately.
4
Startup ordering matters
sidecar must be ready before the app starts or first requests will fail.
5
Use Sidecar CR scoping to reduce memory overhead by up to 80% in large clusters.
6
The pattern is overkill for <5 services; use static sidecar configs for small teams.

Common mistakes to avoid

4 patterns
×

Not setting separate resource limits for the sidecar container

Symptom
Your main application pod is OOMKilled even though your app's memory usage looks normal in dashboards. The culprit is Envoy or Fluent Bit consuming unbounded memory, which counts against the pod's node allocation.
Fix
Always define resources.requests and resources.limits on EVERY container in the pod spec. Profile sidecar memory independently using kubectl exec -c istio-proxy -- curl localhost:9901/stats | grep heap, and set limits at ~2x observed P99 usage.
×

Routing Kubernetes liveness and readiness probes through the sidecar

Symptom
Your pod enters a CrashLoopBackOff restart loop during initial deployment even though the app itself starts fine. Kubernetes sends the readiness probe before Envoy's iptables rules are fully ready, the probe fails, the pod is killed.
Fix
Configure readiness/liveness probes to connect directly to the application's port (which bypasses iptables REDIRECT because it targets the local port directly via loopback). Alternatively, use Istio's holdApplicationUntilProxyStarts: true feature gate.
×

Deploying a full service mesh sidecar for a monolith-to-microservices migration with 3 services

Symptom
The team spends 3 sprints debugging Istio CRDs, Envoy xDS errors, and mTLS certificate rotation instead of shipping features. The operational overhead of a full service mesh only pays off at meaningful scale.
Fix
Use the sidecar pattern without a full service mesh for small deployments — a single Nginx or Envoy sidecar configured with static config files gives you 80% of the value (TLS termination, access logging, header injection) with 10% of the complexity. Graduate to a control-plane-managed mesh (Istio/Consul Connect/Linkerd) when you have 15+ services and a dedicated platform team.
×

Assuming sidecar startup completes before the main application starts

Symptom
First few outbound requests from a freshly started pod fail with connection refused or timeout, even though the application is healthy. The sidecar process is not yet ready to accept traffic.
Fix
Use Istio's holdApplicationUntilProxyStarts: true or implement a startup script that polls the sidecar's health endpoint before launching the app.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain how a service mesh like Istio achieves transparent traffic inter...
Q02SENIOR
A team reports that after enabling Istio on their cluster, P99 latency d...
Q03SENIOR
What's the difference between the Sidecar, Ambassador, and Adapter conta...
Q01 of 03SENIOR

Explain how a service mesh like Istio achieves transparent traffic interception without any code changes to the application. Walk me through what happens at the kernel level from the moment your app calls `http.Get('http://payments-service:8080')` until the response arrives back.

ANSWER
When the app calls http.Get, the OS resolves 'payments-service' to a pod IP via DNS. The kernel consults the iptables NAT table rules installed by the istio-init container. The ISTIO_OUTPUT chain redirects outbound TCP packets destined for any IP except 127.0.0.6 (Envoy's loopback) to port 15001 (Envoy's outbound listener). Envoy receives the packet, applies mesh policies (mTLS, retries, headers), then forwards to the actual destination. On the receiving side, the inbound iptables rules (ISTIO_INBOUND) redirect packets arriving on the app's port (e.g., 8080) to Envoy's inbound listener on port 15006. Envoy verifies TLS, extracts trace headers, and forwards to the app on localhost:8080. The app receives the request as if it came directly. The entire process involves two extra context switches (kernel→Envoy→kernel) and two Envoy proxy passes.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
Does the sidecar pattern require Kubernetes?
02
Can I use a sidecar for non-HTTP traffic?
03
How do I debug a sidecar that's not intercepting traffic?
04
What's the difference between Istio sidecar injection and manual sidecar deployment?
🔥

That's Components. Mark it forged?

6 min read · try the examples if you haven't

Previous
Bloom Filter
13 / 18 · Components
Next
WebRTC Explained