Senior 6 min · June 25, 2026

Backpressure: Stop Your System From Drowning in Requests

Backpressure explained with real production patterns.

N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

Follow
Production
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

Backpressure is how a system tells its upstream, 'I'm full, stop sending.' It's implemented via bounded queues, reactive streams (e.g., Java Flow API, ReactiveX), or explicit throttling signals (e.g., HTTP 429, TCP window scaling). Without it, you get OOM errors, connection pool exhaustion, and silent data loss.

✦ Definition~90s read
What is Backpressure?

Backpressure is a flow control mechanism where a downstream component signals upstream to slow down or stop sending data when it can't keep up. It prevents buffer overflows, resource exhaustion, and cascading failures in distributed systems.

Imagine you're a bartender pouring shots as fast as you can.
Plain-English First

Imagine you're a bartender pouring shots as fast as you can. If customers order faster than you can pour, you either spill drinks everywhere (crash) or you tell them to wait. Backpressure is that 'wait' signal. It's the bartender saying, 'Hold up, I'm still on the last round.' In software, it's the downstream service telling upstream, 'I'm at capacity, back off.'

You've seen it happen. A sudden traffic spike hits your service. Latency climbs. Then throughput flatlines. Then the whole thing falls over with an OOM killer message at 3am. The root cause? No backpressure. Your system kept accepting work it couldn't handle, buffers grew unbounded, and the JVM choked. This isn't a theory problem — it's the #1 cause of cascading failures in microservices.

Backpressure is the mechanism that prevents this. It's not optional in any system that processes async data streams — message queues, event pipelines, HTTP servers, database connection pools. Without it, you're gambling that your peak load never exceeds your capacity. That's a bet you'll lose.

By the end of this, you'll know how to implement backpressure in your async pipelines, what patterns actually work in production, and — more importantly — which ones will burn you. You'll be able to diagnose backpressure failures from logs and metrics, and you'll have a mental model for designing systems that degrade gracefully instead of catastrophically.

Why Backpressure Exists: The Problem of Unbounded Buffers

Every async system has a buffer somewhere — a queue, a channel, a buffer in memory. Buffers smooth out load spikes. But they also hide the fact that downstream can't keep up. When the buffer is unbounded, it grows until it eats all memory. Then the process OOMs. Then the load shifts to the next service, which OOMs too. That's a cascading failure.

Without backpressure, you have two failure modes: either you drop data silently (if you have a bounded buffer with no backpressure signal) or you crash (if the buffer is unbounded). Both are bad. Backpressure gives you a third option: slow down the producer so the system stays stable.

In production, the most common symptom of missing backpressure is the 'hockey stick' latency graph. Latency stays flat until some threshold, then shoots to infinity. That's the buffer filling up. The fix isn't more memory — it's backpressure.

UnboundedQueueExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// io.thecodeforge — System Design tutorial

import java.util.concurrent.*;

public class UnboundedQueueExample {
    // BAD: unbounded queue — will OOM under load
    private static final ExecutorService executor = Executors.newFixedThreadPool(10);
    
    public static void main(String[] args) {
        // Simulate fast producer, slow consumer
        for (int i = 0; ; i++) {
            final int taskId = i;
            executor.submit(() -> {
                try {
                    Thread.sleep(1000); // slow consumer
                } catch (InterruptedException e) {}
                System.out.println("Processed " + taskId);
            });
        }
    }
}
// Output: eventually java.lang.OutOfMemoryError: Java heap space
Output
Eventually: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
Production Trap: Unbounded Queues
Executors.newFixedThreadPool() uses an unbounded LinkedBlockingQueue. Under load, it will OOM. Always use a bounded queue with a rejection policy. The error message is 'Java heap space' — not 'queue full'.
Backpressure Flow: From Unbounded Load to System Stability THECODEFORGE.IO Backpressure Flow: From Unbounded Load to System Stability How backpressure prevents overload across queues, streams, HTTP, and distributed systems Unbounded Requests Incoming load exceeds processing capacity Bounded Queue Fixed-size buffer absorbs spikes Reactive Streams Demand Consumer signals how many items it can handle HTTP 429 Response Server rejects excess requests with status code Circuit Breaker Opens when failure threshold exceeded Stable Throughput System maintains predictable performance ⚠ Unbounded queues hide backpressure until memory runs out Always set a maximum queue size and monitor queue depth THECODEFORGE.IO
thecodeforge.io
Backpressure Flow: From Unbounded Load to System Stability
Backpressure

Bounded Queues: The First Line of Defense

The simplest backpressure mechanism is a bounded queue. You set a maximum capacity. When the queue is full, the producer must either block, drop, or throw. This forces the producer to slow down.

In Java, ArrayBlockingQueue is your friend. It's a fixed-size array-based queue. When full, the put() method blocks until space is available. That blocking is the backpressure signal — it propagates upstream, eventually slowing the source.

But blocking has trade-offs. If the producer is a network thread, blocking it can starve other connections. That's why you need to think about where the backpressure propagates. In a web server, blocking the request thread is fine — the client will wait. In a Kafka consumer, blocking the poll loop will cause rebalances. Know your context.

BoundedQueueExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// io.thecodeforge — System Design tutorial

import java.util.concurrent.*;

public class BoundedQueueExample {
    // GOOD: bounded queue with blocking backpressure
    private static final BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(100);
    private static final ExecutorService executor = new ThreadPoolExecutor(
            10, 10, 0L, TimeUnit.MILLISECONDS, queue);
    
    public static void main(String[] args) throws InterruptedException {
        // Producer blocks when queue is full
        for (int i = 0; ; i++) {
            final int taskId = i;
            // put() blocks until space available
            queue.put(() -> {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {}
                System.out.println("Processed " + taskId);
            });
        }
    }
}
// Output: tasks processed at rate of consumer, producer blocks
Output
Processed 0
Processed 1
... (steady rate, no OOM)
Senior Shortcut: ThreadPoolExecutor with Bounded Queue
Use ThreadPoolExecutor constructor directly instead of Executors factory. You control queue type and rejection policy. The CallerRunsPolicy is often the best — it runs the task on the producer thread, naturally throttling the producer.
Bounded Queue Backpressure FlowTHECODEFORGE.IOBounded Queue Backpressure FlowProducer forced to slow when buffer is fullProducerGenerates requests at own paceBounded QueueFixed capacity, e.g. ArrayBlockingQueueQueue Full?Capacity reached, no roomBlock/Drop/ThrowProducer must slow downConsumerProcesses at sustainable rate⚠ Unbounded queues hide backpressure until OOM kills the processTHECODEFORGE.IO
thecodeforge.io
Bounded Queue Backpressure Flow
Backpressure

Reactive Streams: The Demand-Driven Approach

Bounded queues are imperative — they push back when full. Reactive Streams (ReactiveX, Java Flow API) are declarative: the consumer tells the producer how much it can handle. This is called 'demand signaling'.

In Reactive Streams, the Subscriber calls request(n) on the Subscription to indicate it's ready for n items. The Publisher must not send more than requested. This is backpressure built into the protocol.

This pattern shines in data pipelines where you have multiple stages. Each stage requests only what it can process. The backpressure propagates all the way to the source. No buffers overflow because no stage sends more than the next stage can consume.

The downside? Complexity. Reactive code is harder to debug. Stack traces are useless. And if any stage forgets to request(), the pipeline stalls silently. I've seen production incidents where a misconfigured buffer caused a 30-minute data delay because a downstream subscriber requested too few items.

ReactiveBackpressureExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// io.thecodeforge — System Design tutorial

import java.util.concurrent.Flow.*;
import java.util.concurrent.SubmissionPublisher;

public class ReactiveBackpressureExample {
    public static void main(String[] args) throws InterruptedException {
        SubmissionPublisher<Integer> publisher = new SubmissionPublisher<>();
        
        Subscriber<Integer> subscriber = new Subscriber<>() {
            private Subscription subscription;
            private static final int REQUEST_SIZE = 5;
            
            @Override
            public void onSubscribe(Subscription subscription) {
                this.subscription = subscription;
                subscription.request(REQUEST_SIZE); // demand signal
            }
            
            @Override
            public void onNext(Integer item) {
                System.out.println("Processing " + item);
                try { Thread.sleep(200); } catch (InterruptedException e) {}
                subscription.request(1); // request one more after processing
            }
            
            @Override
            public void onError(Throwable t) { t.printStackTrace(); }
            
            @Override
            public void onComplete() { System.out.println("Done"); }
        };
        
        publisher.subscribe(subscriber);
        for (int i = 0; i < 20; i++) {
            publisher.submit(i);
        }
        publisher.close();
        Thread.sleep(5000);
    }
}
// Output: processes at consumer rate, never more than REQUEST_SIZE buffered
Output
Processing 0
Processing 1
... (steady rate, no overflow)
The Classic Bug: Forgetting to request()
Reactive Streams vs Bounded QueuesTHECODEFORGE.IOReactive Streams vs Bounded QueuesDemand-driven vs pushback backpressureBounded QueueProducer pushes until queue fullBackpressure is reactive (block/drop)Imperative: producer must handle fullRisk of deadlock with shared threadsSimple to implement and reason aboutReactive StreamsConsumer requests N items via request()Backpressure is proactive (demand)Declarative: framework manages flowNo blocking, uses async signalingSteeper learning curve, more flexibleDemand signaling prevents buffer bloat without blocking threadsTHECODEFORGE.IO
thecodeforge.io
Reactive Streams vs Bounded Queues
Backpressure

Backpressure in HTTP: 429 Too Many Requests

In HTTP services, backpressure is often implemented as rate limiting with 429 status code. The server tells the client 'slow down' by returning a Retry-After header. This is explicit backpressure at the application layer.

But 429 is a blunt instrument. It works well for external APIs. For internal microservices, you want something more nuanced — like circuit breakers or bulkheads. 429 can cause clients to retry aggressively, making things worse. Always implement exponential backoff with jitter on the client side.

A better pattern for internal services is to use a bounded queue with a rejection policy that returns 503 (Service Unavailable) when the queue is full. This signals the caller to back off, and the load balancer can route to another instance. Combine with circuit breakers to prevent cascading.

HttpBackpressureExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// io.thecodeforge — System Design tutorial

import com.sun.net.httpserver.*;
import java.io.*;
import java.net.InetSocketAddress;
import java.util.concurrent.*;

public class HttpBackpressureExample {
    private static final BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(100);
    private static final ExecutorService executor = new ThreadPoolExecutor(
            10, 10, 0L, TimeUnit.MILLISECONDS, queue,
            new ThreadPoolExecutor.AbortPolicy()); // throws RejectedExecutionException
    
    public static void main(String[] args) throws IOException {
        HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);
        server.setExecutor(executor);
        server.createContext("/process", exchange -> {
            try {
                executor.submit(() -> {
                    try {
                        Thread.sleep(1000); // simulate work
                        String response = "Processed";
                        exchange.sendResponseHeaders(200, response.length());
                        exchange.getResponseBody().write(response.getBytes());
                    } catch (Exception e) {}
                });
            } catch (RejectedExecutionException e) {
                String response = "Too many requests";
                exchange.sendResponseHeaders(503, response.length());
                exchange.getResponseBody().write(response.getBytes());
            }
        });
        server.start();
        System.out.println("Server started on port 8080");
    }
}
// Output: returns 503 when queue full
Output
HTTP/1.1 503 Service Unavailable
Content-Length: 17
Too many requests
Interview Gold: 429 vs 503 for Backpressure

Backpressure in Message Queues: Kafka, RabbitMQ, SQS

Message brokers handle backpressure differently. Kafka consumers control their own pace — they poll at their own rate. The broker doesn't push. So backpressure is implicit: if you poll slowly, you consume slowly. The problem is that the consumer's internal processing pipeline might still overflow if it buffers messages internally.

RabbitMQ uses consumer prefetch. Set a prefetch count to limit how many unacknowledged messages a consumer can have. This is explicit backpressure. If your consumer processes slowly, RabbitMQ stops sending more. But if your consumer crashes, messages can be redelivered, causing duplicates.

SQS has no built-in backpressure. You must implement it yourself. The consumer polls messages, processes them, and deletes them. If processing is slow, you can reduce the polling frequency or use a circuit breaker. But SQS will keep delivering messages as long as you poll. This is a common source of unbounded growth in serverless architectures.

KafkaConsumerBackpressure.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// io.thecodeforge — System Design tutorial

import org.apache.kafka.clients.consumer.*;
import java.time.Duration;
import java.util.*;

public class KafkaConsumerBackpressure {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("group.id", "my-group");
        props.put("enable.auto.commit", "false");
        props.put("max.poll.records", "10"); // backpressure: limit per poll
        
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Arrays.asList("my-topic"));
        
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                process(record);
            }
            consumer.commitSync(); // commit after processing
        }
    }
    
    private static void process(ConsumerRecord<String, String> record) {
        // simulate processing
        try { Thread.sleep(100); } catch (InterruptedException e) {}
        System.out.println("Processed: " + record.value());
    }
}
// Output: processes at consumer pace, never more than 10 records in flight
Output
Processed: message-0
Processed: message-1
... (steady rate)
Never Do This: Unbounded Internal Buffer in Kafka Consumer

When Backpressure Breaks: Anti-Patterns and Gotchas

Backpressure isn't a silver bullet. Here are the ways it fails in production.

Deadlock with blocking queues. If your producer and consumer share the same thread pool, blocking can cause deadlock. Example: a web server thread submits a task to a bounded queue, and the task tries to submit another task to the same queue. If the queue is full, the first task blocks, consuming a thread, and the second task never runs. Fix: use separate thread pools or non-blocking patterns.

Starvation with priority inversion. If a low-priority task holds a resource that a high-priority task needs, and the low-priority task is blocked by backpressure, the high-priority task starves. This is rare but nasty.

Backpressure amplification. If every service in a chain applies backpressure independently, the system can become overly conservative. A transient slowdown at the tail can cause the head to stall completely. Use circuit breakers with timeouts to break the chain.

Monitoring blind spots. Backpressure hides problems. If the queue is always full, you might think the system is healthy because it's not crashing. But latency is high. Monitor queue depth and processing latency, not just throughput.

DeadlockExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// io.thecodeforge — System Design tutorial

import java.util.concurrent.*;

public class DeadlockExample {
    private static final BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(1);
    private static final ExecutorService executor = new ThreadPoolExecutor(
            1, 1, 0L, TimeUnit.MILLISECONDS, queue);
    
    public static void main(String[] args) throws InterruptedException {
        executor.submit(() -> {
            System.out.println("Task 1 started");
            try {
                // This will block because queue is full (size 1) and no threads available
                executor.submit(() -> System.out.println("Task 2")).get();
            } catch (Exception e) {
                System.out.println("Deadlock: " + e.getMessage());
            }
        });
        executor.shutdown();
    }
}
// Output: deadlock — Task 1 blocks forever
Output
Task 1 started
(program hangs)
Never Do This: Submitting to Same Executor Inside Task

Backpressure in Distributed Systems: Circuit Breakers and Bulkheads

In distributed systems, backpressure must be combined with circuit breakers and bulkheads. A circuit breaker monitors failure rates and opens when too many requests fail, preventing calls to a downstream that's already struggling. This is a form of backpressure — it stops the flow of requests to a failing service.

Bulkheads isolate resources. If one service's thread pool is exhausted, it doesn't affect others. This limits the blast radius of backpressure. For example, separate thread pools for different downstream services.

Together, these patterns create a system that degrades gracefully. When a downstream service slows down, the circuit breaker opens, requests are rejected fast (fail-fast), and the upstream doesn't accumulate work. This is better than letting backpressure propagate and stall everything.

The trade-off is complexity. You need to tune timeouts, thresholds, and pool sizes. Get it wrong and you'll have false positives (circuit breaker opens when it shouldn't) or false negatives (doesn't open when it should). Monitor and adjust.

CircuitBreakerExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// io.thecodeforge — System Design tutorial

import java.time.*;
import java.util.concurrent.atomic.*;

public class CircuitBreakerExample {
    enum State { CLOSED, OPEN, HALF_OPEN }
    private final AtomicReference<State> state = new AtomicReference<>(State.CLOSED);
    private final AtomicInteger failureCount = new AtomicInteger(0);
    private final int threshold = 5;
    private final long timeoutMillis = 10000;
    private volatile long lastFailureTime;
    
    public boolean call(Runnable operation) {
        if (state.get() == State.OPEN) {
            if (System.currentTimeMillis() - lastFailureTime > timeoutMillis) {
                state.compareAndSet(State.OPEN, State.HALF_OPEN);
            } else {
                return false; // fast fail
            }
        }
        try {
            operation.run();
            if (state.get() == State.HALF_OPEN) {
                state.set(State.CLOSED);
                failureCount.set(0);
            }
            return true;
        } catch (Exception e) {
            failureCount.incrementAndGet();
            lastFailureTime = System.currentTimeMillis();
            if (failureCount.get() >= threshold) {
                state.set(State.OPEN);
            }
            return false;
        }
    }
}
// Usage: circuitBreaker.call(() -> downstreamService.process(data));
Output
Returns false when circuit is open, preventing calls to downstream
Senior Shortcut: Use Resilience4j, Don't Roll Your Own

Monitoring Backpressure: What to Watch For

You can't fix what you don't measure. Here are the key metrics for backpressure:

Queue depth. How many items are waiting? If it's consistently near capacity, you're at risk. Alert on queue depth > 80% of capacity.

Processing latency. The time from item arrival to processing start. If this grows, backpressure is building.

Rejection rate. How many requests are rejected due to full queues? A non-zero rate is okay — it means backpressure is working. But if it's high, you need more capacity.

Thread pool utilization. Are all threads busy? If yes, and queue is growing, you need more threads or better backpressure.

Circuit breaker state. Monitor how often circuits open and close. Frequent toggling indicates instability.

In production, I've seen teams ignore queue depth until it hits the limit and starts rejecting. By then, latency is already terrible. Set proactive alerts.

MonitoringExample.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// io.thecodeforge — System Design tutorial

import java.lang.management.*;
import java.util.concurrent.*;

public class MonitoringExample {
    private static final BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(100);
    private static final ThreadPoolExecutor executor = new ThreadPoolExecutor(
            10, 10, 0L, TimeUnit.MILLISECONDS, queue);
    
    public static void main(String[] args) throws InterruptedException {
        // Simulate load
        for (int i = 0; i < 200; i++) {
            executor.submit(() -> {
                try { Thread.sleep(1000); } catch (InterruptedException e) {}
            });
        }
        
        // Monitor
        System.out.println("Queue size: " + executor.getQueue().size());
        System.out.println("Active threads: " + executor.getActiveCount());
        System.out.println("Completed tasks: " + executor.getCompletedTaskCount());
        System.out.println("Rejected tasks: " + executor.getRejectedExecutionHandler());
        
        executor.shutdown();
    }
}
// Output: shows queue depth, active threads, etc.
Output
Queue size: 100
Active threads: 10
Completed tasks: 0
Rejected tasks: java.util.concurrent.ThreadPoolExecutor$AbortPolicy@...
Production Trap: Ignoring Queue Depth
● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom
A payment processing service would run fine for hours, then suddenly OOM. Heap dumps showed a LinkedBlockingQueue with millions of pending transaction objects. CPU was low, memory was gone.
Assumption
Team assumed a memory leak in the transaction object. Spent weeks profiling object allocations.
Root cause
The upstream Kafka consumer had no backpressure. It polled messages as fast as possible and offered them to an unbounded ExecutorService queue. When the downstream payment gateway slowed down (latency spikes), the queue grew unbounded until heap exhausted. The thread pool's work queue was a LinkedBlockingQueue with Integer.MAX_VALUE capacity.
Fix
Switched to a bounded queue with ArrayBlockingQueue(1000). Set RejectedExecutionHandler to CallerRunsPolicy, which throttles the Kafka consumer by blocking the poll thread. Also added a circuit breaker on the payment gateway client.
Key lesson
  • Unbounded queues are a ticking time bomb.
  • Always bound your queues and decide what happens when they're full — blocking, dropping, or rejecting.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
java.lang.OutOfMemoryError: Java heap space in async processor
Fix
1. Check thread pool queue type — is it bounded? 2. Dump heap and look for large collections (LinkedBlockingQueue, ArrayList). 3. Set a hard limit on queue size. 4. Add rejection policy (CallerRunsPolicy or AbortPolicy). 5. Restart with new config.
Symptom · 02
Latency spikes then throughput drops to zero
Fix
1. Check thread pool queue depth via JMX or /actuator/metrics. 2. Check if threads are blocked (jstack). 3. Look for deadlock between producer and consumer threads. 4. Increase queue capacity or add more threads temporarily. 5. Implement backpressure with blocking put.
Symptom · 03
Circuit breaker toggling open/closed frequently
Fix
1. Check downstream latency and error rate. 2. Increase circuit breaker threshold or timeout. 3. Add bulkhead (separate thread pool) for that downstream. 4. If downstream is healthy, check for false positives due to transient spikes. 5. Tune sliding window size.
★ Backpressure Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
OOM: `java.lang.OutOfMemoryError: Java heap space`
Immediate action
Check if thread pool queue is unbounded
Commands
jcmd <pid> VM.native_memory summary
jstack <pid> | grep -A 20 'pool-'
Fix now
Set queue capacity: new ArrayBlockingQueue<>(1000). Set rejection policy: new ThreadPoolExecutor.CallerRunsPolicy()
High latency, low throughput+
Immediate action
Check queue depth
Commands
curl localhost:8080/actuator/metrics/jvm.threadpool.queue.size
jstack <pid> | grep 'BLOCKED'
Fix now
Increase queue capacity or add threads. If blocked, check for deadlock.
Circuit breaker open+
Immediate action
Check downstream health
Commands
curl -I http://downstream/health
grep 'circuit' /var/log/app.log
Fix now
If downstream is healthy, increase threshold. If unhealthy, wait for recovery or scale downstream.
Reactive stream stall+
Immediate action
Check if request() is called
Commands
grep 'onNext' /var/log/app.log | tail -20
jstack <pid> | grep 'Subscriber'
Fix now
Add subscription.request(1) in onNext(). Add a timeout to detect stalls.
Feature / AspectBounded Queue (Blocking)Reactive StreamsHTTP 429/503
Backpressure mechanismBlock producer when fullDemand signal (request(n))Status code + Retry-After
ComplexityLowHighMedium
Latency impactIncreases with queue depthControlled by demandImmediate rejection
Best forIn-process async, thread poolsData pipelines, streamingHTTP APIs, external clients
Failure modeDeadlock if not carefulStall if request() forgottenClient retry storms if no backoff

Key takeaways

1
Backpressure is not optional in async systems
unbounded buffers are a ticking time bomb that will OOM under load.
2
Always bound your queues and decide what happens when they're full
block, drop, or reject. Blocking is the simplest backpressure signal.
3
Reactive streams give fine-grained demand control but add complexity and debugging difficulty. Use them only when you need non-blocking pipelines.
4
Combine backpressure with circuit breakers and bulkheads to prevent cascading failures in distributed systems.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How does backpressure prevent cascading failures in a microservices arch...
Q02SENIOR
When would you choose a bounded blocking queue over reactive streams for...
Q03SENIOR
What happens when a Kafka consumer's internal processing queue is unboun...
Q04JUNIOR
What is backpressure and why is it important in async systems?
Q05SENIOR
You notice a service's latency is spiking and throughput drops to zero. ...
Q06SENIOR
Design a system that processes a high-volume event stream with backpress...
Q01 of 06SENIOR

How does backpressure prevent cascading failures in a microservices architecture?

ANSWER
Backpressure limits the amount of work in flight. When a downstream service slows down, backpressure propagates upstream, causing the upstream to also slow down or reject requests. This prevents buffers from growing unbounded and avoids OOM. Combined with circuit breakers, it isolates failures so they don't cascade.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is backpressure in system design?
02
What's the difference between backpressure and rate limiting?
03
How do I implement backpressure in a Kafka consumer?
04
Can backpressure cause deadlocks?
N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Notes here come from systems that actually shipped.

Follow
Verified
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
🔥

That's Async & Data Processing. Mark it forged?

6 min read · try the examples if you haven't

Previous
Inverted Index and Text Search
6 / 7 · Async & Data Processing
Next
Transactional Outbox Pattern