Spring Boot Async Messaging Patterns: Kafka, RabbitMQ & Beyond
Master async messaging in Spring Boot: request-reply with correlation IDs, fire-and-forget, pub-sub, Kafka vs RabbitMQ trade-offs, and backpressure handling.
- Fire-and-forget sends a message and moves on — no response expected, highest throughput
- Request-reply adds a correlation ID and a reply topic/queue so the sender can match the response
- Pub-sub decouples publishers from subscribers — multiple consumers receive the same event
- Kafka excels at ordered, high-throughput, replayable event streams; RabbitMQ excels at flexible routing and low-latency RPC
- Backpressure must be handled explicitly: bounded queues, consumer rate limiting, and dead-letter queues
Async messaging is like dropping a letter in a post box instead of making a phone call. Fire-and-forget is a flyer in the letterbox — you walk away immediately. Request-reply is a registered letter with a return address — you wait for the postman to bring an answer. Pub-sub is a newspaper — one publisher, many readers, each getting their own copy on their own schedule.
Synchronous HTTP calls are the instinctive choice when building microservices APIs. They are easy to reason about, easy to test, and return results immediately. But in production, synchronous calls become fragility vectors: a slow downstream service holds a thread, a timeout cascade can take down an entire call chain, and a service restart loses all in-flight requests. Async messaging inverts this dynamic — the caller deposits work into a broker and continues processing immediately, decoupled from the receiver's availability.
Spring Boot's messaging ecosystem is mature and rich. Spring Kafka wraps the Kafka client with @KafkaListener, @KafkaTemplate, and deep integration with Spring transactions and the application context lifecycle. Spring AMQP provides the same for RabbitMQ with additional routing primitives that Kafka's topic model does not offer. Spring Cloud Stream adds an abstraction layer that lets you switch brokers by changing a dependency.
The choice of pattern — fire-and-forget, request-reply, or pub-sub — shapes the entire architecture of a feature. Fire-and-forget is appropriate when the caller does not need a result: sending a welcome email, logging an audit event, triggering a background report. Request-reply is appropriate when the caller does need a result but can tolerate higher latency than a synchronous call: an order enrichment service that calls a pricing service asynchronously while computing other fields in parallel. Pub-sub is appropriate when multiple downstream systems need to react to the same event: an OrderPlaced event consumed by inventory, billing, analytics, and fulfilment.
The choice of broker shapes throughput, ordering guarantees, and routing capabilities. Kafka's partitioned log is the choice for high-throughput event streaming (millions of events per second), ordered processing within a partition, and event replay (new consumers can read from the beginning of a topic). RabbitMQ's exchange/queue model is the choice for flexible routing (topic exchanges, fanout exchanges, header routing), low-latency RPC patterns, and priority queues. Both are mature, production-proven, and have excellent Spring Boot support.
Backpressure is the often-overlooked consequence of async messaging. If producers publish faster than consumers can process, the broker queue grows without bound until it exhausts memory or disk. Explicit backpressure mechanisms — bounded queues, consumer concurrency limits, rate limiting, and dead-letter queues — are essential production requirements, not optional enhancements.
Fire-and-Forget: Maximising Throughput
Fire-and-forget is the simplest and highest-throughput async pattern. The producer publishes a message to the broker and immediately continues execution without waiting for processing confirmation beyond the broker's acknowledgement (acks=all for Kafka). It is appropriate when the business operation is complete once the event is durably recorded, regardless of downstream processing.
Common use cases: audit logging, welcome emails, analytics events, cache invalidation signals, search index updates. The key property these share is that the producer does not need to know whether or when the consumer processes the event, and a processing failure in the consumer can be handled independently (DLQ + alerting) without affecting the producer's success path.
In Spring Boot with Kafka, fire-and-forget uses KafkaTemplate.send(), which returns a CompletableFuture. For true fire-and-forget, you can ignore the future — the message is delivered to the broker regardless. For production resilience, add a callback to log send failures: listenableFuture.whenComplete((result, ex) -> { if (ex != null) log.error(...); }). This does not block the producer but gives visibility into broker-level failures.
With RabbitMQ, fire-and-forget uses RabbitTemplate.convertAndSend(). Publisher confirms should be enabled to detect broker-level rejections. Without publisher confirms, a message can be silently dropped if the exchange does not have a matching queue.
Critical producer configuration for Kafka reliability: acks=all (wait for all in-sync replicas), retries=Integer.MAX_VALUE with idempotent producer enabled, and max.in.flight.requests.per.connection=5 (safe with idempotent producer). This prevents message loss and duplicate production without sacrificing throughput.
Request-Reply with Correlation IDs
Request-reply over a message broker gives you the asynchronous decoupling benefit while still returning a result to the caller. The pattern works by attaching a correlation ID to the outgoing request message and a reply-to topic or queue. The consumer processes the request and publishes its response to the reply-to destination, including the original correlation ID. The caller matches the incoming response to the outstanding request using the correlation ID.
Spring Kafka provides ReplyingKafkaTemplate, which automates this pattern. It creates a reply consumer, generates correlation IDs, serialises/deserialises request and reply, and blocks (or uses a CompletableFuture) until the response arrives or a timeout expires. It is conceptually clean but carries a production warning: blocking on .get() consumes a thread for the duration of the wait. Under load, this can exhaust thread pools and cause cascading timeouts.
The production-safe approach is to use non-blocking callbacks: replyingTemplate.sendAndReceive(record).whenComplete((reply, ex) -> { ... }). This releases the calling thread immediately and processes the reply on a separate callback thread.
RabbitMQ's RabbitTemplate.convertSendAndReceive() implements the same pattern for AMQP. RabbitMQ is often preferred for request-reply because its exchange/queue model supports per-request reply queues naturally, and its low-latency routing makes round trips shorter.
When to use request-reply over Kafka vs direct HTTP: use async request-reply when the caller can proceed with other work while waiting, or when the downstream service needs to batch requests for efficiency. Use HTTP when you need the response synchronously in the same request context and latency must be minimised.
ReplyingKafkaTemplate.sendAndReceive().get() blocks a thread for up to the timeout duration. With 200 concurrent requests and a 10s timeout, you need 200 threads minimum. Use .thenApply() callbacks instead, or wrap in a virtual thread (Java 21+) to avoid blocking platform threads.Pub-Sub: Event-Driven Fan-Out
Pub-sub (publish-subscribe) is the pattern where a producer publishes an event to a topic or exchange, and any number of independent consumers receive and process their own copy of the event. The producer has no knowledge of its consumers — it simply publishes the fact that something happened. This is the core pattern of event-driven microservices.
In Kafka, pub-sub is implemented with consumer groups. Each consumer group gets its own copy of every message in the topic. Within a consumer group, each partition is processed by exactly one consumer — this provides horizontal scalability up to the number of partitions. To add a new subscriber (e.g., an analytics service wants to know about orders), simply create a new consumer group with a new groupId — no changes to the producer or other consumers required.
In RabbitMQ, pub-sub is implemented with fanout exchanges or topic exchanges. A fanout exchange routes every message to all bound queues. A topic exchange routes based on routing key patterns (orders.# routes all order events). Adding a new subscriber means binding a new queue to the exchange — again, no producer changes required.
The critical operational difference: Kafka retains messages for a configurable period (7 days by default), so a new consumer group can replay historical events. RabbitMQ does not — messages are deleted after acknowledgement, so a new queue only receives messages published after it was bound. This makes Kafka the choice for event sourcing and audit trails; RabbitMQ the choice for transient notifications.
Backpressure in pub-sub: if one consumer group is slow, it accumulates lag independently without affecting other consumer groups (Kafka) or other queues (RabbitMQ). Monitor per-consumer-group lag independently and scale slow consumers separately.
Kafka vs RabbitMQ: Choosing the Right Broker
Kafka and RabbitMQ are both battle-tested message brokers, but their architectures suit different use cases. Choosing the wrong one creates friction that compounds over time.
Kafka's core abstraction is the immutable, partitioned, replicated log. Producers append to the end of the log. Consumers read at their own pace and track their position (offset). Messages are retained regardless of consumption — this enables replay, event sourcing, and adding new consumers without data loss. Kafka is the choice when: (1) throughput exceeds 100K messages/second, (2) message ordering within an entity is required, (3) event replay is needed (new service onboarding, disaster recovery, debugging), (4) the use case maps to event streaming (CDC, audit logs, ML feature pipelines).
RabbitMQ's core abstraction is the exchange/queue routing model. Messages are routed from exchanges to queues based on routing keys and bindings. Messages are deleted after consumer acknowledgement. RabbitMQ is the choice when: (1) flexible routing is needed (route order events to different queues based on country, priority, or product category), (2) per-message TTL or priority is needed, (3) request-reply RPC patterns dominate (low-latency round trips), (4) the team is more familiar with AMQP semantics.
Kafka disadvantages: operational complexity (ZooKeeper or KRaft mode, partition rebalancing, offset management), high storage cost if topics are large and retention is long, no per-message TTL or priority, minimum useful cluster size is 3 brokers. RabbitMQ disadvantages: no message replay after acknowledgement, throughput ceiling around 50K messages/second per queue, complex cluster topologies for high availability.
Hybrid architectures are common: use Kafka for domain events and event streaming, use RabbitMQ for task queues and low-latency RPC within a bounded context.
Backpressure: Preventing Consumer Overload
Backpressure is the mechanism by which a slow consumer signals to the producer to slow down or by which the system limits the rate of message production to match the rate of consumption. Without backpressure, an unconstrained producer can fill the broker queue until disk or memory is exhausted, causing either message loss (if the broker drops messages) or broker failure.
In Kafka, consumer-side backpressure is controlled by max.poll.records (how many messages the consumer fetches per poll), the processing time per message, and consumer concurrency. If the consumer takes 100ms per message and fetches 500 records per poll, it can process 5000 records/second per thread. If the producer publishes at 10K/second, you need at least 2 consumer threads (up to the number of partitions). Monitor consumer lag as the primary backpressure indicator.
In RabbitMQ, backpressure is controlled by prefetchCount (how many unacknowledged messages the broker delivers to a consumer). Setting prefetchCount=1 means the broker waits for an ack before sending the next message — maximum backpressure, minimum throughput. Setting prefetchCount=50 allows the consumer to buffer 50 messages locally — higher throughput but more messages in flight if the consumer crashes.
Producer-side rate limiting with Spring Cloud Gateway or Resilience4j RateLimiter prevents the producer from overwhelming the broker in the first place. For internal services, use a Semaphore or a bounded BlockingQueue at the producer to limit in-flight requests.
WebFlux (Project Reactor) provides reactive backpressure natively: a Flux<Message> pulled from the broker propagates backpressure signals upstream — if the consumer's flatMap is slow, the Flux stops pulling new messages until demand exists. This is the most elegant backpressure solution for high-throughput streaming pipelines.
Choreography vs Orchestration Sagas with Messaging
Sagas implemented via messaging are one of the most important applications of async messaging patterns in microservices. A choreography saga uses domain events to coordinate between services with no central controller. An orchestration saga uses a central saga service that sends commands and waits for responses.
Choreography with Kafka: each service publishes domain events to well-known topics. Other services subscribe and react. The saga 'flow' is encoded in the event subscriptions. Example: OrderService publishes OrderPlaced → InventoryService listens, reserves stock, publishes StockReserved → PaymentService listens, charges card, publishes PaymentProcessed → FulfillmentService listens, ships. Compensation flows in reverse: PaymentFailed → InventoryService listens to PaymentFailed, publishes StockReleased.
Orchestration with Kafka: the saga orchestrator publishes commands to each service's command topic. Each service processes the command and publishes a result event. The orchestrator listens for result events and advances the saga state machine. Spring State Machine or Axon Framework can implement the state machine; Temporal provides durable workflow execution with retries and timeouts built in.
For messaging-based sagas, correlation ID is essential — every command and event in the saga chain must carry the saga ID so the orchestrator (or any debugging tool) can correlate the entire flow. Include the saga ID in the Kafka message key for partitioned ordering of saga events.
JMS Transaction Boundaries: Don't Lose Messages on a Crash
You think async makes you safe? Think again. Without proper transaction management, a message can be consumed and then vanish when your database write fails. That's a data loss incident waiting to happen.
Spring's JMS integration gives you two knobs: sessionTransacted (local JMS transaction) and a JtaTransactionManager (XA two-phase commit). For most services, sessionTransacted is enough — it ties the message acknowledgment to your listener's success. If your listener throws a RuntimeException, the message goes back on the queue.
But here's the gotcha: unless your database write is also in the same transaction, you can still lose data. If you're updating a PostgreSQL row and the commit before JMS ACK, a crash between commits sends your customer's order to the void.
For serious guarantees, use a JtaTransactionManager and a broker like ActiveMQ Artemis or IBM MQ that supports XA. Your platform team will thank you when you don't wake them at 3 AM.
Message Selectors: Stop Wasting CPU on Irrelevant Messages
Every message your consumer ignores is still a message you paid to deserialize, validate, and discard. That's throughput you're burning.
JMS message selectors let you filter messages at the broker level. The broker only delivers messages that match your expression. You consume zero CPU for messages you don't care about.
The selector is a SQL-like expression on the JMS message headers (JMSType, JMSDeliveryMode) and your custom String properties. You cannot filter on the message body — that ship sailed in the JMS 1.1 spec.
When does this matter? When you have a shared queue for multiple event types (e.g., 'order.created', 'order.cancelled') and your listener only processes one type. Instead of a generic listener with a switch statement, use selectors to partition the work. Your team gets smaller, faster consumers that never touch the wrong event.
But careful: selectors add broker-side overhead. If 90% of your messages match, the broker is doing pointless work. Profile before you over-optimize.
ReplyingKafkaTemplate Timeout Storm During Kafka Broker Restart
- Request-reply over Kafka is not truly async if you block on the response with .get().
- Use non-blocking callbacks.
- Add circuit breakers around all broker-dependent operations.
- Never assume that because your application is healthy, its dependencies are.
rabbitmqctl list_consumers. If consumers are zero, the Spring listener container has stopped — check for UncaughtException in the listener that caused the container to shut down (RabbitMQ listener containers stop on unhandled exceptions by default). Check application logs for 'SimpleMessageListenerContainer' errors. If consumers exist but lag grows, check prefetchCount — if it is 1 (default), throughput is limited to one message per consumer per round trip. Increase prefetchCount to 10-50 for batch workloads. Also check if messages are being NACK'd and re-queued in a tight loop (poison pill).kafka-consumer-groups.sh --bootstrap-server kafka:9092 --describe --group my-consumer-groupkubectl logs -l app=my-consumer --since=5m | grep -c 'ERROR'Key takeaways
ReplyingKafkaTemplate.get() under loadCommon mistakes to avoid
6 patternsUsing auto.commit=true in Kafka consumers
Blocking on ReplyingKafkaTemplate.get() in a web request thread
Sharing a Kafka consumer groupId between two different services
Not setting a dead-letter queue for failed messages
Publishing events with null keys in Kafka when ordering matters
Not enabling publisher confirms in RabbitMQ
Interview Questions on This Topic
What is the difference between fire-and-forget, request-reply, and pub-sub messaging patterns?
Frequently Asked Questions
That's Messaging. Mark it forged?
10 min read · try the examples if you haven't