Spring Boot API Timeout Handling: RestTemplate, WebClient, Resilience4j & More
Master API timeout handling in Spring Boot 3.
- Set connectTimeout and readTimeout on SimpleClientHttpRequestFactory for RestTemplate
- Use responseTimeout() and tcpConfiguration() on WebClient for reactive clients
- Add timeout=30 to @Transactional to kill long-running database transactions
- Wrap async calls with Resilience4j TimeLimiter or CompletableFuture.orTimeout()
- Return 503 (Service Unavailable) for internal timeouts, 504 (Gateway Timeout) when a downstream dependency times out
Think of timeouts like a restaurant kitchen rule: if a dish isn't ready in 30 minutes, the waiter stops waiting and apologizes to the customer rather than making them sit there forever. In Spring Boot, timeouts are those rules — they tell your application exactly how long to wait for a database query, a remote API call, or an async job before giving up and returning a clean error instead of hanging indefinitely.
It starts with a single slow third-party API. On Monday morning, response times creep from 200 ms to 12 seconds. Within minutes, your connection pool is exhausted, your thread pool is saturated, and your entire service is unresponsive — not because your code is broken, but because you never told it how long to wait. This is the cascade that takes down production systems every week, and yet timeout configuration remains one of the most neglected areas in Spring Boot services.
Timeout handling is not a single knob. Every I/O boundary in a Spring Boot application — HTTP client calls, database transactions, reactive streams, asynchronous tasks, and Feign clients in microservice chains — has its own timeout mechanism with its own defaults (often none). A service that sets timeouts only on RestTemplate but forgets its @Transactional methods or its WebClient reactive pipeline is still a ticking clock.
The HTTP status codes you return on timeout failures carry real operational meaning. A 503 Service Unavailable signals that your own service cannot fulfill the request right now, while a 504 Gateway Timeout tells the caller that your service was acting as a proxy and a downstream dependency didn't respond in time. Conflating these makes incident diagnosis dramatically harder for consumers and on-call engineers alike.
Modern Spring Boot 3.x and Java 17+ give you a rich toolkit: the classic SimpleClientHttpRequestFactory for RestTemplate, the reactive responseTimeout on WebClient, declarative transaction timeouts via @Transactional, and Resilience4j's TimeLimiter for wrapping any CompletableFuture or Mono. Feign clients in microservice chains introduce their own timeout layer that interacts with — and can override — the underlying HTTP client settings.
This guide walks through every timeout surface area with production-grade configuration, real incident patterns, and copy-paste-ready code for Spring Boot 3.x with Java 17+. By the end you will have a systematic approach to auditing your service for timeout gaps and a reference cheat sheet for diagnosing timeout failures in production.
RestTemplate Timeout Configuration with SimpleClientHttpRequestFactory
RestTemplate is Spring's classic synchronous HTTP client, still widely used in Spring Boot 3.x legacy codebases. Despite being in maintenance mode (replaced by WebClient for new code), it remains common in enterprise services and understanding its timeout configuration is essential for anyone maintaining or migrating such a service.
The timeout configuration lives in the ClientHttpRequestFactory implementation. The default factory used by a no-arg new is SimpleClientHttpRequestFactory, which wraps Java's HttpURLConnection. Critically, SimpleClientHttpRequestFactory has no timeout by default — both connectTimeout and readTimeout are set to -1, meaning infinite wait. This is the most common source of thread-starvation incidents in Spring Boot services.RestTemplate()
For production use, always construct RestTemplate with an explicit factory configuration. The two critical properties are connectTimeout (how long to wait for the TCP connection to be established) and readTimeout (how long to wait for data to arrive on an established connection). Both are specified in milliseconds.
For Apache HttpClient-backed RestTemplate (recommended for connection pooling), use HttpComponentsClientHttpRequestFactory. This gives you full control over connection pool size, connection TTL, and per-request timeouts. The RequestConfig builder lets you set connectionRequestTimeout (how long to wait to acquire a connection from the pool — a third timeout surface often missed), connectTimeout, and socketTimeout.
RestTemplate does not natively support reactive timeout propagation. If your service uses a reactive gateway layer (Spring Cloud Gateway) that sets a deadline on the incoming request, RestTemplate will not respect that deadline. This is the primary architectural reason to migrate to WebClient for new services — WebClient integrates with Project Reactor's context propagation and can participate in deadline-aware pipelines.
new RestTemplate() constructed without a factory will wait indefinitely for both connection and socket read. Always inject a configured RestTemplate bean — never call new RestTemplate() directly in service code.WebClient Timeout Configuration for Reactive Services
WebClient is the modern, non-blocking HTTP client in Spring WebFlux and Spring Boot 3.x. Its timeout configuration is more nuanced than RestTemplate because it operates on a reactive pipeline with multiple observable points. A common mistake is configuring the HTTP-level responseTimeout but forgetting the TCP-level connection timeout, or vice versa.
ResponseTimeout is the highest-level timeout — it governs the entire HTTP exchange from request send to last byte of response received. This is the most commonly configured option and covers the majority of latency scenarios. It is set on the WebClient.Builder via .responseTimeout(Duration.ofSeconds(5)).
The TCP-level timeout is configured via the reactor-netty HttpClient's tcpConfiguration (pre-Netty 1.0) or channelOption/doOnConnected API (Netty 1.0+). The connection timeout (TCP handshake) is set via ChannelOption.CONNECT_TIMEOUT_MILLIS. The read and write idle timeouts are set via ReadTimeoutHandler and WriteTimeoutHandler in the channel pipeline — these fire if no data is read or written within the specified duration, which is different from responseTimeout.
For production services, you typically want both levels configured: responseTimeout for the application-level SLA, and CONNECT_TIMEOUT_MILLIS for network-level fast-fail. If the target host is unreachable, CONNECT_TIMEOUT_MILLIS determines how quickly you fail; if it's reachable but slow to respond, responseTimeout determines how long you wait.
Per-request timeout overrides are supported via the httpRequest attribute on the exchange, or more cleanly, by using .timeout(Duration) operator on the resulting Mono/Flux at the reactive layer. This allows different endpoints of the same service to have different timeout budgets without needing separate WebClient instances.
Note that WebClient's non-blocking nature means thread exhaustion looks different: it's the Netty event-loop threads (typically one per CPU core) that saturate rather than a servlet thread pool. Netty event-loop threads are designed to handle thousands of concurrent connections, but each connection that's waiting for a slow response still occupies an in-flight request slot in the event loop queue. High concurrency to a slow upstream will eventually exhaust available memory for queued requests even if threads appear free.
@Transactional Timeout and Database Transaction Management
Database transactions are a frequently overlooked timeout surface. A @Transactional method that runs a slow query or waits on a row lock can hold a database connection for minutes, exhausting the HikariCP connection pool and starving the rest of the application of DB access. The timeout attribute on @Transactional directly addresses this: it instructs the transaction manager to set a deadline, and if the transaction has not committed by that deadline, it is rolled back.
The timeout value is in seconds (not milliseconds, unlike most other Spring timeout configurations — a common mistake). The timeout countdown begins when the transaction is opened, not when the query starts executing. This means for methods that do pre-processing before the first database call, the effective database query budget is timeout minus setup time.
Propagation interacts with timeout in important ways. A @Transactional(timeout=30) method that calls another @Transactional method with the default REQUIRED propagation will share the same transaction and the same timeout budget. The inner method does not reset or extend the timeout. However, calling a @Transactional(propagation=REQUIRES_NEW, timeout=10) method creates a new transaction with its own 10-second budget, independent of the outer transaction.
At the JDBC level, timeout is translated to Statement.setQueryTimeout() by the JPA provider (Hibernate, in most Spring Boot applications). The database then cancels the query server-side when the timeout fires, which is more efficient than client-side cancellation — the database terminates the query and releases server resources immediately. This is why transaction timeout is superior to simply wrapping a service call in a Resilience4j TimeLimiter for database-heavy operations.
For read-only queries, @Transactional(readOnly=true, timeout=10) is the recommended pattern. The readOnly hint allows Hibernate to skip dirty checking, and the timeout provides the safety net. For write operations, keep the transaction timeout tight and push long-running work outside the transaction boundary.
Statement.setQueryTimeout(). However, this only applies to queries executed within the transaction — it does not prevent row-lock wait indefinitely. For lock-wait timeouts, set spring.jpa.properties.javax.persistence.lock.timeout=5000 (milliseconds) or use database-level lock_timeout settings.Resilience4j TimeLimiter and CompletableFuture.orTimeout()
Resilience4j TimeLimiter provides a declarative timeout mechanism that works at the Java Future/Reactive layer, independent of the underlying I/O implementation. It is particularly useful for wrapping CompletableFuture-based async operations, third-party SDK calls that don't expose timeout configuration, or complex orchestration flows where multiple I/O calls need a single shared deadline.
TimeLimiter works by scheduling a cancellation task after the configured timeoutDuration. If the CompletableFuture or Mono completes before the deadline, the cancellation is discarded. If the deadline fires first, the future is cancelled (which sends an interrupt signal) and a TimeoutException is propagated. The important nuance is that cancellation does not guarantee the underlying thread stops — threads that ignore interruption (such as JDBC blocking reads) will continue running in the background. This is a key operational concern: you may return a timeout response to the caller while the background work continues consuming resources.
For pure Java 9+ code without Resilience4j, CompletableFuture.orTimeout(5, TimeUnit.SECONDS) is a lightweight alternative that schedules a TimeoutException if the future doesn't complete in time. CompletableFuture.completeOnTimeout(defaultValue, 5, TimeUnit.SECONDS) goes further and completes the future with a fallback value instead of an exception — useful for non-critical data fetches where a default is acceptable.
In Resilience4j, TimeLimiter integrates cleanly with CircuitBreaker: a timeout counts as a failure for circuit breaker state machine purposes. This means a sustained stream of timeouts will open the circuit breaker and fail fast, preventing thread/resource exhaustion. This integration is one of the key reasons to prefer Resilience4j over raw orTimeout() for production services.
Annotation-based Resilience4j configuration via @TimeLimiter requires the method to return CompletableFuture or Mono/Flux. For blocking service methods, wrap them in CompletableFuture.supplyAsync() within a bounded thread pool — but be aware this shifts the blocking to the thread pool's threads rather than the caller's thread.
resilience4j.timelimiter.calls with tags for kind (successful, timeout, failed). Alert when timeout rate for any instance exceeds 5% of total calls — this indicates the timeoutDuration may be too tight for current p99 latency, or the upstream is degrading.503 vs 504: Returning the Right HTTP Status on Timeout
The HTTP status code returned when a timeout occurs is not cosmetic — it carries semantic meaning that affects how upstream services, API gateways, load balancers, and monitoring systems respond. Conflating 503 and 504 makes incident diagnosis significantly harder and can lead to incorrect retry behavior from clients.
503 Service Unavailable means your service itself is currently unable to handle the request. Use this when the timeout is internal: your own thread pool is exhausted, your circuit breaker is open, your database connection pool is drained, or a resource your service owns is unavailable. The Retry-After header should be included to give clients a hint about when to retry. Load balancers and API gateways typically remove a 503-returning instance from their upstream pool.
504 Gateway Timeout means your service was acting as a proxy or gateway and a downstream dependency failed to respond in time. Your service received the request, forwarded it downstream, and that downstream service did not respond within the timeout window. Use this when the timeout occurs in a client call to another service — a payment gateway, an inventory service, a shipping provider API. The consumer knows the failure is not in your service but in something your service depends on.
The practical implication: if your order service times out calling the inventory service, return 504 to the API gateway. The API gateway (or the caller) can then make an informed decision — retry the call, use cached data, or surface a specific error to the user. If you return 503, the caller may conclude your order service is down and stop sending traffic to it, when in fact only the inventory dependency is degraded.
In Spring Boot, the mapping is implemented via @ExceptionHandler or a global @ControllerAdvice. Different exception types from different timeout mechanisms map to different HTTP statuses. TimeLimiter throws TimeoutException, WebClient throws WebClientRequestException wrapping various IO exceptions, and @Transactional timeout throws TransactionTimedOutException.
Feign Client Timeout Propagation in Microservice Chains
Feign clients in a microservice architecture introduce a layered timeout problem. Each Feign client has its own timeout configuration, which interacts with — and can be overridden by — the underlying HTTP client (OkHttp or Apache HttpClient). Additionally, the calling service's own response timeout budget must be considered holistically across the chain.
Feign's timeout configuration has two levels: the Feign-level default (set via Request.Options) and the HTTP client-level timeout (set on the underlying OkHttpClient or HttpClient bean). The Feign-level default takes precedence for most scenarios, but some HTTP client implementations may enforce their own limits more strictly. The safest approach is to configure both consistently.
In a microservice chain (A → B → C → D), if service B's Feign client has a 10-second read timeout calling C, and C's Feign client has a 10-second read timeout calling D, then A's timeout for the entire chain must be greater than 20 seconds to avoid timing out before B and C finish (assuming sequential calls). In practice, timeouts should be set considering the end-to-end latency budget per call with appropriate margins.
Idempotency is critical when retrying across Feign clients. The default Feign retryer (Retryer.NEVER_RETRY) disables retries, which is the safe default for non-idempotent operations. For idempotent operations (GET, PUT with same data), configure Retryer with maxAttempts and period. For POST operations that may create resources, idempotency keys (sent as a request header, stored with Redis SETNX by the receiver) prevent duplicate resource creation on retry after a timeout.
Spring Cloud LoadBalancer integrates with Feign and adds retry logic at the load-balancer layer. If a Feign call fails with an IOException (including timeout), Spring Cloud can retry on a different instance. Configure this carefully — retrying a write operation on timeout to a different instance is only safe with idempotency key support on the target service.
Spring MVC Request Timeout: The Async Request You Didn't Configure
You can set a timeout on Spring MVC requests without touching RestTemplate or WebClient. The spring.mvc.async.request-timeout property controls how long the container waits for a deferred result before sending a 503 back to the client. This is not a database timeout or an HTTP client timeout—it's the timeout for the entire request processing pipeline when you use Callable or DeferredResult. Why does this matter? If your controller returns a Callable, Spring hands off execution to a separate thread pool. If that thread hangs on a slow database query, the client sits waiting. Without this property, Tomcat's default connector timeout (usually 20 seconds for keep-alive) kicks in, but you lose control over the error response. Setting spring.mvc.async.request-timeout=5000 tells Spring to throw a AsyncRequestTimeoutException after 5 seconds. Catch it with @ExceptionHandler and return a proper 503. Common trap: developers set timeouts on the HTTP client but forget the async timeout. The client times out and retries, but the server thread is still burning CPU. Always set async request timeout when using deferred results.
RestClient Timeout: The Modern Non-Reactive Client
Spring Boot 3.2 introduced RestClient as the synchronous successor to RestTemplate. Same fluent API as WebClient, but blocking. With RestClient, you configure timeouts through the underlying ClientHttpRequestFactory—same approach as RestTemplate, but cleaner builder pattern. Why use RestClient over RestTemplate? RestClient is the future. Spring marks RestTemplate as deprecated in 3.2+ for maintenance. RestClient gives you connectTimeout, readTimeout, and connectionRequestTimeout via SimpleClientHttpRequestFactory or HttpComponentsClientHttpRequestFactory. The critical mistake: setting connect timeout to 30 seconds when you mean read timeout. Connect timeout covers TCP handshake—should be 1–3 seconds. Read timeout covers waiting for the response body—should match your SLA. Example: external payment API guarantees 2 second response. Set read timeout to 3 seconds. Any longer, you're masking upstream failures. Also: always set connectionRequestTimeout when using connection pools. Without it, a thread can wait indefinitely for a pooled connection if the pool is exhausted.
The Black Friday Cascade: How a Missing WebClient Timeout Took Down Checkout
.responseTimeout(Duration.ofSeconds(5)) to the WebClient builder. Added Resilience4j CircuitBreaker wrapping the payment client with a 60% failure rate threshold. Added fallback to a queued async payment flow. Redeployed in 11 minutes; recovery in under 2 minutes after deployment.- WebClient is non-blocking but Netty event-loop threads are still finite.
- A slow upstream with no timeout exhausts the reactive thread pool just as a blocking RestTemplate exhausts the servlet thread pool.
- Every WebClient instance that calls an external service must have responseTimeout set.
- Add circuit breakers so a single slow dependency cannot monopolize the thread pool.
jstack <pid>) and look for threads blocked on socket read. Identify which upstream host is involved, then add connectTimeout and readTimeout (RestTemplate) or responseTimeout (WebClient) for that client. Start with 5 s read timeout and tune based on p99 latency of the upstream.SHOW PROCESSLIST (MySQL) or SELECT * FROM pg_stat_activity (Postgres) to find long-running queries. Add transaction timeouts and optimize slow queries; short-term mitigation is to increase pool size cautiously.Thread.currentThread().isInterrupted() periodically. For blocking JDBC calls, set @Transactional(timeout=N) so the database itself cancels the query. Use ThreadPoolTaskExecutor with a bounded queue so runaway background tasks don't exhaust executor threads.http.server.requests histogram, executor.active gauge, hikaricp.connections.active. The timeout may be firing due to queue wait time, not actual I/O latency. Increase corePoolSize or reduce the work done before the downstream call.jstack $(pgrep -f 'java.*myapp') | grep -A 20 'BLOCKED\|socket read'curl -s http://localhost:8080/actuator/metrics/jvm.threads.states | jq '.measurements'Key takeaways
Statement.setQueryTimeout(), which cancels the query server-side for efficient resource releaseCommon mistakes to avoid
7 patternsUsing `new RestTemplate()` without a factory
Setting @Transactional(timeout=5000) thinking the unit is milliseconds
Configuring responseTimeout on WebClient but not CONNECT_TIMEOUT_MILLIS
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 3_000) and .responseTimeout(Duration.ofSeconds(5)) on the HttpClient builderReturning 500 for all timeout scenarios
Retrying non-idempotent operations after a timeout without idempotency keys
Setting Feign retry to Retryer.Default without considering non-idempotent calls
Assuming Resilience4j TimeLimiter stops the underlying thread
Interview Questions on This Topic
What is the difference between connectTimeout and readTimeout in RestTemplate?
Frequently Asked Questions
That's Production. Mark it forged?
12 min read · try the examples if you haven't