Intermediate 8 min · May 23, 2026

Spring Cloud Gateway: The Complete Production Guide

Q: Can Spring Cloud Gateway replace an Nginx reverse proxy?

For microservice routing within a Spring Cloud architecture, yes. Gateway handles routing, load balancing, auth, rate limiting, and circuit breaking with better Spring integration than Nginx. For static file serving, SSL termination at the infrastructure edge, or very high-throughput scenarios, Nginx (or a cloud load balancer) at the outer edge combined with Gateway for internal routing is a common pattern.

Q: Is Spring Cloud Gateway compatible with Spring Boot 3.x?

Yes. Spring Cloud Gateway 4.x (part of Spring Cloud 2022.0+ / Kilburn) supports Spring Boot 3.x with Java 17+. It includes native AOT compilation support for GraalVM native image builds. Use the Spring Cloud BOM matching your Spring Boot version to ensure compatible dependency versions.

Q: How do I add a new route without restarting the Gateway?

Use the Gateway Actuator API: POST to /actuator/gateway/routes with a RouteDefinition JSON body, and refresh via POST to /actuator/gateway/refresh. For persistent dynamic routes, implement a custom RouteDefinitionRepository backed by Redis or a database. Note: the actuator endpoint should be secured in production.

Q: How does Gateway handle WebSocket connections?

Spring Cloud Gateway supports WebSocket proxying out of the box. Define routes with ws:// or wss:// URI schemes (or lb:// which works for both HTTP and WebSocket). The connection upgrade handshake is handled automatically. Ensure your WebSocket route predicate matches the upgrade request path, and configure appropriate timeout values since WebSocket connections are long-lived.

Q: What is the performance difference between Spring Cloud Gateway and Zuul?

Spring Cloud Gateway (built on Reactor Netty, fully non-blocking) typically achieves 3-5x higher throughput than Zuul 1.x (Servlet-based, one thread per connection) on the same hardware. For CPU-intensive filter logic, the difference narrows. Netflix's own benchmarks showed Gateway handling 10k+ RPS per pod where Zuul 1.x saturated at 2-3k RPS. Zuul 1.x is not recommended for new projects.

Q: How do I test Spring Cloud Gateway routes in unit tests?

Use @WebFluxTest with a TestWebClient and mock the downstream services using WireMock or MockWebServer. Spring Cloud provides TestcontainersConfiguration and @AutoConfigureWebTestClient for integration testing. For route predicate testing, use WebTestClient.bindToServer("http://localhost:" + port) against a full application context with @SpringBootTest(webEnvironment = RANDOM_PORT).

Master Spring Cloud Gateway: RouteLocator, predicates, filters, rate limiting with Redis, JWT auth filters, and lb:// load balancing for production microservices..

Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Everything here is grounded in real deployments.

✓ Production

production tested

July 04, 2026

last updated

1,697

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Define routes via RouteLocator bean or YAML with predicates (Path, Header, Method) and filters (AddRequestHeader, RewritePath)
Rate limit with RequestRateLimiter filter backed by Redis using KeyResolver beans
JWT authentication via a GlobalFilter that validates tokens before routing
Use lb://service-name URI scheme for automatic Eureka/Consul load balancing
Circuit breaker integration via CircuitBreaker filter with fallback URIs for resilience

✦ Definition~90s read

What is API Gateway with Spring Cloud Gateway?

Spring Cloud Gateway is a reactive API gateway built on Spring WebFlux, providing routing, filtering, and cross-cutting concern management for microservice architectures. It replaces Netflix Zuul as the Spring Cloud recommended gateway solution, offering better performance through its non-blocking Netty-based runtime.

★

Spring Cloud Gateway is the front door of your microservices house.

The core processing model consists of three components: Route (destination + predicates + filters), Predicate (java.util.function.Predicate<ServerWebExchange> that determines if a route matches), and Filter (GatewayFilter that modifies request/response in pre- and post-processing phases). When a request arrives, Gateway evaluates all routes in order, applies the first matching route's filters, proxies the request, and runs post-filters on the response.

Gateway integrates with Spring Cloud LoadBalancer for service-discovery-based routing via the lb:// URI scheme, with Resilience4j for circuit breaking, with Redis for rate limiting, and with Micrometer for metrics and distributed tracing. It supports both Java-based DSL configuration (RouteLocatorBuilder) and YAML configuration, with the Java DSL being more expressive for complex routing logic.

Plain-English First

Spring Cloud Gateway is the front door of your microservices house. Every request from the outside world passes through this door, which checks credentials, rate-limits abusive visitors, rewrites messy URLs, and then directs each visitor to the right room inside the house. The rooms (microservices) never talk directly to the outside world.

Before API gateways became standard, microservice teams faced a brutal choice: either expose every service directly to the internet (an operational and security nightmare) or build a bespoke routing layer in Nginx that nobody wanted to maintain. Spring Cloud Gateway was created to solve this by providing a programmable, reactive routing engine that integrates natively with the Spring ecosystem.

The production pain point that drives teams to Spring Cloud Gateway is the proliferation of cross-cutting concerns. Authentication, rate limiting, request tracing, CORS, circuit breaking, and response caching all need to happen before a request reaches its target service. Without a gateway, every service implements these independently — and inconsistently. A single missed Authorization header check in one service becomes a security incident.

Spring Cloud Gateway is built on Spring WebFlux and Project Reactor, making it fully non-blocking and reactive. This architectural choice means a single gateway instance can handle tens of thousands of concurrent connections without the thread-per-request overhead of a Servlet-based gateway. In benchmarks, it consistently handles 3-5x more requests per second than Zuul 1.x on the same hardware.

The routing model is predicate-based: each route has a set of predicates that must match the incoming request (path pattern, header values, HTTP method, query parameters, time of day, cookie values), and a set of filters that transform the request before forwarding and transform the response before returning. This declarative model makes complex routing logic readable and testable.

Rate limiting is one of the most critical production concerns. Without it, a single misbehaving client or a DDoS attack can exhaust backend service capacity. Gateway's Redis-backed RequestRateLimiter implements the token bucket algorithm with per-key limits, where the key can be the authenticated user ID, API key, or client IP. This is far superior to per-service rate limiting because it enforces limits at the edge.

This guide walks through every major Gateway capability with production-grade configuration, real incident analysis, and the exact patterns used in high-traffic Spring Boot microservice deployments.

RouteLocator: Java DSL vs YAML Configuration

Spring Cloud Gateway supports two configuration styles: Java DSL via RouteLocatorBuilder and YAML/properties configuration. Both produce identical runtime behavior, but the Java DSL is more expressive for complex routing logic and provides compile-time type checking. YAML is more readable for simple routing tables and better for environments where configuration is managed separately from code.

The Java DSL uses a fluent builder API where you define each route with an ID, URI, predicates, and filters. The RouteLocatorBuilder.routes() method chains route definitions, and the resulting RouteLocator bean replaces or supplements YAML-defined routes. Both sources are merged at startup.

A critical detail often missed: route order matters. Gateway evaluates routes in the order they're defined and selects the first match. Put more specific routes before more general ones. In YAML, routes are ordered by their position in the list. In Java DSL, they're ordered by definition order within the routes() builder.

For dynamic routing that changes at runtime without redeployment, implement a custom RouteDefinitionRepository backed by a database or Redis. This enables admin APIs that add or remove routes without restarting the Gateway. The InMemoryRouteDefinitionRepository (the default) supports this via the Gateway Actuator endpoints (POST /actuator/gateway/routes, DELETE /actuator/gateway/routes/{id}).

Route ID Is Required for Actuator Management

Always set explicit route IDs in your configuration. The Gateway Actuator endpoints (GET /actuator/gateway/routes/{id}, DELETE /actuator/gateway/routes/{id}) require the route ID. Auto-generated IDs are UUIDs that change between restarts, making runtime management impossible.

Production Insight

Define routes in order from most specific to least specific; Gateway takes the first matching route and never evaluates subsequent ones, so a catch-all path must be last.

Key Takeaway

Use Java DSL for complex routing logic with compile-time safety; YAML for simple routing tables. Always set explicit route IDs and define routes from most specific to least specific.

thecodeforge.io

Spring Cloud Api Gateway

Rate Limiting with Redis RequestRateLimiter

Rate limiting at the API gateway level is the most effective defense against API abuse, DDoS attacks, and accidental client bugs that cause thundering herds. Spring Cloud Gateway's RequestRateLimiter filter implements the token bucket algorithm backed by Redis Lua scripts, providing accurate, distributed rate limiting that works correctly across multiple gateway instances.

The token bucket algorithm maintains a bucket with a maximum capacity (burst-capacity) that refills at a fixed rate (replenish-rate tokens per second). Each request consumes one or more tokens. If the bucket is empty, the request is rejected with 429 Too Many Requests. This allows short bursts of traffic (up to burst-capacity) while enforcing a long-term average rate (replenish-rate).

The key resolver is the most important configuration decision. It determines the granularity of rate limiting. Common strategies: by authenticated user ID (prevents power users from starving others), by API key (for quota-based monetization), by client IP (for unauthenticated endpoints), or by request path (to protect expensive endpoints). You can compose multiple resolvers.

Redis connectivity is critical — if Redis is unreachable and deny-empty-key=true (the default), all requests are rejected. In production, use Redis Sentinel or Redis Cluster for HA, configure appropriate connection pool settings, and set deny-empty-key=false with monitoring alerts so you know when rate limiting is degraded rather than serving 100% 429s.

The rate limiter headers in the response (X-RateLimit-Remaining, X-RateLimit-Replenish-Rate, X-RateLimit-Burst-Capacity) are valuable for clients and should be preserved. They allow clients to implement backoff before hitting the limit rather than polling until they get a 429.

Redis Is a Hard Dependency for Rate Limiting

With deny-empty-key=true (the default), any Redis connectivity issue causes 100% of requests to return 429. Set deny-empty-key=false in production and alert on rate limiter bypass events via the spring.cloud.gateway.filter.request-rate-limiter.empty-key-status-code metric. Never use the rate limiter without Redis HA.

Production Insight

Use separate Redis instances for rate limiting and caching; rate limiting Lua scripts are write-heavy and should not compete with cache read workloads.

Key Takeaway

Redis-backed token bucket rate limiting works correctly across multiple gateway instances; use deny-empty-key=false to fail open on Redis outages and alert separately.

JWT Authentication Global Filter

Authentication at the gateway level enforces a single, consistent security boundary across all microservices. A Global Filter that validates JWT tokens runs before any route filter, ensuring unauthenticated requests never reach downstream services regardless of which route is matched.

The Global Filter implements GlobalFilter and Ordered. The order value determines priority — lower numbers run first. Authentication should run at a very low order number (high priority) so it runs before any other filter. The filter receives a ServerWebExchange (containing the request and response) and a GatewayFilterChain, and it either calls chain.filter(exchange) to proceed or completes the exchange with a 401/403 response.

After validating the JWT, the filter should extract user claims and forward them to downstream services as request headers. This allows downstream services to trust the user identity without performing their own JWT validation. Common headers include X-User-ID, X-User-Roles, X-User-Email. Use a prefix like X-Auth- to distinguish gateway-injected headers from client-provided ones, and strip any X-Auth- headers from incoming requests before validation to prevent header spoofing.

Whitelist public endpoints (health checks, Swagger UI, auth endpoints themselves) by path pattern. Use a configurable list of patterns stored in configuration, not hardcoded in the filter class, so new public endpoints can be added without code changes. AntPathMatcher works for pattern matching in WebFlux contexts.

Always Strip Injected Headers from Client Requests

If your filter injects X-User-ID headers for downstream services, you must strip those same headers from the incoming client request before validation. A malicious client can forge X-User-ID: admin-user and bypass authorization if you don't strip first. The mutate().headers(h -> h.remove(...)) call must happen before you add the validated values.

Production Insight

Run JWT cryptographic operations on Schedulers.boundedElastic() to avoid blocking the Netty event loop, which causes latency spikes under load.

Key Takeaway

JWT auth Global Filters must strip spoofable headers first, run at low order numbers for highest priority, and offload crypto to a bounded elastic scheduler.

thecodeforge.io

Spring Cloud Api Gateway

Circuit Breaker Filter and Resilience Patterns

The CircuitBreaker filter integrates Resilience4j circuit breaker logic at the gateway level. When a downstream service begins failing or responding slowly, the circuit opens and requests are immediately routed to a fallback URI, preventing timeouts from cascading into gateway thread exhaustion.

The fallback URI can be a local gateway endpoint (forward:/fallback/orders) that returns a cached response, a default error response, or even a redirect to a maintenance page. For read-heavy endpoints, the fallback can serve stale cached data from Redis, giving users a degraded but functional experience instead of an error.

Timeout configuration is a separate concern from circuit breaking but works alongside it. The gateway's HttpClient timeout (spring.cloud.gateway.httpclient.response-timeout) applies to all routes. Per-route timeouts override this via the RequestTimeout filter. Set timeouts aggressively — a 30-second timeout means a slow downstream can hold gateway connections for 30 seconds per request, quickly exhausting the connection pool.

Resilience4j's sliding window configuration deserves careful tuning. COUNT_BASED uses the last N calls; TIME_BASED uses calls in the last N seconds. For low-traffic services, COUNT_BASED is more responsive because TIME_BASED windows may not have enough samples to make accurate decisions. The failure rate threshold (default 50%) means half your traffic must fail before the circuit opens — in production, lower this to 30-40% for critical services.

Never Use Retry Filter on Non-Idempotent Methods

The Retry filter should only be configured for GET and HEAD requests. Retrying POST, PUT, or DELETE requests on network failure risks duplicate operations — a payment POST retried 3 times could result in 3 charges. Always explicitly specify methods: GET in Retry filter configuration.

Production Insight

Set slow-call-duration-threshold to half your P99 SLA — if orders must respond in 2 seconds, mark 1-second gateway responses as slow calls to open the circuit before your SLA is breached.

Key Takeaway

CircuitBreaker + Retry + Timeout in combination provides defense-in-depth; tune failure thresholds for each service's criticality and never retry non-idempotent methods.

Load Balancing with lb:// URI Scheme

The lb:// URI scheme in Spring Cloud Gateway integrates with Spring Cloud LoadBalancer to automatically resolve service names to physical instance addresses. When Gateway sees lb://order-service, it queries the service registry (Eureka, Consul, or Kubernetes) for available instances, applies the configured load balancing strategy, and forwards the request to the selected instance.

The default load balancer is RoundRobinLoadBalancer. For sticky sessions (sending requests from the same client to the same instance), use a custom ServiceInstanceListSupplier. For canary deployments, use a WeightedServiceInstanceListSupplier that routes a percentage of traffic to new instances.

The connection pool to downstream services is managed by Reactor Netty's connection provider. Each unique host:port combination gets its own pool. Key settings: max-connections (default 500 per pool), pending-acquire-max-count (requests waiting for a connection, default 1000), and connect-timeout. If these limits are exceeded, requests fail immediately with a ConnectionPoolAcquireTimeoutException.

Healthy instance filtering is crucial — without it, the load balancer may route to instances that are running but unhealthy (DOWN in Actuator). Use ServiceInstanceListSupplier.builder().withDiscoveryClient().withHealthChecks().build() to filter unhealthy instances before the load balancing algorithm selects one. This requires that your services expose /actuator/health and are registered in the service registry with accurate health status.

The weight() Predicate Enables Canary Deployments

Routes with the same weight group name compete for traffic based on their weight values. This enables canary deployments without external traffic management tools: deploy v2 with weight 5 and v1 with weight 95, then gradually shift weight as confidence in v2 grows. Update weights via dynamic route configuration without restarting Gateway.

Production Insight

Monitor the reactor.netty.connection.provider.pending.connections.count metric; sustained values above 0 indicate connection pool exhaustion and predict imminent request failures.

Key Takeaway

The lb:// scheme integrates with service discovery and LoadBalancer; tune the Netty connection pool and combine with health check filtering to avoid routing to unhealthy instances.

Global CORS, Logging, and Observability

CORS configuration at the Gateway eliminates the need for CORS config in every downstream service. Configure it once globally or per-route, and ensure all downstream services remove their CORS configuration to prevent duplicate headers. Duplicate CORS headers (Access-Control-Allow-Origin appearing twice) cause browsers to reject all responses from that origin.

Request logging for debugging and audit trails should be implemented as a Global Filter that logs before and after each proxied request. Include: request ID (generate one if not present), path, method, user ID (from JWT), response status, and duration. Structured JSON logging with these fields makes log aggregation and querying in Elasticsearch or CloudWatch straightforward.

Micrometer integration provides metrics for every route: spring.cloud.gateway.requests with tags for routeId, uri, outcome, and status. Export to Prometheus and create dashboards for: P50/P95/P99 latency per route, error rate per route, rate limiter rejection rate, and circuit breaker state changes. These four dashboards give you complete observability without any custom instrumentation.

Distributed tracing with Micrometer Tracing (Spring Boot 3.x) automatically propagates trace IDs through the gateway to downstream services via HTTP headers. Configure a 100% sampling rate for development and 1-5% for production, or use a head-based sampler that samples 100% of requests that return 5xx.

Expose /actuator/gateway/routes in Development Only

The Gateway Actuator endpoints (routes, filters, globalfilters) expose your routing configuration, which can aid attackers in mapping your internal service architecture. Include the 'gateway' actuator endpoint only in development profiles, or secure it with management.endpoints.web.security behind admin role authentication.

Production Insight

Create a Prometheus alert on spring.cloud.gateway.requests{outcome='SERVER_ERROR'} exceeding 1% for 5 minutes; this catches downstream failures before they escalate to user-visible incidents.

Key Takeaway

Configure CORS globally at the gateway (remove it from all downstream services), implement structured request logging with request IDs, and export Prometheus metrics for all routes.

The Hardest Part: Custom Gateway Filters Don't Crash Your Cluster

You've got rate limiting, circuit breakers, and auth wired up. Now you need to add a header, rewrite a path, or strip a cookie. The instinct is to slap a global filter on everything. Don't. Spring Cloud Gateway filters run in the Netty event loop — one blocking call and you take down every route. The WHY: filters are reactive for a reason. Your custom filter must return a Mono<Void> and never block. The HOW: implement GatewayFilter (not WebFilter for route-specific) and put business logic inside a Mono.fromRunnable or defer(() -> ...). If you need to call a database or external service, use a circuit breaker around that call, not a raw synchronous client. The production trap is thinking 'this is just a small DB lookup' — that small lookup blocks the entire gateway thread pool. Write your filters to log, mutate, or redirect. Never to wait.

CustomHeaderFilter.javaJAVA

// io.thecodeforge — java tutorial
@Component
public class CustomHeaderFilter implements GatewayFilter, Ordered {

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
        // WHY: avoid blocking the event loop
        ServerHttpRequest request = exchange.getRequest().mutate()
            .header("X-Origin-Service", "gateway")
            .build();
        ServerWebExchange mutated = exchange.mutate().request(request).build();
        return chain.filter(mutated);
    }

    @Override
    public int getOrder() {
        return -1; // run before built-in filters
    }
}

Output

Applied header X-Origin-Service=gateway to all requests.

Production Trap:

Never call Thread.sleep(), JDBC, or synchronous HTTP inside a filter. The event loop is one shared thread pool. Block it and every route stalls.

Key Takeaway

Custom filters must be non-blocking. If it can block, it doesn't belong in a filter — move it to a separate async service.

Dynamic Routes Without a Restart (The No-Downtime Move)

Static routes in application.yml work for demos. In production, you need to add, remove, or reroute services without redeploying the gateway. The WHY: your microservices will be renamed, moved to new clusters, or scaled down. You can't redeploy the gateway every time. The HOW: implement a RouteDefinitionRepository backed by Redis or your config server. Spring Cloud Gateway picks up changes via events. Store RouteDefinition objects in a remote store, then trigger a RefreshRoutesEvent. The gateway will re-read routes without downtime. Baeldung's 'Dynamic Routing' section points to this but doesn't show you how to wire the repository. Here's the pattern: create a RedisRouteDefinitionRepository, then expose a REST endpoint that writes to it. Call that endpoint from your deployment pipeline. Your ops team will thank you.

Real-World Pattern:

Store routes in Redis with a TTL. When a service is decommissioned, the route auto-expires. No stale routes, no manual cleanup.

Key Takeaway

Dynamic routing isn't optional beyond 3 services. Wire a RouteDefinitionRepository to a config server or database. Restarts are for weekends.

Monitoring That Actually Shows You the Bottleneck

Your gateway is handling 10k requests per second. You see 503s but no circuit breaker tripped. You're staring at logs like it's 2015. Don't. The WHY: without observability, you can't tell if the bottleneck is your filter, the downstream service, or the network. The HOW: enable Spring Cloud Gateway metrics via Actuator and Micrometer. Expose gateway.requests (total count), gateway.routes (active route count), and gateway.error (error rate). Wire these to Prometheus and set up alerts on 99th percentile latency. The production trap: thinking that a circuit breaker metric is enough. It's not. The circuit breaker only tells you something failed; it doesn't tell you where. Add a Distributed Tracing header propagation (like Zipkin or Jaeger) through every filter you write. Then, when a request hangs, you can follow the breadcrumbs. Competitors skip this because it's not 'code.' But in a real incident, it's the only thing that saves you.

MonitoringConfig.javaJAVA

// io.thecodeforge — java tutorial
@Configuration
public class MonitoringConfig {

    @Bean
    public GlobalFilter customMetricsFilter(MeterRegistry registry) {
        Counter requestCounter = Counter.builder("gateway.requests")
            .description("Total gateway requests")
            .register(registry);
        
        return (exchange, chain) -> {
            requestCounter.increment();
            return chain.filter(exchange).then(Mono.fromRunnable(() -> {
                registry.counter("gateway.response", 
                    "status", String.valueOf(exchange.getResponse().getStatusCode().value())).increment();
            }));
        };
    }
}

Output

Prometheus metrics exposed at /actuator/prometheus with gateway_requests_total and gateway_response_total.

Production Trap:

Default Actuator metrics only show aggregate counts. Add tracing headers and custom counters for each route to find real bottlenecks.

Key Takeaway

Metrics without tracing is guessing. Always combine gateway counters with distributed tracing for root-cause visibility.

● Production incidentPOST-MORTEMseverity: high

Gateway OOM Crash During Traffic Spike Due to Response Caching Filter Misconfiguration

Symptom

Gateway pods crashed with OutOfMemoryError during a flash sale traffic spike. All API endpoints returned 502 for 4 minutes until Kubernetes restarted the pods. Error rate hit 100%.

Assumption

The team assumed the caching filter would reduce load on backend services by serving cached responses, improving gateway throughput.

Root cause

The ModifyResponseBodyFilter was configured to cache response bodies for transformation, but large product catalog responses (up to 2MB) were being fully buffered in memory per request. At 5000 concurrent requests, this consumed 10GB of heap. Additionally, the filter was applied globally instead of only to the specific route that needed transformation.

Fix

Removed global filter application; applied ModifyResponseBodyFilter only to routes that needed response transformation. Added response size limits via spring.cloud.gateway.filter.request-rate-limiter.deny-empty-key=true. Increased pod memory limits from 512MB to 2GB. Added a max-content-size limit on body buffering. Implemented streaming response handling for large payloads instead of full buffering.

Key lesson

Global filters in Spring Cloud Gateway apply to every request.
Never apply body-buffering or transformation filters globally — scope them to specific routes.
Always load test the gateway with realistic payload sizes and concurrency levels before traffic spikes.

Production debug guideSymptom → root cause → fix5 entries

Symptom · 01

Requests return 404 from Gateway even though the downstream service is running

→

Fix

Enable DEBUG logging for org.springframework.cloud.gateway and org.springframework.web.reactive. Check that the route predicate matches your request path exactly — Path predicates are case-sensitive and the pattern syntax differs from Spring MVC (uses PathMatcher from Spring 5, not AntPathMatcher). Use the actuator route endpoint (GET /actuator/gateway/routes) to list all configured routes and their predicates. Verify the lb://service-name matches exactly what's registered in your service registry (case-insensitive for Eureka, but check for typos).

Symptom · 02

Rate limiting returning 429 for all requests, even under low load

→

Fix

Check Redis connectivity first — if Gateway cannot reach Redis, the RequestRateLimiter defaults to denying all requests (deny-empty-key=false changes this behavior). Run redis-cli ping from the Gateway pod. Verify the KeyResolver bean is resolving keys correctly by adding a log statement; a NullPointerException or Mono.error() in the KeyResolver causes all requests to be rate-limited. Check the replenish-rate and burst-capacity values — burst-capacity must be greater than or equal to replenish-rate.

Symptom · 03

Circuit breaker always in OPEN state, requests always going to fallback

→

Fix

Check the CircuitBreaker filter configuration — the fallbackUri must be a valid local endpoint (forward:/fallback or a full URL). Verify the Resilience4j circuit breaker name in the filter config matches a name in your resilience4j.circuitbreaker.instances configuration. Check the failure rate threshold — the default is 50% but this means 5 out of 10 requests failing opens the circuit. Use GET /actuator/circuitbreakers to see current state and metrics. Check if the downstream service is actually failing or if the timeout is too low (default is 1 second for WebClient in Gateway).

Symptom · 04

RewritePath filter not rewriting correctly; downstream service receives original path

→

Fix

The RewritePath filter uses Java regex syntax for the pattern and replacement. The replacement uses $\{group} syntax (with backslash-escaping in YAML). In Java config, use RewritePath("/api/v1/(?<segment>.*)", "/${segment}"). Test the regex independently. Also check filter ordering — RewritePath must run before routing occurs (it's a pre-filter). Enable TRACE logging for the Gateway RoutePredicateHandlerMapping to see which route and filters are selected for each request.

Symptom · 05

CORS errors in browser even with CORS filter configured on Gateway

→

Fix

Spring Cloud Gateway CORS config (spring.cloud.gateway.globalcors or per-route cors) replaces any CORS config on the downstream service. If both are configured, you'll get duplicate CORS headers causing browser rejection. Disable CORS on all downstream services and configure it only at the Gateway. Also verify the allowedOrigins list includes the exact origin (scheme + domain + port) that the browser sends in the Origin header — wildcards don't work with credentials.

★ Debug Cheat SheetFast diagnosis commands for Spring Cloud Gateway in production

Route not matching incoming request−

Immediate action

List all configured routes via Actuator

Commands

curl -s http://gateway:8080/actuator/gateway/routes | python3 -m json.tool

curl -s http://gateway:8080/actuator/gateway/routefilters

Fix now

Enable TRACE logging: logging.level.org.springframework.cloud.gateway=TRACE and check route predicate evaluation in logs

Rate limiter rejecting all requests+

High latency on Gateway proxied requests+

JWT filter not rejecting invalid tokens+

Spring Cloud Gateway vs Nginx vs Kong vs AWS API Gateway

Feature	Spring Cloud Gateway	Nginx	Kong	AWS API Gateway
Spring integration	Native	Plugin/sidecar	Plugin	Lambda integration
Service discovery	Eureka/Consul/K8s native	Manual upstream config	Service discovery plugin	Not built-in
Rate limiting backend	Redis (built-in)	nginx-limit-req module	Redis (built-in)	AWS WAF + Usage plans
Custom filter language	Java/Kotlin	Lua	Lua/Go/Python	Not supported
Circuit breaking	Resilience4j native	Not built-in	Plugin	Not built-in
Performance	High (reactive Netty)	Very high	High	High (managed)
Operational complexity	Low (Spring Boot app)	Medium	Medium	Low (managed)
Cost	Free (open source)	Free/Nginx Plus	Free/Enterprise	Pay per request

⚙ Quick Reference

2 commands from this guide

File	Command / Code	Purpose
CustomHeaderFilter.java	@Component	The Hardest Part
MonitoringConfig.java	@Configuration	Monitoring That Actually Shows You the Bottleneck

Key takeaways

Spring Cloud Gateway is fully reactive (Reactor Netty); never perform blocking operations on event loop threads

use Schedulers.boundedElastic() for any synchronous I/O

Global Filters apply to all requests (authentication, logging); GatewayFilters apply per-route (rewrite, circuit breaker, rate limiting)

separate these concerns cleanly

Redis-backed rate limiting requires Redis HA; set deny-empty-key=false to fail open and alert separately when rate limiting is bypassed

Always strip client-provided versions of your injected auth headers before adding validated values

failing to do so enables header spoofing attacks

Configure CORS only at the Gateway and remove CORS configuration from all downstream services to avoid duplicate headers that browsers reject

Common mistakes to avoid

7 patterns

Applying body-buffering filters (ModifyResponseBodyFilter) globally

Symptom

Gateway OOM errors under high load; large response payloads consume all heap

Fix

Apply body-transformation filters only to specific routes that need them, never globally; add response size limits and monitor heap usage under load

Forgetting to strip injected auth headers from incoming requests

Symptom

Malicious clients can forge X-User-ID or X-User-Roles headers and bypass authorization in downstream services

Fix

Always call request.mutate().headers(h -> h.remove("X-User-ID")) before adding validated values from the JWT; strip first, then add

Configuring Retry filter for POST/PUT/DELETE endpoints

Symptom

Duplicate data creation or double charges during retry storms; idempotency violations

Fix

Always specify methods: GET in Retry filter args; only retry idempotent HTTP methods

Setting deny-empty-key=true (default) without Redis HA for rate limiting

Symptom

Any Redis connectivity issue causes 100% of requests to return 429

Fix

Set deny-empty-key=false to fail open on Redis unavailability; use Redis Sentinel or Cluster for HA; add separate alerting for rate limiter bypass events

Configuring CORS on both Gateway and downstream services

Symptom

Duplicate Access-Control-Allow-Origin headers cause browser to reject all responses from the API

Fix

Configure CORS only at the Gateway; remove all CORS configuration from downstream services; validate by checking response headers in browser DevTools

Not setting explicit route IDs

Symptom

Dynamic route management via Actuator is impossible; auto-generated UUID IDs change between restarts

Fix

Always set a stable, descriptive route ID like 'order-service' or 'inventory-read'; this is also used in metrics tags for per-route observability

Blocking operations in Gateway filters on the event loop thread

Symptom

Gateway latency spikes under load; event loop threads blocked cause request queuing

Fix

Offload any blocking operations (database calls, crypto, file I/O) to Schedulers.boundedElastic() using subscribeOn(); keep event loop threads non-blocking

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

What is the difference between a GlobalFilter and a GatewayFilter in Spr...

Q02SENIOR

How does the token bucket algorithm work in the Redis RequestRateLimiter...

Q03SENIOR

How do you implement canary deployments using Spring Cloud Gateway?

Q04SENIOR

Why should you use Schedulers.boundedElastic() for blocking operations i...

Q05SENIOR

What happens when the Redis instance used for rate limiting becomes unav...

Q06JUNIOR

How does the lb:// URI scheme work in Spring Cloud Gateway?

Q07SENIOR

How do you prevent header spoofing in a Gateway JWT authentication filte...

Q08SENIOR

What is the correct filter order for authentication, logging, and rate l...

Q01 of 08SENIOR

What is the difference between a GlobalFilter and a GatewayFilter in Spring Cloud Gateway?

ANSWER

A GlobalFilter applies to every request that passes through the gateway, regardless of which route is matched. Examples: authentication, request logging, correlation ID injection. A GatewayFilter applies only to the specific routes it's configured on. Examples: RewritePath (only needed for routes where the path needs transformation), CircuitBreaker (different settings per downstream service), RequestRateLimiter (different limits per route). Implement GlobalFilter for cross-cutting concerns and GatewayFilter for route-specific behavior.

FAQ · 6 QUESTIONS

Frequently Asked Questions

Can Spring Cloud Gateway replace an Nginx reverse proxy?

Is Spring Cloud Gateway compatible with Spring Boot 3.x?

How do I add a new route without restarting the Gateway?

How does Gateway handle WebSocket connections?

What is the performance difference between Spring Cloud Gateway and Zuul?

How do I test Spring Cloud Gateway routes in unit tests?

Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Everything here is grounded in real deployments.

✓ Verified

production tested

July 04, 2026

last updated

1,697

articles · all by Naren

🔥

That's Spring Cloud. Mark it forged?

8 min read · try the examples if you haven't