Senior 8 min · March 09, 2026

Eureka UP, Gateway 502 Drops — Spring Boot Microservices

Intermittent 502 from Gateway despite Eureka UP.

N
Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Written from production experience, not tutorials.

Follow
Production
production tested
May 23, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • Core concept: decomposing applications into independently deployable services with Spring Boot (individual services) and Spring Cloud (coordination layer)
  • Service Discovery (Eureka): services register and discover each other by logical name, not hardcoded IPs
  • API Gateway (Spring Cloud Gateway): single entry point for routing, auth, rate limiting
  • Circuit Breakers (Resilience4j): isolate failures to prevent cascading crashes
  • Performance insight: Eureka heartbeats add ~2ms per instance; gateway adds ~5-10ms per request
  • Production insight: without healthchecks on depends_on, your gateway routes to dead containers silently
✦ Definition~90s read
What is Microservices with Spring Boot and Spring Cloud?

Microservices with Spring Boot and Spring Cloud is a core feature-set for building resilient distributed systems. While Spring Boot makes it trivial to create a standalone 'service' (like an Order Service or User Service), Spring Cloud provides the glue.

Think of Microservices with Spring Boot and Spring Cloud as a powerful tool in your developer toolkit.

It solves the problem of 'Service Discovery' (how Service A finds Service B without hardcoding IP addresses) and 'Circuit Breaking' (how to stop a failure in one service from cascading and crashing your entire ecosystem).

It exists because managing 50+ independent services manually is impossible; you need an automated infrastructure to handle the overhead. By using declarative tools like OpenFeign, Service A can call Service B just by using its name, while Eureka handles the dynamic IP mapping in the background.

Plain-English First

Think of Microservices with Spring Boot and Spring Cloud as a powerful tool in your developer toolkit. Once you understand what it does and when to reach for it, everything clicks into place. Imagine a giant restaurant: in a 'monolith,' one person takes orders, cooks, cleans, and manages the books. If they get sick, the whole place closes. In a 'microservices' model, you have a dedicated host, a head chef, a cleaning crew, and an accountant. They all work in their own spaces and talk to each other through intercoms. Spring Boot builds the individual workers, and Spring Cloud provides the intercoms, maps, and managers that keep them all synchronized.

Microservices with Spring Boot and Spring Cloud is a fundamental concept in modern Java development. Moving away from the 'Monolithic' architecture—where every feature lives in a single code base—to a distributed system allows for independent scaling, faster deployment cycles, and technological flexibility. However, it introduces the 'Distributed System Tax': complexity in networking, security, and data consistency.

In this guide, we'll break down exactly what Microservices with Spring Boot and Spring Cloud is, why it was designed this way, and how to use it correctly in real projects by leveraging Service Discovery, Configuration Management, and API Gateways. We focus on the Spring Cloud 2023.x release train (Leyton), ensuring your stack is ready for modern cloud environments.

By the end, you'll have both the conceptual understanding and practical code examples to use Microservices with Spring Boot and Spring Cloud with confidence, moving from a single JAR to a resilient, interconnected ecosystem.

What Is Microservices with Spring Boot and Spring Cloud and Why Does It Exist?

Microservices with Spring Boot and Spring Cloud is a core feature-set for building resilient distributed systems. While Spring Boot makes it trivial to create a standalone 'service' (like an Order Service or User Service), Spring Cloud provides the glue. It solves the problem of 'Service Discovery' (how Service A finds Service B without hardcoding IP addresses) and 'Circuit Breaking' (how to stop a failure in one service from cascading and crashing your entire ecosystem).

It exists because managing 50+ independent services manually is impossible; you need an automated infrastructure to handle the overhead. By using declarative tools like OpenFeign, Service A can call Service B just by using its name, while Eureka handles the dynamic IP mapping in the background.

io/thecodeforge/orderservice/OrderServiceApplication.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
package io.thecodeforge.orderservice;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.discovery.EnableDiscoveryClient;
import org.springframework.cloud.openfeign.EnableFeignClients;

/**
 * io.thecodeforge Standard: High-Availability Microservice Setup
 * @EnableDiscoveryClient: Registers this service with Eureka/Consul so others can find it.
 * @EnableFeignClients: Enables declarative REST clients to call other services via name.
 */
@SpringBootApplication
@EnableDiscoveryClient
@EnableFeignClients
public class OrderServiceApplication {

    public static void main(String[] args) {
        SpringApplication.run(OrderServiceApplication.class, args);
    }
}

// --- Declarative Client Example ---
package io.thecodeforge.orderservice.client;

import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;

/**
 * Using Feign for type-safe inter-service communication.
 * Hardcoded URLs are replaced by logical service names.
 */
@FeignClient(name = "inventory-service")
public interface InventoryClient {
    @GetMapping("/api/inventory/{skuCode}")
    boolean isInStock(@PathVariable("skuCode") String skuCode);
}
Output
INFO: Registering application ORDER-SERVICE with eureka with status UP
INFO: Feign Client 'inventory-service' initialized for OrderService.
Key Insight:
The most important thing to understand about Microservices with Spring Boot and Spring Cloud is the problem it was designed to solve. Always ask 'why does this exist?' before asking 'how do I use it?' In this case, it's about managing distribution complexity and ensuring high availability through abstraction.
Spring Boot Microservices with Eureka & Gateway THECODEFORGE.IO Spring Boot Microservices with Eureka & Gateway Architecture flow from service discovery to API gateway and resilience Eureka Service Discovery Register & discover microservices API Gateway (Spring Cloud Gateway) Route requests to services Circuit Breaker (Resilience4j) Prevent cascading failures Event-Driven Messaging Async communication via Kafka/RabbitMQ Reactive WebFlux Endpoints Non-blocking I/O for high throughput Schema & Config Management External config, schema versioning ⚠ Eureka self-preservation mode can hide service failures Disable in dev: eureka.server.enableSelfPreservation=false THECODEFORGE.IO
thecodeforge.io
Spring Boot Microservices with Eureka & Gateway
Spring Boot Microservices

Common Mistakes and How to Avoid Them

When learning Microservices with Spring Boot and Spring Cloud, most developers hit the same set of gotchas. A major one is the 'Distributed Monolith,' where services are so tightly coupled that you can't update one without updating them all.

Another is neglecting 'Observability'—if a request fails across four different services, how do you trace it? Using tools like Micrometer Tracing (the successor to Sleuth) and Zipkin is non-negotiable for production debugging. Furthermore, failing to implement Circuit Breakers means that if your Payment Service hangs, your entire Checkout process will also hang until the connection times out.

src/main/resources/application.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Centralized Config Example via Spring Cloud Config
spring:
  application:
    name: order-service
  config:
    import: "optional:configserver:http://forge-config-server:8888"
  cloud:
    gateway:
      routes:
        - id: order-route
          uri: lb://order-service
          predicates:
            - Path=/api/order/**

# Production-Grade Resilience4j Circuit Breaker Config
resilience4j.circuitbreaker:
  instances:
    inventoryService:
      registerHealthIndicator: true
      slidingWindowSize: 10
      minimumNumberOfCalls: 5
      permittedNumberOfCallsInHalfOpenState: 3
      waitDurationInOpenState: 10s
      failureRateThreshold: 50
      eventConsumerBufferSize: 10
Output
// Route configured: Requests to /api/order/** reach order-service instances via Load Balancer.
// Resilience4j: Circuit breaker 'inventoryService' is monitoring service health.
Watch Out:
The most common mistake with Microservices with Spring Boot and Spring Cloud is using it when a simpler alternative would work better. Always consider whether the added complexity is justified. Don't build microservices for a 2-person startup unless you expect massive, immediate scale. The operational cost is high.

Configuring Service Discovery with Eureka and Load Balancer

Service Discovery is the backbone of microservices communication. Without it, you're hardcoding IPs and ports, which breaks the moment you scale or redeploy. Netflix Eureka is the battle-tested service registry. Each service registers with its instance ID, hostname, port, and health status. The Spring Cloud LoadBalancer (replacement for Netflix Ribbon) picks a healthy instance using round-robin or custom rules.

Here's the catch: Eureka uses a heartbeat mechanism (30s by default). If a service misses three heartbeats, Eureka removes it. But a service can be 'UP' in Eureka while being completely broken — that's why custom health indicators are critical. Always implement a health endpoint that checks database, message queue, and external API availability.

Another pitfall: self-preservation mode. If Eureka loses too many heartbeats network-wide (e.g., due to a transient network partition), it stops evicting services to protect against false removals. That means your gateway might route to dead instances for minutes. Tune eureka.server.renewalPercentThreshold based on your cluster size.

io/thecodeforge/gateway/GatewayConfig.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package io.thecodeforge.gateway;

import org.springframework.cloud.client.loadbalancer.LoadBalanced;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;

@Configuration
public class GatewayConfig {

    @Bean
    @LoadBalanced
    public RestTemplate loadBalancedRestTemplate() {
        return new RestTemplate();
    }
}

// In application.yml:
// spring:
//   cloud:
//     gateway:
//       routes:
//         - id: inventory-route
//           uri: lb://inventory-service
//           predicates:
//             - Path=/api/inventory/**
Output
// LoadBalanced RestTemplate resolves service names to actual instances.
// Gateway routes to 'lb://inventory-service' using client-side load balancing.
How Eureka Works Under the Hood
  • Registration: Service sends POST /eureka/apps/INSTANCE with metadata (host, port, status).
  • Renewal: Service sends PUT /eureka/apps/APP/INSTANCE every 30s (heartbeat).
  • Eviction: Eureka removes instances that miss 3 consecutive heartbeats (90s window).
  • Self-preservation: If heartbeats drop >15% network-wide, Eureka stops evictions to avoid false negatives during network partitions.
Production Insight
Eureka self-preservation can mask real outages for minutes.
In a production incident, a network hiccup caused 20% heartbeat loss. Eureka entered self-preservation, kept routing to dead instances. The fix: tune eureka.server.renewalPercentThreshold to 0.75 and implement proactive health checks.
Rule: never rely solely on Eureka status — combine with client-side circuit breakers.
Key Takeaway
Service Discovery removes hardcoded URLs but introduces heartbeat latency (30s) and self-preservation risks.
Always pair Eureka with custom health indicators and circuit breakers.
Failure to do so = routing traffic to dead services.

API Gateway: The Front Door to Your Microservices

An API Gateway is a single entry point that handles cross-cutting concerns: authentication, rate limiting, request/response transformation, and routing. Spring Cloud Gateway is the current standard — it's reactive (built on WebFlux) and non-blocking, meaning it can handle thousands of concurrent requests with minimal threads.

Why not a plain load balancer? A load balancer distributes traffic but doesn't understand HTTP semantics. A gateway can inspect paths, rewrite URLs, add headers, enforce rate limits per client, and handle CORS. It's also the perfect place to implement token validation (JWT) before requests reach your services, reducing duplicate auth logic.

Key configuration pitfall: route order matters. The gateway evaluates routes in the order they're defined. A generic catch-all route before a specific one will shadow all requests. Always declare specific routes first, then a fallback.

Performance note: Spring Cloud Gateway adds ~5-10ms per request in most cases. If you need sub-millisecond latency, consider running a dedicated sidecar proxy like Envoy instead. But for 95% of applications, the gateway is fine.

io/thecodeforge/gateway/src/main/resources/application.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Spring Cloud Gateway Configuration
spring:
  cloud:
    gateway:
      routes:
        - id: user-service
          uri: lb://user-service
          predicates:
            - Path=/api/users/**
          filters:
            - StripPrefix=1
            - RemoveRequestHeader=Cookie
        - id: order-service
          uri: lb://order-service
          predicates:
            - Path=/api/orders/**
          filters:
            - name: CircuitBreaker
              args:
                name: orderCircuitBreaker
                fallbackUri: forward:/fallback/orders
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 10
                redis-rate-limiter.burstCapacity: 20
Output
// Routes defined in order of priority.
// Circuit breaker with fallback for order routes.
// Rate limiter: 10 requests/second per user (Redis-backed).
Gateway Anti-Pattern
Don't put business logic in the gateway. It should only route, filter, and transform. Business logic belongs in the services. Violating this makes the gateway a distributed monolith bottleneck and impossible to maintain.
Production Insight
In production, a misconfigured rate limiter can silently drop legitimate traffic. We had a rate limiter set to 5 req/s per client, but the frontend sent 6 requests in one second on page load. 17% of initial page loads failed with 429. The fix: use a burst capacity 2x the replenish rate and add a Retry-After header.
Rule: always test rate limit scenarios with real user traffic patterns.
Key Takeaway
An API Gateway centralises auth, rate limiting, and routing — but adds 5-10ms latency and requires careful route ordering.
Without it, every service duplicates cross-cutting logic.
With it, you risk overloading the gateway if not tuned for throughput.

Resilience: Circuit Breakers, Retries, and Bulkheads

In a distributed system, failures are not exceptions — they're the default. Resilience4j is the de facto library for Spring Boot applications. It provides three core patterns: - Circuit Breaker: monitors failures and opens the circuit when a threshold is hit, preventing further calls to a failing service. After a wait duration, it half-opens and tests the water. - Retry: automatically retries failed calls with exponential backoff, but must be used with an idempotent API (e.g., GET, PUT with idempotency key). - Bulkhead: limits concurrent calls to a service, preventing resource exhaustion from spilling over.

The biggest production mistake: configuring retry without a circuit breaker. If the downstream service is down, retries just amplify the load and delay the failure. Retries are for transient errors (timeouts, 503s), not for persistent failures.

Another trap: thread pool vs semaphore isolation. Resilience4j offers both. Thread pool isolation creates a separate thread pool per circuit breaker (isolated but resource-heavy). Semaphore isolation is lightweight but shares threads with the caller — a blocking downstream can still starve your application. Prefer thread pool isolation for critical paths.

Metrics to monitor: resilience4j.circuitbreaker.state.{name} and resilience4j.circuitbreaker.calls.{name} in Micrometer. Alert on OPEN state lasting more than 5 minutes.

src/main/resources/application.ymlYAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Resilience4j circuit breaker with retry and bulkhead
resilience4j.circuitbreaker:
  instances:
    inventoryService:
      registerHealthIndicator: true
      slidingWindowSize: 10
      minimumNumberOfCalls: 5
      permittedNumberOfCallsInHalfOpenState: 3
      waitDurationInOpenState: 10s
      failureRateThreshold: 50
      eventConsumerBufferSize: 10

resilience4j.retry:
  instances:
    inventoryService:
      maxRetryAttempts: 3
      waitDuration: 500ms
      retryExceptions:
        - org.springframework.web.client.HttpServerErrorException

resilience4j.bulkhead:
  instances:
    inventoryService:
      maxConcurrentCalls: 5
      maxWaitDuration: 100ms
Output
// Circuit breaker opens after 50% failure rate in 10 calls.
// Retries only on 5xx errors, not 4xx.
// Bulkhead limits concurrent calls to 5, with 100ms max wait.
Resilience4j vs Hystrix
Hystrix is in maintenance mode since 2018. Resilience4j is the active successor: no Netflix dependencies, modular, works with Spring Cloud 2023.x, and supports reactive types out of the box. Migrate away from Hystrix if you haven't already.
Production Insight
We had a payment service that was slow but not failing (2s average response). The circuit breaker didn't open because failure rate was <50%. But it caused thread pool exhaustion across three upstream services. The fix: add a time limiter (resilience4j.timelimiter) to cap the response time at 500ms and treat timeout as a failure in the circuit breaker.
Rule: always pair circuit breakers with time limiters to handle slow responses, not just errors.
Key Takeaway
Resilience4j offers circuit breaker, retry, bulkhead, and time limiter. They must be combined, not used in isolation.
Retry without circuit breaker amplifies failure load.
Time limiters convert slow responses into fast failures the circuit breaker can detect.

Event-Driven Microservices: Why Your Synchronous Calls Are Burning Money

Stop pretending every service needs an instant reply. Synchronous HTTP calls chain services together, turning one failure into a cascade. You're paying for idle threads and lost messages. Event-driven architecture decouples services with a message broker (RabbitMQ, Kafka, or Redis Streams). Service A publishes an event and moves on. Service B picks it up when it's ready. No waiting. No cascading failures. Production systems use event sourcing to rebuild state from a log of events. Spring Cloud Stream wraps the broker with binders, so your code talks to channels, not queues. The WHY: resilience, scale, and audit trails. The HOW: define input/output channels in a functional interface, bind them via application.yml, and let Spring Cloud Stream handle serialization, retries, and dead-letter queues. This pattern saves your ass when a downstream service goes down at 3 AM.

OrderEventPublisher.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// io.thecodeforge — java tutorial

import org.springframework.cloud.stream.function.StreamBridge;
import org.springframework.stereotype.Component;

@Component
public class OrderEventPublisher {

    private final StreamBridge streamBridge;

    public OrderEventPublisher(StreamBridge streamBridge) {
        this.streamBridge = streamBridge;
    }

    public void publishOrderCreated(String orderId) {
        // Sends an event to the 'order-created' binding — no waiting
        streamBridge.send("order-created-out-0", orderId);
    }
}

// application.yml:
// spring:
//   cloud:
//     stream:
//       bindings:
//         order-created-out-0:
//           destination: orders.topic
//           content-type: application/json
//       binders:
//         defaultRabbit:
//           type: rabbit
Output
Event 'orderId-abc' published to RabbitMQ exchange 'orders.topic' — no HTTP reply
Production Trap: Losing Events on Restart
Never assume the broker remembers your messages. Configure durable queues and persistent delivery mode. Without them, a broker restart wipes unprocessed events and your audit trail goes silent.
Key Takeaway
Async messaging isn't optional — it's the difference between a brittle monolith and a resilient microservice ecosystem.

Reactive Microservices with Spring WebFlux: Stop Blocking, Start Scaling

Thread-per-request doesn't scale. Every blocked HTTP call ties up a thread from a finite pool. Under load, your services stall, and you throw hardware at the symptom. Reactive programming flips the model: event-loop threads handle thousands of concurrent requests with non-blocking I/O. Spring WebFlux runs on Netty and gives you reactive streams for MVC-style controllers, Mongo, R2DBC, and WebClient. The WHY: 10x throughput on the same hardware. The HOW: write controllers that return Mono (single value) or Flux (stream). Wrap blocking calls in subscribeOn to offload them. Use WebClient for non-blocking HTTP calls. You'll still write synchronous code inside a chain — just without waiting. Start small: replace one blocking REST client with reactive WebClient. Measure the thread savings. You'll convert the whole team.

ReactiveOrderController.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// io.thecodeforge — java tutorial

import org.springframework.web.bind.annotation.*;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Mono;

@RestController
@RequestMapping("/orders")
public class ReactiveOrderController {

    private final WebClient webClient = WebClient.create("http://inventory-service");

    @GetMapping("/{orderId}/status")
    public Mono<String> getOrderStatus(@PathVariable String orderId) {
        // Non-blocking call — returns immediately, result arrives later
        return webClient.get()
                .uri("/inventory/" + orderId)
                .retrieve()
                .bodyToMono(String.class)
                .map(inventoryStatus -> "Order " + orderId + ": " + inventoryStatus);
    }
}
Output
GET /orders/123/status -> JSON: "Order 123: in stock" (no thread blocked waiting for inventory-service)
Senior Shortcut: Start with WebClient, Not Full WebFlux
Replacing RestTemplate with reactive WebClient in a blocking controller gives immediate gains. You don't need to rewrite everything to reactive — just the I/O bottlenecks.
Key Takeaway
Thread isolation is a myth under load. Reactive streams let one thread do the work of fifty.

Stop Restarting Containers: Schema & Config Like a Pro

You don't 'create a schema in MySQL Workbench' for every microservice. You script it, version it, and automate it. Why? Because your dev, staging, and prod environments must match. Clicking around Workbench creates drift. Drift kills deployments.

Step 2: Write a Flyway or Liquibase migration script. Keep it in your microservice repo. For our employee service, a V1__create_employee_table.sql creates the schema. Step 3: Your application.properties points to a config server or environment variable for datasource URL, username, password. Hard-coding these is amateur hour.

You set spring.datasource.url and spring.jpa.hibernate.ddl-auto=validate in production. Validate means your app won't even start if the schema doesn't match your entities. That's the safety net your on-call rotation will thank you for.

EmployeeServiceApplication.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// io.thecodeforge — java tutorial

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.discovery.EnableDiscoveryClient;

@SpringBootApplication
@EnableDiscoveryClient
public class EmployeeServiceApplication {

    public static void main(String[] args) {
        SpringApplication.run(EmployeeServiceApplication.class, args);
    }
}
Output
2025-02-22 10:30:15.123 INFO 12345 --- [ main] i.t.EmployeeServiceApplication : Started EmployeeServiceApplication in 3.45 seconds (JVM running for 3.8)
Production Trap: Flyway Checksums
Never modify a committed migration file. Flyway checksums the content — change it and your app fails on startup. Always create a new V2__ file instead.
Key Takeaway
Schema changes are code changes. Script them, version them, automate them in CI/CD.

Run the Damn Thing: Microservice Startup in One Shot

Step 10 isn't 'run your employee microservice' from an IDE like a junior. You run it as a fat JAR or a container. Why? Because your IDE doesn't exist in production. Your microservice must boot, register with Eureka, and serve traffic — all without human fingers touching a run button.

First, mvn clean package -DskipTests. Then run java -jar target/employee-service-0.0.1-SNAPSHOT.jar. Watch the logs: you should see 'Registered instance with Eureka' and 'Started EmployeeServiceApplication'. If you don't see registration, your bootstrap.properties has a wrong eureka.client.serviceUrl.defaultZone.

Containerize it immediately. Dockerfile: FROM eclipse-temurin:17-jre-alpine, COPY target/*.jar app.jar, ENTRYPOINT ["java","-jar","/app.jar"]. Then docker run -p 8081:8081 --network microservices-net employee-service. This is your production-identical local run. Master this before you touch Kubernetes.

EmployeeController.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// io.thecodeforge — java tutorial

import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/employees")
public class EmployeeController {

    private final EmployeeService service;

    public EmployeeController(EmployeeService service) {
        this.service = service;
    }

    @GetMapping("/{id}")
    public Employee getEmployee(@PathVariable Long id) {
        return service.findById(id);
    }

    @PostMapping
    public Employee createEmployee(@RequestBody Employee employee) {
        return service.save(employee);
    }
}
Output
{
"id": 1,
"name": "Ada Lovelace",
"department": "Engineering"
}
Senior Shortcut: Health Check First
Key Takeaway
Locally run exactly as Prod runs: from a JAR or container, not an IDE play button.

Testing Microservices: Contracts Beat Integration Hell

Synchronous integration tests across services are slow, brittle, and burn CI minutes. Instead, use consumer-driven contract tests with Spring Cloud Contract. Each service publishes a contract (Groovy or YAML) defining expected request/response behavior. The provider verifies against these contracts; the consumer stubs them. This catches breaking changes before deployment, without spinning up databases or message brokers. WireMock or Testcontainers handle external dependencies. For real isolation, pair contracts with in-memory test slices: @WebMvcTest for controllers, @DataJpaTest for repositories. The why: contracts decouple tests from infrastructure, making pipelines fast and deterministic. The how: start small—contract-test one inter-service API, stub everything else with Spring Cloud Contract's automatic stub generation. Run provider-side verification in your build, then let consumers test against generated stubs locally.

UserServiceContractTest.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// io.thecodeforge — java tutorial
// Contract test for GET /users/{id}

@SpringBootTest(webEnvironment = WebEnvironment.MOCK)
@AutoConfigureMockMvc
@AutoConfigureStubRunner(
  ids = "com.example:user-service:+:stubs:8090",
  stubsMode = StubRunnerProperties.StubsMode.LOCAL
)
public class UserServiceContractTest {

    @Autowired
    private MockMvc mockMvc;

    @Test
    void shouldReturnUserWhenExists() throws Exception {
        mockMvc.perform(get("/users/1"))
               .andExpect(status().isOk())
               .andExpect(jsonPath("$.name").value("John"));
    }
}
Output
MockMvc passes — stub returns expected JSON. No real service needed.
Production Trap:
Never let developers skip contracts by testing against a running 'dev' cluster. That creates hidden coupling that breaks on deploy.
Key Takeaway
Contract tests catch integration bugs faster than full end-to-end suites.

Best Practices: Profile Configuration Over Environment Variables

Environment variables are global and hard to audit. Spring Boot profiles (@Profile) let you group configs by environment: application-dev.yml, application-prod.yml. Avoid scattering @Value annotations everywhere—bundle related config into @ConfigurationProperties classes with @Validated and JSR-303 annotations. For secrets, externalize to a vault (HashiCorp Vault, AWS Secrets Manager) and inject via spring.cloud.vault. Never hardcode URLs or credentials. Use spring.config.import to compose configs: spring.config.import=configserver:http://config-server:8888. Why: profiles enforce environment parity, reduce human error during deployments, and make configs testable. The how: refactor a single application.yml into three files (dev, staging, prod). Move all @Value into a single AppProperties class. Then add a spring.profiles.active=dev environment variable to launch. You'll thank yourself during audits.

AppProperties.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// io.thecodeforge — java tutorial
// Typed config: no scattered @Value

@ConfigurationProperties(prefix = "app")
@Validated
@Component
public class AppProperties {

    @NotEmpty
    private String apiKey;
    @Min(1) @Max(65535)
    private int port;
    private List<String> allowedOrigins;

    // getters & setters
}
Output
Spring auto-binds app.api-key, app.port, app.allowed-origins from YAML.
Production Trap:
If you use @RefreshScope on @ConfigurationProperties, scoped beans get proxy overhead. Keep config POJOs plain and inject them as singletons where possible.
Key Takeaway
One @ConfigurationProperties class replaces dozens of @Values and makes configs auditable.
● Production incidentPOST-MORTEMseverity: high

The Silent Gateway Drop: When Eureka Says UP but Traffic Says DOWN

Symptom
Intermittent HTTP 502 errors from the API Gateway to the payment service. Clients saw random payment failures. Eureka dashboard showed both payment-service instances as UP (green).
Assumption
The load balancer was misconfigured or the gateway had a bug.
Root cause
The payment service's actuaator/health endpoint returned 200 OK even when the database connection pool was exhausted. Eureka's healthcheck only polls the default health endpoint — it doesn't validate the service can process requests. The two instances were alive but unable to handle any real transactions.
Fix
Configured a custom health indicator in the payment service that checked both DB connectivity and pool availability. Then set Eureka's healthcheck to use that custom endpoint. Also added a gateway-level circuit breaker for the payment route (Resilience4j time limiter + circuit breaker) so the gateway would stop routing after timeouts.
Key lesson
  • Eureka health =/= service health. Always use custom health indicators that test real dependencies.
  • Gateway-level circuit breakers are cheaper than service-level ones — they protect the entry point first.
  • Monitor actual success rates, not just UP/DOWN status.
Production debug guideSymptom → Action matrix for common microservices problems4 entries
Symptom · 01
Service A cannot resolve service B by name (UnknownHostException)
Fix
Check Eureka dashboard: is service B registered? Verify both services are on the same Eureka server. Also confirm the spring.application.name matches the name used in @FeignClient.
Symptom · 02
Gateway returns 503 Service Unavailable intermittently
Fix
Check the gateway's routing configuration (application.yml). Run curl against the service directly (bypass gateway) to isolate the issue. Look at the gateway logs for routing failures.
Symptom · 03
Circuit breaker opens permanently
Fix
Inspect the downstream service's response times and error rates. Check Resilience4j metrics via /actuator/health. Usually the root cause is a slow or crashing dependency; fix that first.
Symptom · 04
Distributed tracing shows a request takes 10s across services but each service reports <100ms
Fix
Look at the gaps between spans. The delay is likely serialization, network congestion, or thread pool exhaustion at the client side. Enable Zipkin Kafka collector to reduce span reporting overhead.
★ 5-Minute Microservices Debug Cheat SheetCommands and actions for the most common production microservices failures.
Service not registering with Eureka
Immediate action
Check if eureka.client.serviceUrl.defaultZone is correct and reachable.
Commands
curl -v http://<eureka-host>:8761/eureka/apps/<service-name> | jq .
docker logs <eureka-container> --tail 50
Fix now
Restart the service. If still fails, verify the service can reach Eureka's host/port (firewall rules).
Gateway routes but returns 500 or timeout+
Immediate action
Bypass the gateway: call the service directly on its server:port.
Commands
curl -w '%{http_code}' http://<service-ip>:<port>/actuator/health
docker compose logs gateway-service --tail 100 | grep ERROR
Fix now
If direct call works, increase gateway's connection timeout for that route (spring.cloud.gateway.routes.default-filters).
Feign client call fails with RetryableException+
Immediate action
Check the target service's load and thread pool. Likely the service is overwhelmed.
Commands
curl http://<service-ip>:<port>/actuator/metrics/jvm.threads.live
curl http://<service-ip>:<port>/actuator/health | jq .components.db.status
Fix now
Increase server.tomcat.threads.max on the downstream service. Add a circuit breaker around the Feign call.
Monolith vs Microservices
AspectMonolithic ArchitectureMicroservices (Spring Cloud)
DeploymentSingle unit; all or nothing deployment.Independent units; deploy features separately.
ScalingScale the whole app (Vertical/Horizontal).Scale only the bottleneck service (Granular).
Fault ToleranceOne bug can crash the entire process.Circuit breakers isolate failures to one service.
Tech StackLocked into one language/framework.Polyglot-friendly; use the best tool for each service.
Data ManagementSingle shared Database (Strong Consistency).Database per Service (Eventual Consistency).

Key takeaways

1
Microservices with Spring Boot and Spring Cloud is a core concept for building distributed systems that are resilient and scalable.
2
Spring Boot builds the 'what' (the logic), while Spring Cloud manages the 'how' (the communication and coordination).
3
Service Discovery (Eureka) and API Gateways (Spring Cloud Gateway) are the entry-level requirements for any microservices architecture.
4
Never skip observability; distributed tracing (Micrometer) and centralized logging (ELK/Prometheus) are your only hope when debugging a request across multiple service boundaries.
5
Adopt the 'Database per Service' rule to ensure truly independent deployments.

Common mistakes to avoid

4 patterns
×

Overusing Microservices when a monolith suffices

Symptom
Team spends months building distributed system for a simple CRUD app, leading to operational overhead and slower iteration.
Fix
Start with a monolith. Extract microservices only when you hit clear scaling bottlenecks or team coordination issues.
×

Hardcoding Service URLs

Symptom
After auto-scaling, new instances get new IPs. Hardcoded URLs cause connection failures and manual configuration updates.
Fix
Use a Service Registry (Eureka) and client-side load balancing. Define Feign clients with service names, not IPs.
×

Ignoring Cascading Failures — No Circuit Breakers

Symptom
A slow payment service holds up all upstream threads. The entire checkout flow becomes unresponsive under load.
Fix
Implement circuit breakers on all inter-service calls. Configure time limiters to convert slow responses into fast failures.
×

Sharing a Single Database Across Services

Symptom
Changes to the shared schema require coordinated deployments across multiple services, negating independent deployment.
Fix
Adopt Database per Service pattern. Use event-driven communication (Kafka/RabbitMQ) for eventual consistency.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain the 'Service Registry' pattern. How does Eureka distinguish betw...
Q02SENIOR
What is the role of an API Gateway (like Spring Cloud Gateway) regarding...
Q03SENIOR
How do you ensure data consistency across multiple microservices without...
Q04SENIOR
Explain the 'Circuit Breaker' states (Closed, Open, Half-Open) in Resili...
Q05SENIOR
What is 'Client-Side Load Balancing' (LoadBalancer library) and how does...
Q01 of 05SENIOR

Explain the 'Service Registry' pattern. How does Eureka distinguish between a service being 'Down' vs a 'Network Partition'?

ANSWER
The Service Registry pattern decouples service consumers from providers. Services register with their IP/port on startup and periodically send heartbeats. Eureka uses a 'self-preservation' mode: if heartbeats drop below a threshold network-wide, it stops evicting instances, assuming a network partition rather than actual failures. This prevents false positives during network issues. To distinguish: check if only one service instance stops heartbeating (likely down) vs many instances simultaneously (likely partition). Also use custom health indicators that test real dependencies, not just the default health endpoint.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
Do I need both Eureka and an API Gateway?
02
Can I use Consul instead of Eureka?
03
How many services is too many for Spring Cloud?
04
Should I share configuration across services via Spring Cloud Config?
05
What's the best way to handle distributed tracing?
N
Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Written from production experience, not tutorials.

Follow
Verified
production tested
May 23, 2026
last updated
1,554
articles · all by Naren
🔥

That's Spring Boot. Mark it forged?

8 min read · try the examples if you haven't

Previous
Spring Boot with Docker
14 / 21 · Spring Boot
Next
Spring Boot Caching with Redis