Distributed Tracing with Jaeger
- A trace = complete request journey across services. A span = one operation within a service.
- OpenTelemetry is the vendor-neutral instrumentation API — use it to avoid lock-in.
- FastAPIInstrumentor auto-instruments all routes — you only need manual spans for important sub-operations.
Distributed tracing tracks a request as it flows through multiple microservices. A trace is the complete journey; a span is one operation within the service. Jaeger is an open-source tracing backend that stores and visualises traces. Instrument your code with OpenTelemetry, configure it to export to Jaeger, and you can see exactly where latency comes from across service boundaries.
Instrumenting a FastAPI App with OpenTelemetry
# pip install opentelemetry-distro opentelemetry-exporter-otlp # pip install opentelemetry-instrumentation-fastapi from fastapi import FastAPI from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor # Configure tracer provider = TracerProvider() exporter = OTLPSpanExporter(endpoint='http://jaeger:4317') # Jaeger OTLP endpoint provider.add_span_processor(BatchSpanProcessor(exporter)) trace.set_tracer_provider(provider) app = FastAPI() FastAPIInstrumentor.instrument_app(app) # auto-instruments all routes tracer = trace.get_tracer(__name__) @app.get('/orders/{order_id}') async def get_order(order_id: int): with tracer.start_as_current_span('fetch-order') as span: span.set_attribute('order.id', order_id) # Manual span for a specific operation with tracer.start_as_current_span('db-query'): order = await db.get_order(order_id) with tracer.start_as_current_span('enrich-order'): user = await user_service.get_user(order.user_id) # cross-service call return {'order': order, 'user': user}
Running Jaeger with Docker
# Run Jaeger all-in-one (development setup) docker run -d \ --name jaeger \ -p 16686:16686 \ -p 4317:4317 \ -p 4318:4318 \ jaegertracing/all-in-one:latest # Ports: # 16686 — Jaeger UI # 4317 — OTLP gRPC receiver # 4318 — OTLP HTTP receiver # Open Jaeger UI: http://localhost:16686 # Search by service name → see all traces # Click a trace → see full span timeline # Click a span → see attributes, events, errors
🎯 Key Takeaways
- A trace = complete request journey across services. A span = one operation within a service.
- OpenTelemetry is the vendor-neutral instrumentation API — use it to avoid lock-in.
- FastAPIInstrumentor auto-instruments all routes — you only need manual spans for important sub-operations.
- Trace context (trace ID, span ID) propagates via HTTP headers (traceparent) between services.
- Use span attributes to add business context: order.id, user.id — makes filtering useful.
Interview Questions on This Topic
- QWhat is distributed tracing and when would you use it?
- QWhat is the difference between a trace and a span?
- QWhat is sampling in tracing and why is it needed?
Frequently Asked Questions
What is the difference between distributed tracing, logging, and metrics?
Logs are time-stamped text events from a single service. Metrics are aggregated numerical measurements (request rate, error rate, latency percentiles). Distributed traces show the causal chain of events across services for a single request. Observability requires all three: metrics to know something is wrong, logs to see what happened, traces to find where.
What is sampling in distributed tracing?
Recording every trace at high traffic volumes is expensive. Sampling records only a fraction of traces — head-based sampling decides at the start of a request (simple, misses tail latency). Tail-based sampling decides after the trace completes, keeping slow or error traces — more accurate but requires buffering. Jaeger supports both. Common approach: sample 1-5% of normal traces, always sample errors.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.