Distributed Tracing with OpenTelemetry: Following Requests Through Microservices
In a microservices architecture, a single request traverses dozens of services. OpenTelemetry makes the entire path visible.
The Problem: Invisible Request Paths
User clicks, page is slow. But where is the problem? In the API gateway? In the auth service? In the database? Without distributed tracing, debugging becomes a guessing game.
OpenTelemetry: The Standard
OpenTelemetry (OTel) is the CNCF standard for observability data. It replaces proprietary solutions like Jaeger clients and Zipkin libraries with a unified API.
- Auto-Instrumentation: OTel can automatically instrument HTTP requests, database queries, and message queue operations, without code changes.
- Context Propagation: Trace IDs are automatically propagated via HTTP headers and message queue metadata.
- Vendor-neutral: Send traces to Jaeger, Tempo, Datadog, or any other OTel-compatible backend.
Traces in Practice
- Span Attributes: Enrich spans with business context, user ID, tenant, feature flag status.
- Sampling: In production, you do not need to trace every request. Head-based sampling (e.g., 10%) or tail-based sampling (only errors and slow requests) reduces costs.
- Correlation: Link trace IDs with logs and metrics for complete visibility.
Our Architecture
At devRocks, we use OTel with Grafana Tempo as the backend. The OTel Collector runs as a DaemonSet on every node and forwards traces, metrics, and logs to their respective backends, a single pipeline for everything.
Questions About This Topic?
We are happy to advise you on the technologies and solutions described in this article.
Get in TouchSeit über 25 Jahren realisieren wir Engineering-Projekte für Mittelstand und Enterprise.