Distributed Tracing
Track requests across microservices to debug latency and failures
✨ The solution you've been looking for
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.
See It In Action
Interactive preview & real-world examples
AI Conversation Simulator
See how users interact with this skill
User Prompt
I have a user API that's responding slowly. Help me set up distributed tracing to find the bottleneck across my authentication service, user service, and database layers.
Skill Processing
Analyzing request...
Agent Response
Complete Jaeger setup with instrumented services showing exact timing for each component in your request flow
Quick Start (3 Steps)
Get up and running in minutes
Install
claude-code skill install distributed-tracing
claude-code skill install distributed-tracingConfig
First Trigger
@distributed-tracing helpCommands
| Command | Description | Required Args |
|---|---|---|
| @distributed-tracing debug-service-latency | Identify which service in your request chain is causing slow response times | None |
| @distributed-tracing analyze-service-dependencies | Understand how your microservices communicate and depend on each other | None |
| @distributed-tracing implement-production-tracing | Set up comprehensive tracing infrastructure for a production environment | None |
Typical Use Cases
Debug Service Latency
Identify which service in your request chain is causing slow response times
Analyze Service Dependencies
Understand how your microservices communicate and depend on each other
Implement Production Tracing
Set up comprehensive tracing infrastructure for a production environment
Overview
Distributed Tracing
Implement distributed tracing with Jaeger and Tempo for request flow visibility across microservices.
Purpose
Track requests across distributed systems to understand latency, dependencies, and failure points.
When to Use
- Debug latency issues
- Understand service dependencies
- Identify bottlenecks
- Trace error propagation
- Analyze request paths
Distributed Tracing Concepts
Trace Structure
Trace (Request ID: abc123)
↓
Span (frontend) [100ms]
↓
Span (api-gateway) [80ms]
├→ Span (auth-service) [10ms]
└→ Span (user-service) [60ms]
└→ Span (database) [40ms]
Key Components
- Trace - End-to-end request journey
- Span - Single operation within a trace
- Context - Metadata propagated between services
- Tags - Key-value pairs for filtering
- Logs - Timestamped events within a span
Jaeger Setup
Kubernetes Deployment
1# Deploy Jaeger Operator
2kubectl create namespace observability
3kubectl create -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.51.0/jaeger-operator.yaml -n observability
4
5# Deploy Jaeger instance
6kubectl apply -f - <<EOF
7apiVersion: jaegertracing.io/v1
8kind: Jaeger
9metadata:
10 name: jaeger
11 namespace: observability
12spec:
13 strategy: production
14 storage:
15 type: elasticsearch
16 options:
17 es:
18 server-urls: http://elasticsearch:9200
19 ingress:
20 enabled: true
21EOF
Docker Compose
1version: "3.8"
2services:
3 jaeger:
4 image: jaegertracing/all-in-one:latest
5 ports:
6 - "5775:5775/udp"
7 - "6831:6831/udp"
8 - "6832:6832/udp"
9 - "5778:5778"
10 - "16686:16686" # UI
11 - "14268:14268" # Collector
12 - "14250:14250" # gRPC
13 - "9411:9411" # Zipkin
14 environment:
15 - COLLECTOR_ZIPKIN_HOST_PORT=:9411
Reference: See references/jaeger-setup.md
Application Instrumentation
OpenTelemetry (Recommended)
Python (Flask)
1from opentelemetry import trace
2from opentelemetry.exporter.jaeger.thrift import JaegerExporter
3from opentelemetry.sdk.resources import SERVICE_NAME, Resource
4from opentelemetry.sdk.trace import TracerProvider
5from opentelemetry.sdk.trace.export import BatchSpanProcessor
6from opentelemetry.instrumentation.flask import FlaskInstrumentor
7from flask import Flask
8
9# Initialize tracer
10resource = Resource(attributes={SERVICE_NAME: "my-service"})
11provider = TracerProvider(resource=resource)
12processor = BatchSpanProcessor(JaegerExporter(
13 agent_host_name="jaeger",
14 agent_port=6831,
15))
16provider.add_span_processor(processor)
17trace.set_tracer_provider(provider)
18
19# Instrument Flask
20app = Flask(__name__)
21FlaskInstrumentor().instrument_app(app)
22
23@app.route('/api/users')
24def get_users():
25 tracer = trace.get_tracer(__name__)
26
27 with tracer.start_as_current_span("get_users") as span:
28 span.set_attribute("user.count", 100)
29 # Business logic
30 users = fetch_users_from_db()
31 return {"users": users}
32
33def fetch_users_from_db():
34 tracer = trace.get_tracer(__name__)
35
36 with tracer.start_as_current_span("database_query") as span:
37 span.set_attribute("db.system", "postgresql")
38 span.set_attribute("db.statement", "SELECT * FROM users")
39 # Database query
40 return query_database()
Node.js (Express)
1const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
2const { JaegerExporter } = require("@opentelemetry/exporter-jaeger");
3const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
4const { registerInstrumentations } = require("@opentelemetry/instrumentation");
5const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
6const {
7 ExpressInstrumentation,
8} = require("@opentelemetry/instrumentation-express");
9
10// Initialize tracer
11const provider = new NodeTracerProvider({
12 resource: { attributes: { "service.name": "my-service" } },
13});
14
15const exporter = new JaegerExporter({
16 endpoint: "http://jaeger:14268/api/traces",
17});
18
19provider.addSpanProcessor(new BatchSpanProcessor(exporter));
20provider.register();
21
22// Instrument libraries
23registerInstrumentations({
24 instrumentations: [new HttpInstrumentation(), new ExpressInstrumentation()],
25});
26
27const express = require("express");
28const app = express();
29
30app.get("/api/users", async (req, res) => {
31 const tracer = trace.getTracer("my-service");
32 const span = tracer.startSpan("get_users");
33
34 try {
35 const users = await fetchUsers();
36 span.setAttributes({ "user.count": users.length });
37 res.json({ users });
38 } finally {
39 span.end();
40 }
41});
Go
1package main
2
3import (
4 "context"
5 "go.opentelemetry.io/otel"
6 "go.opentelemetry.io/otel/exporters/jaeger"
7 "go.opentelemetry.io/otel/sdk/resource"
8 sdktrace "go.opentelemetry.io/otel/sdk/trace"
9 semconv "go.opentelemetry.io/otel/semconv/v1.4.0"
10)
11
12func initTracer() (*sdktrace.TracerProvider, error) {
13 exporter, err := jaeger.New(jaeger.WithCollectorEndpoint(
14 jaeger.WithEndpoint("http://jaeger:14268/api/traces"),
15 ))
16 if err != nil {
17 return nil, err
18 }
19
20 tp := sdktrace.NewTracerProvider(
21 sdktrace.WithBatcher(exporter),
22 sdktrace.WithResource(resource.NewWithAttributes(
23 semconv.SchemaURL,
24 semconv.ServiceNameKey.String("my-service"),
25 )),
26 )
27
28 otel.SetTracerProvider(tp)
29 return tp, nil
30}
31
32func getUsers(ctx context.Context) ([]User, error) {
33 tracer := otel.Tracer("my-service")
34 ctx, span := tracer.Start(ctx, "get_users")
35 defer span.End()
36
37 span.SetAttributes(attribute.String("user.filter", "active"))
38
39 users, err := fetchUsersFromDB(ctx)
40 if err != nil {
41 span.RecordError(err)
42 return nil, err
43 }
44
45 span.SetAttributes(attribute.Int("user.count", len(users)))
46 return users, nil
47}
Reference: See references/instrumentation.md
Context Propagation
HTTP Headers
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: congo=t61rcWkgMzE
Propagation in HTTP Requests
Python
1from opentelemetry.propagate import inject
2
3headers = {}
4inject(headers) # Injects trace context
5
6response = requests.get('http://downstream-service/api', headers=headers)
Node.js
1const { propagation } = require("@opentelemetry/api");
2
3const headers = {};
4propagation.inject(context.active(), headers);
5
6axios.get("http://downstream-service/api", { headers });
Tempo Setup (Grafana)
Kubernetes Deployment
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: tempo-config
5data:
6 tempo.yaml: |
7 server:
8 http_listen_port: 3200
9
10 distributor:
11 receivers:
12 jaeger:
13 protocols:
14 thrift_http:
15 grpc:
16 otlp:
17 protocols:
18 http:
19 grpc:
20
21 storage:
22 trace:
23 backend: s3
24 s3:
25 bucket: tempo-traces
26 endpoint: s3.amazonaws.com
27
28 querier:
29 frontend_worker:
30 frontend_address: tempo-query-frontend:9095
31---
32apiVersion: apps/v1
33kind: Deployment
34metadata:
35 name: tempo
36spec:
37 replicas: 1
38 template:
39 spec:
40 containers:
41 - name: tempo
42 image: grafana/tempo:latest
43 args:
44 - -config.file=/etc/tempo/tempo.yaml
45 volumeMounts:
46 - name: config
47 mountPath: /etc/tempo
48 volumes:
49 - name: config
50 configMap:
51 name: tempo-config
Reference: See assets/jaeger-config.yaml.template
Sampling Strategies
Probabilistic Sampling
1# Sample 1% of traces
2sampler:
3 type: probabilistic
4 param: 0.01
Rate Limiting Sampling
1# Sample max 100 traces per second
2sampler:
3 type: ratelimiting
4 param: 100
Adaptive Sampling
1from opentelemetry.sdk.trace.sampling import ParentBased, TraceIdRatioBased
2
3# Sample based on trace ID (deterministic)
4sampler = ParentBased(root=TraceIdRatioBased(0.01))
Trace Analysis
Finding Slow Requests
Jaeger Query:
service=my-service
duration > 1s
Finding Errors
Jaeger Query:
service=my-service
error=true
tags.http.status_code >= 500
Service Dependency Graph
Jaeger automatically generates service dependency graphs showing:
- Service relationships
- Request rates
- Error rates
- Average latencies
Best Practices
- Sample appropriately (1-10% in production)
- Add meaningful tags (user_id, request_id)
- Propagate context across all service boundaries
- Log exceptions in spans
- Use consistent naming for operations
- Monitor tracing overhead (<1% CPU impact)
- Set up alerts for trace errors
- Implement distributed context (baggage)
- Use span events for important milestones
- Document instrumentation standards
Integration with Logging
Correlated Logs
1import logging
2from opentelemetry import trace
3
4logger = logging.getLogger(__name__)
5
6def process_request():
7 span = trace.get_current_span()
8 trace_id = span.get_span_context().trace_id
9
10 logger.info(
11 "Processing request",
12 extra={"trace_id": format(trace_id, '032x')}
13 )
Troubleshooting
No traces appearing:
- Check collector endpoint
- Verify network connectivity
- Check sampling configuration
- Review application logs
High latency overhead:
- Reduce sampling rate
- Use batch span processor
- Check exporter configuration
Reference Files
references/jaeger-setup.md- Jaeger installationreferences/instrumentation.md- Instrumentation patternsassets/jaeger-config.yaml.template- Jaeger configuration
Related Skills
prometheus-configuration- For metricsgrafana-dashboards- For visualizationslo-implementation- For latency SLOs
What Users Are Saying
Real feedback from the community
Environment Matrix
Dependencies
Framework Support
Context Window
Security & Privacy
Information
- Author
- wshobson
- Updated
- 2026-01-30
- Category
- debugging
Related Skills
Distributed Tracing
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and …
View Details →Error Tracking
Add Sentry v8 error tracking and performance monitoring to your project services. Use this skill …
View Details →Error Tracking
Add Sentry v8 error tracking and performance monitoring to your project services. Use this skill …
View Details →