If you're trying to quickly get up to speed with distributed tracing and want to try it out in a Kubernetes environment, this post will help you set up the architectural pieces and try to see tracing in action.
Architecture
We would be running a Jaeger collector back-end that would collect all traces from everywhere. This could run outside Kubernetes too, as long as its ports are accessible from within the Kubernetes pods. Workloads generating traces would be simulated using pods running otel-cli. Each Kubernetes node would also run an OTel agent. The pods would send the traces to the agent on the local node, which in turn would forward them to the Jaeger collector.
Deployment
Deploy a recent version (>=1.35) of the Jaeger all-in-one collector for the back end, on a machine that is accessible to all the Kubernetes cluster that would be producing traces. The version is important because we want Jaeger to be capable of accepting OTLP payload which OTel libraries and agents emit. By default, it would use in-memory storage for keeping traces - they wouldn't be persistent.
docker run --name jaeger -e COLLECTOR_OTLP_ENABLED=true -p 16686:16686 -p 4317:4317 -p 4318:4318 jaegertracing/all-in-one:1.35
On the Kubernetes clusters where you want to run applications that generate traces, deploy OTel agent daemon sets using the following manifest.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-agent-conf
labels:
app: opentelemetry
component: otel-agent-conf
data:
otel-agent-config: |
receivers:
otlp:
protocols:
grpc:
http:
exporters:
otlp:
endpoint: "192.168.219.1:4317"
tls:
insecure: true
sending_queue:
num_consumers: 4
queue_size: 100
retry_on_failure:
enabled: true
processors:
batch:
memory_limiter:
# 80% of maximum memory up to 2G
limit_mib: 400
# 25% of limit up to 2G
spike_limit_mib: 100
check_interval: 5s
extensions:
zpages: {}
memory_ballast:
# Memory Ballast size should be max 1/3 to 1/2 of memory.
size_mib: 165
service:
extensions: [zpages, memory_ballast]
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-agent
labels:
app: opentelemetry
component: otel-agent
spec:
selector:
matchLabels:
app: opentelemetry
component: otel-agent
template:
metadata:
labels:
app: opentelemetry
component: otel-agent
spec:
containers:
- command:
- "/otelcol"
- "--config=/conf/otel-agent-config.yaml"
image: otel/opentelemetry-collector:0.75.0
name: otel-agent
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 55679 # ZPages endpoint.
- containerPort: 4317 # Default OpenTelemetry receiver port.
hostPort: 4317
- containerPort: 8888 # Metrics.
volumeMounts:
- name: otel-agent-config-vol
mountPath: /conf
volumes:
- configMap:
name: otel-agent-conf
items:
- key: otel-agent-config
path: otel-agent-config.yaml
name: otel-agent-config-vol
In the above, the address 192.168.219.1 is where the Jaeger all-in-one collector is running on my setup. Yours would be different.
Finally deploy your application which produces traces using OTel libraries and configure it to send the traces to the local node IP on port 4317. This would send the traces to the OTel agent daemon set. This section will be expanded to add Golang code samples using the OTel SDK. For now skip to the next section to see how you can test trace generation using a CLI tool.
Trying it out
Install the otel-cli in your Go build environment:
go install github.com/equinix-labs/otel-cli@latest
This would put the otel-cli binary under your $GOPATH/bin. Put this binary inside a container image, and create a pod from that image that periodically runs the following commands:
$ export OTEL_EXPORTER_OTLP_ENDPOINT=<IP>:4317
$ otel-cli exec --service my-service --name "curl google" curl https://google.com
If you point your browser to <IP>:16686, where <IP> is the address of the machine running the Jaeger all-in-one collector, you should be able to see Jaeger UI and look up traces generated by your service.
Read more!