OpenTelemetry GenAI Semantic Conventions

> OpenTelemetry's GenAI SIG (launched April 2024) defines the standard schema for agent telemetry. Span names, attributes, and content-capture rules converge across vendors so agent traces mean the same thing in Datadog, Grafana, Jaeger, and Honeycomb.

Type: Learn + Build

Languages: Python (stdlib)

Prerequisites: Phase 14 · 13 (LangGraph), Phase 14 · 24 (Observability Platforms)

Time: ~60 minutes

Learning Objectives

The Problem

Every vendor invents their own span names. Ops teams end up building per-framework dashboards. OpenTelemetry's GenAI SIG fixes this by defining one standard the whole ecosystem targets.

The Concept

Span categories

  1. Model / client spans. Cover raw LLM calls. Emitted by provider SDKs (Anthropic, OpenAI, Bedrock) and framework model adapters.
  2. Agent spans. create_agent (when the agent is constructed) and invoke_agent (when it runs).
  3. Tool spans. One per tool invocation; connected to the agent span by parent-child relation.

Agent span naming

- CLIENT — for remote agent services (OpenAI Assistants API, Bedrock Agents).

- INTERNAL — for in-process agent frameworks (LangChain, CrewAI, local ReAct).

Key attributes

Technology-specific conventions exist for Anthropic, Azure AI Inference, AWS Bedrock, OpenAI.

Content capture

The default rule: instrumentations SHOULD NOT capture inputs/outputs by default. Capture is opt-in via:

Recommended production pattern: store content externally (S3, your log store), record references on spans (pointer IDs, not prose). This is the Lesson 27 content-poisoning defense wired into observability.

Stability

Most conventions are experimental as of March 2026. Opt in to the stable preview with:

OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental

Datadog v1.37+ maps GenAI attributes natively into its LLM Observability schema. Other backends (Grafana, Honeycomb, Jaeger) support the raw attributes.

Where this pattern goes wrong

Build It

code/main.py implements a stdlib span emitter matching GenAI conventions:

Run it:

python3 code/main.py

Output: a span tree with all required GenAI attributes, and an "external store" showing the opt-in content references.

Use It

Ship It

outputs/skill-otel-genai.md wires OTel GenAI spans into an existing agent with content-capture defaults and external-reference storage.

Exercises

  1. Instrument your Lesson 01 ReAct loop with invoke_agent (INTERNAL) + per-tool spans. Send to a Jaeger instance.
  2. Add content capture in "references only" mode: prompts to SQLite, span attributes carry only row IDs.
  3. Read the spec for gen_ai.data_source.id. Wire it into your Lesson 09 Mem0 search.
  4. Set OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental and verify your attributes don't get renamed by the collector.
  5. Build a dashboard: "which tool errors correlate with which models" from GenAI attributes alone.

Key Terms

Term What people say What it actually means
GenAI SIG "OpenTelemetry GenAI group" OTel working group defining the schema
invoke_agent "Agent span" Name of the span representing an agent run
CLIENT span "Remote call" Span for a call to a remote agent service
INTERNAL span "In-process" Span for an in-process agent run
gen_ai.provider.name "Provider" anthropic / openai / aws.bedrock / google.vertex
gen_ai.data_source.id "RAG source" Which corpus/store a retrieval hit
Content capture "Prompt logging" Opt-in capture of messages; store externally in prod
Stability opt-in "Preview mode" Env var to pin experimental conventions

Further Reading