This file provides context for AI coding agents (OpenAI Codex, GitHub Copilot, Cursor, Windsurf, Claude).
Splunk OpenTelemetry Python Contrib - GenAI instrumentation packages for Python providing AI/ML observability through OpenTelemetry.
Status: Alpha preview - breaking changes expected.
Core Purpose of Splunk Distro for OpenTelemetry(SDOT): Provides util libraries to separate instrumentation capture from telemetry emission so:
- Instrumentation authors create neutral GenAI data objects once
- Pluggable emitters produce different telemetry flavors (spans, metrics, events)
- Evaluations (LLM-as-a-judge) run asynchronously through the same pipeline
- Configuration is environment-variable driven
├── util/ # Core utility packages
│ ├── opentelemetry-util-genai/ # Core: TelemetryHandler, emitters, types
│ ├── opentelemetry-util-genai-evals/ # Async evaluation manager & registry
│ ├── opentelemetry-util-genai-evals-deepeval/ # Deepeval metrics integration
│ ├── opentelemetry-util-genai-emitters-splunk/ # Splunk-specific emitters
│ └── opentelemetry-util-genai-traceloop-translator/ # Traceloop span translation
│
├── instrumentation-genai/ # Framework instrumentations
│ ├── opentelemetry-instrumentation-langchain/ # LangChain/LangGraph
│ ├── opentelemetry-instrumentation-crewai/ # CrewAI
│ ├── opentelemetry-instrumentation-openai-v2/ # OpenAI SDK
│ ├── opentelemetry-instrumentation-openai-agents-v2/ # OpenAI Agents
│ ├── opentelemetry-instrumentation-llamaindex/ # LlamaIndex
│ └── opentelemetry-instrumentation-aidefense/ # AI Defense
# Setup virtual environment (macOS)
python -m venv .venv && source .venv/bin/activate
# Install dev dependencies
pip install -r dev-requirements.txt
pip install -r dev-genai-requirements.txt
# Install a package in editable mode
pip install -e ./util/opentelemetry-util-genai
pip install -e "./instrumentation-genai/opentelemetry-instrumentation-langchain[instruments,test]"
# Run tests for a specific package
pytest ./util/opentelemetry-util-genai/tests/ -v
pytest ./instrumentation-genai/opentelemetry-instrumentation-langchain/tests/ -v
# Run linting (REQUIRED before commits)
make lint
# Or manually:
ruff check --fix . && ruff format .- Lint (
ci-lint.yaml):ruff check .andruff format --check . - Tests (
ci-main.yaml): pytest across Python 3.10-3.13 on Linux/macOS/Windows
| File | Purpose |
|---|---|
types.py |
Core dataclasses: GenAI, LLMInvocation, AgentInvocation, Workflow, ToolCall, EvaluationResult |
handler.py |
TelemetryHandler - lifecycle facade (start_llm, stop_llm, fail_llm, etc.) |
interfaces.py |
EmitterProtocol, CompletionCallback, Evaluator protocols |
emitters/composite.py |
CompositeEmitter - chains emitters by category |
emitters/span.py |
Semantic convention span emitter |
emitters/metrics.py |
Metrics emitter (duration, tokens) |
attributes.py |
Semantic attribute extraction |
class EmitterProtocol(Protocol):
def on_start(self, obj: GenAI) -> None: ...
def on_end(self, obj: GenAI) -> None: ...
def on_error(self, error: Exception, obj: GenAI) -> None: ...
def on_evaluation_results(self, results: list[EvaluationResult], obj: GenAI | None) -> None: ...[project.entry-points.opentelemetry_util_genai_emitters]
my_emitter = "my_package:load_emitters"
[project.entry-points.opentelemetry_util_genai_evaluators]
my_evaluator = "my_package:register"| Variable | Purpose | Default |
|---|---|---|
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT |
Capture message content | false |
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE |
SPAN, EVENT, SPAN_AND_EVENT |
SPAN |
OTEL_INSTRUMENTATION_GENAI_EMITTERS |
Emitter selection | span |
OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS |
Evaluator configuration | "default value" |
OTEL_INSTRUMENTATION_GENAI_EVALUATION_SAMPLE_RATE |
Evaluation sampling (0.0-1.0) | 1.0 |
- Python: 3.10+ required, type hints for all public APIs
- Linting: Ruff (config in
pyproject.toml) - Docstrings: Google style
- Dataclasses: Use for structured data (see
types.py) - Protocols: Use
typing.Protocolfor interfaces - Async: Use
async/awaitfor I/O operations
- Tests in
tests/subdirectory of each package and requires this package installed in development mode locally, likebash pip install -e ./util/opentelemetry-util-genai - Use pytest with fixtures
- Mock external services (LLM providers, etternal APIs, etc.). Do not mock instrumented frameworks unless absolutely necessary.
# Run with coverage
pytest ./util/opentelemetry-util-genai/tests/ -v --cov=opentelemetry.util.genaiFor detailed information, see these files in the repository:
| Document | Description |
|---|---|
| README.md | Core concepts, emitter architecture, evaluation system |
| README.packages.architecture.md | Package architecture, interfaces, lifecycle diagrams |
| CONTRIBUTING.md | Contribution guidelines, PR process |
| DEVELOPMENT.md | Detailed development setup for macOS |
| docs/semconv-reference.md | Complete GenAI semantic conventions reference (SDOT vs upstream) |
- Create
instrumentation-genai/opentelemetry-instrumentation-{name}/ - Copy structure from existing instrumentation (e.g., langchain)
- Implement a target instrumented framework demo apps, i.e.
instrumentation-genai/opentelemetry-instrumentation-langchain/examples/sre_incident_copilot - Implement callback handler/wrappers for the instrumented functions
- Don't create telemetry in the instrumentation library, use
util/opentelemetry-util-genaiAPIs to create GenAI Types and for lifecycle - Validate functionality with real app
- Add tests with mocked LLM Provider call (test all the way through the framework, don't mock the framework)
- Create class implementing
EmitterProtocol - Add
load_emitters()function returninglist[EmitterSpec] - Register via entry point in
pyproject.toml - Document configuration environment variables
- Create class implementing
Evaluatorprotocol in a new package (seeutil/opentelemetry-util-genai-evals-deepevalfor example) - Add
register()function for entry point discovery - Handle evaluation results via
TelemetryHandler.evaluation_results()
VS Code launch configurations are in .vscode/launch.json for debugging examples.
{
"version": "0.2.0",
"configurations": [
{
"name": "LangChain Demo S0 Current (span_metric_event)",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/instrumentation-genai/opentelemetry-instrumentation-langchain/examples/manual/main.py",
"python": "${workspaceFolder}/.venv/bin/python",
"console": "integratedTerminal",
"justMyCode": false,
"env": {
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4317",
"OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
"OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE": "DELTA",
"OTEL_LOGS_EXPORTER": "otlp",
"OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED": "true",
"OTEL_RESOURCE_ATTRIBUTES": "deployment.environment=o11y-for-ai-dev-sergey,scenario=current",
"OTEL_SERVICE_NAME": "demo-app-util-langchain-dev-sergey",
"OTEL_INSTRUMENTATION_LANGCHAIN_CAPTURE_MESSAGE_CONTENT": "true",
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT": "true",
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE": "SPAN_AND_EVENT",
"OTEL_INSTRUMENTATION_GENAI_EMITTERS": "span_metric_event",
"OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS": "deepeval(LLMInvocation(bias,toxicity),AgentInvocation(hallucination))"
}
}
]
}- Always start work from creating
.local/plan-<jira-ticket>-<feature-name>.md(skip jira if not present)- start plan from a research industry standards (other instrumentation libraries, ai agentic frameworks, ai-focused observability products)
- plan sections - project description, research, open questions that need human clarification, implementation plan for AI Coder.
- after implementing a change - maintain a PR description for the change in
.local/pr-<jira-ticket>-<feature-name>.md(skip jira if not present). Highlighte the changes, identify major parts requiring special attention. Update the file on code changes. - Maintain repo level
README.md,README.packages.architecture.md, and package level README to reflect the changes. Ensure user documentation added when needed. - Stage changes, do not commit without human review.
- Always run
make lintafter changes - Tests are required for all new features
- Update
CHANGELOG.mdto document changes - Update
version.py, only update the minor version unless backward-incompatible changes are introduced. Always communicate with human when the breaking changes are introduced.
When making changes to semantic conventions in this repo (e.g., adding/removing/renaming attributes in attributes.py, types.py, or semconv_ai.py, changing span/metric/event schemas, or updating emitter behavior), you must update docs/semconv-reference.md to reflect those changes. This includes:
- New or removed attributes in the attribute reference tables
- Changes to span types, metrics, or events
- Changes to propagation behavior (agent name, conversation ID, association properties)
- Updates to the SDOT vs upstream comparison
- Changes to legacy attribute mappings
- Check existing patterns in similar packages before implementing
- Ask user to provide a file to export
OPENAI_API_KEYor other credentials to run the demo app to validate - Environment variables should have
OTEL_INSTRUMENTATION_GENAI_prefix - Use semantic conventions from OpenTelemetry GenAI spec where applicable
- Always keep backward compatibility in mind when refactoring existing
- Follow DRY and SOLID software engineering principles when readability and maintainability is not compromised.
- Never add
Co-Authored-Bytrailers (or any similar attribution trailers such asCo-authored-by,Signed-off-by, etc.) that reference AI assistants, bots, or automated tools in commit messages. This includes but is not limited to Claude, Copilot, ChatGPT, or anynoreply@addresses from AI vendors. - Commit messages should only attribute human contributors.
- Do not try to mock libraries if import in the current env fail. If in doubt - clearly communicate the problem to user
- Always refer to
README.mdandREADME.packages.architecture.mdfor context. - avoid creating multiple copies of example apps, when can introduce parameters and reuse the same demo app
- Do not use
try/exceptguards around test imports. Test dependencies must be declared inpyproject.toml[project.optional-dependencies] testand are expected to be present at test time. If they are missing, the test should fail with anImportError, not silently skip. - Do not use
sys.pathhacks (e.g., insertingsrc/intosys.path) in test files. Packages should be installed in editable mode (pip install -e .) and imports should work without path manipulation. - Do not use
pytest.mark.skipifto guard against missing test dependencies. If a dependency is required for a test, add it to the test requirements and import it directly.