This file provides context for AI coding agents (OpenAI Codex, GitHub Copilot, Cursor, Windsurf, Claude).
Splunk OpenTelemetry Python Contrib - GenAI instrumentation packages for Python providing AI/ML observability through OpenTelemetry.
Status: Alpha preview - breaking changes expected.
Core Purpose of Splunk Distro for OpenTelemetry(SDOT): Provides util libraries to separate instrumentation capture from telemetry emission so:
- Instrumentation authors create neutral GenAI data objects once
- Pluggable emitters produce different telemetry flavors (spans, metrics, events)
- Evaluations (LLM-as-a-judge) run asynchronously through the same pipeline
- Configuration is environment-variable driven
├── util/ # Core utility packages
│ ├── opentelemetry-util-genai/ # Core: TelemetryHandler, emitters, types
│ ├── opentelemetry-util-genai-evals/ # Async evaluation manager & registry
│ ├── opentelemetry-util-genai-evals-deepeval/ # Deepeval metrics integration
│ ├── opentelemetry-util-genai-emitters-splunk/ # Splunk-specific emitters
│ └── opentelemetry-util-genai-traceloop-translator/ # Traceloop span translation
│
├── instrumentation-genai/ # Framework instrumentations
│ ├── opentelemetry-instrumentation-langchain/ # LangChain/LangGraph
│ ├── opentelemetry-instrumentation-crewai/ # CrewAI
│ ├── opentelemetry-instrumentation-openai-v2/ # OpenAI SDK
│ ├── opentelemetry-instrumentation-openai-agents-v2/ # OpenAI Agents
│ ├── opentelemetry-instrumentation-llamaindex/ # LlamaIndex
│ └── opentelemetry-instrumentation-aidefense/ # AI Defense
# Setup virtual environment (macOS)
python -m venv .venv && source .venv/bin/activate
# Install dev dependencies
pip install -r dev-requirements.txt
pip install -r dev-genai-requirements.txt
# Install a package in editable mode
pip install -e ./util/opentelemetry-util-genai
pip install -e "./instrumentation-genai/opentelemetry-instrumentation-langchain[instruments,test]"
# Run tests for a specific package
pytest ./util/opentelemetry-util-genai/tests/ -v
pytest ./instrumentation-genai/opentelemetry-instrumentation-langchain/tests/ -v
# Run linting (REQUIRED before commits)
make lint
# Or manually:
ruff check --fix . && ruff format .- Lint (
ci-lint.yaml):ruff check .andruff format --check . - Tests (
ci-main.yaml): pytest across Python 3.10-3.13 on Linux/macOS/Windows
| File | Purpose |
|---|---|
types.py |
Core dataclasses: GenAI, LLMInvocation, AgentInvocation, Workflow, ToolCall, EvaluationResult |
handler.py |
TelemetryHandler - lifecycle facade (start_llm, stop_llm, fail_llm, etc.) |
interfaces.py |
EmitterProtocol, CompletionCallback, Evaluator protocols |
emitters/composite.py |
CompositeEmitter - chains emitters by category |
emitters/span.py |
Semantic convention span emitter |
emitters/metrics.py |
Metrics emitter (duration, tokens) |
attributes.py |
Semantic attribute extraction |
class EmitterProtocol(Protocol):
def on_start(self, obj: GenAI) -> None: ...
def on_end(self, obj: GenAI) -> None: ...
def on_error(self, error: Exception, obj: GenAI) -> None: ...
def on_evaluation_results(self, results: list[EvaluationResult], obj: GenAI | None) -> None: ...[project.entry-points.opentelemetry_util_genai_emitters]
my_emitter = "my_package:load_emitters"
[project.entry-points.opentelemetry_util_genai_evaluators]
my_evaluator = "my_package:register"| Variable | Purpose | Default |
|---|---|---|
OTEL_INSTRUMENTATION_GENAI_ENABLE |
Enable/disable instrumentation | true |
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT |
Capture message content | false |
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE |
SPAN, EVENT, SPAN_AND_EVENT |
SPAN |
OTEL_INSTRUMENTATION_GENAI_EMITTERS |
Emitter selection | span |
OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS |
Evaluator configuration | "default value" |
OTEL_INSTRUMENTATION_GENAI_EVALUATION_SAMPLE_RATE |
Evaluation sampling (0.0-1.0) | 1.0 |
- Python: 3.10+ required, type hints for all public APIs
- Linting: Ruff (config in
pyproject.toml) - Docstrings: Google style
- Dataclasses: Use for structured data (see
types.py) - Protocols: Use
typing.Protocolfor interfaces - Async: Use
async/awaitfor I/O operations
- Tests in
tests/subdirectory of each package and requires this package installed in development mode locally, likebash pip install -e ./util/opentelemetry-util-genai - Use pytest with fixtures
- Mock external services (LLM providers, etternal APIs, etc.). Do not mock instrumented frameworks unless absolutely necessary.
# Run with coverage
pytest ./util/opentelemetry-util-genai/tests/ -v --cov=opentelemetry.util.genaiFor detailed information, see these files in the repository:
| Document | Description |
|---|---|
| README.md | Core concepts, emitter architecture, evaluation system |
| README.packages.architecture.md | Package architecture, interfaces, lifecycle diagrams |
| CONTRIBUTING.md | Contribution guidelines, PR process |
| DEVELOPMENT.md | Detailed development setup for macOS |
- Create
instrumentation-genai/opentelemetry-instrumentation-{name}/ - Copy structure from existing instrumentation (e.g., langchain)
- Implement a target instrumented framework demo apps, i.e.
instrumentation-genai/opentelemetry-instrumentation-langchain/examples/sre_incident_copilot - Implement callback handler/wrappers for the instrumented functions
- Don't create telemetry in the instrumentation library, use
util/opentelemetry-util-genaiAPIs to create GenAI Types and for lifecycle - Validate functionality with real app
- Add tests with mocked LLM Provider call (test all the way through the framework, don't mock the framework)
- Create class implementing
EmitterProtocol - Add
load_emitters()function returninglist[EmitterSpec] - Register via entry point in
pyproject.toml - Document configuration environment variables
- Create class implementing
Evaluatorprotocol in a new package (seeutil/opentelemetry-util-genai-evals-deepevalfor example) - Add
register()function for entry point discovery - Handle evaluation results via
TelemetryHandler.evaluation_results()
VS Code launch configurations are in .vscode/launch.json for debugging examples.
{
"version": "0.2.0",
"configurations": [
{
"name": "LangChain Demo S0 Current (span_metric_event)",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/instrumentation-genai/opentelemetry-instrumentation-langchain/examples/manual/main.py",
"python": "${workspaceFolder}/.venv/bin/python",
"console": "integratedTerminal",
"justMyCode": false,
"env": {
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4317",
"OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
"OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE": "DELTA",
"OTEL_LOGS_EXPORTER": "otlp",
"OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED": "true",
"OTEL_RESOURCE_ATTRIBUTES": "deployment.environment=o11y-for-ai-dev-sergey,scenario=current",
"OTEL_SERVICE_NAME": "demo-app-util-langchain-dev-sergey",
"OTEL_INSTRUMENTATION_LANGCHAIN_CAPTURE_MESSAGE_CONTENT": "true",
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT": "true",
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE": "SPAN_AND_EVENT",
"OTEL_INSTRUMENTATION_GENAI_EMITTERS": "span_metric_event",
"OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS": "deepeval(LLMInvocation(bias,toxicity),AgentInvocation(hallucination))"
}
}
]
}- Always run
make lintbefore committing - Ask user to provide a file to export OPENAI_API_KEY or other credentials to run the demo app to validate
- Check existing patterns in similar packages before implementing
- Tests are required for all new features
- Environment variables should have
OTEL_INSTRUMENTATION_GENAI_prefix - Use semantic conventions from OpenTelemetry GenAI spec where applicable
- Update README.md and README.arhcitecture.md if needed.
- Update CHANGELOG.md to document changes
- Update version.py, only update the minor version unless backward-incompatible changes are introduced. Always communicate with human when the breaking changes are introduced.
- Always keep backward compatibility in mind when refactoring existing
- Follow DRY and SOLID software engineering principles when readability and maintainability is not compromised.
- Do not try to mock libraries if import in the current env fail. If in doubt - clearly communicate the problem to user
- Always refer to README.md and README.packages.architecture.md
- avoid creating multiple copies of example apps, when can introduce parameters and reuse the same demo app