Agent Instructions

This file provides context for AI coding agents (OpenAI Codex, GitHub Copilot, Cursor, Windsurf, Claude).

Project Overview

Splunk OpenTelemetry Python Contrib - GenAI instrumentation packages for Python providing AI/ML observability through OpenTelemetry.

Status: Alpha preview - breaking changes expected.

Core Purpose of Splunk Distro for OpenTelemetry(SDOT): Provides util libraries to separate instrumentation capture from telemetry emission so:

Instrumentation authors create neutral GenAI data objects once
Pluggable emitters produce different telemetry flavors (spans, metrics, events)
Evaluations (LLM-as-a-judge) run asynchronously through the same pipeline
Configuration is environment-variable driven

Repository Structure

├── util/                                    # Core utility packages
│   ├── opentelemetry-util-genai/           # Core: TelemetryHandler, emitters, types
│   ├── opentelemetry-util-genai-evals/     # Async evaluation manager & registry
│   ├── opentelemetry-util-genai-evals-deepeval/  # Deepeval metrics integration
│   ├── opentelemetry-util-genai-emitters-splunk/ # Splunk-specific emitters
│   └── opentelemetry-util-genai-traceloop-translator/ # Traceloop span translation
│
├── instrumentation-genai/                   # Framework instrumentations
│   ├── opentelemetry-instrumentation-langchain/   # LangChain/LangGraph
│   ├── opentelemetry-instrumentation-crewai/      # CrewAI
│   ├── opentelemetry-instrumentation-openai-v2/   # OpenAI SDK
│   ├── opentelemetry-instrumentation-openai-agents-v2/ # OpenAI Agents
│   ├── opentelemetry-instrumentation-llamaindex/  # LlamaIndex
│   └── opentelemetry-instrumentation-aidefense/   # AI Defense

Quick Reference

Development Commands

# Setup virtual environment (macOS)
python -m venv .venv && source .venv/bin/activate

# Install dev dependencies
pip install -r dev-requirements.txt
pip install -r dev-genai-requirements.txt

# Install a package in editable mode
pip install -e ./util/opentelemetry-util-genai
pip install -e "./instrumentation-genai/opentelemetry-instrumentation-langchain[instruments,test]"

# Run tests for a specific package
pytest ./util/opentelemetry-util-genai/tests/ -v
pytest ./instrumentation-genai/opentelemetry-instrumentation-langchain/tests/ -v

# Run linting (REQUIRED before commits)
make lint
# Or manually:
ruff check --fix . && ruff format .

CI Checks (must pass)

Lint (ci-lint.yaml): ruff check . and ruff format --check .
Tests (ci-main.yaml): pytest across Python 3.10-3.13 on Linux/macOS/Windows

Code Patterns

Key Types (util/opentelemetry-util-genai/src/opentelemetry/util/genai/)

File	Purpose
`types.py`	Core dataclasses: `GenAI`, `LLMInvocation`, `AgentInvocation`, `Workflow`, `ToolCall`, `EvaluationResult`
`handler.py`	`TelemetryHandler` - lifecycle facade (`start_llm`, `stop_llm`, `fail_llm`, etc.)
`interfaces.py`	`EmitterProtocol`, `CompletionCallback`, `Evaluator` protocols
`emitters/composite.py`	`CompositeEmitter` - chains emitters by category
`emitters/span.py`	Semantic convention span emitter
`emitters/metrics.py`	Metrics emitter (duration, tokens)
`attributes.py`	Semantic attribute extraction

Emitter Protocol

class EmitterProtocol(Protocol):
    def on_start(self, obj: GenAI) -> None: ...
    def on_end(self, obj: GenAI) -> None: ...
    def on_error(self, error: Exception, obj: GenAI) -> None: ...
    def on_evaluation_results(self, results: list[EvaluationResult], obj: GenAI | None) -> None: ...

Plugin Registration (pyproject.toml)

[project.entry-points.opentelemetry_util_genai_emitters]
my_emitter = "my_package:load_emitters"

[project.entry-points.opentelemetry_util_genai_evaluators]
my_evaluator = "my_package:register"

Environment Variables

Variable	Purpose	Default
`OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT`	Capture message content	`false`
`OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE`	`SPAN`, `EVENT`, `SPAN_AND_EVENT`	`SPAN`
`OTEL_INSTRUMENTATION_GENAI_EMITTERS`	Emitter selection	`span`
`OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS`	Evaluator configuration	"default value"
`OTEL_INSTRUMENTATION_GENAI_EVALUATION_SAMPLE_RATE`	Evaluation sampling (0.0-1.0)	`1.0`

Code Style

Python: 3.10+ required, type hints for all public APIs
Linting: Ruff (config in pyproject.toml)
Docstrings: Google style
Dataclasses: Use for structured data (see types.py)
Protocols: Use typing.Protocol for interfaces
Async: Use async/await for I/O operations

Testing

Tests in tests/ subdirectory of each package and requires this package installed in development mode locally, like bash pip install -e ./util/opentelemetry-util-genai
Use pytest with fixtures
Mock external services (LLM providers, etternal APIs, etc.). Do not mock instrumented frameworks unless absolutely necessary.

# Run with coverage
pytest ./util/opentelemetry-util-genai/tests/ -v --cov=opentelemetry.util.genai

Documentation References

For detailed information, see these files in the repository:

Document	Description
README.md	Core concepts, emitter architecture, evaluation system
README.packages.architecture.md	Package architecture, interfaces, lifecycle diagrams
CONTRIBUTING.md	Contribution guidelines, PR process
DEVELOPMENT.md	Detailed development setup for macOS
docs/semconv-reference.md	Complete GenAI semantic conventions reference (SDOT vs upstream)

Common Tasks

Adding a New Instrumentation

Create instrumentation-genai/opentelemetry-instrumentation-{name}/
Copy structure from existing instrumentation (e.g., langchain)
Implement a target instrumented framework demo apps, i.e. instrumentation-genai/opentelemetry-instrumentation-langchain/examples/sre_incident_copilot
Implement callback handler/wrappers for the instrumented functions
Don't create telemetry in the instrumentation library, use util/opentelemetry-util-genai APIs to create GenAI Types and for lifecycle
Validate functionality with real app
Add tests with mocked LLM Provider call (test all the way through the framework, don't mock the framework)

Adding a New Emitter

Create class implementing EmitterProtocol
Add load_emitters() function returning list[EmitterSpec]
Register via entry point in pyproject.toml
Document configuration environment variables

Adding a New Evaluator

Create class implementing Evaluator protocol in a new package (see util/opentelemetry-util-genai-evals-deepeval for example)
Add register() function for entry point discovery
Handle evaluation results via TelemetryHandler.evaluation_results()

Debugging

VS Code launch configurations are in .vscode/launch.json for debugging examples.

{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "LangChain Demo S0 Current (span_metric_event)",
      "type": "debugpy",
      "request": "launch",
      "program": "${workspaceFolder}/instrumentation-genai/opentelemetry-instrumentation-langchain/examples/manual/main.py",
      "python": "${workspaceFolder}/.venv/bin/python",
      "console": "integratedTerminal",
      "justMyCode": false,
      "env": {
        "OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4317",
        "OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
        "OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE": "DELTA",
        "OTEL_LOGS_EXPORTER": "otlp",
        "OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED": "true",
        "OTEL_RESOURCE_ATTRIBUTES": "deployment.environment=o11y-for-ai-dev-sergey,scenario=current",
        "OTEL_SERVICE_NAME": "demo-app-util-langchain-dev-sergey",
        "OTEL_INSTRUMENTATION_LANGCHAIN_CAPTURE_MESSAGE_CONTENT": "true",
        "OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT": "true",
        "OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT_MODE": "SPAN_AND_EVENT",
        "OTEL_INSTRUMENTATION_GENAI_EMITTERS": "span_metric_event",
        "OTEL_INSTRUMENTATION_GENAI_EVALS_EVALUATORS": "deepeval(LLMInvocation(bias,toxicity),AgentInvocation(hallucination))"
      }
    }
  ]
}

Workflow for AI Agents

Always start work from creating .local/plan-<jira-ticket>-<feature-name>.md (skip jira if not present)
- start plan from a research industry standards (other instrumentation libraries, ai agentic frameworks, ai-focused observability products)
- plan sections - project description, research, open questions that need human clarification, implementation plan for AI Coder.
after implementing a change - maintain a PR description for the change in .local/pr-<jira-ticket>-<feature-name>.md (skip jira if not present). Highlighte the changes, identify major parts requiring special attention. Update the file on code changes.
Maintain repo level README.md, README.packages.architecture.md, and package level README to reflect the changes. Ensure user documentation added when needed.
Stage changes, do not commit without human review.
Always run make lint after changes
Tests are required for all new features
Update CHANGELOG.md to document changes
Update version.py, only update the minor version unless backward-incompatible changes are introduced. Always communicate with human when the breaking changes are introduced.

Semantic Convention Reference Maintenance

When making changes to semantic conventions in this repo (e.g., adding/removing/renaming attributes in attributes.py, types.py, or semconv_ai.py, changing span/metric/event schemas, or updating emitter behavior), you must update docs/semconv-reference.md to reflect those changes. This includes:

New or removed attributes in the attribute reference tables
Changes to span types, metrics, or events
Changes to propagation behavior (agent name, conversation ID, association properties)
Updates to the SDOT vs upstream comparison
Changes to legacy attribute mappings

Notes for AI Agents

Check existing patterns in similar packages before implementing
Ask user to provide a file to export OPENAI_API_KEY or other credentials to run the demo app to validate
Environment variables should have OTEL_INSTRUMENTATION_GENAI_ prefix
Use semantic conventions from OpenTelemetry GenAI spec where applicable
Always keep backward compatibility in mind when refactoring existing
Follow DRY and SOLID software engineering principles when readability and maintainability is not compromised.

Git Commit Rules

Never add Co-Authored-By trailers (or any similar attribution trailers such as Co-authored-by, Signed-off-by, etc.) that reference AI assistants, bots, or automated tools in commit messages. This includes but is not limited to Claude, Copilot, ChatGPT, or any noreply@ addresses from AI vendors.
Commit messages should only attribute human contributors.

Common Pitfalls to Avoid

Do not try to mock libraries if import in the current env fail. If in doubt - clearly communicate the problem to user
Always refer to README.md and README.packages.architecture.md for context.
avoid creating multiple copies of example apps, when can introduce parameters and reuse the same demo app
Do not use try/except guards around test imports. Test dependencies must be declared in pyproject.toml [project.optional-dependencies] test and are expected to be present at test time. If they are missing, the test should fail with an ImportError, not silently skip.
Do not use sys.path hacks (e.g., inserting src/ into sys.path) in test files. Packages should be installed in editable mode (pip install -e .) and imports should work without path manipulation.
Do not use pytest.mark.skipif to guard against missing test dependencies. If a dependency is required for a test, add it to the test requirements and import it directly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Instructions

Project Overview

Repository Structure

Quick Reference

Development Commands

CI Checks (must pass)

Code Patterns

Key Types (util/opentelemetry-util-genai/src/opentelemetry/util/genai/)

Emitter Protocol

Plugin Registration (pyproject.toml)

Environment Variables

Code Style

Testing

Documentation References

Common Tasks

Adding a New Instrumentation

Adding a New Emitter

Adding a New Evaluator

Debugging

Workflow for AI Agents

Semantic Convention Reference Maintenance

Notes for AI Agents

Git Commit Rules

Common Pitfalls to Avoid

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Agent Instructions

Project Overview

Repository Structure

Quick Reference

Development Commands

CI Checks (must pass)

Code Patterns

Key Types (util/opentelemetry-util-genai/src/opentelemetry/util/genai/)

Emitter Protocol

Plugin Registration (pyproject.toml)

Environment Variables

Code Style

Testing

Documentation References

Common Tasks

Adding a New Instrumentation

Adding a New Emitter

Adding a New Evaluator

Debugging

Workflow for AI Agents

Semantic Convention Reference Maintenance

Notes for AI Agents

Git Commit Rules

Common Pitfalls to Avoid