Add conformance testing based on weaver live check#14
Conversation
lmolkova
commented
May 14, 2026
- add helpers
- plumb into openai, anthropic (fails), langchain (partial coverage)
| }, | ||
| } | ||
| ], | ||
| tool_choice={"type": "tool", "name": "get_weather"}, |
There was a problem hiding this comment.
TODO (follow up, create issue): do multi-turn conversation
| @pytest.mark.parametrize( | ||
| "scenario", | ||
| [ | ||
| pytest.param(InferenceScenario(), marks=_LEGACY_SYSTEM_SKIP), |
There was a problem hiding this comment.
TODO (follow up, create issue) bug in anthropic
| marks=pytest.mark.skip( | ||
| reason="openai-v2 embeddings emit legacy gen_ai.system in experimental mode" | ||
| ), | ||
| ), |
There was a problem hiding this comment.
TODO (follow up, create issue): bug
| return target | ||
|
|
||
| cache_root.mkdir(parents=True, exist_ok=True) | ||
| url = ( |
There was a problem hiding this comment.
need to check out to get json schemas and run custom genai_content_validation.rego policy
|
|
||
| cache_root.mkdir(parents=True, exist_ok=True) | ||
| url = ( | ||
| "https://github.com/open-telemetry/semantic-conventions/" |
There was a problem hiding this comment.
not using GenAI repo yet, due to minor issues, TODO: create bug
There was a problem hiding this comment.
Pull request overview
Adds GenAI semantic-conventions conformance testing driven by Weaver live-check, including shared test utilities, Rego advice policies, new per-instrumentation conformance scenarios, and CI/tox plumbing to run them in dedicated *-conformance environments.
Changes:
- Introduces conformance infrastructure in
opentelemetry-test-util-genai(Weaver setup,weaver_live_checkfixture, scenario runner). - Adds Weaver advice policies for span-level invariants and JSON-schema validation of GenAI content attributes.
- Wires new conformance suites into OpenAI v2, Anthropic, and LangChain packages, plus tox/CI/jobs, renovate pins, and ignores for generated artifacts.
Reviewed changes
Copilot reviewed 39 out of 41 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| versions.env | Adds pinned Weaver + semantic-conventions versions for conformance tooling. |
| uv.lock | Updates lockfile to include new deps and git-sourced OpenTelemetry components. |
| util/opentelemetry-test-util-genai/src/opentelemetry/test_util_genai/vcr.py | Shared pytest-vcr utilities; removes cassette-name override. |
| util/opentelemetry-test-util-genai/src/opentelemetry/test_util_genai/fixtures.py | Documents fixtures and adds weaver_live_check fixture. |
| util/opentelemetry-test-util-genai/src/opentelemetry/test_util_genai/conformance.py | Adds scenario base class + run_conformance runner using Weaver live-check. |
| util/opentelemetry-test-util-genai/src/opentelemetry/test_util_genai/_setup_weaver.py | Downloads/presents pinned semconv registry + generates schema rego for Weaver policies. |
| util/opentelemetry-test-util-genai/pyproject.toml | Adds conformance-related runtime deps to the shared test util package. |
| tox.ini | Adds *-conformance envs and marker-based selection/deselection; installs conformance deps. |
| pytest.ini | Registers conformance marker. |
| pyproject.toml | Adds global uv git sources for OpenTelemetry components (incl. exporters/proto). |
| policies/genai_span_validation.rego | New Weaver policy enforcing GenAI span naming + expected attributes per operation. |
| policies/genai_content_validation.rego | New Weaver policy validating GenAI content attributes against semconv JSON schemas. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/test_conformance.py | Adds OpenAI v2 conformance test driver. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/conftest.py | Loads shared vcr plugin alongside shared fixtures. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/conformance/init.py | New conformance package init. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/conformance/inference.py | Adds OpenAI v2 inference conformance scenario. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/conformance/embedding.py | Adds OpenAI v2 embeddings conformance scenario (currently skipped in driver). |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/conformance/tool_calling.py | Adds OpenAI v2 tool-calling conformance scenario. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/cassettes/inference-conformance.yaml | New VCR cassette for inference conformance. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/cassettes/embedding-conformance.yaml | New VCR cassette for embeddings conformance. |
| instrumentation/opentelemetry-instrumentation-openai-v2/tests/cassettes/tool_calling-conformance.yaml | New VCR cassette for tool-calling conformance. |
| instrumentation/opentelemetry-instrumentation-langchain/tests/test_conformance.py | Adds LangChain conformance test driver. |
| instrumentation/opentelemetry-instrumentation-langchain/tests/conformance/init.py | New conformance package init. |
| instrumentation/opentelemetry-instrumentation-langchain/tests/conformance/inference.py | Adds LangChain inference conformance scenario via ChatOpenAI. |
| instrumentation/opentelemetry-instrumentation-langchain/tests/cassettes/inference-conformance.yaml | New VCR cassette for LangChain conformance. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/test_conformance.py | Adds Anthropic conformance test driver (currently all scenarios skipped). |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/conftest.py | Loads shared vcr plugin alongside shared fixtures. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/conformance/init.py | New conformance package init. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/conformance/inference.py | Adds Anthropic inference conformance scenario. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/conformance/tool_calling.py | Adds Anthropic tool-calling conformance scenario. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/cassettes/inference-conformance.yaml | New VCR cassette for Anthropic conformance. |
| instrumentation/opentelemetry-instrumentation-anthropic/tests/cassettes/tool_calling-conformance.yaml | New VCR cassette for Anthropic conformance. |
| instrumentation/opentelemetry-instrumentation-anthropic/pyproject.toml | Registers conformance marker for package tests. |
| dev-requirements-conformance.txt | Adds git-based OpenTelemetry pins for conformance envs. |
| AGENTS.md | Documents repo conventions for conformance tests (marker + tox env usage). |
| .gitignore | Ignores generated schema rego and weaver reports directory. |
| .github/workflows/test.yml | Adds conformance CI jobs and Weaver installation steps. |
| .github/workflows/generate_workflows_lib/src/generate_workflows_lib/test.yml.j2 | Adds conditional Weaver install step for *-conformance jobs. |
| .github/workflows/generate_workflows_lib/src/generate_workflows_lib/init.py | Marks *-conformance jobs as needs_weaver. |
| .github/renovate.json5 | Adds custom manager for versions.env pins. |
| .github/instructions/instrumentation.instructions.md | Documents conformance test expectations for instrumentation packages. |
MikeGoldsmith
left a comment
There was a problem hiding this comment.
This looks cool, thanks @lmolkova.
There was a problem hiding this comment.
can you update CONTRIBUTING.md to cover how to manage these cassettes? I've always found http replay to be a bit painful for contributors, especially when they require remote service setup.