diff --git a/.github/instructions/instrumentation-genai.instructions.md b/.github/instructions/instrumentation-genai.instructions.md new file mode 100644 index 0000000000..b62bea94b4 --- /dev/null +++ b/.github/instructions/instrumentation-genai.instructions.md @@ -0,0 +1,66 @@ +--- +applyTo: "instrumentation-genai/**" +--- + +Review rules for PRs touching `instrumentation-genai/**`. Flag violations with a link to the rule. + +These rules are additive to +[`instrumentation.instructions.md`](instrumentation.instructions.md), which applies to all +instrumentation packages. + +## 1. Scope of `instrumentation-genai/` + +Only for: + +- Generative AI inference providers, +- Agentic frameworks, +- Libraries directly supporting the above (e.g., MCP, GenAI protocols). + +Database clients (including vector DBs used outside a GenAI-specific client) and CLI libs belong +in `instrumentation/`, not here. + +## 2. GenAI component ownership + +See [`CONTRIBUTING.md#guideline-for-genai-instrumentations`](../../CONTRIBUTING.md#guideline-for-genai-instrumentations) +for GenAI-specific maintenance expectations on top of the general +[instrumentation checklist](../../CONTRIBUTING.md#guideline-for-instrumentations). + +## 3. Telemetry and configuration via `opentelemetry-util-genai` + +- Spans, logs, metrics, and events must go through `opentelemetry-util-genai`. Direct use of + `Tracer`, `Meter`, `Logger`, or event APIs is not allowed. +- Content capture, hooks, and other cross-cutting configuration are owned by the util. + Instrumentations must not introduce their own env vars, settings, or hook interfaces. +- Message content, prompts, and tool call arguments must only be set through the util's content + capture path — never as unconditional span/log attributes. +- Adding attributes to invocations produced by the util is fine. +- If a capability is missing in `opentelemetry-util-genai`, land it in the util first. + +## 4. GenAI semantic conventions + +- Attributes, spans, events, and metrics must match the + [GenAI semantic conventions](https://github.com/open-telemetry/semantic-conventions/tree/main/docs/gen-ai). +- `gen_ai.*` attribute names must come from + `opentelemetry.semconv._incubating.attributes.gen_ai_attributes`. +- For attributes with a well-known value set, use the generated enum from the same module + (e.g. `GenAiOutputTypeValues` for `gen_ai.output.type`) instead of string literals. + +## 5. Tests + +- Use recorded VCR cassettes for provider calls. No live-key-only tests; skipping on missing key + is not acceptable. +- Cover streaming and non-streaming variants when both exist. +- For error scenarios, at minimum include: provider error / endpoint unavailable, stream + interrupted by network, stream closed early by the caller. + +## 6. Examples + +New instrumentations must ship a minimal example under the package's `examples/`, with both a +`manual/` and a `zero-code/` (auto-instrumentation) variant. + +## 7. PR description + +- Cover which part of the GenAI semconv the change implements or follows (when applicable) and + how instrumentations should consume it. + +See also [AGENTS.md](../../AGENTS.md) for general repo rules. diff --git a/.github/instructions/instrumentation.instructions.md b/.github/instructions/instrumentation.instructions.md new file mode 100644 index 0000000000..360f1cc797 --- /dev/null +++ b/.github/instructions/instrumentation.instructions.md @@ -0,0 +1,52 @@ +--- +applyTo: "{instrumentation,instrumentation-genai}/**" +--- + +Review rules for PRs touching `instrumentation/**` and `instrumentation-genai/**`. Flag violations +with a link to the rule. + +## 0. Reviewer mindset + +Review as long-term maintainer. + +For new instrumentations, consult upstream library docs and judge: + +- Does the library already emit its own telemetry, making this instrumentation redundant? +- Is the library used widely enough to warrant a package in this repo? +- Does it avoid unbounded in-memory accumulation or other side-effects? + +For changes to existing instrumentations: prefer back-compat. Break users only for a real reason; +prefer opt-in or additive. Breaking changes need explicit justification in the PR. + +## 1. Component ownership & maintenance commitment + +- New instrumentations must add an entry under the correct folder in + [`component_owners.yml`](../component_owners.yml) in the same PR. Contributor must commit to + long-term maintenance. See + [Expectations from contributors](../../CONTRIBUTING.md#expectations-from-contributors) and the + general [instrumentation checklist](../../CONTRIBUTING.md#guideline-for-instrumentations). + +## 2. Semantic conventions + +- Attribute names must come from the semconv attribute modules, not hardcoded strings. Use the + module matching the namespace under `opentelemetry.semconv` (e.g. `server_attributes`, + `error_attributes`, `http_attributes`, `db_attributes`, …). +- For attributes with a well-known value set in semconv, use the generated enum from the same + modules instead of string literals. +- If a signal is not in semconv, wait until semconv lands. + +## 3. Exception handling + +- When catching exceptions from the underlying library to record telemetry, always re-raise the + original exception unmodified. +- Do not raise **new** exceptions in instrumentation/telemetry code. + +## 4. Tests + +- For every public API instrumented, cover sync/async variants when both exist. +- Cover happy path and error scenarios. +- Tests must verify exact attribute names **and value types**, checked against the semconv spec. +- Test against oldest and latest supported library versions via `tests/requirements.{oldest,latest}.txt` + and `{oldest,latest}` `tox.ini` factors. + +See also [AGENTS.md](../../AGENTS.md) for general repo rules. diff --git a/.github/instructions/util-genai.instructions.md b/.github/instructions/util-genai.instructions.md new file mode 100644 index 0000000000..5c64be974b --- /dev/null +++ b/.github/instructions/util-genai.instructions.md @@ -0,0 +1,86 @@ +--- +applyTo: "util/opentelemetry-util-genai/**" +--- + +Review rules for PRs touching `util/opentelemetry-util-genai/**`. Flag violations with a link to +the rule. + +## 0. Reviewer mindset + +Review as long-term maintainer. Every GenAI instrumentation in the repo depends on this package; +churn breaks them. Default to back-compat changes. Every public addition is a long-term +commitment — limit public API. + +## 1. Semconv first + +No code without semconv. If a signal, attribute, or operation is not in the +[GenAI semconv](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/), +land the semconv change first. + +## 2. Semantic conventions + +- Names (attributes, operations, spans) must match semconv exactly. +- Use matching `opentelemetry.semconv` modules per namespace (`gen_ai`, `server`, `error`, …). + Do not hardcode name strings if a constant exists. +- Shared attributes must behave consistently across invocation types (same conditions, same + defaults). If semconv marks an attribute for multiple invocations, set it on all. + +## 3. API backwards compatibility + +- Do not remove or rename public objects. Deprecate first via a docstring note pointing to the + replacement (not `@deprecated` — unreliable). +- Private modules and module-private objects start with `_`. +- Default to internal (`_`-prefixed) unless instrumentations need it public. + +## 4. Invocation shape + +- `start_*()` factories must accept all sampling-relevant semconv attributes as parameters. + Attributes also marked required by semconv must be required parameters (no default value). +- `start_*()` factories must map 1:1 to distinct semconv operation types (inference, embeddings, + tool execution, agent invocation, workflow invocation). Names must match the operation + unambiguously — e.g., `create_agent` vs `invoke_agent` are distinct ops; `start_agent()` alone + is ambiguous. +- Each operation exposes both a factory (`start_inference(...)`) and a context-manager + (`inference(...)`) form. +- Never construct invocation types directly (`InferenceInvocation(...)`) — skips span creation, + silent no-ops. Always use `handler.start_*()` or the context manager. + +## 5. Exception handling + +- When catching exceptions from the underlying library to record telemetry, always re-raise the + original exception unmodified. +- Do not raise **new** exceptions in utils code. + +## 6. Performance + +Keep the hot path tight: + +- Avoid per-invocation allocations; do not accumulate state unboundedly. +- Skip content capture when content capture is disabled. +- Skip setting span attributes when the span is not recording. +- Still record attributes that feed metrics — metric recording is independent of span sampling. + +## 7. DRY + +Do not copy-paste logic across invocation types. Extract shared helpers. + +## 8. Tests + +- Every new operation type or attribute change needs tests asserting exact attribute names **and + value types**, checked against semconv. +- Cover success (`invocation.stop()`), failure (`invocation.fail(exc)`), and conditional + attribute logic. +- Prefer public API in tests. + +## 9. Documentation + +- Docstrings for invocation types and span/event helpers must link to the corresponding semconv + op. +- When adding/changing attributes, update the docstring with what is set and when. + +## 10. PR description + +- Cover which part of the GenAI semconv the change implements or follows (when applicable) and + how instrumentations should consume it. + +See also [AGENTS.md](../../AGENTS.md) for general repo rules. diff --git a/AGENTS.md b/AGENTS.md index 7467033607..09ab2575fb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -14,10 +14,22 @@ Follow the PR scoping guidance in [CONTRIBUTING.md](CONTRIBUTING.md). Keep AI-as isolated to the requested change and never include unrelated cleanup or opportunistic improvements unless they are strictly necessary for correctness. +- One logical change per PR. Do not bundle multiple fixes, refactors, or features into the same + PR - split them up so each can be reviewed and reverted independently. +- Run the linter and the relevant tests locally and make sure they pass. See [Commands](#commands). + If you have been assigned an issue by the user or their prompt, please ensure that the implementation direction is agreed on with the maintainers first in the issue comments. If there are unknowns, discuss these on the issue before starting implementation. Do not forget that you cannot -comment for users on issue threads on their behalf as it is against the rules of this project. +comment for users on issue or pull request threads on their behalf as it is against the rules of this project. + +## PR description + +Follow the repo's [PR template](.github/pull_request_template.md) and fill applicable sections. +Keep description short and focus on what is being changed and any gaps or concerns. + +AI-generated analyses, long reports, or design dumps go in a relevant issue or a separate PR +comment - not in the PR description. ## Structure @@ -55,6 +67,35 @@ uv run tox -e typecheck - Do not add `type: ignore` comments. If a type error arises, solve it properly or write a follow-up plan to address it in another PR. - Whenever applicable, all code changes should have tests that actually validate the changes. +## Instrumentation rules + +Apply to packages under `instrumentation/` and `instrumentation-genai/`. + +### Exception handling + +- When catching exceptions from the underlying library to record telemetry, always re-raise the + original exception unmodified. +- Do not raise new exceptions in instrumentation/telemetry code. + +### Semantic conventions + +- Use the semconv attribute and metrics modules under `opentelemetry.semconv` — do not hardcode + attribute or metric name strings. +- For attributes with a well-known value set, use the generated enum from the same module instead + of string literals. + +### Tests + +- For every public API instrumented, cover sync/async variants when both exist. +- Cover happy path and error scenarios. +- Tests must verify exact attribute names **and value types**, checked against the semconv spec. +- Test against oldest and latest supported library versions via `tests/requirements.{oldest,latest}.txt` + and `{oldest,latest}` `tox.ini` factors. + +The parallel PR-review rules live in +[`.github/instructions/instrumentation.instructions.md`](.github/instructions/instrumentation.instructions.md) +and should be kept in sync with this section. + ## Commit formatting We appreciate it if users disclose the use of AI tools when the significant part of a commit is diff --git a/CHANGELOG.md b/CHANGELOG.md index 07db28fe5c..a48239ef1d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -17,6 +17,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ([#4244](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4244)) - `opentelemetry-instrumentation-sqlite3`: Add uninstrument, error status, suppress, and no-op tests ([#4335](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4335)) +- Expand `AGENTS.md` with instrumentation/GenAI guidance and add PR review instructions. + ([#4457](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4457)) ### Fixed diff --git a/instrumentation-genai/AGENTS.md b/instrumentation-genai/AGENTS.md index 6e8dd1eca9..5e602be269 100644 --- a/instrumentation-genai/AGENTS.md +++ b/instrumentation-genai/AGENTS.md @@ -3,6 +3,9 @@ Instrumentation packages here wrap specific libraries (OpenAI, Anthropic, etc.) and bridge them to the shared telemetry layer in `util/opentelemetry-util-genai`. +These rules are additive to the shared instrumentation rules in the repo-root +[AGENTS.md](../AGENTS.md). + ## 1. Instrumentation Layer Boundary Do not call OpenTelemetry APIs (`tracer`, `meter`, `span`, event APIs) directly. @@ -33,9 +36,23 @@ except Exception as exc: raise ``` -## 3. Exception Handling +## 3. Semantic conventions + +Attributes, spans, events, and metrics follow the +[GenAI semantic conventions](https://github.com/open-telemetry/semantic-conventions/tree/main/docs/gen-ai). +Do not emit signals that are not covered by semconv. + +`gen_ai.*` attribute names and the enums for well-known values (e.g. `GenAiOutputTypeValues` for +`gen_ai.output.type`) live in `opentelemetry.semconv._incubating.attributes.gen_ai_attributes`. + +## 4. Tests + +- Use VCR cassettes for provider calls. Do not skip tests when an API key is missing. +- Cover streaming and non-streaming variants when both exist. +- Cover error scenarios, at minimum: provider error / endpoint unavailable, stream interrupted by + network, stream closed early by the caller. + +## 5. Examples -- Do not add `raise {Error}` statements in instrumentation/telemetry code — validation belongs in - tests and callers, not in the instrumentation layer. -- When catching exceptions from the underlying library to record telemetry, always re-raise - the original exception unmodified. +New instrumentations ship a minimal example under the package's `examples/` directory, with +both a `manual/` setup and a `zero-code/` (auto-instrumentation) variant. diff --git a/util/opentelemetry-util-genai/AGENTS.md b/util/opentelemetry-util-genai/AGENTS.md index 355d8b7c08..77cc5cf20b 100644 --- a/util/opentelemetry-util-genai/AGENTS.md +++ b/util/opentelemetry-util-genai/AGENTS.md @@ -4,17 +4,22 @@ This package provides shared telemetry utilities for OpenTelemetry GenAI instrum ## 1. Semantic Convention Compliance -All attributes, operation names, and span names must match the -[OpenTelemetry GenAI semantic conventions](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/) -exactly. +No new telemetry without semconv. If a signal, attribute, or operation is not in the +[OpenTelemetry GenAI semantic conventions](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/), land the semconv change first. -Use the appropriate semconv attribute modules — do not hardcode attribute name strings: +All attributes, operation names, and span names must match semconv. + +Use the semconv attribute modules — do not hardcode attribute name strings: - `gen_ai.*` attributes: `opentelemetry.semconv._incubating.attributes.gen_ai_attributes` - `server.*` attributes: `opentelemetry.semconv.attributes.server_attributes` - `error.*` attributes: `opentelemetry.semconv.attributes.error_attributes` - Other namespaces: use the corresponding module from `opentelemetry.semconv` +Shared attributes should behave consistently across invocation types (same conditions, same +defaults). If an attribute applies to more than one invocation per semconv, set it on all +applicable ones. + ## 2. Invocation Lifecycle Pattern Every new operation type must follow this pattern: @@ -42,6 +47,20 @@ Factory methods on `TelemetryHandler` (`handler.py`): Context manager equivalents (`handler.inference()`, `handler.embedding()`, `handler.tool()`, `handler.workflow()`) are available when the span lifetime maps cleanly to a `with` block. +`start_*()` factories must map 1:1 to distinct semconv operation types (inference, embeddings, +tool execution, agent invocation, workflow invocation). Names must match the operation +unambiguously — for example, `create_agent` and `invoke_agent` are different operations, so a +single `start_agent()` would be ambiguous and is not acceptable. Add a new factory per operation +instead. + +Factory names are Python-style singular verbs (`start_embedding`, `start_tool`); the op names +they map to follow semconv (`embeddings`, `tool execution`, future operations). + +`start_*()` factories must accept all attributes that semconv marks as important for sampling +decisions as parameters, so they are on the span at creation time. Attributes that are also +marked required by semconv must be required parameters (no default value). Operation name +is usually hardcoded in specific invocation and does not need to be passed. + ### Anti-patterns **Never construct invocation types directly** (`InferenceInvocation(...)`, `ToolInvocation(...)`, @@ -50,31 +69,45 @@ propagation, so all telemetry calls become no-ops. Always use `handler.start_*() ## 3. Exception Handling -- Do not add `raise {Error}` statements to `handler.py` or `types.py` — validation belongs in - tests and callers, not telemetry internals. - When catching exceptions from the underlying library to record telemetry, always re-raise the original exception unmodified. +- Do not raise new exceptions in telemetry code. + +## 4. Performance + +Keep the hot path tight: + +- Avoid per-invocation allocations; do not accumulate state unboundedly. +- Skip content capture when content capture is disabled. +- Skip setting span-only attributes when the span is not recording. +- Still record attributes that feed metrics — metric recording is independent of span sampling. + +## 5. DRY + +Do not copy-paste logic across invocation types. Extract shared helpers. -## 4. Documentation +## 6. Documentation - Docstrings for invocation types and span/event helpers must include a link to the corresponding operation in the semconv spec. - When adding or changing attributes, update the docstring to describe what is set and under what conditions (e.g., "set only when `server_address` is provided"). -## 5. Tests +## 7. Tests - Every new operation type or attribute change must have tests verifying the exact attribute - names and values produced, checked against the semconv spec. + names **and value types**, checked against the semconv spec. - Cover all paths: success (`invocation.stop()`), failure (`invocation.fail(exc)`), and any conditional attribute logic (e.g., attributes set only when optional fields are populated). - Tests live in `tests/` — follow existing patterns there. - Don't call internal API in tests when the public API is available. -## 6. Python API Conventions +## 8. Python API Conventions -- Mark private modules with an underscore. -- Objects inside of a private module should be prefixed with underscopre if they - are not used outside the that module. -- Before removing or renaming an object exposed publicly, deprecate it first with - `@deprecated("... Use X instead.")` pointing to the replacement; +- Mark private modules with an underscore. Objects inside a private module should be prefixed + with an underscore if they are not used outside that module. +- When adding fields or methods on invocation types (or anywhere in the public surface), push + back hard: does this need to be public? If instrumentations don't need it, keep it internal + (`_`-prefixed). Every public addition becomes a back-compat commitment. +- Before removing or renaming an object exposed publicly, deprecate it first with a note in the + docstring pointing to the replacement.