Skip to content

fix(semconv): attach spec-mandated explicit bucket boundaries to GenAI histogram helpers#5084

Open
alliasgher wants to merge 3 commits intoopen-telemetry:mainfrom
alliasgher:fix-gen-ai-histogram-buckets
Open

fix(semconv): attach spec-mandated explicit bucket boundaries to GenAI histogram helpers#5084
alliasgher wants to merge 3 commits intoopen-telemetry:mainfrom
alliasgher:fix-gen-ai-histogram-buckets

Conversation

@alliasgher
Copy link
Copy Markdown

@alliasgher alliasgher commented Apr 13, 2026

Description

The four GenAI histogram helpers in opentelemetry-semantic-conventions called meter.create_histogram without passing explicit_bucket_boundaries_advisory. The SDK therefore fell back to _DEFAULT_EXPLICIT_BUCKET_HISTOGRAM_AGGREGATION_BOUNDARIES, which is tuned for request-duration metrics in the seconds range and produces unusable histograms for latency-per-token and TTFT metrics — the exact problem flagged in the semconv spec, which says these metrics SHOULD be specified with ExplicitBucketBoundaries.

Pass the semconv-prescribed boundaries for all four helpers:

  • gen_ai.client.operation.duration, gen_ai.server.request.duration, gen_ai.server.time_to_first_token share the latency boundary set [0.01 … 81.92] seconds.
  • gen_ai.server.time_per_output_token uses the per-token boundary set [0.01 … 2.5] seconds.

Fixes #4946

Checklist

  • pytest opentelemetry-semantic-conventions/tests/test_gen_ai_metrics.py
  • New test file asserts each factory passes the correct explicit_bucket_boundaries_advisory to Meter.create_histogram
  • CHANGELOG entry added

…I histogram helpers

The four GenAI histogram helpers in opentelemetry-semantic-conventions
called meter.create_histogram without passing
explicit_bucket_boundaries_advisory. The SDK therefore fell back to
_DEFAULT_EXPLICIT_BUCKET_HISTOGRAM_AGGREGATION_BOUNDARIES, which is
tuned for request-duration metrics in the seconds range and produces
unusable histograms for latency-per-token and TTFT metrics — the exact
problem flagged in the semconv spec which says these metrics SHOULD be
specified with ExplicitBucketBoundaries.

Pass the semconv-prescribed boundaries for all four helpers:

* gen_ai.client.operation.duration / gen_ai.server.request.duration /
  gen_ai.server.time_to_first_token share the latency boundary set
  [0.01 .. 81.92] seconds.
* gen_ai.server.time_per_output_token uses the per-token boundary set
  [0.01 .. 2.5] seconds.

Add tests asserting each factory passes the correct
explicit_bucket_boundaries_advisory to Meter.create_histogram.

Fixes open-telemetry#4946

Signed-off-by: Ali <alliasgher123@gmail.com>
The link pointed to opentelemetry-python-contrib/pull/5076 (404) but
should reference opentelemetry-python/pull/5076, matching the main branch.

Signed-off-by: Ali <alliasgher123@gmail.com>
@alliasgher
Copy link
Copy Markdown
Author

Fixed the two CI failures:

check-links: The CHANGELOG had a stale link pointing to opentelemetry-python-contrib/pull/5076 (404) — the correct URL is opentelemetry-python/pull/5076 (matching main). Fixed in the latest commit.

pypy-3.10 Windows: The failing test TestConcurrentMultiSpanProcessor::test_force_flush_late_by_timeout is in opentelemetry-sdk/tests/trace/test_span_processor.py, which is unrelated to this PR's changes (this PR only touches opentelemetry-semantic-conventions/src/opentelemetry/semconv/metrics/gen_ai_metrics.py and its test file). The failure appears to be a pre-existing race-condition flake specific to the pypy-3.10 Windows environment.

Copy link
Copy Markdown
Contributor

@tammy-baylis-swi tammy-baylis-swi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, and this dir isn't generally tested.

It would be good if @open-telemetry/python-approvers who make regular GenAI instrumentation changes could also look at this please.

@github-project-automation github-project-automation bot moved this from Ready for review to Approved PRs in Python PR digest Apr 15, 2026
Copy link
Copy Markdown
Member

@MikeGoldsmith MikeGoldsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @alliasgher - I've left a suggestion to update the changelog to use the PR number instead of issue number.

Comment thread CHANGELOG.md Outdated
Comment on lines +23 to +24
- `opentelemetry-semantic-conventions`: Attach spec-mandated explicit bucket boundaries to the GenAI histogram helpers (`gen_ai.client.operation.duration`, `gen_ai.server.request.duration`, `gen_ai.server.time_to_first_token`, `gen_ai.server.time_per_output_token`); without them the default SDK buckets produced unusable histograms for latency-per-token metrics
([#4946](https://github.com/open-telemetry/opentelemetry-python/issues/4946))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `opentelemetry-semantic-conventions`: Attach spec-mandated explicit bucket boundaries to the GenAI histogram helpers (`gen_ai.client.operation.duration`, `gen_ai.server.request.duration`, `gen_ai.server.time_to_first_token`, `gen_ai.server.time_per_output_token`); without them the default SDK buckets produced unusable histograms for latency-per-token metrics
([#4946](https://github.com/open-telemetry/opentelemetry-python/issues/4946))
- `opentelemetry-semantic-conventions`: Attach spec-mandated explicit bucket boundaries to the GenAI histogram helpers
([#5084](https://github.com/open-telemetry/opentelemetry-python/issues/5084))

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown
Member

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support the cause, but this code is auto-generated and changes will disappear with the next update. this is currently blocked on open-telemetry/semantic-conventions#1225.

@github-project-automation github-project-automation bot moved this from Approved PRs to Reviewed PRs that need fixes in Python PR digest Apr 16, 2026
Signed-off-by: Ali <alliasgher123@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Reviewed PRs that need fixes

Development

Successfully merging this pull request may close these issues.

GenAi metrics TTFT/TPOT histogram SHOULD be specified with ExplicitBucketBoundaries

4 participants