Implement gen_ai.client.operation.time_to_first_chunk for OpenAI v2 streaming

## Problem

The OpenTelemetry GenAI Semantic Conventions define a `gen_ai.client.operation.time_to_first_chunk` histogram metric that records the time (in seconds) from request start to the first output chunk during streaming operations. No Python instrumentation currently implements this metric.

Originally tracked in opentelemetry-python-contrib #3932 with a previous implementation in opentelemetry-python-contrib#4415. That PR was closed when GenAI instrumentations moved to this repository.

## Context

In the old repository, #4415 was designed to build on top of opentelemetry-python-contrib #4500 (streaming ABC refactor by @eternalcuriouslearner), which introduced the generic stream wrapper in `opentelemetry-util-genai`. That refactor is already present here at `util/opentelemetry-util-genai/src/opentelemetry/util/genai/stream.py`, so the time-to-first-chunk metric can now be built directly on top of it.

cc @lmolkova @eternalcuriouslearner

## Proposed solution

1. Add a `gen_ai.client.operation.time_to_first_chunk` histogram to the OpenAI v2 instrumentation and the generic streaming ABC.
2. Record `time_of_first_chunk - start_time` in seconds when the first streaming chunk arrives.
3. Attach standard GenAI metric attributes: `gen_ai.operation.name`, `gen_ai.request.model`, `gen_ai.response.model`, `gen_ai.system`, `server.address`, `server.port`.
4. Use the semconv-defined explicit bucket boundaries.
5. Record only for streaming calls; do not emit when no chunk is ever received.
6. Cover both sync and async streaming paths.

## Acceptance criteria

- Metric recorded only on streaming completions (not non-streaming)
- No data point emitted if the stream errors before the first chunk
- Sync and async streaming paths both instrumented
- Test coverage ported from opentelemetry-python-contrib #4415

## References

- Semantic convention: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/
- Original issue: https://github.com/open-telemetry/opentelemetry-python-contrib/issues/3932
- Previous implementation: https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4415
- Streaming ABC refactor: https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4500

I have a previous implementation from opentelemetry-python-contrib#4415 ready to port.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement gen_ai.client.operation.time_to_first_chunk for OpenAI v2 streaming #8

Problem

Context

Proposed solution

Acceptance criteria

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement gen_ai.client.operation.time_to_first_chunk for OpenAI v2 streaming #8

Description

Problem

Context

Proposed solution

Acceptance criteria

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions