feat: Send GenAI spans as V2 envelope items #6079

7 issues

find-bugs: Found 7 issues (2 high, 4 medium, 1 low)

High

test_multiple_providers captures only transactions but asserts on spans - `tests/integrations/litellm/test_litellm.py:945`

Line 945 calls capture_items("transaction") which only captures transactions. However, lines 1020-1023 (outside the hunk but testing items set up in the hunk) filter for item.type == "span" and assert on span attributes. Since spans are never captured, the spans list will be empty and the for-loop assertion will trivially pass, making the test ineffective at verifying span attributes.

Also found at:

tests/integrations/litellm/test_litellm.py:1020-1023
tests/integrations/litellm/test_litellm.py:866-868

Test expects V2 span envelope for non-gen_ai op span, will fail - `tests/tracing/test_misc.py:618-629`

The test test_conversation_id_propagates_to_span_with_gen_ai_operation_name was modified to use capture_items("span") which captures V2 envelope span items. However, the span being created has op="http.client", and _split_gen_ai_spans() in client.py only splits spans where op starts with gen_ai.. This span will NOT be sent as a V2 envelope item - it will remain in the transaction event. The test will fail because spans list will be empty or not contain the expected span.

Also found at:

tests/tracing/test_misc.py:636-647

Medium

V2 GenAI spans may be missing release, environment, and SDK metadata - `sentry_sdk/client.py:1130`

Line 1130 passes event instead of event_opt to _serialized_v1_span_to_serialized_v2_span. The _prepare_event function (lines 811-817) populates release, environment, and sdk from options into the event, but those values only exist in the returned event_opt. The original event parameter may not contain these fields, causing V2 GenAI spans to be missing sentry.release, sentry.environment, sentry.sdk.name, and sentry.sdk.version attributes.

Also found at:

tests/integrations/google_genai/test_google_genai.py:2153

Hardcoded SDK version '2.58.0' will cause test failure on version change - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

In test_text_generation, the expected sentry.sdk.version attribute is hardcoded as "2.58.0" (line 523) instead of using mock.ANY like all other similar tests in this file. This test will fail when the SDK version changes, unlike test_text_generation_streaming, test_chat_completion, and other tests which correctly use mock.ANY for version comparison.

Also found at:

tests/integrations/openai_agents/test_openai_agents.py:1097

Test accesses orphaned _meta after gen_ai span is removed from transaction - `tests/integrations/openai/test_openai.py:3758-3760`

After gen_ai spans are split from the transaction and sent as V2 envelope items, the transaction's spans list no longer contains the gen_ai span. However, the test still accesses event["_meta"]["spans"]["0"]["data"] expecting truncation metadata. Since the span at index 0 has been moved to the V2 envelope, _meta["spans"]["0"] now references metadata for a span that no longer exists in the transaction's spans array. This test will likely fail or assert against orphaned/stale metadata.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:830-833

Test checks wrong field 'attributes' instead of 'data' for transaction trace context - `tests/integrations/openai_agents/test_openai_agents.py:3560-3562`

At line 3560-3561, the test checks transaction["contexts"]["trace"].get("attributes", {}) to verify conversation_id is not set. However, all other tests in this file (lines 3359, 3497) and throughout the test suite access transaction trace data via transaction["contexts"]["trace"]["data"]. This inconsistency means the test will always pass since it's checking a non-existent 'attributes' field, while the actual data might still contain the conversation_id in the 'data' field.

Low

Duplicated sort key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda for tools on line 330 was changed from (t.get("name", ""), t.get("description", "")) to (t.get("name", ""), t.get("name", "")). This duplicates 'name' as both primary and secondary sort keys, making the secondary sort redundant. While this works for the current test data (since tool names are distinct), it loses the intended secondary sort by description and appears to be a copy-paste error.

Also found at:

tests/integrations/langchain/test_langchain.py:1840-1844

Duration: 23m 34s · Tokens: 18.5M in / 212.2k out · Cost: $26.27 (+extraction: $0.03, +merge: $0.01, +fix_gate: $0.02)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

7 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

fix common tests

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

7 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...