feat: Send GenAI spans as V2 envelope items #6079

6 issues

code-review: Found 6 issues (1 high, 4 medium, 1 low)

High

tx["_meta"]["spans"] will cause KeyError - spans are no longer nested in transaction - `tests/integrations/langchain/test_langchain.py:1370`

Line 1370 accesses tx["_meta"]["spans"]["0"]["data"] to verify message truncation metadata. However, the test was migrated to use V2 envelope items where spans are separate items (accessed via [item.payload for item in items if item.type == "span"]) rather than being nested within the transaction. Since spans are no longer embedded in tx, the path tx["_meta"]["spans"] will raise a KeyError at runtime, causing this test to fail.

Also found at:

tests/integrations/openai_agents/test_openai_agents.py:3039

Medium

V2 span conversion uses unprocessed event instead of processed event_opt - `sentry_sdk/client.py:1138`

At line 1138, _serialized_v1_span_to_serialized_v2_span(span, event) passes the original event parameter instead of event_opt. The _serialized_v1_span_to_serialized_v2_span function reads user, release, environment, transaction, trace context, and SDK info from the event (lines 224-251 in client.py). These fields are populated during _prepare_event via scope.apply_to_event, which can return a different object via event processors (lines 1817-1823 in scope.py). When event processors return a new object, V2 spans will be missing user, release, environment, and other scope-applied attributes.

Also found at:

tests/integrations/litellm/test_litellm.py:945
tests/integrations/huggingface_hub/test_huggingface_hub.py:710

List comprehension result is discarded without any assertion or assignment - `tests/integrations/langchain/test_langchain.py:1842-1846`

The list comprehension at lines 1842-1846 computes error events but does not assign the result to a variable or make any assertions on it. This makes the test ineffective - it doesn't actually verify that errors are captured or handled correctly. Similar tests in litellm/test_litellm.py assign the result to error_events and assert on its length.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496

Test assertion uses wrong key 'attributes' instead of 'data' for transaction contexts - `tests/integrations/openai_agents/test_openai_agents.py:3540-3542`

The test at line 3540 checks for gen_ai.conversation.id in transaction["contexts"]["trace"].get("attributes", {}), but transactions store this data under data, not attributes. Looking at the capture_items fixture in conftest.py (lines 361-370), only span items have their attributes transformed; transactions use the raw payload which contains data. Other assertions in the same file (lines 3341 and 3479) correctly use data for transaction context access.

Also found at:

tests/integrations/openai_agents/test_openai_agents.py:2257
tests/integrations/langchain/test_langchain.py:1953-1954

test_message_history accesses transaction-nested spans using wrong attribute format, causing silent test pass - `tests/integrations/pydantic_ai/test_pydantic_ai.py:831-841`

At line 831, spans are extracted from second_transaction["spans"] and then filtered using s["attributes"].get("sentry.op", ""). However, spans nested inside transaction payloads use the legacy format with s["op"], not s["attributes"]["sentry.op"]. This mismatch causes the filter to find zero chat_spans, and since the test uses if chat_spans: at line 836, it silently passes without verifying the message history feature.

Also found at:

tests/integrations/langchain/test_langchain.py:262

Low

Sorting key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda at line 330 uses t.get("name", "") twice instead of sorting by name and description as the comment on line 328 states. The original code used (t.get("name", ""), t.get("description", "")). While this doesn't cause a runtime error since the tool names are unique, it makes the second tuple element redundant and contradicts the documented intent.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:493

Duration: 21m 41s · Tokens: 15.9M in / 195.5k out · Cost: $22.30 (+extraction: $0.03, +merge: $0.01, +fix_gate: $0.01)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

6 issues

High

Medium

Low

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

update test with hardcoded version

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

6 issues

High

Medium

Low

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...