feat: Send GenAI spans as V2 envelope items #6079

9 issues

find-bugs: Found 9 issues (1 high, 7 medium, 1 low)

High

GenAI span conversion uses unprocessed `event` instead of `event_opt` - `sentry_sdk/client.py:1130`

On line 1130, _serialized_v1_span_to_serialized_v2_span(span, event) passes event (the original input parameter) instead of event_opt (the processed event). The _prepare_event method enriches the event with release, environment, sdk info, user data, and trace context. V2 spans converted from genAI spans will be missing this enriched data, causing inconsistency between the transaction and its extracted spans.

Also found at:

sentry_sdk/client.py:230

Medium

Test uses incorrect 'attributes' field instead of 'data' for inline_data - `tests/integrations/google_genai/test_google_genai.py:2153`

The test input was changed from {"inline_data": {"data": b"binary_data", ...}} to {"inline_data": {"attributes": b"binary_data", ...}}. However, the transform_google_content_part function in sentry_sdk/ai/utils.py (line 286) expects inline_data.get("data", ""), not attributes. This means the test will no longer properly validate the inline_data handling, as the actual binary data will be ignored and an empty string will be used instead. Other tests in this file (lines 1765, 1806, 1851) correctly use "data".

Also found at:

tests/integrations/google_genai/test_google_genai.py:330

Hardcoded SDK version will cause test failures on version bump - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

In test_text_generation, the sentry.sdk.version expected value is hardcoded to "2.58.0" instead of using mock.ANY. All other tests in this file and similar tests in other integration test files use mock.ANY for this field. This test will fail when the SDK version is bumped.

test_multiple_providers fails to capture spans, making provider detection assertion ineffective - `tests/integrations/litellm/test_litellm.py:945`

On line 945, capture_items("transaction") only captures transaction items, but the test later (line 1020-1023) attempts to filter for spans and assert that each span has SPANDATA.GEN_AI_SYSTEM in its attributes. Since spans are not captured, the spans list will always be empty, and the for span in spans: loop never executes any assertions. This makes the provider detection verification ineffective - the test will pass regardless of whether spans are correctly captured.

Also found at:

tests/integrations/litellm/test_litellm.py:1020-1023

Unsafe dict access could cause KeyError in test_no_integration when spans lack sentry.op attribute - `tests/integrations/litellm/test_litellm.py:1283-1284`

The test_no_integration and test_async_no_integration tests filter spans using direct dict access x["attributes"]["sentry.op"] which will raise KeyError if any captured span doesn't have the sentry.op attribute. When LiteLLM integration is not enabled, other default integrations might produce spans without this attribute. Other tests in this codebase use the safer .get() pattern (e.g., span["attributes"].get("sentry.op")). The tests would fail with KeyError instead of asserting that no LiteLLM chat spans exist.

Also found at:

tests/integrations/litellm/test_litellm.py:1330-1331

Test checks wrong key 'attributes' instead of 'data' for transaction trace context - `tests/integrations/openai_agents/test_openai_agents.py:3560-3562`

The assertion at line 3560-3561 checks transaction["contexts"]["trace"].get("attributes", {}) but transactions store span data in data, not attributes. Other tests in the same file (lines 3359, 3497) correctly use ["data"] to access gen_ai.conversation.id on transactions. This causes the test to always pass (false negative) even if gen_ai.conversation.id is incorrectly set in the transaction's data field.

Missing else branch drops validation for handled_tool_call_exceptions=False case - `tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496`

In test_agent_with_tool_validation_error, the old code had an else branch that validated model_behaviour_error existed when handled_tool_call_exceptions=False. The new code removed this branch entirely, meaning the test no longer validates that the unhandled UnexpectedModelBehavior exception is captured when handled_tool_call_exceptions=False. This allows the test to pass even if Sentry fails to capture the expected error event.

test_message_history accesses spans from transaction instead of captured items, causing test to fail or pass vacuously - `tests/integrations/pydantic_ai/test_pydantic_ai.py:830-840`

The test_message_history function was incompletely migrated to V2 envelope format. At line 830, it retrieves spans via second_transaction["spans"] (old format), but then at line 832 accesses s["attributes"].get("sentry.op", "") (V2 format). In V2, spans are sent as separate envelope items and should be accessed via [item.payload for item in items if item.type == "span"]. The transaction object may not contain a "spans" key at all, causing a KeyError, or the nested spans may have the old format (using s["op"] and s["data"] instead of s["attributes"]), causing the filter to find no matches. All other tests in this diff correctly use the pattern spans = [item.payload for item in items if item.type == "span"].

Low

Unused list comprehension result - dead code in test - `tests/integrations/langchain/test_langchain.py:1840-1844`

The list comprehension at lines 1840-1844 builds a list of error events but the result is never assigned to a variable or used. This appears to be leftover code from a refactor. Additionally, the capture_items("transaction", "span") call at line 1821 doesn't include "event" type, so this comprehension would always produce an empty list anyway.

Duration: 41m 58s · Tokens: 19.2M in / 211.4k out · Cost: $27.87 (+extraction: $0.02, +merge: $0.00, +fix_gate: $0.02)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

common tests

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...