feat: Send GenAI spans as V2 envelope items #6079

9 issues

find-bugs: Found 9 issues (1 high, 6 medium, 2 low)

High

Test accesses wrong payload key 'data' instead of 'attributes', causing KeyError - `tests/tracing/test_misc.py:628`

The test was refactored to use capture_items('span') instead of capture_events(), but continues to access spans[0]['data']. The capture_items fixture transforms span payloads to have an 'attributes' key (see conftest.py lines 361-367), not 'data'. This will cause a KeyError when the test runs, making the test fail.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:830-832

Medium

V2 GenAI spans are missing metadata because `event` is used instead of `event_opt` - `sentry_sdk/client.py:230`

At line 1130, _serialized_v1_span_to_serialized_v2_span(span, event) passes the original event instead of the processed event_opt. The function extracts metadata like user info, release, environment, SDK info, and transaction name from the event parameter. Since event has not been through _prepare_event(), these fields may be missing or unpopulated (e.g., release, environment, sdk are set in _prepare_event at lines 811-817). This causes V2 GenAI span attributes to be incomplete compared to what the V1 spans in the same transaction would have.

Also found at:

sentry_sdk/client.py:1130

Test uses incorrect 'attributes' key instead of 'data' for inline_data format - `tests/integrations/google_genai/test_google_genai.py:2153`

The test test_extract_contents_messages_dict_inline_data was changed to use "attributes" as the key for binary data, but the Google GenAI API and the implementation in sentry_sdk/ai/utils.py:286 expect "data". The documented input format is {"inline_data": {"mime_type": "...", "data": "..."}}. All other tests in this file and test_ai_monitoring.py consistently use "data". This test now passes incorrectly because the code defaults to an empty string when data is missing, making it not effectively test the inline_data parsing.

Also found at:

tests/integrations/google_genai/test_google_genai.py:330

test_multiple_providers does not capture spans but later asserts on them - `tests/integrations/litellm/test_litellm.py:945`

The capture_items("transaction") call on line 945 only captures transaction items, but the test later (line 1020-1023) attempts to filter and assert on span items with [item.payload for item in items if item.type == "span"]. Since spans are never captured into the items list, this assertion will operate on an empty list, making the test ineffective at validating span attributes. The async version test_async_multiple_providers correctly uses capture_items("transaction", "span").

Also found at:

tests/integrations/langchain/test_langchain.py:1840-1844
tests/integrations/litellm/test_litellm.py:1020-1023

Inconsistent transaction assertion uses 'attributes' instead of 'data' - `tests/integrations/openai_agents/test_openai_agents.py:3560-3562`

Line 3560-3561 checks transaction["contexts"]["trace"].get("attributes", {}) but all other transaction assertions in this file (lines 3359 and 3497) use ["data"] instead of ["attributes"]. This inconsistency means the assertion is checking the wrong field - if transactions use data format, this assertion will always pass (since attributes would be empty/missing), making the test ineffective at detecting bugs.

Missing validation of model_behaviour_error when handled_tool_call_exceptions=False - `tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496`

In test_agent_with_tool_validation_error, the original code validated that a model_behaviour_error event was captured in both the handled_tool_call_exceptions=True and False cases. The new code only unpacks and validates events when handled_tool_call_exceptions=True, completely skipping any verification when False. This means the test no longer verifies that the UnexpectedModelBehavior exception is being captured as an error event when the flag is false, reducing test coverage.

Test can pass vacuously if no tool spans are captured - `tests/integrations/pydantic_ai/test_pydantic_ai.py:964-966`

The test_include_prompts_false_with_tools test iterates over tool_spans with assertions but never validates that tool_spans is non-empty. If no tool spans are captured (e.g., due to a bug in the integration or the tool not being executed), the for loop on line 964 will simply not execute, and the test passes without actually verifying anything. Other similar tests in this file (e.g., lines 351-353, 426-428) include assert len(tool_spans) >= 1 before iterating.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:993-995

Low

Hardcoded SDK version in test will cause test failure on version bump - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

Line 523 hardcodes "sentry.sdk.version": "2.58.0" while all other test functions in this file use mock.ANY for this field (lines 599, 676, 753, 828, 942, 1038). This test will fail whenever the SDK version is incremented, requiring manual updates to the test file.

Unused fixture parameter in test function - `tests/integrations/openai_agents/test_openai_agents.py:3057`

The function signature for test_openai_agents_message_truncation was changed from capture_events to capture_items, but the test body never calls or uses the capture_items fixture. This appears to be a mechanical find-replace during refactoring that left an unused fixture parameter. While not a runtime bug (pytest will simply inject an unused fixture), it adds unnecessary test setup overhead and may confuse future maintainers.

Duration: 23m 6s · Tokens: 17.9M in / 201.5k out · Cost: $26.67 (+extraction: $0.03, +merge: $0.01, +fix_gate: $0.02)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

fix common tests

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Low

Annotations

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

sentry-warden / warden: find-bugs

Re-running checks...