feat: Send GenAI spans as V2 envelope items #6079

10 issues

code-review: Found 10 issues (4 high, 5 medium, 1 low)

High

Test uses invalid 'attributes' key instead of 'data' for inline_data - `tests/integrations/google_genai/test_google_genai.py:2190`

The test input was changed from "data": b"binary_data" to "attributes": b"binary_data". The Google GenAI API and the transform_google_content_part function expect inline_data to have a "data" key (line 286 of ai/utils.py: inline_data.get("data", "")). With "attributes", the transform will return an empty string for content instead of the binary data, making this test ineffective at validating the intended functionality.

Also found at:

tests/integrations/openai_agents/test_openai_agents.py:3528-3530

_experiments variable is assigned but never passed to sentry_init(), test won't enable V2 spans - `tests/integrations/huggingface_hub/test_huggingface_hub.py:787-788`

In test_chat_completion_api_error, the _experiments variable is assigned on line 787 but never passed to sentry_init() on line 786. This means the gen_ai_as_v2_spans: True experiment won't be enabled. Later in the test (lines 805, 815-817), the code accesses span["attributes"] which is the V2 span format - this will fail at runtime because V2 spans won't be generated. Additionally, the value is incorrectly wrapped in a tuple (dict,) instead of just a dict like in other test functions.

Also found at:

tests/integrations/huggingface_hub/test_huggingface_hub.py:847
tests/integrations/langchain/test_langchain.py:997
tests/integrations/langchain/test_langchain.py:1758
tests/integrations/openai/test_openai.py:1434-1435
tests/integrations/openai/test_openai.py:1450-1451
tests/integrations/openai/test_openai.py:1474-1475
tests/integrations/pydantic_ai/test_pydantic_ai.py:2978
tests/integrations/langchain/test_langchain.py:1745

Incomplete V2 span migration leaves assertion accessing non-existent transaction meta structure - `tests/integrations/langchain/test_langchain.py:1381`

The test test_langchain_message_truncation was partially migrated to V2 span format: spans are now extracted as separate envelope items via items and accessed via span["attributes"]. However, line 1381 still asserts tx["_meta"]["spans"]["0"]["data"]["gen_ai.request.messages"][""]["len"] == 5, which references the old structure where spans are nested within the transaction. With V2 spans sent as separate envelope items, the tx["_meta"]["spans"] structure may not exist or contain different data, causing a KeyError or incorrect test behavior.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:500-506

test_multiple_providers captures no spans, making assertions vacuously pass - `tests/integrations/litellm/test_litellm.py:1034-1037`

In test_multiple_providers, capture_items("transaction") (line 959) only captures transaction items, but the test at line 1034-1037 filters for item.type == "span" and iterates over the result. Since no span items are captured, the spans list is always empty, and the for-loop never executes any assertions. This means the test silently passes without verifying that SPANDATA.GEN_AI_SYSTEM is set on any span. The async version correctly uses capture_items("transaction", "span").

Also found at:

tests/integrations/litellm/test_litellm.py:959

Medium

GenAI spans converted with unprepared event, missing release/environment/SDK attributes - `sentry_sdk/client.py:1130`

Line 1130 passes event (the original input) to _serialized_v1_span_to_serialized_v2_span, but should pass event_opt (the prepared event). The _prepare_event method populates release, environment, and sdk fields (lines 811-817) which _serialized_v1_span_to_serialized_v2_span relies on to populate span attributes. Using the unprepared event results in converted V2 GenAI spans missing these attributes.

Sorting key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:333`

The sorting lambda was changed from (t.get("name", ""), t.get("description", "")) to (t.get("name", ""), t.get("name", "")), duplicating the name field. The comment on line 331 explicitly states 'sort by name and description for comparison', but the code now sorts by name twice, which makes the secondary sort criterion useless and could produce incorrect ordering when tools have the same name but different descriptions.

Also found at:

tests/integrations/litellm/test_litellm.py:1301-1302
tests/integrations/litellm/test_litellm.py:1348-1350
tests/integrations/langchain/test_langchain.py:264

Unused list comprehension performs no validation in error handling test - `tests/integrations/langchain/test_langchain.py:1861-1865`

The list comprehension on lines 1861-1865 computes a list of error events but the result is not assigned to any variable or used in any assertion. The previous code had the same issue ([e for e in events if ...]), but this was an opportunity to fix it. As written, the test for error handling doesn't actually verify anything about captured errors.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:800-802

ai_client_span1 is unpacked but never tested - `tests/integrations/openai_agents/test_openai_agents.py:1712-1713`

The variable ai_client_span1 is unpacked from the spans list but no assertions are made against it. The previous test version verified ai_client_span properties including description, origin, status, and tags. This reduces test coverage and may miss regressions in the AI client span behavior.

Also found at:

tests/integrations/huggingface_hub/test_huggingface_hub.py:524

Test may pass vacuously without verifying any spans - `tests/integrations/pydantic_ai/test_pydantic_ai.py:1011-1018`

In test_include_prompts_requires_pii, the test iterates over chat_spans to verify messages aren't captured, but there's no assertion ensuring chat_spans is non-empty. If no chat spans are produced (due to a bug or configuration issue), the for-loop executes zero iterations and the test passes without verifying anything. Other similar tests in this file use assert len(chat_spans) >= 1 before the verification loop.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:848-851
tests/integrations/langchain/test_langchain.py:946

Low

Unused capture_items fixture parameter in test function - `tests/integrations/openai_agents/test_openai_agents.py:3083`

The test function test_openai_agents_message_truncation accepts capture_items as a parameter (changed from capture_events), but the fixture is never actually called or used in the test body. The test directly accesses span data via span._data without needing to capture any items. This unused parameter should either be removed or the test should be updated to actually use the fixture to verify captured items.

Duration: 47m 14s · Tokens: 15.0M in / 197.7k out · Cost: $19.51 (+extraction: $0.04, +merge: $0.01, +fix_gate: $0.02)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

10 issues

High

Medium

Low

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

fix tests

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

10 issues

High

Medium

Low

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...