Skip to content

test huggingface_hub

b92ae36
Select commit
Loading
Failed to load commit list.
Draft

feat: Send GenAI spans as V2 envelope items #6079

test huggingface_hub
b92ae36
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden: code-review completed Apr 17, 2026 in 26m 7s

6 issues

code-review: Found 6 issues (2 high, 3 medium, 1 low)

High

Wrong variable `event` used instead of `event_opt` causes missing attributes in GenAI V2 spans - `sentry_sdk/client.py:1124`

The code passes event (the original input) to _serialized_v1_span_to_serialized_v2_span at line 1124, but should pass event_opt (the prepared event). The _serialized_v1_span_to_serialized_v2_span function extracts user, release, environment, transaction, trace_context, and sdk_info from the event parameter. These attributes are populated by _prepare_event() and exist on event_opt, not the original event. This will cause V2 GenAI spans to be missing user information, release/environment metadata, and SDK info.

Test never verifies span attributes because spans list is always empty - `tests/integrations/litellm/test_litellm.py:1020-1023`

The code at line 1020 filters for item.type == "span" but items was captured on line 945 with only capture_items("transaction"), not including "span" type. This means spans will always be an empty list, and the for-loop assertion at lines 1021-1023 will never execute, silently passing without verifying any span attributes. The companion test_async_multiple_providers function correctly uses capture_items("transaction", "span") at line 1040.

Also found at:

  • tests/integrations/litellm/test_litellm.py:945

Medium

Sorting key uses 'name' twice instead of 'name' and 'description' as documented - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda was changed from t.get("description", "") to t.get("name", "") for the second element of the tuple key. This contradicts the comment on line 328 which states "sort by name and description for comparison". Using name twice is redundant and removes the secondary sort by description, which could cause test flakiness if tools have the same name but different descriptions.

Test uses incorrect field name 'attributes' instead of 'data' for inline_data - `tests/integrations/google_genai/test_google_genai.py:2153`

The test was changed to use "attributes" instead of "data" as the field name within inline_data. However, the implementation code in sentry_sdk/ai/utils.py (line 286) and sentry_sdk/integrations/google_genai/utils.py (line 378) both expect the field to be named "data". This test will not properly validate the inline_data handling logic since the code looks for .get("data", "") which will return an empty string for input with "attributes".

Hardcoded SDK version will break tests on version bump - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

The test hardcodes "sentry.sdk.version": "2.58.0" instead of using mock.ANY (like the openai tests in this same PR) or importing VERSION from sentry_sdk.consts. When the SDK version is incremented, these tests will fail. This pattern appears in all 7 occurrences in this file.

Also found at:

  • tests/integrations/huggingface_hub/test_huggingface_hub.py:676
  • tests/integrations/huggingface_hub/test_huggingface_hub.py:942
  • tests/integrations/huggingface_hub/test_huggingface_hub.py:1038

Low

Direct dictionary access may raise KeyError if span lacks expected attributes - `tests/integrations/litellm/test_litellm.py:1283-1285`

In test_no_integration and test_async_no_integration, the code filters spans using x["attributes"]["sentry.op"] and x["attributes"]["sentry.origin"]. If any span is captured that doesn't have these attributes (e.g., from other instrumentation), this will raise a KeyError. The same file uses the safer .get() pattern at line 1427, suggesting awareness of this issue.


Duration: 25m 53s · Tokens: 7.0M in / 84.2k out · Cost: $9.66 (+extraction: $0.00, +merge: $0.00, +fix_gate: $0.01)

Annotations

Check failure on line 1124 in sentry_sdk/client.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Wrong variable `event` used instead of `event_opt` causes missing attributes in GenAI V2 spans

The code passes `event` (the original input) to `_serialized_v1_span_to_serialized_v2_span` at line 1124, but should pass `event_opt` (the prepared event). The `_serialized_v1_span_to_serialized_v2_span` function extracts `user`, `release`, `environment`, `transaction`, `trace_context`, and `sdk_info` from the event parameter. These attributes are populated by `_prepare_event()` and exist on `event_opt`, not the original `event`. This will cause V2 GenAI spans to be missing user information, release/environment metadata, and SDK info.

Check failure on line 1023 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Test never verifies span attributes because spans list is always empty

The code at line 1020 filters for `item.type == "span"` but `items` was captured on line 945 with only `capture_items("transaction")`, not including "span" type. This means `spans` will always be an empty list, and the for-loop assertion at lines 1021-1023 will never execute, silently passing without verifying any span attributes. The companion `test_async_multiple_providers` function correctly uses `capture_items("transaction", "span")` at line 1040.

Check failure on line 945 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[7HE-V7N] Test never verifies span attributes because spans list is always empty (additional location)

The code at line 1020 filters for `item.type == "span"` but `items` was captured on line 945 with only `capture_items("transaction")`, not including "span" type. This means `spans` will always be an empty list, and the for-loop assertion at lines 1021-1023 will never execute, silently passing without verifying any span attributes. The companion `test_async_multiple_providers` function correctly uses `capture_items("transaction", "span")` at line 1040.

Check warning on line 330 in tests/integrations/google_genai/test_google_genai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Sorting key uses 'name' twice instead of 'name' and 'description' as documented

The sorting lambda was changed from `t.get("description", "")` to `t.get("name", "")` for the second element of the tuple key. This contradicts the comment on line 328 which states "sort by name and description for comparison". Using `name` twice is redundant and removes the secondary sort by description, which could cause test flakiness if tools have the same name but different descriptions.

Check warning on line 2153 in tests/integrations/google_genai/test_google_genai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Test uses incorrect field name 'attributes' instead of 'data' for inline_data

The test was changed to use `"attributes"` instead of `"data"` as the field name within `inline_data`. However, the implementation code in `sentry_sdk/ai/utils.py` (line 286) and `sentry_sdk/integrations/google_genai/utils.py` (line 378) both expect the field to be named `"data"`. This test will not properly validate the inline_data handling logic since the code looks for `.get("data", "")` which will return an empty string for input with `"attributes"`.

Check warning on line 523 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Hardcoded SDK version will break tests on version bump

The test hardcodes `"sentry.sdk.version": "2.58.0"` instead of using `mock.ANY` (like the openai tests in this same PR) or importing `VERSION` from `sentry_sdk.consts`. When the SDK version is incremented, these tests will fail. This pattern appears in all 7 occurrences in this file.

Check warning on line 676 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[XGB-JDN] Hardcoded SDK version will break tests on version bump (additional location)

The test hardcodes `"sentry.sdk.version": "2.58.0"` instead of using `mock.ANY` (like the openai tests in this same PR) or importing `VERSION` from `sentry_sdk.consts`. When the SDK version is incremented, these tests will fail. This pattern appears in all 7 occurrences in this file.

Check warning on line 942 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[XGB-JDN] Hardcoded SDK version will break tests on version bump (additional location)

The test hardcodes `"sentry.sdk.version": "2.58.0"` instead of using `mock.ANY` (like the openai tests in this same PR) or importing `VERSION` from `sentry_sdk.consts`. When the SDK version is incremented, these tests will fail. This pattern appears in all 7 occurrences in this file.

Check warning on line 1038 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[XGB-JDN] Hardcoded SDK version will break tests on version bump (additional location)

The test hardcodes `"sentry.sdk.version": "2.58.0"` instead of using `mock.ANY` (like the openai tests in this same PR) or importing `VERSION` from `sentry_sdk.consts`. When the SDK version is incremented, these tests will fail. This pattern appears in all 7 occurrences in this file.