Skip to content

Commit d0e0aa2

Browse files
committed
docs: Add comprehensive analysis of issue #3281 and PR #3280 review
This commit contains detailed analysis materials for investigating and reviewing the nullable fields bug with Google GenAI bump. Analysis Materials: - issue_3281_analysis.md: Root cause analysis and detailed bug flow - pr_3280_review.md: Comprehensive review of the proposed fix Test Scripts (for reproduction and validation): - test_nullable_issue.py: Basic reproduction of the Pydantic schema issue - test_sanitize_issue.py: Demonstrates the problematic transformation - test_exact_adk_flow.py: Traces the complete ADK processing flow - test_anyof_unwrap.py: Shows google-genai can unwrap correctly - test_unwrap_with_description.py: Tests unwrapping with metadata - test_gemini_api_error.py: Analyzes the schema structure problem - test_fix_proposal.py: Initial fix approach (tracking anyOf context) - test_pr_3280_fix.py: Validates the PR #3280 approach - test_pr_edge_cases.py: Comprehensive edge case testing Key Findings: - Root cause: _sanitize_schema_type transforms {"type": "null"} to {"type": ["object", "null"]} which breaks google-genai unwrapping - PR #3280 correctly fixes this by removing _sanitize_schema_type entirely and deferring to google-genai>=1.45.0 - All edge cases are properly handled by the new approach - The fix is minimal, clean, and improves type handling overall Recommendation: APPROVE PR #3280 Related: #3281, PR #3280
1 parent e6f7363 commit d0e0aa2

11 files changed

Lines changed: 748 additions & 0 deletions

issue_3281_analysis.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Issue #3281: Nullable Fields Broken with Google GenAI Bump
2+
3+
## Root Cause Analysis
4+
5+
### The Problem
6+
7+
When defining MCP tools with optional parameters using Pydantic's `int | None` or `str | None` type hints (OpenAPI 3.1 style), the Gemini API returns a 400 error:
8+
9+
```
10+
Unable to submit request because `create_payee` functionDeclaration `parameters.person.age` schema specified other fields alongside any_of. When using any_of, it must be the only field set.
11+
```
12+
13+
### The Bug
14+
15+
The issue is in **`_sanitize_schema_type` function** in [`src/google/adk/tools/_gemini_schema_util.py:96-97`](src/google/adk/tools/_gemini_schema_util.py#L96-L97):
16+
17+
```python
18+
elif schema.get("type") == "null":
19+
schema["type"] = ["object", "null"]
20+
```
21+
22+
This transformation breaks the `anyOf` unwrapping logic in `google-genai`'s `Schema.from_json_schema`.
23+
24+
### Detailed Flow
25+
26+
1. **Pydantic generates** (OpenAPI 3.1 style):
27+
```json
28+
{
29+
"anyOf": [
30+
{"type": "integer", "minimum": 0},
31+
{"type": "null"}
32+
],
33+
"description": "Person's age"
34+
}
35+
```
36+
37+
2. **ADK's `_sanitize_schema_formats_for_gemini` recursively processes** the `anyOf` elements
38+
- First element `{"type": "integer", "minimum": 0}` → stays as-is
39+
- Second element `{"type": "null"}`**transforms to `{"type": ["object", "null"]}`** (LINE 97)
40+
41+
3. **Result after sanitization**:
42+
```json
43+
{
44+
"any_of": [
45+
{"type": "integer", "minimum": 0},
46+
{"type": ["object", "null"]} ← PROBLEM!
47+
],
48+
"description": "Person's age"
49+
}
50+
```
51+
52+
4. **google-genai's unwrapping logic** (in `Schema.from_json_schema`, lines 2232-2250):
53+
- Expects: `{"type": "null"}` or `{"nullable": True}`
54+
- Gets: `{"type": ["object", "null"]}`
55+
- **Fails to unwrap** because it doesn't match the expected pattern
56+
57+
5. **Result**: Schema sent to Vertex AI has:
58+
```
59+
any_of: [Schema(type=INTEGER), Schema(type=OBJECT, nullable=True)]
60+
description: "Person's age"
61+
type: OBJECT
62+
```
63+
This violates Vertex AI's constraint: "when using any_of, it must be the only field set"
64+
65+
### Why This Line Exists
66+
67+
The `_sanitize_schema_type` function line 96-97 was likely added to handle standalone `{"type": "null"}` schemas, converting them to something Gemini can understand. However, it should **NOT** be applied to schemas that are part of an `anyOf` array used for representing nullable types.
68+
69+
### The Fix
70+
71+
The `_sanitize_schema_type` function should NOT transform `{"type": "null"}` when it's inside an `anyOf` array that's being used to represent nullable types. The google-genai library is designed to handle this unwrapping automatically.
72+
73+
**Recommended Solution**: Modify `_sanitize_schema_formats_for_gemini` to track when it's processing elements inside `any_of`, and skip the `{"type": "null"}``{"type": ["object", "null"]}` transformation in that context.
74+
75+
**Code Changes Required**:
76+
1. Add an `in_any_of` parameter to `_sanitize_schema_formats_for_gemini` and `_sanitize_schema_type`
77+
2. Set `in_any_of=True` when recursively processing elements of `any_of` lists
78+
3. Skip the line 96-97 transformation when `in_any_of=True`
79+
4. Update the test `test_sanitize_schema_formats_for_gemini_nullable` to expect `{"type": "null"}` instead of `{"type": ["object", "null"]}`
80+
5. Add an end-to-end test using `_to_gemini_schema` to verify the anyOf unwrapping works correctly
81+
82+
**Alternative Solutions** (not recommended):
83+
- **Option 2**: Remove all anyOf pre-processing and let google-genai handle everything (may break other cases)
84+
- **Option 3**: Remove the line 96-97 transformation entirely (would break standalone `{"type": "null"}` schemas)
85+
86+
### Impact
87+
88+
- Affects all tools with optional parameters using Python 3.10+ union syntax (`Type | None`)
89+
- Affects MCP tools particularly, as they often use optional parameters
90+
- Regression introduced with google-genai v1.45.0+ which added stricter anyOf validation
91+
92+
### Workarounds
93+
94+
Until fixed, users can:
95+
1. Avoid using optional parameters (not practical)
96+
2. Downgrade google-genai to v1.44.0 (may have other implications)
97+
3. Use OpenAPI 3.0 style with `required` field instead of nullable (breaks Pydantic patterns)
98+
99+
## Test Cases
100+
101+
See:
102+
- `test_nullable_issue.py` - Demonstrates the original Pydantic schema
103+
- `test_sanitize_issue.py` - Shows the problematic transformation
104+
- `test_exact_adk_flow.py` - Confirms the unwrapping failure
105+
- `test_anyof_unwrap.py` - Shows google-genai can unwrap when given correct input

pr_3280_review.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# PR #3280 Review: Fix for Nullable Types with GenAI Bump
2+
3+
## Summary
4+
**Status: ✅ CORRECT FIX**
5+
6+
PR #3280 correctly fixes issue #3281 by removing the problematic `_sanitize_schema_type` function and deferring to google-genai>=1.45.0 for proper nullable type handling.
7+
8+
## Root Cause (Confirmed)
9+
The bug was in `_sanitize_schema_type` function at lines 96-97:
10+
```python
11+
elif schema.get("type") == "null":
12+
schema["type"] = ["object", "null"]
13+
```
14+
15+
This transformation broke google-genai's anyOf unwrapping logic, causing Vertex AI to reject schemas with the error:
16+
> "schema specified other fields alongside any_of. When using any_of, it must be the only field set"
17+
18+
## The Fix
19+
20+
### Code Changes ✅
21+
1. **Removes entire `_sanitize_schema_type` function** (lines 77-99)
22+
- This was causing the conflicting type transformations
23+
- google-genai>=1.45.0 handles all these cases properly
24+
25+
2. **Simplifies `_sanitize_schema_formats_for_gemini`**
26+
- Removes call to `_sanitize_schema_type`
27+
- Adds simple empty schema handling: `if not snake_case_schema: snake_case_schema["type"] = "object"`
28+
- Passes schemas to google-genai without mangling the type field
29+
30+
3. **Bumps google-genai requirement** to `>=1.45.0`
31+
- Version 1.45.0 has improved nullable type handling
32+
- Includes the anyOf unwrapping logic that ADK was breaking
33+
34+
### Test Changes ✅
35+
36+
1. **Removes `test_sanitize_schema_formats_for_gemini_nullable`**
37+
- This test was asserting INCORRECT behavior
38+
- Expected `{"type": ["object", "null"]}` which breaks unwrapping
39+
- Correct to remove it
40+
41+
2. **Adds `test_to_gemini_schema_any_of_nullable`**
42+
- Tests the end-to-end anyOf unwrapping
43+
- Verifies `{"anyOf": [{"type": "string"}, {"type": "null"}]}``{type: STRING, nullable: True}`
44+
- This is the critical test for the fix ✅
45+
46+
3. **Updates `test_to_gemini_schema_array_string_types`**
47+
- Changes `"object_nullable"` to `"nullable_object"` with proper schema
48+
- Adds `"only_null"` test case for standalone null type
49+
- Updates `"multi_types_nullable"` expectation to use anyOf (more correct)
50+
- All changes reflect the new, more correct behavior ✅
51+
52+
## Edge Case Analysis ✅
53+
54+
All edge cases previously handled by `_sanitize_schema_type` are now handled by google-genai>=1.45.0:
55+
56+
| Case | Old Behavior | New Behavior | Status |
57+
|------|--------------|--------------|--------|
58+
| Empty schema `{}` | Adds `type: object` | Still adds `type: object` | ✅ Same |
59+
| Standalone `{"type": "null"}` |`["object", "null"]` |`nullable: True` | ✅ Better |
60+
| `["string", "null"]` | Keeps as-is |`{type: STRING, nullable: True}` | ✅ Better |
61+
| `["null", "integer"]` |`["integer", "null"]` |`{type: INTEGER, nullable: True}` | ✅ Better |
62+
| `["string", "integer", "null"]` |`["string", "null"]` (loses integer!) |`{any_of: [STRING, INTEGER], nullable: True}` | ✅ Much Better |
63+
| anyOf with null |`{type: ["object", "null"]}` (broken!) | → unwraps correctly |**FIXES THE BUG** |
64+
65+
## Concerns Addressed
66+
67+
### ❓ Why was `_sanitize_schema_type` added originally?
68+
- Added in commit b1a74d09 to "tolerate more cases"
69+
- Was needed when google-genai had less sophisticated type handling
70+
- With google-genai>=1.45.0, it's now redundant and harmful
71+
72+
### ❓ Will this break existing code?
73+
- **No breaking changes for valid use cases**
74+
- Minor behavior improvement for multi-type unions (more correct now)
75+
- The bug being fixed was preventing nullable types from working at all
76+
77+
### ❓ Are the test changes safe?
78+
- Yes, the removed test was testing INCORRECT behavior
79+
- New test verifies the actual desired outcome
80+
- Updated tests reflect more correct type handling
81+
82+
## Recommendation
83+
84+
**✅ APPROVE AND MERGE**
85+
86+
This PR:
87+
1. ✅ Correctly identifies and fixes the root cause
88+
2. ✅ Properly defers to google-genai's improved handling
89+
3. ✅ Includes appropriate test changes
90+
4. ✅ Handles all edge cases correctly
91+
5. ✅ Has minimal risk of regressions
92+
6. ✅ Actually IMPROVES handling of complex union types
93+
94+
## Additional Notes
95+
96+
- The PR author did excellent debugging work (see screenshots in PR description)
97+
- The fix aligns with how google-genai was designed to work in v1.45.0+
98+
- This is the cleanest solution - removing conflicting logic rather than adding workarounds
99+
- The requirement bump to google-genai>=1.45.0 is necessary and correct

test_anyof_unwrap.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
"""Test if google-genai properly unwraps anyOf for nullable types."""
2+
3+
from google.genai.types import JSONSchema, Schema
4+
from pydantic import BaseModel, Field
5+
import json
6+
7+
class Person(BaseModel):
8+
name: str = Field(description="Person's name")
9+
age: int | None = Field(description="Person's age", ge=0)
10+
11+
# Generate the Pydantic schema
12+
pydantic_schema = Person.model_json_schema()
13+
14+
print("=== Original Pydantic Schema ===")
15+
print(json.dumps(pydantic_schema, indent=2))
16+
17+
# Test: Call google-genai's Schema.from_json_schema directly
18+
json_schema_obj = JSONSchema(**pydantic_schema)
19+
gemini_schema = Schema.from_json_schema(
20+
json_schema=json_schema_obj,
21+
api_option='VERTEX_AI'
22+
)
23+
24+
print("\n=== Gemini Schema (via google-genai directly) ===")
25+
print(gemini_schema)
26+
27+
print("\n=== Analyzing 'age' field ===")
28+
age_schema = gemini_schema.properties['age']
29+
print(f"Type: {age_schema.type}")
30+
print(f"Nullable: {age_schema.nullable}")
31+
print(f"Has any_of: {age_schema.any_of is not None}")
32+
if age_schema.any_of:
33+
print(f"Number of any_of options: {len(age_schema.any_of)}")
34+
for i, option in enumerate(age_schema.any_of):
35+
option_dict = option.model_dump(exclude_unset=True)
36+
print(f" Option {i}: {option_dict}")
37+
38+
print("\n=== Test Summary ===")
39+
if age_schema.any_of:
40+
print("❌ PROBLEM: anyOf was NOT unwrapped. This will cause Vertex AI error.")
41+
print("The schema has both any_of and other fields (type, description, title).")
42+
else:
43+
print("✅ GOOD: anyOf was successfully unwrapped to nullable field.")

test_exact_adk_flow.py

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
"""Test the EXACT flow that ADK uses, including the problematic transformation."""
2+
3+
from google.genai.types import JSONSchema, Schema
4+
from google.adk.tools._gemini_schema_util import (
5+
_sanitize_schema_formats_for_gemini,
6+
_dereference_schema,
7+
_ExtendedJSONSchema
8+
)
9+
from google.adk.utils.variant_utils import get_google_llm_variant
10+
import json
11+
12+
# This is what Pydantic generates for int | None
13+
pydantic_schema = {
14+
"properties": {
15+
"age": {
16+
"anyOf": [
17+
{"minimum": 0, "type": "integer"},
18+
{"type": "null"}
19+
],
20+
"description": "Person's age",
21+
"title": "Age"
22+
}
23+
},
24+
"type": "object"
25+
}
26+
27+
print("=== Step 1: Original Pydantic Schema ===")
28+
print(json.dumps(pydantic_schema, indent=2))
29+
30+
# ADK processing
31+
dereferenced = _dereference_schema(pydantic_schema)
32+
print("\n=== Step 2: After _dereference_schema ===")
33+
print(json.dumps(dereferenced, indent=2))
34+
35+
sanitized = _sanitize_schema_formats_for_gemini(dereferenced)
36+
print("\n=== Step 3: After _sanitize_schema_formats_for_gemini ===")
37+
print(json.dumps(sanitized, indent=2))
38+
39+
# The key problem: note the 'age' field after sanitization
40+
print("\n=== Critical Issue in 'age' field ===")
41+
age_sanitized = sanitized['properties']['age']
42+
print(json.dumps(age_sanitized, indent=2))
43+
print(f"\nNotice: The second element of any_of has type={age_sanitized['any_of'][1]['type']}")
44+
print("It's ['object', 'null'] instead of just 'null'!")
45+
print("This is due to _sanitize_schema_type being called on {'type': 'null'}")
46+
47+
# Now convert to Gemini schema
48+
print("\n=== Step 4: Converting to Gemini Schema ===")
49+
try:
50+
json_schema_obj = _ExtendedJSONSchema.model_validate(sanitized)
51+
gemini_schema = Schema.from_json_schema(
52+
json_schema=json_schema_obj,
53+
api_option=get_google_llm_variant()
54+
)
55+
print("Gemini Schema for 'age':")
56+
age_schema = gemini_schema.properties['age']
57+
print(f" Type: {age_schema.type}")
58+
print(f" Nullable: {age_schema.nullable}")
59+
print(f" any_of: {age_schema.any_of}")
60+
61+
if age_schema.any_of:
62+
print("\n❌ PROBLEM CONFIRMED!")
63+
print("The schema was NOT unwrapped because:")
64+
print(" The second element is {'type': ['object', 'null']} instead of {'type': 'null'}")
65+
print(" So it doesn't match the unwrapping condition")
66+
for i, option in enumerate(age_schema.any_of):
67+
option_dict = option.model_dump(exclude_unset=True)
68+
print(f" Option {i}: {option_dict}")
69+
else:
70+
print("\n✅ Schema was unwrapped successfully")
71+
72+
except Exception as e:
73+
print(f"Error: {e}")
74+
import traceback
75+
traceback.print_exc()

0 commit comments

Comments
 (0)