SCOPE-PY-008: Propose New Rule pattern-pydantic-model-dump-mode-json for Python + azure-cosmos
Repository: AzureCosmosDB/cosmosdb-agent-kit
Labels: SCOPE, enhancement, rule:sdk, rule:pattern
Affected Rule: NEW rule (proposed) — category pattern- or sdk-
Severity: CRITICAL
Summary
The Python azure-cosmos SDK serializes request bodies with stdlib json.dumps(data, separators=(",", ":")) and no custom encoder. Pydantic v2's default model_dump() returns Python-native datetime / UUID / Decimal / date / time values, which raise TypeError: Object of type X is not JSON serializable when passed to create_item / upsert_item / replace_item. Every vulnerable run fails on its first write endpoint call (POST score). Across 25 SCOPE V-B Python Gaming Leaderboard L5 runs the anti-pattern rate is 11/25 = 44%, with AK-only profiles at 40% and the Control profile at 80%. No existing cosmosdb-best-practices rule covers this Python-specific pattern.
Scope of this proposal: non-Enum non-JSON-primitive Pydantic fields (datetime, UUID, Decimal, date, time). Enum serialization is already covered by sdk-serialization-enums; mode="json" is compatible with that rule's class X(str, Enum) guidance and does not conflict.
SDK source (from GitHub): azure/cosmos/_synchronized_request.py — function _request_body_from_data:
if isinstance(data, (dict, list, tuple)):
json_dumped = json.dumps(data, separators=(",", ":"))
return json_dumped
Observed Behavior
Refined 25-run static scan (pattern model_dump(...) piped to create_item / upsert_item / replace_item, AND a BaseModel with a : datetime-typed field, AND no mode="json" anywhere in the file):
| Profile |
Config |
Vuln Runs |
Rate |
| P1 |
Control (Sonnet 4.5) |
4/5 |
80% |
| P2 |
AK + Azure MCP |
4/5 |
80% |
| P3 |
AK only |
2/5 |
40% |
| P4 |
MS Learn MCP |
0/5 |
0% |
| P5 |
No extensions |
1/5 |
20% |
| Total |
|
11/25 |
44% |
Runtime-verified failures (Phase 4): P01 R04, P02 R01, P03 R04 all fail with the TypeError traceback at azure/cosmos/_synchronized_request.py:66 on the first score submission. P04 R04 succeeds because the author used mode="json". P04 R02 and P05 R04 succeed only because their timestamp fields are typed str (pre-stringified via .isoformat() in classmethods) — an accidental dodge, not a deliberate defense.
Anti-pattern (versionb/profile03/run04/workspace/workspace/app/repositories/score_repository.py):
class ScoreDocument(BaseModel):
id: str
player_id: str = Field(alias="playerId")
created_at: datetime = Field(alias="createdAt") # ← datetime field
async def create(self, doc: ScoreDocument) -> dict:
payload = doc.model_dump(by_alias=True, exclude_none=True) # ← no mode="json"
return await self._container.create_item(body=payload) # 💥 TypeError at runtime
Expected Behavior
When an agent writes Pydantic payloads containing non-JSON-primitive typed fields to azure-cosmos, it should always pass mode="json" to model_dump(), letting Pydantic convert datetime → ISO-8601 str, UUID → hex str, Decimal → str, date/time → ISO str before the SDK sees the dict.
Proposed Fix
Add a new rule file to skills/cosmosdb-best-practices/rules/pattern-pydantic-model-dump-mode-json.md:
---
title: Always pass mode="json" when dumping Pydantic models for Cosmos DB writes
impact: CRITICAL
impactDescription: prevents TypeError on first write when Pydantic models contain datetime, UUID, Decimal, date, or time fields
tags: [sdk, python, serialization, pydantic, datetime, bug-prevention]
---
## Always pass `mode="json"` when dumping Pydantic models for Cosmos DB writes
When serializing a Pydantic v2 model for `azure.cosmos` `create_item`,
`upsert_item`, or `replace_item`, always call `.model_dump(..., mode="json")`.
### Why
The Python `azure-cosmos` SDK serializes request bodies with
`json.dumps(data, separators=(",", ":"))` and no custom encoder
(`azure/cosmos/_synchronized_request.py`, `_request_body_from_data`). Any
`datetime`, `UUID`, `Decimal`, `date`, or `time` value in `data` raises
`TypeError: Object of type X is not JSON serializable` at runtime on the
first write.
Pydantic's default `model_dump()` returns native Python objects. `mode="json"`
converts them to JSON-safe primitives before the SDK sees the dict.
### Incorrect
```python
from datetime import datetime
from pydantic import BaseModel, Field
class ScoreDoc(BaseModel):
id: str
submitted_at: datetime = Field(alias="submittedAt")
# ❌ raises TypeError: Object of type datetime is not JSON serializable
await container.create_item(body=doc.model_dump(by_alias=True))
Correct
# ✅ datetime → ISO-8601 string before azure-cosmos sees it
await container.create_item(body=doc.model_dump(by_alias=True, mode="json"))
Applies to
datetime, date, time — most common trigger
UUID — second most common
Decimal — financial/precision fields
- Any field with a Pydantic custom serializer returning non-primitive objects
For Enum fields, see the related rule sdk-serialization-enums
(inherit from str, int, or use a Pydantic serializer). mode="json"
is compatible with that rule and covers Enum correctly as well.
## Evidence
- **SDK source:** [`_synchronized_request.py` L60-70](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/cosmos/azure-cosmos/azure/cosmos/_synchronized_request.py) — `json.dumps` with no `default=` or `cls=`
- **Phase 4a proof-of-concept:** [`versionb/_scratch/`](../../versionb/_scratch/README.md) — `poc_vuln.py` (reproduces TypeError), `poc_safe.py` (succeeds with `mode="json"`), `run_poc.py` (runner, exits 0)
[vuln] create_item -> ✓ Object of type datetime is not JSON serializable
[safe] create_item -> ✓ created id=poc-safe-a6858ddf
PoC PASSED — rule pattern-pydantic-model-dump-mode-json is evidence-backed.
- **Phase 4 runtime findings:** [`versionb/docs/runtime-findings.md`](../../versionb/docs/runtime-findings.md) — 6 runs runtime-verified, full 25-run refined matrix
- **Official Microsoft docs (gap):**
- [Azure Cosmos DB NoSQL Python quickstart](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quickstart-python) (updated 2026-03-25) — `create_item` examples use plain `dict` literals; no Pydantic / datetime guidance
- [azure-cosmos 4.15.0 README](https://learn.microsoft.com/en-us/python/api/overview/azure/cosmos-readme) — has a "Boolean Data Type" Python-vs-JSON callout, no equivalent for datetime/UUID/Decimal
- **Related AK rules checked:**
- `sdk-serialization-enums` — covers Enum only (scope overlap resolved by narrowing this rule to non-Enum types)
- `model-json-serialization` — Jackson/Java only, does not apply to Python
- `sdk-python-async-deps` — aiohttp dependency only
## Documentation Gap
**Yes.** Companion issue recommended on `MicrosoftDocs/azure-databases-docs-pr` to add a Pydantic-integration callout in the Python quickstart, mirroring the existing "Boolean Data Type" callout in the SDK README.
## Related Out-of-Scope Bugs
Found during Phase 4 runtime validation but **do not belong to this rule**; each warrants a separate issue proposal:
| Bug | Observed in | Root cause |
|---|---|---|
| Custom `to_dict()` returning raw `datetime` | P3 R5, P5 R2 | Same SDK `json.dumps` limitation; pattern is a custom serializer not `model_dump` |
| URL-unsafe `#` in document `id` causes 401 auth-signing mismatch | P4 R02 | Cosmos REST signature is computed over ResourceLink; HTTP client strips fragment, server sees truncated id, signatures don't match |
| `match_condition="IfMatch"` passed as raw string instead of `MatchConditions.IfNotModified` enum | P5 R04 | `azure.core.MatchConditions` enum required; string rejected with `TypeError: Invalid match condition`. Note: .NET `IfMatch` maps to Python `IfNotModified` |
SCOPE-PY-008: Propose New Rule
pattern-pydantic-model-dump-mode-jsonfor Python + azure-cosmosRepository: AzureCosmosDB/cosmosdb-agent-kit
Labels: SCOPE, enhancement, rule:sdk, rule:pattern
Affected Rule: NEW rule (proposed) — category
pattern-orsdk-Severity: CRITICAL
Summary
The Python
azure-cosmosSDK serializes request bodies with stdlibjson.dumps(data, separators=(",", ":"))and no custom encoder. Pydantic v2's defaultmodel_dump()returns Python-nativedatetime/UUID/Decimal/date/timevalues, which raiseTypeError: Object of type X is not JSON serializablewhen passed tocreate_item/upsert_item/replace_item. Every vulnerable run fails on its first write endpoint call (POST score). Across 25 SCOPE V-B Python Gaming Leaderboard L5 runs the anti-pattern rate is 11/25 = 44%, with AK-only profiles at 40% and the Control profile at 80%. No existing cosmosdb-best-practices rule covers this Python-specific pattern.Scope of this proposal: non-Enum non-JSON-primitive Pydantic fields (
datetime,UUID,Decimal,date,time).Enumserialization is already covered bysdk-serialization-enums;mode="json"is compatible with that rule'sclass X(str, Enum)guidance and does not conflict.SDK source (from GitHub):
azure/cosmos/_synchronized_request.py— function_request_body_from_data:Observed Behavior
Refined 25-run static scan (pattern
model_dump(...)piped tocreate_item/upsert_item/replace_item, AND aBaseModelwith a: datetime-typed field, AND nomode="json"anywhere in the file):Runtime-verified failures (Phase 4): P01 R04, P02 R01, P03 R04 all fail with the
TypeErrortraceback atazure/cosmos/_synchronized_request.py:66on the first score submission. P04 R04 succeeds because the author usedmode="json". P04 R02 and P05 R04 succeed only because their timestamp fields are typedstr(pre-stringified via.isoformat()in classmethods) — an accidental dodge, not a deliberate defense.Anti-pattern (
versionb/profile03/run04/workspace/workspace/app/repositories/score_repository.py):Expected Behavior
When an agent writes Pydantic payloads containing non-JSON-primitive typed fields to
azure-cosmos, it should always passmode="json"tomodel_dump(), letting Pydantic convertdatetime→ ISO-8601 str,UUID→ hex str,Decimal→ str,date/time→ ISO str before the SDK sees the dict.Proposed Fix
Add a new rule file to
skills/cosmosdb-best-practices/rules/pattern-pydantic-model-dump-mode-json.md:Correct
Applies to
datetime,date,time— most common triggerUUID— second most commonDecimal— financial/precision fieldsFor
Enumfields, see the related rulesdk-serialization-enums(inherit from
str,int, or use a Pydantic serializer).mode="json"is compatible with that rule and covers Enum correctly as well.
[vuln] create_item -> ✓ Object of type datetime is not JSON serializable
[safe] create_item -> ✓ created id=poc-safe-a6858ddf
PoC PASSED — rule
pattern-pydantic-model-dump-mode-jsonis evidence-backed.