Skip to content

fix: Image Pydantic schema to properly handle Union[str, Image] deserialization#7195

Open
Sean-Kenneth-Doherty wants to merge 1 commit into
microsoft:mainfrom
Sean-Kenneth-Doherty:fix-image-mixed-content-deserialization
Open

fix: Image Pydantic schema to properly handle Union[str, Image] deserialization#7195
Sean-Kenneth-Doherty wants to merge 1 commit into
microsoft:mainfrom
Sean-Kenneth-Doherty:fix-image-mixed-content-deserialization

Conversation

@Sean-Kenneth-Doherty
Copy link
Copy Markdown

@Sean-Kenneth-Doherty Sean-Kenneth-Doherty commented Jan 31, 2026

Why are these changes needed?

Fixes #7170 - UserMessage JSON deserialization fails when content contains both text and Image items.

Problem

When UserMessage.content contains mixed content such as [image, "describe this"], deserialization fails with:

Expected dict or Image instance, got <class 'str'>

The Image pydantic schema used core_schema.any_schema(), so the Image validator accepted the string candidate from Union[str, Image] and raised before Pydantic could fall through to the str branch.

Solution

Make Image validation narrow and Union-friendly:

  1. pass through existing Image instances
  2. deserialize dict payloads with a data key through the existing base64 path

This keeps strings out of the Image validator and lets Union[str, Image] choose the string branch normally.

Related issue number

Closes #7170

Checks

Testing

Local checks:

  • uv run --package autogen-core pytest packages/autogen-core/tests/test_serialization.py - 15 passed
  • uv run --package autogen-core ruff check packages/autogen-core/src/autogen_core/_image.py packages/autogen-core/tests/test_serialization.py - passed
  • uv run --package autogen-core mypy --config-file ../../pyproject.toml src/autogen_core/_image.py tests/test_serialization.py - passed

Hosted checks:

  • GitGuardian Security Checks - passed
  • license/cla - passed

@Sean-Kenneth-Doherty
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@Sean-Kenneth-Doherty Sean-Kenneth-Doherty force-pushed the fix-image-mixed-content-deserialization branch 2 times, most recently from 86240bc to b5ffb62 Compare May 16, 2026 21:06
When UserMessage.content contains mixed Image and str items, Pydantic tries Image validation before the string branch. The previous any_schema validator raised TypeError for strings, preventing Union[str, Image] fallback.

Use a narrow Image schema that accepts Image instances or serialized dicts, and add a regression to ensure mixed text/image UserMessage content round-trips through JSON in both orders.

Fixes microsoft#7170.
@Sean-Kenneth-Doherty Sean-Kenneth-Doherty force-pushed the fix-image-mixed-content-deserialization branch from b5ffb62 to a28d655 Compare May 16, 2026 21:06
@Sean-Kenneth-Doherty
Copy link
Copy Markdown
Author

Refreshed this PR today and tightened the patch:

  • rebuilt the branch with a clean single commit
  • replaced the standalone 5-test file with a focused regression in test_serialization.py
  • kept coverage for mixed Image/str content in both item orders
  • confirmed the PR is mergeable after the update

Validation run locally:

  • uv run --package autogen-core pytest packages/autogen-core/tests/test_serialization.py - 15 passed
  • uv run --package autogen-core ruff check packages/autogen-core/src/autogen_core/_image.py packages/autogen-core/tests/test_serialization.py - passed
  • uv run --package autogen-core mypy --config-file ../../pyproject.toml src/autogen_core/_image.py tests/test_serialization.py - passed

GitHub checks are also passing: GitGuardian Security Checks and license/cla.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

when UserMessage have both string and Image data,JSON deserialization cause error

1 participant