Skip to content

Discriminated-union tool parameters lose their schema and fail validation #989

@planetf1

Description

@planetf1

Problem

A tool parameter typed as a Pydantic discriminated union (Annotated[A | B, Field(discriminator="...")], with or without | None) produces an Ollama tool schema of {"type": "string"}. The \$defs block is retained but unreferenced, and any dict payload is rejected by validate_tool_arguments.

Reproducer

from typing import Annotated, Literal
from pydantic import BaseModel, Field

from mellea.backends.tools import (
    MelleaTool,
    convert_function_to_ollama_tool,
    validate_tool_arguments,
)


class Cat(BaseModel):
    kind: Literal["cat"]
    name: str


class Dog(BaseModel):
    kind: Literal["dog"]
    name: str
    breed: str


def act(pet: Annotated[Cat | Dog, Field(discriminator="kind")]) -> str:
    """Act on a pet.

    Args:
        pet: the pet
    """
    return "ok"


tool = convert_function_to_ollama_tool(act)
print(tool.function.parameters.model_dump(exclude_none=True)["properties"])
# {'pet': {'type': 'string', 'description': 'the pet'}}

mt = MelleaTool.from_callable(act)
validate_tool_arguments(
    mt,
    {"pet": {"kind": "dog", "name": "Rex", "breed": "lab"}},
    strict=True,
)
# ValidationError: Input should be a valid string

Expected behaviour

The pet schema should preserve the union structure (the two object schemas plus the discriminator) so an LLM can emit a valid payload, and validate_tool_arguments should accept a correctly-shaped dict and reject an incorrectly-shaped one.

Actual behaviour

Input Expected Actual
{"pet": {"kind": "dog", "name": "Rex", "breed": "lab"}} accepted rejected (Input should be a valid string)
{"pet": "just a string"} rejected accepted
{"pet": {"name": "Rex"}} (missing discriminator) rejected (wrong shape) rejected, but for the wrong reason (not a string)

Root cause

Pydantic emits the discriminated union as:

{
  "anyOf": [
    {"discriminator": {...}, "oneOf": [{"\$ref": "..."}, {"\$ref": "..."}]},
    {"type": "null"}
  ]
}

Both the \$ref-inlining pass and _is_complex_anyof in mellea/backends/tools.py only inspect \$ref and properties on the anyOf sub-schemas. Neither descends into oneOf, so the parameter falls through to the flat {"type": "string"} fallback and the \$defs block is discarded for this property.

This is a structurally similar gap to the allOf case discussed on PR #896 and overlaps with the recursive-resolution work needed for #911 — all three are variants of "unresolved structure one level deeper than the inliner looks".

Scope / impact

  • Affected use cases: command-pattern tools (op: Annotated[CreateUser | DeleteUser | UpdateUser, Field(discriminator="action")]), polymorphic message/event parameters, and any tool signature that mirrors an existing discriminated-union domain model.
  • Severity: medium. Not a crash or data-loss bug, but makes an idiomatic Pydantic pattern functionally unusable as a tool parameter.
  • Silent failure: no error at MelleaTool.from_callable(...) or schema generation. Only surfaces as validation failures at tool-call time or as LLM hallucination when the model sees {"type": "string"} and has to guess the payload.
  • Pre-existing: not introduced by fix: tool call arguments #896 — the pre-PR flattening logic produced the same {"type": "string"}. fix: tool call arguments #896 does not expand the detection to cover oneOf.
  • Workaround: wrap the union in a container BaseModel. That currently bumps into Support recursive nested-model $ref resolution in Pydantic tool parameter schemas #911 (nested-model refs aren't recursively inlined), so there is no clean workaround today.

Environment

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions