Embedding calls crash with 400 when stored memory exceeds model context window

## Description

When Agent Zero stores or searches memory documents that exceed the embedding model's context window (e.g. BAAI/bge-m3: 8192 tokens), the embedding call crashes with a 400 error:

\`\`\`
litellm.BadRequestError: Error code: 400 - You passed 8193 input tokens and requested 0 output tokens.
However, the model's context length is only 8192 tokens.
\`\`\`

This is a hard crash — the agent errors out and memory operations fail completely.

## Root Cause

Two separate issues in `LiteLLMEmbeddingWrapper`:

**Issue 1**: LiteLLM ≥1.80.11 sends `encoding_format: null` in embedding requests when the parameter is not set. Strict OpenAI-compatible validators (DeepInfra, vLLM, HuggingFace TEI) reject `null` with 422. (Upstream LiteLLM issue: [BerriAI/litellm#19174](https://github.com/BerriAI/litellm/issues/19174))

**Issue 2**: There is no input truncation before embedding calls. If a memory document exceeds the model's context window, the API returns 400. Additionally, when truncation is applied using `cl100k_base` (GPT tokenizer), it can undercount tokens compared to the model's own tokenizer (e.g. bge-m3 SentencePiece), causing 400 errors even at the apparent limit.

## Steps to Reproduce

1. Configure an OpenAI-compatible embedding provider via `api_base` (e.g. DeepInfra with `BAAI/bge-m3`)
2. Ask the agent to memorize or search a long document (>8192 tokens)
3. The embedding call raises `BadRequestError: 400`

## Fix

In `models.py`, `LiteLLMEmbeddingWrapper.embed_documents` and `embed_query`:
1. Default `encoding_format` to `"float"` before merging kwargs (prevents null being sent)
2. Truncate input to `ctx_length - 500` tokens before embedding (the 500-token margin accounts for tokenizer divergence between cl100k_base used for counting and the model's actual tokenizer)

PR: #PLACEHOLDER

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embedding calls crash with 400 when stored memory exceeds model context window #1436

Description

Root Cause

Steps to Reproduce

Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Embedding calls crash with 400 when stored memory exceeds model context window #1436

Description

Description

Root Cause

Steps to Reproduce

Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions