fix: bypass LiteLLM for Ollama embeddings to resolve 400 Bad Request (#1425) by GratefulDave · Pull Request #1438 · agent0ai/agent-zero

GratefulDave · 2026-04-04T03:52:27Z

Fixes #1425 — persistent `httpx.HTTPStatusError: 400 Bad Request` on Ollama `/api/embed` during memory similarity search.

Root Cause

Two distinct causes, both fixed:

LiteLLM's Ollama handler sends a malformed request — leaks the `ollama/` prefix into the model name field and forwards unsupported kwargs (e.g. `encoding_format: null`) that Ollama 0.18.x+ rejects with 400.
`None` values in the embedding input array — when a `None` ends up in the texts list (e.g. from a failed upstream LLM call or race condition), it serialises to JSON `null`, which Ollama rejects with `{"error": "invalid input type"}` → 400.

Changes

`models.py`

`_ollama_embed()` — new helper on `LiteLLMEmbeddingWrapper` that calls Ollama's `/api/embed` directly via `httpx` (already a transitive dependency), bypassing LiteLLM entirely
- Strips the `ollama/` prefix from the model name
- Sanitises input: converts `None` → `""` and any non-str → `str()` before sending, with a `logging.warning` that records which inputs were coerced
- Retries only on transient errors (429, 503) with exponential backoff; raises immediately on 400 (retrying a bad payload is pointless)
- Logs the HTTP status, Ollama response body, and first 100 chars of each input on non-200 responses
`embed_query` / `embed_documents` — route through `_ollama_embed()` when `provider == "ollama"`
`_is_ollama()` — helper predicate

`plugins/_memory/helpers/memory.py`

`search_similarity_threshold` — wraps `asearch` in `try/except` so any embedding failure returns `[]` (no memories this turn) instead of propagating and crashing the agent's monologue loop

Test Plan

Start agent-zero with `ollama/nomic-embed-text` as the embedding model
Trigger a memory operation — confirm no 400 errors
Verify memory similarity search returns results
Verify `embed_documents` path works (store and retrieve a memory)
Verify agent continues normally when Ollama is temporarily unreachable (returns empty memories, does not crash)

Tested locally on Docker with `nomic-embed-text` (768-dim), Ollama 0.18.3, `host.docker.internal:11434`.

LiteLLM's Ollama embedding handler sends a malformed request to Ollama's /api/embed endpoint, causing a 400 Bad Request error on Ollama 0.18.x. - Add `_ollama_embed()` to `LiteLLMEmbeddingWrapper` that calls Ollama's `/api/embed` directly via httpx, stripping the "ollama/" prefix from the model name (the root cause of the malformed request) - Route `embed_query` and `embed_documents` through this helper when provider == "ollama", bypassing LiteLLM entirely - Wrap `search_similarity_threshold` in try/except so an embedding failure returns [] instead of crashing the agent Fixes agent0ai#1425 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GratefulDave force-pushed the fix/ollama-embedding-400-bad-request branch from eca2212 to bf6954f Compare April 9, 2026 12:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: bypass LiteLLM for Ollama embeddings to resolve 400 Bad Request (#1425)#1438

fix: bypass LiteLLM for Ollama embeddings to resolve 400 Bad Request (#1425)#1438
GratefulDave wants to merge 1 commit intoagent0ai:mainfrom
GratefulDave:fix/ollama-embedding-400-bad-request

GratefulDave commented Apr 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

GratefulDave commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Changes

`models.py`

`plugins/_memory/helpers/memory.py`

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GratefulDave commented Apr 4, 2026 •

edited

Loading