fix(deriver): add explicit JSON schema example to minimal prompt by mrlufepines · Pull Request #662 · plastic-labs/honcho

mrlufepines · 2026-05-07T22:28:23Z

Add schema-aware JSON output instruction to minimal deriver prompt

Target repo: plastic-labs/honcho
Suggested branch name: fix/deriver-prompt-json-schema

Problem

The minimal deriver prompt does not specify the exact JSON schema expected in the output.
When using models that return a bare list of strings when asked for JSON (e.g.
["fact one", "fact two"]), the deriver fails to parse the response into a
PromptRepresentation because it expects a list of objects with a content field
([{"content": "..."}]), not bare strings.

This manifests as a validation error during the PromptRepresentation parse step, causing
the task to fail and no observations to be saved.

Root cause

The prompt says "Return a JSON object" with a generic description but does not include a
concrete schema with CORRECT and WRONG examples. Models that default to the simplest
possible JSON representation of a list return bare strings. Without an explicit
counter-example, models cannot self-correct.

Fix

Updates minimal_deriver_prompt in src/deriver/prompts.py to include:

An explicit schema block showing the exact required structure:

{"explicit": [{"content": "one atomic fact"}, {"content": "another atomic fact"}]}

A clarifying note that each item inside "explicit" MUST be an object with a
"content" string field, not a bare string.
An empty-result example: {"explicit": []} when no observations are found.

The word "JSON" is now present in the prompt body, which also satisfies the Alibaba Qwen
json-keyword guardrail (complementing the clients.py patch) without requiring the
clients-side injection.

Files touched

src/deriver/prompts.py

Test plan

Call minimal_deriver_prompt(peer_id="Alice", messages="...") and confirm the
output contains the literal schema example including {"content": ...}.
Feed the updated prompt to a model that previously returned bare strings; confirm
the response now uses {"content": "..."} objects.
PromptRepresentation.parse_raw(response) succeeds on the new output format.
estimate_minimal_deriver_prompt_tokens() still returns a plausible integer
(cache invalidation check after prompt length change).

Notes

Self-healing in Lufe's deployment via cleo ensure-patches until merged.
See full context in cleo's STATE.md and patches dir.
This change also incidentally resolves the Qwen json-keyword guardrail for the deriver
prompt specifically, complementing the clients.py fix (honcho-01) which covers all
call sites generically.
The prompt change is additive and does not alter the extraction logic or the set of
observations extracted.

Summary by CodeRabbit

Bug Fixes
- Enhanced output schema for extracting explicit observations with clarified JSON formatting, including proper handling of empty result cases.

Some json-mode capable LLM providers (notably Alibaba Qwen via DashScope) emit observations as bare strings inside the "explicit" array instead of objects with a "content" field, when the prompt doesn't show the schema explicitly. This violates PromptRepresentation and the deriver discards the response. Add a literal schema example at the end of the minimal deriver prompt showing the object-with-content shape, plus an explicit empty-result example. This pushes the LLM toward the right output shape regardless of provider quirks. Costs ~80 tokens per call. Verified end-to-end with Qwen 3.6 Plus on Alibaba DashScope.

coderabbitai · 2026-05-07T22:28:35Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1220a24-6def-4b45-8ead-494fd649a6df

📥 Commits

Reviewing files that changed from the base of the PR and between a4ae372 and c56ca45.

📒 Files selected for processing (1)

src/deriver/prompts.py

Walkthrough

The minimal deriver prompt template is extended with explicit JSON-instruction text that mandates structured output: an "explicit" array of objects containing "content" strings, and defines the empty-result format as {"explicit": []} when no observations are extractable.

Changes

Prompt Output Schema Specification

Layer / File(s)	Summary
Prompt Output Schema Definition `src/deriver/prompts.py`	Minimal deriver prompt now includes strict JSON output format instructions defining an "explicit" array schema, object shape (`{ "content": ... }`), and empty-response format.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

A prompt now speaks with clarity,
JSON shapes its words with care,
Explicit arrays, content strings—
Instructions neat, the rabbit sings,
Structure brings the answers true! 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the main change: adding explicit JSON schema example to the minimal prompt in the deriver module, which is the core fix described in the PR objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deriver): add explicit JSON schema example to minimal prompt#662

fix(deriver): add explicit JSON schema example to minimal prompt#662
mrlufepines wants to merge 1 commit into
plastic-labs:mainfrom
mrlufepines:fix/deriver-prompt-json-schema

mrlufepines commented May 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrlufepines commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add schema-aware JSON output instruction to minimal deriver prompt

Problem

Root cause

Fix

Files touched

Test plan

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mrlufepines commented May 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading