Skip to content

fix(deriver): add explicit JSON schema example to minimal prompt#662

Open
mrlufepines wants to merge 1 commit into
plastic-labs:mainfrom
mrlufepines:fix/deriver-prompt-json-schema
Open

fix(deriver): add explicit JSON schema example to minimal prompt#662
mrlufepines wants to merge 1 commit into
plastic-labs:mainfrom
mrlufepines:fix/deriver-prompt-json-schema

Conversation

@mrlufepines
Copy link
Copy Markdown

@mrlufepines mrlufepines commented May 7, 2026

Add schema-aware JSON output instruction to minimal deriver prompt

Target repo: plastic-labs/honcho
Suggested branch name: fix/deriver-prompt-json-schema

Problem

The minimal deriver prompt does not specify the exact JSON schema expected in the output.
When using models that return a bare list of strings when asked for JSON (e.g.
["fact one", "fact two"]), the deriver fails to parse the response into a
PromptRepresentation because it expects a list of objects with a content field
([{"content": "..."}]), not bare strings.

This manifests as a validation error during the PromptRepresentation parse step, causing
the task to fail and no observations to be saved.

Root cause

The prompt says "Return a JSON object" with a generic description but does not include a
concrete schema with CORRECT and WRONG examples. Models that default to the simplest
possible JSON representation of a list return bare strings. Without an explicit
counter-example, models cannot self-correct.

Fix

Updates minimal_deriver_prompt in src/deriver/prompts.py to include:

  1. An explicit schema block showing the exact required structure:
    {"explicit": [{"content": "one atomic fact"}, {"content": "another atomic fact"}]}
  2. A clarifying note that each item inside "explicit" MUST be an object with a
    "content" string field, not a bare string.
  3. An empty-result example: {"explicit": []} when no observations are found.

The word "JSON" is now present in the prompt body, which also satisfies the Alibaba Qwen
json-keyword guardrail (complementing the clients.py patch) without requiring the
clients-side injection.

Files touched

  • src/deriver/prompts.py

Test plan

  • Call minimal_deriver_prompt(peer_id="Alice", messages="...") and confirm the
    output contains the literal schema example including {"content": ...}.
  • Feed the updated prompt to a model that previously returned bare strings; confirm
    the response now uses {"content": "..."} objects.
  • PromptRepresentation.parse_raw(response) succeeds on the new output format.
  • estimate_minimal_deriver_prompt_tokens() still returns a plausible integer
    (cache invalidation check after prompt length change).

Notes

  • Self-healing in Lufe's deployment via cleo ensure-patches until merged.
  • See full context in cleo's STATE.md and patches dir.
  • This change also incidentally resolves the Qwen json-keyword guardrail for the deriver
    prompt specifically, complementing the clients.py fix (honcho-01) which covers all
    call sites generically.
  • The prompt change is additive and does not alter the extraction logic or the set of
    observations extracted.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced output schema for extracting explicit observations with clarified JSON formatting, including proper handling of empty result cases.

Some json-mode capable LLM providers (notably Alibaba Qwen via
DashScope) emit observations as bare strings inside the "explicit"
array instead of objects with a "content" field, when the prompt
doesn't show the schema explicitly. This violates PromptRepresentation
and the deriver discards the response.

Add a literal schema example at the end of the minimal deriver prompt
showing the object-with-content shape, plus an explicit empty-result
example. This pushes the LLM toward the right output shape regardless
of provider quirks. Costs ~80 tokens per call.

Verified end-to-end with Qwen 3.6 Plus on Alibaba DashScope.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1220a24-6def-4b45-8ead-494fd649a6df

📥 Commits

Reviewing files that changed from the base of the PR and between a4ae372 and c56ca45.

📒 Files selected for processing (1)
  • src/deriver/prompts.py

Walkthrough

The minimal deriver prompt template is extended with explicit JSON-instruction text that mandates structured output: an "explicit" array of objects containing "content" strings, and defines the empty-result format as {"explicit": []} when no observations are extractable.

Changes

Prompt Output Schema Specification

Layer / File(s) Summary
Prompt Output Schema Definition
src/deriver/prompts.py
Minimal deriver prompt now includes strict JSON output format instructions defining an "explicit" array schema, object shape ({ "content": ... }), and empty-response format.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

A prompt now speaks with clarity,
JSON shapes its words with care,
Explicit arrays, content strings—
Instructions neat, the rabbit sings,
Structure brings the answers true! 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: adding explicit JSON schema example to the minimal prompt in the deriver module, which is the core fix described in the PR objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant