Skip to content

Lorenze/imp/prompt layering#5774

Merged
lorenzejay merged 7 commits into
mainfrom
lorenze/imp/prompt-layering
May 12, 2026
Merged

Lorenze/imp/prompt layering#5774
lorenzejay merged 7 commits into
mainfrom
lorenze/imp/prompt-layering

Conversation

@lorenzejay

@lorenzejay lorenzejay commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Note

Medium Risk
Touches core prompt/message formatting and Anthropic provider payload shaping; mistakes could change prompts sent to LLMs or degrade caching behavior across tool-use loops.

Overview
Introduces a provider-agnostic cache_breakpoint marker (crewai.llms.cache) and updates both CrewAgentExecutor and experimental.AgentExecutor to mark the stable system/user prompt prefix for reuse across ReAct/tool iterations.

Updates BaseLLM._format_messages to strip breakpoint flags without mutating the caller’s message buffer, and extends the Anthropic provider to translate marked breakpoints into cache_control (including converting cached system prompts to content blocks and stamping only the intended stable user block).

Moves skills prompt injection out of Agent runtime prompt concatenation by removing append_skill_context, emitting skill context as stable XML blocks (<skills>, <skill ...>) during prompt construction, and updates/adding tests to cover caching semantics and the new skill formatting.

Reviewed by Cursor Bugbot for commit 6f94c9b. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Introduced provider-agnostic prompt cache breakpoints and provider handling to mark stable prompt regions.
    • Skill context formatting now emits stable XML-like blocks and these are appended to constructed prompts.
  • Bug Fixes

    • Avoided mutating original message buffers and ensured cache-breakpoint markers are removed before sending to providers.
  • Tests

    • Added tests validating cache-breakpoint behavior, provider formatting, and updated skill-format integration tests.

Review Change Stack

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b2488ec. Configure here.

Comment thread lib/crewai/src/crewai/llms/cache.py Outdated
Comment thread lib/crewai/src/crewai/llms/providers/anthropic/completion.py Outdated
@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds provider-agnostic prompt cache-breakpoint markers, integrates them into executors and base LLM message prep (markers stripped before provider submission), implements Anthropic ephemeral stamping, replaces Markdown skill headers with XML <skill> blocks in prompts, and removes the old append_skill_context helper.

Changes

Prompt Caching and Skill Context Refactor

Layer / File(s) Summary
Cache Breakpoint Infrastructure
lib/crewai/src/crewai/llms/cache.py
Adds CACHE_BREAKPOINT_KEY, mark_cache_breakpoint() (returns new dict with marker) and strip_cache_breakpoint() (removes marker in-place).
LLM Base Message Preparation
lib/crewai/src/crewai/llms/base_llm.py
Validates messages and creates cleaned per-message copies excluding CACHE_BREAKPOINT_KEY, then passes cleaned list to file-processing to avoid mutating caller buffers.
Anthropic Ephemeral Cache Support
lib/crewai/src/crewai/llms/providers/anthropic/completion.py
Detects cache-breakpoint markers pre-format, records original contents, and stamps matching formatted user/system content blocks with cache_control: {"type":"ephemeral"}; system-marked messages may be returned as content-block payloads.
Cache Integration in Executors
lib/crewai/src/crewai/agents/crew_agent_executor.py, lib/crewai/src/crewai/experimental/agent_executor.py
Executors wrap formatted system/user (or prompt-only) messages with mark_cache_breakpoint() before appending to message state.
Skill Format Changes
lib/crewai/src/crewai/skills/loader.py
Skill formatter now emits <skill name="...">...</skill> XML-like wrappers instead of Markdown ## Skill: headings.
Skill Integration in Prompts
lib/crewai/src/crewai/utilities/prompts.py
Prompts.task_execution() appends a <skills>...</skills> block via _build_skill_block() when skills are present.
Skill Context Cleanup
lib/crewai/src/crewai/agent/core.py, lib/crewai/src/crewai/agent/utils.py
Removes append_skill_context helper and its two call sites in agent core; prompts now include skill blocks.
Tests: Cache Marker & Skill Format
lib/crewai/tests/llms/test_prompt_cache.py, lib/crewai/tests/skills/*
New tests for cache marker behavior and provider formatting; skill tests updated to expect XML skill tags and to use Prompts.task_execution().

Sequence Diagram

sequenceDiagram
  participant AgentExec as Agent Executor
  participant Prompts as Prompts Builder
  participant ExecutorState as Message State
  participant LLMBase as BaseLLM
  participant Anthropic as Anthropic Provider
  AgentExec->>Prompts: task_execution() -> render prompt + _build_skill_block()
  Prompts-->>AgentExec: formatted system/user prompt (+ <skills> block)
  AgentExec->>ExecutorState: append mark_cache_breakpoint(formatted_message)
  ExecutorState->>LLMBase: prepare_messages(marked_messages)
  LLMBase->>LLMBase: create cleaned copies (strip CACHE_BREAKPOINT_KEY)
  LLMBase->>Anthropic: _format_messages_for_anthropic(cleaned_messages)
  Anthropic->>Anthropic: detect original markers and stamp cache_control ephemeral
  Anthropic-->>AgentExec: formatted payload (with ephemeral stamps where applicable)
Loading

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels: size/M, enhancement

Suggested reviewers:

  • greysonlalonde

🐰 A prompt wrapped in XML, marked for cache, so swift!
Skill tags stable, ephemeral drift,
Breaking agents free from context drift—
Caching leaps on, the rabbit's gift! 🎁

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.48% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Lorenze/imp/prompt layering' is overly vague and does not clearly convey the main changes; it reads like a branch name rather than a descriptive PR summary. Use a more descriptive title that captures the primary change, such as 'Add provider-agnostic cache markers and move skill context to prompt construction' or 'Implement prompt caching with cache breakpoints and XML-formatted skills'.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch lorenze/imp/prompt-layering

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/crewai/src/crewai/skills/loader.py (1)

175-188: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Escape XML-sensitive content before building <skill> blocks

Line 175 and Line 188 inject skill.name directly into an XML attribute, and Lines 176/178 inject raw text into the wrapped body. If any field contains ", <, &, or </skill>, it can break block boundaries and destabilize the prompt/cache anchor format.

Suggested fix
+from html import escape
+
 def format_skill(skill: Skill) -> str:
+    safe_name = escape(skill.name or "", quote=True)
+    safe_description = escape(skill.description or "")
+    safe_instructions = escape(skill.instructions or "")
+
     if skill.disclosure_level >= INSTRUCTIONS and skill.instructions:
         parts = [
-            f'<skill name="{skill.name}">',
-            skill.description,
+            f'<skill name="{safe_name}">',
+            safe_description,
             "",
-            skill.instructions,
+            safe_instructions,
         ]
         ...
         parts.append("</skill>")
         return "\n".join(parts)
-    return f'<skill name="{skill.name}">\n{skill.description}\n</skill>'
+    return f'<skill name="{safe_name}">\n{safe_description}\n</skill>'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/skills/loader.py` around lines 175 - 188, Escape
XML-sensitive characters before injecting skill fields into the generated
<skill> blocks: locate the code in lib/crewai/src/crewai/skills/loader.py that
constructs the skill XML (the function that concatenates or formats skill.name
and the skill body/text into the "<skill ...>...</skill>" block) and replace raw
inserts with an XML-escaping step (e.g., escape &, <, >, " and apostrophe) for
attribute values and either escape or wrap body text in a safe container (CDATA)
for the element content; ensure the code that references skill.name and the
wrapped body text uses the escaped values so that quotes, angle brackets,
ampersands, or literal "</skill>" cannot break the block boundaries.
🧹 Nitpick comments (2)
lib/crewai/src/crewai/utilities/prompts.py (1)

89-89: ⚡ Quick win

Stabilize and reuse the skill block once per prompt build.

_build_skill_block() is called repeatedly, and the block order currently depends on incoming agent.skills order. Sorting by skill name and computing once in task_execution() makes cache-prefix text deterministic and avoids duplicate formatting work.

♻️ Proposed refactor
@@
-        system: str = self._build_prompt(slices) + self._build_skill_block()
+        skill_block = self._build_skill_block()
+        system: str = self._build_prompt(slices) + skill_block
@@
                 system=system,
                 user=self._build_prompt([task_slice]),
-                prompt=self._build_prompt(slices) + self._build_skill_block(),
+                prompt=self._build_prompt(slices) + skill_block,
             )
         return StandardPromptResult(
             prompt=self._build_prompt(
@@
                 self.prompt_template,
                 self.response_template,
             )
-            + self._build_skill_block()
+            + skill_block
         )
@@
-        sections = [format_skill_context(s) for s in skills if isinstance(s, Skill)]
+        stable_skills = sorted(
+            (s for s in skills if isinstance(s, Skill)),
+            key=lambda s: s.name,
+        )
+        sections = [format_skill_context(s) for s in stable_skills]
         if not sections:
             return ""

Also applies to: 109-109, 118-118, 121-137

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/utilities/prompts.py` at line 89, The code repeatedly
calls _build_skill_block() and relies on agent.skills' incoming order, causing
nondeterministic cache prefixes and wasted formatting; compute the skill block
once (e.g., in task_execution()) by sorting agent.skills by skill.name to
produce a deterministic ordered list, build the skill_block string there, store
it in a local variable (e.g., skill_block) and replace all direct calls to
_build_skill_block() (including where system is set and other uses of
_build_skill_block) to reuse that stored skill_block so formatting is done only
once per prompt build.
lib/crewai/tests/llms/test_prompt_cache.py (1)

67-69: ⚡ Quick win

Avoid position-dependent block selection in Anthropic assertion.
formatted[0]["content"][-1] can become brittle if content/message ordering shifts. Prefer locating the specific text block by role/type/content, like you already do in the next test.

Proposed test hardening
-        # First user block carries cache_control too
-        last_block = formatted[0]["content"][-1]
-        assert last_block["cache_control"] == {"type": "ephemeral"}
+        # Stable user text block carries cache_control too
+        user_msg = next(fm for fm in formatted if fm["role"] == "user")
+        text_block = next(
+            b
+            for b in user_msg["content"]
+            if isinstance(b, dict)
+            and b.get("type") == "text"
+            and b.get("text") == "ping"
+        )
+        assert text_block.get("cache_control") == {"type": "ephemeral"}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/llms/test_prompt_cache.py` around lines 67 - 69, The
assertion is brittle because it picks the last content entry via
formatted[0]["content"][-1]; instead, locate the specific user/text block
deterministically (e.g., iterate formatted[0]["content"] and find the dict whose
"type" or "text"/"role" matches the expected user message) and then assert that
its "cache_control" equals {"type": "ephemeral"}—update the test in
test_prompt_cache.py to search for the block by its identifying fields rather
than by position.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/llms/base_llm.py`:
- Around line 722-729: The code that converts incoming dicts into LLMMessage
instances (in BaseLLM, the method that currently casts messages to LLMMessage)
only checks for key presence and should also validate value types: ensure "role"
is a str and "content" is a str (or the expected type), otherwise raise a
ValueError immediately instead of casting; update the message-casting routine
(the BaseLLM message conversion helper that produces LLMMessage) to perform
these type checks before constructing LLMMessage and include a clear ValueError
when types are invalid.

In `@lib/crewai/src/crewai/llms/providers/anthropic/completion.py`:
- Around line 704-707: The code that collects message content in the Anthropic
completion path is currently only grabbing string content from "assistant"
messages and the later stamping loop (in the same module) only applies
cache_breakpoint handling to "user" messages, so assistant cache_breakpoint
flags and structured list-form content are ignored; update the collection step
(where message content is aggregated) to (1) stop collecting/depending on
"assistant" roles if they are not stamped, or better remove the unused assistant
collection, and (2) when extracting content from a message (look for the place
that reads message["content"] and sets collected content), handle both string
and list-form content by detecting if content is a list and
concatenating/extracting text fields (preserving original ordering) so messages
with {"content": [...], "cache_breakpoint": True} are honored; ensure the
stamping loop that sets cache_breakpoint (the loop that checks message["role"]
== "user") either also checks "assistant" roles or you remove assistant
collection so only user roles are processed, and keep the cache_breakpoint flag
propagation logic consistent with these changes.

---

Outside diff comments:
In `@lib/crewai/src/crewai/skills/loader.py`:
- Around line 175-188: Escape XML-sensitive characters before injecting skill
fields into the generated <skill> blocks: locate the code in
lib/crewai/src/crewai/skills/loader.py that constructs the skill XML (the
function that concatenates or formats skill.name and the skill body/text into
the "<skill ...>...</skill>" block) and replace raw inserts with an XML-escaping
step (e.g., escape &, <, >, " and apostrophe) for attribute values and either
escape or wrap body text in a safe container (CDATA) for the element content;
ensure the code that references skill.name and the wrapped body text uses the
escaped values so that quotes, angle brackets, ampersands, or literal "</skill>"
cannot break the block boundaries.

---

Nitpick comments:
In `@lib/crewai/src/crewai/utilities/prompts.py`:
- Line 89: The code repeatedly calls _build_skill_block() and relies on
agent.skills' incoming order, causing nondeterministic cache prefixes and wasted
formatting; compute the skill block once (e.g., in task_execution()) by sorting
agent.skills by skill.name to produce a deterministic ordered list, build the
skill_block string there, store it in a local variable (e.g., skill_block) and
replace all direct calls to _build_skill_block() (including where system is set
and other uses of _build_skill_block) to reuse that stored skill_block so
formatting is done only once per prompt build.

In `@lib/crewai/tests/llms/test_prompt_cache.py`:
- Around line 67-69: The assertion is brittle because it picks the last content
entry via formatted[0]["content"][-1]; instead, locate the specific user/text
block deterministically (e.g., iterate formatted[0]["content"] and find the dict
whose "type" or "text"/"role" matches the expected user message) and then assert
that its "cache_control" equals {"type": "ephemeral"}—update the test in
test_prompt_cache.py to search for the block by its identifying fields rather
than by position.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: dd35940f-39d2-441b-b2b2-d6780d4e8c39

📥 Commits

Reviewing files that changed from the base of the PR and between 63a9e7e and b2488ec.

📒 Files selected for processing (12)
  • lib/crewai/src/crewai/agent/core.py
  • lib/crewai/src/crewai/agent/utils.py
  • lib/crewai/src/crewai/agents/crew_agent_executor.py
  • lib/crewai/src/crewai/experimental/agent_executor.py
  • lib/crewai/src/crewai/llms/base_llm.py
  • lib/crewai/src/crewai/llms/cache.py
  • lib/crewai/src/crewai/llms/providers/anthropic/completion.py
  • lib/crewai/src/crewai/skills/loader.py
  • lib/crewai/src/crewai/utilities/prompts.py
  • lib/crewai/tests/llms/test_prompt_cache.py
  • lib/crewai/tests/skills/test_integration.py
  • lib/crewai/tests/skills/test_loader.py
💤 Files with no reviewable changes (2)
  • lib/crewai/src/crewai/agent/core.py
  • lib/crewai/src/crewai/agent/utils.py

Comment thread lib/crewai/src/crewai/llms/base_llm.py
Comment thread lib/crewai/src/crewai/llms/providers/anthropic/completion.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant