Lorenze/imp/prompt layering by lorenzejay · Pull Request #5774 · crewAIInc/crewAI

lorenzejay · 2026-05-11T18:03:23Z

Note

Medium Risk
Touches core prompt/message formatting and Anthropic provider payload shaping; mistakes could change prompts sent to LLMs or degrade caching behavior across tool-use loops.

Overview
Introduces a provider-agnostic cache_breakpoint marker (crewai.llms.cache) and updates both CrewAgentExecutor and experimental.AgentExecutor to mark the stable system/user prompt prefix for reuse across ReAct/tool iterations.

Updates BaseLLM._format_messages to strip breakpoint flags without mutating the caller’s message buffer, and extends the Anthropic provider to translate marked breakpoints into cache_control (including converting cached system prompts to content blocks and stamping only the intended stable user block).

Moves skills prompt injection out of Agent runtime prompt concatenation by removing append_skill_context, emitting skill context as stable XML blocks (<skills>, <skill ...>) during prompt construction, and updates/adding tests to cover caching semantics and the new skill formatting.

^{Reviewed by Cursor Bugbot for commit 6f94c9b. Bugbot is set up for automated code reviews on this repo. Configure here.}

Summary by CodeRabbit

New Features
- Introduced provider-agnostic prompt cache breakpoints and provider handling to mark stable prompt regions.
- Skill context formatting now emits stable XML-like blocks and these are appended to constructed prompts.
Bug Fixes
- Avoided mutating original message buffers and ensured cache-breakpoint markers are removed before sending to providers.
Tests
- Added tests validating cache-breakpoint behavior, provider formatting, and updated skill-format integration tests.

…rompt-layering

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit b2488ec. Configure here.}

coderabbitai · 2026-05-11T18:12:16Z

📝 Walkthrough

Walkthrough

Adds provider-agnostic prompt cache-breakpoint markers, integrates them into executors and base LLM message prep (markers stripped before provider submission), implements Anthropic ephemeral stamping, replaces Markdown skill headers with XML <skill> blocks in prompts, and removes the old append_skill_context helper.

Changes

Prompt Caching and Skill Context Refactor

Layer / File(s)	Summary
Cache Breakpoint Infrastructure `lib/crewai/src/crewai/llms/cache.py`	Adds `CACHE_BREAKPOINT_KEY`, `mark_cache_breakpoint()` (returns new dict with marker) and `strip_cache_breakpoint()` (removes marker in-place).
LLM Base Message Preparation `lib/crewai/src/crewai/llms/base_llm.py`	Validates messages and creates cleaned per-message copies excluding `CACHE_BREAKPOINT_KEY`, then passes cleaned list to file-processing to avoid mutating caller buffers.
Anthropic Ephemeral Cache Support `lib/crewai/src/crewai/llms/providers/anthropic/completion.py`	Detects cache-breakpoint markers pre-format, records original contents, and stamps matching formatted user/system content blocks with `cache_control: {"type":"ephemeral"}`; system-marked messages may be returned as content-block payloads.
Cache Integration in Executors `lib/crewai/src/crewai/agents/crew_agent_executor.py`, `lib/crewai/src/crewai/experimental/agent_executor.py`	Executors wrap formatted system/user (or prompt-only) messages with `mark_cache_breakpoint()` before appending to message state.
Skill Format Changes `lib/crewai/src/crewai/skills/loader.py`	Skill formatter now emits `<skill name="...">...</skill>` XML-like wrappers instead of Markdown `## Skill:` headings.
Skill Integration in Prompts `lib/crewai/src/crewai/utilities/prompts.py`	`Prompts.task_execution()` appends a `<skills>...</skills>` block via `_build_skill_block()` when skills are present.
Skill Context Cleanup `lib/crewai/src/crewai/agent/core.py`, `lib/crewai/src/crewai/agent/utils.py`	Removes `append_skill_context` helper and its two call sites in agent core; prompts now include skill blocks.
Tests: Cache Marker & Skill Format `lib/crewai/tests/llms/test_prompt_cache.py`, `lib/crewai/tests/skills/*`	New tests for cache marker behavior and provider formatting; skill tests updated to expect XML skill tags and to use `Prompts.task_execution()`.

Sequence Diagram

sequenceDiagram
  participant AgentExec as Agent Executor
  participant Prompts as Prompts Builder
  participant ExecutorState as Message State
  participant LLMBase as BaseLLM
  participant Anthropic as Anthropic Provider
  AgentExec->>Prompts: task_execution() -> render prompt + _build_skill_block()
  Prompts-->>AgentExec: formatted system/user prompt (+ <skills> block)
  AgentExec->>ExecutorState: append mark_cache_breakpoint(formatted_message)
  ExecutorState->>LLMBase: prepare_messages(marked_messages)
  LLMBase->>LLMBase: create cleaned copies (strip CACHE_BREAKPOINT_KEY)
  LLMBase->>Anthropic: _format_messages_for_anthropic(cleaned_messages)
  Anthropic->>Anthropic: detect original markers and stamp cache_control ephemeral
  Anthropic-->>AgentExec: formatted payload (with ephemeral stamps where applicable)

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels: size/M, enhancement

Suggested reviewers:

greysonlalonde

🐰 A prompt wrapped in XML, marked for cache, so swift!
Skill tags stable, ephemeral drift,
Breaking agents free from context drift—
Caching leaps on, the rabbit's gift! 🎁

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 43.48% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Lorenze/imp/prompt layering' is overly vague and does not clearly convey the main changes; it reads like a branch name rather than a descriptive PR summary.	Use a more descriptive title that captures the primary change, such as 'Add provider-agnostic cache markers and move skill context to prompt construction' or 'Implement prompt caching with cache breakpoints and XML-formatted skills'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch lorenze/imp/prompt-layering

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/crewai/src/crewai/skills/loader.py (1)

175-188: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Escape XML-sensitive content before building <skill> blocks

Line 175 and Line 188 inject skill.name directly into an XML attribute, and Lines 176/178 inject raw text into the wrapped body. If any field contains ", <, &, or </skill>, it can break block boundaries and destabilize the prompt/cache anchor format.

Suggested fix

+from html import escape
+
 def format_skill(skill: Skill) -> str:
+    safe_name = escape(skill.name or "", quote=True)
+    safe_description = escape(skill.description or "")
+    safe_instructions = escape(skill.instructions or "")
+
     if skill.disclosure_level >= INSTRUCTIONS and skill.instructions:
         parts = [
-            f'<skill name="{skill.name}">',
-            skill.description,
+            f'<skill name="{safe_name}">',
+            safe_description,
             "",
-            skill.instructions,
+            safe_instructions,
         ]
         ...
         parts.append("</skill>")
         return "\n".join(parts)
-    return f'<skill name="{skill.name}">\n{skill.description}\n</skill>'
+    return f'<skill name="{safe_name}">\n{safe_description}\n</skill>'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/skills/loader.py` around lines 175 - 188, Escape
XML-sensitive characters before injecting skill fields into the generated
<skill> blocks: locate the code in lib/crewai/src/crewai/skills/loader.py that
constructs the skill XML (the function that concatenates or formats skill.name
and the skill body/text into the "<skill ...>...</skill>" block) and replace raw
inserts with an XML-escaping step (e.g., escape &, <, >, " and apostrophe) for
attribute values and either escape or wrap body text in a safe container (CDATA)
for the element content; ensure the code that references skill.name and the
wrapped body text uses the escaped values so that quotes, angle brackets,
ampersands, or literal "</skill>" cannot break the block boundaries.

🧹 Nitpick comments (2)

lib/crewai/src/crewai/utilities/prompts.py (1)

89-89: ⚡ Quick win

Stabilize and reuse the skill block once per prompt build.

_build_skill_block() is called repeatedly, and the block order currently depends on incoming agent.skills order. Sorting by skill name and computing once in task_execution() makes cache-prefix text deterministic and avoids duplicate formatting work.

♻️ Proposed refactor

@@
-        system: str = self._build_prompt(slices) + self._build_skill_block()
+        skill_block = self._build_skill_block()
+        system: str = self._build_prompt(slices) + skill_block
@@
                 system=system,
                 user=self._build_prompt([task_slice]),
-                prompt=self._build_prompt(slices) + self._build_skill_block(),
+                prompt=self._build_prompt(slices) + skill_block,
             )
         return StandardPromptResult(
             prompt=self._build_prompt(
@@
                 self.prompt_template,
                 self.response_template,
             )
-            + self._build_skill_block()
+            + skill_block
         )
@@
-        sections = [format_skill_context(s) for s in skills if isinstance(s, Skill)]
+        stable_skills = sorted(
+            (s for s in skills if isinstance(s, Skill)),
+            key=lambda s: s.name,
+        )
+        sections = [format_skill_context(s) for s in stable_skills]
         if not sections:
             return ""

Also applies to: 109-109, 118-118, 121-137

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/utilities/prompts.py` at line 89, The code repeatedly
calls _build_skill_block() and relies on agent.skills' incoming order, causing
nondeterministic cache prefixes and wasted formatting; compute the skill block
once (e.g., in task_execution()) by sorting agent.skills by skill.name to
produce a deterministic ordered list, build the skill_block string there, store
it in a local variable (e.g., skill_block) and replace all direct calls to
_build_skill_block() (including where system is set and other uses of
_build_skill_block) to reuse that stored skill_block so formatting is done only
once per prompt build.

lib/crewai/tests/llms/test_prompt_cache.py (1)

67-69: ⚡ Quick win

Avoid position-dependent block selection in Anthropic assertion.
formatted[0]["content"][-1] can become brittle if content/message ordering shifts. Prefer locating the specific text block by role/type/content, like you already do in the next test.

Proposed test hardening

-        # First user block carries cache_control too
-        last_block = formatted[0]["content"][-1]
-        assert last_block["cache_control"] == {"type": "ephemeral"}
+        # Stable user text block carries cache_control too
+        user_msg = next(fm for fm in formatted if fm["role"] == "user")
+        text_block = next(
+            b
+            for b in user_msg["content"]
+            if isinstance(b, dict)
+            and b.get("type") == "text"
+            and b.get("text") == "ping"
+        )
+        assert text_block.get("cache_control") == {"type": "ephemeral"}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/llms/test_prompt_cache.py` around lines 67 - 69, The
assertion is brittle because it picks the last content entry via
formatted[0]["content"][-1]; instead, locate the specific user/text block
deterministically (e.g., iterate formatted[0]["content"] and find the dict whose
"type" or "text"/"role" matches the expected user message) and then assert that
its "cache_control" equals {"type": "ephemeral"}—update the test in
test_prompt_cache.py to search for the block by its identifying fields rather
than by position.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/llms/base_llm.py`:
- Around line 722-729: The code that converts incoming dicts into LLMMessage
instances (in BaseLLM, the method that currently casts messages to LLMMessage)
only checks for key presence and should also validate value types: ensure "role"
is a str and "content" is a str (or the expected type), otherwise raise a
ValueError immediately instead of casting; update the message-casting routine
(the BaseLLM message conversion helper that produces LLMMessage) to perform
these type checks before constructing LLMMessage and include a clear ValueError
when types are invalid.

In `@lib/crewai/src/crewai/llms/providers/anthropic/completion.py`:
- Around line 704-707: The code that collects message content in the Anthropic
completion path is currently only grabbing string content from "assistant"
messages and the later stamping loop (in the same module) only applies
cache_breakpoint handling to "user" messages, so assistant cache_breakpoint
flags and structured list-form content are ignored; update the collection step
(where message content is aggregated) to (1) stop collecting/depending on
"assistant" roles if they are not stamped, or better remove the unused assistant
collection, and (2) when extracting content from a message (look for the place
that reads message["content"] and sets collected content), handle both string
and list-form content by detecting if content is a list and
concatenating/extracting text fields (preserving original ordering) so messages
with {"content": [...], "cache_breakpoint": True} are honored; ensure the
stamping loop that sets cache_breakpoint (the loop that checks message["role"]
== "user") either also checks "assistant" roles or you remove assistant
collection so only user roles are processed, and keep the cache_breakpoint flag
propagation logic consistent with these changes.

---

Outside diff comments:
In `@lib/crewai/src/crewai/skills/loader.py`:
- Around line 175-188: Escape XML-sensitive characters before injecting skill
fields into the generated <skill> blocks: locate the code in
lib/crewai/src/crewai/skills/loader.py that constructs the skill XML (the
function that concatenates or formats skill.name and the skill body/text into
the "<skill ...>...</skill>" block) and replace raw inserts with an XML-escaping
step (e.g., escape &, <, >, " and apostrophe) for attribute values and either
escape or wrap body text in a safe container (CDATA) for the element content;
ensure the code that references skill.name and the wrapped body text uses the
escaped values so that quotes, angle brackets, ampersands, or literal "</skill>"
cannot break the block boundaries.

---

Nitpick comments:
In `@lib/crewai/src/crewai/utilities/prompts.py`:
- Line 89: The code repeatedly calls _build_skill_block() and relies on
agent.skills' incoming order, causing nondeterministic cache prefixes and wasted
formatting; compute the skill block once (e.g., in task_execution()) by sorting
agent.skills by skill.name to produce a deterministic ordered list, build the
skill_block string there, store it in a local variable (e.g., skill_block) and
replace all direct calls to _build_skill_block() (including where system is set
and other uses of _build_skill_block) to reuse that stored skill_block so
formatting is done only once per prompt build.

In `@lib/crewai/tests/llms/test_prompt_cache.py`:
- Around line 67-69: The assertion is brittle because it picks the last content
entry via formatted[0]["content"][-1]; instead, locate the specific user/text
block deterministically (e.g., iterate formatted[0]["content"] and find the dict
whose "type" or "text"/"role" matches the expected user message) and then assert
that its "cache_control" equals {"type": "ephemeral"}—update the test in
test_prompt_cache.py to search for the block by its identifying fields rather
than by position.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: dd35940f-39d2-441b-b2b2-d6780d4e8c39

📥 Commits

Reviewing files that changed from the base of the PR and between 63a9e7e and b2488ec.

📒 Files selected for processing (12)

lib/crewai/src/crewai/agent/core.py
lib/crewai/src/crewai/agent/utils.py
lib/crewai/src/crewai/agents/crew_agent_executor.py
lib/crewai/src/crewai/experimental/agent_executor.py
lib/crewai/src/crewai/llms/base_llm.py
lib/crewai/src/crewai/llms/cache.py
lib/crewai/src/crewai/llms/providers/anthropic/completion.py
lib/crewai/src/crewai/skills/loader.py
lib/crewai/src/crewai/utilities/prompts.py
lib/crewai/tests/llms/test_prompt_cache.py
lib/crewai/tests/skills/test_integration.py
lib/crewai/tests/skills/test_loader.py

💤 Files with no reviewable changes (2)

lib/crewai/src/crewai/agent/core.py
lib/crewai/src/crewai/agent/utils.py

…rompt-layering

…ewAI into lorenze/imp/prompt-layering

lorenzejay added 2 commits May 11, 2026 10:36

improving prompt structure especially for prompt caching

f2bae77

Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/imp/p…

b2488ec

…rompt-layering

github-actions Bot added the size/L label May 11, 2026

cursor Bot reviewed May 11, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/llms/cache.py Outdated

Comment thread lib/crewai/src/crewai/llms/providers/anthropic/completion.py Outdated

coderabbitai Bot requested changes May 11, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/llms/base_llm.py

Comment thread lib/crewai/src/crewai/llms/providers/anthropic/completion.py Outdated

lorenzejay and others added 4 commits May 11, 2026 11:36

addressing comments

98eaee0

Merge branch 'main' into lorenze/imp/prompt-layering

4be1c10

Merge branch 'main' of github.com:crewAIInc/crewAI into lorenze/imp/p…

4186f66

…rompt-layering

Merge branch 'lorenze/imp/prompt-layering' of github.com:crewAIInc/cr…

98ba391

…ewAI into lorenze/imp/prompt-layering

coderabbitai Bot approved these changes May 12, 2026

View reviewed changes

Merge branch 'main' into lorenze/imp/prompt-layering

6f94c9b

lorenzejay merged commit 264da82 into main May 12, 2026
55 of 88 checks passed

lorenzejay deleted the lorenze/imp/prompt-layering branch May 12, 2026 19:39

This was referenced May 29, 2026

fix: guard cache_breakpoint marking behind is_anthropic check #5971

Open

fix: strip cache_breakpoint in LiteLLM path and expand Anthropic prefixes #5914

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lorenze/imp/prompt layering#5774

Lorenze/imp/prompt layering#5774
lorenzejay merged 7 commits into
mainfrom
lorenze/imp/prompt-layering

lorenzejay commented May 11, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lorenzejay commented May 11, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lorenzejay commented May 11, 2026 •

edited by cursor Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading