Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions docs/content/docs/(core)/prompts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,22 @@ The channel system prompt is the most complex, assembled from multiple dynamic c

{{ worker_capabilities }}

{{ system_prompt_cache_boundary }}

{%- if available_channels %}
{{ available_channels }}
{%- endif %}

{%- if working_memory %}
{{ working_memory }}
{%- endif %}

{%- if knowledge_synthesis %}
## Knowledge Context

{{ knowledge_synthesis }}
{%- endif %}

{%- if conversation_context %}
## Conversation Context

Expand All @@ -137,6 +153,35 @@ The channel system prompt is the most complex, assembled from multiple dynamic c
{%- endif %}
```

## Prompt Cache Boundary

The channel prompt includes a cache boundary after the stable instruction prefix and before volatile runtime context. Anthropic requests split the system prompt at that marker. The stable prefix receives `cache_control`. Status, working memory, knowledge context, channel activity, and conversation context do not.

Other providers strip the marker before sending instructions.

Keep stable sections above the boundary:

- identity context
- base channel rules
- adapter guidance
- skills
- worker capabilities

Keep volatile sections below it:

- available channels
- org and project context
- working memory
- channel activity
- participant context
- knowledge context
- conversation context
- current status
- message coalescing hints
- backfilled transcript data

The `token_usage` table records `cache_read_tokens` and `cache_write_tokens`. Use those fields to check whether prompt-cache changes are paying off.

## Adding a New Language

1. Create language directory:
Expand Down
2 changes: 2 additions & 0 deletions prompts/en/channel.md.j2
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@ When in doubt, skip. Being a lurker who speaks when it matters is better than be

{{ worker_capabilities }}

{{ system_prompt_cache_boundary }}

{%- if available_channels %}
{{ available_channels }}
{%- endif %}
Expand Down
143 changes: 136 additions & 7 deletions src/llm/anthropic/params.rs
Original file line number Diff line number Diff line change
Expand Up @@ -132,21 +132,40 @@ fn build_system_prompt(
}

if let Some(preamble) = &request.preamble {
let mut preamble_block = serde_json::json!({
"type": "text",
"text": preamble,
});
if let Some(cc) = cache_control {
preamble_block["cache_control"] = cc.clone();
if let Some((stable_prefix, volatile_suffix)) =
crate::prompts::engine::split_system_prompt_cache_boundary(preamble)
{
push_system_text_block(&mut system_blocks, stable_prefix, cache_control);
push_system_text_block(&mut system_blocks, volatile_suffix, &None);
} else {
push_system_text_block(&mut system_blocks, preamble, cache_control);
}
system_blocks.push(preamble_block);
}
Comment on lines 134 to 143
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Multiple boundary markers: inconsistent handling between providers.

split_system_prompt_cache_boundary uses split_once, so if the preamble contains more than one SYSTEM_PROMPT_CACHE_BOUNDARY marker, any subsequent markers remain verbatim inside the volatile suffix text block sent to Anthropic. Meanwhile the non-Anthropic path in src/llm/model.rs uses strip_system_prompt_cache_boundary (a replace), which removes all occurrences. Today there's only one marker in the channel template, so this is latent rather than exploitable, but if another template or fragment ever adds the marker Anthropic users would see the raw HTML comment leak into the system prompt while other providers would not.

Consider either splitting on first and stripping the marker from the remaining suffix, or asserting/documenting single-marker invariant.

🔧 Proposed tweak
         if let Some((stable_prefix, volatile_suffix)) =
             crate::prompts::engine::split_system_prompt_cache_boundary(preamble)
         {
+            let volatile_suffix =
+                crate::prompts::strip_system_prompt_cache_boundary(volatile_suffix);
             push_system_text_block(&mut system_blocks, stable_prefix, cache_control);
-            push_system_text_block(&mut system_blocks, volatile_suffix, &None);
+            push_system_text_block(&mut system_blocks, &volatile_suffix, &None);
         } else {
             push_system_text_block(&mut system_blocks, preamble, cache_control);
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm/anthropic/params.rs` around lines 134 - 143, The Anthropic branch
currently uses crate::prompts::engine::split_system_prompt_cache_boundary which
returns a (stable_prefix, volatile_suffix) via split_once but leaves any
additional SYSTEM_PROMPT_CACHE_BOUNDARY markers inside volatile_suffix; update
the logic in the block that calls split_system_prompt_cache_boundary (and then
push_system_text_block) to ensure remaining markers are removed from the
volatile suffix—e.g., after receiving volatile_suffix call the same
strip/replace helper used elsewhere (strip_system_prompt_cache_boundary or
equivalent replace) before passing it to push_system_text_block—so Anthropic and
non-Anthropic paths handle multiple markers consistently (alternative: assert a
single-marker invariant if you prefer).


if !system_blocks.is_empty() {
body["system"] = serde_json::json!(system_blocks);
}
}

fn push_system_text_block(
system_blocks: &mut Vec<serde_json::Value>,
text: &str,
cache_control: &Option<serde_json::Value>,
) {
if text.trim().is_empty() {
return;
}

let mut block = serde_json::json!({
"type": "text",
"text": text,
});
if let Some(cache_control) = cache_control {
block["cache_control"] = cache_control.clone();
}
system_blocks.push(block);
}

/// Build tool definitions, optionally normalizing names. Returns the original
/// tool (name, description) pairs for reverse-mapping on response.
fn build_tools(
Expand Down Expand Up @@ -201,6 +220,23 @@ fn build_tools(
#[cfg(test)]
mod tests {
use super::*;
use rig::completion::{Message, ToolDefinition};
use rig::one_or_many::OneOrMany;

fn completion_request_with_preamble(preamble: &str) -> CompletionRequest {
CompletionRequest {
model: None,
preamble: Some(preamble.to_string()),
chat_history: OneOrMany::one(Message::user("hello")),
documents: Vec::new(),
tools: Vec::new(),
temperature: None,
max_tokens: None,
tool_choice: None,
additional_params: None,
output_schema: None,
}
}

#[test]
fn adaptive_thinking_detected_for_4_6_models() {
Expand All @@ -218,4 +254,97 @@ mod tests {
assert!(!supports_adaptive_thinking("claude-opus-4-0"));
assert!(!supports_adaptive_thinking("gpt-4o"));
}

#[test]
fn system_prompt_cache_boundary_splits_preamble_cache_control() {
let request = completion_request_with_preamble(&format!(
"stable prefix\n{}\nvolatile suffix",
crate::prompts::engine::SYSTEM_PROMPT_CACHE_BOUNDARY
));
let expected_cache_control = serde_json::json!({"type": "ephemeral"});
let cache_control = Some(expected_cache_control.clone());
let mut body = serde_json::json!({});

build_system_prompt(&mut body, &request, false, &cache_control);

let system_blocks = body["system"]
.as_array()
.expect("system prompt should be an array");
assert_eq!(system_blocks.len(), 2);
assert_eq!(system_blocks[0]["text"], "stable prefix\n");
assert_eq!(system_blocks[0]["cache_control"], expected_cache_control);
assert_eq!(system_blocks[1]["text"], "\nvolatile suffix");
assert!(system_blocks[1].get("cache_control").is_none());
}

#[test]
fn system_prompt_without_cache_boundary_preserves_existing_cache_behavior() {
let request = completion_request_with_preamble("stable prompt");
let expected_cache_control = serde_json::json!({"type": "ephemeral"});
let cache_control = Some(expected_cache_control.clone());
let mut body = serde_json::json!({});

build_system_prompt(&mut body, &request, false, &cache_control);

let system_blocks = body["system"]
.as_array()
.expect("system prompt should be an array");
assert_eq!(system_blocks.len(), 1);
assert_eq!(system_blocks[0]["text"], "stable prompt");
assert_eq!(system_blocks[0]["cache_control"], expected_cache_control);
}

#[test]
fn build_anthropic_request_keeps_cache_boundary_out_of_volatile_system_block() {
let client = reqwest::Client::new();
let mut request = completion_request_with_preamble(&format!(
"stable prefix\n{}\nvolatile suffix",
crate::prompts::engine::SYSTEM_PROMPT_CACHE_BOUNDARY
));
request.tools = vec![ToolDefinition {
name: "reply".to_string(),
description: "Send a reply".to_string(),
parameters: serde_json::json!({
"type": "object",
"properties": {
"text": {"type": "string"}
}
}),
}];

let anthropic_request = build_anthropic_request(
&client,
"sk-ant-test",
"https://api.anthropic.com",
"claude-sonnet-4-5",
&request,
"auto",
false,
);
let http_request = anthropic_request
.builder
.build()
.expect("request should build");
let body = http_request
.body()
.and_then(reqwest::Body::as_bytes)
.expect("request body should be buffered JSON");
let body: serde_json::Value =
serde_json::from_slice(body).expect("request body should be JSON");

let system_blocks = body["system"]
.as_array()
.expect("system prompt should be an array");
assert_eq!(system_blocks.len(), 2);
assert!(system_blocks[0]["cache_control"].is_object());
assert!(system_blocks[1].get("cache_control").is_none());
assert_eq!(system_blocks[0]["text"], "stable prefix\n");
assert_eq!(system_blocks[1]["text"], "\nvolatile suffix");

let tools = body["tools"]
.as_array()
.expect("tool definitions should be an array");
assert_eq!(tools.len(), 1);
assert!(tools[0]["cache_control"].is_object());
}
}
5 changes: 5 additions & 0 deletions src/llm/model.rs
Original file line number Diff line number Diff line change
Expand Up @@ -833,6 +833,7 @@ impl SpacebotModel {
let mut messages = Vec::new();

if let Some(preamble) = &request.preamble {
let preamble = crate::prompts::strip_system_prompt_cache_boundary(preamble);
messages.push(serde_json::json!({
"role": "system",
"content": preamble,
Expand Down Expand Up @@ -945,6 +946,7 @@ impl SpacebotModel {
});

if let Some(preamble) = &request.preamble {
let preamble = crate::prompts::strip_system_prompt_cache_boundary(preamble);
body["instructions"] = serde_json::json!(preamble);
} else if is_chatgpt_codex {
body["instructions"] = serde_json::json!(
Expand Down Expand Up @@ -1071,6 +1073,7 @@ impl SpacebotModel {
});

if let Some(preamble) = &request.preamble {
let preamble = crate::prompts::strip_system_prompt_cache_boundary(preamble);
body["instructions"] = serde_json::json!(preamble);
} else if is_chatgpt_codex {
body["instructions"] = serde_json::json!(
Expand Down Expand Up @@ -1380,6 +1383,7 @@ impl SpacebotModel {
let mut messages = Vec::new();

if let Some(preamble) = &request.preamble {
let preamble = crate::prompts::strip_system_prompt_cache_boundary(preamble);
messages.push(serde_json::json!({
"role": "system",
"content": preamble,
Expand Down Expand Up @@ -1472,6 +1476,7 @@ impl SpacebotModel {
let mut messages = Vec::new();

if let Some(preamble) = &request.preamble {
let preamble = crate::prompts::strip_system_prompt_cache_boundary(preamble);
messages.push(serde_json::json!({
"role": "system",
"content": preamble,
Expand Down
63 changes: 63 additions & 0 deletions src/mcp.rs
Original file line number Diff line number Diff line change
Expand Up @@ -605,6 +605,7 @@ impl McpManager {
}
}

names.sort();
names
}

Expand Down Expand Up @@ -853,6 +854,68 @@ fn interpolate_env_placeholders(value: &str) -> String {
mod tests {
use super::*;

fn test_mcp_config(name: &str) -> McpServerConfig {
McpServerConfig {
name: name.to_string(),
enabled: true,
transport: McpTransport::Stdio {
command: "test".to_string(),
args: Vec::new(),
env: HashMap::new(),
},
}
}

fn test_tool(name: &str, description: Option<&str>) -> rmcp::model::Tool {
let mut tool = rmcp::model::Tool::default();
tool.name = Cow::Owned(name.to_string());
tool.description = description.map(|description| Cow::Owned(description.to_string()));
tool
}

#[tokio::test]
async fn get_tool_names_returns_deterministic_sorted_names() {
let manager = McpManager::new(Vec::new());

let later_connection = Arc::new(McpConnection::new(test_mcp_config("z_server")));
{
let mut tools = later_connection.tools.write().await;
*tools = vec![test_tool("z_tool", Some("z desc"))];
}
{
let mut state = later_connection.state.write().await;
*state = McpConnectionState::Connected;
}

let earlier_connection = Arc::new(McpConnection::new(test_mcp_config("a_server")));
{
let mut tools = earlier_connection.tools.write().await;
*tools = vec![
test_tool("b_tool", None),
test_tool("a_tool", Some("a desc")),
];
}
{
let mut state = earlier_connection.state.write().await;
*state = McpConnectionState::Connected;
}

{
let mut connections = manager.connections.write().await;
connections.insert("z_server".to_string(), later_connection);
connections.insert("a_server".to_string(), earlier_connection);
}

assert_eq!(
manager.get_tool_names().await,
vec![
"a_tool — a desc",
"b_tool — from a_server",
"z_tool — z desc"
]
);
}

#[test]
fn parse_bearer_token_strips_bearer_prefix() {
let token = parse_bearer_token("Bearer abc123", "test").unwrap();
Expand Down
2 changes: 1 addition & 1 deletion src/prompts.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
pub mod engine;
pub mod text;

pub use engine::{PromptEngine, SkillInfo};
pub use engine::{PromptEngine, SkillInfo, strip_system_prompt_cache_boundary};
pub use text::{get as get_text, init as init_language};
Loading
Loading