fix(agent): escape XML values, CDATA content, generic corrective message

0xhis · 0xhis · commit f19c2a9e88f3 · 2026-03-21T01:32:27.000-07:00
- Add html.escape() to target values in &lt;scan_task&gt; (URLs, paths, IPs)
- Escape sender_name/sender_id in &lt;agent_message&gt; attributes
- CDATA-wrap message content in &lt;agent_message&gt; to handle any text
- Make corrective message generic (no StrixAgent-specific tool names)
diff --git a/.sisyphus/plans/pr-review-fixes.md b/.sisyphus/plans/pr-review-fixes.md
@@ -0,0 +1,213 @@
+# PR Review Feedback Fixes
+
+## TL;DR
+
+> **Quick Summary**: Address reviewer feedback from Greptile and Copilot across PRs 381-384. Fix a real bug (thinking blocks dropped on interrupted messages), improve code quality (XML escaping, private symbol export), and add clarifying comments (Qwen compliance block).
+>
+> **Deliverables**:
+> - PR 381: `get_message_tokens` (dropped leading underscore)
+> - PR 382: Split WSTG category codes + compliance block comment
+> - PR 383: Thinking blocks fix in interrupted path
+> - PR 384: XML escaping + generic corrective message
+>
+> **Estimated Effort**: Quick
+> **Parallel Execution**: YES - 4 PRs, each fix is independent
+
+---
+
+## Context
+
+### Original Request
+User asked to read reviewer feedback on the 4 split PRs and implement fixes.
+
+### Reviewer Findings
+
+**PR 381 (Memory Fix)** - Greptile: `_get_message_tokens` is a private symbol imported across module boundaries. Rename to `get_message_tokens`.
+
+**PR 382 (WSTG Prompts)** - Greptile: `IDNT/ATHN` and `ATHZ/SESS` are combined codes inconsistent with single codes used elsewhere. `<compliance>` block wording is too aggressive for prompt injection resistance. User clarified: keep the aggressive wording (needed for Qwen-series models), add a comment explaining why.
+
+**PR 383 (TUI Status)** - Greptile: **BUG** - thinking blocks silently dropped when `metadata["interrupted"]` is true. Early return bypasses `renderables` list.
+
+**PR 384 (Agent Workflow)** - Greptile: (1) URLs with `&` produce invalid XML in `<scan_task>`, (2) corrective message hardcodes StrixAgent-specific tool names in BaseAgent, (3) message content not escaped in `<agent_message>` XML. User approved: HTML escape for all targets, generic corrective message.
+
+---
+
+## Work Objectives
+
+### Core Objective
+Apply all non-blocking review feedback to make PRs cleaner and more robust.
+
+### Concrete Deliverables
+- `_get_message_tokens` renamed to `get_message_tokens` in both files
+- WSTG category codes split into separate phases
+- Compliance block comment added (Qwen rationale)
+- Thinking blocks included in interrupted message path
+- `html.escape()` on all XML-interpolated values
+- Generic corrective message in BaseAgent
+- CDATA-wrapping for `<agent_message>` content
+
+---
+
+## TODOs
+
+- [ ] 1. PR 381: Rename `_get_message_tokens` → `get_message_tokens`
+
+  **What to do**:
+  - Rename function in `strix/llm/memory_compressor.py` (definition)
+  - Update import in `strix/llm/llm.py` (consumption)
+  - Commit on `fix/memory-compressor-token-budget` branch
+
+  **Must NOT do**:
+  - Change function behavior or signature
+  - Touch unrelated files
+
+  **Acceptance Criteria**:
+  - [ ] `grep -r "_get_message_tokens"` returns no results in source files
+  - [ ] `python -c "from strix.llm.memory_compressor import get_message_tokens"` succeeds
+  - [ ] `git diff` shows only the rename, no behavioral changes
+
+  **Commit**: `refactor(llm): rename _get_message_tokens to public API name`
+
+- [ ] 2. PR 382: Split WSTG category codes in `<methodology>` phases
+
+  **What to do**:
+  - In `system_prompt.jinja`, split `IDNT/ATHN` into separate `IDNT` and `ATHN` phases
+  - Split `ATHZ/SESS` into separate `ATHZ` and `SESS` phases
+  - Renumber subsequent phases accordingly
+  - Ensure codes match what `<skill_triggers>` and `<phase2>` use
+
+  **Must NOT do**:
+  - Change the `<skill_triggers>` or `<phase2>` sections (they already use correct single codes)
+  - Alter non-phase content in the methodology section
+
+  **Acceptance Criteria**:
+  - [ ] No combined codes like `IDNT/ATHN` remain in system_prompt.jinja
+  - [ ] All methodology phases use single WSTG codes matching `<skill_triggers>`
+
+  **Commit**: `fix(prompts): split combined WSTG category codes for consistency`
+
+- [ ] 3. PR 382: Add comment explaining aggressive compliance wording
+
+  **What to do**:
+  - Add a Jinja comment or XML comment above the `<compliance>` block explaining that the aggressive wording is intentional and was needed when testing with Qwen-series models (e.g., Qwen3.5-Plus)
+  - Note that softer language caused unnecessary refusals during authorized scans
+
+  **Must NOT do**:
+  - Change the actual compliance wording
+  - Remove or weaken the `<compliance>` block
+
+  **Acceptance Criteria**:
+  - [ ] Comment exists above `<compliance>` block referencing Qwen-series models
+  - [ ] Functional output of the Jinja template is unchanged
+
+  **Commit**: `docs: add comment explaining aggressive compliance block rationale`
+
+- [ ] 4. PR 383: Fix thinking blocks dropped on interrupted messages
+
+  **What to do**:
+  - In `_render_chat_content` in `strix/interface/tui.py`, fix the `metadata["interrupted"]` branch
+  - Change `self._merge_renderables([streaming_result, interrupted_text])` to include `renderables`:
+    `self._merge_renderables([*renderables, streaming_result, interrupted_text])`
+
+  **Must NOT do**:
+  - Change the interrupted message rendering logic beyond including renderables
+  - Touch other branches of `_render_chat_content`
+
+  **Acceptance Criteria**:
+  - [ ] Code shows `[*renderables, streaming_result, interrupted_text]` in interrupted branch
+  - [ ] No other changes in the diff
+
+  **Commit**: `fix(ui): include thinking blocks in interrupted message render`
+
+- [ ] 5. PR 384: Add `html.escape()` to XML-interpolated values in strix_agent.py
+
+  **What to do**:
+  - `import html` at top of `strix/agents/StrixAgent/strix_agent.py`
+  - Wrap all target values in `html.escape()` before embedding in XML:
+    - `url` values
+    - `repo["url"]` values
+    - `code["path"]` values
+    - IP address entries
+  - Same approach for `base_agent.py` `<agent_message>` attributes (`sender_name`, `sender_id`)
+
+  **Must NOT do**:
+  - Change the XML structure or element names
+  - Escape already-safe static strings (only user/target-derived values)
+
+  **Acceptance Criteria**:
+  - [ ] `html` is imported in both files
+  - [ ] All interpolated target values wrapped in `html.escape()`
+  - [ ] `<agent_message>` attributes `from="{html.escape(sender_name)}"` and `id="{html.escape(sender_id)}"` escape properly
+
+  **Commit**: `fix(agent): escape XML special characters in target values`
+
+- [ ] 6. PR 384: Escape message content in `<agent_message>` XML
+
+  **What to do**:
+  - In `base_agent.py`, wrap `message.get("content", "")` in CDATA or `html.escape()` before embedding in `<agent_message>` element content
+  - CDATA is preferred here since content is free-form text that shouldn't be interpreted as XML
+
+  **Must NOT do**:
+  - Change the XML element structure
+  - Break existing `clean_content()` regex patterns
+
+  **Acceptance Criteria**:
+  - [ ] Message content is CDATA-wrapped or HTML-escaped in the `<agent_message>` element
+  - [ ] Content containing `</agent_message>` does not break the XML structure
+
+  **Commit**: `fix(agent): escape message content in compact agent_message format`
+
+- [ ] 7. PR 384: Make corrective message generic in BaseAgent
+
+  **What to do**:
+  - In `base_agent.py`, change the corrective message from mentioning specific tool names to generic guidance:
+    ```
+    "You responded with plain text instead of a tool call. 
+    While the agent loop is running, EVERY response MUST be a tool call. 
+    Do NOT send plain text messages. Act via your available tools.
+    Review your task and take action now."
+    ```
+  - Remove references to `create_agent`, `terminal_execute`, `wait_for_message`
+
+  **Must NOT do**:
+  - Change the `add_message("user", corrective_message)` call structure
+  - Change the `return None` behavior
+
+  **Acceptance Criteria**:
+  - [ ] Corrective message contains no StrixAgent-specific tool names
+  - [ ] Message still conveys "use tools, not plain text"
+
+  **Commit**: `fix(agent): use generic corrective message in BaseAgent`
+
+---
+
+## Execution Strategy
+
+Each fix is independent and lives on a separate branch. Apply fixes to the appropriate branch, commit, and force-push.
+
+```
+Wave 1 (all parallel):
+├── Task 1: PR 381 branch - rename function
+├── Task 2: PR 382 branch - split WSTG codes
+├── Task 3: PR 382 branch - compliance comment
+├── Task 4: PR 383 branch - thinking blocks fix
+├── Task 5: PR 384 branch - XML escaping
+├── Task 6: PR 384 branch - CDATA wrapping
+└── Task 7: PR 384 branch - generic corrective message
+```
+
+Tasks 2+3 share branch (PR 382). Tasks 5+6+7 share branch (PR 384). Work sequentially within a branch, but all 4 branches can be updated in parallel.
+
+**Branch workflow per PR:**
+1. `git checkout <branch>`
+2. Apply edits
+3. `git add -A && git commit -m "message"`
+4. `git push origin <branch> --force-with-lease`
+
+---
+
+## Success Criteria
+
+- [ ] All 4 PRs still `MERGEABLE` after fixes
+- [ ] No reviewer comments remain unaddressed
+- [ ] Each branch has exactly one additional commit with the fix
diff --git a/strix/agents/StrixAgent/strix_agent.py b/strix/agents/StrixAgent/strix_agent.py
@@ -1,5 +1,7 @@
 from typing import Any
 
+import html
+
 from strix.agents.base_agent import BaseAgent
 from strix.llm.config import LLMConfig
 
@@ -59,24 +61,24 @@ async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:  #
 
         target_lines = []
 
-        if repositories:
+if repositories:
             for repo in repositories:
                 if repo["workspace_path"]:
-                    target_lines.append(f'  <target type="repository">{repo["url"]} (code at: {repo["workspace_path"]})</target>')
+                    target_lines.append(f'  <target type="repository">{html.escape(repo["url"])} (code at: {html.escape(repo["workspace_path"])})</target>')
                 else:
-                    target_lines.append(f'  <target type="repository">{repo["url"]}</target>')
+                    target_lines.append(f'  <target type="repository">{html.escape(repo["url"])}</target>')
 
         if local_code:
             for code in local_code:
-                target_lines.append(f'  <target type="local_code">{code["path"]} (code at: {code["workspace_path"]})</target>')
+                target_lines.append(f'  <target type="local_code">{html.escape(code["path"])} (code at: {html.escape(code["workspace_path"])})</target>')
 
         if urls:
             for url in urls:
-                target_lines.append(f'  <target type="url">{url}</target>')
+                target_lines.append(f'  <target type="url">{html.escape(url)}</target>')
 
         if ip_addresses:
             for ip in ip_addresses:
-                target_lines.append(f'  <target type="ip">{ip}</target>')
+                target_lines.append(f'  <target type="ip">{html.escape(ip)}</target>')
 
         targets_block = "\n".join(target_lines)
 
diff --git a/strix/agents/base_agent.py b/strix/agents/base_agent.py
@@ -1,5 +1,6 @@
 import asyncio
 import contextlib
+import html
 import logging
 from typing import TYPE_CHECKING, Any, Optional
 
@@ -413,11 +414,7 @@ async def _process_iteration(self, tracer: Optional["Tracer"]) -> bool | None:
         corrective_message = (
             "You responded with plain text instead of a tool call. "
             "While the agent loop is running, EVERY response MUST be a tool call. "
-            "Do NOT send plain text messages. Act via tools:\n"
-            "- Use the think tool to reason through problems\n"
-            "- Use create_agent to spawn subagents for testing\n"
-            "- Use terminal_execute to run commands\n"
-            "- Use wait_for_message ONLY when waiting for subagent results\n"
+            "Do NOT send plain text messages. Act via your available tools. "
             "Review your task and take action now."
         )
         self.state.add_message("user", corrective_message)
@@ -499,12 +496,13 @@ def _check_agent_messages(self, state: AgentState) -> None:  # noqa: PLR0912
                             if sender_id and sender_id in _agent_graph.get("nodes", {}):
                                 sender_name = _agent_graph["nodes"][sender_id]["name"]
 
+                            content = message.get("content", "")
                             message_content = f"""<agent_message
-from="{sender_name}"
-id="{sender_id}"
-type="{message.get("message_type", "information")}"
-priority="{message.get("priority", "normal")}">
-{message.get("content", "")}
+from="{html.escape(sender_name)}"
+id="{html.escape(str(sender_id))}"
+type="{html.escape(message.get("message_type", "information"))}"
+priority="{html.escape(message.get("priority", "normal"))}">
+<![CDATA[{content}]]>
 </agent_message>"""
                             state.add_message("user", message_content.strip())