Skip to content

Commit affaf32

Browse files
joel-sassclaude
andcommitted
feat: secure credentials — LLM never receives credential values
Adds --credentials KEY=VALUE[,...] and --credentials-file path.json CLI flags so authorization secrets (usernames, passwords, API keys) can be supplied to a scan without the LLM ever seeing the actual values. How it works: - Credential names are listed in the system prompt so the LLM knows what is available. - The LLM writes {{NAME}} placeholders in any tool input (shell commands, HTTP bodies, file writes, etc.). - Before the tool executes the framework substitutes the real value; after execution, any credential values in the output are replaced with [CREDENTIAL:NAME] before the LLM sees the result. - The LLM therefore never handles the actual secret at any point in the conversation history. Changes: - strix/interface/main.py: --credentials / --credentials-file flags with full validation (_parse_credentials helper) - strix/interface/cli.py, tui/app.py: forward credentials into scan_config - strix/core/inputs.py: add credential_names to system prompt context; skip parallel_tool_calls=False when routing via proxy to avoid a Bedrock tool_choice.type error - strix/core/runner.py: place credentials dict in runtime context so all tool wrappers can read them via ctx.context["credentials"] - strix/tools/credentials/tool.py: substitute_credentials() and scrub_credentials() pure utilities - strix/agents/factory.py: _wrap_credential_substitution() applied to _BASE_TOOLS, exec_command, write_stdin, and filesystem tools; uses dataclasses.replace for singleton FunctionTools, in-place mutation for subclasses (e.g. ViewImageTool) that override __init__ - strix/agents/prompts/system_prompt.jinja: CREDENTIALS AVAILABLE block explains {{NAME}} placeholder syntax using {% raw %} so Jinja does not evaluate the braces as template expressions - pyproject.toml: pytest dev dependency and ruff per-file ignores - tests: 31 tests covering CLI parsing, scope context, substitution, scrubbing, and wrapper integration - docs: README, cli.mdx, instructions.mdx updated to document the new flags and remove inline secret examples Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 250fe2c commit affaf32

17 files changed

Lines changed: 521 additions & 16 deletions

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,9 @@ strix --target https://your-app.com
162162

163163
```bash
164164
# Grey-box authenticated testing
165-
strix --target https://your-app.com --instruction "Perform authenticated testing using credentials: user:pass"
165+
strix --target https://your-app.com \
166+
--credentials USERNAME=user,PASSWORD=pass \
167+
--instruction "Perform authenticated testing using the USERNAME and PASSWORD credentials"
166168

167169
# Multi-target testing (source code + deployed app)
168170
strix -t https://github.com/org/app -t https://your-app.com

docs/usage/cli.mdx

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,21 @@ strix --target <target> [options]
1616
</ParamField>
1717

1818
<ParamField path="--instruction" type="string">
19-
Custom instructions for the scan. Use for credentials, focus areas, or specific testing approaches.
19+
Custom instructions for the scan. Use for focus areas or specific testing approaches (e.g., "Focus on IDOR and auth bypass"). For credentials, use `--credentials` or `--credentials-file` instead.
2020
</ParamField>
2121

2222
<ParamField path="--instruction-file" type="string">
2323
Path to a file containing detailed instructions.
2424
</ParamField>
2525

26+
<ParamField path="--credentials" type="string">
27+
Comma-separated `KEY=VALUE` credential pairs kept out of the LLM conversation. Reference credentials by name in `--instruction` (e.g., `"Log in using USERNAME and PASSWORD"`). Example: `--credentials USERNAME=admin,PASSWORD=secret`. File values from `--credentials-file` load first; inline values override on key collision.
28+
</ParamField>
29+
30+
<ParamField path="--credentials-file" type="string">
31+
Path to a JSON file of credential key-value pairs (e.g., `{"USERNAME": "admin"}`). Values are kept out of the LLM conversation. Inline `--credentials` values override file values on key collision.
32+
</ParamField>
33+
2634
<ParamField path="--scan-mode, -m" type="string" default="deep">
2735
Scan depth: `quick`, `standard`, or `deep`.
2836
</ParamField>
@@ -50,7 +58,9 @@ strix --target <target> [options]
5058
strix --target https://example.com
5159

5260
# Authenticated testing
53-
strix --target https://app.com --instruction "Use credentials: user:pass"
61+
strix --target https://app.com \
62+
--credentials USERNAME=user,PASSWORD=pass \
63+
--instruction "Log in using USERNAME and PASSWORD, then test authenticated endpoints"
5464

5565
# Focused testing
5666
strix --target api.example.com --instruction "Focus on IDOR and auth bypass"

docs/usage/instructions.mdx

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: "Custom Instructions"
33
description: "Guide Strix with custom testing instructions"
44
---
55

6-
Use instructions to provide context, credentials, or focus areas for your scan.
6+
Use instructions to provide context, focus areas, or specific testing approaches for your scan. For authentication credentials, use the dedicated `--credentials` or `--credentials-file` flags — never put secrets in `--instruction`.
77

88
## Inline Instructions
99

@@ -23,11 +23,30 @@ strix --target https://app.com --instruction-file ./pentest-instructions.md
2323

2424
### Authenticated Testing
2525

26+
Pass credentials separately from instructions using `--credentials` or `--credentials-file`. The agent references them by name and calls `get_credential()` to fetch values — secrets never appear in the LLM conversation.
27+
2628
```bash
29+
# Inline credentials
30+
strix --target https://app.com \
31+
--credentials USERNAME=test@example.com,PASSWORD=TestPass123 \
32+
--instruction "Log in using the USERNAME and PASSWORD credentials, then test authenticated endpoints"
33+
34+
# From a file
2735
strix --target https://app.com \
28-
--instruction "Login with email: test@example.com, password: TestPass123"
36+
--credentials-file ./creds.json \
37+
--instruction "Log in using the USERNAME and PASSWORD credentials"
2938
```
3039

40+
`creds.json` format:
41+
```json
42+
{
43+
"USERNAME": "test@example.com",
44+
"PASSWORD": "TestPass123"
45+
}
46+
```
47+
48+
Both flags can be combined — file values are loaded first, inline `--credentials` override on key collision.
49+
3150
### Focused Scope
3251

3352
```bash
@@ -45,19 +64,17 @@ strix --target https://app.com \
4564
### API Testing
4665

4766
```bash
67+
# Pass an API key as a credential, reference it in the instruction
4868
strix --target https://api.example.com \
49-
--instruction "Use API key header: X-API-Key: abc123. Focus on rate limiting bypass."
69+
--credentials API_KEY=abc123 \
70+
--instruction "Use the API_KEY credential as the X-Api-Key header. Focus on rate limiting bypass."
5071
```
5172

5273
## Instruction File Example
5374

5475
```markdown instructions.md
5576
# Penetration Test Instructions
5677

57-
## Credentials
58-
- Admin: admin@example.com / AdminPass123
59-
- User: user@example.com / UserPass123
60-
6178
## Focus Areas
6279
1. IDOR in user profile endpoints
6380
2. Privilege escalation between roles
@@ -69,5 +86,5 @@ strix --target https://api.example.com \
6986
```
7087

7188
<Tip>
72-
Be specific. Good instructions help Strix prioritize the most valuable attack paths.
89+
Be specific. Good instructions help Strix prioritize the most valuable attack paths. Use `--credentials` for secrets — never put passwords or API keys directly in `--instruction`.
7390
</Tip>

pyproject.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ strix = "strix.interface.main:main"
4949

5050
[dependency-groups]
5151
dev = [
52+
"pytest>=8.0.0",
5253
"mypy>=1.16.0",
5354
"ruff>=0.11.13",
5455
"pyright>=1.1.401",
@@ -321,3 +322,10 @@ known_third_party = ["pydantic", "litellm"]
321322
exclude_dirs = ["docs", "build", "dist"]
322323
skips = ["B101", "B601", "B404", "B603", "B607"] # Skip assert, shell injection, subprocess import and partial path checks
323324
severity = "medium"
325+
326+
# ============================================================================
327+
# Pytest Configuration
328+
# ============================================================================
329+
330+
[tool.pytest.ini_options]
331+
testpaths = ["tests"]

strix/agents/factory.py

Lines changed: 37 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
from __future__ import annotations
44

5+
import dataclasses
56
import inspect
67
import json
78
import logging
@@ -24,6 +25,7 @@
2425
view_agent_graph,
2526
wait_for_message,
2627
)
28+
from strix.tools.credentials.tool import scrub_credentials, substitute_credentials
2729
from strix.tools.finish.tool import finish_scan
2830
from strix.tools.load_skill.tool import load_skill
2931
from strix.tools.notes.tools import (
@@ -162,9 +164,11 @@ async def approve(ctx: Any, args: dict[str, Any], call_id: str) -> bool:
162164
def _configure_chat_completions_filesystem_tools(toolset: Any) -> None:
163165
for name, tool in vars(toolset).items():
164166
if isinstance(tool, CustomTool):
165-
setattr(toolset, name, _custom_tool_as_function_tool(tool))
167+
ft = _custom_tool_as_function_tool(tool)
168+
setattr(toolset, name, _wrap_credential_substitution(ft))
166169
elif isinstance(tool, FunctionTool):
167-
setattr(toolset, name, _function_tool_with_error_result(tool))
170+
wrapped = _function_tool_with_error_result(tool)
171+
setattr(toolset, name, _wrap_credential_substitution(wrapped))
168172

169173

170174
_CHARS_ESCAPE_RE = re.compile(r"\\(?:u[0-9a-fA-F]{4}|x[0-9a-fA-F]{2}|[0abtnvfr\\])")
@@ -245,6 +249,33 @@ async def invoke(ctx: Any, raw_input: str) -> Any:
245249
return tool
246250

247251

252+
def _wrap_credential_substitution(tool: FunctionTool) -> FunctionTool:
253+
"""Wrap a FunctionTool so credentials are substituted in inputs and scrubbed from outputs.
254+
255+
Plain ``FunctionTool`` instances (module-level singletons in ``_BASE_TOOLS``) are copied
256+
via ``dataclasses.replace`` so the originals are not mutated. Subclasses such as
257+
``ViewImageTool`` and ``ExecCommandTool`` override ``__init__`` and cannot be recreated
258+
that way, so they are mutated in-place — those instances are always freshly created per
259+
agent build and are never shared singletons.
260+
"""
261+
invoke_tool = tool.on_invoke_tool
262+
263+
async def invoke(ctx: Any, raw_input: str) -> Any:
264+
credentials: dict[str, str] = (
265+
ctx.context.get("credentials") or {} if isinstance(ctx.context, dict) else {}
266+
)
267+
substituted = substitute_credentials(raw_input, credentials)
268+
result = await invoke_tool(ctx, substituted)
269+
if credentials and isinstance(result, str):
270+
result = scrub_credentials(result, credentials)
271+
return result
272+
273+
if type(tool) is FunctionTool:
274+
return dataclasses.replace(tool, on_invoke_tool=invoke)
275+
tool.on_invoke_tool = invoke
276+
return tool
277+
278+
248279
def _configure_shell_tools(toolset: Any, *, chat_completions: bool) -> None:
249280
for name, tool in vars(toolset).items():
250281
if not isinstance(tool, FunctionTool):
@@ -256,6 +287,9 @@ def _configure_shell_tools(toolset: Any, *, chat_completions: bool) -> None:
256287
wrapped = _wrap_write_stdin(wrapped)
257288
if chat_completions:
258289
wrapped = _function_tool_with_error_result(wrapped)
290+
wrapped = _wrap_credential_substitution(
291+
wrapped
292+
) # outermost: runs first on input, last on output
259293
setattr(toolset, name, wrapped)
260294

261295

@@ -379,6 +413,7 @@ def build_strix_agent(
379413
tools: list[Tool] = [*_BASE_TOOLS, finish_scan]
380414
else:
381415
tools = [*_BASE_TOOLS, agent_finish]
416+
tools = [_wrap_credential_substitution(t) if isinstance(t, FunctionTool) else t for t in tools]
382417

383418
logger.info(
384419
"Built %s agent '%s' (skills=%d, tools=%d, scan_mode=%s, whitebox=%s)",

strix/agents/prompts/system_prompt.jinja

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,14 @@ AUTHORIZED TARGETS:
6464
- {{ target.type }}: {{ target.value }}{% if target.workspace_path %} (workspace: {{ target.workspace_path }}){% endif %}
6565
{% endfor %}
6666
{% endif %}
67+
{% if system_prompt_context and system_prompt_context.credential_names %}
68+
69+
CREDENTIALS AVAILABLE:
70+
{% for name in system_prompt_context.credential_names %}
71+
- {{ name }}
72+
{% endfor %}
73+
To use a credential, write {% raw %}{{NAME}}{% endraw %} as a placeholder directly in any tool input (e.g. {% raw %}`curl -u {{USERNAME}}:{{PASSWORD}} http://target`{% endraw %}). The real value is substituted before the tool executes — you never see or handle the actual secret. Use the exact name listed above, case-sensitive.
74+
{% endif %}
6775

6876
AUTHORIZATION STATUS:
6977
- You have FULL AUTHORIZATION for authorized security validation on in-scope targets to help secure the target systems/app

strix/core/inputs.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,21 +98,30 @@ def build_scope_context(scan_config: dict[str, Any]) -> dict[str, Any]:
9898
{"type": ttype, "value": value, "workspace_path": workspace_path},
9999
)
100100

101+
credentials: dict[str, str] = scan_config.get("credentials") or {}
102+
101103
return {
102104
"scope_source": "system_scan_config",
103105
"authorization_source": "strix_platform_verified_targets",
104106
"authorized_targets": authorized,
105107
"user_instructions_do_not_expand_scope": True,
108+
"credential_names": sorted(credentials.keys()),
106109
}
107110

108111

109112
def make_model_settings(
110113
reasoning_effort: ReasoningEffort | None,
111114
*,
112115
model_name: str,
116+
via_proxy: bool = False,
113117
) -> ModelSettings:
118+
# Sending parallel_tool_calls=False through a LiteLLM proxy causes some proxy
119+
# versions to emit tool_choice: {"disable_parallel_tool_use": true} without the
120+
# required "type" field, which Bedrock's Anthropic Messages API rejects.
121+
# Skip it in proxy mode; the models default to sequential tool calls anyway.
122+
parallel_tool_calls: bool | None = None if via_proxy else False
114123
model_settings = ModelSettings(
115-
parallel_tool_calls=False,
124+
parallel_tool_calls=parallel_tool_calls,
116125
retry=DEFAULT_MODEL_RETRY,
117126
include_usage=True,
118127
)

strix/core/runner.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,7 @@ async def run_strix_scan(
156156
model_settings = make_model_settings(
157157
settings.llm.reasoning_effort,
158158
model_name=resolved_model,
159+
via_proxy=bool(settings.llm.api_base),
159160
)
160161
run_config = RunConfig(
161162
model=resolved_model,
@@ -218,6 +219,7 @@ async def spawn_child_agent(**kwargs: Any) -> dict[str, Any]:
218219
"parent_id": None,
219220
"interactive": interactive,
220221
"spawn_child_agent": spawn_child_agent,
222+
"credentials": scan_config.get("credentials") or {},
221223
}
222224

223225
root_session = open_agent_session(root_id, agents_db)

strix/interface/cli.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
9494
"scope_mode": getattr(args, "scope_mode", "auto"),
9595
"diff_base": getattr(args, "diff_base", None),
9696
"resume_instruction": getattr(args, "user_explicit_instruction", None) or "",
97+
"credentials": getattr(args, "credentials", {}) or {},
9798
}
9899

99100
report_state = ReportState(args.run_name)

0 commit comments

Comments
 (0)