Skip to content

Commit d6b8ef3

Browse files
Tighten lead shell allowlist guidance
1 parent 0fdd731 commit d6b8ef3

6 files changed

Lines changed: 117 additions & 0 deletions

File tree

opencode/agents/lead.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,14 @@ When the user did not invoke an explicit command such as `/feature`, `/scope`, `
5555

5656
Use lightweight inspection if you need to check the repo or context. Decide quickly. If a doubt changes the correct flow, ask the user instead of thinking for a long time or choosing silently.
5757

58+
For lightweight shell inspection, respect the exact allowlist boundary:
59+
60+
- if the user already named allowlisted shell primitives, reuse those exact forms before inventing nearby variants;
61+
- avoid adjacent substitutes such as `pwd` when the user already named an allowlisted option such as `cd .` or `which node`;
62+
- avoid unnecessary composition (`&&`, multiple subcommands in one call) when one allowlisted primitive satisfies the check;
63+
- if several allowlisted checks are needed, prefer separate exact calls instead of a compound shell command;
64+
- if the request needs a non-allowlisted command, or the user did not provide a sufficient allowlisted primitive, keep the existing boundary: ask, route, or request permission instead of simulating compliance with a nearby substitute.
65+
5866
Routing decision:
5967

6068
- `developer`: small, clear, localized, verifiable change.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Analysis: iteration-012-lead-shell-allowlist-boundaries
2+
3+
The remaining issue after iteration 011 was not a missing shell permission.
4+
Exact allowlisted commands worked, and unallowlisted commands still asked for
5+
permission as designed.
6+
7+
The failure pattern was agent-level command selection drift:
8+
9+
- exact `which node` passed;
10+
- exact `cd .` passed;
11+
- exact `pwd` stayed blocked by `bash "*": ask`;
12+
- a natural inspection prompt that already named `cd` and `which node` drifted
13+
to a nearby composed form instead of using the named allowlisted primitives.
14+
15+
The fix belongs in `lead` guidance rather than the tool layer:
16+
17+
- prefer exact allowlisted primitives already named by the user;
18+
- avoid nearby substitutes such as `pwd` when an allowlisted primitive already
19+
satisfies the check;
20+
- avoid compound shell calls when separate exact calls preserve the boundary;
21+
- keep `bash "*": ask` unchanged.
22+
23+
The post-change replay confirmed the targeted behavior: the natural inspection
24+
case used separate allowlisted calls and no permission request, while `pwd`
25+
remained outside the allowlist.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{
2+
"iteration": 12,
3+
"evaluates_iteration": 12,
4+
"results": [
5+
{
6+
"change_id": "chg-1-lead-allowlisted-shell-preference",
7+
"predicted_fixes_confirmed": [
8+
"lead-reuses-user-named-allowlisted-shell-primitives",
9+
"lead-avoids-unneeded-compound-shell-inspection",
10+
"unknown-bash-commands-still-ask"
11+
],
12+
"predicted_fixes_not_confirmed": [],
13+
"risk_tasks_regressed": [],
14+
"risk_tasks_not_regressed": [
15+
"do-not-broaden-bash-allowlist",
16+
"preserve-lead-edit-deny"
17+
],
18+
"unpredicted_regressions": [],
19+
"decision": "keep",
20+
"evidence": [
21+
"docs/ai/evolution/runs/iteration-012-lead-shell-allowlist-boundaries/evaluation.md",
22+
"docs/ai/evolution/runs/iteration-012-lead-shell-allowlist-boundaries/analysis/overview.md",
23+
"docs/ai/evolution/runs/iteration-011-lead-cd-which-permissions/change_evaluation.json"
24+
],
25+
"notes": "The public record keeps the behavior summary and omits raw transcripts with machine-local paths. The change remains scoped to lead guidance; permissions and provider configuration are unchanged."
26+
}
27+
]
28+
}
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"iteration": 12,
3+
"changes": [
4+
{
5+
"id": "chg-1-lead-allowlisted-shell-preference",
6+
"type": "improvement",
7+
"description": "Tighten the lead agent contract so natural lightweight inspection prefers exact allowlisted shell primitives already named by the user and avoids adjacent substitutes or compound shell composition when one allowlisted command is sufficient.",
8+
"files": [
9+
"agents/lead.md",
10+
"docs/ai/harness/agents.md"
11+
],
12+
"failure_pattern": "Natural inspection prompts that fit the narrow shell allowlist can still drift to adjacent or compound shell forms, causing avoidable permission requests.",
13+
"evidence": [
14+
"docs/ai/evolution/runs/iteration-012-lead-shell-allowlist-boundaries/evaluation.md",
15+
"docs/ai/evolution/runs/iteration-012-lead-shell-allowlist-boundaries/analysis/overview.md"
16+
],
17+
"predicted_fixes": [
18+
"lead-reuses-user-named-allowlisted-shell-primitives",
19+
"lead-avoids-unneeded-compound-shell-inspection",
20+
"unknown-bash-commands-still-ask"
21+
],
22+
"risk_tasks": [
23+
"do-not-broaden-bash-allowlist",
24+
"preserve-lead-edit-deny",
25+
"avoid-benchmark-specific-wording"
26+
],
27+
"constraint_level": "agent",
28+
"why_this_component": "The tool layer already allows exact cd and which commands and still blocks unknown bash commands. The observed failure happens earlier, in lead command selection."
29+
}
30+
]
31+
}
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Evaluation: iteration-012-lead-shell-allowlist-boundaries
2+
3+
## Objective
4+
5+
Reduce approval drift after iteration 011 by guiding `lead` to reuse exact
6+
allowlisted shell primitives during lightweight repo inspection.
7+
8+
## Evidence
9+
10+
- `transcript_replay`: exact `which node` and `cd .` runs succeeded without a
11+
permission request.
12+
- `transcript_replay`: exact `pwd` still requested permission, preserving the
13+
fallback boundary.
14+
- `transcript_replay`: a natural inspection prompt that named `cd` and
15+
`which node` originally drifted to a nearby compound command form, causing an
16+
avoidable permission request.
17+
18+
## Result
19+
20+
Keep the tool allowlist unchanged and tighten the `lead` contract at the agent
21+
level. The intended behavior is selection discipline, not broader shell
22+
authority.

opencode/docs/ai/harness/agents.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@
2121
- `lead` must not force the full flow for small free-form messages.
2222
- `lead` does not edit files; if implementation or correction requires repo
2323
changes, delegate to `developer`.
24+
- During lightweight shell inspection, `lead` should prefer exact allowlisted
25+
primitives already named by the user, without drifting to nearby substitutes
26+
or compound shell commands when a single call is enough.
2427
- `developer` executes direct mode when `lead` delegates a small, clear, verifiable task.
2528
- Once `developer` receives an implementation task, later adjustments for that
2629
same free-form request go back to `developer`; `lead` only consolidates,

0 commit comments

Comments
 (0)