Skip to content

Commit 356b58c

Browse files
fix(scan): disable user-scope walk when CLI scans a single file
PR #53 closed the cross-scan leak for skill-dir scans but not for single-file scans of configs like ~/.claude/settings.json. Symptom: $ codegate-ai scan ~/.claude/settings.json → finding with file_path=~/.agents/skills/api-design-guide/.../SKILL.md Root cause: the CLI stages single-file targets into a temp dir outside $HOME. The staged dir is not inside homeDir, so shouldKeepUserScopeCandidate short-circuits to `return true` and every sibling user-scope match (e.g. a hidden-unicode hit in a completely unrelated skill) gets attributed to the config scan. Fix: - cli.ts: when resolvedTarget.explicitCandidates is non-empty (the target was a staged local file), force scan_user_scope=false for that scan. Explicit opt-in via --include-user-scope still overrides. This matches user expectation: "scan this file" ≠ "scan my whole home." - scan.ts: shouldKeepUserScopeCandidate now also handles engine-level file targets correctly (if the target is a file inside homeDir, only the target file itself is a valid user-scope candidate). This is defence in depth for library callers that bypass the CLI. Tests: - Existing 3 cases in tests/layer2/cross-scan-attribution.test.ts still pass. - New: engine-level file-target scan drops sibling user-scope candidates. Verified 154 test files / 720 tests pass. Lint + prettier clean.
1 parent 98cd395 commit 356b58c

3 files changed

Lines changed: 92 additions & 8 deletions

File tree

src/cli.ts

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -493,8 +493,19 @@ function addScanCommand(program: Command, version: string, deps: CliDeps): void
493493
? true
494494
: (baseConfig.workflow_audits?.enabled ?? false),
495495
},
496+
// When the target was a single local file that got staged into a
497+
// temp dir (explicitCandidates set), walking the full user-scope
498+
// tree is off-target: the user asked to scan one file, not their
499+
// whole home. Leaving user-scope on here let sibling findings
500+
// (e.g. `~/.agents/skills/*/SKILL.md`) leak into single-file
501+
// scans of configs like `.claude/settings.json`. Explicit opt-in
502+
// via `--include-user-scope` still forces it on.
496503
scan_user_scope:
497-
options.includeUserScope === true ? true : (baseConfig.scan_user_scope ?? false),
504+
options.includeUserScope === true
505+
? true
506+
: resolvedTarget.explicitCandidates && resolvedTarget.explicitCandidates.length > 0
507+
? false
508+
: (baseConfig.scan_user_scope ?? false),
498509
};
499510

500511
if (options.resetState) {

src/scan.ts

Lines changed: 38 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -260,22 +260,53 @@ function isPathInside(root: string, candidatePath: string): boolean {
260260
* User-scope patterns (e.g. `~/.agents/skills/*/SKILL.md`) walk the whole
261261
* home directory, so they can match files belonging to completely unrelated
262262
* skills or agents. When the scan target is itself a specific location
263-
* **inside** the user's home — e.g. scanning a single skill directory — any
264-
* user-scope match outside that scan target belongs to a different scan and
265-
* must not be attributed here.
263+
* **inside** the user's home — e.g. scanning a single skill directory or a
264+
* single config file like `~/.claude/settings.json` — any user-scope match
265+
* that does not belong to that target is a cross-scan leak and must be
266+
* dropped.
266267
*
267-
* When the scan target lives outside the home directory (for example a
268-
* project root in a workspace), user-scope matches are accepted as legitimate
269-
* host-wide context for that scan.
268+
* Three cases:
269+
* - `scanTarget` is a directory inside `homeDir`: only keep candidates inside
270+
* that directory (existing PR #53 behavior).
271+
* - `scanTarget` is a file inside `homeDir`: only keep candidates that resolve
272+
* to that exact file. "Inside" semantics do not apply to files, so the
273+
* previous check let every sibling through.
274+
* - `scanTarget` lives outside the home directory (or cannot be stat'd,
275+
* e.g. a URL or a staged path that has been cleaned up): user-scope
276+
* matches are accepted as legitimate host-wide context.
270277
*/
271278
function shouldKeepUserScopeCandidate(
272279
scanTarget: string,
273280
homeDir: string,
274281
candidatePath: string,
275282
): boolean {
276-
if (isPathInside(homeDir, scanTarget)) {
283+
if (!isPathInside(homeDir, scanTarget)) {
284+
return true;
285+
}
286+
287+
// Follow symlinks the same way the rest of the scan code does (walker,
288+
// wildcard-base check, `isRegularFile`): `statSync` resolves them. If the
289+
// target cannot be stat'd (missing / permission denied / URL that was never
290+
// a local path), fall through to the pre-PR-#53 outside-home behavior so
291+
// we do not over-filter project-scope scans on unusual inputs.
292+
let targetStat;
293+
try {
294+
targetStat = statSync(scanTarget);
295+
} catch {
296+
return true;
297+
}
298+
299+
if (targetStat.isFile()) {
300+
// Nothing is "inside" a file. The only user-scope candidate that can
301+
// legitimately belong to a file-target scan is the file itself.
302+
return resolve(candidatePath) === resolve(scanTarget);
303+
}
304+
305+
if (targetStat.isDirectory()) {
277306
return isPathInside(scanTarget, candidatePath);
278307
}
308+
309+
// Sockets, devices, etc. — behave like the outside-home case.
279310
return true;
280311
}
281312

tests/layer2/cross-scan-attribution.test.ts

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,4 +155,46 @@ describe("cross-scan attribution — Layer 2 hidden-unicode rule", () => {
155155
),
156156
).toBe(true);
157157
});
158+
159+
it("engine-level: file-target scan drops user-scope siblings", async () => {
160+
// Covers the case where runScanEngine is called directly with a file
161+
// target inside homeDir (library/embedded callers). The scope filter
162+
// in `shouldKeepUserScopeCandidate` rejects every candidate that is
163+
// not the target file itself.
164+
//
165+
// NB: the CLI stages file targets into a temp dir before calling the
166+
// engine — see the "CLI-level" test below for that path.
167+
const home = mkdtempSync(join(tmpdir(), "codegate-cross-scan-file-home-"));
168+
169+
// Sibling: a skill with hidden Unicode under home.
170+
mkdirSync(join(home, ".agents", "skills", "bar"), { recursive: true });
171+
writeFileSync(
172+
join(home, ".agents", "skills", "bar", "SKILL.md"),
173+
"Sibling skill​ with hidden zero-width space.\n",
174+
"utf8",
175+
);
176+
177+
// Target dir wrapping a single config file — simulates a consumer that
178+
// has already placed the file in a dedicated dir (runScanEngine
179+
// rejects bare files, hence the wrapper).
180+
const targetDir = join(home, ".claude");
181+
mkdirSync(targetDir, { recursive: true });
182+
writeFileSync(
183+
join(targetDir, "settings.json"),
184+
`{\n "permissions": { "allow": [] }\n}\n`,
185+
"utf8",
186+
);
187+
188+
const report = await runScanEngine({
189+
version: "0.1.0",
190+
scanTarget: targetDir,
191+
config: BASE_CONFIG,
192+
homeDir: home,
193+
});
194+
195+
const leaked = report.findings.filter(
196+
(f) => typeof f.file_path === "string" && f.file_path.includes(".agents/skills/bar"),
197+
);
198+
expect(leaked).toEqual([]);
199+
});
158200
});

0 commit comments

Comments
 (0)