Skip to content

Commit ef6a9f6

Browse files
authored
Refactor host iptables tests and fix follow-up CI regressions (#2901)
* Initial plan * test: split host iptables modules * test: address host iptables review * test: refine host iptables helpers * fix: tighten security guard prompt * fix: guide security guard diff usage * fix: retry api-proxy startup exits * fix: detect generic api-proxy startup failures * fix: inspect api-proxy state after compose failures * fix: restore security guard prompt diff placeholders * fix: suppress markdown lint for security guard placeholder --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
1 parent 3503a1f commit ef6a9f6

11 files changed

Lines changed: 1584 additions & 1401 deletions

.github/workflows/security-guard.lock.yml

Lines changed: 34 additions & 49 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.github/workflows/security-guard.md

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ tools:
1919
sandbox:
2020
agent:
2121
id: awf
22-
version: v0.25.29
22+
version: v0.25.41
2323
network:
2424
allowed:
2525
- github
@@ -92,7 +92,9 @@ steps:
9292

9393
## Security Relevance Check
9494

95-
**Security-critical files changed in this PR:** ${{ steps.security-relevance.outputs.security_files_changed }}
95+
<!-- markdownlint-disable-next-line MD050 -->
96+
**Security-critical files changed in this PR:** __GH_AW_EXPR_66EB691F__
97+
<!-- gh-aw compile marker: ${{ steps.security-relevance.outputs.security_files_changed }} -->
9698

9799
> If this value is `0`, the workflow skips the agent job.
98100
@@ -135,14 +137,17 @@ Analyze PR #${{ github.event.pull_request.number }} in repository ${{ github.rep
135137

136138
1. **Review the pre-fetched diff below** (up to 100 KB of changes are included)
137139
2. **Batch all independent reads** in a single tool-use block rather than making sequential calls
138-
3. **Use `mcp__github__get_pull_request_diff`** only when the diff below is truncated and you need the remainder
139-
4. **Use `mcp__github__get_file_contents`** only for files not changed in this PR (e.g., to understand adjacent security context)
140-
5. **Collect evidence** with specific file names, line numbers, and code snippets
140+
3. **Use the pre-fetched diff below as your primary source of truth**; only fall back to `gh pr diff` / `gh api` if the diff is truncated and you need the remainder
141+
4. **Do not use local branch comparisons or commit history** (for example `git diff main...HEAD` or `git log main..`) unless you first confirm the base branch exists locally; the checkout may contain only the PR branch, and these calls waste turns
142+
5. **Use direct file reads from the checked-out repository** only for files you need to inspect further (e.g., to understand adjacent security context)
143+
6. **Collect evidence** with specific file names, line numbers, and code snippets
141144

142145
## Security Checks
143146

144147
Check for these security-weakening changes: new/expanded ACCEPT rules, weakened DROP/REJECT, firewall chain rewiring, DNS or IPv6 bypasses, Squid ACL/order regressions, non-80/443 egress allowances, wildcard/domain validation bypasses, capability additions (`SYS_ADMIN`, `NET_RAW`), seccomp relaxations, removal of resource/user hardening, input validation removal, command injection risk, hardcoded secrets, security-disabling env var changes, or risky dependency updates.
145148

149+
If the changed files are only tests, docs, or other non-production artifacts and the diff does not modify runtime security behavior, avoid unnecessary exploration and call the `safeoutputs noop` tool after the brief verification.
150+
146151
## Output Format
147152

148153
**IMPORTANT: Be concise.** Report each security finding in ≤ 150 words. Maximum 5 findings total.
@@ -166,5 +171,6 @@ If no security issues are found:
166171
The following PR diff has been pre-computed. Focus your security analysis on these changes:
167172

168173
```
169-
${{ steps.pr-diff.outputs.PR_FILES }}
170-
```
174+
__GH_AW_EXPR_BAA3A6C6__
175+
<!-- gh-aw compile marker: ${{ steps.pr-diff.outputs.PR_FILES }} -->
176+
```

src/container-lifecycle.ts

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,37 @@ function isApiProxyStartupFailureError(errorMsg: string): boolean {
388388
return errorMsg.includes('is unhealthy') || errorMsg.includes('exited (1)');
389389
}
390390

391+
/**
392+
* Some docker compose failures surface only as a generic execa error message
393+
* while the actionable api-proxy state is visible only via container inspect.
394+
*/
395+
async function didApiProxyFailStartup(errorMsg: string): Promise<boolean> {
396+
if (isApiProxyStartupFailureError(errorMsg)) {
397+
return true;
398+
}
399+
400+
try {
401+
const result = await execa(
402+
'docker',
403+
['inspect', API_PROXY_CONTAINER_NAME, '--format', '{{.State.Status}}|{{if .State.Health}}{{.State.Health.Status}}{{end}}'],
404+
{
405+
reject: false,
406+
env: getLocalDockerEnv(),
407+
}
408+
);
409+
410+
if (result.exitCode !== 0) {
411+
return false;
412+
}
413+
414+
const [containerStatus = '', healthStatus = ''] = result.stdout.trim().split('|');
415+
return containerStatus === 'exited' || healthStatus === 'unhealthy';
416+
} catch (error) {
417+
logger.debug(`Could not inspect ${API_PROXY_CONTAINER_NAME} after startup failure:`, error);
418+
return false;
419+
}
420+
}
421+
391422
/**
392423
* Dumps the tail of a container's logs to stderr for diagnosis.
393424
* Silently skips if the container does not exist or logs are unavailable.
@@ -463,11 +494,12 @@ export async function startContainers(workDir: string, allowedDomains: string[],
463494
logger.success('Containers started successfully');
464495
} catch (firstError) {
465496
const firstErrorMsg = firstError instanceof Error ? firstError.message : String(firstError);
497+
const firstAttemptApiProxyStartupFailure = await didApiProxyFailStartup(firstErrorMsg);
466498

467499
// When api-proxy specifically fails to start, retry once.
468500
// Transient failures are common on slow or busy runners (e.g. Azure-hosted runners)
469501
// where the Node.js process inside the container takes longer to bind its port.
470-
if (isApiProxyStartupFailureError(firstErrorMsg)) {
502+
if (firstAttemptApiProxyStartupFailure) {
471503
logger.warn(`${API_PROXY_CONTAINER_NAME} failed to start — this may be a transient startup failure, retrying once...`);
472504
await logContainerLogsToStderr(API_PROXY_CONTAINER_NAME);
473505

@@ -491,7 +523,7 @@ export async function startContainers(workDir: string, allowedDomains: string[],
491523
return;
492524
} catch (retryError) {
493525
const retryErrorMsg = retryError instanceof Error ? retryError.message : String(retryError);
494-
if (isApiProxyStartupFailureError(retryErrorMsg)) {
526+
if (await didApiProxyFailStartup(retryErrorMsg)) {
495527
// Surface api-proxy logs and emit a clear, unambiguous error so
496528
// downstream parse steps don't blame the model for never running.
497529
await logContainerLogsToStderr(API_PROXY_CONTAINER_NAME);

src/docker-manager-lifecycle.test.ts

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,53 @@ describe('docker-manager lifecycle', () => {
137137
expect(upCalls).toHaveLength(2);
138138
});
139139

140+
it('should retry once when awf-api-proxy exits during startup', async () => {
141+
// 1. docker rm (initial cleanup)
142+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
143+
// 2. docker compose up (first attempt - fails with api-proxy exit)
144+
mockExecaFn.mockRejectedValueOnce(new Error('dependency failed to start: container awf-api-proxy exited (1)'));
145+
// 3. docker logs --tail 50 awf-api-proxy (get logs for diagnosis)
146+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy startup logs', stderr: '', exitCode: 0 } as any);
147+
// 4. docker compose down (cleanup before retry)
148+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
149+
// 5. docker compose up (retry - succeeds)
150+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
151+
152+
await expect(startContainers(testDir, ['github.com'])).resolves.toBeUndefined();
153+
154+
const upCalls = mockExecaFn.mock.calls.filter((call: any[]) =>
155+
call[0] === 'docker' && Array.isArray(call[1]) && call[1].includes('up')
156+
);
157+
expect(upCalls).toHaveLength(2);
158+
});
159+
160+
it('should retry once when docker compose only reports a generic error but awf-api-proxy already exited', async () => {
161+
// 1. docker rm (initial cleanup)
162+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
163+
// 2. docker compose up (first attempt - generic execa error)
164+
mockExecaFn.mockRejectedValueOnce(new Error('Command failed with exit code 1: docker compose up -d'));
165+
// 3. docker inspect awf-api-proxy (fallback diagnosis)
166+
mockExecaFn.mockResolvedValueOnce({ stdout: 'exited', stderr: '', exitCode: 0 } as any);
167+
// 4. docker logs --tail 50 awf-api-proxy (get logs for diagnosis)
168+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy startup logs', stderr: '', exitCode: 0 } as any);
169+
// 5. docker compose down (cleanup before retry)
170+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
171+
// 6. docker compose up (retry - succeeds)
172+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
173+
174+
await expect(startContainers(testDir, ['github.com'])).resolves.toBeUndefined();
175+
176+
const inspectCalls = mockExecaFn.mock.calls.filter((call: any[]) =>
177+
call[0] === 'docker' && Array.isArray(call[1]) && call[1][0] === 'inspect' && call[1][1] === 'awf-api-proxy'
178+
);
179+
expect(inspectCalls).toHaveLength(1);
180+
181+
const upCalls = mockExecaFn.mock.calls.filter((call: any[]) =>
182+
call[0] === 'docker' && Array.isArray(call[1]) && call[1].includes('up')
183+
);
184+
expect(upCalls).toHaveLength(2);
185+
});
186+
140187
it('should throw clear error when awf-api-proxy fails its health check on both attempts', async () => {
141188
// 1. docker rm (initial cleanup)
142189
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
@@ -156,6 +203,48 @@ describe('docker-manager lifecycle', () => {
156203
);
157204
});
158205

206+
it('should throw clear error when awf-api-proxy exits during startup on both attempts', async () => {
207+
// 1. docker rm (initial cleanup)
208+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
209+
// 2. docker compose up (first attempt - fails)
210+
mockExecaFn.mockRejectedValueOnce(new Error('dependency failed to start: container awf-api-proxy exited (1)'));
211+
// 3. docker logs (first diagnosis)
212+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy logs', stderr: '', exitCode: 0 } as any);
213+
// 4. docker compose down (cleanup before retry)
214+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
215+
// 5. docker compose up (retry - also fails)
216+
mockExecaFn.mockRejectedValueOnce(new Error('dependency failed to start: container awf-api-proxy exited (1)'));
217+
// 6. docker logs (second diagnosis)
218+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy logs', stderr: '', exitCode: 0 } as any);
219+
220+
await expect(startContainers(testDir, ['github.com'])).rejects.toThrow(
221+
'AWF firewall failed to start: awf-api-proxy failed to start on both attempts'
222+
);
223+
});
224+
225+
it('should throw clear error when generic compose failures map to awf-api-proxy startup failures on both attempts', async () => {
226+
// 1. docker rm (initial cleanup)
227+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
228+
// 2. docker compose up (first attempt - generic execa error)
229+
mockExecaFn.mockRejectedValueOnce(new Error('Command failed with exit code 1: docker compose up -d'));
230+
// 3. docker inspect (first diagnosis)
231+
mockExecaFn.mockResolvedValueOnce({ stdout: 'exited', stderr: '', exitCode: 0 } as any);
232+
// 4. docker logs (first diagnosis)
233+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy logs', stderr: '', exitCode: 0 } as any);
234+
// 5. docker compose down (cleanup before retry)
235+
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);
236+
// 6. docker compose up (retry - also generic execa error)
237+
mockExecaFn.mockRejectedValueOnce(new Error('Command failed with exit code 1: docker compose up -d'));
238+
// 7. docker inspect (second diagnosis)
239+
mockExecaFn.mockResolvedValueOnce({ stdout: 'exited', stderr: '', exitCode: 0 } as any);
240+
// 8. docker logs (second diagnosis)
241+
mockExecaFn.mockResolvedValueOnce({ stdout: 'api-proxy logs', stderr: '', exitCode: 0 } as any);
242+
243+
await expect(startContainers(testDir, ['github.com'])).rejects.toThrow(
244+
'AWF firewall failed to start: awf-api-proxy failed to start on both attempts'
245+
);
246+
});
247+
159248
it('should retry once when awf-api-proxy exits (1) during startup', async () => {
160249
// 1. docker rm (initial cleanup)
161250
mockExecaFn.mockResolvedValueOnce({ stdout: '', stderr: '', exitCode: 0 } as any);

src/host-iptables-cleanup.test.ts

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
import { execaResult, mockedExeca, setupHostIptablesTestSuite } from './test-helpers/host-iptables-test-setup';
2+
import { cleanupHostIptables, setupHostIptables, __testing } from './host-iptables';
3+
4+
describe('host-iptables (cleanup)', () => {
5+
setupHostIptablesTestSuite(__testing._resetIpv6State);
6+
7+
describe('cleanupHostIptables', () => {
8+
it('should flush and delete both FW_WRAPPER and FW_WRAPPER_V6 chains', async () => {
9+
mockedExeca.mockResolvedValue(execaResult({
10+
stdout: '',
11+
stderr: '',
12+
exitCode: 0,
13+
}));
14+
15+
await cleanupHostIptables();
16+
17+
// Verify IPv4 chain cleanup operations
18+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-F', 'FW_WRAPPER'], { reject: false });
19+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-X', 'FW_WRAPPER'], { reject: false });
20+
21+
// Verify IPv6 chain cleanup operations
22+
expect(mockedExeca).toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-F', 'FW_WRAPPER_V6'], { reject: false });
23+
expect(mockedExeca).toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-X', 'FW_WRAPPER_V6'], { reject: false });
24+
});
25+
26+
it('should re-enable IPv6 via sysctl on cleanup if it was disabled', async () => {
27+
// First, simulate setup that disabled IPv6
28+
mockedExeca
29+
.mockResolvedValueOnce(execaResult({ stdout: 'fw-bridge', stderr: '', exitCode: 0 }))
30+
.mockResolvedValueOnce(execaResult({ stdout: '', stderr: '', exitCode: 0 }))
31+
.mockResolvedValueOnce(execaResult({ exitCode: 1 }));
32+
33+
// Make ip6tables unavailable to trigger sysctl disable
34+
mockedExeca.mockImplementation(((cmd: string) => {
35+
if (cmd === 'ip6tables') {
36+
return Promise.reject(new Error('ip6tables not found'));
37+
}
38+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
39+
}) as any);
40+
41+
await setupHostIptables('172.30.0.10', 3128, ['8.8.8.8', '8.8.4.4']);
42+
43+
// Now run cleanup
44+
jest.clearAllMocks();
45+
mockedExeca.mockImplementation(((cmd: string) => {
46+
if (cmd === 'ip6tables') {
47+
return Promise.reject(new Error('ip6tables not found'));
48+
}
49+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
50+
}) as any);
51+
52+
await cleanupHostIptables();
53+
54+
// Verify IPv6 was re-enabled via sysctl
55+
expect(mockedExeca).toHaveBeenCalledWith('sysctl', ['-w', 'net.ipv6.conf.all.disable_ipv6=0']);
56+
expect(mockedExeca).toHaveBeenCalledWith('sysctl', ['-w', 'net.ipv6.conf.default.disable_ipv6=0']);
57+
});
58+
59+
it('should clean up IPv6 rules from DOCKER-USER when ip6tables is available', async () => {
60+
// Mock all calls to succeed (ip6tables available)
61+
mockedExeca.mockImplementation(((cmd: string, args: string[]) => {
62+
// getNetworkBridgeName
63+
if (cmd === 'docker' && args[0] === 'network') {
64+
return Promise.resolve({ stdout: 'fw-bridge', stderr: '', exitCode: 0 });
65+
}
66+
// ip6tables -L -n (availability check)
67+
if (cmd === 'ip6tables' && args.includes('-L') && args.includes('-n') && !args.includes('--line-numbers')) {
68+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
69+
}
70+
// ip6tables DOCKER-USER listing with FW_WRAPPER_V6 reference
71+
if (cmd === 'ip6tables' && args.includes('DOCKER-USER') && args.includes('--line-numbers')) {
72+
return Promise.resolve({ stdout: '1 FW_WRAPPER_V6 all -- * * ::/0 ::/0\n', stderr: '', exitCode: 0 });
73+
}
74+
// iptables DOCKER-USER listing with FW_WRAPPER reference
75+
if (cmd === 'iptables' && args.includes('DOCKER-USER') && args.includes('--line-numbers')) {
76+
return Promise.resolve({ stdout: '1 FW_WRAPPER all -- -i fw-bridge -o fw-bridge 0.0.0.0/0 0.0.0.0/0\n', stderr: '', exitCode: 0 });
77+
}
78+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
79+
}) as any);
80+
81+
await cleanupHostIptables();
82+
83+
// Verify IPv6 chain was flushed and deleted
84+
expect(mockedExeca).toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-F', 'FW_WRAPPER_V6'], { reject: false });
85+
expect(mockedExeca).toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-X', 'FW_WRAPPER_V6'], { reject: false });
86+
// Verify IPv6 DOCKER-USER rule was removed
87+
expect(mockedExeca).toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-D', 'DOCKER-USER', '1'], { reject: false });
88+
// Verify IPv4 chain was also cleaned
89+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-F', 'FW_WRAPPER'], { reject: false });
90+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-X', 'FW_WRAPPER'], { reject: false });
91+
});
92+
93+
it('should skip IPv6 cleanup when ip6tables is not available', async () => {
94+
mockedExeca.mockImplementation(((cmd: string, args: string[]) => {
95+
if (cmd === 'docker') {
96+
return Promise.resolve({ stdout: 'fw-bridge', stderr: '', exitCode: 0 });
97+
}
98+
if (cmd === 'ip6tables') {
99+
return Promise.reject(new Error('ip6tables not found'));
100+
}
101+
if (cmd === 'iptables' && args.includes('DOCKER-USER') && args.includes('--line-numbers')) {
102+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
103+
}
104+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
105+
}) as any);
106+
107+
await cleanupHostIptables();
108+
109+
// Should NOT attempt ip6tables cleanup (except the availability check)
110+
expect(mockedExeca).not.toHaveBeenCalledWith('ip6tables', ['-t', 'filter', '-F', 'FW_WRAPPER_V6'], { reject: false });
111+
});
112+
113+
it('should not throw on errors (best-effort cleanup)', async () => {
114+
mockedExeca.mockRejectedValue(new Error('iptables error'));
115+
116+
// Should not throw
117+
await expect(cleanupHostIptables()).resolves.not.toThrow();
118+
});
119+
});
120+
121+
describe('cleanupHostIptables when bridge name is null', () => {
122+
it('should skip IPv4 DOCKER-USER rule removal when bridge name is not found', async () => {
123+
mockedExeca.mockImplementation(((cmd: string, args: string[]) => {
124+
// getNetworkBridgeName returns empty string → null
125+
if (cmd === 'docker' && args[0] === 'network' && args[1] === 'inspect') {
126+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
127+
}
128+
return Promise.resolve({ stdout: '', stderr: '', exitCode: 0 });
129+
}) as any);
130+
131+
await cleanupHostIptables();
132+
133+
// Should NOT attempt to list DOCKER-USER rules (bridge name is null)
134+
expect(mockedExeca).not.toHaveBeenCalledWith('iptables', [
135+
'-t', 'filter', '-L', 'DOCKER-USER', '-n', '--line-numbers',
136+
], { reject: false });
137+
138+
// Should still flush/delete the chain
139+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-F', 'FW_WRAPPER'], { reject: false });
140+
expect(mockedExeca).toHaveBeenCalledWith('iptables', ['-t', 'filter', '-X', 'FW_WRAPPER'], { reject: false });
141+
});
142+
});
143+
});

0 commit comments

Comments
 (0)