Skip to content

Commit 6264227

Browse files
maunilmclaude
andcommitted
fix(security): tighten watchdog overshoot, guard Windows path, assert live termination [DEVA11Y-484]
Addresses gaps found by stress-testing the guard rather than just asserting the happy path: - Measured overshoot: at a 200ms poll, bsdtar could write ~270-380MB past the cap on a fast disk before the watchdog tripped (the cap was far softer than the "200 MB" message implied). Tightened the poll to 50ms — a 10MB cap now peaks at ~34MB and a 2GB bomb is killed at ~224MB. Documented the cap as an explicit SOFT ceiling whose purpose is preventing disk *exhaustion*, not exact byte enforcement. - Windows Expand-Archive path was completely unguarded. Added a platform-agnostic post-extraction footprint backstop in the common path (typecheckable on macOS) so Windows rejects + cleans up a bomb before the binary is used. - Strengthened tests to assert the LIVE watchdog fires (bsdtar SIGTERM, status 15) and that peak disk stays bounded below the bomb size — previously the bomb tests would have passed even if only the post-extraction check worked (which would let a multi-GB bomb fill the disk). - Added test_large_bomb.sh (opt-in via DEVA11Y_DEEP=1): proves a 2GB bomb is bounded to ~224MB. Kept out of the default CI run to keep it fast/bounded. - README now documents the real limitations: soft cap + overshoot, Windows is post-hoc only, the Swift suite tests a mirror (not the compiled plugin) with the call sites typecheck-only, and locateExecutable's cap is defense-in-depth. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 0cc2b6b commit 6264227

6 files changed

Lines changed: 105 additions & 9 deletions

File tree

Plugins/BrowserStackAccessibilityLint/BrowserStackAccessibilityLint.swift

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -225,6 +225,15 @@ private struct BrowserStackCLIDownloader {
225225
try extractWithBsdtar(from: info.resolvedURL, into: versionDirectory)
226226
#endif
227227

228+
// Platform-agnostic backstop (DEVA11Y-484). The bsdtar paths already abort mid-stream
229+
// via the watchdog; this also covers the Windows Expand-Archive path, which has no
230+
// streaming guard — it can't bound peak disk during extraction, but it rejects and
231+
// cleans up a bomb before the binary is ever used.
232+
if let reason = footprintExceeded(at: versionDirectory, maxBytes: Self.maxDecompressedBytes, maxEntries: Self.maxArchiveEntries) {
233+
try? fileManager.removeItem(at: versionDirectory)
234+
forwardExit(code: 1, message: "BrowserStack CLI archive rejected: \(reason). Aborting to prevent disk exhaustion.")
235+
}
236+
228237
let locatedBinary = try locateExecutable(in: versionDirectory, preferredName: executableName)
229238
let finalBinaryURL: URL
230239
if locatedBinary.lastPathComponent == executableName {
@@ -714,9 +723,14 @@ func footprintExceeded(at directory: URL, maxBytes: Int64, maxEntries: Int) -> S
714723
}
715724

716725
/// Starts a background watchdog that terminates `process` (bsdtar) if the decompressed
717-
/// footprint in `directory` exceeds the byte or entry ceiling. The watchdog bounds peak
718-
/// disk use during a large/slow bomb; callers MUST also run `footprintExceeded` once the
719-
/// process exits, to catch a fast bomb that finished within a single poll interval.
726+
/// footprint in `directory` exceeds the byte or entry ceiling.
727+
///
728+
/// This is a SOFT ceiling: bsdtar can write up to one poll interval's worth of data past
729+
/// the limit before it is killed, so peak disk use is roughly `maxBytes + (pollInterval ×
730+
/// disk write rate)`. The goal is to prevent disk *exhaustion* by a multi-GB/TB bomb, not
731+
/// to enforce an exact byte count. The interval is kept short to bound the overshoot.
732+
/// Callers MUST also run `footprintExceeded` once the process exits, to catch a fast bomb
733+
/// that finished within a single poll interval.
720734
func startExtractionWatchdog(on process: Process, directory: URL, maxBytes: Int64, maxEntries: Int) -> ExtractionLimitState {
721735
let state = ExtractionLimitState()
722736
let watchdog = Thread {
@@ -726,7 +740,7 @@ func startExtractionWatchdog(on process: Process, directory: URL, maxBytes: Int6
726740
process.terminate()
727741
break
728742
}
729-
Thread.sleep(forTimeInterval: 0.2)
743+
Thread.sleep(forTimeInterval: 0.05)
730744
}
731745
}
732746
watchdog.start()

scripts/test/README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,29 @@ directly. Instead:
4545
- `check_drift.sh` diffs the two and **fails if they diverge**, so the mirror can never
4646
silently rot. If you edit the guard, copy the block into both — the drift check enforces it.
4747

48+
## Known limitations (read before trusting this blindly)
49+
50+
- **The cap is soft, not exact.** The Swift watchdog polls the extraction directory
51+
(every 50 ms) and kills `bsdtar` once the footprint crosses the limit, so peak disk use
52+
is roughly `cap + (poll interval × disk write rate)`. Measured: a 200 MB cap peaks around
53+
~230–300 MB on a fast NVMe; a 2 GB bomb is killed at ~224 MB (see `test_large_bomb.sh`).
54+
The goal is preventing disk *exhaustion* by a multi-GB/TB bomb — not enforcing an exact
55+
byte count. The shell `-O | head -c` path, by contrast, is a hard byte cap.
56+
- **The Swift tests run a mirror, not the compiled plugin.** `check_drift.sh` guarantees the
57+
guard *block* matches, but the harness has its own copies of the `extractRemote/extractLocal`
58+
call sites, which are NOT drift-checked. A bug in how the plugin wires the guard into those
59+
call sites would not be caught here (the plugin edits are typecheck-only). Eliminating this
60+
needs the larger refactor of extracting the logic into an importable target.
61+
- **`locateExecutable`'s 10k-entry cap is not exercised by the harness** — the watchdog's
62+
entry ceiling (which IS tested) is the primary defense; the `locateExecutable` cap is
63+
secondary/defense-in-depth and currently typecheck-only.
64+
- **Windows protection is post-hoc only.** The macOS/Linux `bsdtar` paths bound peak disk
65+
mid-stream via the watchdog. The Windows `Expand-Archive` path has no streaming guard; it
66+
gets only the platform-agnostic post-extraction backstop (it rejects + cleans up a bomb
67+
before the binary is used, but the bomb can momentarily expand to its full size on disk
68+
first). Windows also can't run on the macOS CI image, so it is verified by typecheck only.
69+
A streaming guard for Windows is a follow-up.
70+
4871
## Files
4972

5073
| File | Purpose |

scripts/test/run_tests.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ bash "$HERE/test_shell_extraction.sh" || rc=1
2525
echo; echo "▶ Swift plugin guard tests (via mirror harness)"
2626
bash "$HERE/test_swift_extraction.sh" || rc=1
2727

28+
echo; echo "▶ Large-bomb deep test (opt-in: DEVA11Y_DEEP=1)"
29+
bash "$HERE/test_large_bomb.sh" || rc=1
30+
2831
echo
2932
if [ "$rc" -eq 0 ]; then
3033
echo "✅ DEVA11Y-484 suite: ALL GREEN"

scripts/test/swift-harness/Sources/ExtractionHarness/Guard.swift

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,14 @@ func footprintExceeded(at directory: URL, maxBytes: Int64, maxEntries: Int) -> S
7171
}
7272

7373
/// Starts a background watchdog that terminates `process` (bsdtar) if the decompressed
74-
/// footprint in `directory` exceeds the byte or entry ceiling. The watchdog bounds peak
75-
/// disk use during a large/slow bomb; callers MUST also run `footprintExceeded` once the
76-
/// process exits, to catch a fast bomb that finished within a single poll interval.
74+
/// footprint in `directory` exceeds the byte or entry ceiling.
75+
///
76+
/// This is a SOFT ceiling: bsdtar can write up to one poll interval's worth of data past
77+
/// the limit before it is killed, so peak disk use is roughly `maxBytes + (pollInterval ×
78+
/// disk write rate)`. The goal is to prevent disk *exhaustion* by a multi-GB/TB bomb, not
79+
/// to enforce an exact byte count. The interval is kept short to bound the overshoot.
80+
/// Callers MUST also run `footprintExceeded` once the process exits, to catch a fast bomb
81+
/// that finished within a single poll interval.
7782
func startExtractionWatchdog(on process: Process, directory: URL, maxBytes: Int64, maxEntries: Int) -> ExtractionLimitState {
7883
let state = ExtractionLimitState()
7984
let watchdog = Thread {
@@ -83,7 +88,7 @@ func startExtractionWatchdog(on process: Process, directory: URL, maxBytes: Int6
8388
process.terminate()
8489
break
8590
}
86-
Thread.sleep(forTimeInterval: 0.2)
91+
Thread.sleep(forTimeInterval: 0.05)
8792
}
8893
}
8994
watchdog.start()

scripts/test/test_large_bomb.sh

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
#!/usr/bin/env bash
2+
# DEEP test (opt-in): proves the watchdog bounds peak disk for a bomb FAR larger than
3+
# the cap — the real disk-exhaustion scenario. Skipped by default because it writes/reads
4+
# a multi-GB archive; enable with DEVA11Y_DEEP=1. Blast radius is bounded: if the guard
5+
# regressed entirely, it writes the bomb's full size once (a few GB) then cleans up.
6+
set -uo pipefail
7+
8+
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
9+
# shellcheck source=lib/assert.sh
10+
source "$HERE/lib/assert.sh"
11+
12+
if [ "${DEVA11Y_DEEP:-0}" != "1" ]; then
13+
echo "test_large_bomb: skipped (set DEVA11Y_DEEP=1 to run the multi-GB bomb test)"
14+
exit 0
15+
fi
16+
17+
MB=$((1024*1024))
18+
BOMB_MB=${BOMB_MB:-2048} # bomb decompressed size
19+
CAP_MB=${CAP_MB:-200} # production byte cap
20+
PEAK_BUDGET_MB=${PEAK_BUDGET_MB:-700} # generous ceiling proving we stop far below the bomb
21+
22+
echo "Building Swift harness..."
23+
( cd "$HERE/swift-harness" && swift build ) >/dev/null 2>&1 || { echo "harness build failed"; exit 1; }
24+
HARNESS="$(cd "$HERE/swift-harness" && swift build --show-bin-path)/ExtractionHarness"
25+
26+
WORK="$(mktemp -d)"
27+
trap 'stop_server; rm -rf "$WORK"' EXIT
28+
echo "Generating ${BOMB_MB} MB bomb..."
29+
( cd "$WORK" && dd if=/dev/zero bs=1048576 count="$BOMB_MB" of=cli 2>/dev/null && bsdtar -czf bomb-big.tar.gz cli && rm cli )
30+
31+
start_server "$WORK" || exit 1
32+
33+
echo "── Large-bomb watchdog test (${BOMB_MB} MB bomb, ${CAP_MB} MB cap) ──"
34+
dest="$(mktemp -d "$WORK/dest.XXXX")"; rm -rf "$dest"
35+
OUT=$("$HARNESS" remote "$(url_for bomb-big.tar.gz)" "$dest" "$((CAP_MB*MB))" 10000 "$((100*MB))" 2>/dev/null)
36+
echo " outcome: $OUT"
37+
peak_mb=$(( $(python3 -c 'import sys,json;print(json.load(sys.stdin)["bytes"])' <<<"$OUT") / MB ))
38+
status=$(python3 -c 'import sys,json;print(json.load(sys.stdin)["bsdtarStatus"])' <<<"$OUT")
39+
40+
assert_eq "$(python3 -c 'import sys,json;print(json.load(sys.stdin)["exceeded"])' <<<"$OUT")" "True" "large bomb: flagged as exceeded"
41+
assert_eq "$status" "15" "large bomb: bsdtar SIGTERM'd mid-stream"
42+
assert_le "$peak_mb" "$PEAK_BUDGET_MB" "large bomb: peak disk (${peak_mb}MB) bounded far below bomb size (${BOMB_MB}MB)"
43+
assert_absent "$dest" "large bomb: extraction dir removed"
44+
45+
summary

scripts/test/test_swift_extraction.sh

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,10 +47,16 @@ assert_eq "$(jget "$OUT" bsdtarStatus)" "0" "legit: bsdtar exits 0"
4747
if [ -f "$DEST/browserstack-cli" ]; then ok "legit: binary extracted"; else bad "legit: binary missing"; fi
4848
"$DEST/browserstack-cli" >/dev/null 2>&1; assert_status $? 0 "legit: extracted binary runs"
4949

50-
# 2. Decompression bomb (remote), small byte cap: flagged, terminated, dir removed.
50+
# 2. Decompression bomb (remote), small byte cap: flagged, terminated mid-stream, removed.
51+
# The 400 MB fixture is ~8x the 50 MB cap, so a working LIVE watchdog must SIGTERM
52+
# bsdtar (status 15) well before it finishes — asserting that catches a regression
53+
# where only the post-extraction check works (which would let a huge bomb fill the disk).
5154
run_remote "bomb.tar.gz" "$CAP_SMALL" "$ENTRIES"
5255
assert_eq "$(jget "$OUT" exceeded)" "True" "bomb: flagged as exceeded"
5356
assert_contains "$(jget "$OUT" reason)" "decompressed size" "bomb: reason cites size"
57+
assert_eq "$(jget "$OUT" bsdtarStatus)" "15" "bomb: bsdtar SIGTERM'd mid-stream (live watchdog fired)"
58+
bomb_peak_mb=$(( $(jget "$OUT" bytes) / MB ))
59+
assert_le "$bomb_peak_mb" 350 "bomb: peak disk bounded below the 400 MB fixture (=${bomb_peak_mb}MB)"
5460
assert_absent "$DEST" "bomb: extraction dir removed"
5561

5662
# 3. Entry-count bomb (20k tiny files), generous byte cap: flagged on entries.

0 commit comments

Comments
 (0)