refactor: biome warnings sweep + cognitive-complexity extract-method (63→1)#115
Conversation
Adds `.claude-recovery.md` and `.workflow-state.json` to .gitignore. Both are per-session orchestration artifacts — should never be committed. The `/finalize` step deletes `.workflow-state.json` on successful workflow completion; without gitignore, an accidental `git add -A` (anti-pattern per CLAUDE.md) leaks them.
Phase 2 of REFACTOR-BIOME-2026-04-29: applied `biome check --write` across the workspace. Strictly behavior-preserving auto-fixes only. Categories addressed: - useTemplate: 2 string-concat → template literal sites - noNonNullAssertion (FIXABLE): 5 `gc!()` calls → optional chain `gc?.()` - noUnusedImports: removed orphan import in security.spec.ts - noUnusedVariables: removed orphan local in xz-helpers.spec.ts Files: 7 changed, +24/-29. Verification (post-fix): - pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline) - pnpm type-check: 0 errors x3 packages (root + tar-xz + nxz-cli) - rtk proxy biome lint: 63 → 40 warnings (-23, -36.5%) Remaining 40 warnings (Phases 3-5): - 23 noNonNullAssertion (manual replacements via type narrowing) - 12 noExcessiveCognitiveComplexity (extract-method refactors) - 2 noImportCycles (intentional lzma↔pool, will be biome-ignore'd) - 1 useForOf (manual: index-aware loop)
Phase 3+4 of REFACTOR-BIOME-2026-04-29: manual replacements for
warnings biome --write cannot fix automatically.
Categories addressed:
- 23 noNonNullAssertion sites — manual narrowing per site:
- errors.ts:176 — undefined-check + early fallback (no `messages[errno]!`)
- pool.ts:166 — undefined-check on shift() with invariant comment
- tar/checksum.ts:32,92 — `?? 0` for fixed-size header reads + early-break narrows byte type
- tar/format.ts:100 — undefined-break in parseOctal loop
- tar/format.ts:168 — `?? 0` for header[OFFSETS.typeflag]
- nxz.ts:114 — destructure presetMatch[1] with explicit fallback
- nxz.ts:710,713,717,755,756,759 — explicit undefined guards in
findCommonParent / resolveArchiveCwd; preserves existing
process.cwd() fallback semantics
- nxz.ts:736,995 — `?? ''` for files[0] indexing
- test/coverage.spec.ts:917,965,1045 — `?? 0` in checksum loops
- test/security.spec.ts:456,492,541 — `?? 0` in checksum loops
- test/wasm/decompress-memlimit.test.ts:141 — explicit
`if (result === undefined) throw` (Vitest toBeDefined doesn't
narrow TS types per js-typescript GOTCHAS)
- examples/browser/main.ts:24 — getElementById guarded
- 2 noImportCycles (lzma↔pool re-export): biome-ignore comments
citing intentional ESM-resolved-at-runtime pattern from MEMORY.md.
- 1 useForOf (format.ts:125 isEmptyBlock): for-of over Uint8Array
yields number directly, simultaneously kills the warning AND the
block[i]! bypass it covered up.
Files: 10 changed, +53/-28.
Verification:
- pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline)
- pnpm type-check: 0 errors x3 packages
- rtk proxy biome lint: 40 → 14 warnings (-26)
Remaining 14 (Phase 5 + out of scope):
- 12 noExcessiveCognitiveComplexity (extract-method refactors next)
- 2 suppressions/unused (pre-existing, out of refactor scope)
Phase 5a of REFACTOR-BIOME-2026-04-29: kill 7 noExcessiveCognitiveComplexity warnings in test files. Pure helper extraction; test names + ordering unchanged. Helpers extracted (kept inside the spec file, not promoted to a shared util): - `buildEntryBlocks(entry, blocks)` — per-entry tar block construction loop body. Used by `buildTar` in all 3 spec files. - `buildGlobalPaxBlock(attrs, blocks)` — global PAX construction branch (tar-parser-stream.spec.ts only). - `recalculateChecksum(header)` — inline checksum loop pattern shared across multiple `V6b`/`V12` it blocks. After extraction, top-level `buildTar` reduces to a 3-line dispatch (global PAX vs regular entries). Cognitive complexity scores drop below biome threshold. The 7 it() bodies that previously inlined the checksum recalculation now call `recalculateChecksum(header)` — identical math, less noise. Bonus cleanup: tar-parser-stream.spec.ts had a `// biome-ignore lint/complexity/noExcessiveCognitiveComplexity` suppression on the old `buildTar`; refactor makes it unnecessary, suppression removed. Files: 3 changed, +192/-175. Verification: - pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline) - pnpm --filter tar-xz type-check: 0 errors - biome warnings (3 test files): 7 → 0 - biome warnings (full project): 14 → 6 Remaining 6 warnings (Phase 5b + out of scope): - 5 noExcessiveCognitiveComplexity in src files: nxz.ts:777, browser/extract.ts:28, node/extract.ts:198, node/file.ts:180, tar/format.ts:227 (Phase 5b — high-risk territory: just-hardened Win32 TOCTOU code + fresh streaming refactor + core tar format). - 1 pre-existing noEmptyBlockStatements biome-ignore in a test file unrelated to this sweep (out of refactor scope).
Phase 5b-1 of REFACTOR-BIOME-2026-04-29: drops 3 noExcessiveCognitiveComplexity
warnings via pure extract-method on the lower-risk source files.
Helpers extracted (private, same-file scope):
- `packages/tar-xz/src/browser/extract.ts` parseTar:
- `parsePaxHeaderBlock(data, offset)` → `{offset, paxAttrs}` for the
PAX_HEADER branch (parse + advance offset)
- `extractEntryContent(data, entry, offset)` → `{offset, contentData}`
for the content-slice + 512-byte padding advance
Pure helpers, no captured mutable state. Loop semantics preserved.
- `packages/tar-xz/src/tar/format.ts` createHeader:
- `splitLongName(fullName)` → `{prefix, name}` for the legacy 100-byte
field validation + 155/100 prefix/name split. Error message
"File name too long for TAR format: ${fullName}" preserved
byte-for-byte (consumers may pattern-match).
- `packages/nxz/src/nxz.ts` collectArchiveFiles:
- `buildEntryPath(entry, dirAbsPath, dirRelative)` → relative path
construction for archive entries. Lazy `import('node:fs')` retained
inline.
Files: 3 changed, +74/-25.
Verification:
- pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline)
- pnpm type-check: 0 errors x3 packages
- rtk proxy biome lint: 6 → 3 warnings (-3)
Remaining 3 warnings:
- 2 noExcessiveCognitiveComplexity in node/file.ts (extractFile,
PR #114 hardened) and node/extract.ts (streaming, PR #113) →
Phase 5b-2 with pre-push opus senior-review
- 1 noNonNullAssertion suppression in test/node-api.spec.ts:249
(pre-existing, unrelated to this sweep)
Phase 5b-2 of REFACTOR-BIOME-2026-04-29: dropped the last 2 noExcessiveCognitiveComplexity warnings via pure extract-method on the highest-risk src files (PR #114 Win32 TOCTOU + PR #113 streaming). ZERO behavior change; all security/streaming contracts preserved verbatim. `packages/tar-xz/src/node/file.ts` extractFile (HIGH RISK — recently hardened by PR #114): 7 private helpers extracted, original extractFile becomes a thin dispatcher: - `ensureSafeLinkname(linkname, opts, name)` — leaf+ancestor symlink validation for hardlink/symlink targets - `extractSymlinkEntry(target, entry, opts)` — SYMLINK branch - `extractHardlinkEntry(target, entry, opts)` — HARDLINK branch - `openFileExclusive(target, fileMode, entryName)` — Win32 'wx' + EEXIST→unlink+retry→security-throw fail-closed gate. Security error message preserved BYTE-IDENTICAL: "Security error: target still exists on retry for '${entryName}' — possible symlink/junction injection or concurrent creation at the target path between unlink and open". The unlink ENOENT pass-through and second-EEXIST throw remain in the same try/catch nesting. - `writeFileEntryPosix(target, entry, fileMode, mtime)` — POSIX path with O_NOFOLLOW + strict chmod/utimes propagation - `writeFileEntryWin32(handle, entry, fileMode, mtime)` — Win32 fd ops with best-effort chmod/utimes wrap (FAT32/cloud share semantics) - `writeFileEntry(target, entry, fileMode, mtime)` — platform dispatcher (POSIX vs Win32) `packages/tar-xz/src/node/extract.ts` extract async generator: 4 top-level helpers extracted; lookaheadRef ref-object pattern replaces inline closure mutation: - `nextParseEvent(parser, lookaheadRef)` — pulls next event with lookahead consumption - `drainEntryChunks(parser, lookaheadRef)` — drains chunk events until next entry boundary; preserves D-2 swallow policy - `drainSkippedEntry(parser, lookaheadRef)` — drains chunks for a filtered-out entry without yielding - `createEntryDataPull(parser, lookaheadRef)` — replaces makeDataGen + dataGenInFlight closure; preserves D-5 TAR_PARSER_INVARIANT detection on concurrent iteration Removed dead biome-ignore comment from extract.ts (no longer needed post-extraction). Replaced `filter && !filter(strippedEntry)` with `filter?.(strippedEntry) === false` — equivalent semantics, lower nesting cost. Files: 2 changed, +417/-250 net (mostly helper extraction). Verification (BLOCKING — all green): - pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline) - pnpm type-check: 0 errors x3 packages - packages/tar-xz/test/security-win32.spec.ts: 4/4 ✅ - packages/tar-xz/test/security.spec.ts: 22/22 ✅ - rtk proxy biome lint: 3 → 1 warnings (-2 cognitive-complexity) Final state across the full REFACTOR-BIOME-2026-04-29 (Phases 1-5b-2): - 63 → 1 biome warnings (-62 = -98.4%) - 1 remaining = pre-existing `suppressions/unused` in test/node-api.spec.ts:249 (unrelated to this sweep, not addressed) - 671 tests pass throughout, 0 regression - 0 TS errors x3 packages throughout - 8 commits on branch fix/biome-warnings-sweep
Pre-push opus senior-review on REFACTOR-BIOME-2026-04-29 surfaced two LOW-severity informational findings: - F-001: spec § 1 cited file.ts:328-346 (line refs valid at PR #114 merge); post-refactor the fail-closed path lives in private helpers openFileExclusive + writeFileEntryWin32 + writeFileEntry. Spec amended to keep both the historical reference (audit trail) AND the current helper names. Contract semantics unchanged. - F-002: pre-existing latent HARDLINK validation gap. resolve(cwd, strippedLinkname) throws TypeError on undefined linkname when strip=0. Filed in TODO as defensive-hardening item; not introduced by this refactor. Both findings non-blocking for the refactor PR.
There was a problem hiding this comment.
Pull request overview
Workspace-wide refactor focused on reducing Biome warnings and lowering cognitive complexity via extract-method, while preserving existing runtime behavior across Node, browser, WASM, and tar-xz security/streaming paths.
Changes:
- Refactors (extract-method + narrowing) across src and tests to eliminate non-null assertions / complexity warnings.
- Splits several tar-xz extraction and security-critical branches into private helpers (Win32
'wx'path, entry draining, PAX parsing, checksum recalculation). - Small lint-motivated cleanups in examples, CLI, docs, and ignore rules.
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/wasm/decompress-memlimit.test.ts | Replaces concat with template literal and removes result! via explicit guard. |
| src/pool.ts | Adds Biome import-cycle suppression comment; removes non-null assertion on queue.shift(). |
| src/lzma.ts | Adds Biome import-cycle suppression comment on pool re-export. |
| src/errors.ts | Removes non-null assertion when indexing error message table; adds safe fallback. |
| packages/tar-xz/test/xz-helpers.spec.ts | Renames unused counter to satisfy lint. |
| packages/tar-xz/test/tar-parser-stream.spec.ts | Extracts TAR-building helpers to reduce cognitive complexity. |
| packages/tar-xz/test/security.spec.ts | Extracts checksum + TAR-building helpers; removes non-null assertions in error checks. |
| packages/tar-xz/test/node-api.spec.ts | Removes non-null assertion via optional chaining in assertion path. |
| packages/tar-xz/test/memory-shape.spec.ts | Uses optional chaining for gc calls under runIf(hasGc). |
| packages/tar-xz/test/coverage.spec.ts | Extracts checksum + TAR-building helpers; removes non-null assertions. |
| packages/tar-xz/src/tar/format.ts | Safer indexing under noUncheckedIndexedAccess; extracts splitLongName. |
| packages/tar-xz/src/tar/checksum.ts | Removes non-null assertions; handles undefined bytes defensively. |
| packages/tar-xz/src/node/file.ts | Extracts Win32/POSIX file-write and link handling into helpers; keeps TOCTOU hardening structure. |
| packages/tar-xz/src/node/extract.ts | Extracts parsing/draining/data-pull helpers to reduce complexity. |
| packages/tar-xz/src/browser/extract.ts | Extracts PAX/content parsing helpers to reduce complexity. |
| packages/nxz/src/nxz.ts | Replaces non-null assertions with explicit guards/helpers in CLI/tar-create path building. |
| examples/browser/main.ts | Replaces DOM non-null assertion with explicit runtime error. |
| docs/plans/WIN32-TOCTOU-2026-04-29.md | Updates plan text to match refactored helper locations. |
| TODO.md | Updates tracking items for the refactor sweep + residual warning note. |
| .gitignore | Ignores additional local tool state files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const strippedEntry = { ...rawEntry, name: strippedName }; | ||
| if (filter && !filter(strippedEntry)) { | ||
| await drainChunks(); | ||
| if (filter?.(strippedEntry) === false) { |
There was a problem hiding this comment.
ExtractOptions.filter is documented/typed to return boolean (“return false to skip an entry”). Changing the check to filter?.(strippedEntry) === false makes any other falsy return value (e.g. 0, '', null, undefined) not skip the entry, which is a behavior change and also diverges from the Browser implementation (if (filter && !filter(...))). Use a standard falsy check (filter && !filter(strippedEntry)) or explicitly coerce to boolean to preserve the existing contract.
| if (filter?.(strippedEntry) === false) { | |
| if (filter && !filter(strippedEntry)) { |
Round-1 Copilot review on PR #115 surfaced one M-class finding: Phase 5b-2 replaced the original `filter && !filter(strippedEntry)` with `filter?.(strippedEntry) === false`. The strict `=== false` check skipped entries ONLY when the predicate returned literal `false`; the original `&&` short-circuit skipped on ANY falsy return (false / 0 / '' / null / undefined). Real behavior drift, not equivalent semantics — and divergent from the browser-side `extract()` which kept `filter && !filter(...)`. Fix: revert to `filter && !filter(...)` semantics, but extract the conditional into a private helper `isExcludedByFilter(entry, filter)` so the inline `&&` short-circuit doesn't push extract()'s cognitive complexity score back over biome's 15 threshold. The helper preserves the exact branching of the original idiom: - filter undefined → returns false (don't skip) - filter returns false → returns true (skip) - filter returns 0 / '' / null / undefined → returns true (skip) - filter returns truthy → returns false (don't skip) Verification: - pnpm --filter tar-xz test: 155 pass / 0 fail / 3 skip - pnpm --filter tar-xz type-check: 0 errors - rtk proxy biome lint extract.ts: 0 warnings (cognitive complexity back under threshold thanks to the helper)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let dataGenInFlight = false; | ||
| return function makeDataGen(): AsyncGenerator<Uint8Array> { | ||
| if (dataGenInFlight) { | ||
| throw new Error('concurrent entry.data iteration is not supported'); |
There was a problem hiding this comment.
The D-5 “concurrent entry.data iteration” invariant error is thrown as a plain Error without setting err.code. Elsewhere in this module invariant violations attach code: 'TAR_PARSER_INVARIANT' (e.g., the stray-chunk check), and downstream tests/consumers may rely on the code to distinguish corruption/invariant failures from ordinary runtime errors. Consider assigning code = 'TAR_PARSER_INVARIANT' on this error (while keeping the same message) for consistency.
| throw new Error('concurrent entry.data iteration is not supported'); | |
| const err = new Error('concurrent entry.data iteration is not supported') as Error & { | |
| code?: string; | |
| }; | |
| err.code = 'TAR_PARSER_INVARIANT'; | |
| throw err; |
| * Only SYMLINK and HARDLINK entries are required to have non-empty linknames; | ||
| * other entry types may legitimately have empty or undefined linkname fields. | ||
| * Delegates to `ensureSafeName` for the actual rejection logic. | ||
| */ | ||
| function ensureSafeLinkname(entry: TarEntry): void { | ||
| if ( | ||
| (entry.type === TarEntryType.SYMLINK || entry.type === TarEntryType.HARDLINK) && | ||
| entry.linkname !== undefined | ||
| ) { |
There was a problem hiding this comment.
TarEntry.linkname is typed as a required string in packages/tar-xz/src/types.ts, so the entry.linkname !== undefined guard and the doc text about “empty or undefined linkname fields” are misleading here. Consider simplifying the condition (or updating the comment) to reflect that linkname is always present and that the only invalid case being rejected is the empty/NUL-containing string.
| * Only SYMLINK and HARDLINK entries are required to have non-empty linknames; | |
| * other entry types may legitimately have empty or undefined linkname fields. | |
| * Delegates to `ensureSafeName` for the actual rejection logic. | |
| */ | |
| function ensureSafeLinkname(entry: TarEntry): void { | |
| if ( | |
| (entry.type === TarEntryType.SYMLINK || entry.type === TarEntryType.HARDLINK) && | |
| entry.linkname !== undefined | |
| ) { | |
| * `TarEntry.linkname` is always present as a string, but only SYMLINK and | |
| * HARDLINK entries are validated here. Delegates to `ensureSafeName` for the | |
| * actual rejection logic, including empty or NUL-containing linknames. | |
| */ | |
| function ensureSafeLinkname(entry: TarEntry): void { | |
| if (entry.type === TarEntryType.SYMLINK || entry.type === TarEntryType.HARDLINK) { |
|
|
||
| - [ ] [Lint] Biome warnings sweep (6 total, all `pnpm exec biome lint` exits 0 — warnings only, ran via `rtk proxy biome lint` workaround on 2026-04-28). Two from PR #111: `test/wasm/decompress-memlimit.test.ts:30` (`useTemplate` — string concat → template literal, biome FIXABLE) + `test/wasm/decompress-memlimit.test.ts:141` (`noNonNullAssertion` on `result!` in callback success test, replace with assertion guard). Four pre-existing on master: `src/errors.ts:176` `noNonNullAssertion` (`messages[errno]!`), `src/lzma.ts:63` + `src/pool.ts:20` `noImportCycles` (lzma↔pool re-export cycle, already noted in MEMORY.md as "benign — ESM resolves at runtime"), `src/pool.ts:166` `noNonNullAssertion` (`this.queue.shift()!` after empty-check). Priority: L (cosmetic; can batch with another pass). Note: lint pipeline is silently broken until RTK biome bug fixes (workaround: `rtk proxy biome ...`). | ||
| - [ ] [Lint] Single residual biome warning: `test/node-api.spec.ts:249` (`suppressions/unused` — pre-existing biome-ignore that no longer suppresses anything). Cosmetic 1-line cleanup for a future PR. | ||
| - [ ] [tar-xz] Pre-existing latent HARDLINK validation gap (F-002 from REFACTOR-BIOME-2026-04-29 opus review): `extractHardlinkEntry` calls `resolve(cwd, strippedLinkname)` where `strippedLinkname` may be `undefined` if `entry.linkname === undefined` AND `strip === 0`. `resolve(undefined)` throws `TypeError` instead of a structured error. Defensive validation; not introduced by refactor; HARDLINK entries with no linkname are already malformed. Priority: L (rare path). |
There was a problem hiding this comment.
This TODO item claims entry.linkname can be undefined (leading to resolve(undefined)), but TarEntry.linkname is currently a required string in packages/tar-xz/src/types.ts and the TAR header parser returns '' for an empty linkname field. Unless there’s another entry source that can produce undefined, this follow-up looks stale/mischaracterized and should be reworded (e.g., to refer to empty linknames) or removed to avoid tracking a non-issue.
Three findings folded:
- **C2-1 (M)**: `createEntryDataPull` threw the D-5 invariant error
("concurrent entry.data iteration is not supported") as a plain
Error without `code = 'TAR_PARSER_INVARIANT'`. The stray-chunk
invariant elsewhere in the module DOES set `code`, and
`drainSkippedEntry` filters errors by that exact code to decide
swallow vs re-throw. Without the code attribute, a concurrent-
iteration error inside a skipped entry would be silently swallowed,
breaking the D-5 contract. Pre-existing latent bug surfaced by the
Phase 5b-2 extraction. Fixed by setting `err.code =
'TAR_PARSER_INVARIANT'` on the dataGenInFlight error.
- **C2-2 (L)**: `ensureSafeLinkname` JSDoc said "empty or undefined
linkname fields" but `TarEntry.linkname` is typed as required
`string` (parser returns `''` for empty fields). Removed the dead
`entry.linkname !== undefined` check (always true given the type)
and clarified the docstring. Behavior preserved: empty linkname on
SYMLINK/HARDLINK still rejected by `ensureSafeName`'s empty-string
guard.
- **C2-3 (L)**: F-002 TODO entry I added in the previous round
claimed `extractHardlinkEntry` could pass `undefined` to `resolve`.
Mischaracterized — `linkname` is required string, parser sets `''`
for empty fields, and `ensureSafeLinkname` rejects '' upstream.
Replaced the entry with an HTML comment explaining why it was
dropped (preserves audit trail).
Verification:
- pnpm --filter tar-xz test: 155 pass / 0 fail / 3 skip
- pnpm --filter tar-xz type-check: 0 errors
- rtk proxy biome lint: 1 warning (pre-existing unrelated
suppressions/unused — not introduced by this PR)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 20 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let outputFile = options.output; | ||
| if (!outputFile) { | ||
| const base = pathModule.basename(files[0]!).replace(/\/$/, ''); | ||
| const firstFile = files[0] ?? ''; |
There was a problem hiding this comment.
resolveTarOutput() now falls back to files[0] ?? '', which can silently produce an output name of .tar.xz if it’s ever called with an empty files array. Previously this would have thrown (via basename(undefined)), which is a much clearer failure mode for an invariant violation. Consider making the precondition explicit (e.g., throw when files.length === 0) instead of defaulting to an empty string.
| const firstFile = files[0] ?? ''; | |
| if (files.length === 0) { | |
| throw new Error('resolveTarOutput requires at least one input file when no output path is provided'); | |
| } | |
| const firstFile = files[0]; | |
| if (firstFile === undefined) { | |
| throw new Error('resolveTarOutput requires at least one input file when no output path is provided'); | |
| } |
Round-3 Copilot finding (M): Phase 3+4's `files[0] ?? ''` replacement in `resolveTarOutput()` swallowed the original `basename(undefined)` TypeError that signaled an invariant violation (empty input files array when no output path provided). With `?? ''`, an empty array silently produces an output named `.tar.xz` (just the suffix) — worse than a clear error. Fix: explicit `firstFile === undefined` check + descriptive throw, preserving the invariant-violation fail-fast semantics. The narrowing also avoids re-introducing the `noNonNullAssertion` warning that Phase 3+4 fought to eliminate. The other `files[0] ?? ''` site at line 1020 (in the dispatcher) is gated by an upstream `if (options.files.length === 0) process.exit` guard at line 1013 — the `??` there is unreachable defensive code, documented in the comment. Left unchanged. Verification: - pnpm --filter nxz-cli test: 27/27 pass - pnpm --filter nxz-cli type-check: 0 errors - rtk proxy biome lint nxz.ts: 0 warnings
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 20 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| entry: TarEntry, | ||
| filter: ((e: TarEntry) => boolean) | undefined | ||
| ): boolean { | ||
| return filter !== undefined && !filter(entry); |
There was a problem hiding this comment.
isExcludedByFilter uses filter !== undefined to decide whether to call the predicate, but the comment says this preserves the historical filter && !filter(entry) semantics. These are not equivalent: a null (or other falsy non-undefined) filter would previously be treated as “no filter”, but will now be invoked and throw at runtime. To truly preserve prior behavior (and match the browser/node create implementations), gate on typeof filter === 'function' (or filter &&) before calling it.
| return filter !== undefined && !filter(entry); | |
| return typeof filter === 'function' && !filter(entry); |
Round-4 Copilot finding (M): the helper used `filter !== undefined` to gate the predicate call. The original `filter && !filter(entry)` short-circuited on ANY falsy filter (false / 0 / '' / null / undefined); `!== undefined` lets `null` through, which would crash at `filter(null)` runtime. Browser-side `extract()` correctly uses `filter && ...`. Fix: `typeof filter === 'function' && !filter(entry)` — null-safe, matches browser-side semantics, preserves the documented contract. Note on macOS CI failure on the previous push: the smoke test crashed with `Error: async hook stack has become corrupted (actual: 6939, expected: 6939)` — a known Node.js v22 internal AsyncHooks race during pnpm install postinstall, NOT a refactor regression. Stack trace is 100% inside Node native code (uv__work_done → AfterStat → InternalCallbackScope::Close → AsyncHooks::pop). This push will re-trigger CI; if macOS passes on retry, confirmed flaky. Verification: - pnpm --filter tar-xz test: 155/0/3 - pnpm --filter tar-xz type-check: 0 errors - rtk proxy biome lint extract.ts: 0 warnings
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 20 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
packages/nxz/src/nxz.ts:728
findCommonParent()splits on'/'and joins with'/', but it is computing a common parent for filesystem paths (path.resolve()output). On Windows these paths use\\, so this will typically return'/'(or otherwise compute the wrong cwd), breaking tar-create behavior. UsepathModule.sep/pathModule.parse()orpathModule.relative()-based logic instead of hardcoding POSIX separators for OS paths.
const parts = paths.map((p) => p.split('/'));
const common: string[] = [];
const first = parts[0];
if (first === undefined) return process.cwd();
for (let i = 0; i < first.length; i++) {
const segment = first[i];
if (segment !== undefined && parts.every((p) => p[i] === segment)) {
common.push(segment);
} else {
break;
}
}
return common.join('/') || '/';
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| const item = this.queue.shift()!; | ||
| const item = this.queue.shift(); | ||
| if (item === undefined) return; // invariant: queue was non-empty above |
There was a problem hiding this comment.
processQueue() already guards this.queue.length === 0, so shift() returning undefined would indicate an internal invariant violation. Returning silently here can leave queued tasks unprocessed and makes the pool fail “quietly” if the invariant is ever broken (e.g., future refactors/re-entrancy). Prefer keeping the non-null assertion, or throw an explicit invariant error (and/or emit an error event) rather than returning.
| if (item === undefined) return; // invariant: queue was non-empty above | |
| if (item === undefined) { | |
| throw new Error('Invariant violation: queue was non-empty but shift() returned undefined'); | |
| } |
| // Check for tar-create mode: -T with files that aren't .tar.xz archives | ||
| const mode = determineMode(options, options.files[0]!); | ||
| // options.files.length > 0 is guaranteed by the stdin check above | ||
| const firstFile = options.files[0] ?? ''; |
There was a problem hiding this comment.
options.files.length > 0 is asserted by the preceding stdin check, but the fallback options.files[0] ?? '' will silently change behavior if that invariant is ever violated (e.g., future argument parsing changes): determineMode() would run with an empty filename. Prefer an explicit invariant check/throw (or keep a non-null assertion) so failures are fail-fast and easier to diagnose.
| const firstFile = options.files[0] ?? ''; | |
| const firstFile = options.files[0]; | |
| if (firstFile === undefined) { | |
| throw new Error('Internal error: expected at least one input file after stdin handling'); | |
| } |
Round-5 Copilot findings (2 M, same fail-fast vs fail-silent class
as round-3 C3-1):
- **C5-1 (M)** `src/pool.ts:168` — `processQueue()` guards
`this.queue.length === 0` immediately above the `shift()` call,
so `shift()` returning undefined indicates an INVARIANT VIOLATION
(re-entrancy, future refactor, etc.). The previous Phase 3+4
replacement of `!` with `if (item === undefined) return` silenced
this breach, leading to queued tasks stalling silently. Fixed:
throw `Invariant violation: queue was non-empty but shift()
returned undefined`.
- **C5-2 (M)** `packages/nxz/src/nxz.ts:1029` — same pattern.
After the stdin guard at line 1013 exits on `files.length === 0`,
the next reference `options.files[0] ?? ''` (added in Phase 3+4)
silenced an invariant breach. Fixed: explicit `if (firstFile ===
undefined) throw new Error('Internal error: expected at least one
input file after stdin handling')`. Mirrors the C3-1 fix in
resolveTarOutput from round-3.
Both findings predate this refactor in spirit (master had `!`
non-null assertions that hid the same invariant assumption); Phase
3+4 replaced the `!` with silent fallbacks. Round-3+5 walked the
project back to fail-fast for those specific sites.
Verification:
- pnpm test: full suite green (155 tar-xz / 27 nxz / 489 root = 671/0/3)
- pnpm type-check: 0 errors x3 packages
- rtk proxy biome lint: 1 warning (pre-existing unrelated suppressions/unused)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 20 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // The `processQueue.length > 0` guard above means shift() must return a | ||
| // value. Returning silently here would let queued tasks stall on a | ||
| // future invariant breach (re-entrancy, refactor, etc.). Throw so the | ||
| // bug surfaces at the breach instead of as a hung pool. | ||
| throw new Error('Invariant violation: queue was non-empty but shift() returned undefined'); |
There was a problem hiding this comment.
The comment refers to processQueue.length > 0 as the guard ensuring shift() returns a value, but the actual guard is this.queue.length === 0 above. processQueue.length is the function arity and unrelated here, so this comment is misleading—update it to reference this.queue.length (or the "queue empty" guard) instead.
| // We do NOT validate entry.linkname here because symlinks can legitimately | ||
| // point to relative paths inside the archive; the TOCTOU risk comes from | ||
| // follow-up entries — those are guarded by ensureSafeTarget() on each entry. |
There was a problem hiding this comment.
This comment says "We do NOT validate entry.linkname here", but extractFile() now always calls ensureSafeLinkname(entry) before branching into extractSymlinkEntry(), so linkname is validated (non-empty / no NUL / not '.' or '..') earlier. Clarify that the only thing not validated here is whether the symlink target stays within cwd (if that's the intent), or remove the statement to avoid contradicting the actual behavior.
| // We do NOT validate entry.linkname here because symlinks can legitimately | |
| // point to relative paths inside the archive; the TOCTOU risk comes from | |
| // follow-up entries — those are guarded by ensureSafeTarget() on each entry. | |
| // `entry.linkname` has already been syntactically validated before reaching | |
| // this function; what we do NOT enforce here is that the symlink target stays | |
| // within `cwd`. The TOCTOU risk comes from follow-up entries, which are | |
| // guarded by ensureSafeTarget() on each extraction target. |
| if (dataGenInFlight) { | ||
| // D-5 invariant violation: must carry TAR_PARSER_INVARIANT code so that | ||
| // drainSkippedEntry's filter (and downstream consumers) can distinguish | ||
| // it from runtime/IO errors and re-throw rather than swallow. | ||
| const err = new Error('concurrent entry.data iteration is not supported') as Error & { | ||
| code?: string; | ||
| }; | ||
| err.code = 'TAR_PARSER_INVARIANT'; | ||
| throw err; |
There was a problem hiding this comment.
The rationale comment for setting err.code = 'TAR_PARSER_INVARIANT' mentions drainSkippedEntry needing to distinguish this error from runtime/IO errors, but this error is thrown on creating a second data generator (i.e. calling makeDataGen() twice). In the current design makeTarEntryWithData() calls dataPull() exactly once and never exposes makeDataGen, so this path is effectively unreachable; the comment (and possibly the guard) reads misleadingly as "concurrent entry.data iteration" protection. Consider adjusting the comment to describe what is actually guarded (multiple generator creation) or remove the drainSkippedEntry justification.
Three L-class comment-accuracy findings from Copilot round-6: - pool.ts:169 — round-5 commit cited `processQueue.length > 0` but that's the function arity (=0), not the queue. Reworded to reference the actual `this.queue.length === 0` early-return guard above. - file.ts:158 — `extractSymlinkEntry` comment said "We do NOT validate entry.linkname here" but `ensureSafeLinkname()` runs upstream now (since Phase 5b-2). Reworded to clarify what's validated upstream (syntactic: non-empty + no NUL + not dot placeholder) vs what's intentionally NOT enforced here (cwd- containment of the symlink target). - extract.ts:193 — round-2 commit's rationale for setting `err.code = 'TAR_PARSER_INVARIANT'` cited `drainSkippedEntry` filtering, but the makeDataGen() branch is effectively unreachable in production (`makeTarEntryWithData` calls dataPull() exactly once and never exposes makeDataGen). Reworded to describe what the guard actually catches (double makeDataGen() invocation) and why the `code` attribute is set (consistency with other invariant errors in the module). Pure comment-precision; zero behavior change. Push uses `# NO-COPILOT-REVIEW` per workflow §2.10 termination rule (a) for L-only on a refactor PR — saves Copilot premium quota; opus SAFE-TO-PUSH already cleared the actual code. Verification: - pnpm test: 671/0/3 (full suite) - pnpm type-check: 0 errors x3 packages - rtk proxy biome lint: 1 warning (pre-existing unrelated)
Future workspace releases (`tar-xz`, `nxz-cli`) now auto-scope their CHANGELOG entries to commits since the LAST per-package release commit (matching `chore(<pkg>): release v*`), instead of since the last GPG-signed root tag. Eliminates the need for post-release manual CHANGELOG curation that nxz-cli@6.1.0 required to drop entries from #108 (already in 6.0.0), #115 (biome refactor body fragments), and `adfbc99` (changesets adoption, already removed). Empirical dry-run from `packages/nxz/` with `GIT_CHANGELOG_PATH=.` produces zero diff: the new `resolveSinceBaseline` correctly stops at `ecff028 chore(nxz-cli): release v6.1.0`. The new `GIT_CHANGELOG_SINCE` env var also provides an explicit override escape hatch for projects whose release commits do not follow the `chore(<pkg>): release v*` pattern.
Summary
file.tsextractFile post-PR fix(tar-xz): close Win32 symlink-swap TOCTOU with JS-pure 'wx'+retry fail-closed #114 Win32 TOCTOU +extract.tsstreaming post-PR feat(tar-xz): true streaming for Node extract()/list() — O(largest entry) #113) refactored with surgical preservation; pre-push opus senior-review verdict: SAFE-TO-PUSH with byte-equivalent security-error string + per-entrydataGenInFlightisolation confirmed.Metrics
Phases
biome check --writebuildEntryBlocks,recalculateChecksum,buildGlobalPaxBlockhelpersparsePaxHeaderBlock,extractEntryContent,splitLongName,buildEntryPathopenFileExclusive,writeFileEntryWin32,writeFileEntryPosix,extractSymlinkEntry,extractHardlinkEntry,nextParseEvent,drainEntryChunks,drainSkippedEntry,createEntryDataPull(11 helpers)Security & streaming contract preservation (pre-push opus verdict)
'wx'atomic createopenFileExclusiveline 311O_NOFOLLOWstrict propagationfor await (const chunk of entry.data)consumptionensureSafeLinknamemirrors inline checkdataGenInFlightper-entry closure isolationcreateEntryDataPullconstructs fresh closure per entrydrainSkippedEntrymirrors original try/catchasync function*outer + helpers)Test plan
pnpm test: 671 pass / 0 fail / 3 skip (identical to baseline at every phase boundary)pnpm type-checkx3 packages: 0 errors throughoutrtk proxy biome lint: 63 → 1 warningspackages/tar-xz/test/security-win32.spec.ts: 4/4 pass (Win32 TOCTOU contract)packages/tar-xz/test/security.spec.ts: 22/22 pass (general security regression)Out-of-scope follow-ups (filed in TODO)
test/node-api.spec.ts:249suppressions/unused(pre-existing, unrelated to this sweep)TypeErrorinstead of structured error. Defensive hardening; not introduced by this refactor.Story
REFACTOR-BIOME-2026-04-29 — branch
fix/biome-warnings-sweep, 10 commits.