feat(tar-xz)!: redesign for v6 — universal stream-first API#108
Conversation
BREAKING CHANGE: tar-xz v6 — universal stream-first API.
The Node and Browser APIs now share identical signatures. All three
core functions (create, extract, list) are async generators returning
or accepting AsyncIterable<Uint8Array>.
Removed:
- extractToMemory() — replaced by extract() + entry.bytes()
- createTarXz / extractTarXz / listTarXz (browser-prefixed names)
- BrowserCreateOptions / BrowserExtractOptions (unified into one type)
- ExtractedFile interface (replaced by TarEntryWithData)
New shape:
create(options): AsyncIterable<Uint8Array>
extract(input, options?): AsyncIterable<TarEntryWithData>
list(input): AsyncIterable<TarEntry>
type TarInput = AsyncIterable<Uint8Array> | Iterable<Uint8Array>
| Uint8Array | ArrayBuffer
| ReadableStream<Uint8Array> // Web
| NodeJS.ReadableStream; // Node
Source files for create() use the new TarSourceFile shape:
{ name, source: AsyncIterable<Uint8Array> | Uint8Array | string }
(string only valid in Node; interpreted as fs path)
Phase 1 of v6 redesign. nxz-cli and tests will be updated next.
…0.0) - New 'tar-xz/file' subpath export (Node only) with createFile/extractFile/listFile - tar-xz package.json: bump to 6.0.0, add ./file subpath export - nxz-cli: rewired to use tar-xz/file helpers, bump to 6.0.0 - Tests rewritten for v6 API (80/80 tar-xz, 27/27 nxz) Bugfix: toAsyncIterable mis-dispatched Uint8Array via Symbol.iterator (yielding bytes as numbers). Reordered checks so instanceof Uint8Array takes precedence over Symbol.iterator.
- packages/tar-xz/README.md: full v6 rewrite with unified Node/Browser
examples, file helpers ('tar-xz/file'), streaming patterns (HTTP, hash
during creation, fetch+extract), API reference, v5->v6 migration guide
- README.md (root): tar-xz section updated for v6 unified API
- packages/tar-xz/demo/main.ts: rewired to v6 (create/extract/list,
AsyncIterable consumption, Blob collection)
Demo build has a pre-existing Vite alias issue (node-liblzma/wasm
subpath resolution) — separate follow-up.
The previous record-style alias replaced 'node-liblzma' as a prefix, producing 'lib/lzma.browser.js/wasm' for subpath imports — not a valid path. Switching to array form lets us register the more specific /wasm subpath first.
There was a problem hiding this comment.
Pull request overview
Major v6 redesign of tar-xz to provide a universal, stream-first API (Node + Browser) centered on AsyncIterable<Uint8Array>, plus Node-only filesystem convenience helpers via tar-xz/file. Updates nxz-cli to use the new API and refreshes docs/demo/tests accordingly.
Changes:
- Introduce v6
create()/extract()/list()APIs returning/accepting async-iterable stream-shaped data across Node and browser entry points. - Add Node-only disk I/O wrappers (
createFile,extractFile,listFile) under thetar-xz/filesubpath export. - Update tests, demo, and READMEs for the new API and add v5→v6 migration guidance; update
nxz-clito usetar-xz/file.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node tests to use tar-xz/file helpers and adds in-memory streaming tests. |
| packages/tar-xz/test/coverage.spec.ts | Refreshes coverage/integration tests for v6 APIs and adds helper to collect streamed extract results. |
| packages/tar-xz/src/types.ts | Introduces v6 universal types (TarInput, TarSourceFile, TarEntryWithData streaming shape). |
| packages/tar-xz/src/node/list.ts | Reworks Node listing to accept stream-shaped input and yield entries via async-iterable API. |
| packages/tar-xz/src/node/index.ts | Updates Node exports for v6 (removes extractToMemory, exports TarInputNode). |
| packages/tar-xz/src/node/file.ts | Adds Node-only filesystem convenience wrappers for create/extract/list. |
| packages/tar-xz/src/node/extract.ts | Reworks Node extraction to async-iterable entries with bytes()/text() helpers. |
| packages/tar-xz/src/node/create.ts | Reworks Node creation to async-iterable compressed output and v6 TarSourceFile inputs. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node input normalization to AsyncIterable<Uint8Array> (supports Node + Web streams). |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser input normalization to AsyncIterable<Uint8Array> for Web Streams and buffers. |
| packages/tar-xz/src/index.ts | Updates main Node entrypoint exports for v6 and re-exports new types. |
| packages/tar-xz/src/index.browser.ts | Updates browser entrypoint to export v6 create/extract/list and new types. |
| packages/tar-xz/src/browser/list.ts | Reworks browser listing to async-iterable input + WASM decompression path. |
| packages/tar-xz/src/browser/index.ts | Updates browser exports to v6 function names. |
| packages/tar-xz/src/browser/extract.ts | Reworks browser extraction to async-iterable TarEntryWithData output. |
| packages/tar-xz/src/browser/create.ts | Reworks browser creation to v6 inputs and async-iterable compressed output. |
| packages/tar-xz/package.json | Bumps tar-xz to 6.0.0 and adds ./file subpath export. |
| packages/tar-xz/demo/main.ts | Updates demo UI to use v6 async-iterable APIs and new TarSourceFile shape. |
| packages/tar-xz/README.md | Rewrites docs for v6 API, adds file-helper docs and v5→v6 migration guide. |
| packages/nxz/src/nxz.ts | Switches CLI tar handling to tar-xz/file helpers and updates verbose behavior. |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| README.md | Updates repo-level docs/examples to reflect tar-xz v6 API and tar-xz/file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const cwd = resolve(options.cwd ?? process.cwd()); | ||
| const archiveStream = createReadStream(archivePath); | ||
|
|
||
| for await (const entry of extract(archiveStream, options)) { | ||
| const target = resolve(cwd, entry.name); | ||
| const normalized = normalize(target); | ||
|
|
||
| // Path safety: prevent directory traversal | ||
| if (!normalized.startsWith(cwd + '/') && normalized !== cwd) { | ||
| throw new Error(`Refusing to extract entry outside cwd: ${entry.name}`); | ||
| } |
| if (entry.type === TarEntryType.HARDLINK) { | ||
| await mkdir(dirname(target), { recursive: true }); | ||
| const linkSource = resolve(cwd, entry.linkname); | ||
| // Remove existing file if present (allow re-extract) | ||
| try { | ||
| await unlink(target); | ||
| } catch { | ||
| // Ignore | ||
| } | ||
| await link(linkSource, target); | ||
| continue; |
| // Build archive with explicit symlink entry | ||
| await createFile(archive, { | ||
| files: [ | ||
| { name: 'target.txt', source: Buffer.from('content') }, | ||
| { | ||
| name: 'link.txt', | ||
| source: new Uint8Array(0), | ||
| mode: TarEntryType.SYMLINK as unknown as number, | ||
| }, | ||
| ], | ||
| }); |
| @@ -83,14 +147,27 @@ class TarList extends Writable { | |||
| * } | |||
| * ``` | |||
| */ | |||
| - **Stream-first** — all functions return `AsyncIterable<…>`; no whole-file buffering required | ||
| - **Flexible input** — `extract()` and `list()` accept `AsyncIterable`, `Uint8Array`, | ||
| `ArrayBuffer`, Web `ReadableStream`, or Node `ReadableStream` |
| if (entry.type === TarEntryType.SYMLINK) { | ||
| await mkdir(dirname(target), { recursive: true }); | ||
| // Remove existing symlink if present (allow re-extract) | ||
| try { | ||
| await unlink(target); | ||
| } catch { | ||
| // Ignore — file may not exist | ||
| } | ||
| await symlink(entry.linkname, target); | ||
| continue; | ||
| } |
| /** Helper: collect an extract() async iterable to memory entries (using ReadableStream) */ | ||
| async function collectExtract( | ||
| archive: string, | ||
| options: { strip?: number; filter?: (e: TarEntry) => boolean } = {} | ||
| ): Promise<Array<{ name: string; type: string; content: Buffer; linkname: string }>> { | ||
| const results: Array<{ name: string; type: string; content: Buffer; linkname: string }> = []; | ||
| for await (const entry of extract(createReadStream(archive), options)) { | ||
| const bytes = await entry.bytes(); |
| // extractFile with no options — extracts to cwd (tempDir not applicable here, but API works) | ||
| // Just verify it doesn't throw when called with default options shape | ||
| await expect(listFile(archivePath)).resolves.toHaveLength(1); |
| |----------|-----------|---------| | ||
| | `create` | `(options: CreateOptions) => AsyncIterable<Uint8Array>` | Compressed archive chunks | | ||
| | `extract` | `(input: TarInput, options?: ExtractOptions) => AsyncIterable<TarEntryWithData>` | Entries with data | | ||
| | `list` | `(input: TarInput, options?: ExtractOptions) => AsyncIterable<TarEntry>` | Metadata only | |
| const mode = file.mode ?? 0o644; | ||
|
|
||
| const isDir = name.endsWith('/') && size === 0; | ||
| const type: TarEntryTypeValue = isDir ? TarEntryType.DIRECTORY : TarEntryType.FILE; |
Copilot round 2:
- S1/S2 path traversal guard: cross-platform via path.relative + isAbsolute
- S3 symlink TOCTOU: hasSymlinkAncestor check before file write
- M1-M5: dir mode 0o755 default (Node + browser), README claims, test cast
- L1-L3: outdated JSDoc + comment cleanup
Senior-review caught 2 missed bugs:
- F-001: rel.startsWith('..') falsely rejects '..gitignore' etc.
Fixed: use 'rel === ..' || rel.startsWith('..' + sep) || isAbsolute(rel)
- F-002: hasSymlinkAncestor check was FILE-only — DIRECTORY/SYMLINK/
HARDLINK branches all called mkdir/link/symlink without ancestor check,
allowing escape via 'link → ../external' + 'link/subdir/'.
Extracted ensureSafeTarget(target, cwd, name) helper called for ALL
entry types before any fs operation.
- F-003: regression tests added for DIRECTORY/HARDLINK/SYMLINK-through-
symlink. Sanity-checked: temporary F-002 revert makes them fail.
There was a problem hiding this comment.
Pull request overview
Major v6 redesign of tar-xz (and nxz-cli) to provide a universal, stream-first API across Node and browsers, centered on AsyncIterable<Uint8Array>, with Node-only filesystem helpers exposed via tar-xz/file.
Changes:
- Replaces v5 Node + browser APIs with unified
create(),extract(),list()AsyncIterable-based interfaces. - Adds Node-only file convenience wrappers
createFile,extractFile,listFileunder thetar-xz/filesubpath export. - Updates tests, demo, and documentation (including a v5→v6 migration guide) and rewires
nxz-clito the new APIs.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node tests to use tar-xz/file helpers and adds new security/behavior cases. |
| packages/tar-xz/test/coverage.spec.ts | Adjusts coverage/integration tests for v6 API + adds additional security regression scenarios. |
| packages/tar-xz/src/types.ts | Introduces v6 unified types (TarInput, TarSourceFile, streaming TarEntryWithData). |
| packages/tar-xz/src/node/list.ts | Implements v6 list(input) as an async generator over TAR metadata. |
| packages/tar-xz/src/node/index.ts | Updates Node entry exports (removes extractToMemory, adds TarInputNode type export). |
| packages/tar-xz/src/node/file.ts | Adds Node-only filesystem helper API (createFile/extractFile/listFile) + path safety checks. |
| packages/tar-xz/src/node/extract.ts | Implements v6 extract(input, options) async generator yielding TarEntryWithData. |
| packages/tar-xz/src/node/create.ts | Implements v6 create(options) as an async generator of compressed chunks. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node input normalization helper to accept multiple “stream-shaped” inputs. |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser input normalization helper. |
| packages/tar-xz/src/index.ts | Updates package root Node entry exports for v6. |
| packages/tar-xz/src/index.browser.ts | Updates package root browser entry exports for v6. |
| packages/tar-xz/src/browser/list.ts | Implements browser v6 list(input) with WASM decompression. |
| packages/tar-xz/src/browser/index.ts | Updates browser entry exports to v6 create/extract/list. |
| packages/tar-xz/src/browser/extract.ts | Implements browser v6 extract(input, options) async generator + TarEntryWithData. |
| packages/tar-xz/src/browser/create.ts | Implements browser v6 create(options) async generator + unified TarSourceFile. |
| packages/tar-xz/package.json | Bumps to 6.0.0 and adds ./file subpath export. |
| packages/tar-xz/demo/vite.config.ts | Fixes Vite alias resolution ordering for node-liblzma/wasm. |
| packages/tar-xz/demo/main.ts | Updates demo to v6 AsyncIterable API and new types. |
| packages/tar-xz/README.md | Rewrites docs for v6 API, adds file helpers docs and migration guide. |
| packages/nxz/src/nxz.ts | Switches CLI tar handling to tar-xz/file APIs and removes optional dynamic import. |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| README.md | Updates top-level repo docs/examples to reflect tar-xz v6 API + file helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } catch { | ||
| // Directory doesn't exist yet — no symlink risk. | ||
| return false; |
| if (typeof TextDecoder !== 'undefined') { | ||
| return new TextDecoder(encoding ?? 'utf-8').decode(bytes); | ||
| } | ||
| return Buffer.from(bytes).toString((encoding ?? 'utf-8') as BufferEncoding); |
R3-1 (M/security): hasSymlinkAncestor stopped walking on ENOENT, missing the case where 'link → ../external' exists but the intermediate path (e.g. 'link/subdir') is not yet on disk. mkdir(recursive) would then follow the symlink and write outside cwd. Fix: ENOENT continues the walk; only non-ENOENT errors re-throw. Regression test added — sanity checked: test fails on revert, passes on fix. R3-2 (M/correctness): browser collectText() had a Buffer.from() fallback that ReferenceError's in modern bundlers without Buffer polyfill. Dropped the fallback; TextDecoder is universally available in target environments.
There was a problem hiding this comment.
Pull request overview
Breaking v6 redesign of tar-xz to provide a universal, stream-first API (AsyncIterable<Uint8Array>) that works consistently in both Node.js and browsers, with Node-only disk I/O helpers exposed via tar-xz/file. Updates nxz-cli and documentation/tests to match the new API.
Changes:
- Introduce unified
create(),extract(),list()AsyncIterable-based APIs for Node + browser, plus shared input normalization (TarInput). - Add Node-only filesystem convenience helpers (
createFile,extractFile,listFile) via thetar-xz/fileexport. - Update tests, demo, and READMEs; rewire
nxz-clito usetar-xz/file; bump package versions to6.0.0.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node tests to exercise v6 file helpers and in-memory streaming usage. |
| packages/tar-xz/test/coverage.spec.ts | Updates coverage suite for v6 APIs; adds several security/TOCTOU regression tests and in-memory extract helper. |
| packages/tar-xz/src/types.ts | Redefines public types around universal stream-first API (TarInput, TarSourceFile, TarEntryWithData). |
| packages/tar-xz/src/node/list.ts | Implements Node list(input) as AsyncIterable<TarEntry> (currently buffers/decompresses internally). |
| packages/tar-xz/src/node/index.ts | Adjusts Node entry exports to v6 (extractToMemory removed; exports TarInputNode). |
| packages/tar-xz/src/node/file.ts | Adds Node-only disk I/O wrappers (tar-xz/file) with traversal + symlink-ancestor guards. |
| packages/tar-xz/src/node/extract.ts | Implements Node extract(input) as AsyncIterable<TarEntryWithData> (currently buffers/decompresses internally). |
| packages/tar-xz/src/node/create.ts | Implements Node create(options) yielding compressed chunks; builds TAR blocks and streams through XZ. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node-side TarInputNode normalization to AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser-side TarInput normalization to AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/index.ts | Updates Node package entrypoint exports/types for v6 and removes v5-only exports. |
| packages/tar-xz/src/index.browser.ts | Updates browser entrypoint to export v6 create/extract/list and shared types. |
| packages/tar-xz/src/browser/list.ts | Implements browser list(input) as AsyncIterable<TarEntry> using WASM subpath. |
| packages/tar-xz/src/browser/index.ts | Renames browser exports to v6 (create/extract/list). |
| packages/tar-xz/src/browser/extract.ts | Implements browser extract(input) yielding TarEntryWithData with bytes()/text() helpers. |
| packages/tar-xz/src/browser/create.ts | Implements browser create(options) as AsyncIterable<Uint8Array> (currently yields one compressed chunk). |
| packages/tar-xz/package.json | Bumps tar-xz to 6.0.0 and adds ./file export for Node-only helpers. |
| packages/tar-xz/demo/vite.config.ts | Fixes Vite aliasing for node-liblzma/wasm subpath resolution order. |
| packages/tar-xz/demo/main.ts | Updates demo to consume v6 AsyncIterable APIs and new source file shape. |
| packages/tar-xz/README.md | Rewrites docs for v6 API, adds migration guide, and documents tar-xz/file. |
| packages/nxz/src/nxz.ts | Rewires CLI tar integration to use tar-xz/file helpers (no more dynamic import). |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| README.md | Updates monorepo README examples/docs to reflect tar-xz v6 and the new API patterns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Ensure directory is traversable: always set execute bits (x) for user/group/other. | ||
| // A directory with mode 0o644 (no execute) cannot be descended into. | ||
| const dirMode = (entry.mode || 0o755) | 0o111; | ||
| await mkdir(target, { recursive: true, mode: dirMode }); |
| | Preset | WASM Memory | Speed | Ratio | Recommendation | | ||
| |--------|------------|-------|-------|----------------| | ||
| | 1 | ~10 MB | Fastest | Lowest | Batch of many small files | | ||
| | 3 | ~20 MB | Fast | Good | Browser default | | ||
| | 6 | ~100 MB | Medium | Very good | Node default | |
| async function collectAllChunks(input: TarInputNode): Promise<Uint8Array> { | ||
| const iterable = toAsyncIterable(input); | ||
| const chunks: Uint8Array[] = []; | ||
| for await (const chunk of iterable) { | ||
| chunks.push(chunk); | ||
| } | ||
| const total = chunks.reduce((n, c) => n + c.length, 0); | ||
| const out = new Uint8Array(total); | ||
| let offset = 0; | ||
| for (const chunk of chunks) { | ||
| out.set(chunk, offset); | ||
| offset += chunk.length; | ||
| } | ||
| return out; | ||
| } | ||
|
|
||
| async function decompressXz(data: Uint8Array): Promise<Uint8Array> { | ||
| const unxzStream = createUnxz(); | ||
| const readable = Readable.from( | ||
| (async function* () { | ||
| yield data; | ||
| })() | ||
| ); | ||
|
|
||
| const output: Uint8Array[] = []; | ||
| let resolveFlush!: () => void; | ||
| let rejectFlush!: (e: unknown) => void; | ||
| const done = new Promise<void>((res, rej) => { | ||
| resolveFlush = res; | ||
| rejectFlush = rej; | ||
| }); | ||
|
|
||
| unxzStream.on('data', (...args: unknown[]) => { | ||
| const chunk = args[0] as Buffer; | ||
| output.push(new Uint8Array(chunk.buffer, chunk.byteOffset, chunk.byteLength)); | ||
| }); | ||
| unxzStream.on('end', resolveFlush); | ||
| unxzStream.on('error', rejectFlush); | ||
| readable.pipe(unxzStream); | ||
|
|
||
| await done; | ||
|
|
||
| const total = output.reduce((n, c) => n + c.length, 0); | ||
| const result = new Uint8Array(total); | ||
| let offset = 0; | ||
| for (const chunk of output) { | ||
| result.set(chunk, offset); | ||
| offset += chunk.length; | ||
| } | ||
| return result; | ||
| } | ||
|
|
||
| async function runWritable(writable: Writable, data: Uint8Array): Promise<void> { | ||
| await new Promise<void>((resolve, reject) => { | ||
| writable.on('finish', resolve); | ||
| writable.on('error', reject); | ||
| writable.write(Buffer.from(data.buffer, data.byteOffset, data.byteLength)); | ||
| writable.end(); | ||
| }); |
| if (entry.type === TarEntryType.HARDLINK) { | ||
| // S2: validate linkname — it must not escape cwd (absolute paths or ".." segments). | ||
| const linkSource = resolve(cwd, entry.linkname); | ||
| const linkRel = relative(cwd, linkSource); | ||
| if (linkRel === '..' || linkRel.startsWith('..' + sep) || isAbsolute(linkRel)) { | ||
| throw new Error(`Refusing hardlink outside cwd: ${entry.linkname}`); | ||
| } |
R4-1 (M/correctness): mode defaults used '||' which treats mode 0 (legitimate TAR entry) as falsy, silently replacing with default. Use '??' so only undefined/null triggers the default. 3 call sites in node/file.ts; node/create.ts and browser/create.ts already used '??'. R4-2 (M/correctness): extractFile applied 'strip' to entry.name but not to entry.linkname for HARDLINK entries. With strip:1, archive 'dir/link → dir/a.txt' would fail because 'dir/a.txt' linkname still includes the stripped 'dir/' prefix. Apply strip to linkname before resolve() and re-run path safety check. Regression test added. R4-3 (L/doc): README claimed preset 3 was 'Browser default' but actual default is 6 (uniform across Node and Browser). Doc updated. R4-4 (L/dry): collectAllChunks/decompressXz/runWritable were duplicated between list.ts and extract.ts. Extracted to internal node/xz-helpers.ts to prevent drift.
There was a problem hiding this comment.
Pull request overview
Major v6 breaking redesign of tar-xz to provide a universal, stream-first API across Node.js and browsers, centered on AsyncIterable<Uint8Array>, plus a Node-only tar-xz/file subpath for filesystem I/O. nxz-cli is updated to consume the new file helpers and v6 package versions.
Changes:
- Replace v5 buffer-/path-centric APIs with universal
create()/extract()/list()AsyncIterable-based APIs for Node + Browser. - Add Node-only file helper module (
tar-xz/file) implementingcreateFile/extractFile/listFilewith path-safety/TOCTOU guards. - Update tests, docs, demo, and
nxz-cliwiring to use the new v6 APIs and exports.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node API tests to exercise tar-xz/file helpers and in-memory stream piping. |
| packages/tar-xz/test/coverage.spec.ts | Refactors coverage tests to use new stream API + file helpers; adds security regression cases. |
| packages/tar-xz/src/types.ts | Unifies Node/Browser types; introduces TarInput, TarSourceFile, and streaming TarEntryWithData. |
| packages/tar-xz/src/node/xz-helpers.ts | Adds shared Node helpers for collecting input, XZ decompression, and driving writables. |
| packages/tar-xz/src/node/list.ts | Reworks Node list to accept stream-shaped input and yield entries via AsyncIterable. |
| packages/tar-xz/src/node/index.ts | Updates Node entrypoint exports for v6 (extractToMemory removed; exports TarInputNode). |
| packages/tar-xz/src/node/file.ts | Introduces Node-only filesystem convenience API (createFile/extractFile/listFile) with safety checks. |
| packages/tar-xz/src/node/extract.ts | Reworks Node extract into AsyncIterable yielding TarEntryWithData (buffering implementation). |
| packages/tar-xz/src/node/create.ts | Reworks Node create into AsyncIterable of compressed chunks; adds TarSourceFile support. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node normalizer for accepted tar inputs into AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser normalizer for accepted tar inputs into AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/index.ts | Updates main (Node) package entry exports for v6 and re-exports new types. |
| packages/tar-xz/src/index.browser.ts | Updates browser package entry exports to v6 function names and unified types. |
| packages/tar-xz/src/browser/list.ts | Reworks browser list to accept stream-shaped input and yield entries via AsyncIterable. |
| packages/tar-xz/src/browser/index.ts | Updates browser internal entrypoint exports to v6 names. |
| packages/tar-xz/src/browser/extract.ts | Reworks browser extract to AsyncIterable yielding TarEntryWithData. |
| packages/tar-xz/src/browser/create.ts | Reworks browser create to AsyncIterable; enforces no-fs-path sources in browser. |
| packages/tar-xz/package.json | Bumps tar-xz to 6.0.0 and adds ./file subpath export. |
| packages/tar-xz/demo/vite.config.ts | Fixes Vite aliasing for node-liblzma/wasm subpath resolution. |
| packages/tar-xz/demo/main.ts | Updates demo to new v6 async-iterable APIs and TarSourceFile usage. |
| packages/tar-xz/README.md | Rewrites docs for v6 universal stream-first API + migration guide and file helpers. |
| packages/nxz/src/nxz.ts | Updates CLI tar commands to use tar-xz/file helpers and new behavior. |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| README.md | Updates monorepo README to describe tar-xz v6 and show new usage patterns. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // S2: validate linkname — it must not escape cwd (absolute paths or ".." segments). | ||
| const linkSource = resolve(cwd, strippedLinkname); | ||
| const linkRel = relative(cwd, linkSource); | ||
| if (linkRel === '..' || linkRel.startsWith('..' + sep) || isAbsolute(linkRel)) { | ||
| throw new Error(`Refusing hardlink outside cwd: ${entry.linkname}`); | ||
| } | ||
| await mkdir(dirname(target), { recursive: true }); | ||
| // Remove existing file if present (allow re-extract) | ||
| try { | ||
| await unlink(target); | ||
| } catch { | ||
| // Ignore | ||
| } | ||
| await link(linkSource, target); | ||
| continue; |
| @@ -116,15 +143,6 @@ class TarUnpack extends Writable { | |||
| * Validate that a path doesn't escape the target directory (path traversal protection) | |||
| * @throws Error if path traversal is detected | |||
| */ | |||
| function validatePath(destPath: string, cwd: string): void { | |||
| const resolvedDest = path.resolve(destPath); | |||
| const resolvedCwd = path.resolve(cwd); | |||
|
|
|||
| // Ensure the destination path starts with the cwd (no escape via ../) | |||
| if (!resolvedDest.startsWith(resolvedCwd + path.sep) && resolvedDest !== resolvedCwd) { | |||
| throw new Error(`Path traversal detected: ${destPath} escapes ${cwd}`); | |||
| } | |||
| } | |||
|
|
|||
| /** | |||
| * Extract a tar.xz archive | |||
| @@ -141,114 +159,53 @@ function validatePath(destPath: string, cwd: string): void { | |||
| * }); | |||
| * ``` | |||
| */ | |||
| export async function extract(options: ExtractOptions): Promise<TarEntry[]> { | |||
| const { file, cwd = process.cwd(), strip = 0, filter, preserveOwner = false } = options; | |||
|
|
|||
| // Create decompression and unpacking streams | |||
| const inputStream = createReadStream(file); | |||
| const unxzStream = createUnxz(); | |||
| const tarUnpack = new TarUnpack(); | |||
| /** | |||
| @@ -90,14 +91,38 @@ function listTarEntries(data: Uint8Array): TarEntry[] { | |||
| * } | |||
| * ``` | |||
| */ | |||
| @@ -100,43 +102,85 @@ function parseTar(data: Uint8Array): Array<TarEntry & { data: Uint8Array }> { | |||
| * } | |||
| * ``` | |||
| */ | |||
Round 5 findings: - R5-1 (M/security): hardlink linkSource not validated against symlinks. Attack: archive plants 's → /etc/passwd' (symlink), then 'link → s' (hardlink with linkname='s'). The previous linkRel check passes (s resolves inside cwd), but link(cwd/s, ...) follows the kernel symlink and creates a hardlink to /etc/passwd. Fix: lstat(linkSource) + hasSymlinkAncestor(linkSource) before link(). Regression test added. - R5-2/3/4 (L/doc): stale v5 JSDoc removed from node/extract.ts, browser/extract.ts, browser/list.ts. Senior-review hardening (folded in): - L-1: regression test now also asserts external/secret nlink === 1 (proves no successful hardlink, even if dest cleaned up). - L-2: lstat error handling tightened — only swallow ENOENT, rethrow EACCES/ELOOP/etc. (consistency with hasSymlinkAncestor pattern).
There was a problem hiding this comment.
Pull request overview
Breaking v6 redesign of tar-xz to provide a universal, stream-first API (Node + Browser) centered on AsyncIterable<Uint8Array>, with Node-only filesystem helpers exposed via tar-xz/file, and corresponding updates to nxz-cli, tests, demo, and documentation.
Changes:
- Replace v5 Node/browser split APIs with unified
create()/extract()/list()AsyncIterable-based interfaces and new shared types. - Add Node-only
tar-xz/filehelpers (createFile,extractFile,listFile) and refactornxz-clito use them. - Update tests, demo, and READMEs to match the new v6 API + migration guidance.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node tests to exercise the new tar-xz/file API and in-memory streaming flows. |
| packages/tar-xz/test/coverage.spec.ts | Updates coverage tests for v6 behavior; adds extensive security regression coverage around extraction. |
| packages/tar-xz/src/types.ts | Introduces v6 universal types (TarInput, TarSourceFile, streaming TarEntryWithData). |
| packages/tar-xz/src/node/xz-helpers.ts | Adds shared Node XZ buffering/decompression helpers for list/extract. |
| packages/tar-xz/src/node/list.ts | Refactors Node list to AsyncIterable<TarEntry> API and uses shared XZ helpers. |
| packages/tar-xz/src/node/index.ts | Updates Node entry exports for v6 (drops extractToMemory, exports TarInputNode). |
| packages/tar-xz/src/node/file.ts | Adds Node-only disk I/O wrappers (createFile/extractFile/listFile) with path-safety checks. |
| packages/tar-xz/src/node/extract.ts | Refactors Node extract to AsyncIterable<TarEntryWithData> and removes extractToMemory. |
| packages/tar-xz/src/node/create.ts | Refactors Node create to return compressed chunks as an AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node input normalization (TarInputNode → AsyncIterable<Uint8Array>). |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser input normalization (TarInput → AsyncIterable<Uint8Array>). |
| packages/tar-xz/src/index.ts | Updates Node main entry exports/types for v6. |
| packages/tar-xz/src/index.browser.ts | Updates browser entry to export create/extract/list and v6 types. |
| packages/tar-xz/src/browser/list.ts | Refactors browser list to the v6 AsyncIterable API and WASM import path. |
| packages/tar-xz/src/browser/index.ts | Updates browser sub-entry exports to v6 names. |
| packages/tar-xz/src/browser/extract.ts | Refactors browser extract to AsyncIterable<TarEntryWithData> and WASM import path. |
| packages/tar-xz/src/browser/create.ts | Refactors browser create to AsyncIterable output and v6 source types; enforces no-fs-path in browser. |
| packages/tar-xz/package.json | Bumps tar-xz to 6.0.0 and adds the ./file subpath export. |
| packages/tar-xz/demo/vite.config.ts | Adjusts Vite aliasing to correctly resolve node-liblzma/wasm vs bare node-liblzma. |
| packages/tar-xz/demo/main.ts | Updates demo to the v6 AsyncIterable API (create/extract/list). |
| packages/tar-xz/README.md | Rewrites docs for v6 API + adds migration guide and file-helper documentation. |
| packages/nxz/src/nxz.ts | Refactors CLI to use tar-xz/file helpers and new file-source mapping. |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| README.md | Updates root README to reflect tar-xz v6 API and file-helper subpath. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await mkdir(dirname(target), { recursive: true }); | ||
| await pipeline( | ||
| Readable.from(entry.data), | ||
| createWriteStream(target, { mode: entry.mode ?? 0o644 }) | ||
| ); |
| * @returns Promise with list of entries | ||
| * Returns an `AsyncIterable<TarEntry>` yielding each entry's metadata. | ||
| * Entry content is skipped — use `extract()` if you need the data. | ||
| * |
| * Extract a tar.xz archive. | ||
| * | ||
| * @param options - Extraction options | ||
| * @returns Promise with list of extracted entries | ||
| * Returns an `AsyncIterable<TarEntryWithData>`. Each yielded entry includes: | ||
| * - Full metadata (`TarEntry` fields) | ||
| * - `data` — `AsyncIterable<Uint8Array>` for the entry's content (consume in order) |
| const chunks: Uint8Array[] = []; | ||
| for await (const chunk of iterable) { | ||
| chunks.push(chunk); | ||
| } | ||
| const total = chunks.reduce((n, c) => n + c.length, 0); |
…k vectors Stop the round-by-round whack-a-mole. Senior audit identified 18 attack vectors (V1-V18); 7 consolidated fixes close them all in one commit, with regression tests for each. Fix 1 (V1, V4, R6-1) — Leaf symlink check ensureSafeTarget now lstat's the leaf 'target' itself, not just its ancestors. Catches archives that plant a symlink then overwrite it via FILE/DIRECTORY/HARDLINK. SYMLINK entries skip the check (legitimate re-extract via unlink+symlink). Fix 2 (V6a, V6b) — Reject empty / NUL-bearing names + linknames New ensureSafeName() helper. Empty strings and embedded NUL bytes are rejected for both entry.name and entry.linkname before any path math runs. Fix 3 (V6c, V14) — Apply strip to SYMLINK linkname Mirrors the existing HARDLINK strip handling. Consistency + correctness for archives extracted with --strip. Fix 4 (V12) — Strip setuid/setgid/sticky bits by default SAFE_MODE_MASK = 0o0777 applied to every extracted file/dir mode. Mirrors GNU tar --no-same-permissions default. No --preserve-special opt-in for now (add later if needed). Fix 5 (V2, V3) — fd-based FILE extraction with O_NOFOLLOW POSIX path uses fs.open(O_WRONLY | O_CREAT | O_TRUNC | O_NOFOLLOW) + handle.write() loop + handle.chmod() + handle.utimes() — all fd-bound, no path-resolution after the open. Eliminates the chmod/utimes TOCTOU window. Windows fallback uses the by-path version (gated by process.platform); leaf check (Fix 1) is the primary defense there. Fix 6 (V8) — Narrow unlink catches to ENOENT Two pre-clean unlink sites (SYMLINK, HARDLINK) now rethrow non-ENOENT errors instead of swallowing all of them. Fix 7 (V11) — Threat model documentation extractFile JSDoc now states that concurrent attacker process swaps during extraction are out of scope (POSIX openat2 not exposed by Node). 8 regression tests added covering each vector. Sanity-checked: reverting Fix 1, Fix 3, or Fix 4 individually causes the corresponding test(s) to fail. 95/95 tar-xz tests pass. Senior-review verdict: Ready, 3 L findings folded (duplicate JSDoc) or deferred (NUL strictness, Windows-mock test) per scope tradeoff.
There was a problem hiding this comment.
Pull request overview
Major v6 redesign of tar-xz (and nxz-cli) to provide a universal, stream-first API based on AsyncIterable<Uint8Array>, with Node-only filesystem helpers moved into a dedicated tar-xz/file subpath export.
Changes:
- Replaced v5 APIs with unified
create/extract/listAsyncIterable-based APIs for both Node and browser builds. - Added Node-only file helper wrappers (
createFile,extractFile,listFile) plus significant extraction hardening (path traversal + symlink/TOCTOU defenses). - Updated tests, demo, and documentation (including a v5→v6 migration guide); bumped
tar-xzandnxz-clito 6.0.0.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates tests to v6 Node file-helper API and adds in-memory streaming tests |
| packages/tar-xz/test/coverage.spec.ts | Expands coverage and adds extensive security regression tests for extractFile |
| packages/tar-xz/src/types.ts | Introduces v6 universal types (TarInput, TarSourceFile, TarEntryWithData streaming helpers) |
| packages/tar-xz/src/node/xz-helpers.ts | Adds shared Node XZ helper utilities for extract/list pipelines |
| packages/tar-xz/src/node/list.ts | Converts Node list() to AsyncIterable API and refactors decompression via helpers |
| packages/tar-xz/src/node/index.ts | Updates Node entry exports (removes extractToMemory, exports TarInputNode type) |
| packages/tar-xz/src/node/file.ts | Adds Node-only disk I/O convenience API + extraction safety hardening |
| packages/tar-xz/src/node/extract.ts | Converts Node extract() to AsyncIterable API |
| packages/tar-xz/src/node/create.ts | Converts Node create() to AsyncIterable output and new TarSourceFile input model |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node TarInput normalization to AsyncIterable |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser TarInput normalization to AsyncIterable |
| packages/tar-xz/src/index.ts | Updates Node package entrypoint exports/types for v6 |
| packages/tar-xz/src/index.browser.ts | Updates browser package entrypoint exports/types for v6 |
| packages/tar-xz/src/browser/list.ts | Converts browser list() to AsyncIterable API and switches to wasm subpath |
| packages/tar-xz/src/browser/index.ts | Updates browser API surface to create/extract/list names |
| packages/tar-xz/src/browser/extract.ts | Converts browser extract() to AsyncIterable API |
| packages/tar-xz/src/browser/create.ts | Converts browser create() to AsyncIterable output and TarSourceFile source model |
| packages/tar-xz/package.json | Bumps to 6.0.0 and adds ./file subpath export |
| packages/tar-xz/demo/vite.config.ts | Fixes Vite aliasing for node-liblzma/wasm subpath resolution |
| packages/tar-xz/demo/main.ts | Updates demo to v6 AsyncIterable APIs and TarSourceFile usage |
| packages/tar-xz/README.md | Rewrites docs for v6 API + migration guide |
| packages/nxz/src/nxz.ts | Rewires CLI tar operations to use tar-xz/file helpers |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0 |
| README.md | Updates root README to reflect tar-xz v6 redesign and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| function ensureSafeName(s: string | undefined, label: string): void { | ||
| if (s === undefined) return; | ||
| if (s.length === 0) throw new Error(`Refusing entry: empty ${label}`); | ||
| if (s.includes('\x00')) throw new Error(`Refusing entry: ${label} contains NUL byte`); | ||
| } |
| export async function createFile(path: string, options: CreateOptions): Promise<void> { | ||
| await pipeline(Readable.from(create(options)), createWriteStream(path)); |
| "./file": { | ||
| "types": "./lib/node/file.d.ts", | ||
| "import": "./lib/node/file.js", | ||
| "default": "./lib/node/file.js" | ||
| } |
| import { createFile, extractFile, listFile } from '../src/node/file.js'; | ||
| import { TarEntryType } from '../src/types.js'; |
| // Pipe TAR builder → XZ compressor; yield each compressed chunk as it arrives. | ||
| // Node's Readable streams are themselves AsyncIterable, so we can `for await` | ||
| // directly without buffering everything in memory. | ||
| const xzStream = createXz({ preset }); | ||
| const outputFile = await fs.open(file, 'w'); | ||
| const outputStream = outputFile.createWriteStream(); | ||
|
|
||
| // Process files | ||
| const processFiles = async (): Promise<void> => { | ||
| for (const { path: filePath, relativePath, stats } of allFiles) { | ||
| let content: Buffer | null = null; | ||
| let linkTarget: string | undefined; | ||
|
|
||
| if (stats.isSymbolicLink()) { | ||
| linkTarget = await fs.readlink(filePath); | ||
| } else if (stats.isFile()) { | ||
| content = await fs.readFile(filePath); | ||
| } | ||
|
|
||
| await tarPack.addEntry(relativePath, content, stats, linkTarget); | ||
| } | ||
|
|
||
| tarPack.finalize(); | ||
| tarPack.push(null); // End the stream | ||
| }; | ||
| Readable.from(buildTar(files, filter)).pipe(xzStream); | ||
|
|
R7-5 (M/correctness): create() used Readable.pipe() which doesn't
propagate source errors — fs.readFile failures inside buildTar would
emit unhandled error events instead of rejecting the iteration.
Switched to pipeline() from node:stream/promises with explicit error
forwarding. Regression test added; sanity-check: reverting to pipe()
causes test to time out.
R7-1 (M/correctness, light): ensureSafeName now rejects dot-segment
placeholder names ('.', './', '..') after separator normalization.
Legitimate dotfiles like '.gitignore' are unaffected — regression
guard test included.
R7-3 (L/DX): added 'browser: null' to ./file subpath exports so
bundlers fail fast instead of pulling in node:fs silently.
Skipped (deferred to follow-up):
- R7-2: cosmetic param rename (path → archivePath)
- R7-4: public-entry re-export smoke test
There was a problem hiding this comment.
Pull request overview
This PR delivers the v6 breaking redesign of tar-xz into a universal, stream-first API built around AsyncIterable<Uint8Array>, with Node-only filesystem helpers moved behind the tar-xz/file subpath export. It also rewires nxz-cli and updates docs/demo/tests to match the new model.
Changes:
- Replace v5 Node/Browser-specific APIs with unified
create()/extract()/list()AsyncIterable-based APIs. - Add Node-only disk I/O wrappers (
createFile/extractFile/listFile) undertar-xz/fileand update consumers/tests accordingly. - Refresh demo + READMEs and expand Node coverage/security tests for extraction hardening.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/tar-xz/test/node-api.spec.ts | Updates Node tests to validate the new Node file helpers and in-memory stream API. |
| packages/tar-xz/test/coverage.spec.ts | Updates coverage tests for v6 APIs and adds extensive extractor security regression tests. |
| packages/tar-xz/src/types.ts | Introduces unified v6 public types (TarInput, TarSourceFile, streaming TarEntryWithData). |
| packages/tar-xz/src/node/xz-helpers.ts | Adds shared Node helpers to normalize/buffer input and run XZ decompression pipelines. |
| packages/tar-xz/src/node/list.ts | Reworks Node list() to accept stream-shaped input and yield TarEntry via AsyncIterable. |
| packages/tar-xz/src/node/index.ts | Adjusts Node entry exports for v6 (create, extract, list, and TarInputNode). |
| packages/tar-xz/src/node/file.ts | Adds Node-only file helpers (tar-xz/file) with path-safety and TOCTOU hardening. |
| packages/tar-xz/src/node/extract.ts | Reworks Node extract() to accept stream-shaped input and yield TarEntryWithData. |
| packages/tar-xz/src/node/create.ts | Reworks Node create() to return compressed chunks as AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/internal/to-async-iterable.ts | Adds Node input normalization to AsyncIterable<Uint8Array> (TarInputNode). |
| packages/tar-xz/src/internal/to-async-iterable.browser.ts | Adds browser input normalization to AsyncIterable<Uint8Array>. |
| packages/tar-xz/src/index.ts | Updates top-level Node entrypoint exports/types for v6. |
| packages/tar-xz/src/index.browser.ts | Updates browser entrypoint exports/types for v6. |
| packages/tar-xz/src/browser/list.ts | Reworks browser list() to accept TarInput and yield entries as AsyncIterable. |
| packages/tar-xz/src/browser/index.ts | Updates browser entry exports to v6 names. |
| packages/tar-xz/src/browser/extract.ts | Reworks browser extract() to yield TarEntryWithData with bytes()/text(). |
| packages/tar-xz/src/browser/create.ts | Reworks browser create() to return AsyncIterable<Uint8Array> output chunks. |
| packages/tar-xz/package.json | Bumps tar-xz to 6.0.0 and adds the ./file subpath export (Node-only). |
| packages/tar-xz/demo/vite.config.ts | Fixes Vite alias ordering to correctly resolve node-liblzma/wasm. |
| packages/tar-xz/demo/main.ts | Updates the demo UI to use v6 AsyncIterable APIs and new source types. |
| packages/nxz/src/nxz.ts | Rewires CLI tar operations to use tar-xz/file helpers and v6 behavior. |
| packages/nxz/package.json | Bumps nxz-cli to 6.0.0. |
| packages/tar-xz/README.md | Rewrites package README for v6 API + migration guide + Node/file helper docs. |
| README.md | Updates repo-level README snippets to reflect the v6 API and tar-xz/file helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * Transform stream that packs files into TAR format | ||
| */ | ||
| class TarPack extends Transform { | ||
| constructor() { | ||
| super({ objectMode: false }); | ||
| /** | ||
| * Build a single TAR entry (header + content blocks) into an array of Uint8Array chunks. | ||
| * Does not write to disk; caller decides what to do with the chunks. |
| * Recursively collect all files in a directory | ||
| */ | ||
| /** |
| // R7-5: use pipeline() instead of pipe() so that errors from buildTar | ||
| // (e.g. missing source file) propagate and reject the iteration rather than | ||
| // hanging or emitting an unhandled error event. |
| */ | ||
| /** | ||
| * Validate a tar entry name or linkname for safety. | ||
| * | ||
| * Rejects: | ||
| * - Empty strings (would cause target === cwd or ambiguous hardlink resolution) | ||
| * - Strings containing the NUL byte (U+0000) |
| /** | ||
| * S3 (TOCTOU guard): Check whether any ancestor directory of `filePath` (up to | ||
| * and including `root`) is a symlink. If so, a malicious archive could first | ||
| * plant a symlink pointing outside root, then write a file through it. | ||
| * | ||
| * Returns true if a symlink ancestor is found (caller should reject the entry). | ||
| */ | ||
| async function hasSymlinkAncestor(filePath: string, root: string): Promise<boolean> { | ||
| // Walk each ancestor from filePath up to (but not including) root. | ||
| let dir = dirname(filePath); | ||
| while (dir !== root && dir.length >= root.length) { |
| import { createReadStream } from 'node:fs'; | ||
|
|
| const archiveStream = create({ | ||
| files: [ | ||
| { name: 'hello.txt', source: Buffer.from('Hello, world!') }, | ||
| { name: 'data.json', source: Buffer.from(JSON.stringify({ ok: true })) }, | ||
| ], | ||
| preset: 6, // XZ compression level 0–9 (default: 6) | ||
| filter: (file) => !file.name.endsWith('.tmp'), // optional | ||
| }); |
…EADME R8-1/R8-2: remove stale 'Transform stream' / 'Recursively collect' comments in create.ts (implementation is AsyncIterable-based, no Transform class, no directory recursion). R8-3: collapse duplicate R7-5 pipeline comment. R8-4: remove duplicate ensureSafeName JSDoc block. R8-5: hasSymlinkAncestor JSDoc says 'up to but not including root' to match the exclusive loop condition. R8-6: hoist mid-file createReadStream import to top of test file. R8-7: README browser example uses TextEncoder.encode() instead of Buffer.from() (Buffer is Node-only). All Copilot round 8 findings (L-only) addressed. Ready to merge.
…try) (#113) * feat(tar-xz): add streamXz() — Block 1 of TAR-XZ-STREAMING-2026-04-28 Adds the streaming-XZ pipeline foundation that subsequent blocks in this story will build on. No callers yet — extract.ts and list.ts still use the buffered helpers; their migration is Block 3 / Block 4. - streamXz(input: TarInputNode): AsyncIterable<Uint8Array> Pipes any TarInputNode through createUnxz() and yields decompressed chunks via the Transform's native Symbol.asyncIterator. No internal Buffer.concat, no Readable.from() wrap (per spec §12.5). - @deprecated tags on collectAllChunks / decompressXz / runWritable. These remain functional until Blocks 3+4 migrate their callers. - 9 unit tests in packages/tar-xz/test/xz-helpers.spec.ts (flat layout, matching coverage.spec.ts / node-api.spec.ts / tar-format.spec.ts). Covers: byte-equality, multi-chunk yield, all four input forms (Uint8Array / Buffer / Readable / AsyncIterable), error propagation on corrupt input, memory-shape no-accumulation proof, deprecation tags. Memory test correctly distinguishes XZ preset-6 dictionary (~8 MB working memory, inherent) from accumulation (would grow with archive size). Spec doc TAR-XZ-STREAMING-2026-04-28.md captured in this commit (adversarial + llm-spec hardened, 5 design decisions locked, §10/§11 review ledgers filled). 108 / 108 tar-xz tests pass; 27 / 27 nxz tests pass; tsc + lint clean. * feat(tar-xz): streaming parseTar generator + extract/list rewrites Blocks 2+3+4 of TAR-XZ-STREAMING-2026-04-28. Replaces the buffered TarUnpack/TarList Writable classes with a coroutine-style AsyncGenerator parser that yields entries while the XZ stream is still being consumed. - tar-parser.ts: new ParseEvent discriminated union + parseTar(source, mode) AsyncGenerator. State machine: HEADER / CONTENT / SKIP / PADDING. Reuses parseNextHeader() unchanged. Removes all four `v8 ignore` blocks that were marking the chunk-split / PAX-split paths (lines 86-89, 148-153, 169-174 — now hot paths exercised by 8 new unit tests in test/tar-parser-stream.spec.ts). Adds MAX_PAX_HEADER_BYTES = 1 MB DoS guard (per spec A-07/A-08); exceeds → throws Error with code 'TAR_PARSER_INVARIANT' (per D-5/L-S-02). - extract.ts: Writable class replaced by lean async generator with a lookahead-buffer pattern that handles cooperative coroutine flow without losing parse events. makeTarEntryWithData rewired around a pull-callback; bytes()/text() memoize via a shared cachedBytes field (D-3). +5 new tests cover S-08 (consumer skips entry.data), memoization, single-use data after bytes(), and S-08b (consumer-break silent stop). - list.ts: Writable class replaced by 4-line generator wrapping parseTar(xzStream, 'list'). Mode 'list' never yields chunk events so memory stays O(BLOCK_SIZE). - xz-helpers.ts: deleted the deprecated collectAllChunks / decompressXz / runWritable. Zero callers verified via search_structural before removal. Spec drift flagged + accepted: D-5's "TAR_PARSER_INVARIANT always re-throws even on consumer-break" cannot be implemented literally because Biome's noUnsafeFinally rule prohibits `throw` in `finally`. After consumer-break (parser.return()) the iterator is dead; no subsequent .next() can observe the corrupted state — surfacing the invariant via finally would only produce an unhandled rejection. Resolution: finally swallows cleanup errors on consumer-break, consistent with D-2 silent-stop semantics. Documented for opus review. Test counts post-blocks-2-3-4: tar-xz 120, root 489, nxz 27 — all green. tsc clean, lint clean (rtk proxy biome). * test(tar-xz): security regression + memory shape gates Block 5 of TAR-XZ-STREAMING-2026-04-28. Adds the regression-lock that PR #113 needs to ship — proves the streaming refactor preserves the 18 TOCTOU vectors closed by PR #108 and meets the O(largest entry) memory target. - vitest.config.ts: pool='forks' + execArgv=['--expose-gc'] for the global.gc handle the memory tests need (Vitest 4 moved execArgv to top-level test config, not poolOptions). - package.json: new test:memory script. - test/security.spec.ts (new, 22 tests): consolidated TOCTOU coverage for V1/R6-1 (leaf symlink), F-001 (traversal), F-002/R3-1 (TOCTOU ancestor + ENOENT walk), S3 (per entry-type), R4-2 (hardlink strip), R5-1 (hardlink symlink source), S2 (hardlink escape), V6a/V6b (NUL/empty name), V12 (setuid mask), V2/V3 (O_NOFOLLOW POSIX — skipped with note on win32). Plus S-14 (Win32 policy doc test) and S-15a/S-15b (PAX bomb DoS — both safe outcomes verified: truncated bomb hits "Unexpected end" before guard, > 1 MB actual hits the TAR_PARSER_INVARIANT). - test/memory-shape.spec.ts (new, 4 tests): in-loop high-water sampling per spec §12.3. extract 1 × 50 MB ≤ 116 MB peak; list 100 × 1 MB metadata ≤ 16 MB peak; extract 5 × 10 MB ≤ 36 MB peak. Tests SKIP cleanly when global.gc is unavailable. - file.ts: ~17 lines of @security TSDoc on extractFile distinguishing POSIX (fd-based, minimal window) from Windows (by-path, wallclock window scales with entry size in streaming mode). Replaces the prior "OUT OF SCOPE" comment. - README.md: streaming claim updated (now O(largest entry) as of v6.1.0); new "Security model" subsection (25 lines) documenting the POSIX/Windows split and recommending exclusive directories on Windows until Win32 fd-based extraction lands (separate TODO). Quality gates: build=0 tsc=0 lint=0 security-test=0 memory-test=0 full-test=0. 661 total tests pass across all packages (33 spec files). * chore: changeset for tar-xz 6.1.0 streaming refactor * fix(tar-xz): address PR #113 review findings — round 1 Round-1 findings from opus + Copilot post-restart consolidated review (3M + 8L). Closes resource leaks, restores BufferEncoding contract on text(), and bundles 8 polish items to converge in one round. M-class (correctness): - streamXz()/parseTar() now propagate cleanup on early termination. streamXz wraps the pipeline in finally{ unxzStream.destroy(); await pipelinePromise.catch() } — destroying the Transform on consumer-break suppresses the resulting pipeline rejection. parseTar wraps its main loop in try/finally{ await iter.return?.() } so upstream cleanup propagates back through the chain. No more pending unhandled rejections after `for await ... break` (F-1). - entry.bytes() now throws explicitly when entry.data has been iterated first: "entry.data already iterated; bytes() cannot recover full content; consume one or call bytes() first." Internal flag set on the first .next() of the data wrapper. Clean contract; surprising silent partial caches eliminated. +2 tests (F-2). - entry.text(encoding?) reverted to Buffer.from(...).toString(encoding) semantics. The TextDecoder rewrite shipped in Block 3 was a breaking change vs the previous Buffer.toString() contract — base64, hex, latin1 etc. now work again. Tests cover utf8 default + base64 + hex (F-3). L-class (polish, doc, layout): - Concurrent entry.data iteration guard (boolean dataGenInFlight) — spec §2.4 forbids it; runtime now enforces (F-4). - Stray-chunk silent fallthrough in extract() outer loop now throws err.code='TAR_PARSER_INVARIANT' instead of consuming-and-continuing (F-5). - memory-shape Test 1 raised to preset:6 to match the 16 MB slack budget rationale (XZ preset-6 dictionary ~8 MB). Threshold unchanged (F-6). - New packages/tar-xz/vitest.memory.config.ts (pool: 'forks', --expose-gc) keeps the default vitest config on threads. test:memory script updated to use the dedicated config. Default `pnpm test` runtime no longer regressed by forks pool (F-7). - xz-helpers.spec.ts:146 "64 KB = 655,360" → "64 KB = 65,536" math fix (F-8). - README streaming claim reworded: "v6.0.0 introduced the stream-first API contract; v6.1.0 delivers the planned optimization that fulfills it." (F-9). - Changeset bullet 4 softened: removed "previously the buffered model could surface the error" overclaim; replaced with accurate JS AsyncGenerator convention reference (F-10). - docs/plans/TAR-XZ-STREAMING-2026-04-28.md duplicate `## §12 Locked Design Decisions` header collapsed to one (F-11). Tests: 150+3-skip pass; memory 3+1-skip pass. tsc + lint + build green. * fix(tar-xz): lazy streamXz pipeline + alloc-once bytes() (PR #113) Round-2 Copilot findings on PR #113 (1M + 1L): - streamXz() now creates the createUnxz() Transform and the pipeline() call INSIDE the async generator body. Pipeline starts only on the first .next() — truly lazy. Consumers that call streamXz(input) but never iterate (e.g. early break before any read) trigger zero I/O and zero resources. T-07 proves it via a counting Readable mock (readCount=0 after 20ms when not iterated). The cleanup `finally{ unxzStream.destroy(); await pipelinePromise.catch(...) }` is reached on every termination path (CR2-1). - bytes() in makeTarEntryWithData() now allocates `new Uint8Array( entry.size)` once and `set()`s each chunk at running offset. Halves peak memory for large entries vs the old chunks-array + concat. entry.size===0 short-circuits to Uint8Array(0). Defensive overrun guard throws Error with code='TAR_PARSER_INVARIANT' if offset+chunk would exceed entry.size — mathematically unreachable given parseTar's Math.min(bytesRemaining,…) but locks the invariant (CR2-2). Pre-push opus senior review (§2.10 round-3 gate): SAFE-TO-PUSH — lazy semantics confirmed, cleanup paths sound, T-07 race-free with positive control. Zero new findings. 151 tar-xz tests pass + 489 full suite + 3 memory-shape pass. tsc + lint + build all green. * fix(tar-xz): retype dataWrapper as AsyncIterable to satisfy strict tsc CI build failed on `packages/tar-xz` with TS2741: TypeScript 6's `lib.esnext` adds `[Symbol.asyncDispose]` to `AsyncGenerator`, and the fix-round-1 dataWrapper proxy didn't implement it. The wrapper only needs to satisfy `AsyncIterable<Uint8Array>` — that is the public type on `TarEntryWithData.data`. `AsyncGenerator` was over-typed. - Restructured the wrapper into a pure AsyncIterable whose `[Symbol.asyncIterator]()` sets `dataIterStarted = true` and returns the underlying `dataGen` directly. No more next/return/throw on the literal — strict mode rejected those without explicit method types, and they were redundant once the wrapper just delegates. - Surface is strictly smaller and matches the public type contract exactly. No behavior change: bytes()-after-iter still throws because the flag is still set on first `[Symbol.asyncIterator]()` call (which fires the moment the consumer starts a `for await`). All gates green: tar-xz build, tsc, vitest (151 + 3 skipped), root build, full suite, biome lint.
Future workspace releases (`tar-xz`, `nxz-cli`) now auto-scope their CHANGELOG entries to commits since the LAST per-package release commit (matching `chore(<pkg>): release v*`), instead of since the last GPG-signed root tag. Eliminates the need for post-release manual CHANGELOG curation that nxz-cli@6.1.0 required to drop entries from #108 (already in 6.0.0), #115 (biome refactor body fragments), and `adfbc99` (changesets adoption, already removed). Empirical dry-run from `packages/nxz/` with `GIT_CHANGELOG_PATH=.` produces zero diff: the new `resolveSinceBaseline` correctly stops at `ecff028 chore(nxz-cli): release v6.1.0`. The new `GIT_CHANGELOG_SINCE` env var also provides an explicit override escape hatch for projects whose release commits do not follow the `chore(<pkg>): release v*` pattern.
Summary
Breaking redesign of
tar-xz(andnxz-cli) for v6.0.0. Same API in Node and Browser, built aroundAsyncIterable<Uint8Array>. SRP-clean: the core does no filesystem I/O; file helpers are an opt-in subpath export (tar-xz/file) Node only.node-liblzma(root) is not affected — stays at v5.0.x. Onlytar-xzandnxz-clibump to 6.0.0.What changed
create(),extract(),list()— same names, same signatures, identical mental model in Node and Browser.create()returnsAsyncIterable<Uint8Array>;extract()andlist()accept any stream-shaped input and yield entries lazily.tar-xz/file(Node only) providescreateFile,extractFile,listFilefor path-based I/O.nxz-cli: rewired to use the new tar-xz API + file helpers; tests updated.Removed
extractToMemory()extract()+entry.bytes()createTarXz/extractTarXz/listTarXzcreate/extract/listBrowserCreateOptions/BrowserExtractOptionsCreateOptions/ExtractOptionsExtractedFileTarEntryWithDataValidation
tsc --noEmitclean (root + tar-xz + nxz)pnpm build+pnpm -r --filter './packages/*' run buildcleanKnown follow-ups (separate PRs)
extract()/list()currently load-then-parse; not yet true streaming. Functional but not memory-optimal for huge archives. → Phase 1.5 optimization.node-liblzma/wasmsubpath. Not introduced by this PR.Test plan
release.ymlwithtarget_package=tar-xz,increment=majorafter merge — first end-to-end validation of the independent versioning infra on a major bump.