Skip to content

feat(tar-xz)!: redesign for v6 — universal stream-first API#108

Merged
oorabona merged 11 commits into
masterfrom
refactor/tar-xz-v6-streams
Apr 27, 2026
Merged

feat(tar-xz)!: redesign for v6 — universal stream-first API#108
oorabona merged 11 commits into
masterfrom
refactor/tar-xz-v6-streams

Conversation

@oorabona
Copy link
Copy Markdown
Owner

Summary

Breaking redesign of tar-xz (and nxz-cli) for v6.0.0. Same API in Node and Browser, built around AsyncIterable<Uint8Array>. SRP-clean: the core does no filesystem I/O; file helpers are an opt-in subpath export (tar-xz/file) Node only.

node-liblzma (root) is not affected — stays at v5.0.x. Only tar-xz and nxz-cli bump to 6.0.0.

What changed

  • Universal API: create(), extract(), list() — same names, same signatures, identical mental model in Node and Browser.
  • Stream-first: create() returns AsyncIterable<Uint8Array>; extract() and list() accept any stream-shaped input and yield entries lazily.
  • File helpers: tar-xz/file (Node only) provides createFile, extractFile, listFile for path-based I/O.
  • nxz-cli: rewired to use the new tar-xz API + file helpers; tests updated.
  • README + demo: rewritten for the new API with side-by-side Node/Browser examples and a v5→v6 migration guide.

Removed

v5 v6 replacement
extractToMemory() extract() + entry.bytes()
createTarXz / extractTarXz / listTarXz create / extract / list
BrowserCreateOptions / BrowserExtractOptions unified CreateOptions / ExtractOptions
ExtractedFile TarEntryWithData

Validation

  • tsc --noEmit clean (root + tar-xz + nxz)
  • ✅ tar-xz tests: 80/80
  • ✅ nxz tests: 27/27
  • ✅ Build: pnpm build + pnpm -r --filter './packages/*' run build clean

Known follow-ups (separate PRs)

  • Node extract() / list() currently load-then-parse; not yet true streaming. Functional but not memory-optimal for huge archives. → Phase 1.5 optimization.
  • Demo build has a pre-existing Vite alias issue with the node-liblzma/wasm subpath. Not introduced by this PR.
  • Conventional-commit scope filter for sub-package CHANGELOGs (currently includes all repo commits).

Test plan

  • tar-xz unit tests
  • nxz unit tests
  • TypeScript build
  • Manual browser demo verification (after Vite alias fix)
  • Real release flow: trigger release.yml with target_package=tar-xz, increment=major after merge — first end-to-end validation of the independent versioning infra on a major bump.

BREAKING CHANGE: tar-xz v6 — universal stream-first API.

The Node and Browser APIs now share identical signatures. All three
core functions (create, extract, list) are async generators returning
or accepting AsyncIterable<Uint8Array>.

Removed:
- extractToMemory() — replaced by extract() + entry.bytes()
- createTarXz / extractTarXz / listTarXz (browser-prefixed names)
- BrowserCreateOptions / BrowserExtractOptions (unified into one type)
- ExtractedFile interface (replaced by TarEntryWithData)

New shape:
  create(options): AsyncIterable<Uint8Array>
  extract(input, options?): AsyncIterable<TarEntryWithData>
  list(input): AsyncIterable<TarEntry>

  type TarInput = AsyncIterable<Uint8Array> | Iterable<Uint8Array>
                | Uint8Array | ArrayBuffer
                | ReadableStream<Uint8Array>     // Web
                | NodeJS.ReadableStream;          // Node

Source files for create() use the new TarSourceFile shape:
  { name, source: AsyncIterable<Uint8Array> | Uint8Array | string }
  (string only valid in Node; interpreted as fs path)

Phase 1 of v6 redesign. nxz-cli and tests will be updated next.
…0.0)

- New 'tar-xz/file' subpath export (Node only) with createFile/extractFile/listFile
- tar-xz package.json: bump to 6.0.0, add ./file subpath export
- nxz-cli: rewired to use tar-xz/file helpers, bump to 6.0.0
- Tests rewritten for v6 API (80/80 tar-xz, 27/27 nxz)

Bugfix: toAsyncIterable mis-dispatched Uint8Array via Symbol.iterator
(yielding bytes as numbers). Reordered checks so instanceof Uint8Array
takes precedence over Symbol.iterator.
- packages/tar-xz/README.md: full v6 rewrite with unified Node/Browser
  examples, file helpers ('tar-xz/file'), streaming patterns (HTTP, hash
  during creation, fetch+extract), API reference, v5->v6 migration guide
- README.md (root): tar-xz section updated for v6 unified API
- packages/tar-xz/demo/main.ts: rewired to v6 (create/extract/list,
  AsyncIterable consumption, Blob collection)

Demo build has a pre-existing Vite alias issue (node-liblzma/wasm
subpath resolution) — separate follow-up.
Copilot AI review requested due to automatic review settings April 27, 2026 10:34
The previous record-style alias replaced 'node-liblzma' as a prefix,
producing 'lib/lzma.browser.js/wasm' for subpath imports — not a valid
path. Switching to array form lets us register the more specific
/wasm subpath first.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Major v6 redesign of tar-xz to provide a universal, stream-first API (Node + Browser) centered on AsyncIterable<Uint8Array>, plus Node-only filesystem convenience helpers via tar-xz/file. Updates nxz-cli to use the new API and refreshes docs/demo/tests accordingly.

Changes:

  • Introduce v6 create() / extract() / list() APIs returning/accepting async-iterable stream-shaped data across Node and browser entry points.
  • Add Node-only disk I/O wrappers (createFile, extractFile, listFile) under the tar-xz/file subpath export.
  • Update tests, demo, and READMEs for the new API and add v5→v6 migration guidance; update nxz-cli to use tar-xz/file.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node tests to use tar-xz/file helpers and adds in-memory streaming tests.
packages/tar-xz/test/coverage.spec.ts Refreshes coverage/integration tests for v6 APIs and adds helper to collect streamed extract results.
packages/tar-xz/src/types.ts Introduces v6 universal types (TarInput, TarSourceFile, TarEntryWithData streaming shape).
packages/tar-xz/src/node/list.ts Reworks Node listing to accept stream-shaped input and yield entries via async-iterable API.
packages/tar-xz/src/node/index.ts Updates Node exports for v6 (removes extractToMemory, exports TarInputNode).
packages/tar-xz/src/node/file.ts Adds Node-only filesystem convenience wrappers for create/extract/list.
packages/tar-xz/src/node/extract.ts Reworks Node extraction to async-iterable entries with bytes()/text() helpers.
packages/tar-xz/src/node/create.ts Reworks Node creation to async-iterable compressed output and v6 TarSourceFile inputs.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node input normalization to AsyncIterable<Uint8Array> (supports Node + Web streams).
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser input normalization to AsyncIterable<Uint8Array> for Web Streams and buffers.
packages/tar-xz/src/index.ts Updates main Node entrypoint exports for v6 and re-exports new types.
packages/tar-xz/src/index.browser.ts Updates browser entrypoint to export v6 create/extract/list and new types.
packages/tar-xz/src/browser/list.ts Reworks browser listing to async-iterable input + WASM decompression path.
packages/tar-xz/src/browser/index.ts Updates browser exports to v6 function names.
packages/tar-xz/src/browser/extract.ts Reworks browser extraction to async-iterable TarEntryWithData output.
packages/tar-xz/src/browser/create.ts Reworks browser creation to v6 inputs and async-iterable compressed output.
packages/tar-xz/package.json Bumps tar-xz to 6.0.0 and adds ./file subpath export.
packages/tar-xz/demo/main.ts Updates demo UI to use v6 async-iterable APIs and new TarSourceFile shape.
packages/tar-xz/README.md Rewrites docs for v6 API, adds file-helper docs and v5→v6 migration guide.
packages/nxz/src/nxz.ts Switches CLI tar handling to tar-xz/file helpers and updates verbose behavior.
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
README.md Updates repo-level docs/examples to reflect tar-xz v6 API and tar-xz/file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/tar-xz/src/node/file.ts Outdated
Comment on lines +57 to +67
const cwd = resolve(options.cwd ?? process.cwd());
const archiveStream = createReadStream(archivePath);

for await (const entry of extract(archiveStream, options)) {
const target = resolve(cwd, entry.name);
const normalized = normalize(target);

// Path safety: prevent directory traversal
if (!normalized.startsWith(cwd + '/') && normalized !== cwd) {
throw new Error(`Refusing to extract entry outside cwd: ${entry.name}`);
}
Comment on lines +89 to +99
if (entry.type === TarEntryType.HARDLINK) {
await mkdir(dirname(target), { recursive: true });
const linkSource = resolve(cwd, entry.linkname);
// Remove existing file if present (allow re-extract)
try {
await unlink(target);
} catch {
// Ignore
}
await link(linkSource, target);
continue;
Comment thread packages/tar-xz/test/coverage.spec.ts Outdated
Comment on lines +353 to +363
// Build archive with explicit symlink entry
await createFile(archive, {
files: [
{ name: 'target.txt', source: Buffer.from('content') },
{
name: 'link.txt',
source: new Uint8Array(0),
mode: TarEntryType.SYMLINK as unknown as number,
},
],
});
Comment thread packages/tar-xz/src/node/list.ts Outdated
Comment on lines 136 to 149
@@ -83,14 +147,27 @@ class TarList extends Writable {
* }
* ```
*/
Comment thread packages/tar-xz/README.md Outdated
Comment on lines +15 to +17
- **Stream-first** — all functions return `AsyncIterable<…>`; no whole-file buffering required
- **Flexible input** — `extract()` and `list()` accept `AsyncIterable`, `Uint8Array`,
`ArrayBuffer`, Web `ReadableStream`, or Node `ReadableStream`
Comment on lines +77 to +87
if (entry.type === TarEntryType.SYMLINK) {
await mkdir(dirname(target), { recursive: true });
// Remove existing symlink if present (allow re-extract)
try {
await unlink(target);
} catch {
// Ignore — file may not exist
}
await symlink(entry.linkname, target);
continue;
}
Comment thread packages/tar-xz/test/coverage.spec.ts Outdated
Comment on lines +319 to +326
/** Helper: collect an extract() async iterable to memory entries (using ReadableStream) */
async function collectExtract(
archive: string,
options: { strip?: number; filter?: (e: TarEntry) => boolean } = {}
): Promise<Array<{ name: string; type: string; content: Buffer; linkname: string }>> {
const results: Array<{ name: string; type: string; content: Buffer; linkname: string }> = [];
for await (const entry of extract(createReadStream(archive), options)) {
const bytes = await entry.bytes();
Comment thread packages/tar-xz/test/node-api.spec.ts Outdated
Comment on lines +191 to +193
// extractFile with no options — extracts to cwd (tempDir not applicable here, but API works)
// Just verify it doesn't throw when called with default options shape
await expect(listFile(archivePath)).resolves.toHaveLength(1);
Comment thread packages/tar-xz/README.md Outdated
|----------|-----------|---------|
| `create` | `(options: CreateOptions) => AsyncIterable<Uint8Array>` | Compressed archive chunks |
| `extract` | `(input: TarInput, options?: ExtractOptions) => AsyncIterable<TarEntryWithData>` | Entries with data |
| `list` | `(input: TarInput, options?: ExtractOptions) => AsyncIterable<TarEntry>` | Metadata only |
Comment thread packages/tar-xz/src/node/create.ts Outdated
Comment on lines +130 to +133
const mode = file.mode ?? 0o644;

const isDir = name.endsWith('/') && size === 0;
const type: TarEntryTypeValue = isDir ? TarEntryType.DIRECTORY : TarEntryType.FILE;
Copilot round 2:
- S1/S2 path traversal guard: cross-platform via path.relative + isAbsolute
- S3 symlink TOCTOU: hasSymlinkAncestor check before file write
- M1-M5: dir mode 0o755 default (Node + browser), README claims, test cast
- L1-L3: outdated JSDoc + comment cleanup

Senior-review caught 2 missed bugs:
- F-001: rel.startsWith('..') falsely rejects '..gitignore' etc.
  Fixed: use 'rel === ..' || rel.startsWith('..' + sep) || isAbsolute(rel)
- F-002: hasSymlinkAncestor check was FILE-only — DIRECTORY/SYMLINK/
  HARDLINK branches all called mkdir/link/symlink without ancestor check,
  allowing escape via 'link → ../external' + 'link/subdir/'.
  Extracted ensureSafeTarget(target, cwd, name) helper called for ALL
  entry types before any fs operation.
- F-003: regression tests added for DIRECTORY/HARDLINK/SYMLINK-through-
  symlink. Sanity-checked: temporary F-002 revert makes them fail.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Major v6 redesign of tar-xz (and nxz-cli) to provide a universal, stream-first API across Node and browsers, centered on AsyncIterable<Uint8Array>, with Node-only filesystem helpers exposed via tar-xz/file.

Changes:

  • Replaces v5 Node + browser APIs with unified create(), extract(), list() AsyncIterable-based interfaces.
  • Adds Node-only file convenience wrappers createFile, extractFile, listFile under the tar-xz/file subpath export.
  • Updates tests, demo, and documentation (including a v5→v6 migration guide) and rewires nxz-cli to the new APIs.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node tests to use tar-xz/file helpers and adds new security/behavior cases.
packages/tar-xz/test/coverage.spec.ts Adjusts coverage/integration tests for v6 API + adds additional security regression scenarios.
packages/tar-xz/src/types.ts Introduces v6 unified types (TarInput, TarSourceFile, streaming TarEntryWithData).
packages/tar-xz/src/node/list.ts Implements v6 list(input) as an async generator over TAR metadata.
packages/tar-xz/src/node/index.ts Updates Node entry exports (removes extractToMemory, adds TarInputNode type export).
packages/tar-xz/src/node/file.ts Adds Node-only filesystem helper API (createFile/extractFile/listFile) + path safety checks.
packages/tar-xz/src/node/extract.ts Implements v6 extract(input, options) async generator yielding TarEntryWithData.
packages/tar-xz/src/node/create.ts Implements v6 create(options) as an async generator of compressed chunks.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node input normalization helper to accept multiple “stream-shaped” inputs.
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser input normalization helper.
packages/tar-xz/src/index.ts Updates package root Node entry exports for v6.
packages/tar-xz/src/index.browser.ts Updates package root browser entry exports for v6.
packages/tar-xz/src/browser/list.ts Implements browser v6 list(input) with WASM decompression.
packages/tar-xz/src/browser/index.ts Updates browser entry exports to v6 create/extract/list.
packages/tar-xz/src/browser/extract.ts Implements browser v6 extract(input, options) async generator + TarEntryWithData.
packages/tar-xz/src/browser/create.ts Implements browser v6 create(options) async generator + unified TarSourceFile.
packages/tar-xz/package.json Bumps to 6.0.0 and adds ./file subpath export.
packages/tar-xz/demo/vite.config.ts Fixes Vite alias resolution ordering for node-liblzma/wasm.
packages/tar-xz/demo/main.ts Updates demo to v6 AsyncIterable API and new types.
packages/tar-xz/README.md Rewrites docs for v6 API, adds file helpers docs and migration guide.
packages/nxz/src/nxz.ts Switches CLI tar handling to tar-xz/file APIs and removes optional dynamic import.
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
README.md Updates top-level repo docs/examples to reflect tar-xz v6 API + file helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/tar-xz/src/node/file.ts Outdated
Comment on lines +36 to +38
} catch {
// Directory doesn't exist yet — no symlink risk.
return false;
Comment thread packages/tar-xz/src/browser/extract.ts Outdated
if (typeof TextDecoder !== 'undefined') {
return new TextDecoder(encoding ?? 'utf-8').decode(bytes);
}
return Buffer.from(bytes).toString((encoding ?? 'utf-8') as BufferEncoding);
R3-1 (M/security): hasSymlinkAncestor stopped walking on ENOENT, missing
the case where 'link → ../external' exists but the intermediate path
(e.g. 'link/subdir') is not yet on disk. mkdir(recursive) would then
follow the symlink and write outside cwd. Fix: ENOENT continues the
walk; only non-ENOENT errors re-throw. Regression test added — sanity
checked: test fails on revert, passes on fix.

R3-2 (M/correctness): browser collectText() had a Buffer.from() fallback
that ReferenceError's in modern bundlers without Buffer polyfill.
Dropped the fallback; TextDecoder is universally available in target
environments.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Breaking v6 redesign of tar-xz to provide a universal, stream-first API (AsyncIterable<Uint8Array>) that works consistently in both Node.js and browsers, with Node-only disk I/O helpers exposed via tar-xz/file. Updates nxz-cli and documentation/tests to match the new API.

Changes:

  • Introduce unified create(), extract(), list() AsyncIterable-based APIs for Node + browser, plus shared input normalization (TarInput).
  • Add Node-only filesystem convenience helpers (createFile, extractFile, listFile) via the tar-xz/file export.
  • Update tests, demo, and READMEs; rewire nxz-cli to use tar-xz/file; bump package versions to 6.0.0.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node tests to exercise v6 file helpers and in-memory streaming usage.
packages/tar-xz/test/coverage.spec.ts Updates coverage suite for v6 APIs; adds several security/TOCTOU regression tests and in-memory extract helper.
packages/tar-xz/src/types.ts Redefines public types around universal stream-first API (TarInput, TarSourceFile, TarEntryWithData).
packages/tar-xz/src/node/list.ts Implements Node list(input) as AsyncIterable<TarEntry> (currently buffers/decompresses internally).
packages/tar-xz/src/node/index.ts Adjusts Node entry exports to v6 (extractToMemory removed; exports TarInputNode).
packages/tar-xz/src/node/file.ts Adds Node-only disk I/O wrappers (tar-xz/file) with traversal + symlink-ancestor guards.
packages/tar-xz/src/node/extract.ts Implements Node extract(input) as AsyncIterable<TarEntryWithData> (currently buffers/decompresses internally).
packages/tar-xz/src/node/create.ts Implements Node create(options) yielding compressed chunks; builds TAR blocks and streams through XZ.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node-side TarInputNode normalization to AsyncIterable<Uint8Array>.
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser-side TarInput normalization to AsyncIterable<Uint8Array>.
packages/tar-xz/src/index.ts Updates Node package entrypoint exports/types for v6 and removes v5-only exports.
packages/tar-xz/src/index.browser.ts Updates browser entrypoint to export v6 create/extract/list and shared types.
packages/tar-xz/src/browser/list.ts Implements browser list(input) as AsyncIterable<TarEntry> using WASM subpath.
packages/tar-xz/src/browser/index.ts Renames browser exports to v6 (create/extract/list).
packages/tar-xz/src/browser/extract.ts Implements browser extract(input) yielding TarEntryWithData with bytes()/text() helpers.
packages/tar-xz/src/browser/create.ts Implements browser create(options) as AsyncIterable<Uint8Array> (currently yields one compressed chunk).
packages/tar-xz/package.json Bumps tar-xz to 6.0.0 and adds ./file export for Node-only helpers.
packages/tar-xz/demo/vite.config.ts Fixes Vite aliasing for node-liblzma/wasm subpath resolution order.
packages/tar-xz/demo/main.ts Updates demo to consume v6 AsyncIterable APIs and new source file shape.
packages/tar-xz/README.md Rewrites docs for v6 API, adds migration guide, and documents tar-xz/file.
packages/nxz/src/nxz.ts Rewires CLI tar integration to use tar-xz/file helpers (no more dynamic import).
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
README.md Updates monorepo README examples/docs to reflect tar-xz v6 and the new API patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +120 to +123
// Ensure directory is traversable: always set execute bits (x) for user/group/other.
// A directory with mode 0o644 (no execute) cannot be descended into.
const dirMode = (entry.mode || 0o755) | 0o111;
await mkdir(target, { recursive: true, mode: dirMode });
Comment thread packages/tar-xz/README.md Outdated
Comment on lines +312 to +316
| Preset | WASM Memory | Speed | Ratio | Recommendation |
|--------|------------|-------|-------|----------------|
| 1 | ~10 MB | Fastest | Lowest | Batch of many small files |
| 3 | ~20 MB | Fast | Good | Browser default |
| 6 | ~100 MB | Medium | Very good | Node default |
Comment thread packages/tar-xz/src/node/list.ts Outdated
Comment on lines +16 to +74
async function collectAllChunks(input: TarInputNode): Promise<Uint8Array> {
const iterable = toAsyncIterable(input);
const chunks: Uint8Array[] = [];
for await (const chunk of iterable) {
chunks.push(chunk);
}
const total = chunks.reduce((n, c) => n + c.length, 0);
const out = new Uint8Array(total);
let offset = 0;
for (const chunk of chunks) {
out.set(chunk, offset);
offset += chunk.length;
}
return out;
}

async function decompressXz(data: Uint8Array): Promise<Uint8Array> {
const unxzStream = createUnxz();
const readable = Readable.from(
(async function* () {
yield data;
})()
);

const output: Uint8Array[] = [];
let resolveFlush!: () => void;
let rejectFlush!: (e: unknown) => void;
const done = new Promise<void>((res, rej) => {
resolveFlush = res;
rejectFlush = rej;
});

unxzStream.on('data', (...args: unknown[]) => {
const chunk = args[0] as Buffer;
output.push(new Uint8Array(chunk.buffer, chunk.byteOffset, chunk.byteLength));
});
unxzStream.on('end', resolveFlush);
unxzStream.on('error', rejectFlush);
readable.pipe(unxzStream);

await done;

const total = output.reduce((n, c) => n + c.length, 0);
const result = new Uint8Array(total);
let offset = 0;
for (const chunk of output) {
result.set(chunk, offset);
offset += chunk.length;
}
return result;
}

async function runWritable(writable: Writable, data: Uint8Array): Promise<void> {
await new Promise<void>((resolve, reject) => {
writable.on('finish', resolve);
writable.on('error', reject);
writable.write(Buffer.from(data.buffer, data.byteOffset, data.byteLength));
writable.end();
});
Comment on lines +144 to +150
if (entry.type === TarEntryType.HARDLINK) {
// S2: validate linkname — it must not escape cwd (absolute paths or ".." segments).
const linkSource = resolve(cwd, entry.linkname);
const linkRel = relative(cwd, linkSource);
if (linkRel === '..' || linkRel.startsWith('..' + sep) || isAbsolute(linkRel)) {
throw new Error(`Refusing hardlink outside cwd: ${entry.linkname}`);
}
R4-1 (M/correctness): mode defaults used '||' which treats mode 0
(legitimate TAR entry) as falsy, silently replacing with default. Use
'??' so only undefined/null triggers the default. 3 call sites in
node/file.ts; node/create.ts and browser/create.ts already used '??'.

R4-2 (M/correctness): extractFile applied 'strip' to entry.name but
not to entry.linkname for HARDLINK entries. With strip:1, archive
'dir/link → dir/a.txt' would fail because 'dir/a.txt' linkname still
includes the stripped 'dir/' prefix. Apply strip to linkname before
resolve() and re-run path safety check. Regression test added.

R4-3 (L/doc): README claimed preset 3 was 'Browser default' but actual
default is 6 (uniform across Node and Browser). Doc updated.

R4-4 (L/dry): collectAllChunks/decompressXz/runWritable were duplicated
between list.ts and extract.ts. Extracted to internal node/xz-helpers.ts
to prevent drift.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Major v6 breaking redesign of tar-xz to provide a universal, stream-first API across Node.js and browsers, centered on AsyncIterable<Uint8Array>, plus a Node-only tar-xz/file subpath for filesystem I/O. nxz-cli is updated to consume the new file helpers and v6 package versions.

Changes:

  • Replace v5 buffer-/path-centric APIs with universal create()/extract()/list() AsyncIterable-based APIs for Node + Browser.
  • Add Node-only file helper module (tar-xz/file) implementing createFile/extractFile/listFile with path-safety/TOCTOU guards.
  • Update tests, docs, demo, and nxz-cli wiring to use the new v6 APIs and exports.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node API tests to exercise tar-xz/file helpers and in-memory stream piping.
packages/tar-xz/test/coverage.spec.ts Refactors coverage tests to use new stream API + file helpers; adds security regression cases.
packages/tar-xz/src/types.ts Unifies Node/Browser types; introduces TarInput, TarSourceFile, and streaming TarEntryWithData.
packages/tar-xz/src/node/xz-helpers.ts Adds shared Node helpers for collecting input, XZ decompression, and driving writables.
packages/tar-xz/src/node/list.ts Reworks Node list to accept stream-shaped input and yield entries via AsyncIterable.
packages/tar-xz/src/node/index.ts Updates Node entrypoint exports for v6 (extractToMemory removed; exports TarInputNode).
packages/tar-xz/src/node/file.ts Introduces Node-only filesystem convenience API (createFile/extractFile/listFile) with safety checks.
packages/tar-xz/src/node/extract.ts Reworks Node extract into AsyncIterable yielding TarEntryWithData (buffering implementation).
packages/tar-xz/src/node/create.ts Reworks Node create into AsyncIterable of compressed chunks; adds TarSourceFile support.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node normalizer for accepted tar inputs into AsyncIterable<Uint8Array>.
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser normalizer for accepted tar inputs into AsyncIterable<Uint8Array>.
packages/tar-xz/src/index.ts Updates main (Node) package entry exports for v6 and re-exports new types.
packages/tar-xz/src/index.browser.ts Updates browser package entry exports to v6 function names and unified types.
packages/tar-xz/src/browser/list.ts Reworks browser list to accept stream-shaped input and yield entries via AsyncIterable.
packages/tar-xz/src/browser/index.ts Updates browser internal entrypoint exports to v6 names.
packages/tar-xz/src/browser/extract.ts Reworks browser extract to AsyncIterable yielding TarEntryWithData.
packages/tar-xz/src/browser/create.ts Reworks browser create to AsyncIterable; enforces no-fs-path sources in browser.
packages/tar-xz/package.json Bumps tar-xz to 6.0.0 and adds ./file subpath export.
packages/tar-xz/demo/vite.config.ts Fixes Vite aliasing for node-liblzma/wasm subpath resolution.
packages/tar-xz/demo/main.ts Updates demo to new v6 async-iterable APIs and TarSourceFile usage.
packages/tar-xz/README.md Rewrites docs for v6 universal stream-first API + migration guide and file helpers.
packages/nxz/src/nxz.ts Updates CLI tar commands to use tar-xz/file helpers and new behavior.
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
README.md Updates monorepo README to describe tar-xz v6 and show new usage patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +153 to +167
// S2: validate linkname — it must not escape cwd (absolute paths or ".." segments).
const linkSource = resolve(cwd, strippedLinkname);
const linkRel = relative(cwd, linkSource);
if (linkRel === '..' || linkRel.startsWith('..' + sep) || isAbsolute(linkRel)) {
throw new Error(`Refusing hardlink outside cwd: ${entry.linkname}`);
}
await mkdir(dirname(target), { recursive: true });
// Remove existing file if present (allow re-extract)
try {
await unlink(target);
} catch {
// Ignore
}
await link(linkSource, target);
continue;
Comment thread packages/tar-xz/src/node/extract.ts Outdated
Comment on lines +142 to +162
@@ -116,15 +143,6 @@ class TarUnpack extends Writable {
* Validate that a path doesn't escape the target directory (path traversal protection)
* @throws Error if path traversal is detected
*/
function validatePath(destPath: string, cwd: string): void {
const resolvedDest = path.resolve(destPath);
const resolvedCwd = path.resolve(cwd);

// Ensure the destination path starts with the cwd (no escape via ../)
if (!resolvedDest.startsWith(resolvedCwd + path.sep) && resolvedDest !== resolvedCwd) {
throw new Error(`Path traversal detected: ${destPath} escapes ${cwd}`);
}
}

/**
* Extract a tar.xz archive
@@ -141,114 +159,53 @@ function validatePath(destPath: string, cwd: string): void {
* });
* ```
*/
export async function extract(options: ExtractOptions): Promise<TarEntry[]> {
const { file, cwd = process.cwd(), strip = 0, filter, preserveOwner = false } = options;

// Create decompression and unpacking streams
const inputStream = createReadStream(file);
const unxzStream = createUnxz();
const tarUnpack = new TarUnpack();
/**
Comment thread packages/tar-xz/src/browser/list.ts Outdated
Comment on lines 77 to 93
@@ -90,14 +91,38 @@ function listTarEntries(data: Uint8Array): TarEntry[] {
* }
* ```
*/
Comment thread packages/tar-xz/src/browser/extract.ts Outdated
Comment on lines 87 to 104
@@ -100,43 +102,85 @@ function parseTar(data: Uint8Array): Array<TarEntry & { data: Uint8Array }> {
* }
* ```
*/
Round 5 findings:
- R5-1 (M/security): hardlink linkSource not validated against symlinks.
  Attack: archive plants 's → /etc/passwd' (symlink), then 'link → s'
  (hardlink with linkname='s'). The previous linkRel check passes
  (s resolves inside cwd), but link(cwd/s, ...) follows the kernel
  symlink and creates a hardlink to /etc/passwd. Fix: lstat(linkSource)
  + hasSymlinkAncestor(linkSource) before link(). Regression test added.
- R5-2/3/4 (L/doc): stale v5 JSDoc removed from node/extract.ts,
  browser/extract.ts, browser/list.ts.

Senior-review hardening (folded in):
- L-1: regression test now also asserts external/secret nlink === 1
  (proves no successful hardlink, even if dest cleaned up).
- L-2: lstat error handling tightened — only swallow ENOENT, rethrow
  EACCES/ELOOP/etc. (consistency with hasSymlinkAncestor pattern).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Breaking v6 redesign of tar-xz to provide a universal, stream-first API (Node + Browser) centered on AsyncIterable<Uint8Array>, with Node-only filesystem helpers exposed via tar-xz/file, and corresponding updates to nxz-cli, tests, demo, and documentation.

Changes:

  • Replace v5 Node/browser split APIs with unified create()/extract()/list() AsyncIterable-based interfaces and new shared types.
  • Add Node-only tar-xz/file helpers (createFile, extractFile, listFile) and refactor nxz-cli to use them.
  • Update tests, demo, and READMEs to match the new v6 API + migration guidance.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node tests to exercise the new tar-xz/file API and in-memory streaming flows.
packages/tar-xz/test/coverage.spec.ts Updates coverage tests for v6 behavior; adds extensive security regression coverage around extraction.
packages/tar-xz/src/types.ts Introduces v6 universal types (TarInput, TarSourceFile, streaming TarEntryWithData).
packages/tar-xz/src/node/xz-helpers.ts Adds shared Node XZ buffering/decompression helpers for list/extract.
packages/tar-xz/src/node/list.ts Refactors Node list to AsyncIterable<TarEntry> API and uses shared XZ helpers.
packages/tar-xz/src/node/index.ts Updates Node entry exports for v6 (drops extractToMemory, exports TarInputNode).
packages/tar-xz/src/node/file.ts Adds Node-only disk I/O wrappers (createFile/extractFile/listFile) with path-safety checks.
packages/tar-xz/src/node/extract.ts Refactors Node extract to AsyncIterable<TarEntryWithData> and removes extractToMemory.
packages/tar-xz/src/node/create.ts Refactors Node create to return compressed chunks as an AsyncIterable<Uint8Array>.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node input normalization (TarInputNodeAsyncIterable<Uint8Array>).
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser input normalization (TarInputAsyncIterable<Uint8Array>).
packages/tar-xz/src/index.ts Updates Node main entry exports/types for v6.
packages/tar-xz/src/index.browser.ts Updates browser entry to export create/extract/list and v6 types.
packages/tar-xz/src/browser/list.ts Refactors browser list to the v6 AsyncIterable API and WASM import path.
packages/tar-xz/src/browser/index.ts Updates browser sub-entry exports to v6 names.
packages/tar-xz/src/browser/extract.ts Refactors browser extract to AsyncIterable<TarEntryWithData> and WASM import path.
packages/tar-xz/src/browser/create.ts Refactors browser create to AsyncIterable output and v6 source types; enforces no-fs-path in browser.
packages/tar-xz/package.json Bumps tar-xz to 6.0.0 and adds the ./file subpath export.
packages/tar-xz/demo/vite.config.ts Adjusts Vite aliasing to correctly resolve node-liblzma/wasm vs bare node-liblzma.
packages/tar-xz/demo/main.ts Updates demo to the v6 AsyncIterable API (create/extract/list).
packages/tar-xz/README.md Rewrites docs for v6 API + adds migration guide and file-helper documentation.
packages/nxz/src/nxz.ts Refactors CLI to use tar-xz/file helpers and new file-source mapping.
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
README.md Updates root README to reflect tar-xz v6 API and file-helper subpath.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/tar-xz/src/node/file.ts Outdated
Comment on lines +191 to +195
await mkdir(dirname(target), { recursive: true });
await pipeline(
Readable.from(entry.data),
createWriteStream(target, { mode: entry.mode ?? 0o644 })
);
* @returns Promise with list of entries
* Returns an `AsyncIterable<TarEntry>` yielding each entry's metadata.
* Entry content is skipped — use `extract()` if you need the data.
*
Comment on lines +143 to +147
* Extract a tar.xz archive.
*
* @param options - Extraction options
* @returns Promise with list of extracted entries
* Returns an `AsyncIterable<TarEntryWithData>`. Each yielded entry includes:
* - Full metadata (`TarEntry` fields)
* - `data` — `AsyncIterable<Uint8Array>` for the entry's content (consume in order)
Comment on lines +15 to +19
const chunks: Uint8Array[] = [];
for await (const chunk of iterable) {
chunks.push(chunk);
}
const total = chunks.reduce((n, c) => n + c.length, 0);
…k vectors

Stop the round-by-round whack-a-mole. Senior audit identified 18
attack vectors (V1-V18); 7 consolidated fixes close them all in one
commit, with regression tests for each.

Fix 1 (V1, V4, R6-1) — Leaf symlink check
  ensureSafeTarget now lstat's the leaf 'target' itself, not just its
  ancestors. Catches archives that plant a symlink then overwrite it
  via FILE/DIRECTORY/HARDLINK. SYMLINK entries skip the check (legitimate
  re-extract via unlink+symlink).

Fix 2 (V6a, V6b) — Reject empty / NUL-bearing names + linknames
  New ensureSafeName() helper. Empty strings and embedded NUL bytes
  are rejected for both entry.name and entry.linkname before any path
  math runs.

Fix 3 (V6c, V14) — Apply strip to SYMLINK linkname
  Mirrors the existing HARDLINK strip handling. Consistency + correctness
  for archives extracted with --strip.

Fix 4 (V12) — Strip setuid/setgid/sticky bits by default
  SAFE_MODE_MASK = 0o0777 applied to every extracted file/dir mode.
  Mirrors GNU tar --no-same-permissions default. No --preserve-special
  opt-in for now (add later if needed).

Fix 5 (V2, V3) — fd-based FILE extraction with O_NOFOLLOW
  POSIX path uses fs.open(O_WRONLY | O_CREAT | O_TRUNC | O_NOFOLLOW) +
  handle.write() loop + handle.chmod() + handle.utimes() — all fd-bound,
  no path-resolution after the open. Eliminates the chmod/utimes TOCTOU
  window. Windows fallback uses the by-path version (gated by
  process.platform); leaf check (Fix 1) is the primary defense there.

Fix 6 (V8) — Narrow unlink catches to ENOENT
  Two pre-clean unlink sites (SYMLINK, HARDLINK) now rethrow non-ENOENT
  errors instead of swallowing all of them.

Fix 7 (V11) — Threat model documentation
  extractFile JSDoc now states that concurrent attacker process swaps
  during extraction are out of scope (POSIX openat2 not exposed by Node).

8 regression tests added covering each vector. Sanity-checked: reverting
Fix 1, Fix 3, or Fix 4 individually causes the corresponding test(s) to
fail. 95/95 tar-xz tests pass.

Senior-review verdict: Ready, 3 L findings folded (duplicate JSDoc) or
deferred (NUL strictness, Windows-mock test) per scope tradeoff.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Major v6 redesign of tar-xz (and nxz-cli) to provide a universal, stream-first API based on AsyncIterable<Uint8Array>, with Node-only filesystem helpers moved into a dedicated tar-xz/file subpath export.

Changes:

  • Replaced v5 APIs with unified create / extract / list AsyncIterable-based APIs for both Node and browser builds.
  • Added Node-only file helper wrappers (createFile, extractFile, listFile) plus significant extraction hardening (path traversal + symlink/TOCTOU defenses).
  • Updated tests, demo, and documentation (including a v5→v6 migration guide); bumped tar-xz and nxz-cli to 6.0.0.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates tests to v6 Node file-helper API and adds in-memory streaming tests
packages/tar-xz/test/coverage.spec.ts Expands coverage and adds extensive security regression tests for extractFile
packages/tar-xz/src/types.ts Introduces v6 universal types (TarInput, TarSourceFile, TarEntryWithData streaming helpers)
packages/tar-xz/src/node/xz-helpers.ts Adds shared Node XZ helper utilities for extract/list pipelines
packages/tar-xz/src/node/list.ts Converts Node list() to AsyncIterable API and refactors decompression via helpers
packages/tar-xz/src/node/index.ts Updates Node entry exports (removes extractToMemory, exports TarInputNode type)
packages/tar-xz/src/node/file.ts Adds Node-only disk I/O convenience API + extraction safety hardening
packages/tar-xz/src/node/extract.ts Converts Node extract() to AsyncIterable API
packages/tar-xz/src/node/create.ts Converts Node create() to AsyncIterable output and new TarSourceFile input model
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node TarInput normalization to AsyncIterable
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser TarInput normalization to AsyncIterable
packages/tar-xz/src/index.ts Updates Node package entrypoint exports/types for v6
packages/tar-xz/src/index.browser.ts Updates browser package entrypoint exports/types for v6
packages/tar-xz/src/browser/list.ts Converts browser list() to AsyncIterable API and switches to wasm subpath
packages/tar-xz/src/browser/index.ts Updates browser API surface to create/extract/list names
packages/tar-xz/src/browser/extract.ts Converts browser extract() to AsyncIterable API
packages/tar-xz/src/browser/create.ts Converts browser create() to AsyncIterable output and TarSourceFile source model
packages/tar-xz/package.json Bumps to 6.0.0 and adds ./file subpath export
packages/tar-xz/demo/vite.config.ts Fixes Vite aliasing for node-liblzma/wasm subpath resolution
packages/tar-xz/demo/main.ts Updates demo to v6 AsyncIterable APIs and TarSourceFile usage
packages/tar-xz/README.md Rewrites docs for v6 API + migration guide
packages/nxz/src/nxz.ts Rewires CLI tar operations to use tar-xz/file helpers
packages/nxz/package.json Bumps nxz-cli to 6.0.0
README.md Updates root README to reflect tar-xz v6 redesign and usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +36 to +40
function ensureSafeName(s: string | undefined, label: string): void {
if (s === undefined) return;
if (s.length === 0) throw new Error(`Refusing entry: empty ${label}`);
if (s.includes('\x00')) throw new Error(`Refusing entry: ${label} contains NUL byte`);
}
Comment on lines +130 to +131
export async function createFile(path: string, options: CreateOptions): Promise<void> {
await pipeline(Readable.from(create(options)), createWriteStream(path));
Comment on lines +18 to 22
"./file": {
"types": "./lib/node/file.d.ts",
"import": "./lib/node/file.js",
"default": "./lib/node/file.js"
}
Comment on lines +5 to 6
import { createFile, extractFile, listFile } from '../src/node/file.js';
import { TarEntryType } from '../src/types.js';
Comment thread packages/tar-xz/src/node/create.ts Outdated
Comment on lines 160 to 165
// Pipe TAR builder → XZ compressor; yield each compressed chunk as it arrives.
// Node's Readable streams are themselves AsyncIterable, so we can `for await`
// directly without buffering everything in memory.
const xzStream = createXz({ preset });
const outputFile = await fs.open(file, 'w');
const outputStream = outputFile.createWriteStream();

// Process files
const processFiles = async (): Promise<void> => {
for (const { path: filePath, relativePath, stats } of allFiles) {
let content: Buffer | null = null;
let linkTarget: string | undefined;

if (stats.isSymbolicLink()) {
linkTarget = await fs.readlink(filePath);
} else if (stats.isFile()) {
content = await fs.readFile(filePath);
}

await tarPack.addEntry(relativePath, content, stats, linkTarget);
}

tarPack.finalize();
tarPack.push(null); // End the stream
};
Readable.from(buildTar(files, filter)).pipe(xzStream);

R7-5 (M/correctness): create() used Readable.pipe() which doesn't
propagate source errors — fs.readFile failures inside buildTar would
emit unhandled error events instead of rejecting the iteration.
Switched to pipeline() from node:stream/promises with explicit error
forwarding. Regression test added; sanity-check: reverting to pipe()
causes test to time out.

R7-1 (M/correctness, light): ensureSafeName now rejects dot-segment
placeholder names ('.', './', '..') after separator normalization.
Legitimate dotfiles like '.gitignore' are unaffected — regression
guard test included.

R7-3 (L/DX): added 'browser: null' to ./file subpath exports so
bundlers fail fast instead of pulling in node:fs silently.

Skipped (deferred to follow-up):
- R7-2: cosmetic param rename (path → archivePath)
- R7-4: public-entry re-export smoke test
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR delivers the v6 breaking redesign of tar-xz into a universal, stream-first API built around AsyncIterable<Uint8Array>, with Node-only filesystem helpers moved behind the tar-xz/file subpath export. It also rewires nxz-cli and updates docs/demo/tests to match the new model.

Changes:

  • Replace v5 Node/Browser-specific APIs with unified create() / extract() / list() AsyncIterable-based APIs.
  • Add Node-only disk I/O wrappers (createFile / extractFile / listFile) under tar-xz/file and update consumers/tests accordingly.
  • Refresh demo + READMEs and expand Node coverage/security tests for extraction hardening.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
packages/tar-xz/test/node-api.spec.ts Updates Node tests to validate the new Node file helpers and in-memory stream API.
packages/tar-xz/test/coverage.spec.ts Updates coverage tests for v6 APIs and adds extensive extractor security regression tests.
packages/tar-xz/src/types.ts Introduces unified v6 public types (TarInput, TarSourceFile, streaming TarEntryWithData).
packages/tar-xz/src/node/xz-helpers.ts Adds shared Node helpers to normalize/buffer input and run XZ decompression pipelines.
packages/tar-xz/src/node/list.ts Reworks Node list() to accept stream-shaped input and yield TarEntry via AsyncIterable.
packages/tar-xz/src/node/index.ts Adjusts Node entry exports for v6 (create, extract, list, and TarInputNode).
packages/tar-xz/src/node/file.ts Adds Node-only file helpers (tar-xz/file) with path-safety and TOCTOU hardening.
packages/tar-xz/src/node/extract.ts Reworks Node extract() to accept stream-shaped input and yield TarEntryWithData.
packages/tar-xz/src/node/create.ts Reworks Node create() to return compressed chunks as AsyncIterable<Uint8Array>.
packages/tar-xz/src/internal/to-async-iterable.ts Adds Node input normalization to AsyncIterable<Uint8Array> (TarInputNode).
packages/tar-xz/src/internal/to-async-iterable.browser.ts Adds browser input normalization to AsyncIterable<Uint8Array>.
packages/tar-xz/src/index.ts Updates top-level Node entrypoint exports/types for v6.
packages/tar-xz/src/index.browser.ts Updates browser entrypoint exports/types for v6.
packages/tar-xz/src/browser/list.ts Reworks browser list() to accept TarInput and yield entries as AsyncIterable.
packages/tar-xz/src/browser/index.ts Updates browser entry exports to v6 names.
packages/tar-xz/src/browser/extract.ts Reworks browser extract() to yield TarEntryWithData with bytes()/text().
packages/tar-xz/src/browser/create.ts Reworks browser create() to return AsyncIterable<Uint8Array> output chunks.
packages/tar-xz/package.json Bumps tar-xz to 6.0.0 and adds the ./file subpath export (Node-only).
packages/tar-xz/demo/vite.config.ts Fixes Vite alias ordering to correctly resolve node-liblzma/wasm.
packages/tar-xz/demo/main.ts Updates the demo UI to use v6 AsyncIterable APIs and new source types.
packages/nxz/src/nxz.ts Rewires CLI tar operations to use tar-xz/file helpers and v6 behavior.
packages/nxz/package.json Bumps nxz-cli to 6.0.0.
packages/tar-xz/README.md Rewrites package README for v6 API + migration guide + Node/file helper docs.
README.md Updates repo-level README snippets to reflect the v6 API and tar-xz/file helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/tar-xz/src/node/create.ts Outdated
Comment on lines +19 to +24
/**
* Transform stream that packs files into TAR format
*/
class TarPack extends Transform {
constructor() {
super({ objectMode: false });
/**
* Build a single TAR entry (header + content blocks) into an array of Uint8Array chunks.
* Does not write to disk; caller decides what to do with the chunks.
Comment thread packages/tar-xz/src/node/create.ts Outdated
Comment on lines +73 to +75
* Recursively collect all files in a directory
*/
/**
Comment thread packages/tar-xz/src/node/create.ts Outdated
Comment on lines +167 to +169
// R7-5: use pipeline() instead of pipe() so that errors from buildTar
// (e.g. missing source file) propagate and reject the iteration rather than
// hanging or emitting an unhandled error event.
Comment thread packages/tar-xz/src/node/file.ts Outdated
Comment on lines +35 to +41
*/
/**
* Validate a tar entry name or linkname for safety.
*
* Rejects:
* - Empty strings (would cause target === cwd or ambiguous hardlink resolution)
* - Strings containing the NUL byte (U+0000)
Comment on lines +56 to +66
/**
* S3 (TOCTOU guard): Check whether any ancestor directory of `filePath` (up to
* and including `root`) is a symlink. If so, a malicious archive could first
* plant a symlink pointing outside root, then write a file through it.
*
* Returns true if a symlink ancestor is found (caller should reject the entry).
*/
async function hasSymlinkAncestor(filePath: string, root: string): Promise<boolean> {
// Walk each ancestor from filePath up to (but not including) root.
let dir = dirname(filePath);
while (dir !== root && dir.length >= root.length) {
Comment thread packages/tar-xz/test/coverage.spec.ts Outdated
Comment on lines +318 to +319
import { createReadStream } from 'node:fs';

Comment thread packages/tar-xz/README.md
Comment on lines +53 to 60
const archiveStream = create({
files: [
{ name: 'hello.txt', source: Buffer.from('Hello, world!') },
{ name: 'data.json', source: Buffer.from(JSON.stringify({ ok: true })) },
],
preset: 6, // XZ compression level 0–9 (default: 6)
filter: (file) => !file.name.endsWith('.tmp'), // optional
});
…EADME

R8-1/R8-2: remove stale 'Transform stream' / 'Recursively collect' comments
in create.ts (implementation is AsyncIterable-based, no Transform class,
no directory recursion).
R8-3: collapse duplicate R7-5 pipeline comment.
R8-4: remove duplicate ensureSafeName JSDoc block.
R8-5: hasSymlinkAncestor JSDoc says 'up to but not including root' to
match the exclusive loop condition.
R8-6: hoist mid-file createReadStream import to top of test file.
R8-7: README browser example uses TextEncoder.encode() instead of
Buffer.from() (Buffer is Node-only).

All Copilot round 8 findings (L-only) addressed. Ready to merge.
@oorabona oorabona merged commit b2c8a8c into master Apr 27, 2026
10 checks passed
@oorabona oorabona deleted the refactor/tar-xz-v6-streams branch April 27, 2026 13:41
oorabona added a commit that referenced this pull request Apr 28, 2026
…try) (#113)

* feat(tar-xz): add streamXz() — Block 1 of TAR-XZ-STREAMING-2026-04-28

Adds the streaming-XZ pipeline foundation that subsequent blocks in this
story will build on. No callers yet — extract.ts and list.ts still use
the buffered helpers; their migration is Block 3 / Block 4.

- streamXz(input: TarInputNode): AsyncIterable<Uint8Array>
  Pipes any TarInputNode through createUnxz() and yields decompressed
  chunks via the Transform's native Symbol.asyncIterator. No internal
  Buffer.concat, no Readable.from() wrap (per spec §12.5).
- @deprecated tags on collectAllChunks / decompressXz / runWritable.
  These remain functional until Blocks 3+4 migrate their callers.
- 9 unit tests in packages/tar-xz/test/xz-helpers.spec.ts (flat layout,
  matching coverage.spec.ts / node-api.spec.ts / tar-format.spec.ts).
  Covers: byte-equality, multi-chunk yield, all four input forms
  (Uint8Array / Buffer / Readable / AsyncIterable), error propagation
  on corrupt input, memory-shape no-accumulation proof, deprecation
  tags. Memory test correctly distinguishes XZ preset-6 dictionary
  (~8 MB working memory, inherent) from accumulation (would grow with
  archive size).

Spec doc TAR-XZ-STREAMING-2026-04-28.md captured in this commit
(adversarial + llm-spec hardened, 5 design decisions locked, §10/§11
review ledgers filled).

108 / 108 tar-xz tests pass; 27 / 27 nxz tests pass; tsc + lint clean.

* feat(tar-xz): streaming parseTar generator + extract/list rewrites

Blocks 2+3+4 of TAR-XZ-STREAMING-2026-04-28. Replaces the buffered
TarUnpack/TarList Writable classes with a coroutine-style AsyncGenerator
parser that yields entries while the XZ stream is still being consumed.

- tar-parser.ts: new ParseEvent discriminated union + parseTar(source, mode)
  AsyncGenerator. State machine: HEADER / CONTENT / SKIP / PADDING.
  Reuses parseNextHeader() unchanged. Removes all four `v8 ignore`
  blocks that were marking the chunk-split / PAX-split paths
  (lines 86-89, 148-153, 169-174 — now hot paths exercised by 8 new
  unit tests in test/tar-parser-stream.spec.ts).
  Adds MAX_PAX_HEADER_BYTES = 1 MB DoS guard (per spec A-07/A-08);
  exceeds → throws Error with code 'TAR_PARSER_INVARIANT' (per
  D-5/L-S-02).
- extract.ts: Writable class replaced by lean async generator with a
  lookahead-buffer pattern that handles cooperative coroutine flow
  without losing parse events. makeTarEntryWithData rewired around a
  pull-callback; bytes()/text() memoize via a shared cachedBytes field
  (D-3). +5 new tests cover S-08 (consumer skips entry.data),
  memoization, single-use data after bytes(), and S-08b (consumer-break
  silent stop).
- list.ts: Writable class replaced by 4-line generator wrapping
  parseTar(xzStream, 'list'). Mode 'list' never yields chunk events
  so memory stays O(BLOCK_SIZE).
- xz-helpers.ts: deleted the deprecated collectAllChunks /
  decompressXz / runWritable. Zero callers verified via
  search_structural before removal.

Spec drift flagged + accepted: D-5's "TAR_PARSER_INVARIANT always
re-throws even on consumer-break" cannot be implemented literally
because Biome's noUnsafeFinally rule prohibits `throw` in `finally`.
After consumer-break (parser.return()) the iterator is dead; no
subsequent .next() can observe the corrupted state — surfacing the
invariant via finally would only produce an unhandled rejection.
Resolution: finally swallows cleanup errors on consumer-break,
consistent with D-2 silent-stop semantics. Documented for opus review.

Test counts post-blocks-2-3-4: tar-xz 120, root 489, nxz 27 — all green.
tsc clean, lint clean (rtk proxy biome).

* test(tar-xz): security regression + memory shape gates

Block 5 of TAR-XZ-STREAMING-2026-04-28. Adds the regression-lock that
PR #113 needs to ship — proves the streaming refactor preserves the 18
TOCTOU vectors closed by PR #108 and meets the O(largest entry) memory
target.

- vitest.config.ts: pool='forks' + execArgv=['--expose-gc'] for the
  global.gc handle the memory tests need (Vitest 4 moved execArgv to
  top-level test config, not poolOptions).
- package.json: new test:memory script.
- test/security.spec.ts (new, 22 tests): consolidated TOCTOU coverage
  for V1/R6-1 (leaf symlink), F-001 (traversal), F-002/R3-1 (TOCTOU
  ancestor + ENOENT walk), S3 (per entry-type), R4-2 (hardlink strip),
  R5-1 (hardlink symlink source), S2 (hardlink escape), V6a/V6b
  (NUL/empty name), V12 (setuid mask), V2/V3 (O_NOFOLLOW POSIX —
  skipped with note on win32). Plus S-14 (Win32 policy doc test) and
  S-15a/S-15b (PAX bomb DoS — both safe outcomes verified: truncated
  bomb hits "Unexpected end" before guard, > 1 MB actual hits the
  TAR_PARSER_INVARIANT).
- test/memory-shape.spec.ts (new, 4 tests): in-loop high-water sampling
  per spec §12.3. extract 1 × 50 MB ≤ 116 MB peak; list 100 × 1 MB
  metadata ≤ 16 MB peak; extract 5 × 10 MB ≤ 36 MB peak. Tests SKIP
  cleanly when global.gc is unavailable.
- file.ts: ~17 lines of @security TSDoc on extractFile distinguishing
  POSIX (fd-based, minimal window) from Windows (by-path, wallclock
  window scales with entry size in streaming mode). Replaces the prior
  "OUT OF SCOPE" comment.
- README.md: streaming claim updated (now O(largest entry) as of
  v6.1.0); new "Security model" subsection (25 lines) documenting the
  POSIX/Windows split and recommending exclusive directories on
  Windows until Win32 fd-based extraction lands (separate TODO).

Quality gates: build=0 tsc=0 lint=0 security-test=0 memory-test=0
full-test=0. 661 total tests pass across all packages (33 spec files).

* chore: changeset for tar-xz 6.1.0 streaming refactor

* fix(tar-xz): address PR #113 review findings — round 1

Round-1 findings from opus + Copilot post-restart consolidated review
(3M + 8L). Closes resource leaks, restores BufferEncoding contract on
text(), and bundles 8 polish items to converge in one round.

M-class (correctness):
- streamXz()/parseTar() now propagate cleanup on early termination.
  streamXz wraps the pipeline in finally{ unxzStream.destroy(); await
  pipelinePromise.catch() } — destroying the Transform on consumer-break
  suppresses the resulting pipeline rejection. parseTar wraps its main
  loop in try/finally{ await iter.return?.() } so upstream cleanup
  propagates back through the chain. No more pending unhandled
  rejections after `for await ... break` (F-1).
- entry.bytes() now throws explicitly when entry.data has been iterated
  first: "entry.data already iterated; bytes() cannot recover full
  content; consume one or call bytes() first." Internal flag set on the
  first .next() of the data wrapper. Clean contract; surprising silent
  partial caches eliminated. +2 tests (F-2).
- entry.text(encoding?) reverted to Buffer.from(...).toString(encoding)
  semantics. The TextDecoder rewrite shipped in Block 3 was a breaking
  change vs the previous Buffer.toString() contract — base64, hex,
  latin1 etc. now work again. Tests cover utf8 default + base64 + hex
  (F-3).

L-class (polish, doc, layout):
- Concurrent entry.data iteration guard (boolean dataGenInFlight) —
  spec §2.4 forbids it; runtime now enforces (F-4).
- Stray-chunk silent fallthrough in extract() outer loop now throws
  err.code='TAR_PARSER_INVARIANT' instead of consuming-and-continuing
  (F-5).
- memory-shape Test 1 raised to preset:6 to match the 16 MB slack
  budget rationale (XZ preset-6 dictionary ~8 MB). Threshold unchanged
  (F-6).
- New packages/tar-xz/vitest.memory.config.ts (pool: 'forks',
  --expose-gc) keeps the default vitest config on threads. test:memory
  script updated to use the dedicated config. Default `pnpm test`
  runtime no longer regressed by forks pool (F-7).
- xz-helpers.spec.ts:146 "64 KB = 655,360" → "64 KB = 65,536" math fix
  (F-8).
- README streaming claim reworded: "v6.0.0 introduced the stream-first
  API contract; v6.1.0 delivers the planned optimization that fulfills
  it." (F-9).
- Changeset bullet 4 softened: removed "previously the buffered model
  could surface the error" overclaim; replaced with accurate JS
  AsyncGenerator convention reference (F-10).
- docs/plans/TAR-XZ-STREAMING-2026-04-28.md duplicate `## §12 Locked
  Design Decisions` header collapsed to one (F-11).

Tests: 150+3-skip pass; memory 3+1-skip pass. tsc + lint + build green.

* fix(tar-xz): lazy streamXz pipeline + alloc-once bytes() (PR #113)

Round-2 Copilot findings on PR #113 (1M + 1L):

- streamXz() now creates the createUnxz() Transform and the pipeline()
  call INSIDE the async generator body. Pipeline starts only on the
  first .next() — truly lazy. Consumers that call streamXz(input) but
  never iterate (e.g. early break before any read) trigger zero I/O
  and zero resources. T-07 proves it via a counting Readable mock
  (readCount=0 after 20ms when not iterated). The cleanup `finally{
  unxzStream.destroy(); await pipelinePromise.catch(...) }` is reached
  on every termination path (CR2-1).
- bytes() in makeTarEntryWithData() now allocates `new Uint8Array(
  entry.size)` once and `set()`s each chunk at running offset.
  Halves peak memory for large entries vs the old chunks-array +
  concat. entry.size===0 short-circuits to Uint8Array(0). Defensive
  overrun guard throws Error with code='TAR_PARSER_INVARIANT' if
  offset+chunk would exceed entry.size — mathematically unreachable
  given parseTar's Math.min(bytesRemaining,…) but locks the
  invariant (CR2-2).

Pre-push opus senior review (§2.10 round-3 gate): SAFE-TO-PUSH —
lazy semantics confirmed, cleanup paths sound, T-07 race-free with
positive control. Zero new findings.

151 tar-xz tests pass + 489 full suite + 3 memory-shape pass.
tsc + lint + build all green.

* fix(tar-xz): retype dataWrapper as AsyncIterable to satisfy strict tsc

CI build failed on `packages/tar-xz` with TS2741: TypeScript 6's
`lib.esnext` adds `[Symbol.asyncDispose]` to `AsyncGenerator`, and the
fix-round-1 dataWrapper proxy didn't implement it. The wrapper only
needs to satisfy `AsyncIterable<Uint8Array>` — that is the public type
on `TarEntryWithData.data`. `AsyncGenerator` was over-typed.

- Restructured the wrapper into a pure AsyncIterable whose
  `[Symbol.asyncIterator]()` sets `dataIterStarted = true` and returns
  the underlying `dataGen` directly. No more next/return/throw on the
  literal — strict mode rejected those without explicit method types,
  and they were redundant once the wrapper just delegates.
- Surface is strictly smaller and matches the public type contract
  exactly. No behavior change: bytes()-after-iter still throws because
  the flag is still set on first `[Symbol.asyncIterator]()` call (which
  fires the moment the consumer starts a `for await`).

All gates green: tar-xz build, tsc, vitest (151 + 3 skipped), root
build, full suite, biome lint.
oorabona added a commit that referenced this pull request Apr 30, 2026
Future workspace releases (`tar-xz`, `nxz-cli`) now auto-scope their
CHANGELOG entries to commits since the LAST per-package release commit
(matching `chore(<pkg>): release v*`), instead of since the last
GPG-signed root tag.

Eliminates the need for post-release manual CHANGELOG curation that
nxz-cli@6.1.0 required to drop entries from #108 (already in 6.0.0),
#115 (biome refactor body fragments), and `adfbc99` (changesets adoption,
already removed). Empirical dry-run from `packages/nxz/` with
`GIT_CHANGELOG_PATH=.` produces zero diff: the new `resolveSinceBaseline`
correctly stops at `ecff028 chore(nxz-cli): release v6.1.0`.

The new `GIT_CHANGELOG_SINCE` env var also provides an explicit override
escape hatch for projects whose release commits do not follow the
`chore(<pkg>): release v*` pattern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants