Skip to content

feat(wasm): wire memlimit option through unxzAsync/unxz#111

Merged
oorabona merged 4 commits into
masterfrom
feat/wasm-memlimit
Apr 28, 2026
Merged

feat(wasm): wire memlimit option through unxzAsync/unxz#111
oorabona merged 4 commits into
masterfrom
feat/wasm-memlimit

Conversation

@oorabona
Copy link
Copy Markdown
Owner

Summary

Closes the silent-ignore footgun where LZMAOptions.memlimit was advertised but never read by the WASM decoder. Callers can now actually constrain decoder memory:

await unxzAsync(buf, { memlimit: 64 * 1024 * 1024 }); // 64 MiB
// → throws LZMAMemoryLimitError if archive needs more

Changes

  • src/types.ts — add memlimit?: number | bigint to LZMAOptions (units: bytes; default 256 MiB; coerced to bigint for the WASM C ABI)
  • src/wasm/decompress.ts — drop _opts underscore; pass opts?.memlimit to streamBufferDecode (already accepted it). Both promise + callback variants wired.
  • src/lzma.tsXzStream._opts switched to Omit<Required<LZMAOptions>, 'memlimit'> & Pick<LZMAOptions, 'memlimit'> so adding the new optional field doesn't break the existing Required<> constraint elsewhere
  • test/wasm/decompress-memlimit.test.ts — 8 new tests (number + bigint inputs, callback path, default behaviour, MEMLIMIT_ERROR surface)
  • .changeset/wasm-memlimit-option.md — minor bump for node-liblzma

Native parity (out of scope)

The N-API binding (InitializeDecoder in src/bindings/node-liblzma.cpp) still hardcodes UINT64_MAX. Adding native support is tracked separately as a MEDIUM TODO — keeping this PR tight.

Test plan

  • pnpm build — exit 0
  • pnpm type-check — exit 0
  • pnpm exec vitest run test/wasm/ — all WASM tests pass (incl. 8 new)
  • pnpm test — 580/580 pass
  • pnpm lint — biome currently OOM-crashes on --version itself (pre-existing infra issue, unrelated to this diff; verified against master)

LZMAOptions previously had no memlimit field, so WASM decoders silently
ignored any caller-provided limit (TODO at src/wasm/decompress.ts:26).

- Add `memlimit?: number | bigint` to LZMAOptions with full TSDoc
- Thread through unxzAsync + callback variant to streamBufferDecode
- Adjust XzStream._opts type to keep memlimit optional in the stream
  class (Required<LZMAOptions> would break otherwise)
- 8 new tests in test/wasm/decompress-memlimit.test.ts covering both
  number and bigint inputs, sync/callback variants, and the default

Native parity: the N-API binding (InitializeDecoder) still hardcodes
UINT64_MAX and ignores opts.memlimit. Tracked as a separate MEDIUM
TODO; out of scope here to keep the change tight.
Copilot AI review requested due to automatic review settings April 28, 2026 13:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Wires the memlimit decompression option through the WASM unxzAsync/unxz APIs so callers can enforce decoder memory limits (surfacing LZMAMemoryLimitError on LZMA_MEMLIMIT_ERROR).

Changes:

  • Add memlimit?: number | bigint to LZMAOptions with WASM-focused TSDoc.
  • Pass opts?.memlimit into the WASM streamBufferDecode path for unxzAsync (and thus unxz).
  • Add a new WASM test suite covering number/bigint inputs, default behavior, and callback behavior; add a changeset + update TODO tracking.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/types.ts Introduces memlimit option on LZMAOptions with documentation and parity notes.
src/wasm/decompress.ts Threads memlimit into WASM buffer decode via unxzAsync (and callback wrapper).
src/lzma.ts Adjusts internal _opts typing to keep memlimit optional without breaking Required<> usage.
test/wasm/decompress-memlimit.test.ts Adds tests validating memlimit error/success cases for Promise + callback APIs.
TODO.md Updates tracking items/status for memlimit wiring and related native follow-up.
.changeset/wasm-memlimit-option.md Declares a minor bump documenting the new WASM memlimit option and behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/wasm/decompress-memlimit.test.ts Outdated
Comment on lines +78 to +103
makeFixture().then(({ compressed }) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
});
}));

it('succeeds via callback with sufficient memlimit', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ original, compressed }) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
});
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Promise wrapper can hang if makeFixture() rejects (e.g., WASM init/encode failure): the rejection isn’t forwarded to reject, leaving the outer Promise pending and the test timing out. Please catch(reject) on makeFixture() (or rewrite the test as async and await makeFixture() inside the Promise).

Suggested change
makeFixture().then(({ compressed }) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
});
}));
it('succeeds via callback with sufficient memlimit', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ original, compressed }) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
});
makeFixture()
.then(({ compressed }) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
})
.catch(reject);
}));
it('succeeds via callback with sufficient memlimit', () =>
new Promise<void>((resolve, reject) => {
makeFixture()
.then(({ original, compressed }) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
})
.catch(reject);

Copilot uses AI. Check for mistakes.
Comment thread test/wasm/decompress-memlimit.test.ts Outdated
Comment on lines +76 to +104
it('passes LZMAMemoryLimitError to callback when memlimit too small', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ compressed }) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
});
}));

it('succeeds via callback with sufficient memlimit', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ original, compressed }) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
});
}));
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here: makeFixture().then(...) doesn’t handle rejection, so a failure in fixture creation will leave the returned Promise pending and the test will hang. Forward errors via .catch(reject) or use an async test and await makeFixture() before calling unxz.

Suggested change
it('passes LZMAMemoryLimitError to callback when memlimit too small', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ compressed }) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
});
}));
it('succeeds via callback with sufficient memlimit', () =>
new Promise<void>((resolve, reject) => {
makeFixture().then(({ original, compressed }) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
});
}));
it('passes LZMAMemoryLimitError to callback when memlimit too small', async () => {
const { compressed } = await makeFixture();
await new Promise<void>((resolve, reject) => {
unxz(compressed, { memlimit: 1024 }, (err) => {
try {
expect(err).toBeInstanceOf(LZMAMemoryLimitError);
expect((err as LZMAMemoryLimitError).code).toBe(LZMA_MEMLIMIT_ERROR);
resolve();
} catch (e) {
reject(e);
}
});
});
});
it('succeeds via callback with sufficient memlimit', async () => {
const { original, compressed } = await makeFixture();
await new Promise<void>((resolve, reject) => {
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
resolve();
} catch (e) {
reject(e);
}
});
});
});

Copilot uses AI. Check for mistakes.
Comment thread test/wasm/decompress-memlimit.test.ts Outdated
unxz(compressed, { memlimit: 256 * 1024 * 1024 }, (err, result) => {
try {
expect(err).toBeNull();
expect(result).toBeDefined();
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the “succeeds via callback” case, the test only asserts result is defined, so it would still pass if decompression returns the wrong bytes. Consider asserting result decodes/equates to original (as in other callback tests) to actually validate successful decompression semantics.

Suggested change
expect(result).toBeDefined();
expect(result).toEqual(original);

Copilot uses AI. Check for mistakes.
Comment thread src/lzma.ts Outdated
Comment on lines +347 to +348
// memlimit is WASM-only; native binding ignores it (hardcodes UINT64_MAX).
// Kept in Required<LZMAOptions> for type completeness; default undefined = no limit.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment is now inaccurate/misleading: _opts is no longer Required<LZMAOptions> (it explicitly omits memlimit from Required<>), and “default undefined = no limit” conflates native vs WASM behavior. Suggest updating the comment to reflect that memlimit is currently only honored by the WASM decoder (defaulting to 256 MiB when omitted) and ignored by the native binding.

Suggested change
// memlimit is WASM-only; native binding ignores it (hardcodes UINT64_MAX).
// Kept in Required<LZMAOptions> for type completeness; default undefined = no limit.
// memlimit is currently only honored by the WASM decoder.
// When omitted there, it defaults to 256 MiB; the native binding ignores it.

Copilot uses AI. Check for mistakes.
Comment thread src/types.ts Outdated
Comment on lines +63 to +73
* Default: `BigInt(256 * 1024 * 1024)` (256 MiB).
*
* When the compressed stream requires more memory than this limit,
* decompression throws `LZMAMemoryLimitError` with
* `code === LZMA_MEMLIMIT_ERROR` (numeric constant `6`, re-exported from
* `src/errors.ts`).
*
* **Parity note (WASM only for now):** The native Node.js binding
* (`src/bindings/node-liblzma.cpp`) currently hardcodes `UINT64_MAX`
* and silently ignores this option. Native support is tracked in
* TODO.md: "[Native] Wire memlimit in src/bindings/node-liblzma.cpp".
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TSDoc says the default memlimit is 256 MiB, but the same block also notes native currently ignores the option (effectively making the “default” different on native vs WASM). To avoid confusing Node consumers, consider wording this as “WASM default: 256 MiB; native: ignored/unlimited until wired” (or similar).

Copilot uses AI. Check for mistakes.
Applies opus + Copilot review findings (4M + 4S):

- Validate `number` memlimit before BigInt() coercion. Reject NaN,
  Infinity, fractional, and negative values with LZMAOptionsError
  instead of letting RangeError leak from BigInt() (F-001).
- Replace `Omit<Required<LZMAOptions>, 'memlimit'> & Pick<...>`
  smell on XzStream._opts with an internal ResolvedLZMAOptions
  interface that lists the genuinely-optional memlimit explicitly
  (F-002).
- Rewrite callback-variant memlimit tests as async so makeFixture()
  rejections propagate to vitest instead of hanging the Promise
  (C-001 / C-002).
- Assert byte-equality vs original buffer in callback success test
  (was only `toBeDefined()`) (C-003).
- Rework memlimit TSDoc: WASM/native asymmetry warning moved to top,
  Default split into "WASM default: 256 MiB" and "Native: ignored"
  (F-003 / C-005).
- Refresh stale "Required<LZMAOptions>" comment in lzma.ts to match
  the new ResolvedLZMAOptions type (F-004 / C-004).
- Fix fixture comment "~4 MB" -> "~8 MiB" (preset 6 dict size) and
  explain why the AC holds regardless of exact size (F-005).
- Document the default-path test coverage gap in a 3-line code
  comment — proves no-throw, not that default IS specifically
  256 MiB (F-006).

584/584 tests pass (12 memlimit, +4 vs round 1).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/types.ts Outdated
/**
* Memory usage limit for decompression, in bytes.
*
* **Honored only by `xzAsync`/`unxz`/`unxzAsync` (WASM Buffer API).**
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TSDoc says memlimit is honored by xzAsync, but src/wasm/compress.ts doesn’t read opts.memlimit (it only uses preset/check). This is a documentation mismatch that may mislead callers—please remove xzAsync from the “honored only by …” list (or actually wire memlimit into compression if that’s intended).

Suggested change
* **Honored only by `xzAsync`/`unxz`/`unxzAsync` (WASM Buffer API).**
* **Honored only by `unxz`/`unxzAsync` (WASM Buffer API).**

Copilot uses AI. Check for mistakes.
Comment thread src/lzma.ts Outdated
* All fields are required (defaults applied in constructor) EXCEPT memlimit,
* which is genuinely optional: the native binding ignores it (UINT64_MAX
* hardcoded; see TODO "[Native] Wire memlimit in src/bindings/node-liblzma.cpp").
* Only the WASM Buffer API (xzAsync/unxz/unxzAsync) honours memlimit.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This internal comment claims the WASM Buffer API xzAsync/unxz/unxzAsync honors memlimit, but xzAsync (compression) doesn’t use a decoder memlimit. Consider updating the comment to mention only the decompression entry points that actually pass memlimit through (e.g., unxz/unxzAsync/streamBufferDecode).

Suggested change
* Only the WASM Buffer API (xzAsync/unxz/unxzAsync) honours memlimit.
* Only the WASM decompression Buffer APIs (for example, unxz/unxzAsync and
* streamBufferDecode) honour memlimit.

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts
Comment on lines +260 to +268
if (memlimit < 0n) {
throw new LZMAOptionsError(8, 'memlimit must be a non-negative value');
}
return;
}
if (!Number.isFinite(memlimit)) {
throw new LZMAOptionsError(
8,
'memlimit must be a finite number (NaN and Infinity are not allowed)'
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateMemlimit hardcodes errno 8 when throwing LZMAOptionsError. Since src/errors.ts already exports LZMA_OPTIONS_ERROR, using the constant here would avoid a magic number and keep this aligned if the mapping ever changes.

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts
Comment on lines +265 to +276
if (!Number.isFinite(memlimit)) {
throw new LZMAOptionsError(
8,
'memlimit must be a finite number (NaN and Infinity are not allowed)'
);
}
if (!Number.isInteger(memlimit)) {
throw new LZMAOptionsError(
8,
'memlimit must be an integer (fractional values are not allowed)'
);
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateMemlimit accepts any finite integer number, but values above Number.MAX_SAFE_INTEGER can’t be represented precisely and may be coerced to an unintended BigInt. Consider either rejecting non-safe integers (and requiring callers to use bigint for large limits) or explicitly documenting the safe-integer requirement/precision caveat.

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts Outdated
Comment on lines 248 to 300
@@ -261,6 +296,7 @@ export function streamBufferDecode(
let outPtr = wasmAlloc(module, outSize);

try {
validateMemlimit(memlimit);
const limit = typeof memlimit === 'number' ? BigInt(memlimit) : memlimit;
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateMemlimit is applied in streamBufferDecode, but the exported decoderInit/autoDecoderInit functions in this same module still coerce memlimit with BigInt(...) directly. That leaves those call sites able to throw a raw RangeError (NaN/Infinity/fractional) and still permits negative values to wrap at the C ABI boundary. Consider reusing validateMemlimit in the other decoder init helpers as well (or narrowing their parameter types if they’re not meant to accept arbitrary user input).

Copilot uses AI. Check for mistakes.
5 findings from Copilot review of round-1 fixes (3M + 1S + 1L):

- Remove `xzAsync` from `LZMAOptions.memlimit` honored-by list and
  refresh internal lzma.ts comment — xzAsync is compression-only and
  doesn't read memlimit (C-2-001 / C-2-002).
- Lift `validateMemlimit` to run as the first statement of every
  decoder ingestion path: `decoderInit`, `autoDecoderInit`, and
  `streamBufferDecode` — round 1 only guarded `streamBufferDecode`,
  leaving the streaming decoder entry points able to throw raw
  RangeError on NaN/Infinity/fractional or accept negative values
  (C-2-005). +12 tests prove rejection on each path.
- Reject `number > Number.MAX_SAFE_INTEGER` in `validateMemlimit` —
  values above 2^53-1 lose precision when coerced to bigint and
  produce an unintended limit. TSDoc directs callers to use bigint
  for large limits (C-2-004). +4 tests cover both sides of the
  boundary.
- Replace hardcoded `errno: 8` with `LZMA_OPTIONS_ERROR` constant
  imported from `src/errors.ts` (C-2-003).

Pre-push opus senior review (§2.10 round-3 gate): SAFE-TO-PUSH —
all five claims verified against working tree, zero new S/M/L.

600/600 tests pass (16 added vs round 1).

Polish item deferred to TODO.md (LOW): bigint branch lacks UINT64_MAX
upper-bound guard; benign while native ignores memlimit; becomes
load-bearing once native parity is wired.
@oorabona oorabona requested a review from Copilot April 28, 2026 14:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/wasm/bindings.ts Outdated
Comment on lines +119 to +125
* @throws LZMAError on initialization failure
*/
/**
* Initialize a stream decoder on the given lzma_stream.
*
* @param stream - Allocated WasmLzmaStream
* @param memlimit - Memory limit in bytes (default: 256MB)
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two consecutive JSDoc blocks above decoderInit. This duplicates rendered docs and makes it unclear which @throws list is authoritative. Please remove the redundant block and keep a single combined JSDoc that includes both LZMAOptionsError and LZMAError throws.

Suggested change
* @throws LZMAError on initialization failure
*/
/**
* Initialize a stream decoder on the given lzma_stream.
*
* @param stream - Allocated WasmLzmaStream
* @param memlimit - Memory limit in bytes (default: 256MB)

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts Outdated
Comment on lines +144 to +152
* @throws LZMAError on initialization failure
*/
/**
* Initialize an auto decoder (detects format) on the given lzma_stream.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two consecutive JSDoc blocks above autoDecoderInit. Please remove the duplicate and keep a single doc comment that includes the LZMAOptionsError validation behavior, to avoid duplicated/contradictory generated docs.

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts Outdated
Comment on lines +271 to +280
* interpreted by the C ABI (uint64_t wrap-around). Both cases are rejected
* here with LZMAOptionsError so callers always get an LZMAError subclass.
*
* @throws LZMAOptionsError if the value is invalid
*/
/**
* Validate a memlimit value before coercion to BigInt.
*
* BigInt() throws a native RangeError for NaN, Infinity, and non-integer
* numbers (e.g. 1.5). Negative integers produce a huge unsigned value when
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateMemlimit is preceded by two back-to-back JSDoc blocks describing the same function. Please keep only one (the more complete one mentioning MAX_SAFE_INTEGER) to avoid duplicated docs and reduce noise.

Suggested change
* interpreted by the C ABI (uint64_t wrap-around). Both cases are rejected
* here with LZMAOptionsError so callers always get an LZMAError subclass.
*
* @throws LZMAOptionsError if the value is invalid
*/
/**
* Validate a memlimit value before coercion to BigInt.
*
* BigInt() throws a native RangeError for NaN, Infinity, and non-integer
* numbers (e.g. 1.5). Negative integers produce a huge unsigned value when

Copilot uses AI. Check for mistakes.
Comment thread src/lzma.ts Outdated
flushFlag: opts.flushFlag ?? liblzma.LZMA_RUN,
// memlimit is genuinely optional in ResolvedLZMAOptions: the native binding ignores it
// (UINT64_MAX hardcoded; see TODO "[Native] Wire memlimit in src/bindings/node-liblzma.cpp").
// Only the WASM Buffer API (xzAsync/unxz/unxzAsync) honours this field.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Only the WASM Buffer API (xzAsync/unxz/unxzAsync) honours this field", but xzAsync is compression-only and does not read memlimit (as also stated a few lines above). Please update this comment to mention only the decompression paths (e.g. unxz/unxzAsync/streamBufferDecode) so it doesn't mislead readers.

Suggested change
// Only the WASM Buffer API (xzAsync/unxz/unxzAsync) honours this field.
// Only the WASM decompression paths (unxz/unxzAsync/streamBufferDecode) honour this field.

Copilot uses AI. Check for mistakes.
Round-3 Copilot doc-only findings (3L + 1S):

- Remove three duplicate JSDoc blocks from `decoderInit`,
  `autoDecoderInit`, and `validateMemlimit`. Earlier rounds
  used `insert_symbol` instead of `write_symbol`, leaving the
  original short doc next to the new validation-aware block.
  Keep the complete block in each case (C-3-001/2/3).
- Fix a missed-spot stale comment in `src/lzma.ts:370` still
  listing `xzAsync` as honouring memlimit. `xzAsync` is
  compression-only; the round-2 cleanup at line 311 missed
  this second occurrence (C-3-004).

Sweep confirms no other "xzAsync honours memlimit" claims
remain across `src/lzma.ts`, `src/wasm/bindings.ts`, and
`src/types.ts`.

600+ tests pass, tsc clean. No behaviour change.
@oorabona oorabona merged commit 6e2bc09 into master Apr 28, 2026
20 checks passed
@oorabona oorabona deleted the feat/wasm-memlimit branch April 28, 2026 14:27
oorabona added a commit that referenced this pull request Apr 28, 2026
* feat(native): wire memlimit through N-API decoder

Closes the WASM↔native asymmetry that PR #111 advertised in TSDoc but
left unresolved on the native side. `Unxz`, `createUnxz`, and the
`unxz`/`unxzAsync` Buffer APIs now all honor `LZMAOptions.memlimit`.

- N-API `InitializeDecoder` (src/bindings/node-liblzma.{cpp,hpp})
  takes a new `const Napi::Object& opts` parameter, reads
  `opts.memlimit`, and passes it to `lzma_stream_decoder` instead of
  the hardcoded `UINT64_MAX`. BigInt and Number inputs both supported;
  lossless conversion is checked.
- JS-side `validateMemlimit` lifted out of the WASM module and called
  in the `XzStream` constructor at decode-mode entry, so the same
  guard runs whether the decode path ends up in native or WASM.
  NaN/Infinity/fractional/negative/`> MAX_SAFE_INTEGER` reject
  synchronously with `LZMAOptionsError` (errno = `LZMA_OPTIONS_ERROR`).
- `validateMemlimit` also rejects `bigint > UINT64_MAX` (2^64 - 1)
  to prevent silent BigInt→uint64 truncation now that the bound
  matters. Closes the LOW-priority symmetry item left open by PR
  #111.
- TSDoc on `LZMAOptions.memlimit` no longer carries the "WASM only"
  caveat. Default behavior unchanged (no memlimit → no limit on
  native, 256 MiB on WASM Buffer API).

14 new tests in `test/native/memlimit.test.ts` cover all decoder
entry points (Unxz stream, createUnxz factory, and validation paths).
Full suite: 614/614.

* fix(native): address review findings on memlimit (PR #112)

Round-1 findings from opus + Copilot (3M + 2S + 2L):

- N-API InitializeDecoder no longer silently widens to UINT64_MAX
  on lossless=false (BigInt) or out-of-range / non-finite /
  non-integer / negative Number. Throws Napi::TypeError instead,
  giving direct-N-API consumers the same contract the JS-side
  validateMemlimit guard provides (F-3 / C-1 / C-2).
- validateMemlimit's UINT64_MAX overflow message is now
  context-neutral — explicitly notes that "no limit" semantics
  are native-only and that WASM buffer APIs default to 256 MiB
  when omitted (C-3).
- LZMAOptions.memlimit TSDoc no longer claims values are coerced
  to bigint before reaching the C ABI — accurate only for WASM,
  not for the native binding which accepts both types via N-API
  directly (C-6).
- ResolvedLZMAOptions.memlimit internal TSDoc refreshed; no
  longer states the field is "WASM only" (F-1).
- New `new Xz({ memlimit: NaN })` test locks the encoder-ignores-
  memlimit contract — encoder mode silently passes through
  invalid values because validateMemlimit is gated on STREAM_DECODE
  (GAP).
- Encoder forwarding kept with explanatory comment (F-2 cosmetic
  resolved without code change).
- Changeset wording cleaned: drop "below" (C-4) and replace
  "lifted out" with "now exported from src/wasm/bindings.ts" (C-5).

15/15 native memlimit tests pass; full 614-test suite green;
node-gyp rebuild clean; tsc + lint clean.

* fix(native): defense-in-depth memlimit validation in N-API binding

Round-2 Copilot findings on PR #112 (2M, both defense-in-depth):

- Number branch now rejects `d > Number.MAX_SAFE_INTEGER`
  (= 2^53 - 1 = 9007199254740991) with a Napi::TypeError instead
  of silently accepting precision-lost values. Mirrors the JS-side
  validateMemlimit guard (C-2-001).
- New `else` branch throws Napi::TypeError when `opts.memlimit` is
  present but is neither a BigInt nor a Number (string, object,
  array). Other options in the same file already type-validate;
  memlimit no longer the odd one out (C-2-002).

Both guards are pure defense-in-depth: the public API path is
fully covered by the JS-side validateMemlimit() called in the
XzStream constructor. They protect direct-N-API consumers (other
addons, future API surface changes) from getting silent UINT64_MAX
fallback when their input is malformed.

Pre-push opus senior review (§2.10 round-3 gate): SAFE-TO-PUSH —
literal correctness, throw idiom, conditional ordering, guard
placement, and existing-test regression all verified. Zero new
findings.

15/15 native memlimit tests pass; full 614-test suite green;
node-gyp rebuild clean.
oorabona added a commit that referenced this pull request Apr 28, 2026
One M finding + one L finding from Copilot round 2; the third
(PR description WASM mismatch) was an orchestrator slip that has
already been corrected via gh pr edit (stale /tmp/pr-body.md from
PR #111 was used at PR creation time — apologies to Copilot,
which had correctly flagged it both rounds).

- security-win32.spec.ts (C2-1, M): vi.stubGlobal('process', spread)
  drops non-enumerable Node.js APIs (EventEmitter, etc), risking
  test runner instability. Switch to Object.create(process) +
  Object.defineProperty('platform', 'win32') so only platform is
  shadowed and the rest of the process API stays intact. Helper
  stubWin32Platform() extracted with JSDoc rationale.
- .changeset/win32-toctou-hardening.md (C2-2, L): example error
  text drifted from the new wording (round 2 changed the source
  but the changeset still cited the old "symlink-swap race
  detected" message). Replaced with a generic 2-line snippet
  decoupled from exact error wording.

Tests: 155 pass / 0 fail / 3 pre-existing skips (identical to pre-fix).
Lint: 0 errors. Type-check: 0 errors.

Trend: round 1 = 6 findings (3 M + 2 L + 1 misclassified), round 2
= 3 findings (1 M + 2 L), round 3 expected = 0 (healthy down-trend).
oorabona added a commit that referenced this pull request Apr 29, 2026
…fail-closed (#114)

* chore: lifecycle hygiene — promote shipped plans, refresh TODO

- docs/plans/TAR-XZ-STREAMING-2026-04-28.md: draft → canonical
  (PR #113 squash 06a9937 shipped 2026-04-29; all 5 blocks
  delivered, opus adversarial + LLM consensus + 4 Copilot rounds
  passed)
- docs/plans/native-binding-migration.md: draft → canonical
  (research conclusion: no migration; document is the deliverable)
- TODO.md: remove stale duplicate of #113 streaming entry from
  Pending-MEDIUM (was a copy of the now-shipped item), refresh
  Quick Reference table, move Win32 follow-up from Pending to
  In Progress (story WIN32-TOCTOU-2026-04-29 starts here)

* fix(tar-xz): close Win32 symlink-swap TOCTOU with JS-pure 'wx'+retry fail-closed

Replace the by-path Win32 fallback in extractFile with a handle-based
path using O_EXCL ('wx') atomic create plus an unlink-then-retry
pattern for legitimate overwrite. fd-based chmod/utimes follow the
file descriptor, not the path. If a symlink is injected during the
unlink+retry race, the second 'wx' open fails with EEXIST and a
security error is thrown citing the entry name — no bytes are written
through the symlink.

Recon invalidated the original "match node-tar with CreateFileW"
framing: node-tar is pure JS and explicitly does NOT protect Windows
either (their PR #456 is Unix-only, get-write-flag.ts gates O_NOFOLLOW
on !isWindows). Adopting JS-pure 'wx'+retry exceeds the ecosystem
norm without expanding the native addon surface.

Adversarial pass folded one M finding (reparse-point coverage gap)
into SECURITY.md with full IO_REPARSE_TAG_SYMLINK / MOUNT_POINT /
CLOUD_FILES coverage table. Vectors 2-5 (hardlinks, case-insensitive
NTFS, ADS, vi.stubGlobal reachability) confirmed handled or
non-issues; vector 5 simplified the implementation by cutting the
platform-detection seam.

Closes the TOCTOU window widened by PR #113 streaming refactor
(was ~ms post-Buffer.concat, became seconds-minutes per entry size
during streaming write — realistic exploit window for co-tenants).

- packages/tar-xz/src/node/file.ts: Win32 branch rewritten
  ('wx' + unlink+retry, fd-based chmod/utimes, threat-model JSDoc
  citing W1-W4); unused chmod/utimes named imports trimmed
- packages/tar-xz/test/security-win32.spec.ts: 4 BDD scenarios
  (atomic-create, legit-overwrite, race-detected fail-closed,
  pre-existing-symlink-rejected-upstream regression lock);
  vi.mock('node:fs/promises') + vi.spyOn for race injection
- packages/tar-xz/SECURITY.md: §"Windows symlink-swap TOCTOU"
  with reparse-tag coverage table, residual race documentation,
  user mitigations
- packages/tar-xz/README.md: link to SECURITY.md§Win32
- .changeset: patch-level entry
- docs/plans/WIN32-TOCTOU-2026-04-29.md: spec (12 sections,
  threat model, BDD, adversarial ledger; will be promoted to
  canonical at /finalize)

All tar-xz tests: 155 pass / 0 fail / 3 pre-existing platform skips.
Lint: 0 errors. Type-check: 0 errors.

* fix(tar-xz): apply Copilot round-1 findings on Win32 TOCTOU PR

Three M-class findings + two L-class fixes from Copilot review;
one finding (C-4) dismissed as a hallucination after content
verification (changeset is on-topic, not WASM-related).

- file.ts:358 — wrap unlink() in try/catch ENOENT (M-3): when
  target disappears between failed first open() and our unlink,
  ignore ENOENT so the retry open('wx') can succeed. Preserves
  fail-closed for reparse-point injection (retry still rejects
  any non-empty path).
- file.ts:364 — reword security error (M-2): replace
  "symlink-swap race detected … a symlink was injected" with
  "target still exists on retry … possible symlink/junction
  injection or concurrent creation at the target path between
  unlink and open". The new wording does not overclaim a
  specific reparse-point type and accepts benign concurrent
  creators as a possibility, while still failing closed.
- security-win32.spec.ts — cross-platform symlink target (M-1):
  pre-create attacker-target.txt under tmp; symlink points there
  instead of /dev/null. Add `it.skipIf(!canCreateSymlinks)` gate
  with a beforeAll probe so Win32 runners without Developer Mode
  skip symlink-creation paths instead of failing with EPERM.
  Also fixes Scenario 4's /dev/null target (same portability
  issue, same fix).
- SECURITY.md / README.md — version-agnostic phrasing (L-1, L-2):
  replace three "v6.1.1" mentions with "this hardening" / "now
  closed". Removes documentation drift risk if the release ends
  up as 6.2.0 instead of 6.1.1.
- WIN32-TOCTOU-2026-04-29.md — spec drift fix: §7 Observable
  Success block now cites the new error wording and the
  cross-platform attacker target.

C-4 dismissed: Copilot claimed the changeset references "WASM
decoder memlimit"; verified false — `.changeset/win32-toctou-
hardening.md` only discusses Win32 TOCTOU.

Tests: 155 pass / 0 fail / 3 pre-existing skips.
Lint: 0 errors. Type-check: 0 errors.

* fix(tar-xz): apply Copilot round-2 findings on Win32 TOCTOU PR

One M finding + one L finding from Copilot round 2; the third
(PR description WASM mismatch) was an orchestrator slip that has
already been corrected via gh pr edit (stale /tmp/pr-body.md from
PR #111 was used at PR creation time — apologies to Copilot,
which had correctly flagged it both rounds).

- security-win32.spec.ts (C2-1, M): vi.stubGlobal('process', spread)
  drops non-enumerable Node.js APIs (EventEmitter, etc), risking
  test runner instability. Switch to Object.create(process) +
  Object.defineProperty('platform', 'win32') so only platform is
  shadowed and the rest of the process API stays intact. Helper
  stubWin32Platform() extracted with JSDoc rationale.
- .changeset/win32-toctou-hardening.md (C2-2, L): example error
  text drifted from the new wording (round 2 changed the source
  but the changeset still cited the old "symlink-swap race
  detected" message). Replaced with a generic 2-line snippet
  decoupled from exact error wording.

Tests: 155 pass / 0 fail / 3 pre-existing skips (identical to pre-fix).
Lint: 0 errors. Type-check: 0 errors.

Trend: round 1 = 6 findings (3 M + 2 L + 1 misclassified), round 2
= 3 findings (1 M + 2 L), round 3 expected = 0 (healthy down-trend).

* fix(tar-xz): apply Copilot round-3 finding on Win32 TOCTOU PR

One M finding from Copilot round 3 — broken skipIf gate.

The `it.skipIf(process.platform === 'win32' && !canCreateSymlinks)`
gate added in round 2 was subtly broken: Vitest evaluates `skipIf`
conditions at test-registration time, but `canCreateSymlinks` is
only updated later inside `beforeAll`. Result: the gate always
sees `true` on Win32, the test runs even without Developer Mode,
and `fsp.symlink` throws EPERM during the race-injection mock,
making the test fail with the wrong error.

Replaced with a static `it.skipIf(process.platform === 'win32')`
on the 2 tests that need symlink creation (race-detected and
pre-existing-symlink-rejected scenarios). Removed the orphaned
`canCreateSymlinks` variable and `beforeAll` probe block.

Coverage tradeoff:
- Linux/macOS (~90% CI): all 4 tests run via process.platform stub.
  Security contract (extract → 'wx' → unlink+retry → security error)
  fully validated.
- Windows: skips the 2 symlink-injection tests cleanly. The other 2
  tests (atomic-create-success, legit-overwrite-via-unlink-retry)
  exercise the real-NTFS Win32 path without needing symlink
  privilege.

Tests: 155 pass / 0 fail / 3 pre-existing skips.
Lint: 0 errors. Type-check: 0 errors.

Trend: round 1 = 6 findings, round 2 = 3, round 3 = 1, round 4
expected = 0. Convergence.

* fix(tar-xz): apply Copilot round-3 findings on Win32 TOCTOU PR

Three findings: 1 regression we introduced + 1 test gap + 1 doc.

- file.ts (C4-1, M, regression): wrap handle.chmod/utimes on Win32
  branch in narrow try/catch with /* best-effort */ comment. The
  master pre-PR Win32 path treated chmod/utimes as best-effort
  (some Win32 filesystems — FAT32, cloud-mounted shares — reject
  these with EPERM even when the write succeeded). Our PR
  inadvertently made them fatal, breaking those workflows. Now
  matches master semantics. handle.write errors still propagate
  (data integrity).
- security-win32.spec.ts (C4-2, M, observable proof): the race-
  detected test verified the symlink remained the attacker's, but
  did NOT assert attackerTarget content was unchanged. If the
  extractor accidentally followed the symlink and wrote through it
  we wouldn't have caught it. Added the byte-equality assert on
  attackerTarget after the throw — true observable-proof regression
  gate. attackerTarget was already pre-populated with
  'attacker content' from round 2.
- .changeset/win32-toctou-hardening.md (C4-3, L): empty typescript
  code fence (round 3 emptied the block but left the fence) replaced
  with a 3-line minimal example using extractFile + rejects.toThrow.

Tests: 155 pass / 0 fail / 3 pre-existing skips.
Lint: 0 errors. Type-check: 0 errors.

Trend: round 1=6, round 2=3, round 3=1, round 4=3 (2 real Ms in
our own prior code), round 5=3 fixed → expected 0 remaining.

* chore(tar-xz): apply Copilot round-4 L-only nits on Win32 TOCTOU PR

Five cosmetic findings — no behavior change.

- docs/plans/native-binding-migration.md: bump doc-meta updated date
  2026-03-01 → 2026-04-29 to match this PR's housekeeping commit
  (status was promoted to canonical but updated stayed stale).
- docs/plans/WIN32-TOCTOU-2026-04-29.md: remove machine-specific
  '/mnt/wsl/...' Project line — was non-portable.
- packages/tar-xz/test/security-win32.spec.ts: remove orphan JSDoc
  block detached from any declaration; correct "proxy" comment
  near vi.mock (the factory does shallow spread, not Proxy);
  align scenario comment regex with the actual assertion text
  (/symlink-swap race detected/ → /target still exists on retry/,
  drift from round-2 wording change).

Tests: 155 pass / 0 fail / 3 pre-existing skips.
Lint: 0 errors. Type-check: 0 errors.

Pushed with NO-COPILOT-REVIEW marker — premium quota saved per
workflow §2.10 termination rule (a) for L-only round on a
ready-to-merge PR.

Trend: round 1=6, round 2=3, round 3=1, round 4=3 (2 real Ms
in our prior code), round 5=3 fixed, round 6=5 L addressed.
Total findings folded across 6 rounds: 21. PR ready for merge.
oorabona added a commit that referenced this pull request Apr 29, 2026
The Changesets - Version PR workflow has been failing on `node-liblzma`
because the root package isn't included in pnpm-workspace.yaml's
`packages: ['packages/*']` glob. The `native-memlimit.md` changeset
referenced `node-liblzma` and triggered "Found changeset for package
not in the workspace" errors on every master push.

Architectural reality: this monorepo uses release.yml (release-it +
conventional commits) as the canonical release path. Changesets
artifacts are decorative — never consumed by the actual release
flow. Their content is already captured in CHANGELOG.md via
release-it's conventional-commit scan.

Cleanup:
- .changeset/native-memlimit.md (PR #112, node-liblzma not in
  workspace, blocked the workflow)
- .changeset/wasm-memlimit-option.md (PR #111, same reason)
- .changeset/validate-changesets-flow.md (test file from
  changesets adoption — no longer needed)
- .changeset/tar-xz-streaming.md (PR #113, shipped via
  tar-xz@6.1.0 release-it 2026-04-29)
- .changeset/win32-toctou-hardening.md (PR #114, shipped via
  same tar-xz@6.1.0 release)

Keep .changeset/config.json + README.md for the directory's
identity. Future PRs can still add changesets if the changesets
workflow is later re-enabled as a primary path.

Unblocks: changesets.yml workflow on master pushes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants