Skip to content

feat(native): wire memlimit through N-API decoder#112

Merged
oorabona merged 3 commits into
masterfrom
feat/native-memlimit
Apr 28, 2026
Merged

feat(native): wire memlimit through N-API decoder#112
oorabona merged 3 commits into
masterfrom
feat/native-memlimit

Conversation

@oorabona
Copy link
Copy Markdown
Owner

Summary

Closes the WASM↔native asymmetry that PR #111 advertised in TSDoc but didn't fix on the native side. After this PR, Unxz, createUnxz, unxz, and unxzAsync all honor LZMAOptions.memlimit uniformly.

const stream = new Unxz({ memlimit: 64 * 1024 * 1024 }); // 64 MiB
// → emits 'error' with LZMAMemoryLimitError on .xz exceeding the limit

Changes

  • N-API binding (src/bindings/node-liblzma.{cpp,hpp}) — InitializeDecoder takes a new const Napi::Object& opts parameter, reads opts.memlimit, and passes it to lzma_stream_decoder. BigInt + Number both supported; lossless conversion is checked.
  • JS-side guard lift (src/lzma.ts, src/wasm/bindings.ts) — validateMemlimit is now exported from src/wasm/bindings.ts and called at the XzStream decode-mode constructor entry. NaN / Infinity / fractional / negative / > Number.MAX_SAFE_INTEGER reject synchronously with LZMAOptionsError. New bigint upper-bound guard rejects > UINT64_MAX (2^64 - 1) to prevent silent BigInt→uint64 truncation now that the bound matters.
  • TSDoc (src/types.ts) — "WASM only" caveat removed; memlimit documented as uniform across all decoder paths.
  • Tests — 14 new tests in test/native/memlimit.test.ts covering: error path (memlimit:1024 on real .xz), success path (memlimit:256 MiB number + bigint), validation (NaN / Infinity / fractional / negative / MAX_SAFE_INTEGER+1 / > UINT64_MAX), default behavior, and createUnxz factory variants.
  • TODO.md — closes the [Native] Wire memlimit MEDIUM and the [WASM] validateMemlimit symmetry LOW (the latter became load-bearing as soon as native honored memlimit).
  • Changesets — new native-memlimit.md minor entry; previous wasm-memlimit-option.md updated to remove the now-stale "WASM only for now" caveat (both ship in the same release).

Test plan

  • pnpm build (TypeScript) — exit 0
  • node-gyp rebuild (native) — exit 0, no warnings beyond the usual
  • pnpm exec tsc --noEmit — exit 0
  • rtk proxy biome lint src/ test/ — exit 0, no errors
  • pnpm exec vitest run test/native/memlimit.test.ts — 14/14 pass
  • pnpm test — 614 / 614 (488 root + 99 tar-xz + 27 nxz)

Follow-ups

None mandatory. The [tar-xz] true streaming MEDIUM remains in TODO.md as the next planned story.

Closes the WASM↔native asymmetry that PR #111 advertised in TSDoc but
left unresolved on the native side. `Unxz`, `createUnxz`, and the
`unxz`/`unxzAsync` Buffer APIs now all honor `LZMAOptions.memlimit`.

- N-API `InitializeDecoder` (src/bindings/node-liblzma.{cpp,hpp})
  takes a new `const Napi::Object& opts` parameter, reads
  `opts.memlimit`, and passes it to `lzma_stream_decoder` instead of
  the hardcoded `UINT64_MAX`. BigInt and Number inputs both supported;
  lossless conversion is checked.
- JS-side `validateMemlimit` lifted out of the WASM module and called
  in the `XzStream` constructor at decode-mode entry, so the same
  guard runs whether the decode path ends up in native or WASM.
  NaN/Infinity/fractional/negative/`> MAX_SAFE_INTEGER` reject
  synchronously with `LZMAOptionsError` (errno = `LZMA_OPTIONS_ERROR`).
- `validateMemlimit` also rejects `bigint > UINT64_MAX` (2^64 - 1)
  to prevent silent BigInt→uint64 truncation now that the bound
  matters. Closes the LOW-priority symmetry item left open by PR
  #111.
- TSDoc on `LZMAOptions.memlimit` no longer carries the "WASM only"
  caveat. Default behavior unchanged (no memlimit → no limit on
  native, 256 MiB on WASM Buffer API).

14 new tests in `test/native/memlimit.test.ts` cover all decoder
entry points (Unxz stream, createUnxz factory, and validation paths).
Full suite: 614/614.
Copilot AI review requested due to automatic review settings April 28, 2026 15:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Wires LZMAOptions.memlimit through the native (N-API) decoder to match the existing WASM behavior, making memlimit effective across Unxz/createUnxz and buffer decode APIs, with updated docs and test coverage.

Changes:

  • Native binding: read opts.memlimit and pass it to lzma_stream_decoder.
  • JS-side: validate memlimit early for decode-mode streams; add UINT64_MAX upper-bound check for bigint.
  • Docs/tests/release: update TSDoc + changesets; add a native memlimit test suite; update TODOs.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/bindings/node-liblzma.cpp Reads memlimit from options and uses it in native decoder init.
src/bindings/node-liblzma.hpp Updates decoder init signature to accept the options object.
src/lzma.ts Validates memlimit for decode streams before allocating the native object; passes through to native binding.
src/wasm/bindings.ts Exports validateMemlimit and adds bigint UINT64_MAX upper-bound guard.
src/types.ts Updates memlimit TSDoc to remove “WASM only” caveat and document new constraints/defaults.
test/native/memlimit.test.ts Adds native tests covering error/success paths and validation behavior.
TODO.md Marks the native memlimit wiring + validation symmetry work as completed.
.changeset/native-memlimit.md Adds a minor changeset for native memlimit support.
.changeset/wasm-memlimit-option.md Updates prior changeset to reflect native parity in the same release.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +496 to +504
// JS-side validateMemlimit() already rejects values > UINT64_MAX, so lossless
// should always be true here. We check it defensively and fall back to UINT64_MAX
// (no limit) on truncation, which is safe and conservative.
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (lossless)
{
memlimit = val;
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the BigInt branch, if Uint64Value() is not lossless (e.g. value > UINT64_MAX or negative), the code currently falls back to UINT64_MAX (no limit). That can silently disable the caller’s intended limit and reintroduce a potential memory-DoS vector if JS-side validation is bypassed. Prefer throwing a JS exception (RangeError/TypeError) when the value isn’t representable as uint64_t, instead of silently defaulting to no limit.

Suggested change
// JS-side validateMemlimit() already rejects values > UINT64_MAX, so lossless
// should always be true here. We check it defensively and fall back to UINT64_MAX
// (no limit) on truncation, which is safe and conservative.
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (lossless)
{
memlimit = val;
}
// Reject non-lossless conversions defensively so invalid BigInt input cannot
// silently disable the caller's intended memory limit by falling back to
// UINT64_MAX (no limit).
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (!lossless)
{
Napi::RangeError::New(env, "memlimit must be a non-negative BigInt representable as uint64_t").ThrowAsJavaScriptException();
return false;
}
memlimit = val;

Copilot uses AI. Check for mistakes.
Comment on lines +506 to +516
else if (optsMemlimit.IsNumber())
{
// JS-side validateMemlimit() already rejects non-integer, negative, NaN, Infinity,
// and values above Number.MAX_SAFE_INTEGER (2^53-1), so a simple cast to uint64_t
// is safe here.
double d = optsMemlimit.ToNumber().DoubleValue();
if (d >= 0.0 && d <= static_cast<double>(UINT64_MAX))
{
memlimit = static_cast<uint64_t>(d);
}
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Number branch assumes JS-side validation ensures integer / finite values, but the native binding is still a public boundary. Consider validating here too (finite, integer, non-negative, <= MAX_SAFE_INTEGER or otherwise explicitly documented), and throw on invalid input rather than silently ignoring (leaving memlimit as UINT64_MAX). This avoids surprising behavior if the binding is used directly or future JS validation paths change.

Copilot uses AI. Check for mistakes.
Comment thread src/wasm/bindings.ts Outdated
if (memlimit > 18446744073709551615n) {
throw new LZMAOptionsError(
LZMA_OPTIONS_ERROR,
'memlimit bigint exceeds UINT64_MAX (2^64 - 1); use a smaller value or omit for no limit'
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message suggests “omit for no limit”, but in the WASM decoder paths omitting memlimit uses the DEFAULT_MEMLIMIT (256 MiB), not “no limit”. Since this validator is now shared across native+WASM call sites, consider making the message context-neutral (or explicitly say that UINT64_MAX is the “no limit” value) to avoid misleading WASM users.

Suggested change
'memlimit bigint exceeds UINT64_MAX (2^64 - 1); use a smaller value or omit for no limit'
'memlimit bigint exceeds UINT64_MAX (2^64 - 1); use a smaller value, or use UINT64_MAX explicitly for no limit'

Copilot uses AI. Check for mistakes.
Comment thread .changeset/wasm-memlimit-option.md Outdated
**Default:** `BigInt(256 * 1024 * 1024)` (256 MiB — unchanged from existing behaviour).

**Native parity:** The native Node.js binding (`InitializeDecoder`) still hardcodes `UINT64_MAX` and ignores `memlimit`. This is WASM-only for now; native tracking in TODO.md.
**Native parity:** Wired in the same release — see the native-memlimit changeset below for `Unxz`/`createUnxz` coverage.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line says “see the native-memlimit changeset below”, but changesets are separate files so there isn’t necessarily a “below” in the rendered output. Consider rephrasing to “see the native-memlimit changeset” to avoid confusing release-note readers.

Suggested change
**Native parity:** Wired in the same release — see the native-memlimit changeset below for `Unxz`/`createUnxz` coverage.
**Native parity:** Wired in the same release — see the native-memlimit changeset for `Unxz`/`createUnxz` coverage.

Copilot uses AI. Check for mistakes.
Comment thread .changeset/native-memlimit.md Outdated

**Changes**
- N-API `InitializeDecoder` reads `opts.memlimit` and passes it to `lzma_stream_decoder` (was hardcoded to `UINT64_MAX`).
- JS-side `validateMemlimit` lifted out of the WASM module and applied at `Xz`/`Unxz` constructor entry, so NaN/Infinity/fractional/negative/`> Number.MAX_SAFE_INTEGER`/`> UINT64_MAX` are caught synchronously with `LZMAOptionsError` regardless of whether the eventual decoder is native or WASM.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changeset says validateMemlimit was “lifted out of the WASM module”, but the implementation is still located in src/wasm/bindings.ts (now exported and reused from the Node entrypoint). Consider rewording to reflect that it’s being reused/shared rather than moved out, so the release notes stay accurate.

Suggested change
- JS-side `validateMemlimit` lifted out of the WASM module and applied at `Xz`/`Unxz` constructor entry, so NaN/Infinity/fractional/negative/`> Number.MAX_SAFE_INTEGER`/`> UINT64_MAX` are caught synchronously with `LZMAOptionsError` regardless of whether the eventual decoder is native or WASM.
- JS-side `validateMemlimit` is now reused from the WASM bindings and applied at `Xz`/`Unxz` constructor entry, so NaN/Infinity/fractional/negative/`> Number.MAX_SAFE_INTEGER`/`> UINT64_MAX` are caught synchronously with `LZMAOptionsError` regardless of whether the eventual decoder is native or WASM.

Copilot uses AI. Check for mistakes.
Comment thread src/types.ts Outdated
Comment on lines 62 to 64
* Accepted types: `number` or `bigint` (both are validated then coerced to
* `bigint` before being passed to the WASM C ABI, which maps `uint64_t` to
* `bigint` before being passed to the C ABI, which maps `uint64_t` to
* `BigInt`). The `number` form must be a finite, non-negative integer; passing
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TSDoc says memlimit values are “validated then coerced to bigint before being passed to the C ABI”, which is true for the WASM C ABI but not for the native stream path (the N-API binding reads Number/BigInt directly). Since the doc now claims memlimit is honored uniformly across native+WASM, it would be clearer/accurate to describe the two paths separately (WASM coerces to bigint; native converts via N-API with lossless checks).

Copilot uses AI. Check for mistakes.
Round-1 findings from opus + Copilot (3M + 2S + 2L):

- N-API InitializeDecoder no longer silently widens to UINT64_MAX
  on lossless=false (BigInt) or out-of-range / non-finite /
  non-integer / negative Number. Throws Napi::TypeError instead,
  giving direct-N-API consumers the same contract the JS-side
  validateMemlimit guard provides (F-3 / C-1 / C-2).
- validateMemlimit's UINT64_MAX overflow message is now
  context-neutral — explicitly notes that "no limit" semantics
  are native-only and that WASM buffer APIs default to 256 MiB
  when omitted (C-3).
- LZMAOptions.memlimit TSDoc no longer claims values are coerced
  to bigint before reaching the C ABI — accurate only for WASM,
  not for the native binding which accepts both types via N-API
  directly (C-6).
- ResolvedLZMAOptions.memlimit internal TSDoc refreshed; no
  longer states the field is "WASM only" (F-1).
- New `new Xz({ memlimit: NaN })` test locks the encoder-ignores-
  memlimit contract — encoder mode silently passes through
  invalid values because validateMemlimit is gated on STREAM_DECODE
  (GAP).
- Encoder forwarding kept with explanatory comment (F-2 cosmetic
  resolved without code change).
- Changeset wording cleaned: drop "below" (C-4) and replace
  "lifted out" with "now exported from src/wasm/bindings.ts" (C-5).

15/15 native memlimit tests pass; full 614-test suite green;
node-gyp rebuild clean; tsc + lint clean.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +512 to +521
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Throw on any out-of-range
// value (defense-in-depth: direct-N-API consumers bypassing the JS guard get an
// error, not a silent UINT64_MAX 'no limit' coercion that inverts caller intent).
double d = optsMemlimit.ToNumber().DoubleValue();
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > static_cast<double>(UINT64_MAX))
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
memlimit = static_cast<uint64_t>(d);
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the optsMemlimit.IsNumber() branch, the check currently allows any integer-valued double up to UINT64_MAX, but it doesn’t actually enforce a lossless JS number → uint64_t conversion. For defense-in-depth (and to match the JS validateMemlimit contract), consider rejecting values above Number.MAX_SAFE_INTEGER (2^53-1) and/or verifying round-trip equality (static_cast<double>(static_cast<uint64_t>(d)) == d) before accepting the cast.

Suggested change
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Throw on any out-of-range
// value (defense-in-depth: direct-N-API consumers bypassing the JS guard get an
// error, not a silent UINT64_MAX 'no limit' coercion that inverts caller intent).
double d = optsMemlimit.ToNumber().DoubleValue();
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > static_cast<double>(UINT64_MAX))
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
memlimit = static_cast<uint64_t>(d);
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Enforce the same contract
// here so Number -> uint64_t conversion is lossless for direct native callers too.
double d = optsMemlimit.ToNumber().DoubleValue();
const double kMaxSafeInteger = 9007199254740991.0; // Number.MAX_SAFE_INTEGER
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > kMaxSafeInteger)
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
uint64_t val = static_cast<uint64_t>(d);
if (static_cast<double>(val) != d)
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
memlimit = val;

Copilot uses AI. Check for mistakes.
return false;
}
memlimit = static_cast<uint64_t>(d);
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If opts.memlimit is provided but is neither a BigInt nor a Number (e.g. a string), the current logic silently falls back to UINT64_MAX (no limit). Since other options are type-validated, it would be more consistent/safer to throw a TypeError when memlimit is present with an unsupported type, rather than ignoring caller intent.

Suggested change
}
}
else
{
Napi::TypeError::New(env, "memlimit must be a Number or BigInt").ThrowAsJavaScriptException();
return false;
}

Copilot uses AI. Check for mistakes.
Round-2 Copilot findings on PR #112 (2M, both defense-in-depth):

- Number branch now rejects `d > Number.MAX_SAFE_INTEGER`
  (= 2^53 - 1 = 9007199254740991) with a Napi::TypeError instead
  of silently accepting precision-lost values. Mirrors the JS-side
  validateMemlimit guard (C-2-001).
- New `else` branch throws Napi::TypeError when `opts.memlimit` is
  present but is neither a BigInt nor a Number (string, object,
  array). Other options in the same file already type-validate;
  memlimit no longer the odd one out (C-2-002).

Both guards are pure defense-in-depth: the public API path is
fully covered by the JS-side validateMemlimit() called in the
XzStream constructor. They protect direct-N-API consumers (other
addons, future API surface changes) from getting silent UINT64_MAX
fallback when their input is malformed.

Pre-push opus senior review (§2.10 round-3 gate): SAFE-TO-PUSH —
literal correctness, throw idiom, conditional ordering, guard
placement, and existing-test regression all verified. Zero new
findings.

15/15 native memlimit tests pass; full 614-test suite green;
node-gyp rebuild clean.
@oorabona oorabona merged commit 0d09200 into master Apr 28, 2026
19 checks passed
@oorabona oorabona deleted the feat/native-memlimit branch April 28, 2026 16:09
oorabona added a commit that referenced this pull request Apr 29, 2026
The Changesets - Version PR workflow has been failing on `node-liblzma`
because the root package isn't included in pnpm-workspace.yaml's
`packages: ['packages/*']` glob. The `native-memlimit.md` changeset
referenced `node-liblzma` and triggered "Found changeset for package
not in the workspace" errors on every master push.

Architectural reality: this monorepo uses release.yml (release-it +
conventional commits) as the canonical release path. Changesets
artifacts are decorative — never consumed by the actual release
flow. Their content is already captured in CHANGELOG.md via
release-it's conventional-commit scan.

Cleanup:
- .changeset/native-memlimit.md (PR #112, node-liblzma not in
  workspace, blocked the workflow)
- .changeset/wasm-memlimit-option.md (PR #111, same reason)
- .changeset/validate-changesets-flow.md (test file from
  changesets adoption — no longer needed)
- .changeset/tar-xz-streaming.md (PR #113, shipped via
  tar-xz@6.1.0 release-it 2026-04-29)
- .changeset/win32-toctou-hardening.md (PR #114, shipped via
  same tar-xz@6.1.0 release)

Keep .changeset/config.json + README.md for the directory's
identity. Future PRs can still add changesets if the changesets
workflow is later re-enabled as a primary path.

Unblocks: changesets.yml workflow on master pushes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants