Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .changeset/native-memlimit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
'node-liblzma': minor
---

Wire `memlimit` through the native N-API decoder, closing the WASM↔native asymmetry advertised in TSDoc since the v5.x line.

`Unxz`, `createUnxz`, and the `unxz`/`unxzAsync` Buffer APIs now all honor `LZMAOptions.memlimit`:

```ts
const stream = new Unxz({ memlimit: 64 * 1024 * 1024 }); // 64 MiB
// → emits 'error' with LZMAMemoryLimitError on real .xz exceeding the limit
```

**Changes**
- N-API `InitializeDecoder` reads `opts.memlimit` and passes it to `lzma_stream_decoder` (was hardcoded to `UINT64_MAX`).
- JS-side `validateMemlimit` (now exported from `src/wasm/bindings.ts`) is reused at the `Xz`/`Unxz` constructor entry, so NaN/Infinity/fractional/negative/`> Number.MAX_SAFE_INTEGER`/`> UINT64_MAX` are caught synchronously with `LZMAOptionsError` regardless of whether the eventual decoder is native or WASM.
- New bigint upper-bound guard: `memlimit > 18446744073709551615n` (`UINT64_MAX`) now rejected, preventing silent BigInt→uint64 truncation.

**TSDoc** updated to remove the "WASM only" caveat — `memlimit` is now uniform across all decoder paths.

Default behavior is preserved: when `memlimit` is omitted, the native decoder uses `UINT64_MAX` (no limit), and the WASM Buffer API uses 256 MiB (its existing default).
2 changes: 1 addition & 1 deletion .changeset/wasm-memlimit-option.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ When the compressed stream would require more memory than the limit, the promise
**Accepted types:** `number | bigint` (both coerced to `bigint` for the WASM C ABI).
**Default:** `BigInt(256 * 1024 * 1024)` (256 MiB — unchanged from existing behaviour).

**Native parity:** The native Node.js binding (`InitializeDecoder`) still hardcodes `UINT64_MAX` and ignores `memlimit`. This is WASM-only for now; native tracking in TODO.md.
**Native parity:** Wired in the same release — see the native-memlimit changeset for `Unxz`/`createUnxz` coverage.
8 changes: 5 additions & 3 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,17 @@ _None_
## Pending - MEDIUM

- [ ] [tar-xz] True streaming for Node `extract()`/`list()` — replace `Buffer.concat` accumulation (extract.ts:59,91 + list.ts:26) with incremental header→content parsing so memory stays O(largest entry) instead of O(archive). Public README v6.0.0 advertises this as a "planned optimization" — Priority: M
- [ ] [Native] Wire `memlimit` in `src/bindings/node-liblzma.cpp` `InitializeDecoder` — currently hardcodes `UINT64_MAX`; should read `opts.memlimit` and call `lzma_stream_decoder(stream, memlimit, flags)`. WASM already supports it. — Priority: M
- [→] [WASM] Wire `memlimit` through `LZMAOptions` and `unxzAsync` — moved to In Progress (2026-04-28) → ✅ completed (2026-04-28)

## Pending - LOW (Nice to Have)

- [ ] [WASM] `validateMemlimit` symmetry — bigint branch has no UINT64_MAX upper-bound guard (only the `number` branch checks `MAX_SAFE_INTEGER`). Currently benign because native side hardcodes UINT64_MAX and `lzma_stream_decoder` reads `uint64_t` (so `2n ** 65n` would silently wrap to a benign-but-unintended ceiling, no security/leak/crash). Becomes load-bearing once `[Native] Wire memlimit` lands — fix together. Priority: L (defer until native parity work).
- [ ] [Lint] Biome warnings sweep (6 total, all `pnpm exec biome lint` exits 0 — warnings only, ran via `rtk proxy biome lint` workaround on 2026-04-28). Two from PR #111: `test/wasm/decompress-memlimit.test.ts:30` (`useTemplate` — string concat → template literal, biome FIXABLE) + `test/wasm/decompress-memlimit.test.ts:141` (`noNonNullAssertion` on `result!` in callback success test, replace with assertion guard). Four pre-existing on master: `src/errors.ts:176` `noNonNullAssertion` (`messages[errno]!`), `src/lzma.ts:63` + `src/pool.ts:20` `noImportCycles` (lzma↔pool re-export cycle, already noted in MEMORY.md as "benign — ESM resolves at runtime"), `src/pool.ts:166` `noNonNullAssertion` (`this.queue.shift()!` after empty-check). Priority: L (cosmetic; can batch with another pass). Note: lint pipeline is silently broken until RTK biome bug fixes (workaround: `rtk proxy biome ...`).

## Completed

- [x] ✅ [Native] PR #112 Round 2 Copilot fixes — C-2-001 MAX_SAFE_INTEGER guard in Number branch (d > 9007199254740991.0 → TypeError, defense-in-depth comment), C-2-002 else-branch for wrong-type memlimit (string/object/array → TypeError "memlimit must be a Number or BigInt"); sibling pattern: InitializeEncoder uses if(!IsNumber){throw} — same strictest pattern mirrored; gyp+tsc+lint+15 native+full suite pass (2026-04-28)
- [x] ✅ [Native] PR #112 Round 1 Copilot fixes — F-3/C-1/C-2 C++ defense-in-depth throw on lossless=false/out-of-range (was silent UINT64_MAX fallback), C-3 error message context-neutral, C-6 TSDoc coercion wording, F-1 ResolvedLZMAOptions stale TSDoc, F-2 encoder memlimit comment, C-4/C-5 changeset wording; GAP test encoder ignores memlimit; gyp+tsc+lint+15 native+full suite pass (2026-04-28)
- [x] ✅ [Native] Wire `memlimit` in `InitializeDecoder` — `.hpp` signature updated, `node-gyp rebuild` clean, 14/14 memlimit tests pass, 488+99+27=614 total tests pass (2026-04-28)
- [x] ✅ [WASM] `validateMemlimit` symmetry — bigint UINT64_MAX upper-bound guard (closed alongside native memlimit PR) (2026-04-28)
- [x] ✅ [WASM] PR #111 Round 3 Copilot fixes — C-3-001/2/3 duplicate JSDoc blocks removed from decoderInit/autoDecoderInit/validateMemlimit, C-3-004 stale xzAsync/unxzAsync comment fixed in lzma.ts:370; tsc+memlimit+full suite pass (2026-04-28)
- [x] ✅ [WASM] PR #111 Round 2 Copilot fixes — C-2-001 TSDoc xzAsync removed from honored-by list, C-2-002 stale lzma.ts comment, C-2-003 LZMA_OPTIONS_ERROR constant replaces magic 8, C-2-004 MAX_SAFE_INTEGER guard + TSDoc, C-2-005 validateMemlimit lifted to decoderInit+autoDecoderInit; 12 new tests, 474+99+27=600 tests pass (2026-04-28)
- [x] ✅ [WASM] PR #111 Round 1 review fixes — F-001 memlimit validation (NaN/Inf/frac/neg → LZMAOptionsError), F-002 ResolvedLZMAOptions internal type, C-001/C-002 async callback fixture pattern, C-003 byte-equality assertion, F-003 TSDoc reorder, F-004 stale comment, F-005 fixture comment magnitude, F-006 default-path caveat; 4 new tests (12 total in decompress-memlimit.test.ts), 458+99+27=584 tests pass (2026-04-28)
Expand Down
64 changes: 61 additions & 3 deletions src/bindings/node-liblzma.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
* along with this program. If not, see <https://www.gnu.org/licenses/>.
**/

#include <cmath>
#include <node_buffer.h>
#include <vector>
#include "node-liblzma.hpp"
Expand Down Expand Up @@ -109,7 +110,7 @@ LZMA::LZMA(const Napi::CallbackInfo &info) : Napi::ObjectWrap<LZMA>(info), _stre
switch (mode)
{
case STREAM_DECODE:
success = InitializeDecoder(env);
success = InitializeDecoder(env, opts);
break;
case STREAM_ENCODE:
success = InitializeEncoder(opts, preset, check);
Expand Down Expand Up @@ -482,9 +483,66 @@ bool LZMA::InitializeEncoder(const Napi::Object &opts, uint32_t preset, lzma_che
return true;
}

bool LZMA::InitializeDecoder(const Napi::Env &env)
bool LZMA::InitializeDecoder(const Napi::Env &env, const Napi::Object &opts)
{
lzma_ret ret = lzma_stream_decoder(&this->_stream, UINT64_MAX, LZMA_CONCATENATED);
// Read optional memlimit from JS options object.
// If absent or undefined, use UINT64_MAX (no limit — preserves pre-fix behaviour).
uint64_t memlimit = UINT64_MAX;
Napi::Value optsMemlimit = opts.Get("memlimit");
if (!optsMemlimit.IsUndefined() && !optsMemlimit.IsNull())
{
if (optsMemlimit.IsBigInt())
{
// node-addon-api Napi::BigInt::Uint64Value: lossless=true when value fits uint64_t.
// JS-side validateMemlimit() already rejects values > UINT64_MAX, so lossless
// should always be true here. Throw on precision loss (defense-in-depth: a future
// direct-N-API consumer bypassing the JS guard must not silently get 'no limit').
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (!lossless)
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
Comment on lines +497 to +506
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the BigInt branch, if Uint64Value() is not lossless (e.g. value > UINT64_MAX or negative), the code currently falls back to UINT64_MAX (no limit). That can silently disable the caller’s intended limit and reintroduce a potential memory-DoS vector if JS-side validation is bypassed. Prefer throwing a JS exception (RangeError/TypeError) when the value isn’t representable as uint64_t, instead of silently defaulting to no limit.

Suggested change
// JS-side validateMemlimit() already rejects values > UINT64_MAX, so lossless
// should always be true here. We check it defensively and fall back to UINT64_MAX
// (no limit) on truncation, which is safe and conservative.
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (lossless)
{
memlimit = val;
}
// Reject non-lossless conversions defensively so invalid BigInt input cannot
// silently disable the caller's intended memory limit by falling back to
// UINT64_MAX (no limit).
bool lossless = false;
uint64_t val = optsMemlimit.As<Napi::BigInt>().Uint64Value(&lossless);
if (!lossless)
{
Napi::RangeError::New(env, "memlimit must be a non-negative BigInt representable as uint64_t").ThrowAsJavaScriptException();
return false;
}
memlimit = val;

Copilot uses AI. Check for mistakes.
memlimit = val;
}
else if (optsMemlimit.IsNumber())
{
// JS-side validateMemlimit() already rejects non-integer, negative, NaN, Infinity,
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Throw on any out-of-range
// value (defense-in-depth: direct-N-API consumers bypassing the JS guard get an
// error, not a silent UINT64_MAX 'no limit' coercion that inverts caller intent).
double d = optsMemlimit.ToNumber().DoubleValue();
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > static_cast<double>(UINT64_MAX))
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
// Defense-in-depth: validateMemlimit() in JS rejects this first; native enforces
// independently for direct-N-API consumers.
// Number.MAX_SAFE_INTEGER = 2^53 - 1 = 9007199254740991. Any double above this
// threshold no longer represents an exact integer — caller intent is silently
// corrupted. Reject to mirror the JS-side guard.
if (d > 9007199254740991.0)
{
Napi::TypeError::New(env, "memlimit number exceeds MAX_SAFE_INTEGER (use bigint for values >= 2^53)").ThrowAsJavaScriptException();
return false;
}
memlimit = static_cast<uint64_t>(d);
Comment on lines +512 to +531
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the optsMemlimit.IsNumber() branch, the check currently allows any integer-valued double up to UINT64_MAX, but it doesn’t actually enforce a lossless JS number → uint64_t conversion. For defense-in-depth (and to match the JS validateMemlimit contract), consider rejecting values above Number.MAX_SAFE_INTEGER (2^53-1) and/or verifying round-trip equality (static_cast<double>(static_cast<uint64_t>(d)) == d) before accepting the cast.

Suggested change
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Throw on any out-of-range
// value (defense-in-depth: direct-N-API consumers bypassing the JS guard get an
// error, not a silent UINT64_MAX 'no limit' coercion that inverts caller intent).
double d = optsMemlimit.ToNumber().DoubleValue();
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > static_cast<double>(UINT64_MAX))
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
memlimit = static_cast<uint64_t>(d);
// and values above Number.MAX_SAFE_INTEGER (2^53-1). Enforce the same contract
// here so Number -> uint64_t conversion is lossless for direct native callers too.
double d = optsMemlimit.ToNumber().DoubleValue();
const double kMaxSafeInteger = 9007199254740991.0; // Number.MAX_SAFE_INTEGER
if (d < 0.0 || !std::isfinite(d) || d != std::floor(d) || d > kMaxSafeInteger)
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
uint64_t val = static_cast<uint64_t>(d);
if (static_cast<double>(val) != d)
{
Napi::TypeError::New(env, "memlimit out of range or precision lost").ThrowAsJavaScriptException();
return false;
}
memlimit = val;

Copilot uses AI. Check for mistakes.
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If opts.memlimit is provided but is neither a BigInt nor a Number (e.g. a string), the current logic silently falls back to UINT64_MAX (no limit). Since other options are type-validated, it would be more consistent/safer to throw a TypeError when memlimit is present with an unsupported type, rather than ignoring caller intent.

Suggested change
}
}
else
{
Napi::TypeError::New(env, "memlimit must be a Number or BigInt").ThrowAsJavaScriptException();
return false;
}

Copilot uses AI. Check for mistakes.
else
{
// memlimit is present but is neither BigInt nor Number (e.g. string, object, array).
// Silently falling through to UINT64_MAX would disable the limit and invert caller
// intent. Throw explicitly to match how sibling options (threads) handle wrong types.
// Note: TypeScript declaration (number | bigint) blocks this at compile-time for TS
// callers; this guard protects direct-N-API and plain-JS consumers.
Napi::TypeError::New(env, "memlimit must be a Number or BigInt").ThrowAsJavaScriptException();
return false;
}
Comment on lines +509 to +532
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Number branch assumes JS-side validation ensures integer / finite values, but the native binding is still a public boundary. Consider validating here too (finite, integer, non-negative, <= MAX_SAFE_INTEGER or otherwise explicitly documented), and throw on invalid input rather than silently ignoring (leaving memlimit as UINT64_MAX). This avoids surprising behavior if the binding is used directly or future JS validation paths change.

Copilot uses AI. Check for mistakes.
}

lzma_ret ret = lzma_stream_decoder(&this->_stream, memlimit, LZMA_CONCATENATED);

// F-005: Throw exception on decoder init failure (consistent with InitializeEncoder)
if (ret != LZMA_OK)
Expand Down
2 changes: 1 addition & 1 deletion src/bindings/node-liblzma.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ class LZMA : public Napi::ObjectWrap<LZMA>
bool ValidateConstructorArgs(const Napi::CallbackInfo &info, uint32_t &mode, Napi::Object &opts);
bool InitializeFilters(const Napi::Object &opts, uint32_t preset);
bool InitializeEncoder(const Napi::Object &opts, uint32_t preset, lzma_check check);
bool InitializeDecoder(const Napi::Env &env);
bool InitializeDecoder(const Napi::Env &env, const Napi::Object &opts);

// Common cleanup operations for both sync and async completion
void AfterCommon(const Napi::Env &env);
Expand Down
21 changes: 14 additions & 7 deletions src/lzma.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import {
LZMAOptionsError,
LZMAProgrammingError,
} from './errors.js';
import { validateMemlimit } from './wasm/bindings.js';
import type {
CheckType,
CompressionCallback,
Expand Down Expand Up @@ -306,9 +307,9 @@ export type {
* Internal resolved options for XzStream instances.
*
* All fields are required (defaults applied in constructor) EXCEPT memlimit,
* which is genuinely optional: the native binding ignores it (UINT64_MAX
* hardcoded; see TODO "[Native] Wire memlimit in src/bindings/node-liblzma.cpp").
* Only the WASM Buffer API decompression paths (unxz/unxzAsync/streamBufferDecode) honour memlimit; xzAsync is compression-only and ignores this field.
* which is genuinely optional: validated by validateMemlimit() at the XzStream
* constructor for decode streams; passed through to both the native N-API decoder
* and the WASM Buffer API. xzAsync is compression-only and ignores this field.
*/
interface ResolvedLZMAOptions {
check: number;
Expand All @@ -318,7 +319,7 @@ interface ResolvedLZMAOptions {
threads: number;
chunkSize: number;
flushFlag: number;
/** Honoured only by the WASM Buffer API; native streams ignore this field. */
/** Optional memlimit. Validated at the public XzStream constructor; passed through to the decoder backend (native or WASM). */
memlimit?: number | bigint;
}

Expand Down Expand Up @@ -357,6 +358,13 @@ export abstract class XzStream extends Transform {
clonedFilters = [filter.LZMA2];
}

// Validate memlimit early for decoder streams so callers get LZMAOptionsError
// (not a raw RangeError) before any native object is allocated.
// Encoder streams pass memlimit through but it has no effect on the C side.
if (streamMode === liblzma.STREAM_DECODE && opts.memlimit !== undefined) {
validateMemlimit(opts.memlimit);
}

this._opts = {
check: opts.check ?? check.NONE,
preset: opts.preset ?? preset.DEFAULT,
Expand All @@ -365,9 +373,8 @@ export abstract class XzStream extends Transform {
threads: opts.threads ?? 1,
chunkSize: opts.chunkSize ?? liblzma.BUFSIZ,
flushFlag: opts.flushFlag ?? liblzma.LZMA_RUN,
// memlimit is genuinely optional in ResolvedLZMAOptions: the native binding ignores it
// (UINT64_MAX hardcoded; see TODO "[Native] Wire memlimit in src/bindings/node-liblzma.cpp").
// Only the WASM decompression APIs (unxz/unxzAsync/streamBufferDecode) honour this field. xzAsync is compression-only and ignores it.
// memlimit is passed through to the native binding for STREAM_DECODE.
// Validated above for decode streams. Ignored for encode streams.
memlimit: opts.memlimit,
};

Expand Down
21 changes: 11 additions & 10 deletions src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,25 +56,26 @@ export interface LZMAOptions {
/**
* Memory usage limit for decompression, in bytes.
*
* **Honored only by `unxz`/`unxzAsync` (WASM Buffer API decompression).**
* `xzAsync` is compression-only and does not read `memlimit`.
* Currently **silently ignored** by native streams (`Xz`/`Unxz`/`createXz`/`createUnxz`) —
* see TODO `[Native] Wire memlimit in src/bindings/node-liblzma.cpp`.
* **Honored by all decoder APIs** (`unxz`, `unxzAsync`, `Unxz`, `createUnxz` — both native
* and WASM paths). `xzAsync` is compression-only and does not read `memlimit`.
*
* Accepted types: `number` or `bigint` (both are validated then coerced to
* `bigint` before being passed to the WASM C ABI, which maps `uint64_t` to
* `BigInt`). The `number` form must be a finite, non-negative integer; passing
* Accepted types: `number` or `bigint`. Both are validated before reaching the
* decoder. The native binding accepts either type directly via N-API; the WASM
* Buffer API coerces to `bigint` for the C ABI. The `number` form must be a
* finite, non-negative integer; passing
* `NaN`, `Infinity`, a fractional value, or a negative number throws
* `LZMAOptionsError` before any decompression is attempted.
* For values ≥ `Number.MAX_SAFE_INTEGER` (2^53 - 1), use `bigint` to avoid
* precision loss on coercion; passing a `number` above this threshold also
* throws `LZMAOptionsError`.
* `bigint` values above `UINT64_MAX` (2^64 - 1) are also rejected with
* `LZMAOptionsError`.
*
* WASM default: `BigInt(256 * 1024 * 1024)` (256 MiB).
* Native: ignored (UINT64_MAX hardcoded — see follow-up TODO above).
* Default: `UINT64_MAX` (no limit) for native; `BigInt(256 * 1024 * 1024)` (256 MiB)
* for WASM buffer APIs (`unxz`/`unxzAsync`).
*
* When the compressed stream requires more memory than this limit,
* decompression throws `LZMAMemoryLimitError` with
* decompression throws/emits `LZMAMemoryLimitError` with
* `code === LZMA_MEMLIMIT_ERROR` (numeric constant `6`, re-exported from
* `src/errors.ts`).
*/
Expand Down
8 changes: 7 additions & 1 deletion src/wasm/bindings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -261,11 +261,17 @@ export function easyBufferEncode(
*
* @throws LZMAOptionsError if the value is invalid
*/
function validateMemlimit(memlimit: number | bigint): void {
export function validateMemlimit(memlimit: number | bigint): void {
if (typeof memlimit === 'bigint') {
if (memlimit < 0n) {
throw new LZMAOptionsError(LZMA_OPTIONS_ERROR, 'memlimit must be a non-negative value');
}
if (memlimit > 18446744073709551615n) {
throw new LZMAOptionsError(
LZMA_OPTIONS_ERROR,
'memlimit bigint exceeds UINT64_MAX (2^64 - 1); use a smaller value (UINT64_MAX = no limit on native; WASM buffer APIs default to 256 MiB when omitted)'
);
}
return;
}
if (!Number.isFinite(memlimit)) {
Expand Down
Loading
Loading