test(security): Phase 4 audit — test coverage + 11 impl fixes#986
Merged
Conversation
…DF2, Random) Pre-fix, every `crypto.pbkdf2(..., cb)` and `crypto.randomFill(..., cb)` test ran its assertions inside the callback after the test function had already resolved. A wrong digest or a non-null `err` produced an unhandled rejection rather than a test failure — i.e. these tests were silently always green. Each affected test now wraps the callback in `new Promise<void>((resolve, reject) => ...)` and returns it so the runner's `await test()` actually observes the assertion outcome. Touched: PBKDF2 — RFC 6070 testFn, the `handles buffers` mixed sync/async test, and the per-algorithm fixture-driven `async w/ ...` loop (≈42 generated tests). Random — `simple test 5/6/7/8`, `randomFill - deepStringEqual - Buffer/Uint8Array`, the `randomFill (async) - view over larger buffer ...` regression test, the `randomBytes`/`pseudoRandomBytes` length matrix (16 generated tests), and the three `randomInt - Asynchronous API / positive range / negative range` 100-iteration soak tests. Each `randomInt` test now uses a single `settled` flag so the first-failing assertion rejects exactly once even when 100 callbacks are in flight. Adds 2 negative regression tests (one per suite) that purposely assert `expect(...).to.equal(<wrong>)` inside a callback — verifying via `assertThrowsAsync` that the new wrapper actually surfaces the failure. Smoke-test cases that have no assertions (e.g. `randomInt 1`, `randomFill int16`) are intentionally left as-is to keep this commit scoped to the fire-and-forget *assertion* class.
…crypt, BLAKE3)
HKDF — expand from RFC 5869 Case 1 only to all 7 cases (§A.1–A.7) covering
both SHA-256 (Cases 1–3) and SHA-1 (Cases 4–7), basic + long inputs,
zero-length salt+info, and Case 7's "salt not provided" path. Also adds a
SHA-1 WebCrypto `subtle.deriveBits` test the existing Node-API loop doesn't
exercise.
Argon2 — add output-byte comparison against the canonical RFC 9106 §5 KAT
tags for argon2d / argon2i / argon2id. The existing tests only checked
`result.length === 32`, accepting any 32 random bytes. Tags pulled from
Node.js's `test-webcrypto-derivebits-argon2.js` (Node's vectors are taken
from RFC 9106 §5.1/§5.2/§5.3). The pre-existing `RFC_PARAMS` input set
already had the §5 (P, S, K, X, t, m, p, T, v=0x13) tuple; `version: 0x13`
is now passed explicitly so a future binding-default change can't silently
break the KAT.
Scrypt — add RFC 7914 §11 Test Case 4 (P="pleaseletmein", S="SodiumChloride",
N=2^20, r=8, p=1, dkLen=64) with maxmem: 1.5 GiB. Lives in its own opt-in
suite `scrypt-tc4-slow` because the working set is ~1.07 GiB and would OOM
on Android emulators / slow CI substantially — Node.js's parallel scrypt
test omits this vector for the same reason. Sync + async variants.
BLAKE3 — add BLAKE3_KAT_CASES with first-32-bytes-of-extended-output for
keyed_hash and derive_key modes for input_len ∈ {0, 1, 8, 64}, sourced
verbatim from packages/react-native-quick-crypto/deps/blake3/test_vectors/
test_vectors.json. Pre-existing tests checked that keyed mode produced
*something different* from unkeyed but never pinned to the published BLAKE3
KAT bytes. Module-load assertion that BLAKE3_KAT_KEY.length === 32 so future
Unicode contamination of the source string can't silently shift every
expected output.
Crypto-specialist independently verified all 7 HKDF tuples, all 3 Argon2
tags, the Scrypt TC4 expected output, and all 8 BLAKE3 (mode, input_len)
entries against their RFC / source-of-truth values.
… ML-DSA, ML-KEM)
Hash (SHA family) — 33 new tests pinning empty-string + "abc" outputs for
sha1/224/256/384/512, sha512-224, sha512-256, sha3-224/256/384/512 against
FIPS 180-4 Appendix C / FIPS 202 §B.1 published values, plus the FIPS 180-4
§B.3/§B.5 long-input ("a" × 1,000,000) vectors for SHA-256 and SHA-512 to
exercise the multi-chunk path. The empty + "abc" outputs are also driven
through the `hash()` one-shot wrapper so both the streaming and one-shot
APIs are pinned.
AES-GCM/CCM/OCB — new AEAD_KATS array with NIST GCM Test Cases 2/3/4
(Joux/McGrew GCM Test Vectors), NIST SP 800-38C CCM Examples C.1 (Tlen=4 B),
C.2 (Tlen=6 B), C.3 (Tlen=8 B), and RFC 7253 §A AES-OCB vectors (empty +
8B P/8B AAD). Each KAT runs both encrypt (asserting ciphertext + tag bytes)
and decrypt (asserting plaintext recovers from given C+T). Until now the
cipher suite only round-tripped over getCiphers() — that catches wiring
bugs but doesn't pin bit-exact output against another implementation.
ML-DSA — add cross-variant rejection (44-sig under 65-pub must fail),
tampered-message rejection, and full PKCS8/SPKI export → import → sign+verify
round-trip per variant, verifying the imported signature against both the
imported public key and the originally-generated public key.
ML-KEM — add FIPS 203 implicit-rejection tests: tampered ciphertext returns
32 deterministic-but-different bytes (must not throw); wrong private key
likewise produces different deterministic bytes. Plus cross-variant size
rejection (768 ciphertext into 512 priv must throw — size validation runs
before any KEM op). OpenSSL doesn't expose seeded ML-DSA/ML-KEM keygen so
we can't anchor to FIPS 204/203 KAT outputs deterministically; these tests
pin the FIPS-mandated *properties* observable at a black-box level.
Crypto-specialist verified the FIPS 180-4/202 hash digests and the NIST
AES-GCM/CCM/OCB AEAD outputs. Two transcription errors were caught and
corrected before commit:
- SHA-512/224 empty-string output had a wrong digit count + value
(had 32-byte length, must be 28 bytes). Re-derived from
`openssl dgst -sha512-224`.
- OCB §A "8B P + 8B AAD" entry was against the wrong nonce
N=...221103 with bogus C/T values. Replaced with the actual §A
N=...221101 vector C=6820b3657b6f615a
T=5725bda0d3b4eb3a257c9af1f8f03009 (verified against Node's OpenSSL).
Each AEAD spec mandates a strict ordering of API calls; implementations
that silently accept misordered calls open up real attacks (e.g. setAAD
after update would let an attacker truncate AAD bytes the application
thought were authenticated).
Adds 4 tests per cipher across aes-128-gcm, aes-256-gcm, aes-128-ccm,
aes-128-ocb, and chacha20-poly1305 — 20 tests total:
1. setAAD after update must throw — pinning that the auth tag commits
to the *full* AAD, not a truncated prefix.
2. setAuthTag on a Cipher instance must throw — only Decipher consumes
tags, allowing this on Cipher would invite confusion between "this
is the tag I computed" and "this is the tag I was given".
3. getAuthTag on a Decipher instance must throw — only Cipher produces
tags, calling this on Decipher is meaningless and must surface as
an error rather than a silent empty / stale buffer.
4. decipher.final() without first calling setAuthTag must throw —
otherwise the call accepts unauthenticated ciphertext, defeating
the AEAD guarantee entirely.
Pinning these matches Node's crypto-module behavior.
Phase 3.1 pinned AES-CBC, AES-CCM, AES-GCM, and xsalsa20. This sweep
extends to aes-128-ocb, aes-256-ocb, chacha20-poly1305, aes-192-cbc,
aes-256-ctr, des-ede3-cbc, plus the libsodium-only xchacha20-poly1305
and xsalsa20-poly1305.
Per cipher, 4 tests pin the boundary:
(a) correct (key, iv) lengths do NOT throw
(b) too-short key throws RangeError matching /key length/
(c) too-long key throws RangeError matching /key length/
(d) wrong iv length throws RangeError matching /(iv|nonce) length/
— libsodium ciphers say "nonce", OpenSSL says "iv".
32 generated tests total across 8 ciphers. The point of the sweep isn't
exhaustive cipher coverage (the existing big roundtrip loop over
getCiphers() already exercises every cipher) but to confirm boundary
rejection fires uniformly across the validator's three code paths: the
libsodium fast-path (strict equality), the OpenSSL default-match
fast-path, and the OpenSSL per-parameter fallback that calls
getCipherInfo(name, undefined, ivLen) to ask whether the ivLen is
acceptable.
…s ↔ RNQC)
The fixtures below are generated by Node.js's `crypto` module on the host
(script kept inline below for reproducibility) and pinned hex/b64 in the
new test file `example/src/tests/keys/cross_impl_verify.ts` so RNQC's
verify path is exercised against bytes it didn't produce. Catches the
bug class where RNQC and Node both round-trip with themselves but disagree
on the wire format — e.g. an ECDSA signature DER-encoded with a
leading-zero bug, or a PBKDF2 output that uses the wrong endianness for
the iteration counter on one side.
Six pinned interop checks:
1. ECDSA P-256/SHA-256 (SPKI + sig) — RNQC verifies, with a Node-API
fallback if `dsaEncoding: 'der'` isn't honored at the WebCrypto layer.
2. Ed25519 (SPKI + sig) — RNQC verifies. Ed25519 is fully deterministic
per RFC 8032, so a passing verify is itself proof that RNQC's
verifier reproduces the exact bit-string Node's signer emits.
3. RSASSA-PKCS1-v1_5/SHA-256/2048 (SPKI + sig) — RNQC verifies.
4. RSA-PSS/SHA-256/2048 (saltLength=32) — RNQC verifies. (Re-signing
under RNQC would not be a determinism check because PSS uses a
random salt.)
5. PBKDF2-HMAC-SHA-256 (100k iters, 32 B) — RNQC sync + async output
bytes must match Node byte-exact.
6. HKDF-SHA-256 (32 B) — RNQC sync + async output bytes must match Node
byte-exact.
Generation script (run once to produce the fixtures, the outputs of which
are pinned in the test file):
node -e "
const c=require('crypto');
const ec=c.generateKeyPairSync('ec',{namedCurve:'P-256'});
const sig=c.createSign('SHA256')
.update('cross-impl test message').sign(ec.privateKey);
console.log(ec.publicKey.export({type:'spki',format:'der'}).toString('base64'));
console.log(sig.toString('hex'));
"
(equivalent calls for Ed25519, RSA-PKCS1-v1_5, RSA-PSS, PBKDF2, HKDF).
Five distinct fixes addressing failures introduced by the Phase 4.3
(AEAD misuse), 4.5 (Random callback), and 4.6 (cross-impl verify)
suites:
1. AEAD cipher state guards (4 tests) — adds `has_update_called` to
HybridCipher base; `setAAD` throws if called after `update` for
OCB and ChaCha20-Poly1305. OpenSSL silently accepts misordered
AAD/data on those modes, letting an attacker truncate
authenticated data.
For CCM decipher, `update()` no longer throws on missing-tag
verification — defers to `final()` so the failure surfaces at
the documented place ("auth tag mismatch on final") rather than
the misleading "Tag verification failed on update".
ChaCha20-Poly1305 `final()` explicitly checks `auth_tag_state`
for decipher; OpenSSL's chacha20-poly1305 `EVP_CipherFinal_ex`
does not flag a missing tag, which would silently accept
unauthenticated ciphertext.
`setAuthTag` overrides on OCB and ChaCha20-Poly1305 now also
transition `auth_tag_state` (was only the base-class method).
2. ECDSA P1363↔DER auto-detect (1 test) — `HybridEcKeyPair::verify`
was unconditionally converting sigs from P1363 to DER, which
fails for callers that already pass DER (`subtle.verify` with
`dsaEncoding:'der'`, or the Node-API path). Now: if `sig_len ==
2*n` treat as P1363 and convert, otherwise pass through as DER.
3. `randomInt` async callback (3 tests) — was passing `undefined`
as the error arg; tests pin Node-style `cb(null, value)`.
4. `randomFill` in-place semantics (1 test) — the native async
path operates on a copy of the underlying buffer (necessary for
thread safety on the worker), so the caller's view never saw
the random bytes. JS layer now copies the randomized window
from the C++ result back into the caller's buffer, restoring
Node-API in-place semantics.
5. HKDF RFC 5869 Case 6 fixture (2 tests) — IKM was 11 bytes of
0x0b but the expected OKM `0ac1af70…` is the verbatim RFC A.6
vector, which requires 22 bytes of 0x0b. The implementation was
correct; the test fixture had a typo (Case 7 below uses the full
22-byte IKM, matching RFC A.7).
Contributor
🤖 End-to-End Test Results - iOSStatus: ✅ Passed 📸 Final Test ScreenshotScreenshot automatically captured from End-to-End tests and will expire in 30 days This comment is automatically updated on each test run. |
Phase 3 (commit 6870c90, on main) bumped the Android workflow from Node 20 to Node 24. Under Node 24 + the TS 5.9 that bun resolves into the prepare-script process, `moduleResolution: "node"` (alias for the legacy `node10` algorithm) is treated as a hard TS5107 error and breaks `bun install`'s prepare hook before any Android build steps run: tsconfig.json(12,25): error TS5107: Option 'moduleResolution=node10' is deprecated and will stop functioning in TypeScript 7.0. Switch to `bundler` to match what the shared `config/tsconfig.json` already uses for the example app and the workspace root. `bundler` is the recommended setting for a Metro/Bun-bundled package and is not flagged for removal. Verified locally: `bun --filter='*' typescript` passes, `bun run prepare` in the rnqc package succeeds.
Contributor
🤖 End-to-End Test Results - AndroidStatus: ✅ Passed 📸 Final Test ScreenshotScreenshot automatically captured from End-to-End tests and will expire in 30 days This comment is automatically updated on each test run. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Phase 4 of the security audit. Adds comprehensive test coverage across NIST KATs, RFC vectors, AEAD misuse resistance, wrong-key/IV rejection, and cross-implementation verification, then fixes the 11 failing tests that coverage exposed.
7 commits, +1,783/-218 lines.
Test coverage added
Implementation fixes (the final commit)
The new coverage exposed 11 failures. All fixed in one commit:
AEAD cipher state guards (4 tests)
HybridCipherbase class now trackshas_update_called;setAADthrows if called afterupdate(OCB, ChaCha20-Poly1305 silently accepted misordered AAD, allowing AAD truncation attacks)CCMCipher::updatefor decipher no longer throws on missing-tag verification — defers tofinal()so the failure surfaces at the documented placeChaCha20Poly1305Cipher::final()explicitly checksauth_tag_statefor decipher (OpenSSL's chacha20-poly1305 EVP_CipherFinal_ex doesn't surface a missing tag, silently accepting unauthenticated ciphertext)setAuthTagoverrides on OCB and ChaCha20-Poly1305 now also transitionauth_tag_state(was only the base method)ECDSA P1363↔DER auto-detect (1 test)
HybridEcKeyPair::verifywas unconditionally converting sigs P1363→DER, failing for callers that pass DER (subtle.verifywithdsaEncoding:'der', Node-API path). Now: ifsig_len == 2*ntreat as P1363 and convert, otherwise pass through.Random fixes (4 tests)
randomIntasync callback was passingundefinedinstead ofnullas the error argrandomFillasync wasn't writing back to the caller's buffer (native works on a copy for thread safety); JS layer now copies the randomized window back to restore Node-API in-place semanticsHKDF Case 6 fixture (2 tests)
0bbut the expected OKM is the verbatim RFC 5869 A.6 vector requiring 22 bytes. Implementation was correct; the fixture had a typo.Test plan
setAuthTagstill works — the deferred error path only fires on misuse)🤖 Generated with Claude Code