Skip to content

test(security): Phase 4 audit — test coverage + 11 impl fixes#986

Merged
boorad merged 8 commits into
mainfrom
feat/security-audit-phase-4
Apr 27, 2026
Merged

test(security): Phase 4 audit — test coverage + 11 impl fixes#986
boorad merged 8 commits into
mainfrom
feat/security-audit-phase-4

Conversation

@boorad
Copy link
Copy Markdown
Collaborator

@boorad boorad commented Apr 27, 2026

Summary

Phase 4 of the security audit. Adds comprehensive test coverage across NIST KATs, RFC vectors, AEAD misuse resistance, wrong-key/IV rejection, and cross-implementation verification, then fixes the 11 failing tests that coverage exposed.

7 commits, +1,783/-218 lines.

Test coverage added

  • 4.1 — NIST KATs: Hash, AES-GCM/CCM/OCB, ML-DSA, ML-KEM
  • 4.2 — RFC vectors: HKDF (RFC 5869), Argon2 (RFC 9106), Scrypt (RFC 7914), BLAKE3 (official test set)
  • 4.3 — AEAD misuse: setAAD-after-update, getAuthTag-on-Decipher, setAuthTag-on-Cipher, final-without-setAuthTag — across GCM/CCM/OCB/ChaCha20-Poly1305/X(Salsa|ChaCha)20-Poly1305
  • 4.4 — Boundary rejection: wrong key/IV/nonce length for AEAD modes + representative legacy block modes
  • 4.5 — Async correctness: fixed fire-and-forget assertions in PBKDF2 / Random tests that were silently passing on rejection
  • 4.6 — Cross-impl verify: ECDSA, Ed25519, RSA-PKCS1-v1.5, RSA-PSS, PBKDF2, HKDF — pinned bytes from Node.js, verified by RNQC

Implementation fixes (the final commit)

The new coverage exposed 11 failures. All fixed in one commit:

AEAD cipher state guards (4 tests)

  • HybridCipher base class now tracks has_update_called; setAAD throws if called after update (OCB, ChaCha20-Poly1305 silently accepted misordered AAD, allowing AAD truncation attacks)
  • CCMCipher::update for decipher no longer throws on missing-tag verification — defers to final() so the failure surfaces at the documented place
  • ChaCha20Poly1305Cipher::final() explicitly checks auth_tag_state for decipher (OpenSSL's chacha20-poly1305 EVP_CipherFinal_ex doesn't surface a missing tag, silently accepting unauthenticated ciphertext)
  • setAuthTag overrides on OCB and ChaCha20-Poly1305 now also transition auth_tag_state (was only the base method)

ECDSA P1363↔DER auto-detect (1 test)

  • HybridEcKeyPair::verify was unconditionally converting sigs P1363→DER, failing for callers that pass DER (subtle.verify with dsaEncoding:'der', Node-API path). Now: if sig_len == 2*n treat as P1363 and convert, otherwise pass through.

Random fixes (4 tests)

  • randomInt async callback was passing undefined instead of null as the error arg
  • randomFill async wasn't writing back to the caller's buffer (native works on a copy for thread safety); JS layer now copies the randomized window back to restore Node-API in-place semantics

HKDF Case 6 fixture (2 tests)

  • IKM hex string was 11 bytes of 0b but the expected OKM is the verbatim RFC 5869 A.6 vector requiring 22 bytes. Implementation was correct; the fixture had a typo.

Test plan

  • Rebuild iOS Pods + reload RN example app; metro log shows 0 failures (was 11 in latest run)
  • Same on Android
  • No regressions in previously-passing AEAD round-trip tests (CCM happy path with setAuthTag still works — the deferred error path only fires on misuse)

🤖 Generated with Claude Code

boorad added 7 commits April 27, 2026 11:37
…DF2, Random)

Pre-fix, every `crypto.pbkdf2(..., cb)` and `crypto.randomFill(..., cb)` test
ran its assertions inside the callback after the test function had already
resolved. A wrong digest or a non-null `err` produced an unhandled rejection
rather than a test failure — i.e. these tests were silently always green.

Each affected test now wraps the callback in `new Promise<void>((resolve,
reject) => ...)` and returns it so the runner's `await test()` actually
observes the assertion outcome.

Touched:

  PBKDF2 — RFC 6070 testFn, the `handles buffers` mixed sync/async test, and
  the per-algorithm fixture-driven `async w/ ...` loop (≈42 generated tests).

  Random — `simple test 5/6/7/8`, `randomFill - deepStringEqual -
  Buffer/Uint8Array`, the `randomFill (async) - view over larger buffer ...`
  regression test, the `randomBytes`/`pseudoRandomBytes` length matrix (16
  generated tests), and the three `randomInt - Asynchronous API / positive
  range / negative range` 100-iteration soak tests. Each `randomInt` test
  now uses a single `settled` flag so the first-failing assertion rejects
  exactly once even when 100 callbacks are in flight.

Adds 2 negative regression tests (one per suite) that purposely assert
`expect(...).to.equal(<wrong>)` inside a callback — verifying via
`assertThrowsAsync` that the new wrapper actually surfaces the failure.

Smoke-test cases that have no assertions (e.g. `randomInt 1`, `randomFill
int16`) are intentionally left as-is to keep this commit scoped to the
fire-and-forget *assertion* class.
…crypt, BLAKE3)

HKDF — expand from RFC 5869 Case 1 only to all 7 cases (§A.1–A.7) covering
both SHA-256 (Cases 1–3) and SHA-1 (Cases 4–7), basic + long inputs,
zero-length salt+info, and Case 7's "salt not provided" path. Also adds a
SHA-1 WebCrypto `subtle.deriveBits` test the existing Node-API loop doesn't
exercise.

Argon2 — add output-byte comparison against the canonical RFC 9106 §5 KAT
tags for argon2d / argon2i / argon2id. The existing tests only checked
`result.length === 32`, accepting any 32 random bytes. Tags pulled from
Node.js's `test-webcrypto-derivebits-argon2.js` (Node's vectors are taken
from RFC 9106 §5.1/§5.2/§5.3). The pre-existing `RFC_PARAMS` input set
already had the §5 (P, S, K, X, t, m, p, T, v=0x13) tuple; `version: 0x13`
is now passed explicitly so a future binding-default change can't silently
break the KAT.

Scrypt — add RFC 7914 §11 Test Case 4 (P="pleaseletmein", S="SodiumChloride",
N=2^20, r=8, p=1, dkLen=64) with maxmem: 1.5 GiB. Lives in its own opt-in
suite `scrypt-tc4-slow` because the working set is ~1.07 GiB and would OOM
on Android emulators / slow CI substantially — Node.js's parallel scrypt
test omits this vector for the same reason. Sync + async variants.

BLAKE3 — add BLAKE3_KAT_CASES with first-32-bytes-of-extended-output for
keyed_hash and derive_key modes for input_len ∈ {0, 1, 8, 64}, sourced
verbatim from packages/react-native-quick-crypto/deps/blake3/test_vectors/
test_vectors.json. Pre-existing tests checked that keyed mode produced
*something different* from unkeyed but never pinned to the published BLAKE3
KAT bytes. Module-load assertion that BLAKE3_KAT_KEY.length === 32 so future
Unicode contamination of the source string can't silently shift every
expected output.

Crypto-specialist independently verified all 7 HKDF tuples, all 3 Argon2
tags, the Scrypt TC4 expected output, and all 8 BLAKE3 (mode, input_len)
entries against their RFC / source-of-truth values.
… ML-DSA, ML-KEM)

Hash (SHA family) — 33 new tests pinning empty-string + "abc" outputs for
sha1/224/256/384/512, sha512-224, sha512-256, sha3-224/256/384/512 against
FIPS 180-4 Appendix C / FIPS 202 §B.1 published values, plus the FIPS 180-4
§B.3/§B.5 long-input ("a" × 1,000,000) vectors for SHA-256 and SHA-512 to
exercise the multi-chunk path. The empty + "abc" outputs are also driven
through the `hash()` one-shot wrapper so both the streaming and one-shot
APIs are pinned.

AES-GCM/CCM/OCB — new AEAD_KATS array with NIST GCM Test Cases 2/3/4
(Joux/McGrew GCM Test Vectors), NIST SP 800-38C CCM Examples C.1 (Tlen=4 B),
C.2 (Tlen=6 B), C.3 (Tlen=8 B), and RFC 7253 §A AES-OCB vectors (empty +
8B P/8B AAD). Each KAT runs both encrypt (asserting ciphertext + tag bytes)
and decrypt (asserting plaintext recovers from given C+T). Until now the
cipher suite only round-tripped over getCiphers() — that catches wiring
bugs but doesn't pin bit-exact output against another implementation.

ML-DSA — add cross-variant rejection (44-sig under 65-pub must fail),
tampered-message rejection, and full PKCS8/SPKI export → import → sign+verify
round-trip per variant, verifying the imported signature against both the
imported public key and the originally-generated public key.

ML-KEM — add FIPS 203 implicit-rejection tests: tampered ciphertext returns
32 deterministic-but-different bytes (must not throw); wrong private key
likewise produces different deterministic bytes. Plus cross-variant size
rejection (768 ciphertext into 512 priv must throw — size validation runs
before any KEM op). OpenSSL doesn't expose seeded ML-DSA/ML-KEM keygen so
we can't anchor to FIPS 204/203 KAT outputs deterministically; these tests
pin the FIPS-mandated *properties* observable at a black-box level.

Crypto-specialist verified the FIPS 180-4/202 hash digests and the NIST
AES-GCM/CCM/OCB AEAD outputs. Two transcription errors were caught and
corrected before commit:

  - SHA-512/224 empty-string output had a wrong digit count + value
    (had 32-byte length, must be 28 bytes). Re-derived from
    `openssl dgst -sha512-224`.

  - OCB §A "8B P + 8B AAD" entry was against the wrong nonce
    N=...221103 with bogus C/T values. Replaced with the actual §A
    N=...221101 vector C=6820b3657b6f615a
    T=5725bda0d3b4eb3a257c9af1f8f03009 (verified against Node's OpenSSL).
Each AEAD spec mandates a strict ordering of API calls; implementations
that silently accept misordered calls open up real attacks (e.g. setAAD
after update would let an attacker truncate AAD bytes the application
thought were authenticated).

Adds 4 tests per cipher across aes-128-gcm, aes-256-gcm, aes-128-ccm,
aes-128-ocb, and chacha20-poly1305 — 20 tests total:

  1. setAAD after update must throw — pinning that the auth tag commits
     to the *full* AAD, not a truncated prefix.

  2. setAuthTag on a Cipher instance must throw — only Decipher consumes
     tags, allowing this on Cipher would invite confusion between "this
     is the tag I computed" and "this is the tag I was given".

  3. getAuthTag on a Decipher instance must throw — only Cipher produces
     tags, calling this on Decipher is meaningless and must surface as
     an error rather than a silent empty / stale buffer.

  4. decipher.final() without first calling setAuthTag must throw —
     otherwise the call accepts unauthenticated ciphertext, defeating
     the AEAD guarantee entirely.

Pinning these matches Node's crypto-module behavior.
Phase 3.1 pinned AES-CBC, AES-CCM, AES-GCM, and xsalsa20. This sweep
extends to aes-128-ocb, aes-256-ocb, chacha20-poly1305, aes-192-cbc,
aes-256-ctr, des-ede3-cbc, plus the libsodium-only xchacha20-poly1305
and xsalsa20-poly1305.

Per cipher, 4 tests pin the boundary:

  (a) correct (key, iv) lengths do NOT throw
  (b) too-short key throws RangeError matching /key length/
  (c) too-long key throws RangeError matching /key length/
  (d) wrong iv length throws RangeError matching /(iv|nonce) length/
      — libsodium ciphers say "nonce", OpenSSL says "iv".

32 generated tests total across 8 ciphers. The point of the sweep isn't
exhaustive cipher coverage (the existing big roundtrip loop over
getCiphers() already exercises every cipher) but to confirm boundary
rejection fires uniformly across the validator's three code paths: the
libsodium fast-path (strict equality), the OpenSSL default-match
fast-path, and the OpenSSL per-parameter fallback that calls
getCipherInfo(name, undefined, ivLen) to ask whether the ivLen is
acceptable.
…s ↔ RNQC)

The fixtures below are generated by Node.js's `crypto` module on the host
(script kept inline below for reproducibility) and pinned hex/b64 in the
new test file `example/src/tests/keys/cross_impl_verify.ts` so RNQC's
verify path is exercised against bytes it didn't produce. Catches the
bug class where RNQC and Node both round-trip with themselves but disagree
on the wire format — e.g. an ECDSA signature DER-encoded with a
leading-zero bug, or a PBKDF2 output that uses the wrong endianness for
the iteration counter on one side.

Six pinned interop checks:

  1. ECDSA P-256/SHA-256 (SPKI + sig) — RNQC verifies, with a Node-API
     fallback if `dsaEncoding: 'der'` isn't honored at the WebCrypto layer.

  2. Ed25519 (SPKI + sig) — RNQC verifies. Ed25519 is fully deterministic
     per RFC 8032, so a passing verify is itself proof that RNQC's
     verifier reproduces the exact bit-string Node's signer emits.

  3. RSASSA-PKCS1-v1_5/SHA-256/2048 (SPKI + sig) — RNQC verifies.

  4. RSA-PSS/SHA-256/2048 (saltLength=32) — RNQC verifies. (Re-signing
     under RNQC would not be a determinism check because PSS uses a
     random salt.)

  5. PBKDF2-HMAC-SHA-256 (100k iters, 32 B) — RNQC sync + async output
     bytes must match Node byte-exact.

  6. HKDF-SHA-256 (32 B) — RNQC sync + async output bytes must match Node
     byte-exact.

Generation script (run once to produce the fixtures, the outputs of which
are pinned in the test file):

  node -e "
    const c=require('crypto');
    const ec=c.generateKeyPairSync('ec',{namedCurve:'P-256'});
    const sig=c.createSign('SHA256')
      .update('cross-impl test message').sign(ec.privateKey);
    console.log(ec.publicKey.export({type:'spki',format:'der'}).toString('base64'));
    console.log(sig.toString('hex'));
  "

(equivalent calls for Ed25519, RSA-PKCS1-v1_5, RSA-PSS, PBKDF2, HKDF).
Five distinct fixes addressing failures introduced by the Phase 4.3
(AEAD misuse), 4.5 (Random callback), and 4.6 (cross-impl verify)
suites:

1. AEAD cipher state guards (4 tests) — adds `has_update_called` to
   HybridCipher base; `setAAD` throws if called after `update` for
   OCB and ChaCha20-Poly1305. OpenSSL silently accepts misordered
   AAD/data on those modes, letting an attacker truncate
   authenticated data.

   For CCM decipher, `update()` no longer throws on missing-tag
   verification — defers to `final()` so the failure surfaces at
   the documented place ("auth tag mismatch on final") rather than
   the misleading "Tag verification failed on update".

   ChaCha20-Poly1305 `final()` explicitly checks `auth_tag_state`
   for decipher; OpenSSL's chacha20-poly1305 `EVP_CipherFinal_ex`
   does not flag a missing tag, which would silently accept
   unauthenticated ciphertext.

   `setAuthTag` overrides on OCB and ChaCha20-Poly1305 now also
   transition `auth_tag_state` (was only the base-class method).

2. ECDSA P1363↔DER auto-detect (1 test) — `HybridEcKeyPair::verify`
   was unconditionally converting sigs from P1363 to DER, which
   fails for callers that already pass DER (`subtle.verify` with
   `dsaEncoding:'der'`, or the Node-API path). Now: if `sig_len ==
   2*n` treat as P1363 and convert, otherwise pass through as DER.

3. `randomInt` async callback (3 tests) — was passing `undefined`
   as the error arg; tests pin Node-style `cb(null, value)`.

4. `randomFill` in-place semantics (1 test) — the native async
   path operates on a copy of the underlying buffer (necessary for
   thread safety on the worker), so the caller's view never saw
   the random bytes. JS layer now copies the randomized window
   from the C++ result back into the caller's buffer, restoring
   Node-API in-place semantics.

5. HKDF RFC 5869 Case 6 fixture (2 tests) — IKM was 11 bytes of
   0x0b but the expected OKM `0ac1af70…` is the verbatim RFC A.6
   vector, which requires 22 bytes of 0x0b. The implementation was
   correct; the test fixture had a typo (Case 7 below uses the full
   22-byte IKM, matching RFC A.7).
@boorad boorad self-assigned this Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 27, 2026

🤖 End-to-End Test Results - iOS

Status: ✅ Passed
Platform: iOS
Run: 25009259084

📸 Final Test Screenshot

Maestro Test Results - ios

Screenshot automatically captured from End-to-End tests and will expire in 30 days


This comment is automatically updated on each test run.

Phase 3 (commit 6870c90, on main) bumped the Android workflow from Node
20 to Node 24. Under Node 24 + the TS 5.9 that bun resolves into the
prepare-script process, `moduleResolution: "node"` (alias for the legacy
`node10` algorithm) is treated as a hard TS5107 error and breaks
`bun install`'s prepare hook before any Android build steps run:

  tsconfig.json(12,25): error TS5107: Option 'moduleResolution=node10'
  is deprecated and will stop functioning in TypeScript 7.0.

Switch to `bundler` to match what the shared `config/tsconfig.json`
already uses for the example app and the workspace root. `bundler` is
the recommended setting for a Metro/Bun-bundled package and is not
flagged for removal.

Verified locally: `bun --filter='*' typescript` passes, `bun run prepare`
in the rnqc package succeeds.
@github-actions
Copy link
Copy Markdown
Contributor

🤖 End-to-End Test Results - Android

Status: ✅ Passed
Platform: Android
Run: 25009259087

📸 Final Test Screenshot

Maestro Test Results - android

Screenshot automatically captured from End-to-End tests and will expire in 30 days


This comment is automatically updated on each test run.

@boorad boorad merged commit 3e06d95 into main Apr 27, 2026
7 checks passed
@boorad boorad deleted the feat/security-audit-phase-4 branch April 27, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant