Skip to content

perf(cache): reuse a single SHA-256 hasher across blobToHash calls#128

Open
KodyJKing wants to merge 2 commits into
mainfrom
kodyk/wasm-hash-singleton
Open

perf(cache): reuse a single SHA-256 hasher across blobToHash calls#128
KodyJKing wants to merge 2 commits into
mainfrom
kodyk/wasm-hash-singleton

Conversation

@KodyJKing

@KodyJKing KodyJKing commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Description

  • Use a single hash-wasm hasher to avoid multiple WASM instances being created.
  • Use hash-wasm resumable hashing API to prevent corruption between concurrent hash calls.

See FFP PR #12270 for motivation.

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have signed the Adobe Open Source CLA.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Previously blobToHash instantiated a fresh hash-wasm instance per call.
Reuse one lazily-created global instance, using hash-wasm's resumable
save()/load() so concurrent calls interleave at their stream reads
without serializing or corrupting each other. Optimizes the I/O-bound
case where blobs are often not fully in memory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +60 to +62
hasher.load(state);
hasher.update(chunk.value);
state = hasher.save();

@KodyJKing KodyJKing Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The save and load are the only possible performance concession here.

The await reader.read() means concurrent hash calls can interleave and corrupt each other's state since the hasher is shared now.

@KodyJKing KodyJKing requested review from AnaSathish and krisnye June 23, 2026 22:50
…eads

Adds a deterministic test that drives concurrent blobToHash calls
through a gated fake Blob, forcing chunk reads to interleave round-robin
and asserting results equal the serial digests. Fails on a shared hasher
without save()/load(); passes with it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
}
});

it("interleaved concurrent reads match serial hashes", async () => {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been verified to fail when save/load are omitted, so this is correctly testing interleaved reads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant