Skip to content

perf: speed up V8 snapshot deserialization#1977

Open
bartlomieju wants to merge 1 commit into
consolidate-v8-patch-managementfrom
speed-up-snapshot-deserialization
Open

perf: speed up V8 snapshot deserialization#1977
bartlomieju wants to merge 1 commit into
consolidate-v8-patch-managementfrom
speed-up-snapshot-deserialization

Conversation

@bartlomieju
Copy link
Copy Markdown
Member

Summary

  • Skip write barriers during snapshot deserialization. During startup/shared-heap/context deserialization all objects go to old space and incremental marking is inactive, making every write barrier a guaranteed no-op. Each still pays ~8 instructions (MemoryChunk lookup + branches) per slot write — for ~875K slots, that's millions of wasted instructions.
  • Compile out trace_deserialization flag checks. 31 if (v8_flags.trace_deserialization) checks in the bytecode dispatch loop bloat ReadSingleBytecodeData and hurt icache. Replaced with TRACE_DESER macros that compile to nothing unless V8_ENABLE_TRACE_DESERIALIZATION is defined.

Estimated combined improvement: ~15-25% of total snapshot deserialization time (~2-3ms out of ~11ms on Apple Silicon).

Test plan

  • Verify patch applies: cd v8 && git am -3 ../patches/0004-*.patch
  • Build rusty_v8 and run existing tests
  • Measure with deno run --v8-flags=--profile-deserialization empty.js before/after

Two optimizations targeting the hot deserialization loop:

1. Skip write barriers during non-user-code deserialization. During
   startup/shared-heap/context deserialization all objects are allocated
   in old space and incremental marking is not active, so every write
   barrier is a guaranteed no-op — but each still pays ~8 instructions
   of MemoryChunk lookups and branch checks per slot write.

2. Compile out trace_deserialization flag checks from the hot bytecode
   dispatch loop (31 occurrences). Replace with TRACE_DESER macros that
   compile to nothing unless V8_ENABLE_TRACE_DESERIALIZATION is defined,
   reducing icache pressure in ReadSingleBytecodeData.
@bartlomieju bartlomieju force-pushed the speed-up-snapshot-deserialization branch from 5f136f0 to deced33 Compare May 7, 2026 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant