Skip to content

perf(compile): build dispatch handler context as a stable-shape literal #52

Open
ChristianMurphy wants to merge 1 commit into
syntax-tree:mainfrom
ChristianMurphy:perf/dispatch-context-reuse
Open

perf(compile): build dispatch handler context as a stable-shape literal #52
ChristianMurphy wants to merge 1 commit into
syntax-tree:mainfrom
ChristianMurphy:perf/dispatch-context-reuse

Conversation

@ChristianMurphy

@ChristianMurphy ChristianMurphy commented May 3, 2026

Copy link
Copy Markdown
Member

Initial checklist

  • I read the support docs
  • I read the contributing guide
  • I agree to follow the code of conduct
  • I searched issues and discussions and couldn’t find anything or linked relevant results below
  • I made sure the docs are up to date
  • I included tests (or that’s not needed)

Description of changes

The compile loop built each handler's this-binding with
Object.assign({sliceSerialize: ...}, context). Object.assign goes
through a generic copy-loop over the source's enumerable properties;
V8 cannot give that loop a stable hidden class for the merged object
because the source's shape is a runtime variable. Each event paid the
cost of a copy-iteration plus a hidden-class transition.

The fix replaces the Object.assign with an inline object literal that
lists every field explicitly. The literal has a fixed compile-time
shape so V8 keeps a single hidden class for the merged object across
every event, allocates it through a fast path, and can inline the
construction at the call site. The merged object is still allocated
per event, so the prior contract (handlers may capture this or
reassign top-level fields without leaking into subsequent events) is
preserved. The dispatch is otherwise unchanged.

Inputs that benefit, with multi-run median-of-medians vs the baseline
(spread in parentheses):

10,000 short backtick code spans -36.1% (5.3%)
1,000 inline links in a paragraph -31.0% (2.0%)
10,000 character entity references -29.9% (2.2%)
one CommonMark example -13.8% (5.5%)
CommonMark spec * 35 (~564 KB) -10.6% (3.6%)
CommonMark spec * 7 (~113 KB) -7.8% (18.3%)
full CommonMark spec (~16 KB) -7.4% (43.5%, NOISY)

Every input that drives a non-trivial number of events through the
dispatch loop benefits, because Object.assign was the cost being
eliminated. Inputs with high node-count-per-byte (many small inline
tokens) gain the most.

Trade-offs and inputs where the gain is small or absent:

A 10,000-unmatched-asterisks input and a 256 KB Unicode-heavy input
both moved within +/- 1% of baseline. Their event count per byte is
low, so the per-event Object.assign was not a hotspot.

The pure emphasis stress inputs ('a**b' repeated 10,000 times and
similar) reported large deltas in single runs but their cross-run
spread is 44 to 53% on the baseline alone. The input shape (mostly
attentionSequence events that mostly do not match a handler) means
the per-event merge was not the cost driver. The deltas there are
noise.

Tests pass: dev + prod 1448/1448, 100% coverage maintained,
mdast-util-gfm 54/54, mdast-util-mdx 11/13. The two failing mdx tests
reproduce on upstream/main and are not introduced by this branch.

@github-actions github-actions Bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels May 3, 2026
@ChristianMurphy ChristianMurphy requested a review from Copilot May 3, 2026 13:53
@ChristianMurphy ChristianMurphy added the 🏁 area/perf This affects performance label May 3, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the markdown-to-mdast compile dispatch loop by reducing per-event allocations in the hot path, aiming to improve performance on large event streams.

Changes:

  • Pre-allocates a single merged handler this-binding object (callContext) and mutates only sliceSerialize per event.
  • Refactors the loop to reuse event local variable for cleaner access and slightly less indexing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev/lib/index.js Outdated
@ChristianMurphy ChristianMurphy force-pushed the perf/dispatch-context-reuse branch from 42ca832 to a01d72a Compare May 3, 2026 15:08
The compile loop built each handler's this-binding with
Object.assign({sliceSerialize: ...}, context). Object.assign goes
through a generic copy-loop over the source's enumerable properties;
V8 cannot give that loop a stable hidden class for the merged object
because the source's shape is a runtime variable. Each event paid the
cost of a copy-iteration plus a hidden-class transition.

The fix replaces the Object.assign with an inline object literal that
lists every field explicitly. The literal has a fixed compile-time
shape so V8 keeps a single hidden class for the merged object across
every event, allocates it through a fast path, and can inline the
construction at the call site. The merged object is still allocated
per event, so the prior contract (handlers may capture `this` or
reassign top-level fields without leaking into subsequent events) is
preserved. The dispatch is otherwise unchanged.

Inputs that benefit, with multi-run median-of-medians vs the baseline
(spread in parentheses):

  10,000 short backtick code spans          -36.1%  (5.3%)
  1,000 inline links in a paragraph         -31.0%  (2.0%)
  10,000 character entity references        -29.9%  (2.2%)
  one CommonMark example                    -13.8%  (5.5%)
  CommonMark spec * 35  (~564 KB)           -10.6%  (3.6%)
  CommonMark spec * 7  (~113 KB)             -7.8%  (18.3%)
  full CommonMark spec  (~16 KB)             -7.4%  (43.5%, NOISY)

Every input that drives a non-trivial number of events through the
dispatch loop benefits, because Object.assign was the cost being
eliminated. Inputs with high node-count-per-byte (many small inline
tokens) gain the most.

Trade-offs and inputs where the gain is small or absent:

A 10,000-unmatched-asterisks input and a 256 KB Unicode-heavy input
both moved within +/- 1% of baseline. Their event count per byte is
low, so the per-event Object.assign was not a hotspot.

The pure emphasis stress inputs ('a**b' repeated 10,000 times and
similar) reported large deltas in single runs but their cross-run
spread is 44 to 53% on the baseline alone. The input shape (mostly
attentionSequence events that mostly do not match a handler) means
the per-event merge was not the cost driver. The deltas there are
noise.

Tests pass: dev + prod 1448/1448, 100% coverage maintained,
mdast-util-gfm 54/54, mdast-util-mdx 11/13. The two failing mdx tests
reproduce on upstream/main and are not introduced by this branch.
@ChristianMurphy ChristianMurphy force-pushed the perf/dispatch-context-reuse branch from a01d72a to 5deffad Compare May 3, 2026 15:13
@ChristianMurphy ChristianMurphy changed the title perf(compile): reuse dispatch handler context perf(compile): build dispatch handler context as a stable-shape literal May 3, 2026
@ChristianMurphy ChristianMurphy requested a review from Copilot May 3, 2026 15:13

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev/lib/index.js
Comment on lines +255 to +257
// less to optimize. A fresh object per event preserves the previous
// contract that handlers may capture `this` or reassign top-level fields
// without leaking into subsequent events.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏁 area/perf This affects performance 🤞 phase/open Post is being triaged manually

Development

Successfully merging this pull request may close these issues.

3 participants