Skip to content

perf: reduce allocation in parse tokenizer + matcher hot-path structure#169

Open
homanp wants to merge 1 commit intomicromatch:masterfrom
homanp:perf/parse-allocation-and-matcher-structure
Open

perf: reduce allocation in parse tokenizer + matcher hot-path structure#169
homanp wants to merge 1 commit intomicromatch:masterfrom
homanp:perf/parse-allocation-and-matcher-structure

Conversation

@homanp
Copy link
Copy Markdown

@homanp homanp commented Apr 19, 2026

Two files, one PR. Match-phase wins depend on both.

lib/parse.js: splitTopLevel switched from for…of (allocates a char-string per iter) to numeric index + charCodeAt. Sticky-flagged REGEX_NON_SPECIAL_CHARS_STICKY replaces REGEX_NON_SPECIAL_CHARS.exec(remaining()) so the plain-text run consumer doesn't slice per call. Char-code fast paths for [a-zA-Z0-9_-], /, . at the top of the main loop, merging into prev text token in place. Plus smaller things: peek split into peek1() / peek(n), opts.noextglob hoisted, globstar fragment cached per-parse, append() inlined, dead consume() / state.consumed removed.

lib/picomatch.js: dispatch specialization. Three matcher factories chosen at compile time by option shape, so no-options string-glob callers (most of them) skip the per-call typeof checks on callbacks/options. Lazy regex compilation on that hot path: picomatch(pattern) returns a matcher without calling makeRe, and the regex builds on first call and caches in closure. Four shape fastpaths in makeRe extending the existing parse.fastpaths idea, covering src/**/*.js, src/**/*.{js,ts}, !(seg)/**/*.ext, and src/**/!(*.test).{js,ts}. delete regex.state made conditional on returnState to avoid a hidden-class transition.

Verification

1,975 tests pass, lint clean. The existing 41-fixture equivalence grid matches master byte-for-byte. I also built a 10,481-comparison grid (223 real-world pattern shapes × 47 boundary-case paths including dotfiles, double extensions, deep nesting, the usual gotchas) specifically to check the hardcoded shape-fastpath outputs against the parser. Also byte-identical.

GHSA-c2c7-rcm5-vvqj unaffected. Nothing here touches extglob quantifier construction.

Numbers

Median of 5 runs, M-series Mac, Node 23.11. Match-phase throughput (one pattern, many paths, the chokidar shape):

case master PR ratio
src/**/*.js 21.3M 37.0M 1.74×
src/**/*.{js,ts,...} 20.9M 37.0M 1.77×
!(seg)/**/*.ext 9.8M 13.7M 1.39×
src/**/!(*.test).{js,ts} 22.0M 38.1M 1.73×

Compile-phase is higher too, but most of that jump is the lazy-compile deferral showing up where the bench happens to measure it, not 240× less work.

Saw the README line on accuracy over perf, hence the equivalence grid before trusting the shape fastpaths. If any commit lands somewhere you'd rather not touch, say so and I'll drop it.

Semantics-preserving changes across lib/parse.js (allocation reductions
in the main tokenizer loop) and lib/picomatch.js (hot-path matcher
structure, lazy regex compilation, shape fastpaths). All verified
against the existing mocha/lint suite plus two equivalence grids
(41 adversarial fixtures and ~10,500 comparisons across real-world
pattern shapes).

lib/parse.js:

- splitTopLevel rewritten to scan with numeric index + charCodeAt
  instead of the string iterator protocol. Collects split positions,
  slices substrings at the end, avoiding per-iteration string
  allocation and per-char accumulator growth.
- REGEX_NON_SPECIAL_CHARS_STICKY (sticky-flagged twin) replaces
  REGEX_NON_SPECIAL_CHARS.exec(remaining()), eliminating the per-call
  input.slice() allocation in the plain-text run consumer.
- Char-code-dispatched fast paths at the top of the main tokenizer
  loop for plain-text runs, '/', '.'. Same outputs, with in-place
  merging into prev text token to skip the allocation that push()
  would immediately consume via its text-merge branch.
- peek split into peek1() (n=1 hot case) and peek(n) (generic).
- opts.noextglob hoisted to local boolean.
- { ...options } clone skipped when minimatch-compat bridge isn't
  needed; globstar regex fragment cached per-parse.
- append() inlined; dead consume() / state.consumed removed.

lib/picomatch.js:

- Dispatch specialization: matcher factories selected at compile
  time based on which options are set. No-options string-glob
  callers (the majority of real chokidar/fast-glob call sites) get
  a minimal closure; everything else goes through getSlow() with
  unchanged logic.
- Lazy regex compilation on the no-options hot path. picomatch()
  returns a matcher without calling makeRe; the regex is built on
  the first call to the matcher and cached in the closure.
- Four additional shape fastpaths in makeRe extending the existing
  parse.fastpaths idea:
    <seg>/**/*.<ext>                       (src/**/*.js)
    <seg>/**/*.{e1,e2,...}                 (src/**/*.{js,ts})
    !(<seg>)/**/*.<ext>                    (!(node_modules)/**/*.js)
    <path>/**/!(*.a|*.b).{e1,e2,...}       (src/**/!(*.test).{js,ts})
  Only activate when options is undefined and the pattern matches
  the shape; fall through to the full parser otherwise. Hardcoded
  outputs verified byte-for-byte against the parser on a 10,481-
  comparison equivalence grid.
- Conditional 'delete regex.state': only when returnState is truthy.
  Avoids hidden-class transition on every compiled regex.
- onResult/onMatch/onIgnore typeof checks hoisted out of the matcher
  closure.
- input !== '' replaces input.length !== 0 on the matcher hot path.

No API or behavior changes. All 1975 mocha tests pass, lint clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant