Rebuild cache: sound type-aware invalidation via two-layer scheme#87
Merged
Conversation
The per-file mtime cache shipped in 3.1.0 is unsound for type-aware rules — see commit 4f4f923 for context. The minimal patch in 4f4f923 restored the 3.0.4 invariant (don't cache type-aware rule results) but type-aware rules then pay full cost on every run, which is the workload TSSLint primarily serves. Rather than evolve the existing tuple-shaped cache, remove it entirely and rebuild from a sound foundation. Plan in `packages/cli/CACHE.md`. This commit deletes: - `packages/cli/lib/cache.ts` — load/save + path key derivation - `Reporter.withoutCache()` from `@tsslint/types` — only meaningful when there's a cache to opt out of - `core.FileLintCache` type and the `cache?` parameter throughout `core.lint`, `core.getCodeFixes`, `core.getRules`, `core.getConfigs` - The getter probe + `typeAwareRules` set added in 4f4f923 — needed only because there was a cache to gate; bring it back in step 1 of the new design - The `cache-typeaware.test.ts` suite — testing removed code - The `--force` CLI flag and its summary hint - `minimatchResult` per-pattern cache — was useful only as a third tuple slot; cost is microseconds per file, can be reintroduced later as a flat top-level key if profiling justifies it CLI smoke: `pnpm -r exec tsc --build` clean across all packages. Compat-eslint pipeline + lazy-estree tests pass unchanged (the `withoutCache()` mocks in compat-pipeline.test.ts are excess properties on object literals; TS allows, runtime no-ops since no caller invokes them after this commit). Net: -442 LOC.
Captures the design for the cache rewrite in `packages/cli/CACHE.md`:
- Two layers gated by an `--incremental` flag
- Layer 1: per-file mtime cache for syntactic rules, runtime
getter probe to classify type-aware rules (sticky), persist
classification across sessions. Always on.
- Layer 2: `BuilderProgram`-based affected-file tracking for
type-aware rule diagnostics, cross-session state via TSSLint-
managed `.tsbuildinfo`. Opt-in until perf is measured.
- New cache file shape: object form with `version`, `ruleModes`,
`files.{mtime, rules}`. Drops the 3.0.x tuple format and the
`minimatchResult` slot.
- Implementation order, edge cases (incremental + noEmit + tsconfig
collision, .tsbuildinfo format stability across TS majors),
backwards compat (none — path key includes pkg.version), and
open questions (default-on vs opt-in for layer 2, ruleModes
expiry policy).
This is a working doc; expected to evolve as implementation lands.
Ten items to run through before touching cache code, grouped by soundness / cache key / implementation / operational. Each item corresponds to a real failure mode the old design hit or could have hit (silent stale on global type changes, half-written JSON on SIGINT, missing TS-version in cache key, etc.). Also captures the realistic upper bound from profile data: cache caps at ~40% wall-time reduction on warm runs because TS Program build is unavoidable.
First step of the cache rebuild plan. Adds the foundation but no cache wiring yet — see packages/cli/CACHE.md for the full design. createLinter now: - Accepts an optional `initialTypeAwareRules` iterable. The CLI will load this from the cache file's `ruleModes` map on session start so a rule classified type-aware in any prior session stays classified — closes the cold-session-with-stale-cache hole 3.0.4 had where the first invocation of a session re-probed and could silently mismatch a stale cache entry. - Wraps `rulesContext.program` in a getter that flips a per-rule `touchedProgram` flag. After each rule call, if the flag is set, add the rule to the linter's `typeAwareRules` set (sticky). - Exposes `getTypeAwareRules(): ReadonlySet<string>` so the CLI can snapshot the classification at session end and persist into the cache file. No behavior change to cache-less operation: probe runs but its output isn't consulted yet. Layer 1 (next commit) is what gates cache reads/writes on the classification. Tests in packages/core/test/probe.test.ts (5 scenarios): - syntactic rule not classified - rule that reads program is classified - classification sticks past the touching file (cross-file in same session) - seeded `initialTypeAwareRules` preserved across sessions - `getTypeAwareRules()` returns the live set
CACHE.md item 7 — spike the LS / BP integration before wiring it
through the CLI. Six scenarios:
1. `builder.getProgram() === languageService.getProgram()` — no
double-build when wrapping LS's program in a builder
2. Cold first run flags every file as affected
3. Edit a.ts → both a.ts and b.ts (importer) in affected set
4. Identical re-run → no files affected (cache hit territory)
5. Edit unrelated c.ts → only c.ts affected, NOT a.ts/b.ts
6. Edit globals.d.ts → all consumers in affected set
↑ the motivating soundness case: ambient declarations don't
appear in any file's `imports` but BuilderProgram's reference
graph tracks them transitively. This is the gap layer 1
alone cannot close.
13/13 checks pass. The integration model works. Layer 2 wiring
into the CLI lint flow is the next step.
Test runs in-process (oldProgram passed directly). Cross-session
state transfer via `.tsbuildinfo` is a separate concern handled
when wiring `incremental: true` into the LS's compilerOptions.
CACHE.md steps 2 + 3 of the rebuild plan: the cache file format and
its wiring into core.lint(). CLI integration (file mtime check,
load/save in the lint loop) is the next commit.
CLI side — packages/cli/lib/cache.ts:
- Cache file at os.tmpdir()/tsslint-cache/<tsslint-ver>/<ts-ver>/<key>.cache.json.
TS version segment closes the .tsbuildinfo-format-cross-major hole
flagged in CACHE.md item 4.
- Object-form CacheData { version, ruleModes, files{mtime, rules} }
replaces the 3.0.x tuple form. `version: "v2"` lets us evolve the
schema; mismatch returns empty cache.
- Atomic write via tmp + rename (CACHE.md item 6). Half-written JSON
on SIGINT no longer corrupts the canonical file.
- Default hash is sha256 hex (was btoa, which produced filesystem-
ENAMETOOLONG paths on real configs).
- Loader treats any uncertainty (parse error, shape mismatch, version
drift) as a cache miss — CACHE.md item 2's "miss is always safe"
invariant.
- Also: scope `files` in cli's package.json to explicit paths so the
newly-introduced test/ doesn't leak to npm. Mirrors fix in e40bfbb
for core, 0585586 for compat-eslint.
Core side — packages/core/index.ts:
- New types `RuleCache { hasFix, diagnostics }` and `FileLintCache =
Record<ruleId, RuleCache>`. Per-file shape; CLI manages the file-
level mtime/version/ruleModes envelope.
- lint() takes optional fileCache. Read path: hit only when entry
exists AND rule is not in typeAwareRules — type-aware rules ignore
any cache entry, matching CACHE.md's layer 1 invariant.
- Write path: report() pushes a serialization-friendly twin of the
diagnostic (file: undefined, relatedInformation files reduced to
{ fileName }). After the rule call, if it touched program, the
whole entry is deleted; otherwise hasFix is set if any fix was
registered.
- Restored diagnostics on cache hit get rehydrated with the live
ts.SourceFile from the current Program.
Tests:
- packages/cli/test/cache.test.ts (14 checks): roundtrip, missing
file → empty, corrupted JSON → empty, version mismatch → empty,
shape mismatch → empty, ts-version segregation, config edit
invalidates path key, no .tmp leak after successful save.
- packages/core/test/cache-layer1.test.ts (18 checks): syntactic
rule cached, type-aware rule not cached, cache hit skips rule
re-execution, restored diagnostic has live file ref, report-then-
touch deletes entry, sticky cross-file, initialTypeAwareRules
ignores stale cache entry, hasFix flag, lint without fileCache
still works (back-compat).
Existing tests pass unchanged (probe, builder-program-poc, compat-
pipeline, lazy-estree).
Walks back the cache-aware lint() introduced in 2284d7c. The cache file format and lifecycle are CLI concerns; core's job is to run rules. Future cache changes (BuilderProgram-based invalidation, schema bumps) shouldn't churn this layer. API changes on createLinter(): - `lint(fileName, options?: { skipRules?: ReadonlySet<string> })` Replaces the cache-mutation overload. Caller passes rule IDs to skip; core just doesn't run them. Caller is responsible for merging cached output back into the diagnostic list. - `hasFixForDiagnostic(fileName, diagnostic): boolean` New per-diagnostic fix accessor. The CLI cache layer uses this to snapshot the `hasFix` flag for a rule's cache entry without reaching into core's internal `diagnostic2Fixes` map. - Removed types: `RuleCache`, `FileLintCache`. Both move to the CLI cache module in the next commit. - Removed the cache-write side of `report()`. Reports now only update the in-memory lint result; serialization for persistence is handled outside core. What stays in core: - The runtime probe + sticky `typeAwareRules` set + `getTypeAwareRules()` exposed for the CLI to persist into `ruleModes`. - `initialTypeAwareRules` constructor param so a session starting cold can preseed classification from the cache file. Tests: - packages/core/test/cache-layer1.test.ts removed — that suite tested cache mutation through lint(), which is no longer core's job. The same scenarios are reformulated against the CLI cache layer in the next commit. - packages/core/test/skip-rules.test.ts added (9 checks): rule in skipRules doesn't run, only specified rule is skipped, lint without options runs everything (back-compat), seeded classification persists through skipping, hasFixForDiagnostic returns true only for diags with registered fixes. Existing probe.test.ts and builder-program-poc.test.ts pass unchanged.
Counterpart to the prior commit (eaabb4c) that pulled cache concerns out of core. Lands the layer 1 logic in packages/cli/lib/cache-flow.ts: lintWithCache(linter, fileName, fileCache, fileMtime, program) → DiagnosticWithLocation[] Responsibilities, all owned by this module: - mtime check: bumping the file mtime resets every rule entry - skip-set computation: cached entries minus type-aware rules - call into core's `linter.lint(fileName, { skipRules })` - group fresh diagnostics by `diagnostic.code` (the rule ID core sets in `report()`) — that's the attribution channel - update cache for rules that ran: - type-aware rules: delete the entry (sticky cleanup, covers report-then-touch and stale prior-session entries) - syntactic rules: write fresh entry, derive `hasFix` via `linter.hasFixForDiagnostic` for each emitted diagnostic - rehydrate cached diagnostics for skipped rules — replace the serialized `file: undefined` with the live `ts.SourceFile` from the current Program; same for `relatedInformation[].file` Layer 2 (BuilderProgram-based affected-file invalidation) plugs in here later: it'll narrow the skip set further by removing entries the BuilderProgram says are affected. Core stays untouched. Tests in packages/cli/test/cache-flow.test.ts (25 checks): syntactic cached / type-aware not, cache hit skips rule, mtime change re-runs, report-then-touch deletes entry, sticky across files, stale entry ignored + cleaned, hasFix derivation, multi-rule partial cache. Setup notes: - Added `@tsslint/types` to cli's dependencies — needed for tests to type-resolve `Config` / `RuleContext` (cli only used them transitively via core before). - Added types/tsconfig project reference to cli's tsconfig. Total cache test surface across the rebuild: 8 + 9 + 13 + 14 + 25 = 69 checks. 144 across all suites.
End of CACHE.md step 4: cli/index.ts and worker.ts now go through the cache-flow module. Per-project cache file is loaded at startup, each file's lint pass uses cached entries when available, and the file is written back at end of project. Project class: - `cacheData: cache.CacheData` (defaults to empty cache) - `init()` calls `cache.loadCache(...)` unless `--force` is passed. Cache key includes ts.version now (the segment was added when the cache module landed; this is the first caller honouring it). Lint loop: - Per file: stat for mtime, get-or-create `project.cacheData.files[fileName]`, pass both to `worker.lint`. - After all files in a project: snapshot `worker.getTypeAwareRules()` into `project.cacheData.ruleModes`, then `cache.saveCache(...)`. - Diagnostic loop now also OR's in `fileCache.rules[diagnostic.code]?.hasFix` because cache-hit diagnostics don't re-register fixes through `linter.hasCodeFixes`. Worker (lib/worker.ts): - `setup` takes `initialTypeAwareRules: readonly string[]` and forwards to `core.createLinter`. - `lint(fileName, fix, fileCache, fileMtime)` — both cache args now required. Replaces direct `linter.lint` calls with `cacheFlow.lintWithCache`. After a fix writes the file, mtime is refreshed via `fs.statSync` so the post-fix re-lint sees the new mtime and invalidates layer-1 entries. - New `getTypeAwareRules()` wrapper for the CLI to snapshot. - `--fix` interaction: before calling lintWithCache in the fix branch, drop cached entries for rules with `hasFix === true`. Their `getEdits` callbacks don't survive the JSON cache round- trip, so we have to actually run those rules to rebuild them. Rules with no fixes can stay cached even under `--fix`. `--force` flag is back, behind a HELP-only documentation entry. With `--force`, `loadCache` is skipped — the project starts with an empty `cacheData` and all rules run fresh. The cache is still written at end of run. Smoke-tested on fixtures/define-rule/: - Cold run: lints, writes cache file with the no-console diagnostic + hasFix=true. - Warm run: cache hit on no-console; same diagnostic shown; `--fix` hint correctly surfaces (via the cached hasFix flag, since the rule didn't run this session). - `--fix` cold + warm: both apply the fix correctly. The pre-call drop of hasFix entries forces those rules to re-run and rebuild their getEdits callbacks. All 144 existing tests still pass (probe / skip-rules / builder- program-poc / cache module / cache-flow / compat-pipeline).
Adds the affected-file gate that BuilderProgram (or any equivalent
tracker) feeds layer 2 with. New option on lintWithCache:
options.typeAwareUnaffected: boolean
true → type-aware rules behave like syntactic ones for caching
(cache hit allowed, cache write on run)
false / undefined → mode A: type-aware rules always re-run, their
entries deleted (current default behavior — safe when
mtime alone can't catch ambient-declaration edits)
Behavior change is opt-in. Existing call sites pass no option and
keep the layer-1-only semantics. The CLI side that drives this
(creating BuilderProgram, walking getSemanticDiagnosticsOfNextAffected-
File, computing per-file affected sets, plumbing through worker.lint)
is the next commit.
Skip-set computation now considers the option:
- syntactic + cached → skip
- type-aware + cached + typeAwareUnaffected=true → skip
- type-aware + cached + typeAwareUnaffected!=true → run (current)
Cache-write logic mirrors:
- rule ran AND (syntactic OR typeAwareUnaffected=true) → write entry
- rule ran AND type-aware AND not typeAwareUnaffected → delete entry
Tests in cache-flow.test.ts cover four new scenarios (10–13):
- typeAwareUnaffected=true → first run caches the entry
- typeAwareUnaffected=true → second run cache hit, rule skipped
- mode B → mode A: entry replaced by re-run, cleanup applied
- default / explicit false → mode A semantics preserved
36 cache-flow checks total.
Drive-by: drop unused `SerializedRelatedInfo` import in cache-flow.ts
(the user's earlier serialize-helper rewrite no longer references it).
CLI surface for layer 2. Adds the `--incremental` flag (in HELP
between --force and --failures-only) and threads it through to
worker.setup as a new boolean param. Worker creates a Semantic-
DiagnosticsBuilderProgram once at setup and walks `getSemantic-
DiagnosticsOfNextAffectedFile` to collect the affected file set.
worker.lint computes `typeAwareUnaffected` per file:
!!affectedFiles && !fix && !affectedFiles.has(fileName)
- falsey when --incremental wasn't passed (no BP, layer 1 only)
- false in --fix mode (fix mutates files mid-session, the setup-
time affected snapshot becomes stale for downstream files —
conservatively treat all as affected)
- true when BP says the file is unaffected this session
The flag is forwarded to cache-flow's lintWithCache option,
already implemented in 576641f.
Limitation, documented inline: without cross-session BP state
(.tsbuildinfo persistence), the first session's BP has no
oldProgram and considers every file affected on cold start. So
within a one-shot CLI invocation, `--incremental` is effectively
a no-op today. The wiring lands here so the cross-session work
can plug in without touching cache-flow or core.
Smoke tests on fixtures/define-rule (cold without flag, cold with
--incremental, warm with --incremental) all produce the expected
diagnostic. Existing unit tests untouched (155 checks across
six suites).
Spawns the local tsslint binary against a throwaway fixture in
os.tmpdir() and inspects diagnostics + cache state on disk. The
in-process cache-flow tests cover the module's invariants — these
exercise everything around it: argv parsing, mtime detection from
the real filesystem, --force / --incremental gates, the cache file
load/save round-trip, and Project's wiring of fileCache through
worker.lint.
Six scenarios:
1. Cold run produces the no-console diagnostic and writes a cache
file containing a rule entry for the fixture.
2. Warm run produces the same diagnostic (cache hit doesn't lose
data).
3. Editing the linted file moves the cached `mtime` past the edit
on the next run (mtime invalidation works through the CLI).
4. Editing tsslint.config.ts mints a fresh cache file under a
different path key (config mtime+size in the hash).
5. --force still produces the diagnostic and exits non-zero.
6. --incremental is accepted and writes a cache (the layer-2
plumbing in 10f0f5c doesn't break anything).
Setup gotcha worth noting: macOS' /var symlinks to /private/var,
and the CLI canonicalises paths through realpath when keying the
cache. The fixture helper realpaths the temp dir up front so the
test's cache-file lookups line up with the keys the CLI writes.
Total cache test surface: 8 + 9 + 13 + 14 + 36 + 14 = 94 across
six suites, plus 75 in compat-pipeline.
Persists per-file state between `tsslint --incremental` runs so the next session can compute which files' type-relevant inputs (own text, transitive deps, ambient `.d.ts`, lib) have actually moved. Without this, the prior wiring (10f0f5c) treated every file as affected on cold start — wiring done but value was zero. Why not `.tsbuildinfo`? TS reads/writes it via `getProgramBuildInfo` / `createBuilderProgramUsingProgramBuildInfo`, both internal and TS- major-coupled. Going through public `BuilderProgram.getAllDependencies` + a content-hash digest trades some hit rate (we invalidate on body- only edits where TS's shape signatures wouldn't) for a contract that survives TS upgrades. New module — packages/cli/lib/incremental-state.ts: - `IncrementalState` shape: `{ version: 'v1', files: { [name]: { contentHash, deps: string[] } } }`. Saved as a new optional `incrementalState` field on the cache file. - `buildIncrementalState(builder, hash)` — harvest from a fresh BuilderProgram pass over the LS program (called once at end of project, after the lint loop). - `computeAffectedFiles(prev, program, hash)` — diff: a file is affected if its own content moved, any dep moved, it's newly added, or anyone listed a removed file in deps. Stale-deps risk is bounded because dep-graph changes always move the file's own content (import statement edit), so they propagate via the own-hash path. Worker: - `setup` now takes `prevIncrementalState` and (when `--incremental`) computes the affected set via the new module instead of the cold- start "everything affected" placeholder. - New `buildIncrementalState()` accessor, called from CLI at end of project to harvest fresh state for persistence. - `defaultHash` fallback (sha256 hex via `node:crypto`) if `ts.sys. createHash` is unavailable on the host. CLI: - Reads `project.cacheData.incrementalState` and passes to `worker.setup`. - After the lint loop, calls `worker.buildIncrementalState()` and writes back to `project.cacheData.incrementalState` before `cache.saveCache`. Cache schema (lib/cache.ts): - Optional `incrementalState?: IncrementalState` on `CacheData`. Layer-1-only sessions stay clean (field absent). Field's own `version` lets us bump the diff format without touching the cache schema. Tests in packages/cli/test/incremental-state.test.ts (18 checks): - No prior state → all affected - Schema version bump → all affected - Identical state → no user file affected (lib files mirrored in prev) - Own content changed → just that file - Dep changed (the `globals.d.ts` killer case) → all consumers affected - Newly added file → affected - Removed file → consumers affected (transitive impact propagates) - Independent file change doesn't affect unrelated files Smoke-tested on fixtures/define-rule with `--incremental`: cold run writes 84 files × deps into the cache file; warm run reads the state and produces the same diagnostic. Total cache test surface now: 8 + 9 + 13 + 14 + 36 + 18 + 14 = 112 across seven cache-related suites, plus 75 in compat-pipeline.
Two bugs found via the layer-2 soundness integration tests:
1. Cold session under --incremental wasn't writing type-aware cache
entries because the affected-file diff (correctly) lists every file
as affected on first encounter (no prev state). The single-flag
`typeAwareUnaffected` conflated "trust cache" with "write cache" —
so layer 2 had nothing to read on the warm session.
Split the option into two flags on cacheFlow.lintWithCache:
- `incremental: boolean` — write type-aware entries (next-session
bait). True ⇔ CLI passed --incremental.
- `typeAwareUnaffected: boolean` — trust this file's cached
type-aware entries (run-skip via skipRules). Only meaningful
when `incremental` is true.
cold session: incremental=true, unaffected=false → run + WRITE
warm session unaffected: incremental=true, unaffected=true → SKIP
warm session affected: incremental=true, unaffected=false → run + WRITE
no flag: any combination → never write type-aware (mode A)
2. `BuilderProgram.getAllDependencies` walks the explicit reference
graph (imports + /// references). Ambient `.d.ts` files in
script mode (e.g. user's `globals.d.ts` declaring globals) connect
only via global scope — no file imports them — so they're absent
from every consumer's dep list. Editing one wouldn't propagate to
the affected-file set; the killer layer-2 case fails silently.
Fix in incremental-state.ts: enumerate non-lib script-mode `.d.ts`
files at state-build time and treat them as universal deps. Lib
files are excluded since `compilerOptions.lib` flips already
invalidate the whole cache via the path key.
Detect script mode via `externalModuleIndicator` (cast since field
is technically internal in TS's public types, but stable at
runtime — used by typescript-eslint and ts-morph for the same
purpose).
Worker and cache-flow tests updated for the new option shape. Two new
cache-flow checks (38 total) cover the cold-session-writes-fresh-entry
case.
Three integration scenarios now passing end-to-end via the marker-
file fixture (rule writes its own execution count to disk, test reads
back to verify):
- Test 7: warm `--incremental` skips type-aware rule (cache hit)
- Test 8: editing `ambient.d.ts` forces dependent re-lint
- Test 9: without `--incremental`, type-aware rule re-runs every
session (mode A semantics preserved)
22 integration checks total (was 14). All other suites unchanged.
Two layered fixes after the layer 2 wiring crashed `JSON.stringify` on Dify (5867 source files, 16M+ transitive dep references → 80MB+ serialized payload, past V8's max string length). 1. Path interning. Every file path stored once in a `paths` table; the per-file entry uses integer indices in its `deps` list. On the same Dify program: ~5× compression vs the verbose form, and numbers serialize faster than strings. 2. Track only user files. node_modules and lib files aren't tracked; their changes don't propagate via this state. Trade-off: a `pnpm install` between `tsslint --incremental` runs serves stale type-aware results until `--force`. Acceptable for CLI use; lib changes are already covered by the cache path key (`compilerOptions.lib` participates in tsconfig+config mtime). Schema bump to `v2` (cache file's `incrementalState.version`). Old "v1" state files are treated as cache-misses on load — by-design, since the path-list/dep-list shape is incompatible. Smoke-tested on Dify web/ (5867 files, react-x/no-leaked-conditional- rendering rule, --incremental): - cold: 19.6 s, cache file 17 MB (vs 80MB+ pre-fix, would crash) - warm (no edits): 11.9 s — ~1.6× speedup from layer 2 cache hits Tests rewritten for the interned shape — 18 checks unchanged in spirit (test 8 renamed from "hash collision in deps" to "independent file change doesn't false-positive" — its actual intent).
The previous approach (path-interned content-hash + dep graph in 68140be) shipped working but with two limitations: - 17MB cache file on Dify (vs 3.6MB now) - drops node_modules dep tracking to fit; ambient declarations needed an explicit special case in code Both stem from rebuilding what TS already does internally for `tsc --incremental`. Switching to the underlying internal API (`getBuildInfo`, `createBuilderProgramUsingIncrementalBuildInfo`) gets all of TS's machinery for free: shape signatures, ambient detection, module augmentation, project references. Mechanism: BuilderProgram emits its state via `emitBuildInfo` to a synthetic `tsBuildInfoFile` path; we capture the text via the host's `writeFile` callback and stash it in `IncrementalState.tsBuildInfoText`. Next session reads it back through `getBuildInfo` and feeds the result to `createBuilderProgramUsingIncrementalBuildInfo`, producing an `oldBuilder` we pass to the new BP. Risk surface (documented in incremental-state.ts): - These two functions are not in `typescript.d.ts`. Stable across TS 5.x → 6.x but no public contract. The cache file path key already includes `ts.version`, so version drift gives a clean miss instead of corruption. - On any deserialization failure (parse error, version mismatch, schema bump), `reconstructOldBuilder` returns `undefined` — caller treats as cold start. The "miss is always safe" invariant. Schema bump to `v3` (the path-keyed `paths`/`files` structure is gone — now just `{ version, tsBuildInfoText }`). Old `v2` state files are treated as misses on load. Worker side: - `getScriptVersion` now falls back to file mtime (was always '0' for unmutated files). Without this, BP couldn't tell that a file's content changed across sessions. In-session bumps from `--fix` still win — they reflect content changes the BP needs to see immediately. - Setup: when `--incremental`, override `compilerOptions.incremental` + `tsBuildInfoFile` (synthetic path). User's own `tsc --incremental` builds use a different path; no collision. - One BP (cached as `currentBuilder`) covers both the affected-file diff at setup and the buildinfo capture at end of session. Dify-scale numbers (5867 files, react-x/no-leaked-conditional-rendering): cold (--incremental): ~58s ← +38s drain cost vs no-incremental warm (--incremental): ~6s ← BP says 0 affected, type-aware rules cache-hit no-incremental: ~10s ← layer 1 only Cold-time penalty is real: BP's `getSemanticDiagnosticsOfNextAffectedFile` forces full semantic-check work that TSSLint's regular lint pass doesn't trigger. CI workloads that always cold-start should NOT use `--incremental`. Repeat dev workflows pay it once and amortize over N warm runs. Tests: - incremental-state.test.ts rewritten for new shape (10 checks): state captured, version stamp, round-trip identical-program → 0 affected, round-trip with edited dep → consumer in affected, undefined / version-mismatch / corrupted-text → cold-start fallback. - integration.test.ts: cache-shape assertion updated (look for `tsBuildInfoText`, not `files`/`paths`). - All other layer 2 integration scenarios (test 7-9: cache hit on warm, ambient-edit invalidation, mode-A re-run) pass unchanged. Out: custom path table, content hashes, ambient-as-universal-dep workaround, 200 LOC of dep-graph plumbing.
The drain pattern from 1c4c888 paid for full semantic diagnostics on every affected file — TSSLint's lint pass already triggers semantic checks lazily for the symbols type-aware rules query, so doing it twice cost +38s on Dify cold (58s vs 10s baseline). `getSemanticDiagnosticsOfNextAffectedFile` accepts an `ignoreSource- File` callback. Returning true from it skips the diagnostic compute for that file but the internal `getNextAffectedFile` walk continues — graph propagation through the reference map still happens, signatures still get computed during emitBuildInfo, the next session can still diff. We just don't pay for diagnostics we never use. Threading the affected-file recorder through the callback keeps the shape we want (affected set in `affectedFiles`) and discards the unused diagnostics. Type wrinkle: the public type for `ignoreSourceFile` declares `(sourceFile: SourceFile) => boolean`, but TS internally calls it with the iterator's `affected` value — which can also be a Program (whole-program affected path, e.g. lib flip). Discriminate at runtime via the `fileName` field; cast through `SourceFile | Program`. Dify-scale numbers (5867 files, react-x/no-leaked-conditional-rendering): before this commit: --incremental cold: 58.3 s --incremental warm: 5.8 s no --incremental: 10.3 s after: --incremental cold: 10.9 s (+0.6 s vs baseline — noise) --incremental warm: 5.9 s (1.7× speedup over baseline) no --incremental: 10.3 s `--incremental` now strictly dominates layer 1 for warm and matches it for cold. CI workloads no longer need to avoid the flag. Existing tests pass unchanged (192 checks across 8 cache suites).
Cold-time penalty was the only reason `--incremental` had to be opt-in. After c6cfbac brought it to +0.6s vs baseline (noise), there's no longer a reason to keep two paths. `--force` already serves as the opt-out — it skips the cache load, so the next invocation starts cold (no prev state for layer 2 to diff against, type-aware rules re-run, fresh state written for the session after that). Changes: - HELP: drop the `--incremental` line. - CLI: drop the `process.argv.includes('--incremental')` check; always pass through to worker. - Worker `setup`: drop the `incremental: boolean` parameter. `compilerOptions.incremental` + `tsBuildInfoFile` always set (overriding the user's tsc-side values, which use a different path). - Worker `lint`: simplify — `affectedFiles` is always populated; drop the conditional that derived `incremental` from its presence. - `cache-flow.lintWithCache`'s `incremental` option stays — the module is library-level and still supports mode A for tests / future callers. Tests: - integration.test.ts test 6: rephrase ("incrementalState always persisted" instead of "--incremental accepted"). - Tests 7-8: drop the redundant `--incremental` arg. - Test 9: repurposed from "without --incremental, type-aware rule re-runs" (no longer reachable from CLI) to "with --force, type-aware rule re-runs every time" — same invariant, exposed via the supported opt-out. Mode A semantics still tested at the cache-flow unit level (38 checks unchanged in cache-flow.test.ts). 192 checks across 8 cache-related suites all pass.
Lost in the cache-removal refactor (47f5410) and never put back when the cache returned. Now that `--force` is the documented opt-out from layer 2 (d303aba), the hint is genuinely useful — without it, users hitting stale cache results have no surfaced way to clear it. Tracking: a file counts as cache-hit if its prev `mtime` matches the current stat AND there's at least one rule entry recorded for it. Approximation, not soundness — layer 2's BP may still re-run type-aware rules if their deps moved, but the user-visible signal just answers "did the cache have something for this file." Output now reads: cold: "1 message (use --fix to apply fixes)" warm: "1 message (use --force to ignore cache, --fix to apply fixes)" with --force: "1 message (use --fix to apply fixes)"
Debug aid for understanding which configured rules went down the
type-aware path vs ran as plain syntactic. Reads from the cache
data we already accumulate (`ruleModes` for type-aware, per-file
`rules` keys for the rest), groups + sorts for stable output.
Sample on Dify (5867 files, react-x rule):
type-aware (1)
react-x/no-leaked-conditional-rendering
Sample on fixtures/define-rule (no-console, no program access):
syntactic (1)
no-console/default
Multi-project runs union the type-aware set across projects so a
rule classified type-aware in one project doesn't show up as
syntactic from another.
Previous behaviour skipped a section header when its count was 0,
so a project with only type-aware rules formatted differently from
one with both kinds. Always print both headers; visually consistent
across runs.
Output now reads, regardless of which side is empty:
type-aware (1)
react-x/no-leaked-conditional-rendering
syntactic (0)
Drops the special-case "(no rules ran)" line — if both counts are
zero the user sees `type-aware (0)` / `syntactic (0)`, which is
already self-explanatory.
`renderer.summary` didn't update lastWasContent, so the blank-line separator before --list-rules output only appeared when diagnostics happened to print earlier in the run. On a clean project (e.g. Dify, where the only rule is type-aware and there are no findings) the separator vanished, making the section visually blur into the summary line. Mark summary as content so any follow-up info() — currently just the --list-rules block — gets the same separator that follows a diagnostic block. Non-TTY output is unchanged (info still skips the blank line outside a TTY).
Layer 2 leans on three TS internal APIs: - ts.getBuildInfo (load path) - ts.createBuilderProgramUsingIncrementalBuildInfo (load path) - BuilderProgram.emitBuildInfo (save path) Load was already wrapped in try/catch — if a future TS renames or removes the loader pair, reconstructOldBuilder returns undefined and the run starts cold. Save was unguarded: missing emitBuildInfo would throw a TypeError after lint already finished, surfacing as an unhandled rejection that takes down the CLI and discards diagnostics for the run. captureIncrementalState now feature-detects emitBuildInfo, wraps the call in try/catch, and returns undefined on either path. The caller (worker.buildIncrementalState → cli persistence) already treats undefined as "layer-1-only this session", so the next run just rebuilds layer 2 from a cold start. Wrong miss > wrong hit. Tests cover both directions — Proxy-stub TS without the loader APIs; fake BuilderProgram without emitBuildInfo and one whose emitBuildInfo throws.
Silent fallback meant a TS upgrade that drops getBuildInfo /
createBuilderProgramUsingIncrementalBuildInfo / emitBuildInfo
would degrade tsslint to layer-1-only without telling anyone — the
run would just get slower across sessions, with no obvious cause.
Now all four failure paths print a yellow `warn` line on stderr
naming the TS version, the missing or failing API, and the
consequence (cache disabled this session / not persisted for next):
warn TypeScript 6.0.3 is missing internal APIs (getBuildInfo /
createBuilderProgramUsingIncrementalBuildInfo). Type-aware
cache disabled — every type-aware rule will re-run from
cold start.
warn TypeScript 6.0.3 BuilderProgram is missing emitBuildInfo.
Type-aware cache cannot be persisted — next run will start
cold.
warn Could not persist incremental state on TypeScript 6.0.3:
<error>. Type-aware cache cannot be persisted — next run
will start cold.
warn Could not load previous incremental state on TypeScript
6.0.3: <error>. Type-aware cache will rebuild from cold
start.
Normal cold-start paths (no prev cache; cache schema bumped) stay
silent — only actual incompatibility / failure warns. stderr keeps
it out of the renderer's stdout flow but visible to anyone who
hasn't redirected stderr.
`captureIncrementalState` takes ts.version as an explicit param
now so the helper stays free of `import 'typescript'` (it works
with a typeof-only import). Tests verify each warn fires with the
right substring and that silent paths produce no stderr.
tsslint's own dogfood lint flagged these as
@typescript-eslint/no-unnecessary-type-assertion:
- probe.test, skip-rules.test: `((rctx: RuleContext) => { ... }) as any`
on rule values typed against `Config`. The arrow function already
matches `Rule = (ctx: RuleContext) => void`; the `as any` was
leftover scaffolding from earlier iterations.
- incremental-state.test: `new Proxy(ts, ...) as typeof ts`. Proxy's
constructor preserves the target type; the cast is a no-op. Same
file: `stale as any` on a `{ version, tsBuildInfoText }` object that
already structurally satisfies `IncrementalState`.
Applied via `tsslint --fix`. All 133 cross-package checks still pass.
Multi-project runs reuse the same in-process worker. setup() reset
linterHost back to originalHost but left snapshots / versions /
affectedFiles / currentBuilder populated from the previous project.
Two consequences:
- memory leak: cross-project file paths accumulate in snapshots
and versions for the lifetime of the process. Short-lived CLI
today, but a future daemon mode would grow unbounded.
- latent correctness bug: if two projects share an absolute file
path (uncommon but possible with monorepo workspace links),
affectedFiles from project A could mis-classify the same path
in project B as cache-hit-eligible.
Clear all four at the top of setup(). projectVersion/typeRootsVersion
keep bumping for LS-cache invalidation as before.
isCacheData was a shallow check — top-level keys only. A corrupt
files entry (mtime as string, rules as null, missing keys) would
pass the gate and surface later as a TypeError mid-lint, with no
actionable signal to the user.
Walk Object.values(files) and verify each entry has a numeric
mtime + non-null rules object. Same for incrementalState if
present (must be {version: string, tsBuildInfoText: string}).
Cost is one full traversal at load time — negligible against the
cost of a confusing crash.
Tests cover both inner-shape mismatches (mtime wrong type, rules
null) and incrementalState corruption (wrong type, missing field).
The TS internal buildinfo format is compact (~3.6MB on Dify's 5867 files), but pathologically large monorepos could in theory grow past V8's max-string limit, or just make the surrounding JSON.stringify of cacheData feel sticky on every save. Hard cap at 64MB (~10–15× headroom over Dify scale, comfortably below V8's max). If captureIncrementalState gets a payload over that, warn + return undefined — caller persists no incrementalState this session, next run starts cold for layer 2 (layer 1 mtime cache still works). Wrong miss > wrong hit.
The summary→info blank-line invariant (recently fixed in 8995ed4) was only verified by integration tests eyeballing subprocess stdout. Cover it directly so the formatting rules can't drift silently. Patches process.stdout.write + process.stdout.isTTY around each case to exercise both the TTY-mode separator path and the non-TTY clean-log path. Re-requires the render module per case so the closure-captured isTTY reflects the current value. Verifies all four state transitions (info→info, diagnostic→info, summary→info, summary→info→info), TTY vs non-TTY for each, and edge cases (empty summary is a no-op, diagnostic indents only in TTY).
3.1.0 (e05416f) shipped Reporter.withoutCache() as the per-diagnostic opt-out for findings whose correctness depends on inputs the cache doesn't track. Dropping it in 47f5410 silently broke any rule that relied on it — the call site would hit a TypeError at runtime, or fail to type-check if written in TS. Restore the API with semantics that match the original: - Diagnostic is still returned for the current run. - cache-flow filters it out before serialising the rule entry to disk, so the next warm hit on this file won't replay it (rule has to re-run to surface it again). Marker is a Symbol.for('@tsslint/no-cache') stamped on the diagnostic in core. Symbol-keyed → invisible to JSON.stringify and to {...spread}, so it doesn't leak into the on-disk cache. Test 17 covers that explicitly. README's "Caching" section is rewritten to describe the new two-layer model and explain when withoutCache() still makes sense (external inputs / fs-driven side reads); for cross-file types, the recommended path is to read ctx.program once so layer 2 handles invalidation. Tests: - cache-flow.test #15: marked diagnostic returned but not persisted - cache-flow.test #16: warm replay drops the marked one - cache-flow.test #17: marker doesn't leak through serialisation
Document the runtime probe's behavior on rules that file-shape-filter before reading ctx.program. Two scenarios that came up during review but weren't directly covered: - probe.test #5 (regression): early-return file processed FIRST in a session leaves the rule unclassified for that invocation, but the next file that does touch program flips classification mid-session, and it sticks even on subsequent early-return calls. - cache-flow.test #17 (regression): in the same scenario, the early-return file's cache entry replays cleanly on a later warm hit — even after the rule is globally classified type-aware. Sound because the early-return path's output is a deterministic function of file text alone (mtime catches changes), and the type-aware path simply never executes for that file. Both pin the user-facing invariant: "a rule that does file-type filtering before accessing program won't be silently mis-classified in a way that produces stale cached diagnostics."
johnsoncodehk
added a commit
that referenced
this pull request
May 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
3.1.0's per-file mtime cache has two problems:
.d.tsand the consumer file's mtime hasn't moved — the previous run's diagnostics get re-served against types that have changed under them.This PR rebuilds the cache as two layers — one per soundness model.
How
BuilderProgramaffected-file diff, persisted via TS internaltsBuildInfoTextround-tripA runtime probe on
RuleContext.programflips a per-rule flag the first time a rule reads it, classifying it type-aware (sticky across sessions). Cache file atos.tmpdir()/tsslint-cache/<tsslint-version>/<ts-version>/<sha256(...)>.cache.jsonwith atomic write + deep shape validation. Wrong miss > wrong hit on every failure path.Layer 2's cross-session state rides on three TS internal APIs (
getBuildInfo,createBuilderProgramUsingIncrementalBuildInfo,BuilderProgram.emitBuildInfo). Each is feature-detected + try/catched, with a yellowwarnline on stderr if any are missing — naming the TS version, the missing API, and the consequence. Normal cold-start paths stay silent.Cache logic lives entirely in
packages/cli/lib/{cache,cache-flow,incremental-state}.ts. Core only exposesskipRules+hasFixForDiagnostic+getTypeAwareRules— it has no knowledge of mtime, BuilderProgram state, or on-disk format.Numbers
Dify web/, 5867 files,
react-x/no-leaked-conditional-rendering./usr/bin/time -p, same machine, cache wiped before each cold measurement, three warm samples for stability.--forceThe win is warm: master's mtime cache only memoises output, so type-aware rules still drive the type checker every run. Layer 2 short-circuits the type-aware lint pass when the file's transitive deps haven't moved. Cold matches master — the BP setup + drain is overhead-free in practice (drain uses
ignoreSourceFileto skip diagnostic compute). Warm bottoms out at the LS Program build floor; daemonisation is the only path below ~6s for a cross-process CLI.Other changes
--force— opts out by clearing the loaded cache, so the same code path runs cold. Summary surfaces(use --force to ignore cache)only when the cache had effect.--list-rules— debug aid, prints rule classification (type-aware vs syntactic) after the run. Both section headers always emit even at count 0 so the format is stable.incremental: true+ a synthetictsBuildInfoFilesoBuilderProgram.emitBuildInfoproduces content. Verified via spike that this doesn't change any TS diagnostic output (semantic, syntactic, suggestion all identical with vs. without).isCacheDatadeep-validatesfilesentries,tsBuildInfoTextcapped at 64MB with warn on overflow,summary→infoblank-line invariant pinned by unit tests.Test plan
pnpm buildcleancore/{builder-program-poc,probe,skip-rules}(30) +cli/{cache,cache-flow,incremental-state,integration,render}(121)tsslint --project packages/{cli,core,config,types,compat-eslint,typescript-plugin}/tsconfig.jsonreports 0 messages across 51 files