Quenty · Quenty · Jun 24, 2026 · Jun 24, 2026 · Jun 24, 2026 · Jun 24, 2026
diff --git a/.claude/skills/strict-typing-luau/SKILL.md b/.claude/skills/strict-typing-luau/SKILL.md
@@ -0,0 +1,201 @@
+---
+name: strict-typing-luau
+description: Convert a Nevermore Luau file from --!nonstrict (or untyped/--!nocheck) to --!strict, adding the project's explicit type annotations and fixing every type error the checker reports. Use this whenever the user asks to "strictly type", "add types to", "make strict", "type-annotate", "convert to --!strict", or clean up the typing on a .lua/.luau file in this repo — including when they just select a file with a `--!nonstrict` header and say "type this" or point you at a legacy module. Also use it when a strict file is throwing luau-lsp type errors and the user wants them resolved following Nevermore conventions.
+---
+
+# Strictly typing Luau files
+
+Convert a file to `--!strict` and make it pass the type checker cleanly. This is mostly
+**mechanical pattern application** — move fast and correctly. The checker is your fast
+feedback loop; lean on it instead of guessing.
+
+## Triage first — match effort to the file
+
+- **Plain util / data module** (`local X = {}` of functions, or a `return {...}` table; no
+  `setmetatable`): flip the header, give every function typed params/returns, run single-file
+  analyze. Usually no `export type` block needed.
+- **Class** (`setmetatable({}, ...)` + constructor + methods): the real work — needs the
+  `export type` block. Follow the patterns below and `references/conventions.md`.
+- **Already `--!strict` but erroring**: skip conversion; just resolve the reported errors.
+
+## Why it's not find-and-replace
+
+Luau can't infer fields through `setmetatable`, so strict mode turns that blind spot into
+errors. The fix is an explicit `export type` block enumerating every `self` field, plus
+dot-syntax methods that name `self`. Flipping the header without these just produces a wall
+of errors — supplying the types the checker can't infer *is* the job.
+
+## The verification loop (fast inner loop)
+
+`luau-lsp analyze` is the same engine as `lint:luau`, pointed at one file (~2.5s). One-time
+setup if `sourcemap.json` / `globalTypes.d.lua` are missing at repo root: `npm run prelint:luau`.
+
+```bash
+luau-lsp analyze --sourcemap=sourcemap.json --base-luaurc=.luaurc \
+  --defs=globalTypes.d.lua --flag:LuauSolverV2=false --ignore='**/node_modules/**' \
+  src/<package>/src/<Realm>/<File>.lua
+```
+
+Clean = only the `[INFO] Loading...` line. `LuauSolverV2=false` is required (repo pins the old
+solver). Iterate until clean, then run `npm run lint:luau` **once** as the final gate — single-file
+analyze can't see files that depend on *yours*, and tightening a type ripples to subclasses and
+callers. Triage new downstream errors: pre-existing → leave & flag; your type is genuinely too
+tight for the real contract → loosen *your* type (often `T?` not `T`); a small obvious follow-on
+→ fix it.
+
+## Core patterns
+
+**Class with a parent (the common case):**
+
+```lua
+--!strict
+local require = require(script.Parent.loader).load(script)
+
+local BaseObject = require("BaseObject")
+
+local MyClass = setmetatable({}, BaseObject)
+MyClass.ClassName = "MyClass"
+MyClass.__index = MyClass
+
+export type MyClass = typeof(setmetatable(
+	{} :: {
+		_serviceBag: ServiceBag.ServiceBag,   -- EVERY self field, with its type
+		_enabled: ValueObject.ValueObject<boolean>,
+	},
+	{} :: typeof({ __index = MyClass })
+)) & BaseObject.BaseObject   -- intersection pulls in inherited _maid, _obj, etc.
+
+function MyClass.new(serviceBag: ServiceBag.ServiceBag): MyClass
+	local self: MyClass = setmetatable(BaseObject.new() :: any, MyClass)
+	self._serviceBag = assert(serviceBag, "No serviceBag")
+	return self
+end
+```
+
+**Methods use dot syntax with explicit `self`** (colon syntax loses the `self` type in strict
+mode). Callers still write `obj:Method()`; only the definition changes:
+
+```lua
+function MyClass.GetEnabled(self: MyClass): boolean
+	return self._enabled.Value
+end
+```
+
+## The export type rule — always `typeof(setmetatable(...))`, never hand-list methods
+
+There is **one** way to write a class's export type, and it holds for generic, inherited, and
+dynamic-`__index` classes alike:
+
+```lua
+export type MyClass = typeof(setmetatable(
+	{} :: { ...only the instance FIELDS... },
+	{} :: typeof({ __index = MyClass })
+)) [& Parent.Parent]
+```
+
+Methods come from `typeof({ __index = MyClass })` — never hand-write `Method: (self, ...) -> ...`
+records (they drift and cost enormous churn on big classes). Inheritance: list only the child's
+*own new* fields; inherited ones arrive via `& Parent.Parent`. Generics: keep `<T>` load-bearing
+by putting it in a field (e.g. a virtual `Value: T`); never collapse `<T>` to `any`. The one tax:
+a metatable'd type needs `self :: any` for dynamic self-access (`rawget(self :: any, k)`).
+
+## When `:: any` is acceptable
+
+Confined to boundaries the checker genuinely can't model:
+
+- `setmetatable(ParentClass.new(...) :: any, MyClass)` — the metatable transform
+- `Binder.new("Tag", MyClass :: any) :: Binder.Binder<MyClass>` — binder registration
+- `local t: any = require("t")` — `t` (and the rare library like it) isn't strict-friendly
+- **Rx / reactive chains** (`:Pipe({...})`, `switchMap`/`map` closures, `RxSignal`, `Signal.new()`)
+  — the #1 time-sink. Don't try to thread types through them; cast to `any` on the *first* analyze
+  error and move on, keeping the public return type precise. See `references/rx.md`.
+
+**Cast the chain body, never the public return type.** Type every public method's return precisely
+(`Promise.Promise<T>`, `Observable.Observable<T>`); only the internal *expression* producing it may
+be `any`. Casting the return type itself is the single biggest source of avoidable looseness — a
+`GetX(self): any` leaks `any` to every caller. See the Rx/Promise examples in `references/conventions.md`.
+
+Reaching for `:: any` inside a method body means the `export type` block is wrong — fix it there.
+A call-site `(x :: any):Method()` to dodge a *signature mismatch* is a smell: usually the upstream
+signature is too strict (a param that should be optional) — fix it upstream if in scope, else flag
+it. Don't bury a real type bug under a cast.
+
+## Hard cases — try the precise type first, loosen the narrowest spot only
+
+Concrete metatable types (`Signal.Signal<T>`, `ValueObject.ValueObject<T>`, `Maid.Maid`) import
+and check cleanly almost always — write them out. Only these justify deviating:
+
+1. **A type you cannot import** (sibling is still nonstrict, exports no type) → write a **precise
+   structural interface** of the exact surface you use (real signatures, real field types) — this
+   is typing by another means, not a loosening. Better, if cheap: add the `export type` upstream.
+2. **A heavy cyclic generic** the old solver can't hold (a deep class wrapped across many
+   members) → type each occurrence as `any` with JUST the intended type in a trailing comment:
+   `_indexObservers: any, -- ObservableSubscriptionTable<T?>`. Find the one culprit type and sweep
+   **all** its occurrences in one pass (fields, params, **returns**), don't bisect.
+3. **Require cycle** between a class and its definition/factory → hoist shared shapes into a
+   types-only `<Package>Types.lua` (requires nothing at runtime, so importing it can't recreate the
+   cycle). Keep the public API precise; only the internal back-reference may give.
+
+Time-box it: if a file won't go clean within ~2 iterations on the *same* cyclic/"too complex"
+error, take the escape for that member and move on. The public API must stay precise; internal
+plumbing can absorb imprecision.
+
+## Comment discipline
+
+**Default: write loose `any`s BARE — no explanatory comment.** Write `_cmdrService: any,` — NOT
+`_cmdrService: any, -- CmdrServiceClient (nonstrict, no exported type)`. This holds for *every* `any`
+whose source module simply exports no type yet (the common case). The comment reads as "blessed,
+intentional" and discourages the next person from tightening it — and we *want* these `any`s gone. A
+bare `any` is honest debt; a commented one looks finished. Don't explain yourself.
+
+Only two exceptions earn a comment:
+- **A forced cyclic-complexity `any`** (Hard case #2) records its intended type so it's recoverable:
+  `_indexObservers: any, -- ObservableSubscriptionTable<T?>`. The comment is *information*, not an apology.
+- **A non-obvious deliberate structure** (e.g. a structural type dodging a require cycle) gets a line
+  so no one "fixes" it back into the cycle.
+
+If you're tempted to comment an `any` for any other reason — don't.
+
+## Report your hand-offs
+
+End with a short **"Hand-offs"** list: every spot where you took an escape instead of a precise
+type — file, member, the `any`/structural fallback, and the intended type. Skip routine sanctioned
+boundaries (metatable casts, `t: any`, binder registration). This is the deliverable that lets the
+user tighten later; a run that loosened things silently is not done.
+
+## Pitfalls
+
+- **`--!strict` must be the file's FIRST line** — before the `--[=[ ]=]` docstring. Luau only honors
+  a mode directive at the very top; placed after the docstring it's silently ignored and the file is
+  never strict-checked (analyze looks "clean" but isn't).
+- Don't leave `--!nonstrict`/`--!nocheck` "for safety" — the point is `--!strict`.
+- Don't run repo-wide `lint:luau` on every edit — single-file analyze is the inner loop, `lint:luau`
+  is the final gate.
+- Don't invent fields — the `export type` block must list exactly what `self` gets.
+- Keep legacy/deprecation docstrings; typing doesn't change deprecation status.
+
+## Tooling
+
+- `references/conventions.md` — full canonical examples + a common-error → fix table.
+  `references/rx.md` — the Rx/reactive escape in detail.
+- **Converting a whole package?** Don't eyeball the order. Run the planner for a
+  dependency-ordered, model-routed conversion plan — which files, in what order, and which form
+  cyclic clusters to convert together:
+
+  ```bash
+  # from the repo root (the script cd's to the repo top itself, so CWD doesn't matter):
+  .claude/skills/strict-typing-luau/evals/lib/run.sh plan src/<package>     # e.g. src/settings
+  ```
+
+  It defaults to **real mode** (targets the files that aren't `--!strict` yet); convert leaves-first
+  in the order it prints. (`--eval-gold` switches it to the harness's gold-bearing view.)
+- **Parallelize at scale.** Files in the same dependency layer of the plan are independent — they
+  don't require each other, so they can convert concurrently. When a package has **more than ~3
+  files to convert**, fan out **one sub-agent per file within a layer** (launch them in a single
+  message so they run in parallel), wait for the layer, then move to the next. Each sub-agent
+  converts ONE file against its already-converted deps. This makes wall-clock ≈ the slowest file per
+  layer instead of the sum, and keeps each context small. At ≤3 files, just convert sequentially —
+  the fan-out overhead isn't worth it.
+- That same **`run.sh`** with no arguments lists every command. `plan` is the one generally useful
+  for real conversions; the rest (`gold`, `convert`, `place`/`score`/`restore`, `triggers`) *evaluate*
+  the skill — see `evals/README.md` for those workflows.
diff --git a/.claude/skills/strict-typing-luau/evals/README.md b/.claude/skills/strict-typing-luau/evals/README.md
@@ -0,0 +1,115 @@
+# strict-typing-luau — mechanical eval harness
+
+A fast, deterministic, **no-LLM** way to validate the skill. Cases are mined from **main
+history**: real nonstrict→strict conversions the maintainers shipped. Each case stores the
+pre-conversion blob (`input`) and anchors `gold` to **main HEAD**, so single-file analyze always
+matches the main-built sourcemap (no staleness). See `manifest.json`.
+
+## The checks
+
+### 1. Gold smoke test (~20s, no tokens) — validates the harness itself
+
+```bas
+bash evals/lib/run.sh gold
+```
+
+Scores the maintainer's gold (main) for every case. Every positive must come back
+`strict=true, analyze=0`; negatives must be `strict=false`. If this is green, the pairs, the
+scorer, and the plumbing are sound. Run it after editing the manifest or scorer.
+
+### 2. Worker eval — measures an actual skill run
+
+Per case, drive the worker (an agent running the skill) with these primitives:
+
+```bash
+bash evals/lib/run.sh place   <case_id>   # lay the nonstrict INPUT at the file's real path
+#   ... run the skill on that path to convert it in place ...
+bash evals/lib/run.sh score   <case_id>   # -> {strict, analyze_errors, any, any_gold}
+bash evals/lib/run.sh restore <case_id>   # undo: restore the package to main
+```
+
+A conversion **passes** when `strict=true`, `analyze_errors=0`, and `any_nonrx <= any_gold_nonrx`
+(no looser than the maintainer's output, *excluding* Rx-chain casts — those are deliberate policy,
+see `references/rx.md`). `score.sh` reports both the total `any` and the budgeted `any_nonrx`. Run
+`npm run lint:luau` once at the end as the downstream gate.
+
+**Whole-package runs.** Cases sharing an `input` commit belong to one package conversion — e.g.
+the six `settings-*` cases (all `input: b1bc1512aa`, `src/settings`, the `PlayerSettings` package
+the maintainers strict-typed in PR #550). `place` any one of them and the whole package drops to
+its untyped state; the agent then converts every file (leaves first — see their manifest order),
+and you `score` each case. If the agent **added** a file during conversion (a shared `Types.lua`),
+`run.sh sync` before scoring so single-file analyze can resolve it, then `restore` + `sync` again
+to return the sourcemap to main. (Single-file conversions that add nothing don't need `sync`.)
+
+### 3. Triggering test (~5s, one judge call) — validates the `description`
+
+```bash
+bash evals/lib/triggers.sh
+```
+
+Does the skill's frontmatter `description` fire on the right prompts and stay quiet on near-misses?
+Cases live in `triggers.json` (positives + hard negatives like TypeScript-strict, tsconfig-strict,
+formatting). A judge model (`claude -p`) sees only the *live* description from SKILL.md and each
+prompt, returns fire/skip per prompt, and we score against `expect`. The judge reads the current
+description, so this catches a description edit that quietly breaks (or over-broadens) triggering.
+Add a case = one line in `triggers.json`.
+
+### 4. Tooling-scope test (~6s, one judge call) — guards against over-orchestration
+
+```bash
+bash evals/lib/run.sh tooling
+```
+
+Now that SKILL.md surfaces the planner and harness, a single-file worker could wrongly run `plan`
+or the eval harness instead of just converting its file. A judge reads the *live* SKILL.md and
+classifies each task in `tooling.json` as `planner` (whole-package only) or `direct` (single file /
+scoped node / error-fix), scored against `expect`. Catches a SKILL.md edit that over-broadens the
+planner's apparent scope. Add a case = one line in `tooling.json`.
+
+### 5. Parallelism test (~9s, one judge call) — execution strategy at scale
+
+```bash
+bash evals/lib/run.sh parallelism
+```
+
+A judge reads the live SKILL.md and classifies each `parallelism.json` task as `parallel` (fan out
+one sub-agent per file within a dependency layer, for a package > ~3 files) or `sequential` (single
+file / handful / error-fix). Guards the real-use regression where the agent converted a whole package
+serially because the skill never told it to parallelize.
+
+## Design notes
+
+- **Files are placed at package granularity** (`src/<pkg>`): a package is the unit of
+  type-consistency, so a cyclic file scored against nonstrict siblings won't error falsely.
+- **`restore` unstages and cleans** gold-only new files (e.g. a new `*Types.lua`) so cases can't
+  leak into each other.
+- **Budget enforcement belongs here, not in the skill prose.** Cap the worker's tokens/iterations
+  in the runner; if it can't finish a case in budget, that's a recorded failure, not a reason to
+  add more instructions to SKILL.md.
+- **Negatives** come from the revert commit `9027756cd7` (a naive flip that broke downstream and
+  was reverted). They exist to prove the downstream `lint:luau` gate fires.
+- **Model routing (per node, not per file).** `plan` tags each unit with a suggested worker model:
+  **opus** for class / cyclic-cluster / Rx-importing nodes (precision + comprehension matter),
+  **sonnet** for mechanical leaf/util nodes (speed is free, little to cast loosely). Backed by
+  probes on a hard Rx class: Sonnet is fastest (3.1 min vs Opus 5.5) but casts ~3× more `any`
+  (34 vs 12) — fine where looseness is cheap, not where it isn't. **Do not** split one hard file
+  across models as a Sonnet-draft→Opus-polish pipeline: measured 2× the cost of Opus-alone with no
+  precision gain (the polish pass re-comprehends from scratch; post-hoc `any`-tightening is harder
+  than typing precisely up front). Route whole *nodes*, never halves of a file.
+- **`any`-budget gate** (for a Sonnet-first driver): after scoring a node, if `any_nonrx >
+  any_gold_nonrx + slack`, re-run that node on Opus. Uses the budgeted count, so heavy Rx casting
+  (sanctioned) never trips the gate — only genuine non-Rx looseness does.
+
+## Adding a case
+
+Pick a file currently `--!strict` on main, find the commit that introduced it, and use its parent
+as `input` (both must be reachable from main):
+
+```bash
+f=src/<pkg>/src/<Realm>/<File>.lua
+c=$(git log -S'--!strict' --reverse --format='%H' -- "$f" | head -1)
+git merge-base --is-ancestor "$c" main && echo "on main: input=$(git rev-parse --short "$c^")"
+```
+
+Add `{ id, archetype, path, input: "<C^ sha>", gold: "main", polarity: "positive" }` to
+`manifest.json`, then re-run the gold smoke test.