Skip to content

Commit 26bcac0

Browse files
authored
feat(ux-9): JS runtime for requires_js specs (full dynamic generator coverage) (#102)
* ux-9 phase 0: refresh coverage baseline + counters + fixtures - Refresh docs/coverage-baseline.json with 2026-05-03 numbers (3,641 requires_js, was stale at 1,889). - Add new counter fields to status --json (schema 1.1): commands_addressable, commands_(fully|partially|non)functional, requires_js_generators_(total|supported|unsupported), command_alias_conflicts. - Surface file-level scan under file_scan key (raw JSON walk so the count is independent of CompletionSpec's structured deserializer, which silently drops OptionSpec.args[N>0] today). - Fix docs/COMPLETION_SPEC.md to describe the actual requires_js skip behaviour and document the planned js_runtime metadata shape. - Update docs/SPECS.md baseline-refresh workflow to project the new counter fields. - Add Phase 0 fixtures under crates/gc-suggest/tests/fixtures/ux9/. New js_runtime fixtures parked under parked/ for Phase 2 activation. - Add scripts/count-spec-coverage.sh — repo-local jq-based counter whose results cross-check against ghost-complete status --json. * ux-9 phase 0: fix-up doc schema, script naming, comment accuracy - docs/COMPLETION_SPEC.md: document js_runtime optional fields input, timeout_ms, allow_shell_command per SPEC.md:232-238 - scripts/count-spec-coverage.sh: rename file-level commands_* outputs to file_scan_* to disambiguate from runtime-level status --json counters (527/176 vs 529/180 today) - crates/ghost-complete/src/status.rs: clarify that requires_js_generators_total=3641 is sourced from the file-level walk; the loader-level count is currently 3506 (135 under, will converge after Phase 1 stem-keying) * ux-9 phase 1: spec store keyed by filename stem with alias index Replaces the single HashMap<String, CompletionSpec> store with a unique entry list plus a HashMap<String, Arc<SpecEntry>> alias index. Each parsed spec is addressable by its filename stem (canonical id) and by its CompletionSpec.name when that doesn't conflict with another spec. - 709 spec files now address as 709 unique entries (no silent loss). Pre-Phase-1 the loader keyed on CompletionSpec.name and the corpus's ~6 stem/name collisions silently dropped one spec per pair. - 14 filename/name mismatches surface as either non-conflicting aliases (8 succeed) or AliasConflict records (6 collide). DuplicateName, NameMatchesOtherStem, and DirectoryPrecedence cover the three collision shapes. - Stems take precedence over `name` aliases via a two-pass registration. kubectl.json keeps the `kubectl` alias even when kubecolor.json (declared name="kubectl") is processed first alphabetically. - Directory precedence preserved: earlier configured dirs win, user specs win over embedded fallback. The losing copy is recorded as a DirectoryPrecedence conflict at debug! level (not an error — that's how user overrides work). - SpecStore::iter() yields one tuple per unique spec entry, not per alias, so status counts don't double-count. - SpecStore::get() resolves stem and non-conflicting name alias. - New API: SpecEntry, AliasConflict, AliasConflictKind, AliasOwner; SpecStore::entries(), SpecStore::aliases_count(), SpecStore::conflicts(). Status JSON updates: - commands_addressable now counts unique resolvable aliases (was: file count). Against the embedded corpus this is 717 = 709 stems + 8 non-conflicting name aliases. - command_alias_conflicts now reflects runtime-visible collisions from SpecStore::conflicts() (6 against the embedded corpus, exactly the set previously lost to name-keyed dedupe). - commands_nonfunctional stays 0 — Phase 1 keeps every spec addressable via its stem; Phase 7 will surface specs whose only generator path is unsupported requires_js. Removes the file-walk count_alias_conflicts shim in status.rs; SpecStore::conflicts() is the single source of truth now. * ux-9 phase 1: fix-up DirectoryPrecedence classification + doc claims - DirectoryPrecedence is now reserved for the literal same-filename scenario (winner.filename_stem == loser.filename_stem). Cross-dir stem-vs-alias collisions classify as NameMatchesOtherStem, which Phase 7 doctor diagnostics will render with the right hint. - Soften docs/COMPLETION_SPEC.md and status.rs comment to remove the forward-looking claim that doctor surfaces per-conflict diagnostics today. That's Phase 7 work; status --json's count is what users see for now. * ux-9 phase 2: preserve js_runtime metadata in schema, converter, build Adds JsRuntimeSpec / JsRuntimeKind / JsRuntimeInput to the Rust generator schema as Option<JsRuntimeSpec> on GeneratorSpec. deny_unknown_fields preserved; existing specs without js_runtime parse identically. Converter emissions: _custom -> js_runtime.kind = "custom" _scriptFunction -> js_runtime.kind = "script_function" _postProcess -> js_runtime.kind = "post_process" (when matcher cannot lower to transforms) Native generator and transform mappings still win first. Regen of all 709 specs produces the expected js_runtime addition; js_source is no longer emitted (replaced by js_runtime.source). The static converter also now drops null/undefined entries in option name arrays (Fig's `name:["-H",,"--hostname"]` sparse-hole pattern, which the Phase 2 regen exposed as a parse failure on `next.json` and `pnpx.json`). Build.rs continues to strip any legacy js_source defensively, asserts js_runtime.source survives the strip, and renames the helper to strip_legacy_js_source to reflect its post-Phase-2 role. Phase 4 will switch the runtime dispatch path to honor js_runtime. Activates the two parked Phase 0 fixtures (crates/gc-suggest/tests/fixtures/ux9/{post_process_supported,custom_unsupported}.json) out of `parked/` now that the schema accepts js_runtime under deny_unknown_fields. Re-applies hand-curated overrides that the converter regen would otherwise wipe: cd.json (folders template), cargo.json (build/run priority 92, test 90, install 75), docker.json (compose priority 88), aws.json (--profile cache.ttl_seconds: 300). Heuristic priorities re-applied via tools/spec-priority-audit/apply.mjs. Binary size: 106928256B -> 108452944B (delta +1524688B / +1.45 MB, under the 2 MB delta gate). Updates benchmarks/binary-size-baseline.txt to the new floor. * ux-9 phase 3: gc-jsrt crate with bounded QuickJS evaluator Adds a new gc-jsrt workspace crate that owns the rquickjs dependency and exposes a bounded JS evaluator for Phase 4+ consumption. The runtime is NOT yet wired into the suggestion engine -- the JsRuntimeKind dispatch from gc-suggest into gc-jsrt arrives in Phase 4 (PostProcess) and Phase 5 (ScriptFunction / Custom). Architecture: - Sync rquickjs::Runtime on a dedicated worker thread. Tokio side uses an mpsc channel for jobs and a per-job oneshot for replies; AsyncRuntime is intentionally not used because the corpus JS is short, synchronous post-processing logic. - Fresh rquickjs::Context per invocation; the Runtime is reused across jobs for warm GC and stable allocator pressure. - Sandbox: removes Node-/Deno-/Bun-style globals (require, process, Deno, Bun, setTimeout, fetch, Buffer, Worker, ...); shadows the eval and Function intrinsics with throwing closures; rquickjs is built with default-features = false plus only rust-alloc, so no module loader, no native-plugin loader, no async surface, no proc macros. - Wall-clock interrupt via Runtime::set_interrupt_handler; a shared AtomicI64 carries the deadline relative to a worker-process-start Instant, and an RAII guard disarms the deadline between jobs. Documented as bounded but not hard real-time on long native operations (pathological regex, JSON parsing of huge strings). - Memory limit 8 MiB, max stack 512 KiB, GC threshold 2 MiB. Output normalization: - Goes through QuickJS' JSON.stringify, then serde_json::Value. This sidesteps cycles (stringify throws), drops functions / symbols / host objects, and crosses the UTF-16/UTF-8 boundary exactly once. - Accepts string, string[], object{name|displayName|text, description?}, object[], and Promise<any of the above>. - Rejects functions, missing name, oversized arrays/strings, cyclic objects, symbols, host objects with structured diagnostics (Timeout, MemoryExceeded, InvalidShape, OversizedOutput, Exception, UnsupportedApi, EmptyOutput). - Limits: 1024 suggestions, 256-byte names, 1024-byte descriptions, 256 KiB total JSON. Adds [suggest.providers] js_runtime to ProvidersConfig (default true). Phase 3 wires the schema only; Phase 4 begins gating the post_process dispatch on the flag. Documentation: - ADR 0006 records the rquickjs choice, sync-runtime decision, and sandbox model. - docs/JS_RUNTIME.md is the ongoing reference doc (Class A/B/C distinctions, sandbox layers, timeout caveats, normalization, cache-key composition, kill switch, concurrency model). Linked from docs/ARCHITECTURE.md. Binary size: 108452944 -> 108451968 bytes (delta -976 bytes, well within the 110 MiB ceiling). The new crate is built but not yet linked into ghost-complete; Phase 4 will pull it in. Tests: 37 in gc-jsrt (11 unit + 26 integration) covering eval primitives, Promise unwrap, sandbox globals + intrinsic shadowing, runaway-loop wall-clock interrupt, memory exhaustion, oversized output, cyclic objects, exceptions, intrinsic availability, context isolation, concurrent evaluation. Workspace test suite remains green (~1100 tests). Verification: cargo check --all-targets cargo clippy --all-targets -- -D warnings cargo fmt --check cargo test --workspace cargo test -p gc-jsrt bash scripts/check-binary-size.sh --absolute-max 110MB bash scripts/check-binary-size.sh --delta-max 2MB * ux-9 phase 4: wire post-process js_runtime through engine Phase 4 dispatches js_runtime.kind == post_process generators through gc-jsrt instead of skipping them. The script runs as before, its stdout feeds into the JS evaluator, and the normalized suggestions merge into the candidate set with the same Script source / Command kind tags as the existing script generators. - collect_generators() classifies post_process+script/script_template as supported. All other requires_js shapes (script_function, custom, missing js_runtime) stay skipped with a structured tracing::info log line that records kind/has_script/has_template. - Cache split: CacheKey::Stdout vs CacheKey::JsProcessed{source_hash}. Two different post-process bodies on the same script never share cache entries; spawn cost is shared via the stdout layer. The source hash is std::hash::DefaultHasher (non-cryptographic, stable per process) and only used as a partition key. - New gc-suggest::js_runtime adapter lazily spawns gc-jsrt::JsWorker on first call. The worker thread costs ~5 MB and a QuickJS runtime init, so users that never trigger a Phase-4 generator never pay for it. Diagnostics (Timeout, Exception, MemoryExceeded, InvalidShape, OversizedOutput, UnsupportedApi) surface as structured tracing::warn events. Phase 7 will route them into status --json and doctor. - suggest.providers.js_runtime kill switch is now active: when false, post_process generators are skipped at dispatch time, equivalent to pre-Phase-4 behavior. - with_suggest_config plumbs the new flag through gc-pty::handler and gc-pty::proxy from gc-config::ProvidersConfig. - status --json: requires_js_generators_supported counts every generator with kind == post_process AND a script (or script_template); requires_js_generators_unsupported is the remainder. Against the 709-spec corpus on disk the runtime reports 1944 supported / 5338 unsupported / 7282 total. The total comes from a raw-JSON walk that does NOT suffer the OptionSpec.args truncation in the structured loader, so it is larger than the plan's pre-AWS-restoration estimate of 3641. - Cache layer rewritten as a CachedPayload enum so a single HashMap stores both stdout strings and suggestion vectors, with the variant tag preventing cross-contamination at the Hash + PartialEq level. Binary size: 108_452_944 B -> 109_668_688 B (+1_215_744 B, +1.16 MB). The rquickjs contribution links here for the first time. Well under the +2 MB delta gate; no zstd-compress-specs fallback needed. Tests: 7 new fixture tests in crates/gc-suggest/tests/phase4_js_dispatch.rs exercise success, kill switch, unsupported skip, cache split, stdout cache reuse across JS sources, timeout, and exception. 5 new cache-layer unit tests cover the JsProcessed / Stdout keyspace partitioning. ux9_active_fixtures_classify_correctly was updated to expect supported=1, unsupported=2 against the 3 requires_js fixtures in tests/fixtures/ux9. Verification: - cargo fmt --check clean - cargo clippy -D warnings clean - cargo test -p gc-jsrt 26 passed - cargo test -p gc-suggest all unit + integration green (incl. 7 phase4 tests) - cargo test --workspace all green (no FAILED) - cargo run -- validate-specs --json 1418 valid / 0 failed - cargo run -- status --json spec_counts as documented above - check-binary-size.sh --absolute-max 110MB PASS - check-binary-size.sh --delta-max 2MB PASS * ux-9 phase 5: dispatch script_function and custom JS generators Phase 5 wires Class B (`script_function`) and Class C (`custom`) generators through the existing gc-jsrt QuickJS evaluator behind a minimal Fig-compatible host API. Both kinds now contribute live suggestions for the 3641 requires_js generators across the corpus instead of stalling at static-only completion. gc-jsrt: - New `host.rs` installs the bounded host API on every job: cwd, env, tokens, search/current/previous-token strings, executeShellCommand, and the legacy `fig` namespace surface. Unsupported subnamespaces (fs, path, keychain, ipc, ui) throw structured errors with stable diagnostic codes. - executeShellCommand accepts argv arrays and `{command, args}` descriptors by default. Shell-string form is denied unless the caller flips `allow_shell_command` (per-spec opt-in plumbed end-to-end). A 5-call recursion cap (MAX_SHELL_CALLS_PER_EVALUATION) prevents accidental fork-bombs inside a single evaluation. - New `ShellRunner` trait + `ShellRunOutput`/`ShellRunError` types let the engine inject its async runner; the worker thread is a plain OS thread so Handle::block_on is safe. - New `JsExecutionKind` enum (PostProcess / ScriptFunction / Custom) drives dispatch in worker.rs. ScriptFunction emits argv via `normalize_argv` (array form or descriptor); Custom and PostProcess continue through `normalize_value`. UnsupportedHostApi accumulates from HostState and is surfaced as diagnostics. - New diagnostic codes: UnsupportedHostApi, ShellCommandStringDenied, ShellCommandLimitExceeded, ShellCommandFailed, InvalidArgv. gc-suggest: - New `shell_runner.rs` implements EngineShellRunner using tokio::runtime::Handle::block_on(run_script(...)) bridged from the worker OS thread. run_string parses with shlex::split. - `js_runtime.rs` adds JsExecContext, script_function/custom helpers, and IIFE wrappers that pass `(tokens, ctx)` or `(tokens, executeShellCommand, ctx)` to the user source. Diagnostic codes are logged via tracing::warn!. - `engine.rs` restructures the dispatch loop: js_kind is determined early, the kill-switch fires consistently for all three kinds, ScriptFunction goes through run_script_function_dispatch (JS->argv->shell->transforms), Custom goes through run_custom_dispatch (JS produces suggestions directly via the runner). make_js_exec_context mirrors process env minus GHOST_COMPLETE_ACTIVE. - `specs.rs` widens collect_generators so script_function/custom (no static argv) reach the engine; `is_js_dispatchable` is the new gate. ghost-complete status: - `is_post_process_supported` now classifies all three kinds as supported when source is non-empty. The 709-spec corpus reports requires_js_generators_supported = 3641 / unsupported = 0 in the JSON status output. Converter spike: - `run-spike.mjs` walkGenerators now reads both legacy `js_source` and the Phase-2 `js_runtime.source` shape, regenerating the inventory against 3641 generators (was finding ~0 post-Phase-2). shape-inventory.json, shape-inventory.md, and candidate-providers.json refreshed. Tests: 11 new gc-jsrt phase5_host_api tests + 12 new gc-suggest phase5_js_dispatch tests, including aws/cargo/kubectl/docker corpus-shape fixtures driven by `printf` stubs. cargo test --workspace passes 1579/0; npm --prefix tools/fig-converter test passes 232/0. Verification: - cargo fmt --check, cargo clippy --all-targets -- -D warnings clean. - cargo test -p gc-jsrt: 26 unit + 11 phase5 host API. - cargo test -p gc-suggest: full suite incl. 12 phase5 dispatch + 4 corpus. - cargo test --workspace -- --test-threads=1: 1579 passed, 0 failed. - npm --prefix tools/fig-converter test: 232 passed, 0 failed. - ghost-complete validate-specs --json: 709/709 valid. - ghost-complete status --json: requires_js_generators_supported = 3641, unsupported = 0, fully_functional = 529, partially_functional = 180. - cargo build --release; binary 104.78 MB (under 110 MB cap; +1.35 MB). * ux-9 phase 4-5 fix-up: hot-path perf, multi-dir count fix, snapshots, baseline - Status counter no longer double-counts requires_js generators across overlapping spec_dirs. SpecStore now exposes canonical_paths() so the file scan walks resolved entries only. Default-config status now reports 3641 (was 7282) regardless of how many overlapping dirs are configured. - Spec snapshots refreshed for the 187 specs Phase 2 regenerated (js_source -> js_runtime{kind,source}). - Binary-size baseline bumped to 109,870,752 bytes (~104.78 MB, +1.35 MB over the Phase 3 baseline). Captures the combined Phase 4 + 5 rquickjs-link cost. Absolute 104.78 MB / 110 MB ceiling, delta zero against the new baseline. - Hot-path: GeneratorSpec.js_runtime is now Option<Arc<JsRuntimeSpec>>; the dispatch path Arc-clones (pointer bump) instead of deep-cloning the embedded JS source on every keystroke. AWS specs ship multi-KB source bodies, so the saving is real. - Diagnostic warn tests now install a thread-local tracing-subscriber capture writer and assert the relevant code/generator_id appear in emitted logs (Timeout, Exception, UnsupportedHostApi). dev-deps pick up tracing-subscriber from the workspace. - JsWorker::Drop blocking shutdown documented with TODO for the follow-up bounded-shutdown work. * ux-9 phase 7: diagnostics, doctor, coverage drift gate User-facing diagnostics: - status JSON schema 1.2: surfaces alias conflicts as a structured list (`command_alias_conflict_details` with alias / kind / winner_* / loser_* fields), splits requires_js generator counts by kind class (`requires_js_generators_supported_by_kind`: post_process / script_function / custom), and exposes the JS runtime kill switch under a top-level `js_runtime` block. - status text mode: new Coverage / Dynamic generators / Command addressability / JS runtime sections — each metric mirrors a status --json field name so the two views stay aligned. - doctor: new "Spec addressability" check listing each AliasConflict grouped by kind with kind-specific hints (DuplicateName / NameMatchesOtherStem / DirectoryPrecedence). New "JS runtime" check warns when the kill switch is off. New "Embedded specs" check verifies every requires_js generator in the loaded corpus has populated js_runtime metadata (`source != ""`). Validation: - validate-specs --strict now fails on shipped specs with requires_js but no js_runtime metadata (or empty source). The check is strict-only so non-strict invocations don't suddenly warn on every existing spec that hasn't been re-converted. CI: - New scripts/check-coverage-regression.sh fails when the unsupported requires_js generator count exceeds the baseline by more than the tolerance (default 0), or whenever any command is reported nonfunctional. Wired into ci.yml as `continue-on-error: true` for the initial rollout; the maintainer who promotes it to a hard gate is responsible for refreshing the baseline first. - docs/ci-gates.md documents the new gate, including local debug commands and the readiness-table entry. - Self-tests at scripts/check-coverage-regression.test.sh (23 cases) cover flag parsing, missing baseline / status-json, the regression / improvement / nonfunctional / within-tolerance branches, and the env-var pathway. Baseline: - docs/coverage-baseline.json refreshed with v0.12.0-ux9 row: 3641 / 3641 / 0 (total / supported / unsupported), 0 nonfunctional, 6 alias conflicts, 717 commands_addressable, 171 corrected_generators, 15 native_providers. * ux-9 phase 8: docs reflect runtime default-on; corpus verification - docs/SPECS.md: replace the "no JS runtime" non-goal with a pointer to the gc-jsrt crate (default-on as of UX-9 Phase 3+). - docs/COMPLETION_SPEC.md: rewrite the requires_js section for runtime active state across all three js_runtime.kind variants; mention the [suggest.providers] js_runtime kill switch. - docs/ARCHITECTURE.md: gc-jsrt crate row, dependency-graph edge from gc-suggest, and a generator-table row covering JS-backed generators with the CacheKey::JsProcessed split. - README.md: 9 crates instead of 8 (gc-jsrt added), Completion Specs paragraph mentions JS-backed generators, Known Limitations entry reframes requires_js from "partially functional" to a bounded sandbox. * ux-9 phase 8 follow-up: restore z.json zoxide query --list generator Phase 2's converter regen overwrote specs/z.json with an empty stub because upstream @withfig/autocomplete has no z spec — the rich spec was a hand-curated override (added in 620c4bf, expanded in a2aa016) and it survived earlier regens only because nothing re-emitted z.json. UX-9's converter ran for every name and clobbered it. Restored z.json with its zoxide query --list generator, split_lines/filter_empty/trim transforms, 60s ttl cache, folders template, and -/~ static suggestions. Snapshot regenerated to match. This was the only spec that lost generators in Phase 2's regen (verified by walking 88f7204 → HEAD diff for every spec's script/script_template/generators count); other hand-curated overrides (cd, cargo, docker, aws priorities) survived because spec-priority-audit/apply.mjs reapplied them. * fix(review-loop): iteration 1 - address 22 findings - fix JS runtime host API compatibility and diagnostics - schedule supported requires_js generators in production paths - cover strict metadata validation for option arg arrays - refresh JS runtime docs and comments * fix(review-loop): iteration 2 - address runtime followups - preserve JS token context and stale-result handling - gate JS runtime support on self-contained sources - refine option-arg resolution and coverage status reporting - update converter, docs, and coverage counter comments Verification note: cargo clippy --all-targets -- -D warnings passed; cargo test -- --test-threads=1 still fails in strict option-arg metadata tests. * fix(review-loop): iteration 3 - address 63 review findings Aggregated four parallel review-fix clusters covering 65 panel findings (2 critical, 14 high, 39 medium, 10 low). 63 addressed; 2 deferred as cross-cluster (type-3 enum refactor, type-4 type-name collision). ghost-complete (8/8): - validate.rs/doctor.rs walk both args and extra_args (formerly missed every requires_js generator on a non-first option arg; both PR-added regression tests now pass) - prune Phase-X annotations across doctor.rs, status.rs, validate.rs - new crates/ghost-complete/tests/doctor_cli.rs smoke covers exit codes gc-suggest + gc-pty (26/26): - script::run_script_full returns structured stdout/stderr/exit_code so EngineShellRunner can stop string-sniffing anyhow messages - timeout-vs-hung classification corrected to ShellRunError::Timeout - Custom dispatch cache key always includes cwd - ScriptFunction/Custom None js_runtime branches warn instead of silent-continue - JsRuntimeAdapter worker spawn race serialised behind a Mutex - handler::generator_depends_on_current_word tightened to Custom + ScriptFunction (PostProcess gets prefix-superset semantics) - new tests for shell-runner classification + allow_shell_command + TTL=0 - OptionSpec::extra_args bumped to pub for cross-crate iteration gc-jsrt (20/22): - delete parking_lot_lite (~55 lines unsafe), HostState uses std::sync::Mutex - bounded_shell_timeout short-circuits below 5ms with Timeout, no spawn - WorkerHandle::drop logs panic payloads via tracing::error - parse_exec_options emits diagnostics for wrong-typed cwd/timeout - throw_with_code logs before degrading to undefined-throw fallback - interrupted_flag downgraded AtomicU64 -> AtomicBool - new test asserts WorkerDead via JsWorker::spawn_for_test_with_failing_thread - new test iterates the 5 unsupported host namespaces (UnsupportedHostApi) - Phase tags pruned, orphan TODOs removed docs (9/9): - JS_RUNTIME.md / SPECS.md / COMPLETION_SPEC.md / ci-gates.md present-tense for shipped behaviour; ADR 0006 gets a post-merge status note rather than a retroactive rewrite (preserves immutable-ADR convention) - tools/fig-converter Phase tags stripped, WHY hints kept Verified: cargo fmt + cargo clippy --all-targets -- -D warnings + cargo check --workspace + cargo test --workspace (1632/1632) + npm --prefix tools/fig-converter test (234/234) all clean. * fix(review-loop): iteration 4 - address 27 review findings Aggregated four parallel review-fix clusters covering all 27 panel findings (1 critical, 1 high, 18 medium, 7 low). 0 skipped. ghost-complete (3/3): - CRITICAL: status.rs supported_kind now requires self_contained:true for script_function/custom (mirrors engine::is_supported_script_generator). Coverage baseline corrected: supported 3641 -> 1944, unsupported 0 -> 1697. Doctor extended to count non-PostProcess generators lacking self_contained as missing. 5 new regression tests cover both kinds, both polarities, and the post_process exemption. - validate.rs walk_subs switched from recursion to the iterative stack pattern doctor.rs already uses (no future stack-depth attack surface). - status.rs scan_spec_files emits tracing::warn on read/parse failure and no longer increments spec_files_total for files that fail to parse. gc-jsrt + cross-cluster type-3 (12/12): - type-3 enum refactor: JsRuntimeOutput.payload is now JsRuntimeOutputPayload::{Suggestions, Argv, None}, with engine.rs pattern matching at the consumer seams. Illegal states unrepresentable. - executeShellCommand opts cwd/timeout now wins over descriptor-embedded values (correct precedence). - parse_exec_options emits diagnostics for non-object opts arg, decode-failed cwd strings, and non-finite/out-of-range timeout values. - 6 new tests pin opts-vs-descriptor precedence, the bad-type diagnostics, and the SHELL_TIMEOUT_FLOOR short-circuit (no spawn below 5ms). - engine.rs gains a cache-key cwd negative test and a None-js_runtime no-panic test. gc-suggest (8/8): - Removed the dead JsRuntimeSpec.input field (and JsRuntimeInput enum): audited zero corpus uses, converter never emits, validator warning gone. - Renamed phase4_js_dispatch.rs -> js_post_process_dispatch.rs and phase5_js_dispatch.rs -> js_dispatch.rs; stripped phase chronology from module docstrings, fixture descriptions, and inline test markers. - handler.rs gains 5 unit tests for the asymmetric generator_depends_on_current_word predicate (Custom/ScriptFunction pin, PostProcess does not). - New worker_serialises_concurrent_spawn_attempts concurrency test. - json_string_literal docstring rewritten to point at QuickJS ES2020 contract and U+2028/U+2029 explicitly. docs + scripts + workflows (5/5): - COMPLETION_SPEC.md / SPECS.md / ci-gates.md / count-spec-coverage.sh / ci.yml / index.test.js: stripped Phase chronology markers per CLAUDE.md doc hygiene; structural keys (job names, env vars, function names) intact. - COMPLETION_SPEC.md table loses the dead `input` field row. Verified: cargo fmt + cargo clippy --all-targets -- -D warnings + cargo check --workspace + cargo test --workspace (1644/1644) + npm --prefix tools/fig-converter test (234/234) all clean. * fix(review-loop): iteration 5 - address 23 review findings Aggregated four parallel review-fix clusters covering all 23 panel findings (1 high, 12 medium, 10 low). All addressed; one sub-finding skipped as cross-cluster (specs.rs:1419 duplicate of engine-side gate). ghost-complete (5/5): - HIGH: 9 stale references to non-existent `js_runtime_supported` engine symbol renamed to actual `is_supported_script_generator` across doctor.rs / status.rs / tests/doctor_cli.rs. - doctor's count_missing_js_runtime_in_spec PostProcess branch now mirrors the engine: requires `script` or `script_template` to be present. - supported_kind doc rewritten as a SHAPE invariant (not a snapshot count). - New regression tests: non-bool self_contained classifies as unsupported; validate's iterative descent emits correct JSON-pointer paths through a 6-level nested subcommand tree. gc-jsrt + cross-cluster engine.rs (9/9): - Renamed phase5_host_api.rs -> host_api.rs (and dropped the phase prefix from the unsupported-host-namespaces test). - gc-jsrt Cargo.toml description loses the (UX-9 / requires_js specs) parenthetical. - engine.rs is_supported_script_generator gains source-emptiness check for PostProcess; matches doctor + validate. - Hot-path silent-drop in is_supported_script_generator now traces under RUST_LOG=gc_suggest=trace; doctor remains the actionable surface. - worker.rs `obj.contains_key("...").unwrap_or(false)` anti-pattern hardened with explicit error handling, mirroring host.rs. - Stale wire-protocol-mismatch comments and dead warns in engine dispatch removed; into_argv()/into_suggestions() docs reflect that the cross- variant arms are structurally unreachable. - New host_api integration test pins the cwd<decode-failure> diagnostic via the lone-surrogate path. - 8 new unit tests in types.rs cover JsRuntimeOutputPayload accessors across all three variants. gc-suggest (5/5): - Stripped phase4_/phase5_/phase1_ prefixes from 37 test fns across js_post_process_dispatch.rs, js_dispatch.rs, and specs.rs. - 3 ux9 fixture descriptions rewritten present-tense. - canonical_paths() doc loses the Pre-Phase-4 chronology framing. - script_function_invalid_argv_yields_empty test extended with log capture asserting the upstream JsDiagnosticCode::InvalidArgv warn. docs + scripts + config (4/4): - check-coverage-regression.sh / gc-config lib.rs / ARCHITECTURE.md / JS_RUNTIME.md / run-spike.mjs: stripped UX-9 / Phase chronology tags; rewrote one config doc present-tense. Verified: cargo fmt + cargo clippy --all-targets -- -D warnings + cargo test --workspace (1659/1659) + npm --prefix tools/fig-converter test (234/234) all clean. * fix(review-loop): iteration 6 - validate predicate parity, doctor messages Aggregated single ghost-complete cluster covering all 4 iter-4 findings (2 medium, 2 low). 0 skipped. - validate.rs::collect_missing_js_runtime_warnings predicate extended to mirror engine + doctor: now also flags PostProcess+missing-script and ScriptFunction/Custom+missing-self_contained. New JsRuntimeWarningKind enum disambiguates the four undispatchable classes in warning text. - doctor.rs check_embedded_runtime_metadata fail message now enumerates all four classes instead of the legacy "missing js_runtime metadata" framing; the function-level doc comment broadened to match the iter-3 predicate. - 7 new tests including a cross-surface property test (validate_doctor_and_engine_predicates_agree) that iterates a 10-fixture matrix and asserts validate's verdict matches the expected supported/unsupported judgement — future divergence between any two of validate / doctor / engine fails this test. Verified: cargo fmt + cargo clippy --all-targets -- -D warnings + cargo test --workspace (1667/1667) + npm --prefix tools/fig-converter test (234/234) all clean. * fix(review-loop): iteration 7 - doctor doc summary + stale walk_subs ref Aggregated single ghost-complete cluster covering both iter-5 findings (1 medium docstring, 1 low test docstring). 0 skipped. Pure docstring edits. - doctor.rs::count_missing_js_runtime_in_spec summary at line 510 now enumerates all four corpus-defect classes ("All four surface as a doctor Fail") instead of the legacy "All three" framing left over from before the post_process+missing-script branch landed in iter-3. - validate.rs test docstring no longer references the removed `walk_subs` symbol — points at the inlined subcommand-descent loop in `collect_missing_js_runtime_warnings`. Verified: cargo fmt + cargo clippy -p ghost-complete --all-targets -- -D warnings + cargo test -p ghost-complete (184/184) all clean. * fix(test): release-mode flake in below_floor_skips_spawn timing test The iter-2-added test burned 95ms of a 100ms JS budget then expected the SHELL_TIMEOUT_FLOOR (5ms) short-circuit to fire. In release mode the JS busy-loop overshoots less than in debug, so remaining could land at exactly 5ms — failing the strictly-less-than floor check, spawning the runner, failing the assertion. Bump burn target from 95ms to 99ms so even tight release-mode loop overhead leaves remaining well under SHELL_TIMEOUT_FLOOR. The else branch already handles the case where the deadline expires before the shell call is reached (Timeout diagnostic, no suggestion), so the larger burn doesn't change semantics — only makes the short-circuit trigger reliably across optimisation modes. Verified: cargo test --release -p gc-jsrt --test host_api (27/27); cargo test --release --workspace clean except for the pre-existing PTY-exhaustion flake in smoke::test_exit_code_zero (passes in isolation; harness/mod.rs:28 openpty call under parallel test load).
1 parent 88f7204 commit 26bcac0

430 files changed

Lines changed: 45213 additions & 8593 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,3 +148,25 @@ jobs:
148148
- uses: actions/checkout@v6
149149
- name: Check coverage baseline drift
150150
run: scripts/check-coverage-baseline-drift.sh
151+
152+
coverage-regression:
153+
name: Coverage regression
154+
# Coverage regression gate: fails when the unsupported requires_js
155+
# generator count rises above the baseline by more than the configured
156+
# tolerance, OR when any command becomes nonfunctional. Wired as
157+
# `continue-on-error: true` during the initial rollout so a
158+
# baseline-refresh oversight does not block unrelated work; promotion
159+
# to a hard gate is a separate follow-up.
160+
needs: [check]
161+
runs-on: macos-latest
162+
continue-on-error: true
163+
steps:
164+
- uses: actions/checkout@v6
165+
- uses: dtolnay/rust-toolchain@stable
166+
- uses: Swatinem/rust-cache@v2
167+
- name: Build release binary
168+
run: cargo build --release
169+
- name: Coverage regression gate
170+
run: scripts/check-coverage-regression.sh
171+
- name: Coverage regression gate (self-test)
172+
run: bash scripts/check-coverage-regression.test.sh

Cargo.lock

Lines changed: 42 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ members = [
99
"crates/gc-overlay",
1010
"crates/gc-config",
1111
"crates/gc-terminal",
12+
"crates/gc-jsrt",
1213
]
1314

1415
[workspace.package]

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -172,13 +172,13 @@ Beyond specs, built-in providers offer:
172172
- **Shell alias resolution**`alias g=git``g push` uses the git spec
173173
- **Frecency-ranked history** — frequently/recently used commands score higher
174174

175-
Many specs include dynamic generators that run shell commands for live results (e.g., `brew list`, `docker ps`, `kubectl get`). Generator results are cached with configurable TTL. A loading indicator (`...`) appears while generators run.
175+
Many specs include dynamic generators that run shell commands for live results (e.g., `brew list`, `docker ps`, `kubectl get`). Generator results are cached with configurable TTL. A loading indicator (`...`) appears while generators run. Specs that originally relied on inline JavaScript (Fig `postProcess`, `script: () => [...]`, `custom: async () => [...]`) execute through the embedded `gc-jsrt` runtime — see [`docs/JS_RUNTIME.md`](docs/JS_RUNTIME.md) for the sandbox model.
176176

177177
Custom specs go in `~/.config/ghost-complete/specs/`. See [docs/COMPLETION_SPEC.md](docs/COMPLETION_SPEC.md) for the format reference.
178178

179179
## Architecture
180180

181-
Rust workspace with 8 crates:
181+
Rust workspace with 9 crates:
182182

183183
| Crate | Role |
184184
|-------|------|
@@ -190,6 +190,7 @@ Rust workspace with 8 crates:
190190
| `gc-overlay` | ANSI popup rendering with synchronized output |
191191
| `gc-config` | TOML config, keybindings, themes |
192192
| `gc-terminal` | Terminal detection, capability profiling, render strategy selection |
193+
| `gc-jsrt` | Bounded QuickJS evaluator for `requires_js` specs (rquickjs) |
193194

194195
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full design — data flow, dependency graph, key design decisions, and performance characteristics.
195196

@@ -207,7 +208,7 @@ See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full design — data fl
207208
- **Terminal.app inside tmux is not detected.** Terminal.app sets no environment variable that leaks through tmux, so Ghost Complete cannot identify it. Ghostty, Kitty, WezTerm, Alacritty, and iTerm2 in tmux work correctly via their respective env vars (`GHOSTTY_RESOURCES_DIR`, `KITTY_WINDOW_ID`, `WEZTERM_UNIX_SOCKET`, `ALACRITTY_SOCKET`, `ITERM_SESSION_ID`).
208209
- **Dynamic generator results require a keystroke to render.** Async generators (shell commands for live results) merge into the popup on the next PTY read. If the shell is idle after the generator completes, results won't appear until the next keystroke.
209210
- **Bash and fish: manual trigger only.** Auto-trigger on typing is not implemented for bash or fish. Use Ctrl+/ to manually invoke completions.
210-
- **Specs with `requires_js: true` are partially functional.** Static completions (subcommands, options) work, but JS-based generators from the original Fig ecosystem are not executed.
211+
- **JS-backed generators run in a bounded sandbox.** Specs with `requires_js: true` are evaluated by the embedded `gc-jsrt` runtime (QuickJS via rquickjs, default on). The sandbox enforces resource caps (memory, stack, time) and a restricted host API; long-running native operations (pathological regex, large `JSON.parse`) are not preempted at the exact deadline. The runtime can be disabled wholesale via `[suggest.providers] js_runtime = false`. See [`docs/JS_RUNTIME.md`](docs/JS_RUNTIME.md).
211212
- **No Linux or Windows support.** macOS only. The PTY proxy and terminal detection rely on macOS-specific behavior.
212213

213214
## FAQ
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
106910288
1+
109870752

crates/gc-config/src/lib.rs

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,17 @@ pub struct ProvidersConfig {
170170
pub filesystem: bool,
171171
pub specs: bool,
172172
pub git: bool,
173+
/// Global kill switch for the QuickJS evaluator that backs
174+
/// `requires_js` generators. Default `true` so JS-backed
175+
/// completions work out of the box; users can disable locally
176+
/// with `[suggest.providers] js_runtime = false` in their
177+
/// `config.toml`.
178+
///
179+
/// When `false`, the suggestion engine skips every JS-backed
180+
/// `requires_js` generator whose `js_runtime.kind` is populated
181+
/// (`post_process`, `script_function`, or `custom`). Static spec
182+
/// completions remain available.
183+
pub js_runtime: bool,
173184
}
174185

175186
impl Default for ProvidersConfig {
@@ -179,6 +190,7 @@ impl Default for ProvidersConfig {
179190
filesystem: true,
180191
specs: true,
181192
git: true,
193+
js_runtime: true,
182194
}
183195
}
184196
}

crates/gc-jsrt/Cargo.toml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
[package]
2+
name = "gc-jsrt"
3+
version.workspace = true
4+
edition.workspace = true
5+
rust-version.workspace = true
6+
license.workspace = true
7+
authors.workspace = true
8+
publish = false
9+
10+
description = "Bounded QuickJS evaluator for Ghost Complete's requires_js spec generators."
11+
12+
[dependencies]
13+
# Sync rquickjs Runtime + Context. We deliberately avoid the `loader`,
14+
# `dyn-load`, `futures`, `parallel`, `macro`, and `phf` features:
15+
# - `loader`/`dyn-load` would expose ES6 module loading and native plugin
16+
# loading we never want corpus JS to reach.
17+
# - `futures` pulls in the AsyncRuntime; we use a dedicated worker thread
18+
# instead so we keep a smaller surface and a single concurrency model.
19+
# - `macro`/`phf` add proc-macro and PHF dependencies we don't use.
20+
# `rust-alloc` keeps QuickJS allocations going through Rust's global
21+
# allocator so memory usage shows up in the same accounting as everything
22+
# else in the binary.
23+
rquickjs = { version = "0.10", default-features = false, features = ["rust-alloc"] }
24+
serde = { workspace = true }
25+
serde_json = "1"
26+
thiserror = "2"
27+
tokio = { workspace = true }
28+
tracing = { workspace = true }
29+
30+
[dev-dependencies]
31+
tokio = { workspace = true, features = ["test-util", "macros", "rt-multi-thread", "time"] }

0 commit comments

Comments
 (0)