Ghost Complete is a terminal-native autocomplete engine that works as a PTY proxy — it sits between your terminal emulator and your shell, intercepting the data stream to render suggestion popups using native ANSI escape sequences. No Accessibility API, no IME hacks, no Electron overlay.
┌──────────────────────────────────────────────────────────┐
│ Terminal Emulator │
│ (Ghostty, Kitty, WezTerm, Alacritty, ...) │
│ Receives: shell output + overlay sequences │
└──────────────────────┬───────────────────────────────────┘
│ stdin / stdout (raw bytes)
▼
┌─────────────────┐
│ Ghost Complete │
│ (PTY Proxy) │
│ │
│ ┌────────────┐ │
│ │ VT Parser │◄─┼── parses shell output, tracks cursor
│ └─────┬──────┘ │ position, screen dims, prompt bounds
│ │ │
│ ▼ │
│ ┌────────────┐ │
│ │ Buffer │ │
│ │ Tracker │──┼── reconstructs current command line,
│ └─────┬──────┘ │ detects command context
│ │ │
│ ▼ │
│ ┌────────────┐ │
│ │ Suggestion │ │
│ │ Engine │──┼── fuzzy matching against completions
│ └─────┬──────┘ │ (specs, filesystem, git, history)
│ │ │
│ ▼ │
│ ┌────────────┐ │
│ │ Overlay │ │
│ │ Renderer │──┼── renders popup using ANSI sequences
│ └────────────┘ │ with synchronized output
│ │
└────────┬─────────┘
│ PTY master ↔ slave
▼
┌──────────────────┐
│ Shell Process │
│ (zsh/bash/fish) │
└──────────────────┘
- User types a keystroke in the terminal emulator
gc-ptyreceives it on stdin — if the popup is visible, intercept navigation keys (Tab, arrows, Escape, Enter); otherwise forward to the shell PTY- Shell produces output, which flows through
gc-parser(VT state tracking) then to terminal stdout gc-parsertracks cursor position, screen dimensions, prompt boundaries, CWD, and the shell's exported env snapshot using shell-emitted OSC markers- On trigger conditions (space after command,
/,-,--, Ctrl+/, or delay timeout),gc-suggestcomputes ranked suggestions - Static suggestions (subcommands, options, templates) render immediately via
gc-overlay - Script generators execute async in background; results merge into the popup progressively without resetting cursor position
The workspace contains 9 crates under crates/:
| Crate | Purpose | Key Dependencies |
|---|---|---|
ghost-complete |
Binary entry point, CLI (clap), install/uninstall, status, doctor, validate-specs |
clap |
gc-pty |
PTY proxy event loop — spawns shell, multiplexes stdin/stdout with tokio::select!, handles SIGWINCH, async generator merge |
portable-pty, tokio |
gc-parser |
VT escape sequence parsing — cursor position, screen dimensions, prompt boundaries (OSC 133 + OSC 7771), CWD (OSC 7), exported env (OSC 7773) | vte |
gc-buffer |
Command line reconstruction — current command, argument position, pipes, redirects, quotes | |
gc-suggest |
Suggestion engine — dispatches to providers, fuzzy-ranks with nucleo, async generators with transform pipelines and TTL caching | nucleo, serde_json |
gc-overlay |
ANSI popup rendering — cursor save/restore, synchronized output, scroll-to-make-room, scrollbar, fuzzy match highlighting | |
gc-config |
TOML config, keybindings, themes (presets + custom styles), generator timeouts | serde, toml |
gc-terminal |
Terminal detection and capability profiling — TerminalProfile with RenderStrategy and PromptDetection enums |
|
gc-jsrt |
Bounded QuickJS evaluator for requires_js specs. Active and wired into gc-suggest for all four js_runtime.kind variants (post_process, script_function, custom, token_only). See docs/JS_RUNTIME.md. |
rquickjs |
ghost-complete ──► gc-pty ──► gc-parser
│ ──► gc-buffer
│ ──► gc-config
│ ──► gc-terminal
│ ──► gc-suggest ──► gc-buffer
│ ├─► gc-config
│ └─► gc-jsrt
│ ──► gc-overlay ──► gc-suggest
│ └─► gc-terminal
gc-parser, gc-buffer, gc-config, gc-terminal, and gc-jsrt are leaf crates with no other gc-* dependencies. gc-suggest depends on gc-buffer, gc-config, and gc-jsrt. gc-overlay depends on gc-suggest and gc-terminal. gc-pty depends on every other crate and ties them all together.
Ghost Complete runs as a PTY proxy rather than a zsh/fish plugin. The proxy sits between the terminal and the shell, seeing all bytes in both directions. This means:
- No zle widget conflicts — doesn't hook into shell internals
- No plugin manager dependencies — one binary, works after install
- No RPROMPT corruption — popup rendering is independent of shell prompt
- Shell-agnostic core — the same proxy works with zsh, bash, and fish
The tradeoff is complexity: we have to maintain our own VT parser to track cursor position, rather than asking the shell where it is.
We use the vte crate — a parser-only VT state machine that fires callbacks per escape sequence. We do NOT maintain a full screen buffer (like alacritty_terminal or vt100). We only track:
- Cursor position (row, column)
- Screen dimensions
- Prompt boundaries (via OSC 133 / OSC 7771)
- Current working directory (via OSC 7)
- Exported shell environment snapshot (via OSC 7773)
This keeps memory usage minimal and parsing fast. The tradeoff: cursor position can drift from reality over time (complex escape sequences we don't fully model). We correct for this using CPR sync — periodically requesting the terminal's actual cursor position via CSI 6n and reconciling.
nucleo (the fuzzy matcher from Helix editor) is ~6x faster than skim. It presegments Unicode strings once and reuses them across queries, making incremental search fast. With 10,000 candidates, nucleo returns results in <1ms. For an autocomplete tool running on every keystroke, this is the difference between "instant" and "noticeable lag."
Modern terminals support DECSET 2026 — the terminal buffers all output between begin/end markers and renders it atomically. This eliminates flicker during popup rendering. Ghostty, Kitty, WezTerm, Alacritty, and Rio all support this.
For terminals that don't (iTerm2, Terminal.app), we fall back to a pre-render buffer strategy: build the entire frame into a byte buffer and emit it in a single write() syscall, relying on kernel write atomicity.
The gc-terminal crate detects the terminal at startup and assigns capabilities via a TerminalProfile:
- RenderStrategy —
Synchronized(DECSET 2026) orPreRenderBuffer(single write) - PromptDetection —
Osc133(native) orShellIntegration(OSC 7771 markers)
Detection uses TERM_PROGRAM plus terminal-specific env vars (KITTY_WINDOW_ID, WEZTERM_UNIX_SOCKET, ALACRITTY_SOCKET, ZED_TERM, VSCODE_IPC_HOOK_CLI). Inside tmux, these env vars leak through from the outer terminal, allowing detection of the host terminal.
The overlay and parser crates are strategy-driven — they query the profile for capabilities rather than checking terminal names. Adding a new terminal means adding one enum variant and one match arm in gc-terminal; no other crate needs changes.
The PTY proxy runs four concurrent worker tasks plus a main coordination loop:
| Task | Spawn type | Role |
|---|---|---|
| Task A (stdin reader) | spawn_blocking |
Reads user keystrokes, intercepts popup navigation when visible, forwards to shell PTY |
| Task B (PTY reader) | spawn_blocking |
Reads shell output, runs VT parser, detects triggers, renders popup, forwards to stdout |
| Task C (debounce timer) | tokio::spawn (only when delay_ms > 0) |
Waits for typing pauses, fires delayed suggestion triggers |
| Task D (merge loop) | tokio::spawn |
Drains async generator results from an mpsc channel and merges them into the visible popup |
| Main loop | tokio::select! |
Waits on SIGWINCH, SIGTERM, SIGHUP, child exit, and the shutdown channel |
Task B notifies Task C via tokio::sync::Notify when the buffer is dirty but no immediate trigger fired. Task C resets its timer on each notification and fires a trigger after delay_ms (default 150ms) of inactivity. Task D consumes results posted by per-generator tasks so async output can land in an idle shell without waiting for the next keystroke.
Ghost Complete ships 711 Fig-compatible JSON specs embedded in the binary. Specs are zstd-compressed (level 19) into a single archive and lazily decompressed on first lookup; results are cached as &'static str. The cache is guarded by a Mutex<HashMap<&'static str, &'static str>>, so the second lookup of any spec is a hash hit with no decompression. This compresses the 47 MB JSON corpus into a 3,881,220-byte (3.70 MB) archive and dropped the ux-12b benchmarked macOS arm64 release binary from 103.41 MB to 11.81 MB; the current CI size baseline is 21,383,696 bytes. See docs/plans/ux-12b-zstd-spec-compression/SPEC.md for the archive layout. At startup, specs are registered but not parsed — see "Lazy Spec Loading" below for the rationale and contract. Command aliases are indexed at registration time so lookup can find a lazy candidate chain without parsing the full spec body.
Specs support multiple generator types:
| Type | Execution | Latency |
|---|---|---|
Rust-native (git_branches, git_tags, etc.) |
Sync, in-process | Instant |
Templates (filepaths, folders) |
Sync, filesystem | Instant |
| Script generators (shell commands) | Async, spawned process | Variable (cached with TTL) |
Script templates (commands with {current_token}) |
Async, spawned process | Variable |
requires_js generators (Fig postProcess / script-fn / custom) |
Async via gc-jsrt (bounded QuickJS); see docs/JS_RUNTIME.md |
Variable (cached separately under CacheKey::JsProcessed) |
Script generator output passes through a transform pipeline (split_lines, trim, regex_extract, json_extract, column_extract, etc.) that is validated at spec load time.
Generator results are cached in-memory with configurable TTL per-generator. cache_by_directory keys cache entries by CWD for commands whose output is directory-dependent. JS post-processed output uses a separate keyspace (CacheKey::JsProcessed { source_hash }) so two js_runtime.source bodies sharing the same script don't cross-contaminate.
Earlier revisions parsed every embedded spec into Arc<CompletionSpec> at startup. The AWS spec alone (~36 MB minified, ~17 K subcommands, ~116 K descriptions) ballooned the daemon's physical footprint to ~333 MB on first load — most of which the user never touched. The lazy-loading layer decouples registration from parsing:
SpecStore::load_with_embedded(&[]) ──► register every (filename, alias, &'static str)
as SpecEntry { source: Embedded(json),
parsed: lazy slot }
and add it to the alias index
─► ~183 µs, ~5 MB heap
store.get("git") ──► first touch: try each registered candidate in
precedence order; serde_json::from_str(json) into
Arc<CompletionSpec>, store in the parsed slot
─► subsequent get("git") hits the cached fast path
Each SpecEntry holds:
| Field | Purpose |
|---|---|
source: SpecSource |
Filesystem(PathBuf) for user specs, Embedded(&'static str) for the binary corpus. The lazy parse path reads from this without re-touching disk for embedded specs. |
parsed: RwLock<ParsedSlot> |
First-touch parse result. Failed(String) makes parse failures sticky — a malformed spec doesn't get re-parsed on every lookup. Surfaces via SpecEntry::load_error(). |
aliases: Vec<String> |
Every command name that resolves to this entry: filename stem first, then a non-conflicting CompletionSpec.name alias when it differs from the stem. |
Registration-time alias metadata is collected before a SpecEntry is stored. For the embedded corpus the build script emits an EMBEDDED_SPEC_ALIASES table; for filesystem specs the loader runs a shallow serde_json::from_str::<SpecHeader> (just name) at load time. That transient name_alias is folded into SpecEntry.aliases. Duplicate filenames remain registered as lower-precedence fallback candidates, so a malformed higher-precedence filesystem spec can fall through to the next filesystem copy and then to the embedded corpus.
SpecStore::iter() force-loads every registered candidate and yields one tuple per resolved runtime spec — used by ghost-complete status to avoid double-counting hidden fallbacks while still surfacing load errors via SpecStore::force_load_errors(). validate-specs uses its separate validator and parses configured spec directories directly. SpecStore::get() only loads the requested candidate chain. The embedded corpus is registered with the lowest precedence, so a valid filesystem spec with the same name (e.g. user-edited ~/.config/ghost-complete/specs/git.json) wins.
The runtime no longer materialises the embedded corpus to ~/.cache/ghost-complete/embedded-specs/. ghost-complete install and uninstall purge that legacy path if an earlier binary left it behind.
The lazy-loading layer above parses each spec on first access and keeps it resident for the rest of the daemon lifetime. The opt-in eviction layer adds a sweep that releases the heap of specs that have been idle past a configurable TTL.
The SpecEntry::parsed field is now RwLock<ParsedSlot> with four states:
| State | Meaning |
|---|---|
Empty |
Registered, never accessed. Zero parsed heap. |
Loaded(Arc<CompletionSpec>) |
Parsed and resident. Eligible for eviction. |
Evicted |
Was Loaded, then a sweep released the cached Arc. Re-parses on next access. |
Failed(String) |
Parse failed. Sticky: never evicted, never retried. |
Each entry also carries an AtomicU64 last_accessed_nanos. Every
successful get() bumps the timestamp; the sweep task takes write locks on
entries whose timestamp is older than the TTL and replaces Loaded(_) with
Evicted. Dropping the cache's Arc releases the heap; any caller still
holding a clone keeps its copy alive.
SpecStore::get returns Option<Arc<CompletionSpec>> (changed from
Option<&CompletionSpec>). Callers own an Arc for as long as they need
it, so the slot can mutate behind their back without invalidating the
resolved spec.
The sweep task is opt-in via [suggest.spec_cache] idle_ttl_secs > 0. With
the default (0), no sweep is spawned and the engine keeps the lazy-loading
layer's "parse once, hold forever" behavior. See docs/CONFIGURATION.md for
the full config schema and a suggested starting recipe.
Re-parse on a cold-cache hit blocks the popup for that one keystroke
(~150 ms in the AWS worst case, <5 ms for most specs). Shell stdin
forwarding is unaffected; it runs on a separate spawn_blocking thread.
The popup is rendered entirely via ANSI escape sequences — no alternate screen buffer, no TUI framework. The rendering flow:
- Calculate viewport deficit (does the popup fit below the cursor?)
- If not, scroll the viewport by emitting newlines at the bottom
- Save cursor (DECSC)
- For each visible suggestion: position cursor (CUP), apply styling (SGR), write text
- Restore cursor (DECRC)
All of this is wrapped in DECSET 2026 begin/end markers (or pre-rendered into a single buffer for terminals without synchronized output).
Scrollback protection: The popup area is cleared by overwriting with spaces, never by using ED (Erase Display) or EL (Erase Line) — those would push popup text into scrollback history.
| Metric | Target | Achieved |
|---|---|---|
| Keystroke to suggestion | <50ms | <20ms typical |
| PTY forwarding overhead | <1ms | <1ms |
| Fuzzy match (10k candidates) | <1ms | <1ms (nucleo) |
| Memory (idle) | <10MB | ~8MB |
| Startup | <100ms | <50ms |
Benchmarks use Criterion and live in gc-suggest and gc-parser. Run with cargo bench.
Shell integration scripts in shell/ emit semantic prompt markers:
- OSC 133 — standard semantic prompt protocol (supported by Ghostty, Kitty, WezTerm, Rio)
- OSC 7771 — Ghost Complete's own marker (used as fallback on Alacritty, iTerm2, Terminal.app)
- OSC 7773 — Ghost Complete's exported environment snapshot, consumed by the proxy and stripped before terminal output
Prompt markers are emitted simultaneously by the integration scripts, so the parser can use whichever the terminal supports.
Without shell integration, features are limited — prompt boundary detection falls back to heuristics, and manual trigger (Ctrl+/) is the only way to invoke completions.
The shell integration reports the live edit buffer to the proxy after every ZLE redraw via OSC 7772:
\e]7772;<cursor>;<percent-encoded-utf8-buffer>\a
<cursor>is a decimal codepoint count (zsh$CURSOR).<percent-encoded-utf8-buffer>uses a deliberately small allow-list: bytes in[A-Za-z0-9._~/-]and the literal space pass through; every other byte (including;,\a(BEL),\x1b(ESC),\\,%, all<0x20controls,0x7F, and0x80–0xFF) is encoded as%XX. UTF-8 multibyte sequences are encoded byte-by-byte.
The narrow alphabet is non-negotiable: any unencoded ; would split the
OSC parameter list and silently truncate the buffer at the parser; any
unencoded BEL would terminate the envelope mid-payload; an unencoded ESC
could smuggle a nested escape sequence into the parser's state machine.
See ADR 0003.
OSC 7770 (the prior raw framing) is accepted by the parser as a deprecated
read-only path for one release: the first hit per process logs a one-shot
tracing::warn! and subsequent hits drop to trace!. The 7770 dispatch
arm is scheduled for #[ignore] in v0.11.0 and removal in v0.12.0. New
shell integrations only emit 7772.
The zsh integration reports exported scalar parameters at each prompt via
OSC 7773. The payload is a single percent-encoded field containing
NUL-separated KEY=value entries:
\e]7773;<percent-encoded-env-snapshot>\a
The PTY proxy consumes the frame, stores the snapshot on parser state, and
filters the frame out of the byte stream before writing shell output to the
terminal. Providers, JS host contexts, and script generators use this live
snapshot instead of the proxy process's startup environment, so export AWS_PROFILE=... or other session-level env changes affect completions on
the next prompt.