Coding Agent Expansion (Added Tools, New Slash Commands, Rewritten Prompts, Improved Chat UX, and more) by russell-rozenbaum · Pull Request #2210 · hazelgrove/hazel

russell-rozenbaum · 2026-04-09T22:06:13Z

Summary

Major coding-agent expansion. Scope: agent surface (src/web/view/agent{Core,View}/, agent CSS/tests, OpenRouter.re, HighLevelNodeMap.re, prompt factory).

What's new

Structure-editor tools — path-addressed syntax-projector tools via HighLevelNodeMap, statics tools, generalized insert_before/insert_after, per-tool + per-category toggles, ProjectorCatalog.
Chat loop & control — send queue, stop, multi-tool abort, auto + manual /compact, context meter, sticky scroll, centralized empty/API retry path.
Session modes & workbench — Converse / Edit / Plan top-bar toggle; tasks + subtasks with ordering and status tracking.
Slash commands — typed payloads + SlashCommandOutput view; /help, /session-usage, /key, /key-usage, /account-usage, /show-thinking, /compact.
Model & API — thinking view, reasoning-effort dropup, model picker (FP Lab recs, fuzzy search, only-free), OpenRouter account/key endpoints, safer error paths.
Display — plain-text tool-call rows + signifiers + jump-to-node, ToolCallSummary, agent-message Markdown via Omd → vdom, chat export/copy.
Prompts & docs — structure-editor-first identity, composition/compaction overhaul, ^^ projector syntax, AGENTS.md + handoff doc.
Refactors — AgentCore/AgentView → agentCore/agentView; top-bar above textbox; dropped empty-reply workbench nudge (plan-mode loop fix).
Tests — new Test_AgentControlFlow, Test_AgentMultiTool; expanded Test_AgentUX, Test_AgentTools (incl. HighLevelNodeMap).
Streaming — tokens are now streamed in via OpenRouter

Files outside agent scope

11 narrow integration touches (icons, Page route, sidebar hook, JsUtil helpers, prompt-string trim, projector/probe perform plumbing, test-runner reg, run_tests/.gitignore). Rest of git diff origin/dev..branch is artifact from dev moving 421 commits since divergence at ab5ba95da25 — no branch-side deletions.

Tour — new bottom-bar controls

Toggle agent modes: Edit / Plan / Converse
Select reasoning effort
Change model inline (current model shown)

New slash commands

Agent has ability to place probes and understand their intermittent results

Along with this, it can also place projectors.

Agent can batch multiple tools per turn

More informative collapsed tool descriptions

Also comes with the ability to Cmd/Ctrl+Click to jump to the definition of the respective definition that was edited

Thinking blobs now displayed as their own UI chunks

Refactored model selection UI

We can recommend some models (this requires maintaining code though :/ might need to remove).

Also added the ability to grep for models in the master view.

Stopped displaying free models first, instead added a filter toggle.

User can stop chats early (and also queue up messages)

Batched Tool Calls

Along with error handling—if tool call n fails, all tool calls scheduled after n are not executed to avoid cascading failures, rogue bugs, and unintentional logic flow.

… docs - Scroll chat to bottom on updates only when already pinned to bottom - System prompt and developer notes use full-screen-doc (left-aligned, readable) - Hide edits summary when there are no edit tools in a turn - Wire Agent Tools to tools-view-* CSS (flex row, hover, toggle states) Made-with: Cursor

Add place_statics, remove_statics, and toggle_statics with ProbePerform helpers, StaticsAction wiring, and Agent handling. Register tools in CompositionUtils and extend tests, including projector snippets. Introduce ProjectorCatalog and refresh CompositionPrompt, CompactionPrompt, HazelDocumentation, HazelSyntaxNotes, and ProbeTools copy for projectors and statics. When marking the active workbench task complete, fold remaining subtasks to completed instead of failing; document in WorkbenchTools. Fix chat messages autoscroll while pinned to the bottom by deferring scroll-to-bottom after layout and enriching the scroll stamp (tool results and slack). Set min-height on full-screen view content for flex scrolling. Made-with: Cursor

Add place/remove/toggle_syntax_projector tools (OpenAPI + CompositionUtils + CompositionActions) and ProjectorPerform helpers for path-resolved placement. Agent.re: handle SyntaxProjectorAction with expand-on-success; fail when zero paths apply; extend mk_diff for projector/probe/statics actions; retry and workbench nudges can use synthetic user-role API messages; document message channels in file header. HighLevelNodeMap: improve closest-path suggestions for nested bindings; note outer/inner paths in path_to_id errors. Prompts: message_channels, partnering_and_user_intent, CONTEXT UPDATE echo ban, EditTools path semantics, projector catalog; compaction prompt aligned. Tests: HighLevelNodeMap path cases, binding_clause scenarios, syntax projector parse coverage. UI: AgentMessageMarkdown + styles for assistant bubbles. Track .cursor/docs/coding-agent-projector-tools-extension.md via .gitignore exception for agent onboarding. Made-with: Cursor

- Copy exact LLM context snapshot from Agent Context panel (clipboard shim + toast); shared Message/Agent helpers and tests in Test_AgentUX. - Multi-tool turns: stop after first real failure; skipped tools with AgentToolResult.skipped, grey circle-minus icon, chat export [not executed]; Test_AgentMultiTool. - Context meter: 80% rounded limit capped at 100k tokens (AgentGlobals); compaction uses same limit. - JsUtil: copy_via_shim, show_copy_toast; Icons.circle_with_minus; CSS for header actions and skipped tool state. - run_tests: node stack + idb_stub alignment; haz3ltest registers multi-tool suite. - Add AGENTS.md; .gitignore .cursor/ with docs exception; prompt/catalog/ HighLevelNodeMap/SyntaxProjectorTools/WorkbenchTools tweaks. Made-with: Cursor

…agent - After update_definition, sanitize syntax projectors in the definition segment via ProjectorPerform.sanitize_projectors_in_segment so wrappers match init. - GeneralTreeUtils and CompositionGo: get_refs_to after pattern edits; clearer ambiguous-path behavior and EditTools prompt notes. - StringUtil: trim_leading and trim_trailing_whitespace strip horizontal whitespace (space, tab, CR); align with paste (ClipboardCache) and tests. - CompositionPrompt documents multiple tool calls per turn and sequential skip-on-failure; CompactionPrompt is expanded with Overview, Goals, Rules, and related sections. - Agent: surface Action.Failure.show when a structural tool cannot be applied; minor clarity in tool-chain failure detection. - Tests: AgentTools, AgentUX, and StringUtil updates. Made-with: Cursor

- Chat.Utils: estimate_openrouter_prompt_tokens and context_meter_prompt_tokens so the bar updates when history is compacted (estimate) until the next agent reply supplies provider usage again. - ChatBottomBar uses context_meter_prompt_tokens for the token display. - CompositionPrompt and update_definition tool text: ^^kind(expr) for livelits (slider, sliderf, check, text, csv, card) when overwriting definitions. Made-with: Cursor

Enforce at most one in-flight main or compaction LLM request; add Stop which clears busy state and ignores the matching reply via flight sequence numbers (including API error and retry paths). Chat bottom bar shows Stop while awaiting or compacting and blocks send until idle. Prompts: prefer incremental insert/update tools over monolithic initialize; treat workbench tasks as optional for large multi-turn work only. Compaction instructions prioritize transcript and tool results over a misleading empty agent-view snapshot. Tests: HandleLLMResponse passes main_llm_seq; add stop_square icon. Made-with: Cursor

- Queue user sends while busy; flush when idle; Stop only via button; append cancel line before flushing so ordering stays correct. - Chunked UI: ResponseCancelled outside Filbert; compaction body renders as Markdown; CSS for queue panel and markdown compaction. - Compaction: dialogue slice includes prior summary for chained compacts; prompt stresses merging that block with new turns; structured Markdown output contract. - Context meter shows only API prompt_tokens; use em dash until next reply after compaction (no client estimate). - HighLevelNodeMap: duplicate path strings resolve to earliest sibling; update CompositionPrompt and EditTools path guidance. - Tests: duplicate path order; compaction dialogue slice includes summary. Made-with: Cursor

…ENTS manual QA - Test_AgentControlFlow: Stop, stale HandleLLMResponse/Compaction, ApiError ignore, queue flush ordering, Send while busy - Test_AgentUX: context_meter_prompt_tokens, messages_for_openrouter, MarkActiveTaskComplete subtask auto-complete - Test_AgentTools: get_refs_to_after_pattern_edit vs get_refs_to when pre/post agree - haz3ltest: register AgentControlFlow suite - AGENTS.md: agent test filters and manual QA checklist for UI/clipboard Made-with: Cursor

- Add SetToolsInCategoryEnabled and Agent Tools header toggles per suite (View/Edit/Workbench/Other): all-on indicator, click enables or disables every tool in that category for the chat. - Sidebar: assistant tab tooltip "Open Hazel Coding Agent". - ChatBottomBar: shorter queue placeholders while compacting or awaiting. Made-with: Cursor

- Document that csv attaches only to empty [] and card expects playing-card tuples; update projector catalog, composition prompt, edit tools, and syntax projector tool descriptions. - Context meter: second line for one-decimal percent, show 100k for 100000 limit, br() layout; drop nowrap on label for line break. Made-with: Cursor

codecov · 2026-04-09T22:10:11Z

Codecov Report

❌ Patch coverage is 33.86693% with 1322 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.96%. Comparing base (4061cfd) to head (a6e1e94).

Files with missing lines	Patch %	Lines
src/web/view/agentCore/AgentUpdate.re	25.82%	428 Missing ⚠️
src/web/view/agentCore/Message.re	31.69%	125 Missing ⚠️
src/util/API.re	0.00%	82 Missing ⚠️
src/util/OpenRouter.re	55.00%	81 Missing ⚠️
src/web/view/agentView/ToolCallSummary.re	48.52%	70 Missing ⚠️
src/web/view/agentCore/Chat.re	50.71%	69 Missing ⚠️
src/haz3lcore/projectors/ProjectorPerform.re	11.84%	67 Missing ⚠️
src/web/view/agentCore/AgentToolCallHandler.re	34.34%	65 Missing ⚠️
...mpositionCore/AgentWorkbenchCore/AgentWorkbench.re	8.19%	56 Missing ⚠️
src/web/view/agentCore/ChatSystem.re	32.05%	53 Missing ⚠️
... and 14 more

Additional details and impacted files

@@            Coverage Diff             @@
##              dev    #2210      +/-   ##
==========================================
+ Coverage   50.26%   50.96%   +0.70%     
==========================================
  Files         267      278      +11     
  Lines       31709    32800    +1091     
==========================================
+ Hits        15938    16716     +778     
- Misses      15771    16084     +313

Files with missing lines	Coverage Δ
...CompositionCore/prompt_factory/CompactionPrompt.re	`100.00% <ø> (ø)`
src/util/StringUtil.re	`73.55% <100.00%> (+0.22%)`	⬆️
src/web/app/common/Icons.re	`100.00% <100.00%> (ø)`
src/web/view/agentCore/ChatSlashCommands.re	`100.00% <100.00%> (ø)`
src/haz3lcore/CompositionCore/AgentToolResult.re	`6.66% <50.00%> (+6.66%)`	⬆️
src/web/view/agentCore/AgentResult.re	`0.00% <0.00%> (ø)`
...CompositionCore/prompt_factory/ProjectorCatalog.re	`75.00% <75.00%> (ø)`
src/haz3lcore/CompositionCore/CompositionGo.re	`74.13% <61.53%> (+8.07%)`	⬆️
src/haz3lcore/CompositionCore/GeneralTreeUtils.re	`57.89% <54.54%> (-0.44%)`	⬇️
src/haz3lcore/ProbePerform.re	`22.29% <60.00%> (+8.82%)`	⬆️
... and 18 more

... and 31 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- HighLevelNodeMap.next_sibling_of / prev_sibling_of: `mod` binds tighter than `+/-`, so `idx + 1 mod len` was parsed `idx + (1 mod len)` — never wrapped, and raised on empty siblings. Wrap explicitly and guard len=0. - Agent.Update: API-error content had `"\\Error: "` (literal backslash), producing "Code: 429\Error: ..." in user-visible messages. Extract format_api_error_content and use `"\nError: "`. - Tests: 4 new sibling wrap-around cases in HighLevelNodeMap suite; 1 new API-error format case in Agent UX suite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- HighLevelNodeMap.closest_valid_path_to_ill_path: called inside the error-handling branch of path_to_id, so returning "" on an empty node_map is safer than raising (avoids compound failure). - Agent.Update: hoist max_empty_retries to module level next to max_api_retries; interpolate it into the retry copy so the bound and the user-visible "/N" stay in sync. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Code-review fixes from the branch review: - M1: HighLevelNodeMap sibling-nav precedence + empty-siblings guard - M2: API-error content uses \n between Code and Error labels - N4: closest_valid_path_to_ill_path no longer raises on empty map - N1: max_empty_retries centralized and interpolated in retry copy Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…oggle Restructure the LLM model list into a curated recommended section and a searchable master list. Recommended entries (Opus 4.6, Sonnet 4.6, Gemini 3 Flash Preview, MiMo V2 Pro, Gemma 4 31B) are ordered most capable → cheapest with per-model taglines. Master list supports subsequence fuzzy search on name+id and an "Only free" toggle, with state in agent_globals (model_filter, only_free_models; yojson/sexp defaults keep existing saves compatible). Recommended and master live in separate scroll containers sized to keep the Confirm Settings button visible at typical sidebar heights. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Empty-program seeding now uses insert_before / insert_after with no path; removes the separate initialize tool. Omitting path on either insert prepends (Before) or appends (After) at the program boundary, collapsing two code paths into one. - Drop EditTools.initialize JSON and Initialize action variant - Add InsertAtProgramBoundary(direction, code) for no-path inserts - Parser treats missing / empty-string path as the boundary case - Update prompts (Composition, Compaction, RecFib), tool catalogs, and chat/agent UI tool-name lists - Repurpose initialize tests as no-path insert_before/insert_after tests - Tool count: 33 -> 32; insert tools drop path from required list

…ge, /cost, /help Replaces markdown-rendered slash output with a typed `slash_command_payload` variant pipeline. Each command builds its own record (cost_output, credits_output, usage_output, help_output, KeyOutput, SlashError); the view layer owns formatting via SlashCommandOutputView, which renders custom card UIs per kind with branded border-left colors, stat tiles, kv rows, a credit progress bar, and a help table. New commands: - /cost — estimates session $ from per-message token counts - /account-usage — hits /api/v1/credits (account-wide credit pool) - /key-usage — hits /api/v1/key (per-key spend, limits, daily/weekly/monthly) - /key — shows the currently-set OpenRouter API key string - /help — lists all slash commands OpenRouter HTTP additions follow the existing get_models idiom; no caching, no key-persistence change. Tests updated to cover the new alphabetical ordering and help_payload contents. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pure formatter reflow, no semantic changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a per-model reasoning-effort selector to the chat bottom bar, gated on whether the active OpenRouter model advertises `reasoning` in its `supported_parameters`. The dropup appears in the action-button row (left side) with options Off / Low / Medium / High and uses pure CSS hover for open/close. Wiring: - `llm_info` gains a `supports_reasoning: bool` field, parsed from the `/api/v1/models` payload alongside the existing required-params check; defaults false on legacy stored data. - `AgentGlobals` gains `reasoning_effort: option(effort_level)` and a `SetReasoningEffort` action. - `Payload.Utils.mk_default` accepts an optional `~reasoning`. Threaded through the four main-chat send paths (initial send, retries, retry-empty); chat-naming and compaction deliberately leave it `None`. Also fixes overflow on the `/key` slash card: long API keys now wrap within the card (`overflow-wrap: anywhere` + `word-break: break-all`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…o-node Replaces the boxed tool-call rows in the chat transcript with plain-text rows that expand inline on click. Collapsed rows now show a per-tool signifier (structural path or free-text summary) so the user can see at a glance what each call did; expanded content is indented rather than boxed. Lead-glyph categories (edit/read/view/projector/probe/statics/workbench) are rendered on the left; success/fail/skip status icon stays on the right. Cmd/Ctrl-click a row dispatches Globals.Update.JumpToTile(id) to the corresponding AST node via HighLevelNodeMap.path_to_id_opt; primary click still toggles expand. Rows whose path no longer resolves dim (.stale) only for tools where persistence is expected — delete_* / remove_* never dim. New ToolCallSummary module centralizes per-tool category + signifier + jump_paths + persists derivation. Summarizer is applied to the three tool-call surfaces: ToolResultView (full redesign), ChatMessagesView .summary-tool-link (signifier added), and WorkbenchView tool rows (redesign via shared ToolResultView). Tests: 14 summarizer unit tests in Test_AgentUX covering category mapping, signifier extraction, path-list joining, free-text truncation, and unknown-tool fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…meter bar Renames the user-facing slash command for symmetry with /key-usage and /account-usage. Internal variant names (CostOutput, RunSlashCommandCost) intentionally left as-is — only the typed-in command label changed. Also moves the "(N.N%)" line in the bottom-bar context meter from the label group (above the bar) to its own div below the bar, matching the visual hierarchy of label → bar → percentage. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Match repo camelCase folder convention (e.g. menhirParser). include_subdirs unqualified means no dune/import edits needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Render reasoning above response: italic dim body with "Thought for Ns" / "Nm Ms" header. Persisted show_thinking flag (default true) gates display. - /show-thinking slash command toggles the flag, posts Notice. - Reply.Model gains reasoning field; parse reasoning_content/ reasoning/thinking from OpenRouter responses. Capture request elapsed_ms in HandleLLMResponse, store on Message for header. - Bottom bar stacks active model's pretty name above a "change model" button routing to main menu; main menu auto- returns to chat on model select. - Page.re: clipboard-shim copy falls back to window selection inside agent containers via new JsUtil.try_copy_window_selection_in_classes.

Persisted mode in AgentGlobals gates which tools the agent may call: - converse: only view tools (no edits, workbench, or overlays) - plan: blocks edits, keeps workbench so the agent can build todos - edit: full toolset (still subject to per-tool toggles) Mode is injected into the per-turn context snapshot and explained statically in CompositionPrompt; identity copy now stresses Filbert is a structure-editor agent, not a text-editor agent. Bottom-bar gets a top row: info-icon + colored mode toggle (click to cycle) on the left, current model name on the right. Change-model button stays in the bottom row. Placeholder gains a "type / for commands" hint.

Rewrote the identity block so "structure-editor agent operating on typed syntactic structure via a small calculus of typed tool calls" is the headline, not a parenthetical buried mid-paragraph. Added an explicit guideline that introspective questions ("what are you?", "what can you do?") must open with that framing instead of a generic feature list — Filbert was answering them by listing capabilities ("writes code", "debugs", "explains") without mentioning the structure-editor angle at all.

UI: move mode toggle + model name out of the bordered input container into a top-bar sibling above it. Context meter stays in the action-row (extracted to a let-binding for reuse). Trim placeholder back to "Type your message..." — the slash-cmd tip added bulk. Behavior: drop the workbench-nudge that fires when the assistant finishes a turn with no text and an active subtask still open. The "MANDATORY: write a sentence to the user" follow-up was creating loops in plan mode. Idle path goes straight to the compact check.

…o scroll stamp The stick-to-bottom hook is keyed on chat_messages_scroll_stamp, which previously only hashed finalized-log contents. During streaming the log is stable and the stamp didn't move, so the hook's update never fired and the view drifted away from the growing in-progress bubble. Mixing String.length of pending_assistant_{content,reasoning} into the stamp makes every delta re-trigger the hook. The existing stick_to_bottom / is_near_bottom policy already disengages when the user scrolls up and re-engages when they scroll back. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds four tools so the agent can maintain its own plan, not just append: - update_active_task / update_active_subtask — edit title and/or description; title changes rename the StringMap key and update the active_task / display_task / subtask_ordering pointers that reference it. - delete_task / delete_subtask — hard delete by title; clears pointers (active_task, display_task, active_subtask) that pointed at the removed entry. Closes the gap where the agent could only add/reorder and had no way to correct a bad title or drop a planned step. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The 4 new CRUD tools were falling through to the default None branch in ToolCallSummary.of_tool_call and rendering under the OTHER category. Route update_active_task/subtask and delete_task/subtask through the workbench(...) helper with sensible signifiers (new_title or new_description for updates; title for deletes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds `AgentTools.AscribedBindings` group (16 tests) covering `let x : T = v in ...`-style explicitly-typed bindings across: - path_to_id resolution (top-level, after type-alias chain, nested) - delete_binding_clause (simple, middle of chain, after type chain, list-literal body) - delete_body on ascribed let - update_definition (preserves ascription) - update_binding_clause (replaces incl. ascription) - insert_before/insert_after around ascribed lets - chess-style full-chain delete (all 5 type aliases + ascribed initial_board) - verbatim chess program: path_to_id finds every binding (incl. tuple-body Piece) - verbatim chess program: delete_binding_clause Piece succeeds Motivated by a live-editor repro where `delete_binding_clause Piece` failed with "Path 'Piece' not found in node map" on a chain ending in an ascribed let. String-parsed reproducers all pass — leaves room to detect regressions if the string-parse path ever breaks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

When an agent's tool call references a binding path that doesn't resolve, the error message now includes (a) the full list of available paths in the node map, and (b) a fully qualified nested path when the requested bare name is uniquely present deeper in the tree (e.g. "outer/inner" for a bare "inner" query). Motivated by a live-editor repro where `delete_binding_clause Piece` failed with "Path 'Piece' not found in node map" but the agent only saw a single levenshtein suggestion (PieceType) with no way to know what *was* in the map. The root-cause of why `Piece` wasn't in the map from the live zipper is still unreproduced from string-parse tests; this change makes the failure mode self-diagnosing so the next occurrence leaves actionable evidence. Resolution semantics are unchanged — `path_to_id` is still strict on the success path. Only the error message grew richer. Adds 2 InvalidPaths tests covering the new diagnostic fields. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Every successful Delete(BindingClause) was surfacing to the agent as "Path X not found in node map". Root cause: CompositionGo.Local.get_diff called path_to_id(new_node_map, path) after a Delete, which correctly no longer contains the path — so it raised, and the exception propagated up as a spurious tool-call failure despite the edit having worked. Swap to path_to_id_opt on the new-side lookup; None → new_segment=None (semantically correct: a deleted binding has no replacement segment). Old-side lookup unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Captures the agent-side invariants for the live-editor scenario that triggered "Exception during View: Cannot read properties of undefined (reading 'length')" when placing a probe on [fib]. The test asserts path resolves, add_manual does not raise, statics rebuild, and node_map rebuild. All pass — confirming the agent tool call path is clean and the crash lives downstream in view/eval render (core, out of agent scope). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

4 workbench tools added on this branch (see WorkbenchTools.re); the sentinel count in Test_AgentTools fell behind and failed CI.

disconcision · 2026-04-21T12:36:58Z

The system prompt refers to the old forall keyword; this should be replaced by poly.

russell-rozenbaum · 2026-04-21T14:22:29Z

Okay, verified that main code this hits external to agent infrastructure are *Perform.re files, where it mainly adds to them.
This was for the projector and probe placement tools we gave the agent.

…-tools-extension

Agent.re was ~4600 lines. Extract each top-level module into its own file under src/web/view/agentCore/. Pure code motion: no logic or behavior changes. New files: - AgentResult.re (Failure + Result, paired foundational types) - Message.re (Message.Model/Utils/Update) - Chat.re (Chat.Model/Utils/Update) - ChunkedUIChat.re (UI chunking of the chat log) - ChatSlashCommands.re - ChatSystem.re (multi-chat container) Agent.re now contains only the Agent module and unwraps the outer module wrapper, so callers drop the Agent.Agent.X stutter in favor of Agent.X. All other agent modules are referenced bare at file scope: Message.X, Chat.X, ChunkedUIChat.X, ChatSlashCommands.X, ChatSystem.X. Agent.re: 4584 -> 3046 lines. Build clean, full agent test suite green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Phase 2 of the Agent.re god-module split. After phase 1 carved out the chat-layer modules, Agent.re still held 3046 lines for the Agent module itself. Extract each submodule into its own file: - AgentModel.re (llm_error_origin + Model + Persistent) - AgentToolUtils.re (tool-JSON helpers) - AgentUtils.re (init, cleanup helpers) - AgentToolCallHandler.re (CompositionActions -> ChatSystem dispatch) - AgentUpdate.re (action dispatch + LLM request plumbing) Agent.re is now a 30-line facade: doc comment + `include AgentModel` for Model/Persistent/llm_error_origin + four module aliases for the rest. External callers keep their `Agent.Model.t`, `Agent.Update.X`, etc. paths unchanged. Pure code motion; no logic, behavior, or public-API changes. Full agent test suite green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Replace outdated `forall` keyword with `poly` in Hazel language guides (HazelSyntaxNotes, HazelDocumentation). `forall` is now the exp-level form; type-level polymorphism uses `poly`. - Add three guideline directives to CompositionPrompt: iterate until done (stop only for user input), write adequate tests, end with a concise summary of what was done.

russell-rozenbaum · 2026-04-22T00:55:44Z

The system prompt refers to the old forall keyword; this should be replaced by poly.

resolved

Probe and Statics branches in ToolCallHandler.update silently dropped unresolved paths and returned Ok, so `place_probe(["bogus"])` produced a "tool call was successful" message while leaving the editor untouched. Agent had no signal to self-correct. Mirror the SyntaxProjector guard: track resolved vs unresolved paths, and when `List.length(paths) > 0` but every path was unresolved, return an Error listing the unresolved paths and explaining the HighLevelNodeMap path format. Partial resolves (some valid, some not) still succeed — matches existing SyntaxProjector semantics. Strengthening that to surface partial failures is a separate design call. Adds 3 regression tests in Test_AgentUX.toolcall_handler_tests: - place_probe with only bogus paths → Error - place_statics with only bogus paths → Error - place_probe with mixed valid/bogus paths → still Ok

- Eg_EmojiPaint.re was an 11-line placeholder (with a typo in the first sentence) never referenced from CompositionPrompt. Delete. - LanguageServerAction branch in ToolCallHandler.update returned Ok silently for an unimplemented path. Return Error instead so the agent gets a real signal if a future tool ever produces it.

…rlay_action The three overlay-tool branches in ToolCallHandler.update had near-identical scaffolding: build the HighLevelNodeMap, fold per path tracking unresolved and changed counts, error on total failure, rebuild the zipper/editor, dispatch AgentContext.Expand on the changed paths. Extract into a shared apply_overlay_action helper parameterized by tool_label, resolve_path, and a perform closure returning option((zipper, should_expand)). Net: 98 fewer lines and a single place to update overlay dispatch semantics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three new test groups pin down agent-core semantics that previously had no coverage: - tool_allowed_in_mode: Edit allows all; Plan blocks edit tools only; Converse additionally blocks workbench + overlay tools. - backoff_ms: attempts 0..3 return 1000, 2000, 4000, 8000 (1000 * 2^n). - StreamDelta: dropped when flight_seq matches pending_ignore_main_reply_seq; otherwise accumulates content + reasoning onto pending_assistant_*. Also covers the case where pending_ignore is set for a different flight. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Rework the compaction system prompt so the summarizer produces a thorough markdown recap rather than an aggressive compression. Key shifts: - Add "Completeness over brevity" as the first goal. Length is cheap; forgetting a user rule or a failing test is expensive. - Require near-verbatim reproduction of the final user and assistant messages in a new "## Most recent exchange" section, with blockquotes / fenced blocks preserving exact wording and any pending tool calls. - Add "## User rules & preferences (quoted)" as a required section so standing instructions ("don't touch core/…", "prefer tail recursion") survive compaction verbatim and carry forward across chained compactions. - Add "## Tool results & program values" so probe values, test pass/fail counts, and per-tool outcomes are enumerated with arguments/paths/values. - Add "## Plans & notes" for stated plans, TODOs, and "next I'll…" commits. - Expand "Preserve in the summary" to spell out every category the summarizer must cover, and explicitly tell the model that many output tokens are expected when the history warrants it. No code-path changes — only the static string list feeding mk_system_prompt. Build + 2605 tests green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The .cursor/ IDE scratch folder had a gitignore exception for .cursor/docs/ that was letting personal handoff notes ship to PRs. Remove the tracked file and drop the exception so .cursor/ is fully ignored, matching dev's posture. The file remains on disk locally. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…-tools-extension

Two call sites added on this branch referenced the old Info.exp.term field, which Elastatics (#2213) split into user_term/elab_term. Other call sites in these files were renamed on dev; our added functions were not reached by that sweep.

disconcision

Looks solid in general. I skimmed most of the agent-specific code, focussing on the interface with the rest of hazel; see code comments for detail.

Functional issues:

New code insertion still seems to have leading spaces. Probably worth trying to make tests around this as it seems to be a recurring issue.
Pressing enter to send a message in a long-ish conversation often lags... probably worth identifying why, but regardless, it would be nice to get some immediate feedback, just to make it obvious something is happening (eg message gets added to log and disappears from entry box), even if actual processing is subject to the lag. otherwise the user is left doubting what's going on.
When inserting new let bindings into the body below the last let binding, it doesn't seem to always insert a linebreak or space. so you sometimes get stuff like inlet empty_playlist : PlayList = ([], NoSongSelected) in, which is problematic as if it's copy-pasted it'll break.

Aesthetic:

Slash cards (.slash-card) are a bit dark... lets use background: var(--T2) instead.
Tool calls are currently displayed as tool_name def_name. despite using different colors these kind of run into each other, let's stick some kind of separator between the two words. Also, it seems like def_name is slightly vertically elevated over tool_name; check the css.

Broader issues:

Not specific to this PR, but I think I'm running into issues based on how we're only keeping the current code view/map. I had the agent do one of the test tasks; he also wrote some tests. I then pasted in our test suite over the agent's, and told him that I did so and that a test was failing. The agent got confused though as all he saw were 'the tests he added'. I'm not actually totally clear on what was going on here, but I think it was that the agent is only ever seeing the current version, plus agent edits, so the agent gets confused when referring to 'updated'/'new' versions resulting from user edits, as those don't show up in the log. something like that. I copy/pasted the history here (https://gist.github.com/disconcision/cdf2950ee8b49820ce36f8540a169669) but it's not too readable. We should discuss this. I think this showcases the importance of a way to see what the model saw at each stage of a job.

disconcision · 2026-05-07T21:37:39Z

+    changed. Strips a projector (exposing underlying syntax) when
+    [[MakeTerm.for_projection]] or [[ProjectorInit.init]] fails, migrating
+    refractors to the underlying term id when possible. */
+let sanitize_projectors_in_segment =


what does sanitize mean? re-validate? what does it mean for maketerm/init to fail? what does this have to do with refractors, which aren't in syntax the same way projectors are?

I don't really get what this is doing and the call site doesn't clear things up for me either

disconcision · 2026-05-11T21:58:35Z

@@ -0,0 +1,170 @@
+# AGENTS.md - Hazel Development Guide


not sure if we should commit this or not; seems fine; discuss with @cyrus-

disconcision · 2026-05-11T22:01:18Z

        switch (target) {
        | Some(el) =>
          let elId = Js.Opt.to_option(Js.Unsafe.coerce(el)##.id);
-          if (is_input_field(elId)) {


This hopefully won't be an issue when you're up to date with dev; @Negabinary has some keyboard handling changes that should obviate the need for these workarounds. lmk if that's not the case and we'll find a better way of doing this.

disconcision · 2026-05-11T22:02:22Z

+  let is_trailing_ws = (c: char): bool => c == ' ' || c == '\t' || c == '\r';
  let trim_line = (line: string): string => {
    let chars = String.to_seq(line) |> List.of_seq;
-    let rec drop_leading_spaces = (chars: list(char)): list(char) =>


was this just mis-named before?

disconcision · 2026-05-11T22:03:05Z

+
+    Used to rescue native copy when a focused hidden element (e.g. the editor's
+    clipboard shim) would otherwise intercept Cmd/Ctrl+C. */
+let try_copy_window_selection_in_classes =


hopefully won't be necessary when current with dev... see comment below

disconcision · 2026-05-11T22:04:38Z

    api_key: option(string),
    active_llm: option(OpenRouter.AvailableLLMs.Model.llm_info),
    available_llms: OpenRouter.AvailableLLMs.Model.t,
+    [@yojson.default ""] [@sexp.default ""]


why are these fields but not others defaulted? trying to make sure this isn't just papering over something

disconcision · 2026-05-11T22:05:39Z

  };

+/** Like [[toggle_statics]], but for path-resolved ids (agent tools). */
+let toggle_statics_at = toggle_statics;


not sure this alias is necessary?

disconcision · 2026-05-11T22:13:21Z

  | SampleFocus(a) => Ok(SampleFocusPerform.go(z, a))
  };
 };
+


I don't really get what this set of functions is doing. they seem to be for generic projector placement, but internally they call migrate_refractor, which only applies to probes/statics, which unlike projectors are not written into the syntax tree.

russell-rozenbaum added 12 commits April 2, 2026 18:15

Agent: minor Agent.update follow-up after projector tools

6040c1d

russell-rozenbaum mentioned this pull request Apr 9, 2026

Coding agent UI updates v2 #2206

Closed

russell-rozenbaum and others added 16 commits April 17, 2026 15:08

style: dune fmt sweep on Test_AgentTools.re

6f94c20

Pure formatter reflow, no semantic changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Merge branch 'russell-agent-tool-display' into russell-wip-2026-04-17

75c9500

refactor: rename AgentCore/AgentView → agentCore/agentView

bb80c52

Match repo camelCase folder convention (e.g. menhirParser). include_subdirs unqualified means no dune/import edits needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

russell-rozenbaum and others added 8 commits April 18, 2026 20:26

test(agent): bump ToolJsonDefinitions tool count 32 → 36

6aac7a8

4 workbench tools added on this branch (see WorkbenchTools.re); the sentinel count in Test_AgentTools fell behind and failed CI.

russell-rozenbaum force-pushed the coding-agent-projector-tools-extension branch from 1842763 to 6aac7a8 Compare April 21, 2026 02:57

cyrus- added this to Hazel Big Board Apr 21, 2026

russell-rozenbaum marked this pull request as draft April 21, 2026 14:18

russell-rozenbaum marked this pull request as ready for review April 21, 2026 22:37

russell-rozenbaum and others added 4 commits April 21, 2026 18:39

Merge remote-tracking branch 'origin/dev' into coding-agent-projector…

8824a00

…-tools-extension

russell-rozenbaum and others added 8 commits April 21, 2026 21:07

Merge remote-tracking branch 'origin/dev' into coding-agent-projector…

b7dfde0

…-tools-extension

russell-rozenbaum changed the title ~~Coding agent expansion and improvements March/April~~ Coding Agent Expansion (Added Tools, New Slash Commands, Rewritten Prompts, Improved Chat UX, and more) Apr 23, 2026

cyrus- moved this to AI Assistant in Hazel Big Board Apr 23, 2026

disconcision requested changes May 11, 2026

View reviewed changes

Conversation

russell-rozenbaum commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's new

Files outside agent scope

Tour — new bottom-bar controls

New slash commands

Agent has ability to place probes and understand their intermittent results

Agent can batch multiple tools per turn

More informative collapsed tool descriptions

Thinking blobs now displayed as their own UI chunks

Refactored model selection UI

User can stop chats early (and also queue up messages)

Batched Tool Calls

Uh oh!

codecov Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

disconcision commented Apr 21, 2026

Uh oh!

russell-rozenbaum commented Apr 21, 2026

Uh oh!

russell-rozenbaum commented Apr 22, 2026

Uh oh!

disconcision left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

russell-rozenbaum commented Apr 9, 2026 •

edited

Loading

codecov Bot commented Apr 9, 2026 •

edited

Loading