Coding Agent Expansion (Added Tools, New Slash Commands, Rewritten Prompts, Improved Chat UX, and more)#2210
Coding Agent Expansion (Added Tools, New Slash Commands, Rewritten Prompts, Improved Chat UX, and more)#2210russell-rozenbaum wants to merge 52 commits into
Conversation
… docs - Scroll chat to bottom on updates only when already pinned to bottom - System prompt and developer notes use full-screen-doc (left-aligned, readable) - Hide edits summary when there are no edit tools in a turn - Wire Agent Tools to tools-view-* CSS (flex row, hover, toggle states) Made-with: Cursor
Add place_statics, remove_statics, and toggle_statics with ProbePerform helpers, StaticsAction wiring, and Agent handling. Register tools in CompositionUtils and extend tests, including projector snippets. Introduce ProjectorCatalog and refresh CompositionPrompt, CompactionPrompt, HazelDocumentation, HazelSyntaxNotes, and ProbeTools copy for projectors and statics. When marking the active workbench task complete, fold remaining subtasks to completed instead of failing; document in WorkbenchTools. Fix chat messages autoscroll while pinned to the bottom by deferring scroll-to-bottom after layout and enriching the scroll stamp (tool results and slack). Set min-height on full-screen view content for flex scrolling. Made-with: Cursor
Add place/remove/toggle_syntax_projector tools (OpenAPI + CompositionUtils + CompositionActions) and ProjectorPerform helpers for path-resolved placement. Agent.re: handle SyntaxProjectorAction with expand-on-success; fail when zero paths apply; extend mk_diff for projector/probe/statics actions; retry and workbench nudges can use synthetic user-role API messages; document message channels in file header. HighLevelNodeMap: improve closest-path suggestions for nested bindings; note outer/inner paths in path_to_id errors. Prompts: message_channels, partnering_and_user_intent, CONTEXT UPDATE echo ban, EditTools path semantics, projector catalog; compaction prompt aligned. Tests: HighLevelNodeMap path cases, binding_clause scenarios, syntax projector parse coverage. UI: AgentMessageMarkdown + styles for assistant bubbles. Track .cursor/docs/coding-agent-projector-tools-extension.md via .gitignore exception for agent onboarding. Made-with: Cursor
- Copy exact LLM context snapshot from Agent Context panel (clipboard shim + toast); shared Message/Agent helpers and tests in Test_AgentUX. - Multi-tool turns: stop after first real failure; skipped tools with AgentToolResult.skipped, grey circle-minus icon, chat export [not executed]; Test_AgentMultiTool. - Context meter: 80% rounded limit capped at 100k tokens (AgentGlobals); compaction uses same limit. - JsUtil: copy_via_shim, show_copy_toast; Icons.circle_with_minus; CSS for header actions and skipped tool state. - run_tests: node stack + idb_stub alignment; haz3ltest registers multi-tool suite. - Add AGENTS.md; .gitignore .cursor/ with docs exception; prompt/catalog/ HighLevelNodeMap/SyntaxProjectorTools/WorkbenchTools tweaks. Made-with: Cursor
…agent - After update_definition, sanitize syntax projectors in the definition segment via ProjectorPerform.sanitize_projectors_in_segment so wrappers match init. - GeneralTreeUtils and CompositionGo: get_refs_to after pattern edits; clearer ambiguous-path behavior and EditTools prompt notes. - StringUtil: trim_leading and trim_trailing_whitespace strip horizontal whitespace (space, tab, CR); align with paste (ClipboardCache) and tests. - CompositionPrompt documents multiple tool calls per turn and sequential skip-on-failure; CompactionPrompt is expanded with Overview, Goals, Rules, and related sections. - Agent: surface Action.Failure.show when a structural tool cannot be applied; minor clarity in tool-chain failure detection. - Tests: AgentTools, AgentUX, and StringUtil updates. Made-with: Cursor
- Chat.Utils: estimate_openrouter_prompt_tokens and context_meter_prompt_tokens so the bar updates when history is compacted (estimate) until the next agent reply supplies provider usage again. - ChatBottomBar uses context_meter_prompt_tokens for the token display. - CompositionPrompt and update_definition tool text: ^^kind(expr) for livelits (slider, sliderf, check, text, csv, card) when overwriting definitions. Made-with: Cursor
Enforce at most one in-flight main or compaction LLM request; add Stop which clears busy state and ignores the matching reply via flight sequence numbers (including API error and retry paths). Chat bottom bar shows Stop while awaiting or compacting and blocks send until idle. Prompts: prefer incremental insert/update tools over monolithic initialize; treat workbench tasks as optional for large multi-turn work only. Compaction instructions prioritize transcript and tool results over a misleading empty agent-view snapshot. Tests: HandleLLMResponse passes main_llm_seq; add stop_square icon. Made-with: Cursor
- Queue user sends while busy; flush when idle; Stop only via button; append cancel line before flushing so ordering stays correct. - Chunked UI: ResponseCancelled outside Filbert; compaction body renders as Markdown; CSS for queue panel and markdown compaction. - Compaction: dialogue slice includes prior summary for chained compacts; prompt stresses merging that block with new turns; structured Markdown output contract. - Context meter shows only API prompt_tokens; use em dash until next reply after compaction (no client estimate). - HighLevelNodeMap: duplicate path strings resolve to earliest sibling; update CompositionPrompt and EditTools path guidance. - Tests: duplicate path order; compaction dialogue slice includes summary. Made-with: Cursor
…ENTS manual QA - Test_AgentControlFlow: Stop, stale HandleLLMResponse/Compaction, ApiError ignore, queue flush ordering, Send while busy - Test_AgentUX: context_meter_prompt_tokens, messages_for_openrouter, MarkActiveTaskComplete subtask auto-complete - Test_AgentTools: get_refs_to_after_pattern_edit vs get_refs_to when pre/post agree - haz3ltest: register AgentControlFlow suite - AGENTS.md: agent test filters and manual QA checklist for UI/clipboard Made-with: Cursor
- Add SetToolsInCategoryEnabled and Agent Tools header toggles per suite (View/Edit/Workbench/Other): all-on indicator, click enables or disables every tool in that category for the chat. - Sidebar: assistant tab tooltip "Open Hazel Coding Agent". - ChatBottomBar: shorter queue placeholders while compacting or awaiting. Made-with: Cursor
- Document that csv attaches only to empty [] and card expects playing-card tuples; update projector catalog, composition prompt, edit tools, and syntax projector tool descriptions. - Context meter: second line for one-decimal percent, show 100k for 100000 limit, br() layout; drop nowrap on label for line break. Made-with: Cursor
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev #2210 +/- ##
==========================================
+ Coverage 50.26% 50.96% +0.70%
==========================================
Files 267 278 +11
Lines 31709 32800 +1091
==========================================
+ Hits 15938 16716 +778
- Misses 15771 16084 +313
🚀 New features to boost your workflow:
|
- HighLevelNodeMap.next_sibling_of / prev_sibling_of: `mod` binds tighter than `+/-`, so `idx + 1 mod len` was parsed `idx + (1 mod len)` — never wrapped, and raised on empty siblings. Wrap explicitly and guard len=0. - Agent.Update: API-error content had `"\\Error: "` (literal backslash), producing "Code: 429\Error: ..." in user-visible messages. Extract format_api_error_content and use `"\nError: "`. - Tests: 4 new sibling wrap-around cases in HighLevelNodeMap suite; 1 new API-error format case in Agent UX suite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- HighLevelNodeMap.closest_valid_path_to_ill_path: called inside the error-handling branch of path_to_id, so returning "" on an empty node_map is safer than raising (avoids compound failure). - Agent.Update: hoist max_empty_retries to module level next to max_api_retries; interpolate it into the retry copy so the bound and the user-visible "/N" stay in sync. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Code-review fixes from the branch review: - M1: HighLevelNodeMap sibling-nav precedence + empty-siblings guard - M2: API-error content uses \n between Code and Error labels - N4: closest_valid_path_to_ill_path no longer raises on empty map - N1: max_empty_retries centralized and interpolated in retry copy Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…oggle Restructure the LLM model list into a curated recommended section and a searchable master list. Recommended entries (Opus 4.6, Sonnet 4.6, Gemini 3 Flash Preview, MiMo V2 Pro, Gemma 4 31B) are ordered most capable → cheapest with per-model taglines. Master list supports subsequence fuzzy search on name+id and an "Only free" toggle, with state in agent_globals (model_filter, only_free_models; yojson/sexp defaults keep existing saves compatible). Recommended and master live in separate scroll containers sized to keep the Confirm Settings button visible at typical sidebar heights. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Empty-program seeding now uses insert_before / insert_after with no path; removes the separate initialize tool. Omitting path on either insert prepends (Before) or appends (After) at the program boundary, collapsing two code paths into one. - Drop EditTools.initialize JSON and Initialize action variant - Add InsertAtProgramBoundary(direction, code) for no-path inserts - Parser treats missing / empty-string path as the boundary case - Update prompts (Composition, Compaction, RecFib), tool catalogs, and chat/agent UI tool-name lists - Repurpose initialize tests as no-path insert_before/insert_after tests - Tool count: 33 -> 32; insert tools drop path from required list
…ge, /cost, /help Replaces markdown-rendered slash output with a typed `slash_command_payload` variant pipeline. Each command builds its own record (cost_output, credits_output, usage_output, help_output, KeyOutput, SlashError); the view layer owns formatting via SlashCommandOutputView, which renders custom card UIs per kind with branded border-left colors, stat tiles, kv rows, a credit progress bar, and a help table. New commands: - /cost — estimates session $ from per-message token counts - /account-usage — hits /api/v1/credits (account-wide credit pool) - /key-usage — hits /api/v1/key (per-key spend, limits, daily/weekly/monthly) - /key — shows the currently-set OpenRouter API key string - /help — lists all slash commands OpenRouter HTTP additions follow the existing get_models idiom; no caching, no key-persistence change. Tests updated to cover the new alphabetical ordering and help_payload contents. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pure formatter reflow, no semantic changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a per-model reasoning-effort selector to the chat bottom bar, gated on whether the active OpenRouter model advertises `reasoning` in its `supported_parameters`. The dropup appears in the action-button row (left side) with options Off / Low / Medium / High and uses pure CSS hover for open/close. Wiring: - `llm_info` gains a `supports_reasoning: bool` field, parsed from the `/api/v1/models` payload alongside the existing required-params check; defaults false on legacy stored data. - `AgentGlobals` gains `reasoning_effort: option(effort_level)` and a `SetReasoningEffort` action. - `Payload.Utils.mk_default` accepts an optional `~reasoning`. Threaded through the four main-chat send paths (initial send, retries, retry-empty); chat-naming and compaction deliberately leave it `None`. Also fixes overflow on the `/key` slash card: long API keys now wrap within the card (`overflow-wrap: anywhere` + `word-break: break-all`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…o-node Replaces the boxed tool-call rows in the chat transcript with plain-text rows that expand inline on click. Collapsed rows now show a per-tool signifier (structural path or free-text summary) so the user can see at a glance what each call did; expanded content is indented rather than boxed. Lead-glyph categories (edit/read/view/projector/probe/statics/workbench) are rendered on the left; success/fail/skip status icon stays on the right. Cmd/Ctrl-click a row dispatches Globals.Update.JumpToTile(id) to the corresponding AST node via HighLevelNodeMap.path_to_id_opt; primary click still toggles expand. Rows whose path no longer resolves dim (.stale) only for tools where persistence is expected — delete_* / remove_* never dim. New ToolCallSummary module centralizes per-tool category + signifier + jump_paths + persists derivation. Summarizer is applied to the three tool-call surfaces: ToolResultView (full redesign), ChatMessagesView .summary-tool-link (signifier added), and WorkbenchView tool rows (redesign via shared ToolResultView). Tests: 14 summarizer unit tests in Test_AgentUX covering category mapping, signifier extraction, path-list joining, free-text truncation, and unknown-tool fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…meter bar Renames the user-facing slash command for symmetry with /key-usage and /account-usage. Internal variant names (CostOutput, RunSlashCommandCost) intentionally left as-is — only the typed-in command label changed. Also moves the "(N.N%)" line in the bottom-bar context meter from the label group (above the bar) to its own div below the bar, matching the visual hierarchy of label → bar → percentage. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Match repo camelCase folder convention (e.g. menhirParser). include_subdirs unqualified means no dune/import edits needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Render reasoning above response: italic dim body with "Thought for Ns" / "Nm Ms" header. Persisted show_thinking flag (default true) gates display. - /show-thinking slash command toggles the flag, posts Notice. - Reply.Model gains reasoning field; parse reasoning_content/ reasoning/thinking from OpenRouter responses. Capture request elapsed_ms in HandleLLMResponse, store on Message for header. - Bottom bar stacks active model's pretty name above a "change model" button routing to main menu; main menu auto- returns to chat on model select. - Page.re: clipboard-shim copy falls back to window selection inside agent containers via new JsUtil.try_copy_window_selection_in_classes.
Persisted mode in AgentGlobals gates which tools the agent may call: - converse: only view tools (no edits, workbench, or overlays) - plan: blocks edits, keeps workbench so the agent can build todos - edit: full toolset (still subject to per-tool toggles) Mode is injected into the per-turn context snapshot and explained statically in CompositionPrompt; identity copy now stresses Filbert is a structure-editor agent, not a text-editor agent. Bottom-bar gets a top row: info-icon + colored mode toggle (click to cycle) on the left, current model name on the right. Change-model button stays in the bottom row. Placeholder gains a "type / for commands" hint.
Rewrote the identity block so "structure-editor agent operating on
typed syntactic structure via a small calculus of typed tool calls"
is the headline, not a parenthetical buried mid-paragraph. Added an
explicit guideline that introspective questions ("what are you?",
"what can you do?") must open with that framing instead of a generic
feature list — Filbert was answering them by listing capabilities
("writes code", "debugs", "explains") without mentioning the
structure-editor angle at all.
UI: move mode toggle + model name out of the bordered input container into a top-bar sibling above it. Context meter stays in the action-row (extracted to a let-binding for reuse). Trim placeholder back to "Type your message..." — the slash-cmd tip added bulk. Behavior: drop the workbench-nudge that fires when the assistant finishes a turn with no text and an active subtask still open. The "MANDATORY: write a sentence to the user" follow-up was creating loops in plan mode. Idle path goes straight to the compact check.
…o scroll stamp
The stick-to-bottom hook is keyed on chat_messages_scroll_stamp, which
previously only hashed finalized-log contents. During streaming the log
is stable and the stamp didn't move, so the hook's update never fired
and the view drifted away from the growing in-progress bubble.
Mixing String.length of pending_assistant_{content,reasoning} into the
stamp makes every delta re-trigger the hook. The existing
stick_to_bottom / is_near_bottom policy already disengages when the
user scrolls up and re-engages when they scroll back.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds four tools so the agent can maintain its own plan, not just append: - update_active_task / update_active_subtask — edit title and/or description; title changes rename the StringMap key and update the active_task / display_task / subtask_ordering pointers that reference it. - delete_task / delete_subtask — hard delete by title; clears pointers (active_task, display_task, active_subtask) that pointed at the removed entry. Closes the gap where the agent could only add/reorder and had no way to correct a bad title or drop a planned step. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The 4 new CRUD tools were falling through to the default None branch in ToolCallSummary.of_tool_call and rendering under the OTHER category. Route update_active_task/subtask and delete_task/subtask through the workbench(...) helper with sensible signifiers (new_title or new_description for updates; title for deletes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds `AgentTools.AscribedBindings` group (16 tests) covering `let x : T = v in ...`-style explicitly-typed bindings across: - path_to_id resolution (top-level, after type-alias chain, nested) - delete_binding_clause (simple, middle of chain, after type chain, list-literal body) - delete_body on ascribed let - update_definition (preserves ascription) - update_binding_clause (replaces incl. ascription) - insert_before/insert_after around ascribed lets - chess-style full-chain delete (all 5 type aliases + ascribed initial_board) - verbatim chess program: path_to_id finds every binding (incl. tuple-body Piece) - verbatim chess program: delete_binding_clause Piece succeeds Motivated by a live-editor repro where `delete_binding_clause Piece` failed with "Path 'Piece' not found in node map" on a chain ending in an ascribed let. String-parsed reproducers all pass — leaves room to detect regressions if the string-parse path ever breaks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When an agent's tool call references a binding path that doesn't resolve, the error message now includes (a) the full list of available paths in the node map, and (b) a fully qualified nested path when the requested bare name is uniquely present deeper in the tree (e.g. "outer/inner" for a bare "inner" query). Motivated by a live-editor repro where `delete_binding_clause Piece` failed with "Path 'Piece' not found in node map" but the agent only saw a single levenshtein suggestion (PieceType) with no way to know what *was* in the map. The root-cause of why `Piece` wasn't in the map from the live zipper is still unreproduced from string-parse tests; this change makes the failure mode self-diagnosing so the next occurrence leaves actionable evidence. Resolution semantics are unchanged — `path_to_id` is still strict on the success path. Only the error message grew richer. Adds 2 InvalidPaths tests covering the new diagnostic fields. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Every successful Delete(BindingClause) was surfacing to the agent as "Path X not found in node map". Root cause: CompositionGo.Local.get_diff called path_to_id(new_node_map, path) after a Delete, which correctly no longer contains the path — so it raised, and the exception propagated up as a spurious tool-call failure despite the edit having worked. Swap to path_to_id_opt on the new-side lookup; None → new_segment=None (semantically correct: a deleted binding has no replacement segment). Old-side lookup unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Captures the agent-side invariants for the live-editor scenario that triggered "Exception during View: Cannot read properties of undefined (reading 'length')" when placing a probe on [fib]. The test asserts path resolves, add_manual does not raise, statics rebuild, and node_map rebuild. All pass — confirming the agent tool call path is clean and the crash lives downstream in view/eval render (core, out of agent scope). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 workbench tools added on this branch (see WorkbenchTools.re); the sentinel count in Test_AgentTools fell behind and failed CI.
1842763 to
6aac7a8
Compare
|
The system prompt refers to the old |
|
Okay, verified that main code this hits external to agent infrastructure are |
Agent.re was ~4600 lines. Extract each top-level module into its own file under src/web/view/agentCore/. Pure code motion: no logic or behavior changes. New files: - AgentResult.re (Failure + Result, paired foundational types) - Message.re (Message.Model/Utils/Update) - Chat.re (Chat.Model/Utils/Update) - ChunkedUIChat.re (UI chunking of the chat log) - ChatSlashCommands.re - ChatSystem.re (multi-chat container) Agent.re now contains only the Agent module and unwraps the outer module wrapper, so callers drop the Agent.Agent.X stutter in favor of Agent.X. All other agent modules are referenced bare at file scope: Message.X, Chat.X, ChunkedUIChat.X, ChatSlashCommands.X, ChatSystem.X. Agent.re: 4584 -> 3046 lines. Build clean, full agent test suite green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 2 of the Agent.re god-module split. After phase 1 carved out the chat-layer modules, Agent.re still held 3046 lines for the Agent module itself. Extract each submodule into its own file: - AgentModel.re (llm_error_origin + Model + Persistent) - AgentToolUtils.re (tool-JSON helpers) - AgentUtils.re (init, cleanup helpers) - AgentToolCallHandler.re (CompositionActions -> ChatSystem dispatch) - AgentUpdate.re (action dispatch + LLM request plumbing) Agent.re is now a 30-line facade: doc comment + `include AgentModel` for Model/Persistent/llm_error_origin + four module aliases for the rest. External callers keep their `Agent.Model.t`, `Agent.Update.X`, etc. paths unchanged. Pure code motion; no logic, behavior, or public-API changes. Full agent test suite green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Replace outdated `forall` keyword with `poly` in Hazel language guides (HazelSyntaxNotes, HazelDocumentation). `forall` is now the exp-level form; type-level polymorphism uses `poly`. - Add three guideline directives to CompositionPrompt: iterate until done (stop only for user input), write adequate tests, end with a concise summary of what was done.
resolved |
Probe and Statics branches in ToolCallHandler.update silently dropped unresolved paths and returned Ok, so `place_probe(["bogus"])` produced a "tool call was successful" message while leaving the editor untouched. Agent had no signal to self-correct. Mirror the SyntaxProjector guard: track resolved vs unresolved paths, and when `List.length(paths) > 0` but every path was unresolved, return an Error listing the unresolved paths and explaining the HighLevelNodeMap path format. Partial resolves (some valid, some not) still succeed — matches existing SyntaxProjector semantics. Strengthening that to surface partial failures is a separate design call. Adds 3 regression tests in Test_AgentUX.toolcall_handler_tests: - place_probe with only bogus paths → Error - place_statics with only bogus paths → Error - place_probe with mixed valid/bogus paths → still Ok
- Eg_EmojiPaint.re was an 11-line placeholder (with a typo in the first sentence) never referenced from CompositionPrompt. Delete. - LanguageServerAction branch in ToolCallHandler.update returned Ok silently for an unimplemented path. Return Error instead so the agent gets a real signal if a future tool ever produces it.
…rlay_action The three overlay-tool branches in ToolCallHandler.update had near-identical scaffolding: build the HighLevelNodeMap, fold per path tracking unresolved and changed counts, error on total failure, rebuild the zipper/editor, dispatch AgentContext.Expand on the changed paths. Extract into a shared apply_overlay_action helper parameterized by tool_label, resolve_path, and a perform closure returning option((zipper, should_expand)). Net: 98 fewer lines and a single place to update overlay dispatch semantics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three new test groups pin down agent-core semantics that previously had no coverage: - tool_allowed_in_mode: Edit allows all; Plan blocks edit tools only; Converse additionally blocks workbench + overlay tools. - backoff_ms: attempts 0..3 return 1000, 2000, 4000, 8000 (1000 * 2^n). - StreamDelta: dropped when flight_seq matches pending_ignore_main_reply_seq; otherwise accumulates content + reasoning onto pending_assistant_*. Also covers the case where pending_ignore is set for a different flight. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rework the compaction system prompt so the summarizer produces a thorough
markdown recap rather than an aggressive compression. Key shifts:
- Add "Completeness over brevity" as the first goal. Length is cheap;
forgetting a user rule or a failing test is expensive.
- Require near-verbatim reproduction of the final user and assistant
messages in a new "## Most recent exchange" section, with blockquotes
/ fenced blocks preserving exact wording and any pending tool calls.
- Add "## User rules & preferences (quoted)" as a required section so
standing instructions ("don't touch core/…", "prefer tail recursion")
survive compaction verbatim and carry forward across chained compactions.
- Add "## Tool results & program values" so probe values, test pass/fail
counts, and per-tool outcomes are enumerated with arguments/paths/values.
- Add "## Plans & notes" for stated plans, TODOs, and "next I'll…" commits.
- Expand "Preserve in the summary" to spell out every category the
summarizer must cover, and explicitly tell the model that many output
tokens are expected when the history warrants it.
No code-path changes — only the static string list feeding
mk_system_prompt. Build + 2605 tests green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The .cursor/ IDE scratch folder had a gitignore exception for .cursor/docs/ that was letting personal handoff notes ship to PRs. Remove the tracked file and drop the exception so .cursor/ is fully ignored, matching dev's posture. The file remains on disk locally. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two call sites added on this branch referenced the old Info.exp.term field, which Elastatics (#2213) split into user_term/elab_term. Other call sites in these files were renamed on dev; our added functions were not reached by that sweep.
disconcision
left a comment
There was a problem hiding this comment.
Looks solid in general. I skimmed most of the agent-specific code, focussing on the interface with the rest of hazel; see code comments for detail.
Functional issues:
- New code insertion still seems to have leading spaces. Probably worth trying to make tests around this as it seems to be a recurring issue.
- Pressing enter to send a message in a long-ish conversation often lags... probably worth identifying why, but regardless, it would be nice to get some immediate feedback, just to make it obvious something is happening (eg message gets added to log and disappears from entry box), even if actual processing is subject to the lag. otherwise the user is left doubting what's going on.
- When inserting new let bindings into the body below the last let binding, it doesn't seem to always insert a linebreak or space. so you sometimes get stuff like
inlet empty_playlist : PlayList = ([], NoSongSelected) in, which is problematic as if it's copy-pasted it'll break.
Aesthetic:
- Slash cards (.slash-card) are a bit dark... lets use
background: var(--T2)instead. - Tool calls are currently displayed as
tool_name def_name. despite using different colors these kind of run into each other, let's stick some kind of separator between the two words. Also, it seems like def_name is slightly vertically elevated over tool_name; check the css.
Broader issues:
- Not specific to this PR, but I think I'm running into issues based on how we're only keeping the current code view/map. I had the agent do one of the test tasks; he also wrote some tests. I then pasted in our test suite over the agent's, and told him that I did so and that a test was failing. The agent got confused though as all he saw were 'the tests he added'. I'm not actually totally clear on what was going on here, but I think it was that the agent is only ever seeing the current version, plus agent edits, so the agent gets confused when referring to 'updated'/'new' versions resulting from user edits, as those don't show up in the log. something like that. I copy/pasted the history here (https://gist.github.com/disconcision/cdf2950ee8b49820ce36f8540a169669) but it's not too readable. We should discuss this. I think this showcases the importance of a way to see what the model saw at each stage of a job.
| changed. Strips a projector (exposing underlying syntax) when | ||
| [[MakeTerm.for_projection]] or [[ProjectorInit.init]] fails, migrating | ||
| refractors to the underlying term id when possible. */ | ||
| let sanitize_projectors_in_segment = |
There was a problem hiding this comment.
what does sanitize mean? re-validate? what does it mean for maketerm/init to fail? what does this have to do with refractors, which aren't in syntax the same way projectors are?
I don't really get what this is doing and the call site doesn't clear things up for me either
| @@ -0,0 +1,170 @@ | |||
| # AGENTS.md - Hazel Development Guide | |||
There was a problem hiding this comment.
not sure if we should commit this or not; seems fine; discuss with @cyrus-
| switch (target) { | ||
| | Some(el) => | ||
| let elId = Js.Opt.to_option(Js.Unsafe.coerce(el)##.id); | ||
| if (is_input_field(elId)) { |
There was a problem hiding this comment.
This hopefully won't be an issue when you're up to date with dev; @Negabinary has some keyboard handling changes that should obviate the need for these workarounds. lmk if that's not the case and we'll find a better way of doing this.
| let is_trailing_ws = (c: char): bool => c == ' ' || c == '\t' || c == '\r'; | ||
| let trim_line = (line: string): string => { | ||
| let chars = String.to_seq(line) |> List.of_seq; | ||
| let rec drop_leading_spaces = (chars: list(char)): list(char) => |
There was a problem hiding this comment.
was this just mis-named before?
|
|
||
| Used to rescue native copy when a focused hidden element (e.g. the editor's | ||
| clipboard shim) would otherwise intercept Cmd/Ctrl+C. */ | ||
| let try_copy_window_selection_in_classes = |
There was a problem hiding this comment.
hopefully won't be necessary when current with dev... see comment below
| api_key: option(string), | ||
| active_llm: option(OpenRouter.AvailableLLMs.Model.llm_info), | ||
| available_llms: OpenRouter.AvailableLLMs.Model.t, | ||
| [@yojson.default ""] [@sexp.default ""] |
There was a problem hiding this comment.
why are these fields but not others defaulted? trying to make sure this isn't just papering over something
| }; | ||
|
|
||
| /** Like [[toggle_statics]], but for path-resolved ids (agent tools). */ | ||
| let toggle_statics_at = toggle_statics; |
There was a problem hiding this comment.
not sure this alias is necessary?
| | SampleFocus(a) => Ok(SampleFocusPerform.go(z, a)) | ||
| }; | ||
| }; | ||
|
|
There was a problem hiding this comment.
I don't really get what this set of functions is doing. they seem to be for generic projector placement, but internally they call migrate_refractor, which only applies to probes/statics, which unlike projectors are not written into the syntax tree.
Summary
Major coding-agent expansion. Scope: agent surface (
src/web/view/agent{Core,View}/, agent CSS/tests,OpenRouter.re,HighLevelNodeMap.re, prompt factory).What's new
HighLevelNodeMap, statics tools, generalizedinsert_before/insert_after, per-tool + per-category toggles,ProjectorCatalog./compact, context meter, sticky scroll, centralized empty/API retry path.SlashCommandOutputview;/help,/session-usage,/key,/key-usage,/account-usage,/show-thinking,/compact.ToolCallSummary, agent-message Markdown via Omd → vdom, chat export/copy.^^ projectorsyntax,AGENTS.md+ handoff doc.AgentCore/AgentView→agentCore/agentView; top-bar above textbox; dropped empty-reply workbench nudge (plan-mode loop fix).Test_AgentControlFlow,Test_AgentMultiTool; expandedTest_AgentUX,Test_AgentTools(incl.HighLevelNodeMap).Files outside agent scope
11 narrow integration touches (icons, Page route, sidebar hook, JsUtil helpers, prompt-string trim, projector/probe perform plumbing, test-runner reg,
run_tests/.gitignore). Rest ofgit diff origin/dev..branchis artifact from dev moving 421 commits since divergence atab5ba95da25— no branch-side deletions.Tour — new bottom-bar controls
New slash commands
Agent has ability to place probes and understand their intermittent results
Along with this, it can also place projectors.
Agent can batch multiple tools per turn
More informative collapsed tool descriptions
Also comes with the ability to Cmd/Ctrl+Click to jump to the definition of the respective definition that was edited
Thinking blobs now displayed as their own UI chunks
Refactored model selection UI
We can recommend some models (this requires maintaining code though :/ might need to remove).
Also added the ability to grep for models in the master view.
Stopped displaying free models first, instead added a filter toggle.
User can stop chats early (and also queue up messages)
Batched Tool Calls
Along with error handling—if tool call n fails, all tool calls scheduled after n are not executed to avoid cascading failures, rogue bugs, and unintentional logic flow.