Feat/complete tool timeline by jaikoo · Pull Request #2807 · ultraworkers/claw-code

jaikoo · 2026-04-26T17:30:19Z

No description provided.

Extract 88 format/reporting functions into format/ submodules: - format/tool_fmt.rs: tool call/result formatting, truncation - format/status.rs: status reports, git workspace summary - format/model.rs: model aliases, provenance, resolution - format/permissions.rs: permission mode parsing/reporting - format/sessions.rs: session management, history formatting - format/cost.rs: cost and compact reports - format/errors.rs: error classification, suggestions - format/slash_help.rs: help rendering, completions main.rs reduced from 13,106 to 11,119 lines (-1,987). All 213 tests pass (179 unit + 34 e2e).

… status bar Phase 0.2-0.6: Extracted ~7,300 lines from main.rs (13,106 → 5,776 lines) - app.rs: LiveCli, BuiltRuntime, RuntimeMcpState, CliPermissionPrompter, CliToolExecutor, AnthropicRuntimeClient, build_runtime, run_repl - args.rs: CliAction, CliOutputFormat, parse_args - cli_commands.rs: 50+ subcommand runners (doctor, resume, export, diff, etc.) - Made shared types pub(crate), fixed cross-module references Phase 1: Added tui/status_bar.rs with StatusBarState and StatusBar - Raw ANSI escape sequences for trait-object compatibility - Renders model, permission mode, message count, tokens, cost, elapsed time - Wired into consume_stream on MessageDelta (Usage) events - Unit tests for truncation, formatting, render output Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…n-tree/agent-context primitives Implements five phases of missing primitives for autonomous AI coding harness: Phase 1: Runtime-loaded models.json config (~/.claw/models.json, .claw/models.json) - Custom providers with base URL, API key (literal or env var), and model definitions - Provider-prefixed lookup (e.g. ollama/llama3.1:8b) and bare ID matching - Merged discovery: user-level + project-level providers coexist; same-key project overrides user - Hooks into metadata_for_model, detect_provider_kind, max_tokens_for_model, model_token_limit - Custom provider routing in ProviderClient respects api field (anthropic-messages vs openai-completions) Phase 2: SDK crate with AgentSession, EventBus, SessionManager, ToolRegistry - AgentSession wraps ConversationRuntime with event-driven lifecycle - EventBus provides multi-subscriber broadcast channels for session events - SessionManager handles CRUD for persisted sessions - ToolRegistry and SdkToolExecutor for tool registration/execution stubs Phase 3: Extension system with Extension trait, ExtensionRegistry, SimpleExtension Phase 4: SessionTree with branching/forking/navigation using single-source-of-truth BTreeMap - Children stored as ID references (not duplicated node data) - Fork at any node, navigate between branches Phase 5: AgentContext (thread-safe KV store), AgentTask, TaskRegistry, SessionAgent - Inter-agent communication via shared AgentContext - Task lifecycle management with completion/failure tracking Includes 7 e2e tests for models_file, 27 SDK unit tests, 4 models_file unit tests. All 1,072 workspace tests pass. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

feat: add runtime models.json config, SDK crate, and extension/session-tree/agent-context primitives

Replace direct truncate_output_for_display calls with collapse_tool_output() from tui/tool_panel module. Tool output now defaults to 10 visible lines (down from 60) per the TUI design doc. Includes DISPLAY_TRUNCATION_NOTICE appended to collapsed output, ToolDisplayConfig for customization. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Reverse-engineered from the implemented codebase and pi-mono reference research. Covers all 5 phases (models.json, SDK, extensions, session tree, inter-agent comms), architectural comparison table, design decisions, remaining gaps, and test coverage summary. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/permission.rs with: - describe_tool_action() — plain-English descriptions per tool type - format_enhanced_permission_prompt() — box-drawing borders, ANSI styling - parse_permission_response() — parses y/n/a/v responses - PermissionDecision enum (Allow, Deny, AllowAll, ViewInput) - 11 unit tests Update CliPermissionPrompter in app.rs: - Use enhanced prompt for display - Support 'a' (allow all) — sets approve_all flag - Support 'v' (view input) — prints raw input, then re-prompts Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/diff_view.rs with: - parse_unified_diff() — git unified diff parser (DiffLine enum) - render_colored_diff() — green additions, red deletions, cyan hunk headers - render_diff_summary() — file-level +/- counts - format_colored_diff() — full colored diff with summary header - DiffCounts, count_diff_lines(), count_diff_files() - 9 unit tests Wire into cli_commands.rs: - render_diff_report_for() now uses format_colored_diff() - Staged/unstaged sections get ANSI coloring Update mock_parity_harness assertions for new prompt text. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/thinking.rs with: - ThinkingFrames — infinite cycled dot-wave animation frames (magenta) - format_thinking_completed() — static 'Reasoned for X.Xs' line - render_thinking_inline() — dim-colored reasoning indicator - 5 unit tests Update render_thinking_block_summary() in app.rs to delegate to the new module, adding magenta ANSI coloring to all thinking summaries. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Replace monolithic startup_banner() with BannerStyle enum and dispatch: - BannerStyle::Full — original ASCII art (opt-in) - BannerStyle::Compact — 2-line banner (default) - BannerStyle::None — empty banner (opt-out) Add BannerStyle::from_config() for future config file integration. Add compact_banner() and full_banner() methods to LiveCli. All 220 tests pass. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/terminal.rs with: - TerminalSize — thread-safe terminal-dimension tracker - Periodic polling (1s interval) via crossterm::terminal::size() - invalidate() for force-refresh on next read - AtomicU16 storage, no locks on hot path (read path is lock-free) StatusBar already consumes terminal_width; TerminalSize provides a reusable shared instance for all TUI components. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/timeline.rs with: - ToolCallTimeline — accumulator for tool events during a turn - ToolCallEvent — per-call metadata (step, name, timing, error, truncation) - start_tool() / complete_tool() builder API - render() — numbered timeline with elapsed time and line counts - 6 unit tests Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Remove inherited upstream documentation (PHILOSOPHY.md, PARITY.md, USAGE.md, ROADMAP.md, prd.json, progress.txt, container.md, MODEL_COMPATIBILITY.md) — all still available in git history. Replace with project-aligned docs: - README.md: agent-first harness overview, quick start, architecture - docs/ROADMAP.md: 6-phase plan (SDK, agent integration, human DX, orchestration, security, developer experience) - docs/AGENT-INTEGRATION.md: SDK usage, CLI patterns, planned RPC mode, event types, model configuration - docs/HUMAN-DX.md: review workflows, notification strategy, auto-expiring demo deployments, tailscale integration, orchestrator interface Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Add tui/theme.rs with: - Theme struct containing all semantic ANSI color constants - DIM, SUCCESS, ERROR, HIGHLIGHT, THINKING, WARNING, MUTED, etc. - Composite helpers: truncation_notice(), permission_border() - 2 unit tests verifying non-empty constants and truncation notice format - Single import point for all TUI modules to use consistent colors Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- Wire ToolCallTimeline into consume_stream: start_tool on ContentBlockStop, render at stream end when tools were used - Replace direct crossterm::terminal::size() with TerminalSize tracker for periodic resize checking - Import Theme in diff_view.rs, status_bar.rs, tool_panel.rs - Use Theme::status_bar_fg() in StatusBar::render Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Implements `claw --mode rpc` (SDK layer, CLI wiring pending): - JSON-RPC 2.0 request/response protocol over stdin/stdout - Methods: session.create, session.turn, session.list, session.destroy - Session tree ops: session.tree.fork, session.tree.navigate, session.tree.path - Event subscription: events.subscribe with notification streaming - Lifecycle: ping, shutdown - Session tree integration: user/assistant turns tracked as tree nodes - 11 unit tests covering round-trip RPC, error handling, shutdown Protocol example: -> {"method":"session.create","params":{"model":"claude-sonnet-4-6"},"id":1} <- {"result":{"sessionId":"abc123"},"id":1} -> {"method":"session.turn","params":{"sessionId":"abc123","input":"hello"},"id":2} <- {"result":{"status":"completed","tokensUsed":150},"id":2} All 1,083 workspace tests pass. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- Fix tree inconsistency: only add user/assistant nodes on run_turn success, not before the call (prevents orphaned user nodes on failure) - Fix events.subscribe no-op: store EventBus in ManagedSession and actually drain events via subscribe() instead of faking notifications - Remove dead RpcMethod enum and default_model() (dispatch uses string matching, enum was never deserialized) - Remove unused imports (Arc, AgentSessionEvent) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Adds `claw --mode rpc` (or `claw --mode=rpc`) that starts the JSON-RPC server over stdin/stdout for agent integration. This is the primary entry point for non-Rust consumers to integrate with Claw Code. Usage: echo '{"method":"session.create","params":{"model":"claude-sonnet-4-6"},"id":1}' | claw --mode rpc Also fixes unused import warning in models_file.rs. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- status_bar.rs: content.len() -> content.chars().count() for unicode-safe truncation - permission.rs: use Theme::WARNING, Theme::DIM, Theme::permission_border() - diff_view.rs: use Theme::SUCCESS, Theme::ERROR, Theme::HIGHLIGHT, Theme::MUTED, Theme::DIM - timeline.rs: use Theme::MUTED, Theme::SUCCESS_BOLD, Theme::ERROR_BRIGHT, Theme::HIGHLIGHT, Theme::DIM - thinking.rs: use Theme::THINKING in format_thinking_completed and render_thinking_inline - All 230 tests pass Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ble API client - Add AgentSessionBuilder with fluent API (model, system_prompt, tools, permission_mode, api_client) - Add BoxedApiClient type-erased wrapper for any runtime::ApiClient - Add DummyApiClient as the default no-op client - Refactor AgentSession to use BoxedApiClient instead of being generic - Export new types from lib.rs - Fix doctests to use DummyApiClient Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…, abort, dispose - steer(): inject mid-turn steering messages into session - follow_up(): queue follow-up messages for next turn - set_model() / cycle_model(): runtime model switching with rotation - compact(): explicit context compaction via runtime CompactionConfig - abort(): cooperative abort signal for mid-turn cancellation - dispose(): clean session teardown with lifecycle event emission - All methods guard against use-after-dispose - 11 new unit tests (51 total in SDK crate) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- steer() now writes to runtime's session (via session_mut()) not stale SDK clone - compact() applies compacted_session back to runtime and syncs SDK copy - run_turn() now checks ensure_not_disposed() before executing - abort() uses runtime's HookAbortSignal (wired via with_hook_abort_signal) instead of disconnected Arc<AtomicBool> - set_model() emits SessionLifecycleEvent::ModelChanged instead of misleading Created - dispose() clears both SDK and runtime session copies - Add ModelChanged variant to SessionLifecycleEvent Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ith Theme constants All \x1b[...m escape sequences in tool_fmt.rs replaced with Theme::DIM, RESET, SUCCESS_BOLD, ERROR_BRIGHT, WARNING, HIGHLIGHT, ERROR, SUCCESS, MUTED, COMMAND_BG constants. 230 tests pass. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…d custom handlers - ToolHandler trait for custom tool execution (Send + Sync) - define_tool() ergonomic builder: name, description, input/output schemas, handler - SchemaValidator: JSON Schema validation (type, required, nested properties) - FnToolHandler: wrap closures as ToolHandler implementations - Enhanced ToolRegistry: register builtin stubs + custom ToolDefinitions with handlers - Upgraded SdkToolExecutor: dispatch to custom handlers with input/output validation - 15 new unit tests (66 total in SDK crate) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- SchemaValidator now supports "integer" type (checks as_i64() for whole numbers) - SdkToolExecutor rejects malformed JSON input when a non-trivial schema is defined - 20 new tests covering: all JSON types (null, array, boolean, number, integer), empty/malformed schemas, deeply nested property paths, malformed JSON bypass, handler error propagation, empty tool name, builtin idempotency, builtin-after-custom conflicts, unregistered tool validation passthrough, minimal tool builds, full end-to-end pipeline, SchemaValidationError Display formatting Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- TreeEntry enum with 6 typed entries: Message, Compaction, Branch, ModelChange, ThinkingLevel, Custom - SessionTreeLog: append-only JSONL persistence with automatic tree reconstruction - build_session_context(): walk active path and collect provider-relevant entries - Branch labels and summaries stored alongside tree structure - fork_to_new_file(): extract ancestor subtree into independent JSONL file - Compaction and model-change entries create tree nodes for full audit trail - Round-trip file persistence with serde JSON serialization - active_id() getter on SessionTree for external module access - 10 new unit tests (95 total in SDK crate) - Update ROADMAP.md: mark Phase 2.3 done, clarify Phase 2.4 is Python adapter work Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ence - P1: load_from_file now skips unparseable trailing lines (crash recovery) instead of failing the entire session load - P2: ModelChange/ThinkingLevel now return error when tree has no active node instead of silently becoming zombie entries - apply_entry now propagates all tree errors instead of silently ignoring them - 16 new tests: serde round-trip all 6 variants, truncated line recovery, corrupted middle line, empty file, ModelChange/ThinkingLevel rejection, branch without label, compaction+branch reconstruction, multi-branch, build_context with branches and compaction, fork at root/leaf, fork with branch/custom entries, duplicate node_id behavior - Add active_id() getter to SessionTree Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Resolve conflict in main.rs by: - Removing inline CliAction, CliOutputFormat, LocalHelpTopic, parse_args (already extracted to args.rs by Phase 0) - Adding CliAction::Rpc variant and --mode rpc parsing to args.rs - Keeping all new mainline files (sdk crate, docs, API changes) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

TUI refactor: modular tui/ module, compact banner, tool timeline, theme system

…d approval gates - RiskLevel enum (Low/Medium/High) with ordering - ChangeRecord/FileChange: structured change tracking with diff hunks - ReviewGate: configurable approval gates by risk level and sensitive file patterns - RiskClassifier: auto-classify changes based on file paths and change size - ReviewManager: submit, approve, reject, request_changes, batch_approve - Review history with full audit trail - Glob matching (* and ** patterns) for sensitive file path detection - 25 unit tests covering all components Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- RiskClassifier: empty file list, .env variants, false positives (environment.rs), case-insensitive paths, renamed files - ReviewManager: sequential ID generation, double-approve rejection, reject/request_changes on nonexistent, request_changes→approve history accumulation, duplicate explicit ID, all_pending vs pending_reviews distinction, batch approve edge cases - ReviewGate: custom low-risk gate with sensitive path, no-match gate - Glob matching: ? wildcard, empty patterns, exact match, case sensitivity, multi-** segments - Serde round-trips: RiskLevel, Decision, FileChangeType, ReviewGate - Decision display formatting Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- Add ProviderKind::DeepSeek, Ollama, Qwen, Vllm variants - Add ProviderClient::DeepSeek, Ollama, Qwen, Vllm variants - Add OpenAiCompatConfig::deepseek(), ollama(), qwen(), vllm() constructors - Add MODEL_REGISTRY entries for deepseek-chat, deepseek-reasoner, deepseek-r1 - Route deepseek*, ollama/*, qwen/*, vllm/* prefixes to new providers - Change qwen/ prefix to route to external Qwen (non-DashScope) - Keep bare qwen-* routing to DashScope for backward compat - Add reasoning_content field to ChunkDelta for DeepSeek R1 thinking - Handle reasoning/thinking blocks in StreamState (ingest_chunk, finish) - Add deepseek-reasoner to is_reasoning_model() - Update strip_routing_prefix() for new provider prefixes - Add env-var-based detection in detect_provider_kind() - Add 15 new tests covering aliases, routing, reasoning, configs - Update exhaustive matches in app.rs and format/model.rs

…scaffolding - Add startup_banner: Option<String> to RuntimeFeatureConfig in runtime crate - Add parse_optional_startup_banner() parser for 'startupBanner' config key - Add RuntimeConfig::startup_banner() accessor - Wire BannerStyle::from_config() in run_repl() — reads settings.json - Migrate full_banner() and compact_banner() to use Theme::DIM/Theme::RESET - Add None arg to run_repl() call sites for new startup_banner parameter - 230 tests pass, runtime crate compiles clean Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- 5 integration tests (provider_client_integration): DeepSeek routing, missing creds, Ollama prefix, vLLM prefix, Qwen external prefix - 1 e2e streaming test (openai_compat_integration): verify reasoning_content produces Thinking blocks with correct indices - Fix from_env() to skip credential check for no-auth providers (Ollama, vLLM) when api_key_env is empty

- New sdk::setup with provider detection, tool detection, SetupReport, session templates - Fix TOCTOU race in DetectedProvider::check - Add DeepSeek, Ollama, vLLM, Qwen providers - Add 9 new edge-case tests (env var present/empty, render branches, full serde equality) - Fix pre-existing clippy: map+unwrap_or -> is_ok_and, Duration::from_hours Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

- NotificationSink trait: ConsoleSink, FileSink, WebhookSink, EmailSink - Severity enum with 5 levels: Debug → Critical - EventType enum covering agent lifecycle and review events - NotificationDispatcher with per-sink filters (severity, event type, tags, exclusions) - Generic Notification with builder pattern (with_tag, with_payload) - Full serde support for all types - 14 unit tests: filtering, severity ordering, file JSONL, sink dispatch, round-trip, edge cases Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…n validation 1. Token limits: add deepseek-chat/deepseek-reasoner to model_token_limit() (8,192 max output, 131,072 context window per DeepSeek API docs) 2. Pricing: add deepseek-chat (/bin/bash.27/.10 per M tokens) and deepseek-reasoner (/bin/bash.55/.19 per M tokens) to pricing_for_model() 3. README: add Built-in Providers table with all 8 providers, env vars, model prefixes, auto-detection order, and usage examples 4. models.json validation: validate the api field against known values (openai-completions, anthropic-messages, deepseek, ollama, qwen, vllm) in both load_custom_models() and load_and_merge_custom_models() - Add tests: token limits, pricing (3 tests), api field validation (2 tests)

- Add base_url_fallback_env field to OpenAiCompatConfig for Qwen to fall back to OPENAI_BASE_URL when QWEN_BASE_URL is not set - Add streaming e2e test for reasoning-only stream (thinking → tools, no text content) verifying correct index offsets - Add unit tests for Qwen base URL fallback behavior - Update all config constructors with new base_url_fallback_env field - Update read_base_url() to check fallback env var

feat: startup_banner config, Theme migration in app.rs

- Add SharedToolCallTimeline (Arc<Mutex<ToolCallTimeline>>) for shared access between streaming client and tool executor - Add tool_timeline field to CliToolExecutor + constructor param - Wire complete_tool() in execute() — records duration, error, truncation, and line count on each tool result - Update all CliToolExecutor::new() call sites with None arg - Re-export SharedToolCallTimeline from tui/mod.rs - 230 tests pass Note: passing an active timeline from consume_stream to the executor requires threading through build_runtime -> ConversationRuntime which crosses the runtime crate boundary (separate PR). Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…ecutor - Create SharedToolCallTimeline once in build_runtime_with_plugin_state() - Pass clone to both AnthropicRuntimeClient and CliToolExecutor - Wire start_tool() via self.tool_timeline in consume_stream() (instead of a local ToolCallTimeline that was invisible to the executor) - complete_tool() already wired in CliToolExecutor::execute() from PR #4 - Remove unused set_timeline() method since timeline is now passed at construction time - Remove unused ToolCallTimeline import from app.rs - 230 tests pass

code-yeongyu and others added 30 commits April 25, 2026 18:10

Merge pull request #1 from deep-thinking-llc/feat/sdk-and-models-config

a389cba

feat: add runtime models.json config, SDK crate, and extension/session-tree/agent-context primitives

fix: truncate_str byte-length -> char-count for unicode safety

11a557b

Add MIT license

1df2271

jaikoo and others added 13 commits April 26, 2026 02:20

Merge pull request #2 from deep-thinking-llc/feat/tui-refactor

bdbf0b3

TUI refactor: modular tui/ module, compact banner, tool timeline, theme system

Merge pull request #3 from deep-thinking-llc/feat/tui-refactor

6b34b35

feat: startup_banner config, Theme migration in app.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/complete tool timeline#2807

Feat/complete tool timeline#2807
jaikoo wants to merge 43 commits intoultraworkers:mainfrom
deep-thinking-llc:feat/complete-tool-timeline

jaikoo commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants