This document provides an overview of how the LLM (Language Model) and Tool System work together to create a functional AI coding assistant.
The system uses an agent coordination pattern with three main components:
- LLM acts as the brain - decides what actions to take
- Tools act as the hands - execute the actual operations
- Agent coordinates - manages the conversation and execution loop
User Request
β
βββββββββββββββββββ
β Agent Layer β β Coordinates LLM + tools
β - Executor β Builds context
β - Context β Manages loop
βββββββββββββββββββ
β
βββββββββββββββββββ
β LLM Layer β β Makes decisions
β - Client β Returns tool calls
β - Streaming β Processes results
βββββββββββββββββββ
β
βββββββββββββββββββ
β Tool Layer β β Executes actions
β - Runner β Handles permissions
β - Registry β Returns results
βββββββββββββββββββ
β
Back to LLM for synthesis
Core execution engine that:
- Manages the LLM + tool calling loop
- Handles both interactive and non-interactive modes
- Coordinates streaming responses and tool execution
- Manages conversation length through auto-compaction (see User Interface for details)
- Types: Defines message formats and OpenAI API compliance
- Validation: Ensures proper message ordering for tool calls
- Manages security controls for file system and bash operations
- For detailed permission models, approval modes, and configuration, see Permission System
- Handles tool execution with permission integration
- Manages async permission requests and retry logic
- For detailed permission workflow, see Permission System
- User Input β Agent receives prompt
- Context Building β System message + session history + project context β LLM
- LLM Response β Streaming text OR tool calls
- Tool Execution β Permission check β Execute β Return results
- Loop Continuation β Results fed back to LLM β Final response
The system automatically incorporates project-specific context through the AGENTS.md file:
- Automatic Reading: System reads
AGENTS.mdfile from project root - Context Integration: Project context is included in system prompts for LLM
- Persistent Memory:
AGENTS.mdprovides long-term project memory across sessions - Customizable: Users can update
AGENTS.mdto provide project-specific information
The system automatically manages conversation length when approaching token limits. For detailed information about auto-compaction triggers, manual commands, and usage, see User Interface.
- Streaming: Real-time LLM responses and progressive tool results
- Permission Integration: Multi-layer security with async approval flow (see Permission System)
- Auto-Compaction: Automatic token management for long conversations (see User Interface)
- Dual Mode: Interactive UI and non-interactive CLI support
- Tools - Tool system design
- Permission System - Permission handling
- User Interface - UI components, state management, and user interactions