-
Notifications
You must be signed in to change notification settings - Fork 3
Agent harness
Thang Chung edited this page Feb 9, 2026
·
3 revisions
An agent harness is the software scaffolding or "operating system" that wraps around a Large Language Model (LLM) to make it a functional, autonomous agent capable of executing tasks, managing context, and interacting with external tools. While the LLM acts as the brain (reasoning), the harness acts as the body (doing/safety).
The main components of an agent harness include:
- Tool Interaction & Execution Engine: this layer connects the model to external applications, APIs, and environments, allowing it to take action.
- Action Execution: Handles tool calling (e.g., executing code, API calls, web browsing).
- Tool Sandbox: Provides a secure environment for running tools, such as terminal access or file system management.
- Error Handling: Manages and recovers from failed tool calls or invalid output from the model.
- Context & Memory Management: the harness curates the information presented to the model, acting as its working memory.
- State Persistence: Stores intermediate steps, conversation history, and variable values to maintain continuity over long-running tasks. Context Engineering: Reduces, compacts, or compresses the prompt to fit within the model's limitations and prevent "context rot".
- Orchestration & Planning (Agent Loop): this component manages the logic of how the agent breaks down complex tasks and sequences actions.
- Goal Decomposition: Breaks high-level user goals into smaller, executable sub-tasks.
- Sub-agent Coordination: Manages communication and workflow between specialized sub-agents.
- Lifecycle Management: Handles task initiation, progress monitoring, and finalization (closing resources, saving state).
- Human-in-the-Loop (HITL) Controls
- For safety and reliability, the harness includes mechanisms for human intervention before executing critical or high-risk actions (e.g., deleting files, sending emails, deploying code).
- Prompt & Instruction Management: the harness defines the system-level rules, constraints, and instructions that govern the agent's behavior, often utilizing pre-defined prompts.
Summary Table
| Component | Responsibility |
|---|---|
| Tools/Action | File system, API connectors, browser |
| Context/Memory | Session history, state persistence, summarization |
| Orchestration | Task planning, loop control, sub-agents |
| Safety/Guardrails | Human-in-the-loop, permissioning |
Ref: https://share.google/aimode/DOpVzeiO858vZOist
- https://github.com/microsoft/vscode-copilot-chat
- https://github.com/anthropics/claude-code
- https://github.com/google-gemini/gemini-cli
- https://github.com/langchain-ai/deepagents
- https://nightlies.apache.org/flink/flink-agents-docs-main/docs/get-started/overview/
- https://github.com/Azure-Samples/Azure-Language-OpenAI-Conversational-Agent-Accelerator
- https://github.com/virattt/dexter => evals: https://github.com/virattt/dexter/tree/main/src/evals
.