Skip to content

Latest commit

Β 

History

History
569 lines (522 loc) Β· 19.8 KB

File metadata and controls

569 lines (522 loc) Β· 19.8 KB

System Architecture

Version: 0.1.1 | Last Updated: 2026-03-21

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Applications Layer                        β”‚
β”‚  (adana, adana-repl, dana-code, dana-init, dana-cli)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Agent Layer                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  STARAgent (STAR Loop Implementation)                   β”‚β”‚
β”‚  β”‚  β”œβ”€ Communicator (LLM interface)                        β”‚β”‚
β”‚  β”‚  β”œβ”€ State (Conversation state)                          β”‚β”‚
β”‚  β”‚  β”œβ”€ Learner (Learning from interactions)               β”‚β”‚
β”‚  β”‚  └─ Observer (Introspection & metrics)                 β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Core Systems Layer                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  Resource System      Timeline         Workflows        β”‚β”‚
β”‚  β”‚  β”œβ”€ BashResource      β”œβ”€ Entry        β”œβ”€ BaseWorkflow  β”‚β”‚
β”‚  β”‚  β”œβ”€ FileIOResource    β”œβ”€ Compressor   β”œβ”€ Executor      β”‚β”‚
β”‚  β”‚  β”œβ”€ SearchResource    └─ Serializer   └─ Validation    β”‚β”‚
β”‚  β”‚  └─ CustomResources                                     β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  Memory          Prompts        Skills        Reminder  β”‚β”‚
β”‚  β”‚  β”œβ”€ STMemory     β”œβ”€ Builder     β”œβ”€ Registry  └─ Context β”‚β”‚
β”‚  β”‚  β”œβ”€ LTMemory     └─ API         └─ Executor β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              LLM Abstraction Layer                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  CodecRuntimeBase (Unified interface)                   β”‚β”‚
β”‚  β”‚  β”œβ”€ NativeToolsCodec (Claude, GPT-4 Turbo)            β”‚β”‚
β”‚  β”‚  └─ CSXMLCodec (Non-native tool use)                   β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  Provider Implementations                               β”‚β”‚
β”‚  β”‚  β”œβ”€ OpenAI           β”œβ”€ Azure OpenAI                   β”‚β”‚
β”‚  β”‚  β”œβ”€ Anthropic        β”œβ”€ Gemini                         β”‚β”‚
β”‚  β”‚  └─ Anthropic-Like   └─ Local (LLaMA/Ollama)          β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            Data Persistence & Infrastructure                β”‚
β”‚  β”œβ”€ Repository Layer (File, Database adapters)             β”‚
β”‚  β”œβ”€ Configuration (config.json, environment)               β”‚
β”‚  β”œβ”€ Logging (structlog)                                    β”‚
β”‚  └─ Utilities (common functions, helpers)                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

STAR Agent Pattern (Execution Flow)

User Input
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  SEE (Perceive)     β”‚
β”‚  β€’ Parse intent     β”‚
β”‚  β€’ Analyze context  β”‚
β”‚  β€’ Load memory      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  THINK (Reason)     β”‚
β”‚  β€’ Prompt building  β”‚
β”‚  β€’ LLM inference    β”‚
β”‚  β€’ Tool planning    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ACT (Execute)      β”‚
β”‚  β€’ Call resources   β”‚
β”‚  β€’ Collect results  β”‚
β”‚  β€’ Update timeline  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  REFLECT (Learn)    β”‚
β”‚  β€’ Analyze outcome  β”‚
β”‚  β€’ Update memory    β”‚
β”‚  β€’ Metrics capture  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
    Response

Component Interaction Diagram

STARAgent
β”œβ”€ Communicator (LLM Communication)
β”‚  β”œβ”€ Runtime (Provider abstraction)
β”‚  β”‚  └─ Codec (Tool schema conversion)
β”‚  β”‚      β”œβ”€ NativeToolsCodec
β”‚  β”‚      └─ CSXMLCodec
β”‚  β”œβ”€ Prompt Builder
β”‚  β”‚  β”œβ”€ System prompt
β”‚  β”‚  β”œβ”€ Tool schemas
β”‚  β”‚  └─ Conversation history
β”‚  └─ LLM Providers
β”‚      β”œβ”€ OpenAI
β”‚      β”œβ”€ Anthropic
β”‚      β”œβ”€ Gemini
β”‚      └─ Others
β”‚
β”œβ”€ State (Conversation State)
β”‚  β”œβ”€ Timeline (History)
β”‚  β”‚  β”œβ”€ User entries
β”‚  β”‚  β”œβ”€ Assistant entries
β”‚  β”‚  β”œβ”€ Tool results
β”‚  β”‚  └─ Compression logic
β”‚  β”œβ”€ Short-term memory
β”‚  └─ Tool cache
β”‚
β”œβ”€ Learner (Learning System)
β”‚  β”œβ”€ Long-term memory
β”‚  β”‚  β”œβ”€ Lessons
β”‚  β”‚  β”œβ”€ Episodes
β”‚  β”‚  β”œβ”€ Facts
β”‚  β”‚  └─ Patterns
β”‚  └─ Feedback processing
β”‚
└─ Observer (Metrics & Introspection)
   β”œβ”€ Token tracking
   β”œβ”€ Performance metrics
   β”œβ”€ Error logging
   └─ Debug information

Data Flow: Message Processing

User Message
    β”‚
    β–Ό
Timeline.add_entry(UserEntry)
    β”‚
    β–Ό
check_should_compress()
    β”‚
    β”œβ”€ Yes β†’ Timeline.compress()
    β”‚        (LLM-based summarization)
    β”‚
    β–Ό
Prompt.build(
    system_prompt,
    timeline_entries,
    tool_schemas,
    memory_context
)
    β”‚
    β–Ό
Runtime.complete(
    messages,
    tools
) ─ Handles provider-specific format
    β”‚
    β”œβ”€ OpenAI: native tools
    β”œβ”€ Anthropic: native tools
    └─ Others: XML-based tools
    β”‚
    β–Ό
Parse Response (ToolCalls + Text)
    β”‚
    β”œβ”€ Tool Calls
    β”‚  β”‚
    β”‚  └─ for each tool_call:
    β”‚     β”‚
    β”‚     β”œβ”€ ResourceRegistry.get(tool_name)
    β”‚     β”œβ”€ Execute with parameters
    β”‚     └─ Timeline.add_entry(ToolResultEntry)
    β”‚
    β”œβ”€ Text Content
    β”‚  β”‚
    β”‚  └─ Timeline.add_entry(AssistantEntry)
    β”‚
    β–Ό
Optional Refinement (if tool errors or incomplete)
    β”‚
    β–Ό
Learner.process_interaction(
    context,
    outcome,
    feedback
)
    β”‚
    β–Ό
Return Response to User

Resource Execution System

STARAgent
    β”‚
    └─ Resource Registry (Global)
       β”‚
       β”œβ”€ BashResource
       β”‚  β”œβ”€ execute(command)
       β”‚  └─ Tool Schema: {name, description, parameters}
       β”‚
       β”œβ”€ FileIOResource
       β”‚  β”œβ”€ read(path)
       β”‚  β”œβ”€ write(path, content)
       β”‚  └─ list_dir(path)
       β”‚
       β”œβ”€ FileEditResource
       β”‚  β”œβ”€ edit_file(path, old, new)
       β”‚  └─ Shows diff before applying
       β”‚
       β”œβ”€ SearchResource
       β”‚  β”œβ”€ search(query, num_results)
       β”‚  └─ Returns structured results
       β”‚
       β”œβ”€ WebResearchResource
       β”‚  β”œβ”€ research(topic, sources)
       β”‚  β”œβ”€ HTML extraction
       β”‚  β”œβ”€ Content synthesis
       β”‚  └─ Multi-source aggregation
       β”‚
       β”œβ”€ TaskResource
       β”‚  β”œβ”€ create(title, description)
       β”‚  β”œβ”€ update(id, status)
       β”‚  └─ list(filter)
       β”‚
       β”œβ”€ TodoResource
       β”‚  β”œβ”€ add(title)
       β”‚  β”œβ”€ complete(id)
       β”‚  └─ list()
       β”‚
       β”œβ”€ SkillResource
       β”‚  β”œβ”€ execute_skill(name, args)
       β”‚  └─ Load from .claude/skills/
       β”‚
       β”œβ”€ MCPResource
       β”‚  β”œβ”€ Call MCP servers
       β”‚  β”œβ”€ Server discovery
       β”‚  └─ Protocol handling
       β”‚
       └─ CustomResources (User-defined)
          └─ Extend BaseResource
             β”œβ”€ Annotated methods = tools
             └─ Auto schema generation

Workflow Execution

WorkflowExecutor
    β”‚
    β”œβ”€ Parse workflow definition
    β”‚  (steps, conditions, parallel tasks)
    β”‚
    β”œβ”€ For each step:
    β”‚  β”‚
    β”‚  β”œβ”€ Validate inputs
    β”‚  β”œβ”€ Execute step (CallableWorkflow)
    β”‚  β”œβ”€ Collect outputs
    β”‚  β”œβ”€ Handle errors/retries
    β”‚  └─ Pass outputs to next step
    β”‚
    β”œβ”€ Conditional branches
    β”‚  β”œβ”€ Evaluate condition
    β”‚  └─ Route to correct step
    β”‚
    β”œβ”€ Parallel execution
    β”‚  β”œβ”€ Run concurrent steps
    β”‚  └─ Gather results
    β”‚
    β–Ό
Return aggregated results

LLM Provider Architecture

Runtime (Abstract Interface)
    β”‚
    β”œβ”€ CodecRuntimeBase
    β”‚  β”‚
    β”‚  β”œβ”€ __init__(codec: ToolCodec, ...)
    β”‚  └─ async complete(messages, tools) -> Response
    β”‚
    β”œβ”€ DefaultRuntime
    β”‚  β”œβ”€ Try providers in priority order
    β”‚  └─ Fallback on provider failure
    β”‚
    β”œβ”€ ProviderSpecificRuntime (OpenAI, Anthropic, etc.)
    β”‚  └─ Provider-specific optimizations
    β”‚
    └─ Codec System
       β”‚
       β”œβ”€ NativeToolsCodec
       β”‚  β”œβ”€ Anthropic native tools
       β”‚  β”œβ”€ OpenAI function calling
       β”‚  └─ No XML conversion needed
       β”‚
       └─ CSXMLCodec
          β”œβ”€ Convert tools to XML format
          β”œβ”€ Parse XML responses
          └─ For non-native providers

Provider Configuration & Priority

config.json
{
  "llm_providers": {
    "openai": {
      "priority": 100,
      "api_key": "${OPENAI_API_KEY}",
      "models": ["gpt-4.1", "gpt-4.1-mini", "o3", "o4-mini"]
    },
    "anthropic": {
      "priority": 90,
      "api_key": "${ANTHROPIC_API_KEY}",
      "models": ["claude-sonnet-4-6", "claude-opus-4-6"]
    },
    "gemini": {
      "priority": 85,
      "api_key": "${GOOGLE_API_KEY}",
      "models": ["gemini-2.5-flash", "gemini-2.5-pro"]
    },
    "azure": {
      "priority": 40,
      "api_key": "${AZURE_OPENAI_KEY}",
      "endpoint": "${AZURE_ENDPOINT}"
    }
  }
}

DefaultRuntime tries providers in priority order (high to low).

Memory System Architecture

STARAgent
    β”‚
    β”œβ”€ Short-Term Memory (STMemory)
    β”‚  β”œβ”€ Per-session cache
    β”‚  β”œβ”€ Fast retrieval
    β”‚  └─ Location: dana/core/memory/
    β”‚
    └─ Long-Term Memory (LTMemory)
       β”œβ”€ Persistent markdown files
       β”œβ”€ 4 memory types:
       β”‚  β”œβ”€ Lesson (Key learnings)
       β”‚  β”œβ”€ Episode (Significant interactions)
       β”‚  β”œβ”€ Fact (Factual information)
       β”‚  └─ Pattern (Observed patterns)
       β”‚
       β”œβ”€ Storage: ~/.dana/memory/ or configured path
       β”œβ”€ Retrieval: Embedding-based search (optional)
       └─ Location: dana/lib/memory/

Prompt Building Pipeline

User Message + Context
    β”‚
    β”œβ”€ PromptBuilder.build()
    β”‚  β”‚
    β”‚  β”œβ”€ 1. System Prompt
    β”‚  β”‚  └─ Model instructions, capabilities
    β”‚  β”‚
    β”‚  β”œβ”€ 2. Memory Context
    β”‚  β”‚  β”œβ”€ STMemory retrieval
    β”‚  β”‚  └─ LTMemory search results
    β”‚  β”‚
    β”‚  β”œβ”€ 3. Tool Schemas
    β”‚  β”‚  β”œβ”€ Names, descriptions
    β”‚  β”‚  └─ Parameter schemas (JSON)
    β”‚  β”‚
    β”‚  β”œβ”€ 4. Conversation Timeline
    β”‚  β”‚  β”œβ”€ Recent entries (uncompressed)
    β”‚  β”‚  └─ Older entries (if compressed)
    β”‚  β”‚
    β”‚  └─ 5. Reminder Context
    β”‚     └─ Dynamic context injected
    β”‚
    β–Ό
Final Prompt β†’ LLM

Token Management & Compression

Timeline.add_entry(entry)
    β”‚
    β”œβ”€ Update token count
    β”‚
    └─ Check compression threshold
       β”‚
       β”œβ”€ If tokens < threshold * max_tokens: OK
       β”‚
       └─ If tokens >= threshold * max_tokens:
          β”‚
          β”œβ”€ Compress older entries
          β”‚  β”‚
          β”‚  └─ LLM-based summarization
          β”‚     β”œβ”€ Summarize 3-5 oldest entries
          β”‚     β”œβ”€ Replace with summary
          β”‚     └─ Maintain token budget
          β”‚
          └─ Continue conversation

Default Settings:

  • max_tokens: 4096
  • compression_threshold: 0.8 (compress at 80% usage)
  • Compression reduces tokens to ~60% of original

Compaction Parity Upgrades (Phases 1–4)

The compression pipeline now has three additional layers for parity with OpenClaude-style engines:

  1. Single-knob heuristic trigger (P3) β€” DANA_COMPACT_TRIGGER_TOKENS (default 150000, clamp [8k, 2M]) gates needs_compression(). Optional system_tokens_fn / tools_tokens_fn callbacks fold system-prompt and tools-schema size into the estimate. Always len(str)/4.
  2. Cheap client-side shrink (P6) β€” cheap_shrink_tool_results() stubs old tool_result bodies to "[cleared for context budget]" while preserving tool_call_id. Opt-in via CompressedTimelineConfig.enable_cheap_shrink_tool_results. A predictive gate skips shrink when it cannot close the token gap alone β€” prevents vacuous summaries over stubs.
  3. Reactive compact + circuit breaker (P2) β€” PromptTooLongError raised by providers is caught in llm_caller._invoke_llm_sync/async, which calls timeline.reactive_compact(attempt) (drop 5β†’10β†’20 oldest kept entries + forward-orphan pruning + full summary) with exponential backoff 1s/3s. After 3 consecutive failures the circuit opens; cooldown DANA_CIRCUIT_COOLDOWN_SECONDS (default 300s) plus half-open probe provide automatic recovery. Kill switch via DANA_DISABLE_REACTIVE_COMPACT=1.

Provider PTL mapping:

Provider Detection
Anthropic / Anthropic-like BadRequestError body type="invalid_request_error" + "prompt is too long" in message
OpenAI / Azure / Moonshot APIStatusError body code="context_length_exceeded"
Gemini No SDK error β€” post-hoc WARNING log on finish_reason=="MAX_TOKENS" (reactive compact unavailable; tune DANA_COMPACT_TRIGGER_TOKENS conservatively)

Telemetry: dana/core/timeline/telemetry.py exposes CompressionLogFields TypedDict allowlist. An AST-based unit test asserts log extra={...} keys stay within the allowlist (no prompt-content leakage).

Error Handling & Recovery

Resource Execution
    β”‚
    β”œβ”€ Try: execute tool
    β”‚  β”‚
    β”‚  β”œβ”€ Success β†’ Return result
    β”‚  β”‚
    β”‚  └─ Error:
    β”‚     β”‚
    β”‚     β”œβ”€ Catch specific exception
    β”‚     β”‚
    β”‚     β”œβ”€ Log error with context
    β”‚     β”‚
    β”‚     β”œβ”€ Decide on retry (transient vs permanent)
    β”‚     β”‚  β”œβ”€ Transient (network): retry with backoff
    β”‚     β”‚  └─ Permanent (invalid): fail immediately
    β”‚     β”‚
    β”‚     └─ Return error to agent
    β”‚        └─ Agent may retry with different params
    β”‚
    └─ Timeline.add_entry(ToolErrorEntry)

Streaming Architecture

Agent.stream_response(messages)
    β”‚
    β”œβ”€ Runtime.stream_complete(messages, tools)
    β”‚  β”‚
    β”‚  └─ Provider-specific streaming
    β”‚     (Server-Sent Events or chunked)
    β”‚
    β”œβ”€ Token-by-token yield
    β”‚
    β”œβ”€ Collect full response
    β”‚
    β”œβ”€ Parse tool calls if present
    β”‚
    └─ Timeline.add_entry(AssistantEntry)

Concurrency Model

  • Async throughout: All I/O is non-blocking
  • Tool execution: Sequential by default (preserves order)
  • Multiple agents: Can run concurrently with asyncio.gather()
  • Web research: Concurrent HTTP requests within single research task

Extension Points

1. Custom Resources

class MyResource(BaseResource):
    async def my_tool(self, param: str) -> str:
        return f"result: {param}"

2. Custom Workflows

class MyWorkflow(BaseWorkflow):
    async def execute(self, context):
        result1 = await self.step1()
        result2 = await self.step2(result1)
        return result2

3. Custom Memory Adapters

class DatabaseMemory(LTMemory):
    async def save(self, memory):
        await db.insert(memory)

4. Custom Providers

Implement provider interface + add to config.json

Performance Characteristics

Operation Latency Notes
Message processing <5s avg Depends on LLM
Tool execution Variable Depends on tool
Timeline compression 1-2s LLM-based
Memory retrieval <100ms Embedding search
Resource lookup <1ms Hash registry
Streaming first token 1-3s LLM latency

Security Architecture

  • Input Validation: Pydantic models for all inputs
  • Code Execution: Python sandbox with restricted builtins
  • Environment Secrets: Never logged, loaded from .env
  • Tool Filtering: Only allowed resources accessible
  • Command Execution: Bash sandboxing where possible

Version: 0.1.1 | Last Updated: 2026-03-21