Architecture

Request Flow

sequenceDiagram
    actor User
    participant chat.py
    participant FastAPI
    participant AgentLoop
    participant LiteLLM
    participant SkillRegistry
    participant ContainerPool
    participant MemoryManager

    User->>chat.py: message
    chat.py->>FastAPI: POST /chat
    FastAPI->>AgentLoop: run(session_id, message)

    AgentLoop->>MemoryManager: load preferences + ChromaDB top-N
    AgentLoop->>LiteLLM: completion(tools, messages)
    LiteLLM-->>AgentLoop: tool_call

    AgentLoop->>SkillRegistry: execute(tool_name, params)
    SkillRegistry->>ContainerPool: execute(manifest, tool, params)
    ContainerPool->>ContainerPool: POST /execute to skill container
    ContainerPool-->>SkillRegistry: result
    SkillRegistry-->>AgentLoop: result

    AgentLoop->>AgentLoop: sanitize result
    AgentLoop->>AgentLoop: if save_search_criteria → save_preferences
    AgentLoop->>AgentLoop: score result 1-5
    AgentLoop->>MemoryManager: save_history if score >= 3
    AgentLoop->>LiteLLM: completion with tool result
    LiteLLM-->>AgentLoop: final response

    AgentLoop->>MemoryManager: save session episode (L2)
    AgentLoop-->>FastAPI: AgentResponse
    FastAPI-->>chat.py: response
    chat.py-->>User: print response

Streaming Request Flow

sequenceDiagram
    actor User
    participant Frontend
    participant FastAPI
    participant AgentLoop
    participant LiteLLM
    participant SkillRegistry

    User->>Frontend: message
    Frontend->>FastAPI: POST /chat/stream
    FastAPI->>AgentLoop: run_stream(session_id, message)

    loop until finish_reason=stop
        AgentLoop->>LiteLLM: acompletion(stream=True)
        LiteLLM-->>AgentLoop: chunk stream
        AgentLoop->>AgentLoop: accumulate tool call chunks
        AgentLoop->>SkillRegistry: execute(tool_name, params)
        SkillRegistry-->>AgentLoop: result
    end

    AgentLoop-->>FastAPI: yield token events
    FastAPI-->>Frontend: "data: {type:token,...}"
    AgentLoop-->>FastAPI: yield data event
    FastAPI-->>Frontend: "data: {type:data,...}"
    AgentLoop-->>FastAPI: yield hints event
    FastAPI-->>Frontend: "data: {type:hints,...}"
    AgentLoop-->>FastAPI: yield done event
    FastAPI-->>Frontend: "data: {type:done,...}"

Agent Loop

flowchart TD
    A[Build context] --> C[LiteLLM call]
    C -->|finish_reason: stop| E[Save session]
    C -->|finish_reason: tool_calls| F[Dispatch tool calls]
    C -->|ContextWindowExceededError| G[Reactive compact]
    G --> C
    F --> H[Sanitize result]
    H --> H1{save_search_criteria?}
    H1 -->|Yes| H2[Save preferences] --> I
    H1 -->|No| I[Score result 1-5]
    I -->|"score >= 3"| J[Save to ChromaDB]
    I -->|"score < 3"| K[Discard]
    J --> L[Append to history]
    K --> L
    L --> A
    E --> N[Save L2 episode]
    N --> O[Generate hints]
    O --> M[Return / yield done]

Memory Model

graph LR
    subgraph Identity ["Identity Layer (per skill, on disk)"]
        AM["skills/{name}/AGENT.md\nL0 — identity, hard constraints\ninjected at position 0"]
    end

    subgraph Session ["Session Layer (per conversation)"]
        SJ["sessions/id.json\nfull message history\n(source of truth)"]
        SE["sessions/id/chroma/\nL2 episode store\nolder turns indexed by embedding"]
    end

    subgraph Persistent ["Persistent Layer (per skill, cross-session)"]
        PM["preferences.md\nL1 — PREFERENCE / DECISION / OBSERVATION\nalways injected"]
        CD["skill/chroma/\nL3 — semantic history\nscore >= 3 only"]
    end

    AgentLoop["AgentLoop"] -->|"position 0"| AM
    AgentLoop -->|"last 5 turns verbatim"| SJ
    AgentLoop -->|"older turns: top-K by similarity"| SE
    AgentLoop -->|"always inject"| PM
    AgentLoop -->|"top-N by similarity"| CD
    SJ -->|"each exchange saved as episode"| SE

Container Pool Lifecycle

stateDiagram-v2
    [*] --> Warm : start() x pool_size
    Warm --> InUse : execute() checks out container
    InUse --> Recreating : tool call completes
    Recreating --> Warm : new container ready

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Request Flow

Streaming Request Flow

Agent Loop

Memory Model

Container Pool Lifecycle

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture

Request Flow

Streaming Request Flow

Agent Loop

Memory Model

Container Pool Lifecycle