Contributing to ChatUI

Thank you for contributing! This guide covers coding conventions, log-level policy, design patterns, and the pre-submit checklist. Please read it before opening a PR.

Module Architecture Overview
- Package Map
- Request Lifecycle
Log Level Decision Guide
Package Façade Pattern
Protocol Pattern for Module Boundaries
Design Patterns in Use
Bug Prevention Checklist

Module Architecture Overview

Package Map

ChatUI's backend is organized into 13 decomposed packages under lib/, each using the Package Façade Pattern. The architecture separates request-handling hot-path modules from domain-specific computation packages.

lib/
├── __init__.py              ← Root config (API keys, model names, env vars)
├── _pkg_utils.py            ← Shared façade utilities (build_facade, safe_import, …)
├── protocols.py             ← Protocol interfaces for all module boundaries
├── log.py                   ← Centralized logging (get_logger, log_context, audit_log, …)
├── llm_client.py            ← LLM API communication (body building, SSE parsing) [HOT_PATH]
├── search.py                ← Web search execution [HOT_PATH]
│
├── tasks_pkg/               ← Task orchestration system
│   ├── orchestrator.py      ← Main LLM↔tool loop [HOT_PATH]
│   ├── executor.py          ← Tool dispatch registry [HOT_PATH]
│   ├── tool_dispatch.py     ← Tool parsing + parallel execution [HOT_PATH]
│   ├── compaction.py        ← Context window compaction [HOT_PATH]
│   ├── model_config.py      ← Model config lookup [HOT_PATH]
│   ├── manager.py           ← Task CRUD, SSE events, persistence
│   ├── endpoint.py          ← Autonomous work→review→revise loop
│   └── ...
│
├── llm_dispatch/            ← Multi-key, multi-model LLM dispatcher
│   ├── slot.py              ← Slot dataclass (api_key × model)
│   ├── config.py            ← Default slot configurations
│   ├── dispatcher.py        ← LLMDispatcher (pool management)
│   ├── factory.py           ← DispatcherFactory + get_dispatcher()
│   └── api.py               ← High-level: dispatch_chat, smart_chat
│
├── fetch/                   ← Content fetching + extraction
│   ├── utils.py             ← HTTP session, user-agent rotation
│   ├── http.py              ← Raw HTTP + retry logic
│   ├── html_extract.py      ← Readability-based text extraction
│   ├── pdf_extract.py       ← PDF text extraction
│   ├── playwright_pool.py   ← Browser pool (optional dep)
│   └── core.py              ← Main entry point: fetch_page_content [HOT_PATH]
│
├── trading/                    ← Trading portfolio advisor
│   ├── _common.py           ← TradingClient (shared HTTP session, proxy)
│   ├── nav.py               ← NAV fetching / caching / history
│   ├── info.py              ← Fund info, search, fee calculation
│   ├── strategy_data.py     ← Built-in strategy definitions
│   ├── intel.py             ← Intelligence CRUD + crawling (optional)
│   └── backtest.py          ← Backtesting (optional)
│
├── trading_strategy_engine/    ← Pure-computation strategy engine (no LLM)
│   ├── strategy.py          ← Strategy Protocol/ABC, registry
│   ├── signals.py           ← Multi-timeframe signal confirmation
│   ├── risk_metrics.py      ← Risk-adjusted performance metrics
│   ├── ensemble.py          ← Ensemble strategy backtesting (optional)
│   ├── monte_carlo.py       ← Monte Carlo simulation (optional)
│   ├── optimization.py      ← Walk-forward optimization (optional)
│   ├── portfolio.py         ← Portfolio construction (optional)
│   └── pipeline.py          ← Full analysis pipeline (optional)
│
├── trading_backtest_engine/    ← Event-driven backtesting engine
│   ├── config.py            ← Default constants
│   ├── state.py             ← Portfolio state management
│   ├── strategies.py        ← Strategy implementations
│   ├── reporting.py         ← Metrics computation
│   ├── engine.py            ← Core simulation engine
│   ├── validation.py        ← Walk-forward validation (optional)
│   ├── comparison.py        ← Multi-strategy comparison (optional)
│   └── analysis.py          ← Bias verification (optional)
│
├── trading_autopilot/          ← Autonomous trading analyst robot
│   ├── _constants.py        ← Thresholds & timing
│   ├── correlation.py       ← Intelligence cross-correlation (optional)
│   ├── strategy_evolution.py← Strategy self-improvement (optional)
│   ├── kpi.py               ← Pre-backtest scoring (optional)
│   ├── reasoning.py         ← LLM mega-prompt builder (optional)
│   ├── cycle.py             ← Full autopilot cycle runner (optional)
│   ├── scheduler.py         ← Periodic toggle (optional)
│   └── outcome.py           ← Recommendation outcome tracker (optional)
│
├── tools/                   ← Tool execution definitions
│   ├── search.py            ← web_search, site_search
│   ├── browser.py           ← Browser automation tools
│   ├── code_exec.py         ← Code execution tools
│   ├── conversation.py      ← Conversation reference tools
│   ├── meta.py              ← Dynamic tool list assembly
│   └── project.py           ← Project-mode file tools
│
├── swarm/                   ← Multi-agent orchestration
│   ├── artifact_store.py    ← Inter-agent data sharing [HOT_PATH]
│   └── ...
│
├── scheduler/               ← Task scheduler
│   ├── cron.py              ← Cron expression parser
│   ├── executor.py          ← Task execution handlers
│   ├── manager.py           ← SchedulerManager class
│   └── tool_defs.py         ← Scheduler tool schemas
│
└── feishu/                  ← Feishu (Lark) Bot integration
    ├── _state.py            ← Per-user state, locks, config
    ├── conversation.py      ← Chat history, DB sync (optional)
    ├── messaging.py         ← Lark API message sending (optional)
    ├── pipeline.py          ← LLM task pipeline (optional)
    ├── commands.py          ← Slash command registry (optional)
    ├── events.py            ← Event handlers (optional)
    └── startup.py           ← WebSocket connection (optional)

Request Lifecycle

A user message flows through the hot-path modules in this order:

Client POST /api/chat/start
       │
       ▼
 ┌─ orchestrator.py ─────────────────────────────────────┐
 │  1. _resolve_model_config()    ← model_config.py      │
 │  2. _inject_system_contexts()  ← system_context.py    │
 │  3. _prefetch_user_urls()      ← executor.py          │
 │  4. ┌─ TOOL LOOP ───────────────────────────────────┐ │
 │     │ a. build_body()          ← llm_client.py       │ │
 │     │ b. stream_llm_response() ← manager.py          │ │
 │     │ c. parse_tool_calls()    ← tool_dispatch.py    │ │
 │     │ d. execute_tool_pipeline()                      │ │
 │     │    └─ _execute_tool_one() ← executor.py        │ │
 │     │       ├─ perform_web_search() ← search.py      │ │
 │     │       ├─ fetch_page_content() ← fetch/core.py  │ │
 │     │       └─ [other tool handlers]                  │ │
 │     │ e. compact_messages_fast() ← compaction.py     │ │
 │     └─ REPEAT until model says "stop" ───────────────┘ │
 │  5. persist_task_result()       ← manager.py           │
 └────────────────────────────────────────────────────────┘
       │
       ▼
 Client polls /api/chat/stream/<id> for SSE events

Every module in this chain is marked # HOT_PATH and must follow the Hot-Path Rules.

Log Level Decision Guide

Use this table to select the correct log level for new code. Misuse of log levels pollutes production logs and obscures real issues.

Decision Flowchart

Is this a per-request / per-item event?
  YES → Is it useful ONLY during active debugging?
    YES → logger.debug(...)
    NO  → Is it a recoverable/expected failure?
      YES → logger.warning(...)  [no exc_info unless traceback adds value]
      NO  → logger.error(..., exc_info=True)
  NO → Is it a one-time startup / config event?
    YES → logger.info(...)
    NO  → Is it a periodic background event (e.g. scheduled task)?
      YES → logger.info(...)  [at most once per cycle]
      NO  → logger.debug(...)

Level Definitions

Level	When to Use	Example
`DEBUG`	Per-request traces, tool arguments, retry attempts, individual trading data fetches, loop iterations, parsed values, cache hits/misses	`logger.debug('Fetched %d bytes from %s', n, url)`
`INFO`	Server startup, config changes, new conversation created, scheduled task registered, periodic summaries (1/cycle)	`logger.info('Server started on port %d', port)`
`WARNING`	Recoverable errors (network retry succeeded), degraded mode, missing optional deps, fallback triggered, resilient-import failures	`logger.warning('Proxy failed, retrying direct')`
`ERROR`	Unrecoverable failures requiring attention — LLM call failed all retries, DB corruption, critical service unavailable; always add traceback	`logger.error('DB write failed', exc_info=True)`
`CRITICAL`	System-level failures preventing continued operation	`logger.critical('Cannot open database')`

Quick-Reference Decision Table

Scenario	Level	`exc_info`	Why
Logging each tool argument before dispatch	`DEBUG`	No	Per-request detail, only useful when debugging
Individual asset price fetch from API	`DEBUG`	No	Per-item data retrieval, high frequency
Retry attempt N of M for an HTTP request	`DEBUG`	No	Per-attempt trace; the final outcome uses WARNING/ERROR
Cache hit on fetch URL	`DEBUG`	No	High frequency, operational detail only
LLM body construction details	`DEBUG`	No	Hot-path, per-request
SSE chunk parsed from LLM stream	`DEBUG`	No	Very high frequency, never useful except deep debugging
Flask server started on port 5001	`INFO`	No	One-time startup event
New conversation created	`INFO`	No	Significant state change, low frequency
Scheduled task registered	`INFO`	No	One-time lifecycle event
Config value changed via admin API	`INFO`	No	Operational event worth auditing
Autopilot cycle started/completed	`INFO`	No	Periodic summary, once per cycle
Network request failed but retry succeeded	`WARNING`	No	Recovered — no action needed, but worth noting
Optional dependency missing (e.g. playwright)	`WARNING`	`True`	Feature degraded but system operational
Resilient import failed in façade	`WARNING`	`True`	Package continues without this sub-module
Running in degraded mode (no proxy)	`WARNING`	No	Operator should be aware of reduced capability
JSON parse error on user data (recovered)	`WARNING`	`True`	Traceback helps diagnose malformed data
LLM API call failed after all retries	`ERROR`	`True`	Unrecoverable — operator needs to investigate
Database corruption detected	`ERROR`	`True`	Data integrity issue requiring attention
Critical external service unreachable	`ERROR`	`True`	System cannot fulfil core function
Max tool rounds exhausted (infinite loop?)	`WARNING`	No	Operational issue but task completes with partial result
`finish_reason` is `None` from LLM	`ERROR`	No	Stream likely interrupted — needs investigation

Formatting Rules

Always use %-style formatting (lazy evaluation):

# ✅ Correct — args only formatted if level is enabled
logger.info('Fetched %s in %.1fs', url, elapsed)

# ❌ Wrong — f-string always evaluates, wastes CPU on hot path
logger.info(f'Fetched {url} in {elapsed:.1f}s')

Include identifiers for grep-ability: task ID, conversation ID, model name, URL.
Truncate large data: logger.debug('Response preview: %.500s', body)
Sanitize secrets: never log API keys, tokens, or full auth headers.
Structured prefix for easy grepping: [LLM], [ext:service], [op:name], [rid:xxxx].

Hot-Path Rules

Convention: Any module whose functions are called on the per-request path MUST include a module-level comment # HOT_PATH as the very first line of the file (before the module docstring). This is machine-grep-able and keeps the list of hot-path modules self-documenting.

# HOT_PATH — functions in this module are called per-request.
# Do NOT use logger.info() here; use logger.debug() instead.
"""Tool execution — unified dispatch for all tool types."""

from __future__ import annotations
...

Discovering hot-path modules — grep, don't maintain a hardcoded list:

# List all hot-path modules
grep -rln '# HOT_PATH' lib/ | sort

CI enforcement — fail the build if any # HOT_PATH module uses logger.info():

#!/usr/bin/env bash
hot_files=$(grep -rln '# HOT_PATH' lib/)
if [ -n "$hot_files" ]; then
  echo "$hot_files" | xargs grep -n 'logger\.info' \
    && { echo "FAIL: logger.info() found in HOT_PATH module(s)"; exit 1; }
fi

Rule of thumb: if you are inside a # HOT_PATH module, always reach for logger.debug(). Use logger.info() only for true one-time events (server boot, config load) — and those belong in non-hot-path modules.

Known Hot-Path Modules (reference)

The canonical list is always grep -rln '# HOT_PATH' lib/, but these are the modules known at time of writing:

Module	Role in Request Path
`lib/tasks_pkg/orchestrator.py`	Main LLM↔tool loop — entry point per request
`lib/tasks_pkg/executor.py`	Tool execution dispatch — called per tool call
`lib/tasks_pkg/tool_dispatch.py`	Tool argument parsing + parallel execution
`lib/tasks_pkg/compaction.py`	Context compaction — called during long conversations
`lib/tasks_pkg/model_config.py`	Model config lookup — called per LLM request
`lib/llm_client.py`	LLM API communication — called per LLM request
`lib/search.py`	Search execution — called per search tool invocation
`lib/fetch/core.py`	Content fetching — called per fetch_url tool call
`lib/swarm/artifact_store.py`	Inter-agent artifact I/O — called per agent step

If you add a new module to the per-request path, add # HOT_PATH at the top.

Traceback Rules (`exc_info=True`)

Always add exc_info=True to logger.error() calls inside except blocks — the traceback is the single most valuable piece of debugging information. No exceptions to this rule.

# ✅ CORRECT — error in except block always gets exc_info
try:
    data = complex_operation()
except Exception as e:
    logger.error('complex_operation failed: %s', e, exc_info=True)
    raise

Add exc_info=True to logger.warning() when the exception itself is the primary news (e.g., unexpected parse failure where the traceback helps diagnosis, or a resilient-import failure in a façade).
```
# ✅ When the traceback adds diagnostic value
except json.JSONDecodeError as e:
    logger.warning('Unexpected JSON in DB row %s: %s', row_id, e, exc_info=True)
```

Omit exc_info=True from logger.warning() when recovery succeeded and the warning just notes the fallback (e.g., "Retried with backup model").

# ✅ Recovered — traceback would be noise
except ConnectionError:
    logger.warning('Primary API down, falling back to backup endpoint')

Never add exc_info=True to logger.debug() — tracebacks in debug logs create noise without benefit.

Log Infrastructure

The project uses lib/log.py as the centralized logging module. Every Python file must use it:

from lib.log import get_logger
logger = get_logger(__name__)

Key utilities provided by lib/log.py:

Utility	Purpose	When to Use
`get_logger(name)`	Get a named logger (usually `__name__`)	Every module — top-level constant
`log_exception()`	`logger.error(msg, exc_info=True)` shorthand	Catch-and-reraise blocks
`log_context(op)`	Context manager: logs start/end/duration/exception	Any operation > 1 second
`log_external()`	Context manager for external API calls with timing	HTTP calls to third-party services
`log_route()`	Decorator for Flask route handlers	All route handler functions
`log_suppressed()`	Log a swallowed exception with context	Guard clauses with `except: pass`
`audit_log()`	Structured JSON audit entry to `logs/audit.log`	Security events, config changes
`set_req_id()`	Set per-request correlation ID (middleware)	Called by request middleware
`req_id()`	Get current request ID	Prefix logs for correlation

Log file layout (configured in server.py):

File	Content	Rotation
`logs/app.log`	Business logic (lib., routes., server) INFO+	Daily, 30 days
`logs/access.log`	HTTP request log (werkzeug) INFO+	Daily, 14 days
`logs/error.log`	WARNING/ERROR/CRITICAL from all sources	5 MB × 10 backups
`logs/vendor.log`	Third-party libraries WARNING+ only	5 MB × 3 backups
`logs/audit.log`	Structured JSON audit trail	Append-only

Package Façade Pattern

When decomposing a monolith lib/X.py into lib/X/ (a package with sub-modules), follow these conventions to maintain a clean public API.

Principles

lib/X/__init__.py is the façade — consumers import from lib.X, never from lib.X.submodule directly.
Each sub-module defines __all__ listing its public API.
The façade uses helpers from lib/_pkg_utils.py for consistency and resilient import handling.
Core sub-modules (required for basic operation) import directly — failures propagate immediately.
Optional sub-modules (features that can be disabled) are wrapped in try/except — failures are logged as warnings and the rest of the package continues to work.

Anatomy of a Façade

Every package __init__.py under lib/ follows the same structure:

"""lib/X/ — Package description (Package Façade)

Sub-modules:
  core           — Main functionality
  config         — Configuration constants
  optional_feat  — Optional feature (graceful degradation)
"""

import logging
from lib._pkg_utils import build_facade

_logger = logging.getLogger(__name__)

__all__: list[str] = []

# ── Core modules (always required — let exceptions propagate) ────
from . import core, config  # noqa: E402
from .core import *          # noqa: F401,F403
from .config import *        # noqa: F401,F403

build_facade(__all__, core, config)

# ── Optional modules (graceful degradation) ──────────────────────
try:
    from . import optional_feat  # noqa: E402
    from .optional_feat import *  # noqa: F401,F403
    build_facade(__all__, optional_feat)
except Exception as _exc:
    _logger.warning('lib.X.optional_feat failed to load — feature disabled: %s',
                    _exc, exc_info=True)

What build_facade() does: reads each module's __all__ and appends those names to the package-level __all__. Combined with the from .X import * statements, this ensures consumers can do from lib.X import any_public_name.

Core vs. Optional Sub-modules

Category	Import Style	On Failure	Use When
Core	Direct `from .X import *`	Exception propagates → package fails	Sub-module is essential (e.g. NAV fetching)
Optional	`try/except` around `from .X import *`	Warning logged → package continues	Feature can be disabled (e.g. Playwright)

Real examples from the codebase:

Package	Core Sub-modules	Optional Sub-modules
`lib/trading`	`_common`, `nav`, `info`, `strategy_data`	`intel`, `backtest`
`lib/fetch`	`utils`, `http`, `html_extract`, `pdf_extract`, `core`	`playwright_pool`
`lib/trading_strategy_engine`	`strategy`, `signals`, `risk_metrics`	`ensemble`, `monte_carlo`, `optimization`, `portfolio`, `pipeline`
`lib/trading_backtest_engine`	`config`, `state`, `strategies`, `reporting`, `engine`	`validation`, `comparison`, `analysis`
`lib/trading_autopilot`	`_constants`	`correlation`, `strategy_evolution`, `kpi`, `reasoning`, `cycle`, `scheduler`, `outcome`
`lib/llm_dispatch`	(none — all guarded)	`config`, `slot`, `factory`, `dispatcher`, `api`
`lib/feishu`	`_state`	`conversation`, `messaging`, `pipeline`, `commands`, `events`, `startup`
`lib/tools`	All: `search`, `browser`, `code_exec`, `conversation`, `meta`, `project`	(none)
`lib/scheduler`	All: `cron`, `executor`, `manager`, `tool_defs`	(none)

The `lib/_pkg_utils.py` Toolkit

Function	Purpose	Use Case
`build_facade()`	Extends package `__all__` from multiple modules' `__all__` lists	After core `from .X import *` statements
`safe_star_import()`	Programmatic `from module import *` into caller's namespace + `__all__`	When you need finer control
`extend_all()`	Appends one module's `__all__` to package `__all__`	Single module, `__all__` only (no import)
`resilient_import()`	`importlib.import_module` + `safe_star_import` in try/except	Importing by fully-qualified name
`safe_import()`	Returns a `_SafeImporter` callable bound to a specific package	Most concise pattern for optional imports
`facade_imports()`	Batch-imports multiple optional sub-modules in one call	Many optional modules at once
`validate_all()`	CI helper — checks all names in `__all__` actually exist in namespace	Pre-submit CI check

Concise alternative pattern using safe_import():

from lib._pkg_utils import build_facade, safe_import

__all__: list[str] = []
_import = safe_import(__name__, globals(), __all__)

# Core (must load)
from . import core, config
from .core import *
from .config import *
build_facade(__all__, core, config)

# Optional — one line each, independent (one failure doesn't block others)
_import('optional_a', 'feature A')
_import('optional_b', 'feature B')

Batch pattern using facade_imports():

from lib._pkg_utils import build_facade, facade_imports

__all__: list[str] = []

from . import core
from .core import *
build_facade(__all__, core)

# Optional — all in one call
facade_imports(__name__, globals(), __all__, [
    ('optional_a', 'feature A description'),
    ('optional_b', 'feature B description'),
])

Adding a New Package

When decomposing lib/monolith.py into lib/monolith/:

Create the directory with sub-modules, each defining __all__.
Create __init__.py following the façade pattern above.
Classify sub-modules as core (direct import) or optional (try/except).
Use build_facade() to aggregate __all__ lists.
Test imports: python -c "from lib.monolith import *; print(__all__)" — the public API should be identical to the old monolith.
Keep consumer imports unchanged: from lib.monolith import func must still work.

Lazy-Loading Variant (`lib/tasks_pkg`)

lib/tasks_pkg uses a different pattern optimized for startup time: only the lightweight manager module is imported eagerly (it's used by route registration at import time). Heavy modules like orchestrator, executor, and endpoint are loaded on first access via __getattr__:

# lib/tasks_pkg/__init__.py
__all__ = ['tasks', 'create_task', 'run_task', ...]

# Eager — lightweight, needed at import time
from lib.tasks_pkg.manager import tasks, create_task, append_event, ...

# Lazy — loaded on first access
_LAZY_MAP = {
    'run_task':        ('lib.tasks_pkg.orchestrator', 'run_task'),
    'ToolRegistry':    ('lib.tasks_pkg.executor',     'ToolRegistry'),
    'run_endpoint_task':('lib.tasks_pkg.endpoint',    'run_endpoint_task'),
    ...
}

def __getattr__(name):
    if name in _LAZY_MAP:
        module_path, attr_name = _LAZY_MAP[name]
        import importlib
        mod = importlib.import_module(module_path)
        val = getattr(mod, attr_name)
        globals()[name] = val  # Cache — __getattr__ only called once
        return val
    raise AttributeError(...)

Use this variant when a package has both:

Symbols needed at import time (e.g., task dict, locks)
Heavy modules that can be deferred (e.g., executor with many deps)

CI Validation

Verify façade completeness in CI:

from lib._pkg_utils import validate_all
import lib.tools

missing = validate_all(lib.tools)
assert not missing, f"Stale __all__ entries: {missing}"

Protocol Pattern for Module Boundaries

Why Protocols

The project uses typing.Protocol (PEP 544) for structural subtyping at high-coupling module boundaries. This enables:

No hard imports — consumers don't import the concrete module → avoids circular imports, reduces coupling.
Testability — mock objects satisfying the protocol can be injected without monkeypatching module globals.
Documented contracts — the expected interface is explicit in type annotations, not implicit in "I happen to call these methods."

All protocols live in one file: lib/protocols.py. This is deliberate — it's the single location for all boundary contracts, importable by any module without pulling in transitive dependencies.

Protocol Catalogue

Protocol	Methods	Satisfied By	Used By
`LLMService`	`chat()`, `stream()`	`lib.llm_dispatch` module functions	trading_autopilot, swarm, tasks_pkg, trading.intel
`FetchService`	`fetch_page_content()`, `fetch_urls()`	`lib.fetch` module functions	tasks_pkg.executor, trading.intel
`TradingDataProvider`	`get_latest_price()`, `fetch_asset_info()`, `fetch_price_history()`, `build_intel_context()`	`lib.trading` module functions	trading_autopilot
`TaskEventSink`	`append_event()`	`lib.tasks_pkg.manager.append_event`	executor, tool_dispatch
`ToolHandler`	`__call__(task, tc, fn_name, …)`	Each `@tool_registry.handler()` fn	ToolRegistry dispatch
`BodyBuilder`	`__call__(model, messages, …)`	`lib.llm_client.build_body`	swarm agents, orchestrators

All protocols are @runtime_checkable — you can use isinstance() checks in defensive code.

Using Protocols in Consumer Code

Type annotation at function boundaries:

from lib.protocols import LLMService

def my_function(llm: LLMService, prompt: str) -> str:
    """Uses structural subtyping — any object with .chat() works."""
    content, usage = llm.chat(
        [{'role': 'user', 'content': prompt}],
        max_tokens=1024,
    )
    return content

Dependency injection with fallback to concrete import:

from lib.protocols import FetchService

def _prefetch_user_urls(
    messages: list[dict],
    task: dict,
    *,
    fetch_service: FetchService | None = None,  # inject for testing
) -> list[tuple[str, str]]:
    # Fall back to concrete import when no DI provided
    if fetch_service:
        _fetch_urls = fetch_service.fetch_urls
    else:
        from lib.fetch import fetch_urls
        _fetch_urls = fetch_urls
    ...

Explicit protocol binding in hot-path modules:

# lib/tasks_pkg/orchestrator.py
from lib.llm_client import build_body as _build_body_impl
from lib.protocols import BodyBuilder

build_body: BodyBuilder = _build_body_impl  # explicit protocol binding

This pattern serves as living documentation: readers immediately see that build_body satisfies the BodyBuilder protocol, and type checkers can verify it.

Runtime type checking (defensive code):

from lib.protocols import LLMService

assert isinstance(llm_client, LLMService), "Expected LLMService protocol"

Adding a New Protocol

Define it in lib/protocols.py — the single location for all boundary contracts.
Use @runtime_checkable so it works with isinstance().
Document in the docstring which concrete types satisfy the protocol.
Add to __all__ in lib/protocols.py.
Add to the table above in this document.

Template:

@runtime_checkable
class MyServiceProtocol(Protocol):
    """Protocol for [description].

    Satisfied by:
      - ``lib.my_module`` module-level functions
      - Test mocks

    Used by:
      - ``lib.consumer_a``
      - ``lib.consumer_b``
    """

    def do_thing(self, arg: str) -> Result:
        """[Method description]."""
        ...

When to create a protocol — use this decision tree:

Does module A import concrete symbols from module B?
  YES → Do multiple modules (A, C, D) import the same symbols from B?
    YES → Is B heavy or slow to import?
      YES → Define a Protocol. ✅
      NO  → Is there a circular import risk?
        YES → Define a Protocol. ✅
        NO  → Do you want to mock B in tests for A?
          YES → Define a Protocol. ✅
          NO  → Direct import is fine. Skip Protocol.
    NO  → Direct import is fine. Skip Protocol.

Design Patterns in Use

ToolRegistry (`lib/tasks_pkg/executor.py`)

Pattern: Decorator-based dispatch registry for tool handlers.

The ToolRegistry class provides a central, extensible dispatch mechanism. Tool handlers are registered at module-load time via decorators, and lookup is performed at dispatch time via registry.lookup(fn_name, round_entry).

Three registration modes:

Mode	Method / Decorator	Use Case
Exact	`@registry.handler(name)`	Single tool name → handler (O(1) dict lookup)
Set-based	`@registry.tool_set(set)`	A group of related tools → one handler
Special	`@registry.special(key)`	Matched via `round_entry` metadata, not name

Example — registering a handler:

from lib.tasks_pkg.executor import tool_registry

@tool_registry.handler('web_search', category='search',
                       description='Perform a web search via API')
def _handle_web_search(task, tc, fn_name, tc_id, fn_args, rn,
                       round_entry, cfg, project_path,
                       project_enabled, all_tools=None):
    ...
    return tc_id, tool_content, True  # (tc_id, content_str, is_search)

# Set-based: one handler for many tool names
@tool_registry.tool_set(BROWSER_TOOL_NAMES, category='browser',
                        description='Execute a browser automation tool')
def _handle_browser_tool(task, tc, fn_name, ...):
    ...

# Special: matched by round_entry metadata, not fn_name
@tool_registry.special('__code_exec__', category='code',
                       description='Execute a shell command')
def _handle_code_exec(task, tc, fn_name, ...):
    ...

Lookup order: exact match → special key check → set-based scan → None.

Introspection: tool_registry.list_tools() returns all registered tools with metadata — useful for debugging and documentation generation.

TradingClient DI (`lib/trading/_common.py`)

Pattern: Injectable HTTP client with lazy singleton initialization.

The TradingClient class encapsulates the HTTP session, proxy configuration, and network-state logic. A module-level singleton is lazily initialised (not at import time) via double-checked locking to avoid opening HTTP connections during module import.

class TradingClient:
    """HTTP client for trading data APIs with proxy support."""
    def __init__(self, proxy_url=None):
        self.session = requests.Session()
        self.proxy_url = proxy_url or os.environ.get('HTTPS_PROXY', '')
        ...

    def check_network(self) -> bool:
        """Thread-safe, rate-limited network health probe."""
        ...

# Lazy singleton with double-checked locking
_lazy_client = None
_lazy_lock = threading.Lock()

def _get_default_client() -> TradingClient:
    global _lazy_client
    if _lazy_client is not None:          # Fast path: no lock
        return _lazy_client
    with _lazy_lock:
        if _lazy_client is not None:      # Double-check after lock
            return _lazy_client
        _lazy_client = TradingClient()
    return _lazy_client

Testing advantage: Instantiate TradingClient(proxy_url="") directly in tests — no monkeypatching of module globals required.

Module-Level Dispatch Table (`lib/tasks_pkg/tool_display.py`)

Pattern: Static dict mapping tool names → display handlers, built once at import time. Replaces long if/elif chains with O(1) dict lookup:

_TOOL_DISPLAY_DISPATCH = _build_display_dispatch_table()  # built once at import
handler = _TOOL_DISPLAY_DISPATCH.get(fn_name, _tool_display_generic)

Similarly, lib/tasks_pkg/executor.py uses a browser badge dispatch table:

_BROWSER_BADGE_DISPATCH = {
    'browser_list_tabs':   _badge_list_tabs,
    'browser_read_tab':    _badge_read_tab,
    'browser_screenshot':  _badge_screenshot,
    ...
}

badge_fn = _BROWSER_BADGE_DISPATCH.get(fn_name)
if badge_fn is not None:
    badge_fn(meta, fn_name, display_text, chars, is_screenshot)

Continuous Integration & Testing

CI Pipeline

Every push to main and every pull request triggers the GitHub Actions CI workflow:

Job	What it checks	Python versions	Blocking?
Lint	`ruff check` on `lib/`, `routes/`, `tests/`	3.12	✅ Yes
Unit Tests	`pytest -m unit` + coverage — pure logic, no server/browser/network	3.10, 3.12	✅ Yes
API Tests	`pytest -m api` + coverage — Flask test client + mock LLM server	3.12	✅ Yes
Healthcheck	`python healthcheck.py` — syntax, imports, schema, assets	3.12	✅ Yes
Visual E2E	`pytest -m visual` — Playwright browser tests (main branch only)	3.12	❌ Non-blocking

Unit tests run on a Python 3.10 + 3.12 matrix to catch compatibility issues early. Coverage reports are generated as XML artifacts for each run.

Running Tests Locally

# Quick: just unit tests
make test-unit

# Full CI pipeline (lint + unit + api + healthcheck)
make ci

# With coverage report
make test-coverage

# Individual commands
make lint              # Ruff linter + format check
make lint-fix          # Auto-fix lint issues
make test-api          # API integration tests
make test-visual       # Playwright E2E tests (needs: playwright install chromium)
make test-all          # All tests including visual
make smoke             # Smoke tests only (imports, syntax, blueprints)
make healthcheck       # Project diagnostics

Or use pytest directly:

python -m pytest -m unit -v                    # Unit tests, verbose
python -m pytest -m api --tb=long              # API tests, full tracebacks
python -m pytest -k "test_build_body" -v       # Filter by name
python -m pytest tests/test_smoke.py -v        # Just smoke tests
python -m pytest -m unit --cov=lib --cov=routes --cov-report=term-missing  # With coverage
python tests/run_all.py --unit                 # Legacy test runner

Test Organization

Tests live in two directories:

Directory	Purpose	Run in CI?	Style
`tests/`	Structured, self-contained pytest tests with mocks	✅ Yes	pytest classes + `@pytest.mark.*` markers
`debug/`	Manual exploration scripts, benchmarks, API-dependent tests	❌ No	Standalone `python debug/script.py`

When a debug/test_*.py script proves valuable for regression prevention, it should be migrated to tests/ with proper mocking and markers. See tests/README.md for the full test directory layout.

Test Markers

Every test must have a marker. strict_markers = true in pyproject.toml means unknown or missing markers cause a hard error.

Marker	Meaning	Needs server?	Needs browser?
`@pytest.mark.unit`	Pure logic tests	No	No
`@pytest.mark.api`	Flask test client + mock LLM	Test server	No
`@pytest.mark.visual`	Playwright E2E	Live server	Yes (Chromium)
`@pytest.mark.slow`	Takes > 10 seconds	Varies	Varies

Adding New Tests

Create test functions/classes in tests/test_*.py
Add the appropriate @pytest.mark.* decorator to every class or function
Verify locally: make test-unit (or the relevant marker)
Ensure make ci passes before submitting a PR

Reporting Bugs

When filing a bug report on GitHub:

Visit http://localhost:5001/api/support-bundle to get a diagnostic JSON bundle
Open a Bug Report issue
Paste the diagnostic bundle JSON into the designated field
Include relevant log excerpts from logs/error.log — use the request ID (rid:XXXX) from error messages to find related entries in logs/app.log

The diagnostic bundle includes system info, recent errors (sanitized), and active configuration — API keys are automatically redacted.

Bug Prevention Checklist

Before submitting code, verify:

Logging & Error Handling

Every new .py file has from lib.log import get_logger; logger = get_logger(__name__)
All logger.error() in except blocks include exc_info=True
No logger.info() in # HOT_PATH modules (use debug instead)
No bare except: or except Exception: pass (always log or re-raise)
No f-strings in log calls (use %-style: logger.info('x=%s', x))
Operations > 1 second use log_context() or log_external()

Safety & Robustness

All json.loads() calls on DB/user data are wrapped in try/except
All HTTP calls (requests.*, session.*) include a timeout= parameter
String variables used after content.strip() in except blocks are pre-defined

Architecture

Protocol interfaces are used for cross-module boundaries (not concrete imports)
Package façades use lib/_pkg_utils.py helpers for optional sub-module imports
New tool handlers are registered via @tool_registry.handler() (not if/elif chains)
New packages follow the façade pattern with build_facade()
Any new per-request module has # HOT_PATH as the first line
New protocols are added to lib/protocols.py with @runtime_checkable and __all__

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History