Thank you for contributing! This guide covers coding conventions, log-level policy, design patterns, and the pre-submit checklist. Please read it before opening a PR.
- Module Architecture Overview
- Log Level Decision Guide
- Package Façade Pattern
- Protocol Pattern for Module Boundaries
- Design Patterns in Use
- Bug Prevention Checklist
ChatUI's backend is organized into 13 decomposed packages under lib/,
each using the Package Façade Pattern. The
architecture separates request-handling hot-path modules from
domain-specific computation packages.
lib/
├── __init__.py ← Root config (API keys, model names, env vars)
├── _pkg_utils.py ← Shared façade utilities (build_facade, safe_import, …)
├── protocols.py ← Protocol interfaces for all module boundaries
├── log.py ← Centralized logging (get_logger, log_context, audit_log, …)
├── llm_client.py ← LLM API communication (body building, SSE parsing) [HOT_PATH]
├── search.py ← Web search execution [HOT_PATH]
│
├── tasks_pkg/ ← Task orchestration system
│ ├── orchestrator.py ← Main LLM↔tool loop [HOT_PATH]
│ ├── executor.py ← Tool dispatch registry [HOT_PATH]
│ ├── tool_dispatch.py ← Tool parsing + parallel execution [HOT_PATH]
│ ├── compaction.py ← Context window compaction [HOT_PATH]
│ ├── model_config.py ← Model config lookup [HOT_PATH]
│ ├── manager.py ← Task CRUD, SSE events, persistence
│ ├── endpoint.py ← Autonomous work→review→revise loop
│ └── ...
│
├── llm_dispatch/ ← Multi-key, multi-model LLM dispatcher
│ ├── slot.py ← Slot dataclass (api_key × model)
│ ├── config.py ← Default slot configurations
│ ├── dispatcher.py ← LLMDispatcher (pool management)
│ ├── factory.py ← DispatcherFactory + get_dispatcher()
│ └── api.py ← High-level: dispatch_chat, smart_chat
│
├── fetch/ ← Content fetching + extraction
│ ├── utils.py ← HTTP session, user-agent rotation
│ ├── http.py ← Raw HTTP + retry logic
│ ├── html_extract.py ← Readability-based text extraction
│ ├── pdf_extract.py ← PDF text extraction
│ ├── playwright_pool.py ← Browser pool (optional dep)
│ └── core.py ← Main entry point: fetch_page_content [HOT_PATH]
│
├── trading/ ← Trading portfolio advisor
│ ├── _common.py ← TradingClient (shared HTTP session, proxy)
│ ├── nav.py ← NAV fetching / caching / history
│ ├── info.py ← Fund info, search, fee calculation
│ ├── strategy_data.py ← Built-in strategy definitions
│ ├── intel.py ← Intelligence CRUD + crawling (optional)
│ └── backtest.py ← Backtesting (optional)
│
├── trading_strategy_engine/ ← Pure-computation strategy engine (no LLM)
│ ├── strategy.py ← Strategy Protocol/ABC, registry
│ ├── signals.py ← Multi-timeframe signal confirmation
│ ├── risk_metrics.py ← Risk-adjusted performance metrics
│ ├── ensemble.py ← Ensemble strategy backtesting (optional)
│ ├── monte_carlo.py ← Monte Carlo simulation (optional)
│ ├── optimization.py ← Walk-forward optimization (optional)
│ ├── portfolio.py ← Portfolio construction (optional)
│ └── pipeline.py ← Full analysis pipeline (optional)
│
├── trading_backtest_engine/ ← Event-driven backtesting engine
│ ├── config.py ← Default constants
│ ├── state.py ← Portfolio state management
│ ├── strategies.py ← Strategy implementations
│ ├── reporting.py ← Metrics computation
│ ├── engine.py ← Core simulation engine
│ ├── validation.py ← Walk-forward validation (optional)
│ ├── comparison.py ← Multi-strategy comparison (optional)
│ └── analysis.py ← Bias verification (optional)
│
├── trading_autopilot/ ← Autonomous trading analyst robot
│ ├── _constants.py ← Thresholds & timing
│ ├── correlation.py ← Intelligence cross-correlation (optional)
│ ├── strategy_evolution.py← Strategy self-improvement (optional)
│ ├── kpi.py ← Pre-backtest scoring (optional)
│ ├── reasoning.py ← LLM mega-prompt builder (optional)
│ ├── cycle.py ← Full autopilot cycle runner (optional)
│ ├── scheduler.py ← Periodic toggle (optional)
│ └── outcome.py ← Recommendation outcome tracker (optional)
│
├── tools/ ← Tool execution definitions
│ ├── search.py ← web_search, site_search
│ ├── browser.py ← Browser automation tools
│ ├── code_exec.py ← Code execution tools
│ ├── conversation.py ← Conversation reference tools
│ ├── meta.py ← Dynamic tool list assembly
│ └── project.py ← Project-mode file tools
│
├── swarm/ ← Multi-agent orchestration
│ ├── artifact_store.py ← Inter-agent data sharing [HOT_PATH]
│ └── ...
│
├── scheduler/ ← Task scheduler
│ ├── cron.py ← Cron expression parser
│ ├── executor.py ← Task execution handlers
│ ├── manager.py ← SchedulerManager class
│ └── tool_defs.py ← Scheduler tool schemas
│
└── feishu/ ← Feishu (Lark) Bot integration
├── _state.py ← Per-user state, locks, config
├── conversation.py ← Chat history, DB sync (optional)
├── messaging.py ← Lark API message sending (optional)
├── pipeline.py ← LLM task pipeline (optional)
├── commands.py ← Slash command registry (optional)
├── events.py ← Event handlers (optional)
└── startup.py ← WebSocket connection (optional)
A user message flows through the hot-path modules in this order:
Client POST /api/chat/start
│
▼
┌─ orchestrator.py ─────────────────────────────────────┐
│ 1. _resolve_model_config() ← model_config.py │
│ 2. _inject_system_contexts() ← system_context.py │
│ 3. _prefetch_user_urls() ← executor.py │
│ 4. ┌─ TOOL LOOP ───────────────────────────────────┐ │
│ │ a. build_body() ← llm_client.py │ │
│ │ b. stream_llm_response() ← manager.py │ │
│ │ c. parse_tool_calls() ← tool_dispatch.py │ │
│ │ d. execute_tool_pipeline() │ │
│ │ └─ _execute_tool_one() ← executor.py │ │
│ │ ├─ perform_web_search() ← search.py │ │
│ │ ├─ fetch_page_content() ← fetch/core.py │ │
│ │ └─ [other tool handlers] │ │
│ │ e. compact_messages_fast() ← compaction.py │ │
│ └─ REPEAT until model says "stop" ───────────────┘ │
│ 5. persist_task_result() ← manager.py │
└────────────────────────────────────────────────────────┘
│
▼
Client polls /api/chat/stream/<id> for SSE events
Every module in this chain is marked # HOT_PATH and must follow the
Hot-Path Rules.
Use this table to select the correct log level for new code. Misuse of log levels pollutes production logs and obscures real issues.
Is this a per-request / per-item event?
YES → Is it useful ONLY during active debugging?
YES → logger.debug(...)
NO → Is it a recoverable/expected failure?
YES → logger.warning(...) [no exc_info unless traceback adds value]
NO → logger.error(..., exc_info=True)
NO → Is it a one-time startup / config event?
YES → logger.info(...)
NO → Is it a periodic background event (e.g. scheduled task)?
YES → logger.info(...) [at most once per cycle]
NO → logger.debug(...)
| Level | When to Use | Example |
|---|---|---|
DEBUG |
Per-request traces, tool arguments, retry attempts, individual trading data fetches, loop iterations, parsed values, cache hits/misses | logger.debug('Fetched %d bytes from %s', n, url) |
INFO |
Server startup, config changes, new conversation created, scheduled task registered, periodic summaries (1/cycle) | logger.info('Server started on port %d', port) |
WARNING |
Recoverable errors (network retry succeeded), degraded mode, missing optional deps, fallback triggered, resilient-import failures | logger.warning('Proxy failed, retrying direct') |
ERROR |
Unrecoverable failures requiring attention — LLM call failed all retries, DB corruption, critical service unavailable; always add traceback | logger.error('DB write failed', exc_info=True) |
CRITICAL |
System-level failures preventing continued operation | logger.critical('Cannot open database') |
| Scenario | Level | exc_info |
Why |
|---|---|---|---|
| Logging each tool argument before dispatch | DEBUG |
No | Per-request detail, only useful when debugging |
| Individual asset price fetch from API | DEBUG |
No | Per-item data retrieval, high frequency |
| Retry attempt N of M for an HTTP request | DEBUG |
No | Per-attempt trace; the final outcome uses WARNING/ERROR |
| Cache hit on fetch URL | DEBUG |
No | High frequency, operational detail only |
| LLM body construction details | DEBUG |
No | Hot-path, per-request |
| SSE chunk parsed from LLM stream | DEBUG |
No | Very high frequency, never useful except deep debugging |
| Flask server started on port 5001 | INFO |
No | One-time startup event |
| New conversation created | INFO |
No | Significant state change, low frequency |
| Scheduled task registered | INFO |
No | One-time lifecycle event |
| Config value changed via admin API | INFO |
No | Operational event worth auditing |
| Autopilot cycle started/completed | INFO |
No | Periodic summary, once per cycle |
| Network request failed but retry succeeded | WARNING |
No | Recovered — no action needed, but worth noting |
| Optional dependency missing (e.g. playwright) | WARNING |
True |
Feature degraded but system operational |
| Resilient import failed in façade | WARNING |
True |
Package continues without this sub-module |
| Running in degraded mode (no proxy) | WARNING |
No | Operator should be aware of reduced capability |
| JSON parse error on user data (recovered) | WARNING |
True |
Traceback helps diagnose malformed data |
| LLM API call failed after all retries | ERROR |
True |
Unrecoverable — operator needs to investigate |
| Database corruption detected | ERROR |
True |
Data integrity issue requiring attention |
| Critical external service unreachable | ERROR |
True |
System cannot fulfil core function |
| Max tool rounds exhausted (infinite loop?) | WARNING |
No | Operational issue but task completes with partial result |
finish_reason is None from LLM |
ERROR |
No | Stream likely interrupted — needs investigation |
- Always use %-style formatting (lazy evaluation):
# ✅ Correct — args only formatted if level is enabled logger.info('Fetched %s in %.1fs', url, elapsed) # ❌ Wrong — f-string always evaluates, wastes CPU on hot path logger.info(f'Fetched {url} in {elapsed:.1f}s')
- Include identifiers for grep-ability: task ID, conversation ID, model name, URL.
- Truncate large data:
logger.debug('Response preview: %.500s', body) - Sanitize secrets: never log API keys, tokens, or full auth headers.
- Structured prefix for easy grepping:
[LLM],[ext:service],[op:name],[rid:xxxx].
Convention: Any module whose functions are called on the per-request path
MUST include a module-level comment # HOT_PATH as the very first line of
the file (before the module docstring). This is machine-grep-able and keeps the
list of hot-path modules self-documenting.
# HOT_PATH — functions in this module are called per-request.
# Do NOT use logger.info() here; use logger.debug() instead.
"""Tool execution — unified dispatch for all tool types."""
from __future__ import annotations
...Discovering hot-path modules — grep, don't maintain a hardcoded list:
# List all hot-path modules
grep -rln '# HOT_PATH' lib/ | sortCI enforcement — fail the build if any # HOT_PATH module uses logger.info():
#!/usr/bin/env bash
hot_files=$(grep -rln '# HOT_PATH' lib/)
if [ -n "$hot_files" ]; then
echo "$hot_files" | xargs grep -n 'logger\.info' \
&& { echo "FAIL: logger.info() found in HOT_PATH module(s)"; exit 1; }
fiRule of thumb: if you are inside a
# HOT_PATHmodule, always reach forlogger.debug(). Uselogger.info()only for true one-time events (server boot, config load) — and those belong in non-hot-path modules.
The canonical list is always grep -rln '# HOT_PATH' lib/, but these are
the modules known at time of writing:
| Module | Role in Request Path |
|---|---|
lib/tasks_pkg/orchestrator.py |
Main LLM↔tool loop — entry point per request |
lib/tasks_pkg/executor.py |
Tool execution dispatch — called per tool call |
lib/tasks_pkg/tool_dispatch.py |
Tool argument parsing + parallel execution |
lib/tasks_pkg/compaction.py |
Context compaction — called during long conversations |
lib/tasks_pkg/model_config.py |
Model config lookup — called per LLM request |
lib/llm_client.py |
LLM API communication — called per LLM request |
lib/search.py |
Search execution — called per search tool invocation |
lib/fetch/core.py |
Content fetching — called per fetch_url tool call |
lib/swarm/artifact_store.py |
Inter-agent artifact I/O — called per agent step |
If you add a new module to the per-request path, add # HOT_PATH at the top.
-
Always add
exc_info=Truetologger.error()calls insideexceptblocks — the traceback is the single most valuable piece of debugging information. No exceptions to this rule.# ✅ CORRECT — error in except block always gets exc_info try: data = complex_operation() except Exception as e: logger.error('complex_operation failed: %s', e, exc_info=True) raise
-
Add
exc_info=Truetologger.warning()when the exception itself is the primary news (e.g., unexpected parse failure where the traceback helps diagnosis, or a resilient-import failure in a façade).# ✅ When the traceback adds diagnostic value except json.JSONDecodeError as e: logger.warning('Unexpected JSON in DB row %s: %s', row_id, e, exc_info=True)
-
Omit
exc_info=Truefromlogger.warning()when recovery succeeded and the warning just notes the fallback (e.g., "Retried with backup model").# ✅ Recovered — traceback would be noise except ConnectionError: logger.warning('Primary API down, falling back to backup endpoint')
-
Never add
exc_info=Truetologger.debug()— tracebacks in debug logs create noise without benefit.
The project uses lib/log.py as the centralized logging module. Every Python
file must use it:
from lib.log import get_logger
logger = get_logger(__name__)Key utilities provided by lib/log.py:
| Utility | Purpose | When to Use |
|---|---|---|
get_logger(name) |
Get a named logger (usually __name__) |
Every module — top-level constant |
log_exception() |
logger.error(msg, exc_info=True) shorthand |
Catch-and-reraise blocks |
log_context(op) |
Context manager: logs start/end/duration/exception | Any operation > 1 second |
log_external() |
Context manager for external API calls with timing | HTTP calls to third-party services |
log_route() |
Decorator for Flask route handlers | All route handler functions |
log_suppressed() |
Log a swallowed exception with context | Guard clauses with except: pass |
audit_log() |
Structured JSON audit entry to logs/audit.log |
Security events, config changes |
set_req_id() |
Set per-request correlation ID (middleware) | Called by request middleware |
req_id() |
Get current request ID | Prefix logs for correlation |
Log file layout (configured in server.py):
| File | Content | Rotation |
|---|---|---|
logs/app.log |
Business logic (lib., routes., server) INFO+ | Daily, 30 days |
logs/access.log |
HTTP request log (werkzeug) INFO+ | Daily, 14 days |
logs/error.log |
WARNING/ERROR/CRITICAL from all sources | 5 MB × 10 backups |
logs/vendor.log |
Third-party libraries WARNING+ only | 5 MB × 3 backups |
logs/audit.log |
Structured JSON audit trail | Append-only |
When decomposing a monolith lib/X.py into lib/X/ (a package with
sub-modules), follow these conventions to maintain a clean public API.
lib/X/__init__.pyis the façade — consumers import fromlib.X, never fromlib.X.submoduledirectly.- Each sub-module defines
__all__listing its public API. - The façade uses helpers from
lib/_pkg_utils.pyfor consistency and resilient import handling. - Core sub-modules (required for basic operation) import directly — failures propagate immediately.
- Optional sub-modules (features that can be disabled) are wrapped in
try/except— failures are logged as warnings and the rest of the package continues to work.
Every package __init__.py under lib/ follows the same structure:
"""lib/X/ — Package description (Package Façade)
Sub-modules:
core — Main functionality
config — Configuration constants
optional_feat — Optional feature (graceful degradation)
"""
import logging
from lib._pkg_utils import build_facade
_logger = logging.getLogger(__name__)
__all__: list[str] = []
# ── Core modules (always required — let exceptions propagate) ────
from . import core, config # noqa: E402
from .core import * # noqa: F401,F403
from .config import * # noqa: F401,F403
build_facade(__all__, core, config)
# ── Optional modules (graceful degradation) ──────────────────────
try:
from . import optional_feat # noqa: E402
from .optional_feat import * # noqa: F401,F403
build_facade(__all__, optional_feat)
except Exception as _exc:
_logger.warning('lib.X.optional_feat failed to load — feature disabled: %s',
_exc, exc_info=True)What build_facade() does: reads each module's __all__ and appends
those names to the package-level __all__. Combined with the from .X import *
statements, this ensures consumers can do from lib.X import any_public_name.
| Category | Import Style | On Failure | Use When |
|---|---|---|---|
| Core | Direct from .X import * |
Exception propagates → package fails | Sub-module is essential (e.g. NAV fetching) |
| Optional | try/except around from .X import * |
Warning logged → package continues | Feature can be disabled (e.g. Playwright) |
Real examples from the codebase:
| Package | Core Sub-modules | Optional Sub-modules |
|---|---|---|
lib/trading |
_common, nav, info, strategy_data |
intel, backtest |
lib/fetch |
utils, http, html_extract, pdf_extract, core |
playwright_pool |
lib/trading_strategy_engine |
strategy, signals, risk_metrics |
ensemble, monte_carlo, optimization, portfolio, pipeline |
lib/trading_backtest_engine |
config, state, strategies, reporting, engine |
validation, comparison, analysis |
lib/trading_autopilot |
_constants |
correlation, strategy_evolution, kpi, reasoning, cycle, scheduler, outcome |
lib/llm_dispatch |
(none — all guarded) | config, slot, factory, dispatcher, api |
lib/feishu |
_state |
conversation, messaging, pipeline, commands, events, startup |
lib/tools |
All: search, browser, code_exec, conversation, meta, project |
(none) |
lib/scheduler |
All: cron, executor, manager, tool_defs |
(none) |
| Function | Purpose | Use Case |
|---|---|---|
build_facade() |
Extends package __all__ from multiple modules' __all__ lists |
After core from .X import * statements |
safe_star_import() |
Programmatic from module import * into caller's namespace + __all__ |
When you need finer control |
extend_all() |
Appends one module's __all__ to package __all__ |
Single module, __all__ only (no import) |
resilient_import() |
importlib.import_module + safe_star_import in try/except |
Importing by fully-qualified name |
safe_import() |
Returns a _SafeImporter callable bound to a specific package |
Most concise pattern for optional imports |
facade_imports() |
Batch-imports multiple optional sub-modules in one call | Many optional modules at once |
validate_all() |
CI helper — checks all names in __all__ actually exist in namespace |
Pre-submit CI check |
Concise alternative pattern using safe_import():
from lib._pkg_utils import build_facade, safe_import
__all__: list[str] = []
_import = safe_import(__name__, globals(), __all__)
# Core (must load)
from . import core, config
from .core import *
from .config import *
build_facade(__all__, core, config)
# Optional — one line each, independent (one failure doesn't block others)
_import('optional_a', 'feature A')
_import('optional_b', 'feature B')Batch pattern using facade_imports():
from lib._pkg_utils import build_facade, facade_imports
__all__: list[str] = []
from . import core
from .core import *
build_facade(__all__, core)
# Optional — all in one call
facade_imports(__name__, globals(), __all__, [
('optional_a', 'feature A description'),
('optional_b', 'feature B description'),
])When decomposing lib/monolith.py into lib/monolith/:
- Create the directory with sub-modules, each defining
__all__. - Create
__init__.pyfollowing the façade pattern above. - Classify sub-modules as core (direct import) or optional (try/except).
- Use
build_facade()to aggregate__all__lists. - Test imports:
python -c "from lib.monolith import *; print(__all__)"— the public API should be identical to the old monolith. - Keep consumer imports unchanged:
from lib.monolith import funcmust still work.
lib/tasks_pkg uses a different pattern optimized for startup time: only the
lightweight manager module is imported eagerly (it's used by route
registration at import time). Heavy modules like orchestrator, executor,
and endpoint are loaded on first access via __getattr__:
# lib/tasks_pkg/__init__.py
__all__ = ['tasks', 'create_task', 'run_task', ...]
# Eager — lightweight, needed at import time
from lib.tasks_pkg.manager import tasks, create_task, append_event, ...
# Lazy — loaded on first access
_LAZY_MAP = {
'run_task': ('lib.tasks_pkg.orchestrator', 'run_task'),
'ToolRegistry': ('lib.tasks_pkg.executor', 'ToolRegistry'),
'run_endpoint_task':('lib.tasks_pkg.endpoint', 'run_endpoint_task'),
...
}
def __getattr__(name):
if name in _LAZY_MAP:
module_path, attr_name = _LAZY_MAP[name]
import importlib
mod = importlib.import_module(module_path)
val = getattr(mod, attr_name)
globals()[name] = val # Cache — __getattr__ only called once
return val
raise AttributeError(...)Use this variant when a package has both:
- Symbols needed at import time (e.g., task dict, locks)
- Heavy modules that can be deferred (e.g., executor with many deps)
Verify façade completeness in CI:
from lib._pkg_utils import validate_all
import lib.tools
missing = validate_all(lib.tools)
assert not missing, f"Stale __all__ entries: {missing}"The project uses typing.Protocol (PEP 544) for structural subtyping at
high-coupling module boundaries. This enables:
- No hard imports — consumers don't import the concrete module → avoids circular imports, reduces coupling.
- Testability — mock objects satisfying the protocol can be injected without monkeypatching module globals.
- Documented contracts — the expected interface is explicit in type annotations, not implicit in "I happen to call these methods."
All protocols live in one file: lib/protocols.py. This is deliberate —
it's the single location for all boundary contracts, importable by any module
without pulling in transitive dependencies.
| Protocol | Methods | Satisfied By | Used By |
|---|---|---|---|
LLMService |
chat(), stream() |
lib.llm_dispatch module functions |
trading_autopilot, swarm, tasks_pkg, trading.intel |
FetchService |
fetch_page_content(), fetch_urls() |
lib.fetch module functions |
tasks_pkg.executor, trading.intel |
TradingDataProvider |
get_latest_price(), fetch_asset_info(), fetch_price_history(), build_intel_context() |
lib.trading module functions |
trading_autopilot |
TaskEventSink |
append_event() |
lib.tasks_pkg.manager.append_event |
executor, tool_dispatch |
ToolHandler |
__call__(task, tc, fn_name, …) |
Each @tool_registry.handler() fn |
ToolRegistry dispatch |
BodyBuilder |
__call__(model, messages, …) |
lib.llm_client.build_body |
swarm agents, orchestrators |
All protocols are @runtime_checkable — you can use isinstance() checks
in defensive code.
Type annotation at function boundaries:
from lib.protocols import LLMService
def my_function(llm: LLMService, prompt: str) -> str:
"""Uses structural subtyping — any object with .chat() works."""
content, usage = llm.chat(
[{'role': 'user', 'content': prompt}],
max_tokens=1024,
)
return contentDependency injection with fallback to concrete import:
from lib.protocols import FetchService
def _prefetch_user_urls(
messages: list[dict],
task: dict,
*,
fetch_service: FetchService | None = None, # inject for testing
) -> list[tuple[str, str]]:
# Fall back to concrete import when no DI provided
if fetch_service:
_fetch_urls = fetch_service.fetch_urls
else:
from lib.fetch import fetch_urls
_fetch_urls = fetch_urls
...Explicit protocol binding in hot-path modules:
# lib/tasks_pkg/orchestrator.py
from lib.llm_client import build_body as _build_body_impl
from lib.protocols import BodyBuilder
build_body: BodyBuilder = _build_body_impl # explicit protocol bindingThis pattern serves as living documentation: readers immediately see that
build_body satisfies the BodyBuilder protocol, and type checkers can
verify it.
Runtime type checking (defensive code):
from lib.protocols import LLMService
assert isinstance(llm_client, LLMService), "Expected LLMService protocol"- Define it in
lib/protocols.py— the single location for all boundary contracts. - Use
@runtime_checkableso it works withisinstance(). - Document in the docstring which concrete types satisfy the protocol.
- Add to
__all__inlib/protocols.py. - Add to the table above in this document.
Template:
@runtime_checkable
class MyServiceProtocol(Protocol):
"""Protocol for [description].
Satisfied by:
- ``lib.my_module`` module-level functions
- Test mocks
Used by:
- ``lib.consumer_a``
- ``lib.consumer_b``
"""
def do_thing(self, arg: str) -> Result:
"""[Method description]."""
...When to create a protocol — use this decision tree:
Does module A import concrete symbols from module B?
YES → Do multiple modules (A, C, D) import the same symbols from B?
YES → Is B heavy or slow to import?
YES → Define a Protocol. ✅
NO → Is there a circular import risk?
YES → Define a Protocol. ✅
NO → Do you want to mock B in tests for A?
YES → Define a Protocol. ✅
NO → Direct import is fine. Skip Protocol.
NO → Direct import is fine. Skip Protocol.
Pattern: Decorator-based dispatch registry for tool handlers.
The ToolRegistry class provides a central, extensible dispatch mechanism.
Tool handlers are registered at module-load time via decorators, and lookup is
performed at dispatch time via registry.lookup(fn_name, round_entry).
Three registration modes:
| Mode | Method / Decorator | Use Case |
|---|---|---|
| Exact | @registry.handler(name) |
Single tool name → handler (O(1) dict lookup) |
| Set-based | @registry.tool_set(set) |
A group of related tools → one handler |
| Special | @registry.special(key) |
Matched via round_entry metadata, not name |
Example — registering a handler:
from lib.tasks_pkg.executor import tool_registry
@tool_registry.handler('web_search', category='search',
description='Perform a web search via API')
def _handle_web_search(task, tc, fn_name, tc_id, fn_args, rn,
round_entry, cfg, project_path,
project_enabled, all_tools=None):
...
return tc_id, tool_content, True # (tc_id, content_str, is_search)
# Set-based: one handler for many tool names
@tool_registry.tool_set(BROWSER_TOOL_NAMES, category='browser',
description='Execute a browser automation tool')
def _handle_browser_tool(task, tc, fn_name, ...):
...
# Special: matched by round_entry metadata, not fn_name
@tool_registry.special('__code_exec__', category='code',
description='Execute a shell command')
def _handle_code_exec(task, tc, fn_name, ...):
...Lookup order: exact match → special key check → set-based scan → None.
Introspection: tool_registry.list_tools() returns all registered tools
with metadata — useful for debugging and documentation generation.
Pattern: Injectable HTTP client with lazy singleton initialization.
The TradingClient class encapsulates the HTTP session, proxy configuration, and
network-state logic. A module-level singleton is lazily initialised (not at
import time) via double-checked locking to avoid opening HTTP connections during
module import.
class TradingClient:
"""HTTP client for trading data APIs with proxy support."""
def __init__(self, proxy_url=None):
self.session = requests.Session()
self.proxy_url = proxy_url or os.environ.get('HTTPS_PROXY', '')
...
def check_network(self) -> bool:
"""Thread-safe, rate-limited network health probe."""
...
# Lazy singleton with double-checked locking
_lazy_client = None
_lazy_lock = threading.Lock()
def _get_default_client() -> TradingClient:
global _lazy_client
if _lazy_client is not None: # Fast path: no lock
return _lazy_client
with _lazy_lock:
if _lazy_client is not None: # Double-check after lock
return _lazy_client
_lazy_client = TradingClient()
return _lazy_clientTesting advantage: Instantiate TradingClient(proxy_url="") directly in
tests — no monkeypatching of module globals required.
Pattern: Static dict mapping tool names → display handlers, built once at
import time. Replaces long if/elif chains with O(1) dict lookup:
_TOOL_DISPLAY_DISPATCH = _build_display_dispatch_table() # built once at import
handler = _TOOL_DISPLAY_DISPATCH.get(fn_name, _tool_display_generic)Similarly, lib/tasks_pkg/executor.py uses a browser badge dispatch table:
_BROWSER_BADGE_DISPATCH = {
'browser_list_tabs': _badge_list_tabs,
'browser_read_tab': _badge_read_tab,
'browser_screenshot': _badge_screenshot,
...
}
badge_fn = _BROWSER_BADGE_DISPATCH.get(fn_name)
if badge_fn is not None:
badge_fn(meta, fn_name, display_text, chars, is_screenshot)Every push to main and every pull request triggers the
GitHub Actions CI workflow:
| Job | What it checks | Python versions | Blocking? |
|---|---|---|---|
| Lint | ruff check on lib/, routes/, tests/ |
3.12 | ✅ Yes |
| Unit Tests | pytest -m unit + coverage — pure logic, no server/browser/network |
3.10, 3.12 | ✅ Yes |
| API Tests | pytest -m api + coverage — Flask test client + mock LLM server |
3.12 | ✅ Yes |
| Healthcheck | python healthcheck.py — syntax, imports, schema, assets |
3.12 | ✅ Yes |
| Visual E2E | pytest -m visual — Playwright browser tests (main branch only) |
3.12 | ❌ Non-blocking |
Unit tests run on a Python 3.10 + 3.12 matrix to catch compatibility issues early. Coverage reports are generated as XML artifacts for each run.
# Quick: just unit tests
make test-unit
# Full CI pipeline (lint + unit + api + healthcheck)
make ci
# With coverage report
make test-coverage
# Individual commands
make lint # Ruff linter + format check
make lint-fix # Auto-fix lint issues
make test-api # API integration tests
make test-visual # Playwright E2E tests (needs: playwright install chromium)
make test-all # All tests including visual
make smoke # Smoke tests only (imports, syntax, blueprints)
make healthcheck # Project diagnosticsOr use pytest directly:
python -m pytest -m unit -v # Unit tests, verbose
python -m pytest -m api --tb=long # API tests, full tracebacks
python -m pytest -k "test_build_body" -v # Filter by name
python -m pytest tests/test_smoke.py -v # Just smoke tests
python -m pytest -m unit --cov=lib --cov=routes --cov-report=term-missing # With coverage
python tests/run_all.py --unit # Legacy test runnerTests live in two directories:
| Directory | Purpose | Run in CI? | Style |
|---|---|---|---|
tests/ |
Structured, self-contained pytest tests with mocks | ✅ Yes | pytest classes + @pytest.mark.* markers |
debug/ |
Manual exploration scripts, benchmarks, API-dependent tests | ❌ No | Standalone python debug/script.py |
When a debug/test_*.py script proves valuable for regression prevention,
it should be migrated to tests/ with proper mocking and markers. See
tests/README.md for the full test directory layout.
Every test must have a marker. strict_markers = true in pyproject.toml
means unknown or missing markers cause a hard error.
| Marker | Meaning | Needs server? | Needs browser? |
|---|---|---|---|
@pytest.mark.unit |
Pure logic tests | No | No |
@pytest.mark.api |
Flask test client + mock LLM | Test server | No |
@pytest.mark.visual |
Playwright E2E | Live server | Yes (Chromium) |
@pytest.mark.slow |
Takes > 10 seconds | Varies | Varies |
- Create test functions/classes in
tests/test_*.py - Add the appropriate
@pytest.mark.*decorator to every class or function - Verify locally:
make test-unit(or the relevant marker) - Ensure
make cipasses before submitting a PR
When filing a bug report on GitHub:
- Visit
http://localhost:5001/api/support-bundleto get a diagnostic JSON bundle - Open a Bug Report issue
- Paste the diagnostic bundle JSON into the designated field
- Include relevant log excerpts from
logs/error.log— use the request ID (rid:XXXX) from error messages to find related entries inlogs/app.log
The diagnostic bundle includes system info, recent errors (sanitized), and active configuration — API keys are automatically redacted.
Before submitting code, verify:
- Every new
.pyfile hasfrom lib.log import get_logger; logger = get_logger(__name__) - All
logger.error()inexceptblocks includeexc_info=True - No
logger.info()in# HOT_PATHmodules (usedebuginstead) - No bare
except:orexcept Exception: pass(always log or re-raise) - No f-strings in log calls (use %-style:
logger.info('x=%s', x)) - Operations > 1 second use
log_context()orlog_external()
- All
json.loads()calls on DB/user data are wrapped in try/except - All HTTP calls (
requests.*,session.*) include atimeout=parameter - String variables used after
content.strip()in except blocks are pre-defined
- Protocol interfaces are used for cross-module boundaries (not concrete imports)
- Package façades use
lib/_pkg_utils.pyhelpers for optional sub-module imports - New tool handlers are registered via
@tool_registry.handler()(not if/elif chains) - New packages follow the façade pattern with
build_facade() - Any new per-request module has
# HOT_PATHas the first line - New protocols are added to
lib/protocols.pywith@runtime_checkableand__all__