Skip to content

Commit 2d0178d

Browse files
authored
Merge pull request #56 from context-labs/add-missing-cli-params
Add missing cli flags
2 parents da1725f + ab750b3 commit 2d0178d

5 files changed

Lines changed: 296 additions & 128 deletions

File tree

README.md

Lines changed: 63 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -84,14 +84,68 @@ halo --help
8484

8585
1. [Integrate Tracing](docs/integrations/openai-agents-sdk.md)
8686
2. Collect traces by running your agent
87-
3. Run the HALO engine, see the [CLI](/halo_cli/README.md) docs for more info
87+
3. Run the HALO engine
8888

8989
```bash
9090
export OPENAI_API_KEY=...
91+
# Optional: point HALO at another OpenAI-compatible provider.
92+
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
9193

9294
halo path_to_your_traces.jsonl -p "Diagnose errors you find and suggest fixes"
9395
```
9496

97+
HALO uses the canonical OpenAI env vars: `OPENAI_API_KEY` for credentials and `OPENAI_BASE_URL` for OpenAI-compatible providers. If `OPENAI_BASE_URL` is unset, HALO uses `https://api.openai.com/v1`. Run `halo --help` to see all CLI options. The CLI mirrors the model/provider settings exposed by the Python SDK's
98+
[`ModelConfig`](engine/model_config.py) and
99+
[`ModelProviderConfig`](engine/model_provider_config.py).
100+
101+
### CLI options
102+
103+
| Flag | Default | Description |
104+
| --------------------------------------------- | -------------------------------------------- | ---------------------------------------------------------------------------------------------- |
105+
| `TRACE_PATH` | required | JSONL trace file |
106+
| `--prompt`, `-p` | required | User prompt sent to the root agent |
107+
| `--model`, `-m` | `gpt-5.4-mini` | Model name for root, subagent, synthesis, and compaction calls |
108+
| `--max-depth` | `2` | Max subagent recursion depth |
109+
| `--max-turns` | `20` | Max turns per agent |
110+
| `--max-parallel` | `10` | Max concurrent subagents |
111+
| `--base-url` | `OPENAI_BASE_URL` / `https://api.openai.com/v1` | OpenAI-compatible API base URL |
112+
| `--api-key` | `OPENAI_API_KEY` | Provider API key |
113+
| `--header`, `-H` | unset | Provider header as `NAME: VALUE`. Repeat for multiple headers, matching curl's `-H` convention |
114+
| `--temperature` | provider default | Sampling temperature forwarded to the model |
115+
| `--max-output-tokens` | provider default | Maximum output tokens forwarded to the model |
116+
| `--parallel-tool-calls / --no-parallel-tool-calls` | enabled | Allow models to issue parallel tool calls |
117+
| `--refusal-retries` | `0` | Retry an agent model request this many times when the model refuses |
118+
| `--reasoning-effort` | model/provider default | Reasoning effort for root, subagent, and synthesis calls. Compaction never uses reasoning |
119+
| `--telemetry` | off | Emit OpenInference traces of HALO's own LLM, tool, and agent activity |
120+
121+
For example:
122+
123+
```bash
124+
halo path_to_your_traces.jsonl \
125+
-p "Diagnose errors you find and suggest fixes" \
126+
--base-url https://openrouter.ai/api/v1 \
127+
-H "HTTP-Referer: https://example.com"
128+
```
129+
130+
### Telemetry
131+
132+
HALO can emit OpenInference-shaped traces of its own LLM, tool, and agent activity. It is off by default; nothing is emitted unless you pass `--telemetry`.
133+
134+
```bash
135+
halo TRACE_PATH --prompt "..." --telemetry
136+
```
137+
138+
When telemetry is enabled, `CATALYST_OTLP_TOKEN` uploads spans to inference.net Catalyst over OTLP. If it is unset, spans are written to a local JSONL file at `./halo-telemetry-{run_id}.jsonl` in the current working directory.
139+
140+
| Var | Default | Purpose |
141+
|---|---|---|
142+
| `CATALYST_OTLP_TOKEN` | unset | If set, uploads to Catalyst over OTLP. If unset, writes JSONL locally |
143+
| `CATALYST_OTLP_ENDPOINT` | catalyst-tracing default | OTLP endpoint base URL, for example `https://telemetry.inference.net` |
144+
| `CATALYST_DEBUG` | unset | Set to `1` to surface OTLP export errors at WARNING level |
145+
| `CATALYST_TRACING_RUN_ID` | unset | Uses this HALO run id instead of a generated uuid |
146+
| `CATALYST_TRACING_*` | unset | Generic catalyst-tracing passthrough |
147+
| `HALO_TELEMETRY_PATH` | `./halo-telemetry-{run_id}.jsonl` | Local fallback file path. Only used when `CATALYST_OTLP_TOKEN` is unset |
148+
95149
We have provided a [simple demo](/demo/openai-agents-sdk-demo/) and an [AppWorld](#appworld) demo.
96150

97151
### Python API
@@ -102,14 +156,14 @@ simplicity. The yielded types ([`AgentOutputItem`](engine/models/engine_output.p
102156
and [`AgentTextDelta`](engine/models/engine_output.py)) are defined in
103157
[`engine/models/engine_output.py`](engine/models/engine_output.py):
104158

105-
| Function | Sync / async | Returns | When to use |
106-
| ---------------------------- | ------------ | -------------------------------------------------- | -------------------------------------------------------------------------------------------------------- |
107-
| `stream_engine_async` | async | `AsyncIterator[AgentOutputItem \| AgentTextDelta]` | You want every event including streaming-token deltas (live UI, custom rendering). |
108-
| `stream_engine_output_async` | async | `AsyncIterator[AgentOutputItem]` | You want to log / persist each completed step (assistant message, tool call, tool result) as it lands. |
109-
| `run_engine_async` | async | `list[AgentOutputItem]` | You want the final list at the end and don't care about per-step observability. |
110-
| `stream_engine` | sync | `Iterator[AgentOutputItem \| AgentTextDelta]` | Sync generator; yields every event including deltas. Drives the async iterator on a private event loop. |
111-
| `stream_engine_output` | sync | `Iterator[AgentOutputItem]` | Sync generator; yields completed items only. Same shape as the async variant for sync callers. |
112-
| `run_engine` | sync | `list[AgentOutputItem]` | Sync, collects to a list. Pure convenience over `asyncio.run(run_engine_async(...))`. |
159+
| Function | Sync / async | Returns | When to use |
160+
| ---------------------------- | ------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
161+
| `stream_engine_async` | async | `AsyncIterator[AgentOutputItem \| AgentTextDelta]` | You want every event including streaming-token deltas (live UI, custom rendering). |
162+
| `stream_engine_output_async` | async | `AsyncIterator[AgentOutputItem]` | You want to log / persist each completed step (assistant message, tool call, tool result) as it lands. |
163+
| `run_engine_async` | async | `list[AgentOutputItem]` | You want the final list at the end and don't care about per-step observability. |
164+
| `stream_engine` | sync | `Iterator[AgentOutputItem \| AgentTextDelta]` | Sync generator; yields every event including deltas. Drives the async iterator on a private event loop. |
165+
| `stream_engine_output` | sync | `Iterator[AgentOutputItem]` | Sync generator; yields completed items only. Same shape as the async variant for sync callers. |
166+
| `run_engine` | sync | `list[AgentOutputItem]` | Sync, collects to a list. Pure convenience over `asyncio.run(run_engine_async(...))`. |
113167

114168
```python
115169
from engine.main import stream_engine_output_async

engine/model_provider_config.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ class ModelProviderConfig(BaseModel):
1212
1313
Each field is independent: when ``None`` the underlying ``AsyncOpenAI``
1414
client falls back to the matching env var (``OPENAI_BASE_URL`` /
15-
``OPENAI_API_KEY``). Setting one and not the other is supported — e.g.
16-
point ``base_url`` at OpenRouter while letting ``OPENAI_API_KEY`` from
17-
the environment supply the credential.
15+
``OPENAI_API_KEY``). When ``OPENAI_BASE_URL`` is unset, the endpoint is
16+
OpenAI's API base URL, ``https://api.openai.com/v1``. Setting one and not
17+
the other is supported — e.g. point ``base_url`` at OpenRouter while
18+
letting ``OPENAI_API_KEY`` from the environment supply the credential.
1819
"""
1920

2021
model_config = ConfigDict(extra="forbid")

halo_cli/README.md

Lines changed: 12 additions & 102 deletions
Original file line numberDiff line numberDiff line change
@@ -1,109 +1,19 @@
11
# HALO CLI
22

3-
Thin Typer wrapper around the HALO engine that streams the engine over a JSONL trace file.
3+
This package contains the `halo` console entry point registered in `pyproject.toml`.
4+
It is a thin Typer wrapper around the engine API:
45

5-
## Install
6+
- Parses CLI arguments and environment-backed provider settings.
7+
- Builds an `EngineConfig` from those arguments.
8+
- Calls `stream_engine_async` over a JSONL trace file.
9+
- Renders streaming text deltas and completed agent output items to stdout.
610

7-
```bash
8-
pip install halo-engine
9-
```
11+
User-facing installation, usage, options, and telemetry docs live in the root
12+
[`README.md`](../README.md).
1013

11-
This installs the `halo` script onto your `PATH`. No extra configuration — the script is registered as a console entry point in the `halo-engine` wheel.
14+
## Code Layout
1215

13-
Verify:
16+
`main.py` intentionally keeps the CLI small. The engine owns behavior; the CLI only
17+
maps shell arguments to existing config objects.
1418

15-
```bash
16-
halo --help
17-
```
18-
19-
### Setup
20-
21-
The engine needs real LLM access:
22-
23-
```bash
24-
export OPENAI_API_KEY=sk-...
25-
```
26-
27-
## Usage
28-
29-
```bash
30-
halo TRACE_PATH --prompt "your question"
31-
```
32-
33-
### Required
34-
35-
| Arg | Description |
36-
| ---------------- | --------------------------------------------------------------- |
37-
| `TRACE_PATH` | JSONL trace file (e.g. `tests/fixtures/realistic_traces.jsonl`) |
38-
| `--prompt`, `-p` | User prompt sent to the root agent |
39-
40-
### Options
41-
42-
| Flag | Default | Description |
43-
| -------------------- | ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
44-
| `--model`, `-m` | `gpt-5.4-mini` | Model name for root, sub, synthesis, and compaction |
45-
| `--max-depth` | `1` | Max subagent recursion depth |
46-
| `--max-turns` | `8` | Max turns per agent |
47-
| `--max-parallel` | `2` | Max concurrent subagents |
48-
| `--reasoning-effort` | _(model default)_ | Reasoning effort for root, subagent, and synthesis calls. One of `none`, `minimal`, `low`, `medium`, `high`, `xhigh`. Compaction never uses reasoning. Omit to use the model family's documented max for known reasoning models. |
49-
50-
## Example
51-
52-
```bash
53-
halo tests/fixtures/realistic_traces.jsonl \
54-
-p "What are the most common failure modes?" \
55-
--max-depth 2 \
56-
--max-turns 12 \
57-
--reasoning-effort high
58-
```
59-
60-
Output streams to stdout: text deltas inline, then a rule-separated panel for each agent output item.
61-
62-
## Telemetry (optional)
63-
64-
HALO can emit OpenInference-shaped traces of its **own** LLM, tool, and agent activity — useful when you're tuning HALO and want to inspect what it actually did. Off by default; nothing is emitted unless you pass `--telemetry`.
65-
66-
### Enable on a run
67-
68-
```bash
69-
halo TRACE_PATH --prompt "..." --telemetry
70-
```
71-
72-
### Routing
73-
74-
The destination is decided by env vars:
75-
76-
- `CATALYST_OTLP_TOKEN` set → spans are uploaded to **inference.net Catalyst** over OTLP.
77-
- `CATALYST_OTLP_TOKEN` unset → spans are written to a **local JSONL file** at `./halo-telemetry-{run_id}.jsonl` in the current working directory.
78-
79-
### Environment variables
80-
81-
| Var | Default | Purpose |
82-
|---|---|---|
83-
| `CATALYST_OTLP_TOKEN` | *(unset)* | If set, uploads to Catalyst over OTLP. If unset, writes JSONL locally. |
84-
| `CATALYST_OTLP_ENDPOINT` | catalyst-tracing default | OTLP endpoint **base URL** (e.g. `https://telemetry.inference.net`). catalyst-tracing appends `/v1/traces` automatically — do **not** include the path, or you'll get a `.../v1/traces/v1/traces` 404 and silently no traces. |
85-
| `CATALYST_DEBUG` | *(unset)* | Set to `1` to surface OTLP export errors at WARNING level. Useful for troubleshooting "no errors, no traces" — the default `BatchSpanProcessor` swallows export failures. |
86-
| `CATALYST_TRACING_RUN_ID` | *(unset)* | When set, becomes the HALO run id (and the `halo.run.id` resource attribute) instead of a generated uuid. Lets a launching system (typically Catalyst) keep its own bookkeeping in sync with HALO's traces. |
87-
| `CATALYST_TRACING_*` | *(unset)* | Generic passthrough — see below. |
88-
| `HALO_TELEMETRY_PATH` | `./halo-telemetry-{run_id}.jsonl` | Local fallback file path. Only consulted when `CATALYST_OTLP_TOKEN` is unset. |
89-
90-
### Local file format
91-
92-
The local JSONL is the inference.net OTLP-shaped form that HALO itself ingests, so traces produced by running HALO can be loaded back into HALO for analysis.
93-
94-
### Notes
95-
96-
- Enabling `--telemetry` clears the openai-agents SDK's default trace processor (which would otherwise upload to OpenAI's dashboard). HALO's own LLM traffic stays out of OpenAI's dashboard while telemetry is on.
97-
- When telemetry is off (the default), no env vars are read and no files are written.
98-
99-
## Developing locally
100-
101-
If you want to hack on the CLI or the engine itself, install from a checkout of this repo with [`uv`](https://docs.astral.sh/uv/):
102-
103-
```bash
104-
git clone https://github.com/context-labs/HALO
105-
cd HALO
106-
uv sync
107-
```
108-
109-
`uv sync` creates `.venv/` and installs `halo-engine` in editable mode. Use `uv run halo ...` (or activate the venv) to invoke the CLI against your local checkout.
19+
Tests for argument parsing and config wiring live in `tests/unit/test_halo_cli.py`.

0 commit comments

Comments
 (0)