Skip to content

Commit 6c2b577

Browse files
committed
docs(workspace): clarify pipeline walkthrough and requirements rationale
1 parent b17b754 commit 6c2b577

1 file changed

Lines changed: 169 additions & 0 deletions

File tree

infrastructure/workspace/README.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Workspace Infrastructure Modules
2+
3+
This package documents the helper modules located in `infrastructure/workspace/`.
4+
They support local automation experiments (Hamilton-inspired pipelines, Codex MCP
5+
playbooks and language server tooling) and ship intentionally without a dedicated
6+
`requirements.txt`. The modules either re-implement small portions of external
7+
packages for educational purposes or model optional integrations that require the
8+
consumer to decide whether to install the third-party dependency.
9+
10+
## Execution quickstart
11+
12+
1. **Activate the project virtual environment** (see repository root `README.md`).
13+
2. **Run the module tests** to validate the self-contained helpers:
14+
```bash
15+
pytest infrastructure/workspace/tests
16+
```
17+
3. **Walk the Hamilton dataflow end to end**:
18+
```bash
19+
pytest infrastructure/workspace/tests/hamilton_llm/test_driver.py \
20+
--maxfail=1 -k hamilton_builder_executes_llm_business_flow -vv
21+
```
22+
Pytest executes the declarative pipeline and prints the captured plan to the
23+
console. The Hamilton shim in `hamilton_llm/` mirrors the public API of the
24+
real `hamilton` package without requiring external installs. Step-by-step
25+
output demonstrates the execution order, configuration overrides and adapter
26+
chaining you would observe inside the official Hamilton UI.
27+
4. **Review Codex MCP playbooks** in `codex_mcp/playbooks.py`. They describe the
28+
declarative configuration passed to the Codex CLI (`npx codex mcp`). Optional
29+
dependencies (`openai-agents`, `openai`) are listed inside the module under
30+
`ENVIRONMENT_SETUP` instead of a requirements file because installing them is
31+
only necessary when exercising the Codex workflow.
32+
5. **Language server tooling** (`dev_tools/language_server/`) exposes utilities
33+
used by the internal development environment. They rely solely on the Python
34+
standard library and therefore do not require extra dependencies.
35+
36+
> **Why you do not see a browser UI**
37+
>
38+
> The Hamilton demo is purposely console-first to keep the repository
39+
> dependency-light. When you run the test target above pytest prints the
40+
> executed nodes and their order. Connecting the same modules to a browser UI is
41+
> possible (for example by embedding the driver inside a FastAPI app), but the
42+
> repository keeps that integration out of scope to avoid shipping additional
43+
> JavaScript tooling or backend frameworks.
44+
45+
### Visualising the pipeline on the command line
46+
47+
Run the snippet below to print the resolved nodes and the resulting business
48+
package exactly as the tests assert:
49+
50+
```bash
51+
python - <<'PY'
52+
from infrastructure.workspace.hamilton_llm import dataflow, driver
53+
from infrastructure.workspace.hamilton_llm.llm_client import MockLLMClient
54+
55+
mock = MockLLMClient(
56+
price_per_1k_tokens=0.4,
57+
response_catalog={dataflow.DATAFLOW_LABEL: "Use Hamilton declarative functions to guard prompts."},
58+
)
59+
60+
pipeline = (
61+
driver.Builder()
62+
.with_modules(dataflow)
63+
.with_config({"pricing_policy": {"price_per_1k_tokens": 0.4, "safety_multiplier": 1.15}})
64+
.build()
65+
)
66+
67+
result = pipeline.execute(
68+
["business_value", "cost_estimate"],
69+
{
70+
"idea": "AI copilots for compliance analysts",
71+
"domain_data": {
72+
"data": "archived compliance tickets",
73+
"ui": "browser extension",
74+
"business_process": "regulatory audit",
75+
},
76+
"edge_cases": [
77+
"Input state space",
78+
"Guard against prompt injection",
79+
"Domain expertise",
80+
"Evaluation",
81+
"Cost/GPUs",
82+
],
83+
"llm_client": mock,
84+
},
85+
)
86+
87+
print("Execution order:", pipeline.execution_log)
88+
print("Business payload:", result["business_value"]["llm_plan"])
89+
print("Estimated cost:", result["cost_estimate"])
90+
PY
91+
```
92+
93+
The script mirrors the pytest fixture and produces terminal output such as:
94+
95+
```
96+
Execution order: ['pace_of_development', 'prompt_template', 'llm_prompt', 'llm_response', 'prompt_token_estimate', 'business_value', 'cost_estimate']
97+
Business payload: Use Hamilton declarative functions to guard prompts.
98+
Estimated cost: 0.0552
99+
```
100+
101+
## Bootstrapping a clean virtual environment
102+
103+
The modules in this directory run on standard library primitives, but the test
104+
suite and Codex integrations expect a handful of optional dependencies. When
105+
working inside a fresh virtual environment run the following commands:
106+
107+
```bash
108+
python3.11 -m venv .venv
109+
source .venv/bin/activate
110+
pip install --upgrade pip setuptools wheel
111+
pip install pytest
112+
```
113+
114+
Install the Codex/OpenAI extras only if you plan to exercise the MCP playbooks:
115+
116+
```bash
117+
pip install openai-agents openai python-dotenv
118+
```
119+
120+
Create a `.env` file at the project root and load it via `python-dotenv` to make
121+
the API keys available to the workflows:
122+
123+
```bash
124+
OPENAI_API_KEY=sk-your-api-key
125+
```
126+
127+
The Hamilton driver example and developer tooling modules continue to operate
128+
without additional packages beyond what ships with CPython.
129+
130+
### Capturing a requirements snapshot (optional)
131+
132+
When teams want to reproduce the exact versions used during an experiment they
133+
can materialise a temporary `requirements.txt` from the active environment
134+
without committing it to source control:
135+
136+
```bash
137+
pip install openai-agents openai python-dotenv # only if Codex integrations are needed
138+
pip freeze --exclude-editable > infrastructure/workspace/requirements.txt
139+
```
140+
141+
The generated file documents the state of your virtual environment at that
142+
moment (suitable for sharing in an issue or attaching to a pipeline artifact),
143+
while keeping the repository source clean. Delete the file afterwards if it is
144+
only needed for local troubleshooting:
145+
146+
```bash
147+
rm infrastructure/workspace/requirements.txt
148+
```
149+
150+
## Why there is no `requirements.txt`
151+
152+
- **Hamilton example**: the code re-implements the essentials of the Apache
153+
Hamilton driver so that unit tests run without external packages.
154+
- **Codex MCP integration**: the dependencies depend on the MCP client chosen
155+
by the operator. The module documents the recommended packages and environment
156+
variables but does not enforce their installation globally.
157+
- **Developer tools**: rely on the standard library. Pinning versions in a
158+
separate requirements file would duplicate the repository-level dependency
159+
management without bringing additional value.
160+
161+
When using the Codex or OpenAI integrations, install the optional packages in
162+
your active environment:
163+
164+
```bash
165+
pip install openai-agents openai python-dotenv
166+
```
167+
168+
Create a `.env` file with the required keys (`OPENAI_API_KEY`, etc.) before
169+
running the workflows.

0 commit comments

Comments
 (0)