Skip to content

Commit 9341eb1

Browse files
Merge pull request #9 from StewAlexander-com/harden-install-run
harden install/run: flags, preflight report, offline pip, port-in-use
2 parents feab0d5 + e0df4c1 commit 9341eb1

7 files changed

Lines changed: 760 additions & 129 deletions

File tree

.github/workflows/ci.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,3 +133,14 @@ jobs:
133133
else
134134
echo "scripts/smoke_prompts.sh missing; skipping prompt smoke"
135135
fi
136+
137+
- name: Smoke-test CLI flags (--help, --no-launch, port-in-use)
138+
env:
139+
TUTOR_SKIP_OLLAMA: "1"
140+
run: |
141+
if [ -f scripts/smoke_flags.sh ]; then
142+
chmod +x scripts/smoke_flags.sh
143+
./scripts/smoke_flags.sh
144+
else
145+
echo "scripts/smoke_flags.sh missing; skipping flags smoke"
146+
fi

README.md

Lines changed: 33 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -91,19 +91,46 @@ and floating "Ask tutor" panel.
9191
> the daemon, pulling the model, or launching the app are all opt-in y/N
9292
> prompts.** Press Enter and nothing changes on your host.
9393
94-
### Unattended install
94+
Run `./install.sh --help` or `./run.sh --help` for every option. The most
95+
common shapes:
9596

9697
```bash
97-
TUTOR_NONINTERACTIVE=1 ./install.sh # answer "no" to everything
98-
PYTHON_TUTOR_ASSUME_YES=1 ./install.sh # answer "yes" to everything (trusted hosts only)
99-
TUTOR_SKIP_OLLAMA=1 ./install.sh # skip all Ollama probes
98+
./install.sh --yes # trusted host: install Ollama, pull model, launch
99+
./install.sh --noninteractive # CI: never prompt, default everything to "no"
100+
./install.sh --skip-ollama # set up Python only; skip every Ollama probe
101+
./install.sh --model llama3.1:8b # use a different model than gemma3:4b
102+
./run.sh --port 8042 # choose a different port
103+
./run.sh --open-browser # open the URL once /api/health is green
100104
```
101105

102-
Full list of env vars and the design rationale behind the two-script flow:
106+
The classic env vars (`TUTOR_NONINTERACTIVE`, `PYTHON_TUTOR_ASSUME_YES`,
107+
`TUTOR_SKIP_OLLAMA`, `TUTOR_MODEL`, `TUTOR_PORT`, …) still work — the flags
108+
are sugar on top of them.
109+
110+
Full env-var list and design rationale:
103111
[`docs/install-runtime-workflow.md`](docs/install-runtime-workflow.md).
104112

105113
---
106114

115+
## Install reliability
116+
117+
`install.sh` and `run.sh` are designed so the obvious failures fail
118+
*loudly* with a concrete next step. The most common ones:
119+
120+
| Symptom | What to do |
121+
| --------------------------------------------- | ----------------------------------------------- |
122+
| "Python 3.10+ is required and was not found" | `brew install python@3.12` / `apt install python3.12` and re-run. |
123+
| `pip install` fails on DNS / proxy / pypi | The script detects this and prints offline/proxy/wheelhouse recipes. See [install-audit.md](docs/install-audit.md#pip-install-fails-on-a-network-you-dont-control). |
124+
| "Port 8001 is already in use" | `./run.sh --port 8002` (probe uses `/dev/tcp`, no `lsof` needed). |
125+
| Ollama installed but daemon down on `:11434` | Answer `y` to "Start `ollama serve` now?" or run it yourself in another Terminal. |
126+
| `gh repo clone` fails with auth error | `gh auth status``gh auth login`. Public clone via HTTPS also works. |
127+
| Repo was moved after install -> "venv broken" | The script auto-rebuilds. Virtualenvs hard-code their own path; relocating is unsupported by Python itself. |
128+
129+
Detailed runbook and the audit that produced these mitigations:
130+
[`docs/install-audit.md`](docs/install-audit.md).
131+
132+
---
133+
107134
## Architecture at a glance
108135

109136
```
@@ -181,6 +208,7 @@ safety scan over the curriculum, and a Markdown link sanity check. See
181208
- [Evaluation](docs/evaluation.md)
182209
- [Roadmap](docs/roadmap.md)
183210
- [Install & runtime workflow](docs/install-runtime-workflow.md)
211+
- [Install reliability audit](docs/install-audit.md)
184212
- [Python foundations curriculum](curriculum/python-foundations.md)
185213
- [Tutor system prompt](prompts/tutor-system-prompt.md)
186214
- [ADR 0001 — offline-first local LLM](adr/0001-offline-first-local-llm.md)

docs/install-audit.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# Install audit & reliability runbook
2+
3+
This document captures real-world install failure modes observed in the
4+
wild and the script-level mitigations that ship in `install.sh` /
5+
`run.sh`. Treat it as the runbook a fresh contributor reaches for when
6+
something goes sideways on a new host.
7+
8+
## Origins
9+
10+
The audit was triggered by a real install on a macOS laptop that
11+
exposed several gaps in the earlier scripts:
12+
13+
- Python 3.14 was present, but the semver parser in `install.sh`
14+
incorrectly extracted the patch component as "minor". It worked by
15+
luck on `3.14.4`; it would have rejected `3.10.0` / `3.11.0` outright.
16+
- `gh auth` had an invalid token, so a direct `gh repo clone` of a
17+
private mirror failed. The error was clear but the README did not
18+
acknowledge it.
19+
- `pip install` failed because DNS could not resolve `pypi.org`.
20+
The script printed raw pip output and exited; no hint about offline
21+
wheelhouses, proxies, or internal mirrors.
22+
- Ollama was installed but the daemon was not running. The probe
23+
worked, but the recovery path required the user to know the magic
24+
`ollama serve` invocation.
25+
- The remote command sandbox could not talk to `localhost:11434`
26+
directly; only an interactive Terminal session could. Nothing in the
27+
scripts surfaced this distinction.
28+
- After verification the install was moved to `~/Projects/python-tutor`.
29+
The venv had to be rebuilt because virtualenvs hard-code their own
30+
path inside `pyvenv.cfg` and the shebangs of `bin/*`.
31+
32+
## What changed
33+
34+
### `install.sh`
35+
36+
| Change | Why |
37+
| --- | --- |
38+
| Proper semver parser using `sys.version_info[:2]` | The old `${ver##*.}` pattern silently misparsed 3-component versions like `3.10.0`. |
39+
| Preflight report at the top | Lets the user see OS, Python, Ollama state, model, and mode in one screen before anything mutates the host. |
40+
| Venv path-sensitivity marker (`.tutor_repo_root`) | Rebuilds the venv automatically if the repo was moved since the last install, so users do not get cryptic shebang failures. |
41+
| Captured pip output + DNS/proxy hint detection | When pip fails, the script greps for known network signatures (`name or service not known`, `getaddrinfo`, etc.) and prints the offline wheelhouse recipe. |
42+
| `--help`, `--yes`, `--noninteractive`, `--no-launch`, `--skip-ollama`, `--skip-model-pull`, `--model TAG` flags | Old env-var-only interface was inscrutable. Flags are sugar over the same env vars; existing scripts keep working. |
43+
| Documented exit codes (0/1/2/3) | Lets CI and parent scripts distinguish "Python missing" from "pip failed" from "bad CLI". |
44+
45+
### `run.sh`
46+
47+
| Change | Why |
48+
| --- | --- |
49+
| `--help`, `--host`, `--port`, `--model`, `--open-browser`, `--no-launch`, `--skip-ollama`, `-y`, `-n` | Same rationale as install.sh: discoverability. |
50+
| Port-in-use probe via `/dev/tcp` before exec'ing uvicorn | uvicorn's bind-error is ugly; the script now exits 4 with `pick another port`. No new system deps required (no `lsof`/`ss`). |
51+
| `--open-browser` background watcher | Polls `/api/health` and opens the URL only after the server reports healthy, so the browser does not race the bind. |
52+
| `--no-launch` | Lets CI exercise the full preflight (venv check, Ollama probe, port-in-use) without binding a socket. |
53+
| Documented exit codes (0/3/4) | Same reason as install.sh. |
54+
55+
## Failure modes & remediations
56+
57+
### "Python 3.10+ is required and was not found on PATH"
58+
59+
The script iterates `python3.13 python3.12 python3.11 python3.10 python3`
60+
and accepts the first interpreter whose `sys.version_info[:2]` is
61+
`>= (3, 10)`. If you have a newer Python under a non-default name
62+
(e.g. `python3.14` via `pyenv`), make sure it is on `PATH` or symlink
63+
it as `python3.13`.
64+
65+
### `pip install` fails on a network you don't control
66+
67+
The script now prints actionable hints whenever pip's log contains a
68+
known network signature. Three paths:
69+
70+
1. **Behind a corporate proxy:**
71+
72+
```bash
73+
export HTTPS_PROXY=http://proxy.example:8080
74+
export HTTP_PROXY=http://proxy.example:8080
75+
./install.sh
76+
```
77+
78+
2. **Internal mirror:**
79+
80+
```bash
81+
PIP_INDEX_URL=https://pypi.internal/simple ./install.sh
82+
```
83+
84+
3. **Fully offline / air-gapped:** build a wheelhouse on a connected
85+
host, copy it over, then install from disk only.
86+
87+
```bash
88+
# On a host with pypi access:
89+
pip download -d wheelhouse -r backend/requirements-dev.txt
90+
# scp/rsync wheelhouse/ to the target host, then:
91+
PIP_NO_INDEX=1 PIP_FIND_LINKS="$PWD/wheelhouse" ./install.sh
92+
```
93+
94+
### "venv at backend/.venv looks broken"
95+
96+
Almost always means the repo was moved (or copied) after the venv was
97+
created. The script detects this via the `.tutor_repo_root` marker and
98+
rebuilds. If you intentionally moved the repo and want to keep the venv,
99+
the only safe move is to recreate it -- there is no supported way to
100+
relocate a virtualenv.
101+
102+
### "Port 8001 is already in use"
103+
104+
`run.sh --port 8002` (or any free port). The port-in-use probe uses
105+
bash's `/dev/tcp` so it works without `lsof` / `ss` / `netstat`.
106+
107+
### "Ollama is installed but the daemon is not running on :11434"
108+
109+
Two paths:
110+
111+
1. Let the script start it: answer `y` to "Start 'ollama serve' in the
112+
background now?" -- the daemon log goes to `/tmp/ollama-serve.log`.
113+
2. Start it yourself in another Terminal: `ollama serve`. Some hosts
114+
(notably remote command sandboxes) cannot reach `localhost:11434`
115+
from a non-interactive session even when the daemon is running --
116+
in that case, run `./install.sh` from an interactive Terminal.
117+
118+
### `gh repo clone` fails with an auth error on a private repo
119+
120+
```bash
121+
gh auth status # check current token
122+
gh auth refresh # re-authorize
123+
gh auth login # full re-login (web flow)
124+
```
125+
126+
The public mirror at `https://github.com/StewAlexander-com/python-tutor`
127+
does not require auth; only private forks do.
128+
129+
## Recommended install / run commands
130+
131+
Interactive (default):
132+
133+
```bash
134+
gh repo clone StewAlexander-com/python-tutor
135+
cd python-tutor
136+
./install.sh # prompts y/N for any host-level change
137+
./run.sh # serves UI + API at http://localhost:8001/
138+
```
139+
140+
Unattended (trusted host -- installs Ollama, pulls model, launches):
141+
142+
```bash
143+
./install.sh --yes
144+
```
145+
146+
CI / dry-run (no system changes, no server):
147+
148+
```bash
149+
./install.sh --noninteractive --skip-ollama --skip-model-pull --no-launch
150+
./run.sh --no-launch --skip-ollama
151+
```
152+
153+
Custom port with a browser pop:
154+
155+
```bash
156+
./run.sh --port 8042 --open-browser
157+
```

docs/install-runtime-workflow.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -275,7 +275,68 @@ Unattended:
275275
```bash
276276
# Pre-approved: install Ollama, start it, pull the model, exec run.sh.
277277
PYTHON_TUTOR_ASSUME_YES=1 ./install.sh
278+
# (or, equivalently:)
279+
./install.sh --yes
278280

279281
# CI: do not touch Ollama at all.
280282
TUTOR_SKIP_OLLAMA=1 TUTOR_NONINTERACTIVE=1 ./install.sh
283+
./install.sh --noninteractive --skip-ollama --skip-model-pull --no-launch
281284
```
285+
286+
## CLI flags
287+
288+
Both scripts now accept flags in addition to env vars. Run
289+
`./install.sh --help` or `./run.sh --help` for the full list. The flags
290+
are sugar over the same env vars; existing CI invocations keep working.
291+
292+
| Flag | Equivalent env var |
293+
| -------------------------- | -------------------------------- |
294+
| `-y`, `--yes` | `PYTHON_TUTOR_ASSUME_YES=1` |
295+
| `-n`, `--noninteractive` | `TUTOR_NONINTERACTIVE=1` |
296+
| `--no-launch` | (install) suppresses launch prompt; (run) preflight-only dry run |
297+
| `--skip-ollama` | `TUTOR_SKIP_OLLAMA=1` |
298+
| `--skip-model-pull` | `TUTOR_SKIP_MODEL_PULL=1` |
299+
| `--model TAG` | `TUTOR_MODEL=TAG` |
300+
| `--host ADDR` (run only) | `TUTOR_HOST=ADDR` |
301+
| `--port N` (run only) | `TUTOR_PORT=N` |
302+
| `--open-browser` (run only) | (no env equivalent; opt-in) |
303+
304+
Exit codes:
305+
306+
- `install.sh`: `0` ok, `1` Python too old/missing, `2` pip failed, `3`
307+
invalid CLI args.
308+
- `run.sh`: `0` ok (server started, or `--no-launch` dry-run), `3`
309+
invalid CLI args, `4` port already in use.
310+
311+
## Offline / restricted networks
312+
313+
`install.sh` calls `pip install` against PyPI by default. When the host
314+
cannot reach `pypi.org` (corporate proxy, air-gapped lab, flaky DNS) the
315+
script captures pip's stderr and prints actionable hints. Three
316+
documented paths:
317+
318+
1. **Behind a proxy:** export `HTTPS_PROXY` / `HTTP_PROXY` and re-run.
319+
2. **Internal mirror:** `PIP_INDEX_URL=https://pypi.internal/simple ./install.sh`.
320+
3. **Air-gapped:** build a wheelhouse on a connected host, then point
321+
pip at the local directory and skip the index:
322+
323+
```bash
324+
# on a connected host:
325+
pip download -d wheelhouse -r backend/requirements-dev.txt
326+
# rsync wheelhouse/ to the target, then:
327+
PIP_NO_INDEX=1 PIP_FIND_LINKS="$PWD/wheelhouse" ./install.sh
328+
```
329+
330+
The detailed audit (failure modes seen in real installs and the
331+
mitigations now in the scripts) lives at
332+
[`install-audit.md`](install-audit.md).
333+
334+
## Venv path sensitivity
335+
336+
Python virtualenvs hard-code their absolute path inside
337+
`pyvenv.cfg` and the shebangs of `bin/*`. Moving or copying a venv
338+
silently breaks it. `install.sh` writes the repo path to
339+
`backend/.venv/.tutor_repo_root` and on subsequent runs rebuilds the
340+
venv if the repo has moved. The takeaway: choose your install location
341+
before running `./install.sh`. If you must move the repo, re-run
342+
`./install.sh` from the new location.

0 commit comments

Comments
 (0)