|
| 1 | +# Install audit & reliability runbook |
| 2 | + |
| 3 | +This document captures real-world install failure modes observed in the |
| 4 | +wild and the script-level mitigations that ship in `install.sh` / |
| 5 | +`run.sh`. Treat it as the runbook a fresh contributor reaches for when |
| 6 | +something goes sideways on a new host. |
| 7 | + |
| 8 | +## Origins |
| 9 | + |
| 10 | +The audit was triggered by a real install on a macOS laptop that |
| 11 | +exposed several gaps in the earlier scripts: |
| 12 | + |
| 13 | +- Python 3.14 was present, but the semver parser in `install.sh` |
| 14 | + incorrectly extracted the patch component as "minor". It worked by |
| 15 | + luck on `3.14.4`; it would have rejected `3.10.0` / `3.11.0` outright. |
| 16 | +- `gh auth` had an invalid token, so a direct `gh repo clone` of a |
| 17 | + private mirror failed. The error was clear but the README did not |
| 18 | + acknowledge it. |
| 19 | +- `pip install` failed because DNS could not resolve `pypi.org`. |
| 20 | + The script printed raw pip output and exited; no hint about offline |
| 21 | + wheelhouses, proxies, or internal mirrors. |
| 22 | +- Ollama was installed but the daemon was not running. The probe |
| 23 | + worked, but the recovery path required the user to know the magic |
| 24 | + `ollama serve` invocation. |
| 25 | +- The remote command sandbox could not talk to `localhost:11434` |
| 26 | + directly; only an interactive Terminal session could. Nothing in the |
| 27 | + scripts surfaced this distinction. |
| 28 | +- After verification the install was moved to `~/Projects/python-tutor`. |
| 29 | + The venv had to be rebuilt because virtualenvs hard-code their own |
| 30 | + path inside `pyvenv.cfg` and the shebangs of `bin/*`. |
| 31 | + |
| 32 | +## What changed |
| 33 | + |
| 34 | +### `install.sh` |
| 35 | + |
| 36 | +| Change | Why | |
| 37 | +| --- | --- | |
| 38 | +| Proper semver parser using `sys.version_info[:2]` | The old `${ver##*.}` pattern silently misparsed 3-component versions like `3.10.0`. | |
| 39 | +| Preflight report at the top | Lets the user see OS, Python, Ollama state, model, and mode in one screen before anything mutates the host. | |
| 40 | +| Venv path-sensitivity marker (`.tutor_repo_root`) | Rebuilds the venv automatically if the repo was moved since the last install, so users do not get cryptic shebang failures. | |
| 41 | +| Captured pip output + DNS/proxy hint detection | When pip fails, the script greps for known network signatures (`name or service not known`, `getaddrinfo`, etc.) and prints the offline wheelhouse recipe. | |
| 42 | +| `--help`, `--yes`, `--noninteractive`, `--no-launch`, `--skip-ollama`, `--skip-model-pull`, `--model TAG` flags | Old env-var-only interface was inscrutable. Flags are sugar over the same env vars; existing scripts keep working. | |
| 43 | +| Documented exit codes (0/1/2/3) | Lets CI and parent scripts distinguish "Python missing" from "pip failed" from "bad CLI". | |
| 44 | + |
| 45 | +### `run.sh` |
| 46 | + |
| 47 | +| Change | Why | |
| 48 | +| --- | --- | |
| 49 | +| `--help`, `--host`, `--port`, `--model`, `--open-browser`, `--no-launch`, `--skip-ollama`, `-y`, `-n` | Same rationale as install.sh: discoverability. | |
| 50 | +| Port-in-use probe via `/dev/tcp` before exec'ing uvicorn | uvicorn's bind-error is ugly; the script now exits 4 with `pick another port`. No new system deps required (no `lsof`/`ss`). | |
| 51 | +| `--open-browser` background watcher | Polls `/api/health` and opens the URL only after the server reports healthy, so the browser does not race the bind. | |
| 52 | +| `--no-launch` | Lets CI exercise the full preflight (venv check, Ollama probe, port-in-use) without binding a socket. | |
| 53 | +| Documented exit codes (0/3/4) | Same reason as install.sh. | |
| 54 | + |
| 55 | +## Failure modes & remediations |
| 56 | + |
| 57 | +### "Python 3.10+ is required and was not found on PATH" |
| 58 | + |
| 59 | +The script iterates `python3.13 python3.12 python3.11 python3.10 python3` |
| 60 | +and accepts the first interpreter whose `sys.version_info[:2]` is |
| 61 | +`>= (3, 10)`. If you have a newer Python under a non-default name |
| 62 | +(e.g. `python3.14` via `pyenv`), make sure it is on `PATH` or symlink |
| 63 | +it as `python3.13`. |
| 64 | + |
| 65 | +### `pip install` fails on a network you don't control |
| 66 | + |
| 67 | +The script now prints actionable hints whenever pip's log contains a |
| 68 | +known network signature. Three paths: |
| 69 | + |
| 70 | +1. **Behind a corporate proxy:** |
| 71 | + |
| 72 | + ```bash |
| 73 | + export HTTPS_PROXY=http://proxy.example:8080 |
| 74 | + export HTTP_PROXY=http://proxy.example:8080 |
| 75 | + ./install.sh |
| 76 | + ``` |
| 77 | + |
| 78 | +2. **Internal mirror:** |
| 79 | + |
| 80 | + ```bash |
| 81 | + PIP_INDEX_URL=https://pypi.internal/simple ./install.sh |
| 82 | + ``` |
| 83 | + |
| 84 | +3. **Fully offline / air-gapped:** build a wheelhouse on a connected |
| 85 | + host, copy it over, then install from disk only. |
| 86 | + |
| 87 | + ```bash |
| 88 | + # On a host with pypi access: |
| 89 | + pip download -d wheelhouse -r backend/requirements-dev.txt |
| 90 | + # scp/rsync wheelhouse/ to the target host, then: |
| 91 | + PIP_NO_INDEX=1 PIP_FIND_LINKS="$PWD/wheelhouse" ./install.sh |
| 92 | + ``` |
| 93 | + |
| 94 | +### "venv at backend/.venv looks broken" |
| 95 | + |
| 96 | +Almost always means the repo was moved (or copied) after the venv was |
| 97 | +created. The script detects this via the `.tutor_repo_root` marker and |
| 98 | +rebuilds. If you intentionally moved the repo and want to keep the venv, |
| 99 | +the only safe move is to recreate it -- there is no supported way to |
| 100 | +relocate a virtualenv. |
| 101 | + |
| 102 | +### "Port 8001 is already in use" |
| 103 | + |
| 104 | +`run.sh --port 8002` (or any free port). The port-in-use probe uses |
| 105 | +bash's `/dev/tcp` so it works without `lsof` / `ss` / `netstat`. |
| 106 | + |
| 107 | +### "Ollama is installed but the daemon is not running on :11434" |
| 108 | + |
| 109 | +Two paths: |
| 110 | + |
| 111 | +1. Let the script start it: answer `y` to "Start 'ollama serve' in the |
| 112 | + background now?" -- the daemon log goes to `/tmp/ollama-serve.log`. |
| 113 | +2. Start it yourself in another Terminal: `ollama serve`. Some hosts |
| 114 | + (notably remote command sandboxes) cannot reach `localhost:11434` |
| 115 | + from a non-interactive session even when the daemon is running -- |
| 116 | + in that case, run `./install.sh` from an interactive Terminal. |
| 117 | + |
| 118 | +### `gh repo clone` fails with an auth error on a private repo |
| 119 | + |
| 120 | +```bash |
| 121 | +gh auth status # check current token |
| 122 | +gh auth refresh # re-authorize |
| 123 | +gh auth login # full re-login (web flow) |
| 124 | +``` |
| 125 | + |
| 126 | +The public mirror at `https://github.com/StewAlexander-com/python-tutor` |
| 127 | +does not require auth; only private forks do. |
| 128 | + |
| 129 | +## Recommended install / run commands |
| 130 | + |
| 131 | +Interactive (default): |
| 132 | + |
| 133 | +```bash |
| 134 | +gh repo clone StewAlexander-com/python-tutor |
| 135 | +cd python-tutor |
| 136 | +./install.sh # prompts y/N for any host-level change |
| 137 | +./run.sh # serves UI + API at http://localhost:8001/ |
| 138 | +``` |
| 139 | + |
| 140 | +Unattended (trusted host -- installs Ollama, pulls model, launches): |
| 141 | + |
| 142 | +```bash |
| 143 | +./install.sh --yes |
| 144 | +``` |
| 145 | + |
| 146 | +CI / dry-run (no system changes, no server): |
| 147 | + |
| 148 | +```bash |
| 149 | +./install.sh --noninteractive --skip-ollama --skip-model-pull --no-launch |
| 150 | +./run.sh --no-launch --skip-ollama |
| 151 | +``` |
| 152 | + |
| 153 | +Custom port with a browser pop: |
| 154 | + |
| 155 | +```bash |
| 156 | +./run.sh --port 8042 --open-browser |
| 157 | +``` |
0 commit comments