|
1 | 1 | # AGENTS.md |
2 | 2 |
|
| 3 | +## Dependency upgrades |
| 4 | + |
| 5 | +### Recommended upgrade order |
| 6 | + |
| 7 | +Retrack has a hard dependency surface (Rust crates → SQLx schema cache → Node runtime → |
| 8 | +NPM packages → Docker base images). Upgrade in this order so each stage builds on a |
| 9 | +green previous one and a single failure does not contaminate later stages: |
| 10 | + |
| 11 | +1. **Rust crates** in `Cargo.toml` (workspace root), `components/retrack-types/Cargo.toml`, |
| 12 | + and `benches/js-runtime-perf/Cargo.toml`. Bump everything to the latest semver-compatible |
| 13 | + release in one pass; refresh `Cargo.lock` with `cargo update`. |
| 14 | +2. **`.nvmrc`** to the next Node LTS (the same major must be reflected in `engines.node` of |
| 15 | + every `package.json` in the workspace). |
| 16 | +3. **NPM packages** in workspace-root `package.json` and `components/retrack-web-scraper/package.json`. |
| 17 | + Refresh `package-lock.json` with `npm install` (run from the workspace root so all |
| 18 | + workspaces resolve through the same tree). |
| 19 | +4. **Dockerfile base images** (`Dockerfile`, `Dockerfile.web-scraper`, |
| 20 | + `Dockerfile.web-scraper-camoufox`), UPX, Camoufox, `playwright-python`. Re-pin SHA256 |
| 21 | + manifest digests with `./dev/scripts/docker-pin-digests.sh`. |
| 22 | + |
| 23 | +Do **not** reorder — bumping Node before bumping NPM deps means the host `npm install` |
| 24 | +runs against the new Node and may produce a lockfile the older Docker base cannot |
| 25 | +consume; bumping the Docker bases before the NPM lock means the runtime stage's |
| 26 | +`npm ci` validates against a stale lock. |
| 27 | + |
| 28 | +### Stage 1 — Rust crates |
| 29 | + |
| 30 | +```bash |
| 31 | +# Edit Cargo.toml files, then: |
| 32 | +cargo update |
| 33 | +cargo fmt |
| 34 | +cargo clippy --all-targets -- -D warnings |
| 35 | +cargo test |
| 36 | +make perf ANALYZE=1 PERF_ITERATIONS=20 PERF_WARMUP=5 # smoke; never fails the build |
| 37 | +``` |
| 38 | + |
| 39 | +What to watch for: |
| 40 | + |
| 41 | +- **`deno_core`** bumps almost always invalidate the `js_runtime::tests::can_access_deno_apis` |
| 42 | + inline snapshot because `Deno.core.*` exposes new ops or removes deprecated ones (e.g. |
| 43 | + `__processTimers`/`__resolveOps` were removed in favour of `__eventLoopTick` / |
| 44 | + `__setTimerExpiry`). Run `cargo insta accept -p retrack` and review the diff to confirm |
| 45 | + the new API surface is intentional rather than a regression. |
| 46 | +- **`sqlx`** macros validate queries against `.sqlx/` cached query plans. After a `sqlx` |
| 47 | + bump or a query change, `cargo check` will fail offline with `Connection refused` / |
| 48 | + `SQLX_OFFLINE` errors. Regenerate the cache: |
| 49 | + |
| 50 | + ```bash |
| 51 | + make dev-up # starts the dev Postgres |
| 52 | + make db-prepare # runs `cargo sqlx prepare` against the live DB |
| 53 | + ``` |
| 54 | + |
| 55 | + Commit the regenerated `.sqlx/` directory alongside the bump. CI runs `make |
| 56 | + db-prepare-check` and fails if the cache is out of date. |
| 57 | + |
| 58 | +### Stage 2 — Node major bump (`.nvmrc`) |
| 59 | + |
| 60 | +```bash |
| 61 | +echo 24 > .nvmrc |
| 62 | +# Update engines.node in every package.json (workspace root + each leaf) to "24.x" |
| 63 | +# Update @types/node to ^24.x in every package.json that declares it (root, web-scraper). |
| 64 | +npm install # refreshes package-lock.json |
| 65 | +npm run lint --ws --if-present |
| 66 | +npm test --ws --if-present |
| 67 | +npm run build --ws --if-present |
| 68 | +``` |
| 69 | + |
| 70 | +### Stage 3 — NPM packages |
| 71 | + |
| 72 | +```bash |
| 73 | +# Edit package.json files (root + leaves), then from components/retrack/: |
| 74 | +npm install |
| 75 | +npm run lint --ws --if-present |
| 76 | +npm test --ws --if-present |
| 77 | +npm run build --ws --if-present |
| 78 | +``` |
| 79 | + |
| 80 | +What to watch for: |
| 81 | + |
| 82 | +- **`playwright-core` must be pinned to a single exact version** across the whole product |
| 83 | + (web-scraper, secutils-webui, e2e harness, and the `playwright-python` git ref baked |
| 84 | + into `Dockerfile.web-scraper-camoufox`). Mismatches cause subtle protocol-level |
| 85 | + incompatibilities. When bumping it here, record the pinned version and bump the other |
| 86 | + three locations in their respective stages — do not let them drift between PRs. |
| 87 | +- **`@commitlint/*` majors** sometimes change their config schema; rerun `npx commitlint |
| 88 | + --from HEAD~1` after bumping to confirm the existing `commitlint.config.cjs` still |
| 89 | + parses. |
| 90 | + |
| 91 | +### Stage 4 — Docker base images |
| 92 | + |
| 93 | +```bash |
| 94 | +# Edit FROM lines (image + tag, no digest) and language/tool versions. |
| 95 | +./dev/scripts/docker-pin-digests.sh # rewrites @sha256:... to current manifest digests |
| 96 | +make docker-api |
| 97 | +make docker-scraper |
| 98 | +make docker-scraper-camoufox |
| 99 | +``` |
| 100 | + |
| 101 | +The pin script reads every `FROM` line in the three Dockerfiles, calls `docker buildx |
| 102 | +imagetools inspect <image>:<tag>` to resolve the current manifest-list digest, and |
| 103 | +rewrites the line in place. It always re-pins, even when the tag did not change — running |
| 104 | +it on every upgrade keeps the digest fresh against the rolling tag. |
| 105 | + |
| 106 | +What to watch for: |
| 107 | + |
| 108 | +- **`Dockerfile.web-scraper` runtime stage requires the workspace layout.** The runtime |
| 109 | + image is a flat install, but the workspace lockfile records the web-scraper's deps |
| 110 | + under `packages."components/retrack-web-scraper"`, not under the top-level |
| 111 | + `packages.""` key. With npm ≥ 11 (which ships with Node ≥ 22), `npm ci --production` |
| 112 | + in a flattened layout fails with `Missing: <pkg> from lock file` / |
| 113 | + `Invalid: lock file's <pkg>@x does not satisfy <pkg>@y`. The Dockerfile preserves the |
| 114 | + workspace structure under `/app` and installs with: |
| 115 | + |
| 116 | + ```dockerfile |
| 117 | + COPY --from=builder ["/app/package.json", "/app/package-lock.json", "./"] |
| 118 | + COPY --from=builder ["/app/components/retrack-web-scraper/package.json", "./components/retrack-web-scraper/"] |
| 119 | + COPY --from=builder ["/app/components/retrack-web-scraper/dist/", "./components/retrack-web-scraper/"] |
| 120 | + RUN npm ci --omit=dev --workspace=retrack-web-scraper --include-workspace-root=false && ... |
| 121 | + CMD ["node", "components/retrack-web-scraper/src/index.js"] |
| 122 | + ``` |
| 123 | + |
| 124 | + Do not "simplify" by flattening — it works against the host npm cache but breaks the |
| 125 | + fresh `npm ci` inside the runtime stage. |
| 126 | +- **Camoufox is a coupled triple.** `cloverlabs-camoufox` (PyPI), `playwright-python` |
| 127 | + (git ref `release-x.y`), and the Camoufox Firefox build ID |
| 128 | + (`python -m camoufox fetch official/<id>`) must move together. Only bump the Firefox |
| 129 | + ID when `cloverlabs-camoufox` has been released against it; otherwise the loader |
| 130 | + rejects the binary. The current set has `cloverlabs-camoufox==0.5.5` + |
| 131 | + `playwright-python@release-1.59` + `official/146.0.1-alpha.25`. |
| 132 | +- **`playwright-python` minor** must match the `playwright-core` minor pinned in stage 3 |
| 133 | + (currently `1.59`). The Camoufox image's Playwright driver speaks the protocol of the |
| 134 | + matching Node-side Playwright; mismatches surface as cryptic "browser closed |
| 135 | + unexpectedly" errors at runtime. |
| 136 | +- **Smoke-test each image after build:** check that the entrypoint path exists |
| 137 | + (`components/retrack-web-scraper/src/index.js` for the scraper, `/app/camoufox_launcher.py` |
| 138 | + for camoufox), the language runtime version is what was requested |
| 139 | + (`node --version` / `python3 --version`), and for camoufox that the Firefox build cache |
| 140 | + populated under `/root/.cache/camoufox/browsers`. The full app stack lives one level up |
| 141 | + in the parent `secutils` repo's `dev/docker/docker-compose.yml`; the local |
| 142 | + `dev/docker/docker-compose.yml` here only spins up Postgres for `make dev-up`. |
| 143 | + |
| 144 | +### Cross-cutting reminders |
| 145 | + |
| 146 | +- **Commitlint.** The repo enforces conventional commits via husky pre-commit. Use |
| 147 | + `chore(deps): ...` for dependency-only commits and `chore(docker): ...` for image |
| 148 | + re-pins; mixing dep code changes (e.g. an API adapter for `hickory-resolver`) with the |
| 149 | + bump is fine and should still go under `chore(deps)`. |
| 150 | +- **Performance harness regressions are advisory.** A perf delta after a `deno_core`, |
| 151 | + `tokio`, or `reqwest` bump is informational; the CI job never fails on it. If a |
| 152 | + regression is large, decide whether to roll back the specific crate or accept it |
| 153 | + before commit — but do not block the upgrade chain on the harness. |
| 154 | + |
3 | 155 | ## JS Runtime Performance Harness (`benches/js-runtime-perf/`) |
4 | 156 |
|
5 | 157 | ### Overview |
|
0 commit comments