Skip to content

Commit f26a819

Browse files
authored
Harden state persistence browser smoke CI (#12)
* [verified] Harden state persistence browser smoke * [verified] Document bridge agent workflow hardening * [verified] Fix CI workflow validation
1 parent 6619c0d commit f26a819

8 files changed

Lines changed: 602 additions & 39 deletions

File tree

.github/workflows/ci.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,27 @@ jobs:
1111
name: Build WebGPU Bridge (WASM)
1212
runs-on: ubuntu-latest
1313
env:
14+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
1415
LLAMA_CPP_TAG: b9116
16+
LLAMA_WEBGPU_SMOKE_MODEL_URL: https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf
17+
LLAMA_WEBGPU_SMOKE_MODEL_SHA256: 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb
18+
LLAMA_WEBGPU_SMOKE_MODEL_CACHE: ~/.cache/llama-web-bridge/state-smoke-models
19+
LLAMA_WEBGPU_SMOKE_ARTIFACTS_DIR: /tmp/state-persistence-smoke-artifacts
1520
steps:
1621
- uses: actions/checkout@v4
1722

1823
- name: Validate state persistence API contract
1924
run: python3 scripts/verify_state_persistence_api.py
2025

26+
- name: Verify CI reliability contract
27+
run: python3 scripts/verify_ci_reliability.py
28+
29+
- name: Cache state smoke model
30+
uses: actions/cache@v4
31+
with:
32+
path: ~/.cache/llama-web-bridge/state-smoke-models
33+
key: state-smoke-model-${{ env.LLAMA_WEBGPU_SMOKE_MODEL_SHA256 }}
34+
2135
- name: Clone llama.cpp source
2236
run: |
2337
git clone --depth 1 --branch "$LLAMA_CPP_TAG" https://github.com/ggml-org/llama.cpp.git third_party/llama_cpp
@@ -52,6 +66,14 @@ jobs:
5266
BRIDGE_DIST_DIR: ${{ runner.temp }}/webgpu_bridge_dist
5367
run: python3 scripts/state_persistence_browser_smoke.py
5468

69+
- name: Upload state persistence smoke diagnostics
70+
if: failure()
71+
uses: actions/upload-artifact@v4
72+
with:
73+
name: state-persistence-smoke-artifacts
74+
path: ${{ env.LLAMA_WEBGPU_SMOKE_ARTIFACTS_DIR }}
75+
if-no-files-found: ignore
76+
5577
- name: Upload bridge artifacts
5678
uses: actions/upload-artifact@v4
5779
with:

.github/workflows/publish_assets.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ on:
1919
- 'v*'
2020

2121
env:
22+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
2223
ASSETS_TAG: ${{ github.event_name == 'workflow_dispatch' && inputs.assets_tag || github.ref_name }}
2324
ASSETS_REPO: ${{ github.event_name == 'workflow_dispatch' && inputs.assets_repo || 'leehack/llama-web-bridge-assets' }}
2425
LLAMA_CPP_TAG: ${{ github.event_name == 'workflow_dispatch' && inputs.llama_cpp_tag || 'b9116' }}

AGENTS.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,21 @@ Useful environment overrides:
3232
- `OUT_DIR`
3333
- `CMAKE_BUILD_TYPE`
3434

35+
## Agent PR Workflow
36+
37+
For non-trivial runtime, workflow, or API changes, keep the PR path explicit:
38+
39+
1. Start from a clean topic branch and inspect `git status` before editing.
40+
2. Add or update a regression/contract check before changing behavior when
41+
practical. Static contract scripts are acceptable for workflow invariants.
42+
3. Keep Emscripten build directories, ccache, model caches, and Playwright
43+
artifacts outside the repository unless they are intentionally versioned.
44+
4. Run the targeted checks in this file and the full browser smoke when the
45+
change touches `js/`, `src/`, `scripts/`, or GitHub workflows.
46+
5. Use an independent review before committing PR-bound changes. Fix blocking
47+
findings, rerun the targeted checks, then commit locally; do not push or open
48+
a PR unless the maintainer asks.
49+
3550
### Local Verification Notes
3651

3752
When validating bridge runtime changes locally, keep build/cache output outside
@@ -44,9 +59,44 @@ export EM_CACHE=/private/tmp/llama_web_bridge_emcache
4459
BUILD_DIR=/private/tmp/llama_web_bridge_build MEM64_BUILD_DIR=/private/tmp/llama_web_bridge_build_mem64 OUT_DIR=/private/tmp/llama_web_bridge_dist WEBGPU_BRIDGE_BUILD_MEM64=1 ./scripts/build_bridge.sh
4560
```
4661

62+
Minimum local checks before handing off a PR-ready branch:
63+
64+
```bash
65+
node --check js/llama_webgpu_bridge.js
66+
node --check js/llama_webgpu_bridge_worker.js
67+
python3 -m py_compile scripts/verify_state_persistence_api.py scripts/verify_ci_reliability.py scripts/state_persistence_browser_smoke.py
68+
python3 scripts/verify_state_persistence_api.py
69+
python3 scripts/verify_ci_reliability.py
70+
```
71+
72+
For state-persistence or workflow changes, also run the browser smoke against a
73+
built `OUT_DIR`. Keep the tiny model in a user cache or `/private/tmp`; do not
74+
commit downloaded GGUFs or smoke artifacts:
75+
76+
```bash
77+
python3 -m pip install --user playwright
78+
python3 -m playwright install chromium
79+
python3 scripts/state_persistence_browser_smoke.py \
80+
--dist-dir /private/tmp/llama_web_bridge_dist \
81+
--model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
82+
--model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
83+
--model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
84+
--artifacts-dir /private/tmp/llama_web_bridge_state_smoke_artifacts
85+
```
86+
4787
## CI / Release
4888

4989
- CI build gate: `.github/workflows/ci.yml`
90+
- CI reliability contract: `scripts/verify_ci_reliability.py`
91+
- Keep this script updated when changing browser smoke behavior, action
92+
versions, or workflow diagnostics.
93+
- The CI smoke must use a pinned tiny GGUF URL plus SHA-256, cache the model in
94+
the same expanded `~/.cache/llama-web-bridge/state-smoke-models` directory
95+
used by `actions/cache`, and upload `state-persistence-smoke-artifacts` on
96+
failure.
97+
- Both CI and publish workflows intentionally set
98+
`FORCE_JAVASCRIPT_ACTIONS_TO_NODE24` so action-runtime regressions are caught
99+
before Node 20 deprecation becomes a hard failure.
50100
- Publish workflow: `.github/workflows/publish_assets.yml`
51101
- Requires `WEBGPU_BRIDGE_ASSETS_PAT`
52102
- Pushes assets + tag to `llama-web-bridge-assets`
@@ -73,3 +123,11 @@ After publishing assets tag:
73123
`navigator.hardwareConcurrency` is greater than the bridge pthread pool size.
74124
- Run the smoke through both direct runtime (`disableWorker: true`) and the
75125
bridge worker path; both should report `n_threads` capped to the pool size.
126+
- For state persistence, exercise both direct and worker runtimes with a real
127+
tiny model. The smoke should evaluate a prompt, save bytes, mutate state,
128+
reload bytes, and verify generation still works after restore.
129+
- Worker and direct runtime filesystems are separate. Do not silently fall back
130+
from worker-owned state APIs to direct runtime state; byte APIs are the durable
131+
app-storage path for IndexedDB/OPFS/Cache API integrations.
132+
- If the smoke downloads a model, never expose raw signed/authenticated locations in
133+
thrown errors or artifacts. Redact userinfo, query, and fragment values.

CONTRIBUTING.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,66 @@ cd llama-web-bridge
2929
LLAMA_CPP_DIR=../llama.cpp OUT_DIR=dist ./scripts/build_bridge.sh
3030
```
3131

32+
For local agent/maintainer validation, prefer external build and cache paths so
33+
generated files do not dirty the checkout:
34+
35+
```bash
36+
export CCACHE_DIR=/private/tmp/llama_web_bridge_ccache
37+
export EM_CACHE=/private/tmp/llama_web_bridge_emcache
38+
BUILD_DIR=/private/tmp/llama_web_bridge_build \
39+
MEM64_BUILD_DIR=/private/tmp/llama_web_bridge_build_mem64 \
40+
OUT_DIR=/private/tmp/llama_web_bridge_dist \
41+
WEBGPU_BRIDGE_BUILD_MEM64=1 \
42+
./scripts/build_bridge.sh
43+
```
44+
3245
## Validate Outputs
3346

3447
Expected files:
3548

3649
- `dist/llama_webgpu_bridge.js`
50+
- `dist/llama_webgpu_bridge_worker.js`
3751
- `dist/llama_webgpu_core.js`
3852
- `dist/llama_webgpu_core.wasm`
3953

54+
Before opening or updating a PR, run the lightweight contracts:
55+
56+
```bash
57+
node --check js/llama_webgpu_bridge.js
58+
node --check js/llama_webgpu_bridge_worker.js
59+
python3 -m py_compile scripts/verify_state_persistence_api.py scripts/verify_ci_reliability.py scripts/state_persistence_browser_smoke.py
60+
python3 scripts/verify_state_persistence_api.py
61+
python3 scripts/verify_ci_reliability.py
62+
```
63+
64+
For state-persistence, worker, or workflow changes, also run the browser smoke
65+
against a built dist directory. Use a checksum-pinned tiny model and keep caches
66+
and artifacts outside the repository:
67+
68+
```bash
69+
python3 scripts/state_persistence_browser_smoke.py \
70+
--dist-dir /private/tmp/llama_web_bridge_dist \
71+
--model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
72+
--model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
73+
--model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
74+
--artifacts-dir /tmp/llama-web-bridge-state-smoke
75+
```
76+
77+
If the smoke downloads from a URL, errors and diagnostics must redact userinfo,
78+
query strings, and fragments before printing the location.
79+
80+
## Agent Workflow Guardrails
81+
82+
- Keep workflow reliability rules in `scripts/verify_ci_reliability.py` when
83+
changing `.github/workflows/ci.yml`, `.github/workflows/publish_assets.yml`, or
84+
`scripts/state_persistence_browser_smoke.py`.
85+
- Preserve `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24` in CI and publish workflows so
86+
GitHub Action runtime changes are detected before they become mandatory.
87+
- Upload state-persistence smoke diagnostics only on failure; successful CI runs
88+
should stay quiet beyond the normal build artifacts.
89+
- Do not push branches, tags, or publish assets from local agent work unless the
90+
maintainer explicitly requests that side effect.
91+
4092
## Publish Process
4193

4294
Use workflow `.github/workflows/publish_assets.yml`:

README.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,39 @@ This repo includes a wasm build gate in:
121121

122122
- `.github/workflows/ci.yml`
123123

124-
It builds against pinned `llama.cpp` tag `b9116` and uploads build artifacts.
124+
It builds against pinned `llama.cpp` tag `b9116`, uploads build artifacts, and
125+
runs the static CI reliability contract:
126+
127+
```bash
128+
python3 scripts/verify_ci_reliability.py
129+
```
130+
131+
The reliability contract protects the browser smoke and workflow invariants that
132+
are easy to regress during agent-driven maintenance:
133+
134+
- both CI and publish workflows opt into `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24`
135+
to catch action-runtime deprecation issues early;
136+
- the state-persistence browser smoke supports an integrity-checked tiny GGUF
137+
model round trip;
138+
- the CI model cache path expands `~` before resolving so it matches the
139+
`actions/cache` directory;
140+
- browser smoke failures upload `state-persistence-smoke-artifacts` with console
141+
logs, result JSON, and screenshots when available.
142+
143+
Run the model-backed smoke locally after building the bridge if a change touches
144+
state persistence, workers, browser smoke, or workflow diagnostics:
145+
146+
```bash
147+
python3 scripts/state_persistence_browser_smoke.py \
148+
--dist-dir /path/to/webgpu_bridge_dist \
149+
--model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
150+
--model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
151+
--model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
152+
--artifacts-dir /tmp/llama-web-bridge-state-smoke
153+
```
154+
155+
Do not commit downloaded GGUFs, Playwright screenshots, console logs, generated
156+
`dist/` assets, or Emscripten build/cache directories.
125157

126158
## Publishing
127159

js/llama_webgpu_bridge.js

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4163,8 +4163,30 @@ class LlamaWebGpuBridgeRuntime {
41634163
throw new Error('Bridge filesystem is not initialized');
41644164
}
41654165

4166-
if (!core.FS.analyzePath('/states').exists) {
4166+
const hasStateDir = () => {
4167+
try {
4168+
const entries = core.FS.readdir('/');
4169+
return Array.isArray(entries) && entries.includes('states');
4170+
} catch (_) {
4171+
return false;
4172+
}
4173+
};
4174+
4175+
// Emscripten's generated FS.analyzePath can throw for existing directories in
4176+
// the pthread/browser runtime. Check the root directory listing instead so
4177+
// repeated bytes save/load round-trips do not retry mkdir('/states') and hit
4178+
// EEXIST after the first snapshot.
4179+
if (hasStateDir()) {
4180+
return;
4181+
}
4182+
4183+
try {
41674184
core.FS.mkdir('/states');
4185+
} catch (error) {
4186+
if (hasStateDir()) {
4187+
return;
4188+
}
4189+
throw error;
41684190
}
41694191
}
41704192

0 commit comments

Comments
 (0)