Harden state persistence browser smoke CI (#12)

leehack · web-flow · commit f26a8195e15e · 2026-05-12T21:43:24.000-04:00
* [verified] Harden state persistence browser smoke

* [verified] Document bridge agent workflow hardening

* [verified] Fix CI workflow validation
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -11,13 +11,27 @@ jobs:
     name: Build WebGPU Bridge (WASM)
     runs-on: ubuntu-latest
     env:
+      FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
       LLAMA_CPP_TAG: b9116
+      LLAMA_WEBGPU_SMOKE_MODEL_URL: https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf
+      LLAMA_WEBGPU_SMOKE_MODEL_SHA256: 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb
+      LLAMA_WEBGPU_SMOKE_MODEL_CACHE: ~/.cache/llama-web-bridge/state-smoke-models
+      LLAMA_WEBGPU_SMOKE_ARTIFACTS_DIR: /tmp/state-persistence-smoke-artifacts
     steps:
       - uses: actions/checkout@v4
 
       - name: Validate state persistence API contract
         run: python3 scripts/verify_state_persistence_api.py
 
+      - name: Verify CI reliability contract
+        run: python3 scripts/verify_ci_reliability.py
+
+      - name: Cache state smoke model
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/llama-web-bridge/state-smoke-models
+          key: state-smoke-model-${{ env.LLAMA_WEBGPU_SMOKE_MODEL_SHA256 }}
+
       - name: Clone llama.cpp source
         run: |
           git clone --depth 1 --branch "$LLAMA_CPP_TAG" https://github.com/ggml-org/llama.cpp.git third_party/llama_cpp
@@ -52,6 +66,14 @@ jobs:
           BRIDGE_DIST_DIR: ${{ runner.temp }}/webgpu_bridge_dist
         run: python3 scripts/state_persistence_browser_smoke.py
 
+      - name: Upload state persistence smoke diagnostics
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: state-persistence-smoke-artifacts
+          path: ${{ env.LLAMA_WEBGPU_SMOKE_ARTIFACTS_DIR }}
+          if-no-files-found: ignore
+
       - name: Upload bridge artifacts
         uses: actions/upload-artifact@v4
         with:
diff --git a/.github/workflows/publish_assets.yml b/.github/workflows/publish_assets.yml
@@ -19,6 +19,7 @@ on:
       - 'v*'
 
 env:
+  FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
   ASSETS_TAG: ${{ github.event_name == 'workflow_dispatch' && inputs.assets_tag || github.ref_name }}
   ASSETS_REPO: ${{ github.event_name == 'workflow_dispatch' && inputs.assets_repo || 'leehack/llama-web-bridge-assets' }}
   LLAMA_CPP_TAG: ${{ github.event_name == 'workflow_dispatch' && inputs.llama_cpp_tag || 'b9116' }}
diff --git a/AGENTS.md b/AGENTS.md
@@ -32,6 +32,21 @@ Useful environment overrides:
 - `OUT_DIR`
 - `CMAKE_BUILD_TYPE`
 
+## Agent PR Workflow
+
+For non-trivial runtime, workflow, or API changes, keep the PR path explicit:
+
+1. Start from a clean topic branch and inspect `git status` before editing.
+2. Add or update a regression/contract check before changing behavior when
+   practical. Static contract scripts are acceptable for workflow invariants.
+3. Keep Emscripten build directories, ccache, model caches, and Playwright
+   artifacts outside the repository unless they are intentionally versioned.
+4. Run the targeted checks in this file and the full browser smoke when the
+   change touches `js/`, `src/`, `scripts/`, or GitHub workflows.
+5. Use an independent review before committing PR-bound changes. Fix blocking
+   findings, rerun the targeted checks, then commit locally; do not push or open
+   a PR unless the maintainer asks.
+
 ### Local Verification Notes
 
 When validating bridge runtime changes locally, keep build/cache output outside
@@ -44,9 +59,44 @@ export EM_CACHE=/private/tmp/llama_web_bridge_emcache
 BUILD_DIR=/private/tmp/llama_web_bridge_build MEM64_BUILD_DIR=/private/tmp/llama_web_bridge_build_mem64 OUT_DIR=/private/tmp/llama_web_bridge_dist WEBGPU_BRIDGE_BUILD_MEM64=1 ./scripts/build_bridge.sh
 ```
 
+Minimum local checks before handing off a PR-ready branch:
+
+```bash
+node --check js/llama_webgpu_bridge.js
+node --check js/llama_webgpu_bridge_worker.js
+python3 -m py_compile scripts/verify_state_persistence_api.py scripts/verify_ci_reliability.py scripts/state_persistence_browser_smoke.py
+python3 scripts/verify_state_persistence_api.py
+python3 scripts/verify_ci_reliability.py
+```
+
+For state-persistence or workflow changes, also run the browser smoke against a
+built `OUT_DIR`. Keep the tiny model in a user cache or `/private/tmp`; do not
+commit downloaded GGUFs or smoke artifacts:
+
+```bash
+python3 -m pip install --user playwright
+python3 -m playwright install chromium
+python3 scripts/state_persistence_browser_smoke.py \
+  --dist-dir /private/tmp/llama_web_bridge_dist \
+  --model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
+  --model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
+  --model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
+  --artifacts-dir /private/tmp/llama_web_bridge_state_smoke_artifacts
+```
+
 ## CI / Release
 
 - CI build gate: `.github/workflows/ci.yml`
+- CI reliability contract: `scripts/verify_ci_reliability.py`
+  - Keep this script updated when changing browser smoke behavior, action
+    versions, or workflow diagnostics.
+  - The CI smoke must use a pinned tiny GGUF URL plus SHA-256, cache the model in
+    the same expanded `~/.cache/llama-web-bridge/state-smoke-models` directory
+    used by `actions/cache`, and upload `state-persistence-smoke-artifacts` on
+    failure.
+  - Both CI and publish workflows intentionally set
+    `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24` so action-runtime regressions are caught
+    before Node 20 deprecation becomes a hard failure.
 - Publish workflow: `.github/workflows/publish_assets.yml`
   - Requires `WEBGPU_BRIDGE_ASSETS_PAT`
   - Pushes assets + tag to `llama-web-bridge-assets`
@@ -73,3 +123,11 @@ After publishing assets tag:
   `navigator.hardwareConcurrency` is greater than the bridge pthread pool size.
 - Run the smoke through both direct runtime (`disableWorker: true`) and the
   bridge worker path; both should report `n_threads` capped to the pool size.
+- For state persistence, exercise both direct and worker runtimes with a real
+  tiny model. The smoke should evaluate a prompt, save bytes, mutate state,
+  reload bytes, and verify generation still works after restore.
+- Worker and direct runtime filesystems are separate. Do not silently fall back
+  from worker-owned state APIs to direct runtime state; byte APIs are the durable
+  app-storage path for IndexedDB/OPFS/Cache API integrations.
+- If the smoke downloads a model, never expose raw signed/authenticated locations in
+  thrown errors or artifacts. Redact userinfo, query, and fragment values.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -29,14 +29,66 @@ cd llama-web-bridge
 LLAMA_CPP_DIR=../llama.cpp OUT_DIR=dist ./scripts/build_bridge.sh
 ```
 
+For local agent/maintainer validation, prefer external build and cache paths so
+generated files do not dirty the checkout:
+
+```bash
+export CCACHE_DIR=/private/tmp/llama_web_bridge_ccache
+export EM_CACHE=/private/tmp/llama_web_bridge_emcache
+BUILD_DIR=/private/tmp/llama_web_bridge_build \
+MEM64_BUILD_DIR=/private/tmp/llama_web_bridge_build_mem64 \
+OUT_DIR=/private/tmp/llama_web_bridge_dist \
+WEBGPU_BRIDGE_BUILD_MEM64=1 \
+./scripts/build_bridge.sh
+```
+
 ## Validate Outputs
 
 Expected files:
 
 - `dist/llama_webgpu_bridge.js`
+- `dist/llama_webgpu_bridge_worker.js`
 - `dist/llama_webgpu_core.js`
 - `dist/llama_webgpu_core.wasm`
 
+Before opening or updating a PR, run the lightweight contracts:
+
+```bash
+node --check js/llama_webgpu_bridge.js
+node --check js/llama_webgpu_bridge_worker.js
+python3 -m py_compile scripts/verify_state_persistence_api.py scripts/verify_ci_reliability.py scripts/state_persistence_browser_smoke.py
+python3 scripts/verify_state_persistence_api.py
+python3 scripts/verify_ci_reliability.py
+```
+
+For state-persistence, worker, or workflow changes, also run the browser smoke
+against a built dist directory. Use a checksum-pinned tiny model and keep caches
+and artifacts outside the repository:
+
+```bash
+python3 scripts/state_persistence_browser_smoke.py \
+  --dist-dir /private/tmp/llama_web_bridge_dist \
+  --model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
+  --model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
+  --model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
+  --artifacts-dir /tmp/llama-web-bridge-state-smoke
+```
+
+If the smoke downloads from a URL, errors and diagnostics must redact userinfo,
+query strings, and fragments before printing the location.
+
+## Agent Workflow Guardrails
+
+- Keep workflow reliability rules in `scripts/verify_ci_reliability.py` when
+  changing `.github/workflows/ci.yml`, `.github/workflows/publish_assets.yml`, or
+  `scripts/state_persistence_browser_smoke.py`.
+- Preserve `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24` in CI and publish workflows so
+  GitHub Action runtime changes are detected before they become mandatory.
+- Upload state-persistence smoke diagnostics only on failure; successful CI runs
+  should stay quiet beyond the normal build artifacts.
+- Do not push branches, tags, or publish assets from local agent work unless the
+  maintainer explicitly requests that side effect.
+
 ## Publish Process
 
 Use workflow `.github/workflows/publish_assets.yml`:
diff --git a/README.md b/README.md
@@ -121,7 +121,39 @@ This repo includes a wasm build gate in:
 
 - `.github/workflows/ci.yml`
 
-It builds against pinned `llama.cpp` tag `b9116` and uploads build artifacts.
+It builds against pinned `llama.cpp` tag `b9116`, uploads build artifacts, and
+runs the static CI reliability contract:
+
+```bash
+python3 scripts/verify_ci_reliability.py
+```
+
+The reliability contract protects the browser smoke and workflow invariants that
+are easy to regress during agent-driven maintenance:
+
+- both CI and publish workflows opt into `FORCE_JAVASCRIPT_ACTIONS_TO_NODE24`
+  to catch action-runtime deprecation issues early;
+- the state-persistence browser smoke supports an integrity-checked tiny GGUF
+  model round trip;
+- the CI model cache path expands `~` before resolving so it matches the
+  `actions/cache` directory;
+- browser smoke failures upload `state-persistence-smoke-artifacts` with console
+  logs, result JSON, and screenshots when available.
+
+Run the model-backed smoke locally after building the bridge if a change touches
+state persistence, workers, browser smoke, or workflow diagnostics:
+
+```bash
+python3 scripts/state_persistence_browser_smoke.py \
+  --dist-dir /path/to/webgpu_bridge_dist \
+  --model-url https://huggingface.co/aladar/llama-2-tiny-random-GGUF/resolve/main/llama-2-tiny-random.gguf \
+  --model-sha256 81f226c62d28ed4a1a9b9fa080fcd9f0cc40e0f9d5680036583ff98fbcd035cb \
+  --model-cache-dir ~/.cache/llama-web-bridge/state-smoke-models \
+  --artifacts-dir /tmp/llama-web-bridge-state-smoke
+```
+
+Do not commit downloaded GGUFs, Playwright screenshots, console logs, generated
+`dist/` assets, or Emscripten build/cache directories.
 
 ## Publishing
 
diff --git a/js/llama_webgpu_bridge.js b/js/llama_webgpu_bridge.js
@@ -4163,8 +4163,30 @@ class LlamaWebGpuBridgeRuntime {
       throw new Error('Bridge filesystem is not initialized');
     }
 
-    if (!core.FS.analyzePath('/states').exists) {
+    const hasStateDir = () => {
+      try {
+        const entries = core.FS.readdir('/');
+        return Array.isArray(entries) && entries.includes('states');
+      } catch (_) {
+        return false;
+      }
+    };
+
+    // Emscripten's generated FS.analyzePath can throw for existing directories in
+    // the pthread/browser runtime. Check the root directory listing instead so
+    // repeated bytes save/load round-trips do not retry mkdir('/states') and hit
+    // EEXIST after the first snapshot.
+    if (hasStateDir()) {
+      return;
+    }
+
+    try {
       core.FS.mkdir('/states');
+    } catch (error) {
+      if (hasStateDir()) {
+        return;
+      }
+      throw error;
     }
   }
 
diff --git a/scripts/state_persistence_browser_smoke.py b/scripts/state_persistence_browser_smoke.py
diff --git a/scripts/verify_ci_reliability.py b/scripts/verify_ci_reliability.py