Skip to content

Commit 69466f6

Browse files
committed
feat(runbook): surface ann governance drilldowns in workspace cards
1 parent 0c4865f commit 69466f6

15 files changed

Lines changed: 287 additions & 31 deletions

docs/diataxis/en/explanation/development-progress-dashboard.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ It tracks what is already implemented, where the hard gaps remain, and how to ve
2222
- ANN-style prefilter, representation telemetry, circuit health, remote index sync, and live `external_http` connector proof now exist in `src/learning/queryBackend.ts` and `src/learning/vectorAccelerationAdapter.ts`,
2323
- runtime capability/runbook governance now includes explicit ANN remote index-sync health (`query_vector_acceleration_index_sync_health`) in addition to prefilter, health, traceability, and circuit checks,
2424
- `server.ts` now closes the corresponding operator loop: the index-sync gate participates in verification escalation, remediation action-queue generation, and per-check runbook history summaries,
25-
- the agent workspace runtime-runbook verify/checks/action-queue cards now render ANN sync-health metrics, so operators can see the gate directly from the frontend shell without drilling into raw JSON,
25+
- the agent workspace runtime-runbook surfaces now render operator-facing ANN governance directly in the frontend shell: verify/checks now expose sync-health plus circuit-budget, traceability, and prefilter summaries, while action-queue keeps the index-sync incident drilldown,
2626
- the modular `src/routes/knowledge.ts` runtime-runbook surfaces now delegate to live server-side runbook ops with full query-parameter passthrough, so browser/runtime consumers no longer hit the old KLP placeholder payloads for verify/history/checks/action-queue/remediation/schedule flows,
27-
- the browser strict smoke gate now also proves those ANN runbook surfaces from real browser evidence: verify-card ANN health/counts, checks-card first-check ANN sync, and action-queue index-sync drilldown are now asserted end to end instead of remaining component-test-only,
27+
- the browser strict smoke gate now also proves those ANN runbook surfaces from real browser evidence: verify-card ANN sync/circuit/traceability/prefilter content, checks-card first-check ANN sync plus circuit/traceability/prefilter snapshots, and action-queue index-sync drilldown are now asserted end to end instead of remaining component-test-only,
2828
- locale governance for the agent workspace is now tighter on both static and runtime surfaces: bilingual locale bundles now cover the query/quality/runbook cards exercised by strict browser smoke, `src/agent_workspace.locale.contract.test.ts` blocks source-referenced `agentWorkspace.*` key drift, and startup-time translate helpers no longer emit false missing-key warnings before locale initialization finishes,
2929
- Phase-2 runtime diagnostics are now materially implemented in `src/learning/KnowledgeLearningPlatform.ts` for query-backend comparison/history/trend, knowledge staleness diagnostics/rebuild planning, learning-quality history/trend, session-plan quality evaluation/history/trend/runtime-threshold diagnostics, query-backend config, and query-backend diagnostics,
3030
- Phase-3 tutor/memory diagnostics remain real and now include an active default runtime tutor adapter path in `src/server.ts`, so normal server execution can emit adapter telemetry instead of staying catalog-only.
@@ -36,6 +36,7 @@ It tracks what is already implemented, where the hard gaps remain, and how to ve
3636
- Active execution focus therefore shifts to truth-first foundation recovery:
3737
- finish the remaining packaged/runtime + heavier-workload closure for the embedded graph backend baseline,
3838
- finish the remaining workload/threshold closure for the now-live ANN connector baseline,
39+
- move the newly surfaced ANN runbook visibility from operator-readable summaries to workload-calibrated release gates,
3940
- keep the new diagnostic surfaces honest against the same runtime truth,
4041
- then promote Phase-2 / Phase-3 gates as release-significant only after the graph/ANN baseline is release-grade.
4142

@@ -105,7 +106,7 @@ Current branch status for this slice:
105106
- CI now has an always-on strict desktop evidence job in `.github/workflows/migration-gates.yml` (`agent-workspace-tauri-strict-evidence`) that runs `verify:agent-workspace:tauri:rust:strict` and `verify:agent-workspace:tauri:window-evidence:strict` on Linux hosts with explicit `javascriptcoregtk-4.1` / `libsoup-3.0` dependencies, and release workflow `.github/workflows/release-desktop-multi-os.yml` now enforces the same strict evidence gate on the Linux desktop build path before bundle generation; both workflows also generate a strict evidence index (`verify:agent-workspace:tauri:evidence:index:strict`), enforce a strict evidence manifest gate (`verify:agent-workspace:tauri:evidence:manifest:strict`), and upload tauri evidence artifacts (retention policy pinned to 30 days) for audit traceability, while the Linux release path now publishes `release-fragment-latest.md` into GitHub Release notes using marker-based idempotent upsert,
106107
- migration workflow now also includes a dedicated always-on `agent-workspace-contract-gates` job that runs `test:agent-workspace:contracts` (parity/frontend/tauri contract suites) plus `test:conversation-turn-cache:durability` (restart durability check for turn-cache trend index/export consistency), closing the CI drift-detection gap for agent-workspace contract evolution,
107108
- license governance now adds `test:license:contract` to enforce `GPL-3.0-only` parity across `LICENSE`, `README`, `package.json`, and `src-tauri/Cargo.toml`, and this gate is wired into `migration-gates` CI to block license drift,
108-
- browser smoke now exercises real `conversation/path/query-compare/quality/session/runbook` backend slices (including trend + history diagnostics plus runbook verify/checks/action-queue), real graph runtime, and real path runtime, and now asserts ANN sync-health card content from verify/checks/action-queue browser evidence before emitting screenshot/console/network-summary artifacts (`scripts/verify-agent-workspace-browser.js`, `src/agent_workspace.browser.contract.test.ts`),
109+
- browser smoke now exercises real `conversation/path/query-compare/quality/session/runbook` backend slices (including trend + history diagnostics plus runbook verify/checks/action-queue), real graph runtime, and real path runtime, and now asserts ANN sync-health plus verify/checks circuit/traceability/prefilter card content from browser evidence before emitting screenshot/console/network-summary artifacts (`scripts/verify-agent-workspace-browser.js`, `src/agent_workspace.browser.contract.test.ts`),
109110
- scoped conversation-memory foundation is now wired end-to-end (typed contracts, backend normalizers/routes, capability operation registry, locale keys, lifecycle tests, browser/runtime verification) through `/api/knowledge/conversation-memory/{list,add,search,delete,feedback}` (`src/learning/api.ts`, `src/learning/types.ts`, `src/learning/KnowledgeLearningPlatform.ts`, `src/server.ts`, `src/frontend/agent_workspace.js`, `src/knowledge.api.contract.test.ts`, `src/learning/KnowledgeLearningPlatform.test.ts`, `src/agent_workspace.frontend.test.ts`),
110111
- unified turn streaming baseline is now delivered on `/api/knowledge/conversation` via `Accept: text/event-stream` negotiation with a minimal event set (`turn_started`/`capability_planned`/`capability_progress`/`capability_result`/`turn_completed`/`turn_failed`) and frontend stream-first + sync fallback behavior (`src/server.ts`, `src/frontend/agent_workspace.js`, `src/knowledge.api.contract.test.ts`, `src/agent_workspace.frontend.test.ts`),
111112
- M8.2 recovery semantics are now in place on top of the stream baseline: frontend requests propagate client turn IDs across stream-first + sync fallback, server route `/api/knowledge/conversation` now enforces replay-window idempotency with turn-level dedupe/conflict protection (`turn_id_conflict`), and resumed stream requests replay cached turn events instead of re-running execution (`src/server.ts`, `src/frontend/agent_workspace.js`, `src/knowledge.api.contract.test.ts`, `src/agent_workspace.frontend.test.ts`),
@@ -128,9 +129,10 @@ Current branch status for this slice:
128129
- M10 rollout profile operator visibility is now wired end-to-end: runtime payload now exposes `rolloutProfile` (store/vector strictness + aggregate mode), `runtime-capability-matrix` plus runbook/verify/history/history-checks/action-queue/remediation-history/replay-schedule endpoints (including remediation POST flows: `event`/`replay`/`schedule`/`tick`) now echo the same profile, and learning-workbench runtime summary now surfaces `rollout=<mode>(...)` cue; integration/contract/frontend behavior coverage is in place (`src/server.ts`, `src/notemd.server.integration.test.ts`, `src/knowledge.api.contract.test.ts`, `src/frontend/path_app.js`, `src/path_app.runtime_trace_filter.behavior.test.ts`),
129130
- the legacy global Path Mode entry now clears the docked pane first so full-path entry remains deterministic (`src/frontend/app.js`).
130131

131-
## Latest Validation Snapshot (2026-05-12)
132+
## Latest Validation Snapshot (2026-05-14)
132133

133-
- Passed on the current Windows host: `npm run test:agent-workspace:contracts`, `npm run verify:agent-workspace:runtime`, `npm run verify:agent-workspace:browser`, `NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_STRICT=1 node scripts/verify-agent-workspace-browser.js`, `NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_UI_STRICT=1 node scripts/verify-agent-workspace-browser.js`, `NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_UI_STRICT=1 NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_UI_DYNAMIC_STRICT=1 node scripts/verify-agent-workspace-browser.js`, `npm run verify:agent-workspace:tauri`, `node node_modules/jest/bin/jest.js src/source_manager.loadflow.test.ts src/welcome.loadflow.test.ts src/pathmode.history.contract.test.ts --runInBand --no-cache`, `npm run verify:sidecar:supply`, `npm test -- src/knowledge.api.contract.test.ts --runInBand`, `npm run docs:diataxis:check`, `npm run docs:site:build`.
134+
- Reconfirmed on the current Windows host in this turn: `node node_modules/jest/bin/jest.js src/agent_workspace.frontend.test.ts --runInBand --no-cache`, `npm run test:agent-workspace:contracts`, `npm run build:with-vite`, `npm run docs:diataxis:check`, `npm run docs:site:build`, `NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_STRICT=1 NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_UI_STRICT=1 NOTE_CONNECTION_AGENT_WORKSPACE_BROWSER_UI_DYNAMIC_STRICT=1 node scripts/verify-agent-workspace-browser.js`.
135+
- The strict browser proof now explicitly verifies the bilingual runtime-runbook verify/checks ANN governance labels that were added in this slice: sync-health plus circuit, traceability, and prefilter summaries.
134136
- Tauri strict evidence is implementation-closed but still host-dependent:
135137
- the current Windows host proves non-strict tauri/runtime behavior and load-flow parity,
136138
- Linux strict evidence commands (`verify:agent-workspace:tauri:rust:strict`, `verify:agent-workspace:tauri:window-evidence:strict`, strict evidence index/manifest) still require provisioned `webkit2gtk-4.1`, `javascriptcoregtk-4.1`, and `libsoup-3.0`.
@@ -170,7 +172,7 @@ Operational note:
170172
| L2 Retrieval | Evidence-first explainable retrieval | `local_hybrid`, `keyword_only`, and `local_vector` retrieval backends exist with ANN-style prefilter, fallback telemetry, and a live sync-backed `external_http` acceleration path; the default graph store baseline is now embedded `graphdb/sqlite` with restart-durability proof | Keep the remaining backend gap honest: packaged/runtime + heavier-workload hardening are still open for graphdb, and ANN still needs benchmarked rollout thresholds plus larger-workload validation |
171173
| L3 Learning | Mastery diagnostics + actionable path generation | Mastery diagnostics, misconception summaries, dual-path recommendation, session execution primitives, and live quality/session-plan trend surfaces are implemented | Calibrate the now-live `learning quality` / `session plan quality` history-trend-evaluation surfaces on top of a release-grade graphdb/ANN baseline before claiming Phase-2 gate closure |
172174
| L4 Interaction | Workbench for operations + tutoring + diagnostics | The agent workspace shell, focus/path panes, typed capability contract, runtime/browser smoke, and turn-cache operator surfaces are real | Keep the shell/runtime evidence healthy, but stop treating observability cards as release-closed while they still depend on an operational rather than release-grade graph/ANN baseline |
173-
| L5 Governance | Runtime checks, trend gates, remediation loop | Runtime capability matrix, connector/circuit telemetry, ANN index-sync telemetry, and remediation plumbing are real | Tie governance upgrades to non-empty live thresholds plus adapter-backed telemetry on top of a release-grade graph/ANN baseline |
175+
| L5 Governance | Runtime checks, trend gates, remediation loop | Runtime capability matrix, connector/circuit telemetry, ANN index-sync telemetry, remediation plumbing, and operator-facing verify/checks drilldowns for ANN sync/circuit/traceability/prefilter are real | Tie governance upgrades to non-empty live thresholds plus adapter-backed telemetry on top of a release-grade graph/ANN baseline |
174176

175177
## Architecture Refactoring Status (2026-05-05, FINAL)
176178

0 commit comments

Comments
 (0)