feat(dashboard): SQLite workflowstore for ML jobs and OpenClaw; healt…#1656
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Supply Chain Security Report — All Clear
Scanned at |
There was a problem hiding this comment.
Pull request overview
This PR migrates dashboard workflow/control-plane state (ML pipeline jobs + OpenClaw collaboration entities) from in-memory / JSON-file persistence to a shared, durable SQLite-backed workflowstore, and wires new APIs/health reporting around it.
Changes:
- Introduces
dashboard/backend/workflowstoreSQLite store with schema + legacy OpenClaw JSON import. - Updates ML pipeline runner/handlers to persist jobs + typed progress events, plus adds
/api/ml-pipeline/jobs/{id}/events. - Updates OpenClaw handlers/tests to store containers/teams/rooms/messages in SQLite and adds
/api/workflows/health.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/agent/state-taxonomy-and-inventory.md | Updates state inventory to reflect SQLite-backed durability for ML pipeline + OpenClaw. |
| dashboard/frontend/src/hooks/useConversationStorage.ts | Documents that localStorage chat is demo-only; points to OpenClaw rooms for durable history. |
| dashboard/backend/workflowstore/store.go | Adds SQLite store open/init + schema for ML pipeline + OpenClaw entities. |
| dashboard/backend/workflowstore/mlpipeline.go | Adds persisted ML job + progress event CRUD and recovery logic. |
| dashboard/backend/workflowstore/openclaw.go | Adds OpenClaw entity/message CRUD against SQLite. |
| dashboard/backend/workflowstore/legacy_import.go | Adds one-time import path from legacy OpenClaw JSON files into SQLite. |
| dashboard/backend/workflowstore/store_test.go | Adds restart/reopen and incremental message append tests for the store. |
| dashboard/backend/router/router.go | Opens workflow store, registers workflow health endpoint, injects store into OpenClaw/ML pipeline. |
| dashboard/backend/router/core_routes.go | Updates ML pipeline route wiring to create runner with store + recover running jobs. |
| dashboard/backend/router/openclaw_routes.go | Updates OpenClaw handler construction to require workflow store. |
| dashboard/backend/router/mcp_routes_test.go | Sets WorkflowDBPath for router integration tests. |
| dashboard/backend/mlpipeline/runner*.go | Replaces in-memory job map with persisted jobs + typed progress events in store. |
| dashboard/backend/handlers/workflow_health.go | Adds /api/workflows/health snapshot endpoint backed by store counts. |
| dashboard/backend/handlers/openclaw*.go + tests | Switches OpenClaw registry/rooms/messages persistence to store and updates tests to use temp SQLite. |
| dashboard/backend/handlers/mlpipeline.go | Adds /events sub-route for durable typed ML progress history. |
| dashboard/backend/handlers/deploy_test.go | Fixes AMD config fixture path to use the documented recipe. |
| dashboard/backend/config/config.go | Adds --workflow-db / env default path config for workflow SQLite. |
|
can we reuse the postgre we set up now #1683? migrate all the sqlite storage into an unified one? |
dd15c8f to
129edb2
Compare
…h/events APIs; AMD deploy test fix Signed-off-by: mkoushni <mkoushni@redhat.com>
…og persist errors - Add _foreign_keys=1 to SQLite DSN so ON DELETE CASCADE is enforced - Use INSERT OR IGNORE instead of INSERT OR REPLACE in AppendOpenClawRoomMessage to preserve seq ordering on duplicate messages - Replace silent error discards (_ = ...) with log.Printf in runner.go so persistence failures are visible in logs Made-with: Cursor Signed-off-by: mkoushni <mkoushni@redhat.com>
Signed-off-by: mkoushni <mkoushni@redhat.com>
Signed-off-by: mkoushni <mkoushni@redhat.com> Made-with: Cursor Signed-off-by: mkoushni <mkoushni@redhat.com>
…errors Signed-off-by: mkoushni <mkoushni@redhat.com>
Signed-off-by: mkoushni <mkoushni@redhat.com>
129edb2 to
ce103c8
Compare
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|

Summary
Make dashboard workflow state restart-safe and server-owned for ML pipeline jobs and OpenClaw collaboration entities, instead of relying on in-memory maps, workspace-local JSON files, or browser-only state.
What changed
New
workflowstorepackage (dashboard/backend/workflowstore/): a shared SQLite database (./data/workflow.sqliteby default, override viaDASHBOARD_WORKFLOW_DB_PATHor--workflow-dbflag) that owns durable state for ML pipeline jobs, typed progress events, and OpenClaw containers, teams, rooms, and room messages.ML pipeline jobs (
dashboard/backend/mlpipeline/runner.go):ml_pipeline_jobsandml_pipeline_progress_eventstables instead of an in-memorymap[string]*Job.RecoverInterruptedMLJobsmarks anyrunningjobs asfailedwith a clear message (same pattern as the evaluation subsystem).GET /api/ml-pipeline/jobs/{id}/eventsreturns typed durable progress history (not log-derived).OpenClaw entities (
dashboard/backend/handlers/openclaw*.go):containers.json,teams.json,rooms.json,room-messages/*.json) to SQLite tables.INSERTrows instead of rewriting the entire JSON file on every message.Workflow health API:
GET /api/workflows/healthreturns a typed JSON snapshot with store connectivity status, ML job counts, and OpenClaw entity counts — no log scraping.Frontend annotation:
useConversationStorage.tsnow documents that browserlocalStoragechat history is demo/playground-only; OpenClaw room APIs are the supported server-owned collaboration history path.State inventory update:
docs/agent/state-taxonomy-and-inventory.mdrows for ML pipeline and OpenClaw updated to reflect the new persistence contract.Test fix:
TestMergeDeployPayload_RoundTripsMaintainedAMDConfigpointed at non-existentdeploy/amd/config.yaml; corrected to usedeploy/recipes/balance.yaml(the actual AMD reference recipe documented indeploy/amd/README.md).Files changed
dashboard/backend/workflowstore/{store,mlpipeline,openclaw,legacy_import,store_test}.godashboard/backend/handlers/workflow_health.godashboard/backend/handlers/openclaw_test_helpers_test.godashboard/backend/mlpipeline/runner{,_subprocess,_http,_config}.godashboard/backend/handlers/openclaw{,_rooms}.godashboard/backend/router/{router,core_routes,openclaw_routes}.godashboard/backend/config/config.goopenclaw_test.go,openclaw_image_test.go,openclaw_mcp_test.go,openclaw_room_readonly_test.go,openclaw_room_context_test.go,openclaw_worker_chat_test.go,mcp_routes_test.go,deploy_test.godashboard/frontend/src/hooks/useConversationStorage.tsdocs/agent/state-taxonomy-and-inventory.mdDesign decisions
auth.dbandevaluations.dbalready in the dashboard. Keeps local-dev simple; theStoreinterface is a natural seam for a future Postgres adapter when HA is needed.DELETE + INSERT ALLin a transaction (same semantics as the JSON-file writes they replace). Room messages use append-onlyINSERT.openclaw_containertable is empty andLegacyOpenClawDiris set. No migration tooling needed; existing JSON data is preserved on disk.Related
Test plan
go test ./dashboard/backend/workflowstore/— restart survival, recovery, incremental messagesgo test ./dashboard/backend/handlers/— all OpenClaw handler tests usenewTestOpenClawHandlerwith temp SQLitego test ./dashboard/backend/router/— MCP integration test withWorkflowDBPathsetTestMergeDeployPayload_RoundTripsMaintainedAMDConfigpasses with corrected recipe pathGET /api/workflows/healthreturns entity counts and"store":"ok"failedwith recovery message and progress events are queryable via/api/ml-pipeline/jobs/{id}/eventscontainers.json/teams.json/rooms.json, verify legacy import populates SQLite; subsequent restarts do not re-importResolve #1609