Skip to content

feat(workflow): add checkpointed rolling generate resumable workflow#5

Merged
fcogidi merged 16 commits intomainfrom
feat/workflow_engine
Apr 21, 2026
Merged

feat(workflow): add checkpointed rolling generate resumable workflow#5
fcogidi merged 16 commits intomainfrom
feat/workflow_engine

Conversation

@fcogidi
Copy link
Copy Markdown
Collaborator

@fcogidi fcogidi commented Apr 21, 2026

Summary

  • Split the generate workflow into focused internal modules for setup, planning, preparation, and execution.
  • Keep results.jsonl as the live artifact while moving checkpointing to SQLite with strict resume validation.
  • Add planner-backed resume handling, clearer shutdown behavior, and targeted readability comments for the new workflow seams.
  • Update CLI/docs and tests to match the new workflow structure and interrupt-safe cleanup path.

Testing

  • uv run pytest -q
  • UV_CACHE_DIR=/tmp/uv-cache uv run pre-commit run -a

@fcogidi fcogidi changed the title Refactor generate workflow into a resumable internal package feat(workflow): add checkpointed rolling generate resumable workflow Apr 21, 2026
@fcogidi fcogidi requested a review from Copilot April 21, 2026 00:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors infermesh generate into a checkpointed, rolling-window workflow that can resume interrupted runs using a SQLite checkpoint DB (with strict validation), while keeping results.jsonl as the live user-facing artifact.

Changes:

  • Introduces a modular generate workflow engine (_workflow/*) with rolling scheduling, per-row persistence, and strict resume planning/validation.
  • Updates CLI to use the new workflow, adds --checkpoint-dir / INFERMESH_CHECKPOINT_DIR, and improves error surfacing.
  • Adds comprehensive tests plus docs/README updates for the new resume + mapper behavior.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_workflow.py New end-to-end unit tests for rolling execution, checkpointing, and strict resume validation.
tests/test_sync_runner.py Adds a regression test ensuring SyncRunner cancels and allows cleanup on KeyboardInterrupt.
tests/test_cli.py Updates CLI tests for new resume/checkpoint behavior, mapper support, and error handling.
tests/fakes.py Extends fakes to support async generation, checkpoint DB helpers, and resume assertions.
src/infermesh/sync_runner.py Reworks run() to cancel loop-owned tasks on KeyboardInterrupt and allow coroutine cleanup.
src/infermesh/client.py Routes sync wrappers through the new _run_sync() helper.
src/infermesh/cli.py Switches generate to run_generate_workflow, adds checkpoint-dir plumbing, and uses managed client lifecycle.
src/infermesh/_workflow/source.py Adds source parsing, fingerprinting, stdin materialization, and input/output path validation helpers.
src/infermesh/_workflow/runtime.py Centralizes generate-run setup/cleanup (staging, persistence sink, resume plan, preparer selection).
src/infermesh/_workflow/resume.py Implements strict resume validation and a SQLite-backed planner for file-backed resumes.
src/infermesh/_workflow/prepare.py Adds preparers for sequential runs and planner-driven resumes; maps rows into work items or immediate error rows.
src/infermesh/_workflow/models.py Introduces internal workflow dataclasses (checkpoint keys, work items, resume plan, etc.).
src/infermesh/_workflow/mapping.py Adds mapper loading, built-in mapping behavior, metadata validation, and mapping fingerprinting.
src/infermesh/_workflow/engine.py Implements rolling-window scheduler, per-item emission/persistence, and the public workflow entrypoint.
src/infermesh/_workflow/checkpoint.py Adds SQLite checkpoint schema, staging/bootstrap logic, and a single-threaded persistence sink.
src/infermesh/_workflow/init.py Exposes run_generate_workflow from the internal workflow package.
src/infermesh/_client_runtime.py Adds _run_sync() helper on the client runtime to centralize sync execution via SyncRunner.
src/infermesh/_cli_support.py Adds missing deployments table validation and re-homes _build_generation_record for reuse by the workflow engine.
docs/guide.md Updates resume docs to describe SQLite checkpointing, strict validation, rolling window behavior, and --mapper.
README.md Updates generate/resume examples to reference checkpoint files and adds a mapper example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/infermesh/sync_runner.py Outdated
Comment thread src/infermesh/_workflow/engine.py
@fcogidi fcogidi marked this pull request as ready for review April 21, 2026 00:59
@fcogidi fcogidi merged commit 605199c into main Apr 21, 2026
8 checks passed
@fcogidi fcogidi deleted the feat/workflow_engine branch April 21, 2026 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants