Skip to content

Latest commit

 

History

History
98 lines (66 loc) · 3.14 KB

File metadata and controls

98 lines (66 loc) · 3.14 KB

The Lockfile

pytask.lock is the default state backend. It stores task state in a portable, git-friendly format so runs can be resumed or shared across machines.

!!! note

SQLite is the legacy format. When no lockfile exists, pytask reads the legacy database
state and writes `pytask.lock`. The lockfile remains the primary backend for skip
decisions, and `pytask build` also keeps the legacy database updated for downgrade
compatibility.

Example

# This file is automatically @generated by pytask.
# It is not intended for manual editing.

lock-version = "1"

[[task]]
id = "src/tasks/data.py::task_clean_data"
state = "f9e8d7c6..."

[task.depends_on]
"data/raw/input.csv" = "e5f6g7h8..."

[task.produces]
"data/processed/clean.parquet" = "m3n4o5p6..."

Behavior

On each run, pytask:

  1. Reads pytask.lock (if present).
  2. Compares current dependency/product/task state() to stored state.
  3. Skips tasks whose states match; runs the rest.
  4. Updates pytask.lock after each completed task (atomic write).
  5. Updates pytask.lock after skipping unchanged tasks (unless --dry-run or --explain are active).

Portability

There are two portability concerns:

  1. IDs: Lockfile IDs must be project‑relative and stable across machines.
  2. State values: state is opaque; portability depends on each node’s state() implementation. Content hashes are portable; timestamps are not.

Maintenance

Use pytask lock clean to rewrite pytask.lock with only currently collected tasks. The command removes stale task entries without executing tasks again.

File Format Reference

Top-Level

Field Required Description
lock-version Yes Schema version (currently "1")

Task Entry

Field Required Description
id Yes Portable task identifier
state Yes Opaque state string
depends_on No Mapping from node id to state
produces No Mapping from node id to state

Dependency/Product Entry

Node entries are stored as key-value pairs inside depends_on and produces, where the key is the node id and the value is the node state string.

IDs vs Signatures

id in the lockfile is a portable identifier used to match entries across runs and machines. It is not the same as a node or task signature used internally in the DAG.

  • signature: runtime identity in the in-memory DAG.
  • id: portable lockfile key persisted to pytask.lock.

When implementing custom nodes, keep lockfile IDs stable and unique within a task.

Version Compatibility

Only lock-version "1" is supported. Older or newer versions error with a clear upgrade message.

Implementation Notes

  • The lockfile is encoded/decoded with msgspec’s TOML support.
  • Writes are atomic: pytask writes a temporary file and replaces pytask.lock.