Skip to content

Commit 46a45d9

Browse files
authored
Improve lockfile documentation structure (#874)
1 parent 3e0db0e commit 46a45d9

12 files changed

Lines changed: 125 additions & 101 deletions

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ releases are available on [PyPI](https://pypi.org/project/pytask) and
77

88
## Unreleased
99

10+
- [#874](https://github.com/pytask-dev/pytask/pull/874) improves the lockfile
11+
documentation by restructuring related guides around user workflows and introducing
12+
`pytask.lock` in the tutorials.
1013
- [#868](https://github.com/pytask-dev/pytask/pull/868) resets the global marker
1114
configuration during unconfigure so `--strict-markers` no longer leaks into later
1215
marker access in the same process.

docs/AGENTS.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,11 @@
22

33
## General
44

5+
- The structure of the documentation follows https://diataxis.fr/. When writing or
6+
editing an article, read the relevant guidance from the Diataxis Framework before:
7+
https://diataxis.fr/tutorials, https://diataxis.fr/how-to-guides/,
8+
https://diataxis.fr/explanation/, https://diataxis.fr/reference/.
9+
https://diataxis.fr/compass/ tells you where belongs what and how do they relate.
510
- Document only public APIs and user-facing behavior - exclude internals, framework
611
abstractions, and implementation plumbing - Users need actionable documentation on
712
what they can use, not confusing details about internal mechanics they can't control

docs/CLAUDE.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/source/how_to_guides/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ specific tasks with pytask.
99

1010
- [Migrating From Scripts To Pytask](migrating_from_scripts_to_pytask.md)
1111
- [Interfaces For Dependencies Products](interfaces_for_dependencies_products.md)
12-
- [Portability](portability.md)
13-
- [Update the Lockfile to Match Project State](reconciling_lockfile_state.md)
12+
- [Move a Project to Another Machine](move_project_to_another_machine.md)
13+
- [Update the Lockfile to Match Project State](update_the_lockfile_to_match_project_state.md)
1414
- [Remote Files](remote_files.md)
1515
- [Functional Interface](functional_interface.md)
1616
- [Capture Warnings](capture_warnings.md)
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Move a Project to Another Machine
2+
3+
This guide teaches you how to move a pytask project to another machine or environment
4+
and reuse existing outputs where possible.
5+
6+
## Update the lockfile on the source machine
7+
8+
Run a normal build with [`pytask build`](../reference_guides/commands.md#pytask-build)
9+
before moving the project with its `pytask.lock` and files and outputs are up-to-date:
10+
11+
```console
12+
$ pytask build
13+
```
14+
15+
## Move the project files and reusable outputs
16+
17+
If you have not done it yet, commit `pytask.lock` to your repository and move it with
18+
the project. In practice, move:
19+
20+
- the project files tracked in version control, including source files, configuration,
21+
data inputs, and `pytask.lock`
22+
- the build artifacts you want to reuse, often in `bld/` if you follow the tutorial
23+
layout
24+
- the `.pytask` folder if you use the data catalog and it manages some of your files
25+
26+
## Keep external files in the same relative layout
27+
28+
If tasks use files outside the project root, keep the same relative layout on the target
29+
machine. The project root is the folder with the `pyproject.toml` file.
30+
31+
For example, if a task reads `../shared/input.csv` from the source machine, the moved
32+
project also needs a readable `../shared/input.csv` next to the project root on the
33+
target machine.
34+
35+
## Run pytask on the target machine
36+
37+
After you moved the project to the target machine, run pytask to build the project:
38+
39+
```console
40+
$ pytask build
41+
```
42+
43+
Assuming that the project was fully built before the move, pytask will not rebuild the
44+
project and skip all tasks.
45+
46+
## Clean stale lockfile entries
47+
48+
If you removed, renamed, or moved tasks before transferring the project, clean up stale
49+
lockfile entries on the source machine before you move the project:
50+
51+
```console
52+
$ pytask build --clean-lockfile
53+
```
54+
55+
This rewrites the lockfile after a successful build with only the currently collected
56+
tasks and their current state values.
57+
58+
## If your project uses custom nodes
59+
60+
Make sure custom node IDs and state values stay stable across machines:
61+
62+
- Use project-relative IDs instead of absolute paths.
63+
- Prefer file content hashes over timestamps.
64+
- Avoid machine-specific paths or timestamps in custom
65+
[`state()`](../api/nodes_and_tasks.md#pytask.PNode.state) implementations.
66+
- Provide a custom hash function for
67+
[`PythonNode`](../api/nodes_and_tasks.md#pytask.PythonNode) values that are not
68+
natively stable.
69+
70+
Most projects that only use built-in nodes do not need extra work here.
71+
72+
!!! seealso
73+
74+
The lockfile format and behavior are documented in the
75+
[reference guide](../reference_guides/lockfile.md). For custom nodes, see
76+
[Writing custom nodes](writing_custom_nodes.md). For hashing guidance, see
77+
[Hashing inputs of tasks](hashing_inputs_of_tasks.md).

docs/source/how_to_guides/portability.md

Lines changed: 0 additions & 92 deletions
This file was deleted.

docs/source/how_to_guides/reconciling_lockfile_state.md renamed to docs/source/how_to_guides/update_the_lockfile_to_match_project_state.md

File renamed without changes.

docs/source/tutorials/defining_dependencies_products.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ my_project
3636
│ ├────task_data_preparation.py
3737
│ └────task_plot_data.py
3838
39+
├───pytask.lock
40+
3941
└───pyproject.toml
4042
```
4143

@@ -107,6 +109,10 @@ Now, let us execute the two paths.
107109

108110
--8<-- "docs/source/_static/md/defining-dependencies-products.md"
109111

112+
The build updates `pytask.lock` with the state of both tasks. When you run the same
113+
tasks again without changing their dependencies, products, or source files, pytask uses
114+
the lockfile to skip them.
115+
110116
## Relative paths
111117

112118
Dependencies and products do not have to be absolute paths. If paths are relative, they

docs/source/tutorials/set_up_a_project.md

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The following directory tree gives an overview of the project's different parts.
1919
```text
2020
my_project
2121
22-
├───.pytask
22+
├───.pytask # Generated by pytask.
2323
2424
├───bld
2525
│ └────...
@@ -30,13 +30,16 @@ my_project
3030
│ ├────config.py
3131
│ └────...
3232
33+
├───pytask.lock # Generated by pytask.
34+
3335
└───pyproject.toml
3436
```
3537

36-
Replicate this directory structure for your project or start from pytask's
38+
Create the project files and folders for your project or start from pytask's
3739
[cookiecutter-pytask-project](https://github.com/pytask-dev/cookiecutter-pytask-project)
3840
template or any other
3941
[linked template or example project](../how_to_guides/bp_templates_and_projects.md).
42+
pytask creates the `.pytask` folder and `pytask.lock` file later when you run tasks.
4043

4144
## The `src` directory
4245

@@ -129,10 +132,24 @@ The `[tool.pytask.ini_options]` section tells pytask to look for tasks in
129132
`src/my_project`. You will learn more about configuration in the
130133
[configuration tutorial](configuration.md).
131134

135+
## The `pytask.lock` file
136+
137+
The `pytask.lock` file records which tasks and products are up to date. pytask updates
138+
it during builds so later runs can skip unchanged tasks. This file should be kept in
139+
version control.
140+
141+
!!! seealso
142+
143+
You will later learn how to sync the state of the lockfile with the project state with
144+
the [`pytask lock`](../reference_guides/commands.md#pytask-lock) command or how the
145+
lockfile enables you to
146+
[move a project to another machine](../how_to_guides/move_project_to_another_machine.md),
147+
but don't worry about it for now.
148+
132149
## The `.pytask` directory
133150

134-
The `.pytask` directory is where pytask stores its information. You do not need to
135-
interact with it.
151+
The `.pytask` directory is where pytask stores some of its ephemeral information. You do
152+
not need to interact with it, nor do you need to keep it in version control.
136153

137154
## Installation
138155

docs/source/tutorials/using_a_data_catalog.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ my_project
3636
│ ├────task_data_preparation.py
3737
│ └────task_plot_data.py
3838
39+
├───pytask.lock
40+
3941
└───pyproject.toml
4042
```
4143

@@ -148,6 +150,8 @@ my_project
148150
149151
├───pyproject.toml
150152
153+
├───pytask.lock
154+
151155
├───src
152156
│ └───my_project
153157
│ ├────config.py

0 commit comments

Comments
 (0)