Skip to content

Commit 443090c

Browse files
docs: document restricted diagram operations (new in 2.2)
- Add Operational Methods section to diagram.md spec: cascade(), restrict(), delete(), drop(), preview(), prune(), restriction propagation rules, OR-vs-AND convergence - Add Graph-Driven Diagram Operations section to whats-new-22.md: motivation, preview-then-execute pattern, two propagation modes, pruning empty tables - Add Diagram-Level Delete section to delete-data.md: build-preview-execute workflow, when to use - Add prune() to read-diagrams how-to - Add version admonition in data-manipulation.md noting graph-driven cascade internals - Cross-references between all files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7e28d67 commit 443090c

File tree

5 files changed

+282
-47
lines changed

5 files changed

+282
-47
lines changed

src/explanation/whats-new-22.md

Lines changed: 64 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# What's New in DataJoint 2.2
22

3-
DataJoint 2.2 introduces **isolated instances** and **thread-safe mode** for applications that need multiple independent database connections—web servers, multi-tenant notebooks, parallel pipelines, and testing.
3+
DataJoint 2.2 introduces **isolated instances**, **thread-safe mode**, and **graph-driven diagram operations** for applications that need multiple independent database connections, explicit cascade control, and operational use of the dependency graph.
44

55
> **Upgrading from 2.0 or 2.1?** No breaking changes. All existing code using `dj.config` and `dj.Schema()` continues to work. The new Instance API is purely additive.
66
@@ -201,9 +201,72 @@ class MyTable(dj.Manual):
201201

202202
Once a Schema is created, table definitions, inserts, queries, and all other operations work identically regardless of which pattern was used to create the Schema.
203203

204+
## Graph-Driven Diagram Operations
205+
206+
DataJoint 2.2 promotes `dj.Diagram` from a visualization tool to an operational component. The same dependency graph that renders pipeline diagrams now powers cascade delete, table drop, and data subsetting.
207+
208+
### From Visualization to Operations
209+
210+
In prior versions, `dj.Diagram` existed solely for visualization — drawing the dependency graph as SVG or Mermaid output. The cascade logic inside `Table.delete()` traversed dependencies independently, with no way to inspect or control the cascade before it executed.
211+
212+
In 2.2, `Table.delete()` and `Table.drop()` delegate internally to `dj.Diagram`. The user-facing behavior of `Table.delete()` is unchanged, but the diagram-level API is now available as a more powerful interface for complex scenarios.
213+
214+
### The Preview-Then-Execute Pattern
215+
216+
The key benefit of the diagram-level API is the ability to build a cascade explicitly, inspect it, and then decide whether to execute:
217+
218+
```python
219+
# Build the dependency graph
220+
diag = dj.Diagram(schema)
221+
222+
# Apply cascade restriction — nothing is deleted yet
223+
restricted = diag.cascade(Session & {'subject_id': 'M001'})
224+
225+
# Inspect: what tables and how many rows would be affected?
226+
counts = restricted.preview()
227+
# {'`lab`.`session`': 3, '`lab`.`trial`': 45, '`lab`.`processed_data`': 45}
228+
229+
# Execute only after reviewing the blast radius
230+
restricted.delete(prompt=False)
231+
```
232+
233+
This is valuable when working with unfamiliar pipelines, large datasets, or multi-schema dependencies where the cascade impact is not immediately obvious.
234+
235+
### Two Propagation Modes
236+
237+
The diagram supports two restriction propagation modes with different convergence semantics:
238+
239+
**`cascade()` uses OR at convergence.** When a child table has multiple restricted ancestors, the child row is affected if *any* parent path reaches it. This is the right semantics for delete — if any reason exists to remove a row, it should be removed. `cascade()` is one-shot: it can only be called once on an unrestricted diagram.
240+
241+
**`restrict()` uses AND at convergence.** A child row is included only if *all* restricted ancestors match. This is the right semantics for data subsetting and export — only rows satisfying every condition are selected. `restrict()` is chainable: call it multiple times to build up conditions from different tables.
242+
243+
The two modes are mutually exclusive on the same diagram. This prevents accidental mixing of incompatible semantics.
244+
245+
### Pruning Empty Tables
246+
247+
After applying restrictions, some tables in the diagram may have zero matching rows. The `prune()` method removes these tables from the diagram, leaving only the subgraph with actual data:
248+
249+
```python
250+
export = (dj.Diagram(schema)
251+
.restrict(Subject & {'species': 'mouse'})
252+
.restrict(Session & 'session_date > "2024-01-01"')
253+
.prune())
254+
255+
export.preview() # only tables with matching rows
256+
export # visualize the export subgraph
257+
```
258+
259+
Without prior restrictions, `prune()` removes physically empty tables. This is useful for understanding which parts of a pipeline are populated.
260+
261+
### Architecture
262+
263+
`Table.delete()` now constructs a `Diagram` internally, calls `cascade()`, and then `delete()`. This means every table-level delete benefits from the same graph-driven logic. The diagram-level API simply exposes this machinery for direct use when more control is needed.
264+
204265
## See Also
205266

206267
- [Use Isolated Instances](../how-to/use-instances.md/) — Task-oriented guide
207268
- [Working with Instances](../tutorials/advanced/instances.ipynb/) — Step-by-step tutorial
208269
- [Configuration Reference](../reference/configuration.md/) — Thread-safe mode settings
209270
- [Configure Database](../how-to/configure-database.md/) — Connection setup
271+
- [Diagram Specification](../reference/specs/diagram.md/) — Full reference for diagram operations
272+
- [Delete Data](../how-to/delete-data.md/) — Task-oriented delete guide

src/how-to/delete-data.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,8 +189,43 @@ count = (Subject & restriction).delete(prompt=False)
189189
print(f"Deleted {count} subjects")
190190
```
191191

192+
## Diagram-Level Delete
193+
194+
!!! version-added "New in 2.2"
195+
Diagram-level delete was added in DataJoint 2.2.
196+
197+
For complex scenarios — previewing the blast radius, working across schemas, or understanding the dependency graph before deleting — use `dj.Diagram` to build and inspect the cascade before executing.
198+
199+
### Build, Preview, Execute
200+
201+
```python
202+
import datajoint as dj
203+
204+
# 1. Build the dependency graph
205+
diag = dj.Diagram(schema)
206+
207+
# 2. Apply cascade restriction (nothing deleted yet)
208+
restricted = diag.cascade(Session & {'subject_id': 'M001'})
209+
210+
# 3. Preview: see affected tables and row counts
211+
counts = restricted.preview()
212+
# {'`lab`.`session`': 3, '`lab`.`trial`': 45, '`lab`.`processed_data`': 45}
213+
214+
# 4. Execute only after reviewing
215+
restricted.delete(prompt=False)
216+
```
217+
218+
### When to Use
219+
220+
- **Preview blast radius**: Understand what a cascade delete will affect before committing
221+
- **Multi-schema cascades**: Build a diagram spanning multiple schemas and delete across them in one operation
222+
- **Programmatic control**: Use `preview()` return values to make decisions in automated workflows
223+
224+
For simple single-table deletes, `(Table & restriction).delete()` remains the simplest approach. The diagram-level API is for when you need more visibility or control.
225+
192226
## See Also
193227

228+
- [Diagram Specification](../reference/specs/diagram.md/) — Full reference for diagram operations
194229
- [Master-Part Tables](master-part.ipynb) — Compositional data patterns
195230
- [Model Relationships](model-relationships.ipynb) — Foreign key patterns
196231
- [Insert Data](insert-data.md) — Adding data to tables

src/how-to/read-diagrams.ipynb

Lines changed: 3 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1325,22 +1325,7 @@
13251325
"cell_type": "markdown",
13261326
"id": "cell-ops-ref",
13271327
"metadata": {},
1328-
"source": [
1329-
"**Operation Reference:**\n",
1330-
"\n",
1331-
"| Operation | Meaning |\n",
1332-
"|-----------|--------|\n",
1333-
"| `dj.Diagram(schema)` | Entire schema |\n",
1334-
"| `dj.Diagram(Table) - N` | Table + N levels upstream |\n",
1335-
"| `dj.Diagram(Table) + N` | Table + N levels downstream |\n",
1336-
"| `D1 + D2` | Union of two diagrams |\n",
1337-
"| `D1 * D2` | Intersection (common nodes) |\n",
1338-
"\n",
1339-
"**Finding paths:** Use intersection to find connection paths:\n",
1340-
"```python\n",
1341-
"(dj.Diagram(upstream) + 100) * (dj.Diagram(downstream) - 100)\n",
1342-
"```"
1343-
]
1328+
"source": "**Operation Reference:**\n\n| Operation | Meaning |\n|-----------|--------|\n| `dj.Diagram(schema)` | Entire schema |\n| `dj.Diagram(Table) - N` | Table + N levels upstream |\n| `dj.Diagram(Table) + N` | Table + N levels downstream |\n| `D1 + D2` | Union of two diagrams |\n| `D1 * D2` | Intersection (common nodes) |\n| `D.prune()` | Remove tables with zero matching rows *(2.2+)* |\n\n**Finding paths:** Use intersection to find connection paths:\n```python\n(dj.Diagram(upstream) + 100) * (dj.Diagram(downstream) - 100)\n```"
13441329
},
13451330
{
13461331
"cell_type": "markdown",
@@ -3322,33 +3307,7 @@
33223307
"cell_type": "markdown",
33233308
"id": "cell-summary-md",
33243309
"metadata": {},
3325-
"source": [
3326-
"## Summary\n",
3327-
"\n",
3328-
"| Visual | Meaning |\n",
3329-
"|--------|--------|\n",
3330-
"| **Thick solid** | One-to-one extension |\n",
3331-
"| **Thin solid** | One-to-many containment |\n",
3332-
"| **Dashed** | Reference (independent identity) |\n",
3333-
"| **Underlined** | Introduces new dimension |\n",
3334-
"| **Orange dots** | Renamed FK via `.proj()` |\n",
3335-
"| **Colors** | Green=Manual, Gray=Lookup, Red=Computed, Blue=Imported |\n",
3336-
"| **Grouped boxes** | Tables grouped by schema/module |\n",
3337-
"| **3D box (gray)** | Collapsed schema *(2.1+)* |\n",
3338-
"\n",
3339-
"| Feature | Method |\n",
3340-
"|---------|--------|\n",
3341-
"| Layout direction | `dj.config.display.diagram_direction` |\n",
3342-
"| Mermaid output | `.make_mermaid()` |\n",
3343-
"| Collapse schema | `.collapse()` *(2.1+)* |\n",
3344-
"\n",
3345-
"## Related\n",
3346-
"\n",
3347-
"- [Diagram Specification](../reference/specs/diagram.md)\n",
3348-
"- [Entity Integrity: Dimensions](../explanation/entity-integrity.md#schema-dimensions)\n",
3349-
"- [Semantic Matching](../reference/specs/semantic-matching.md)\n",
3350-
"- [Schema Design Tutorial](../tutorials/basics/02-schema-design.ipynb)"
3351-
]
3310+
"source": "## Summary\n\n| Visual | Meaning |\n|--------|--------|\n| **Thick solid** | One-to-one extension |\n| **Thin solid** | One-to-many containment |\n| **Dashed** | Reference (independent identity) |\n| **Underlined** | Introduces new dimension |\n| **Orange dots** | Renamed FK via `.proj()` |\n| **Colors** | Green=Manual, Gray=Lookup, Red=Computed, Blue=Imported |\n| **Grouped boxes** | Tables grouped by schema/module |\n| **3D box (gray)** | Collapsed schema *(2.1+)* |\n\n| Feature | Method |\n|---------|--------|\n| Layout direction | `dj.config.display.diagram_direction` |\n| Mermaid output | `.make_mermaid()` |\n| Collapse schema | `.collapse()` *(2.1+)* |\n| Prune empty tables | `.prune()` *(2.2+)* |\n\n## Related\n\n- [Diagram Specification](../reference/specs/diagram.md)\n- [Entity Integrity: Dimensions](../explanation/entity-integrity.md#schema-dimensions)\n- [Semantic Matching](../reference/specs/semantic-matching.md)\n- [Schema Design Tutorial](../tutorials/basics/02-schema-design.ipynb)"
33523311
},
33533312
{
33543313
"cell_type": "code",
@@ -3397,4 +3356,4 @@
33973356
},
33983357
"nbformat": 4,
33993358
"nbformat_minor": 5
3400-
}
3359+
}

src/reference/specs/data-manipulation.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,9 @@ Delete automatically cascades to all dependent tables:
332332
2. Recursively delete matching rows in child tables
333333
3. Delete rows in target table
334334

335+
!!! version-added "New in 2.2"
336+
`Table.delete()` now uses graph-driven cascade internally via `dj.Diagram`. User-facing behavior is unchanged — the same parameters and return values apply. For direct control over the cascade (preview, multi-schema operations), use the [Diagram operational methods](diagram.md#operational-methods).
337+
335338
### 4.3 Basic Usage
336339

337340
```python

0 commit comments

Comments
 (0)