You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/tutorials/basics/01-first-pipeline.ipynb
+3-43Lines changed: 3 additions & 43 deletions
Original file line number
Diff line number
Diff line change
@@ -3,22 +3,7 @@
3
3
{
4
4
"cell_type": "markdown",
5
5
"metadata": {},
6
-
"source": [
7
-
"# A Simple Pipeline\n",
8
-
"\n",
9
-
"This tutorial introduces DataJoint by building a simple research lab database. You'll learn to:\n",
10
-
"\n",
11
-
"- Define tables with primary keys and dependencies\n",
12
-
"- Insert and query data\n",
13
-
"- Use the four core operations: restriction, projection, join, aggregation\n",
14
-
"- Understand the schema diagram\n",
15
-
"\n",
16
-
"We'll work with **Manual tables** only—tables where you enter data directly. Later tutorials introduce automated computation.\n",
17
-
"\n",
18
-
"For complete working examples, see:\n",
19
-
"- [University Database](../examples/university.ipynb) — Academic records with complex queries\n",
20
-
"- [Blob Detection](../examples/blob-detection.ipynb) — Image processing with computation"
21
-
]
6
+
"source": "# A Simple Pipeline\n\nThis tutorial introduces DataJoint by building a simple research lab database. You'll learn to:\n\n- Define tables with primary keys and dependencies\n- Insert and query data\n- Use the four core operations: restriction, projection, join, aggregation\n- Understand the schema diagram\n\nWe'll work with **Manual tables** only—tables where you enter data directly. Later tutorials introduce automated computation.\n\nFor complete working examples, see:\n- [University Database](../examples/university/) — Academic records with complex queries\n- [Blob Detection](../examples/blob-detection/) — Image processing with computation"
22
7
},
23
8
{
24
9
"cell_type": "markdown",
@@ -2698,32 +2683,7 @@
2698
2683
{
2699
2684
"cell_type": "markdown",
2700
2685
"metadata": {},
2701
-
"source": [
2702
-
"## Summary\n",
2703
-
"\n",
2704
-
"You've learned the fundamentals of DataJoint:\n",
2705
-
"\n",
2706
-
"| Concept | Description |\n",
2707
-
"|---------|-------------|\n",
2708
-
"| **Tables** | Python classes with a `definition` string |\n",
"**Key principle:** Solid lines mean the parent's identity becomes part of the child's identity. Dashed lines mean the child maintains independent identity.\n",
1328
-
"\n",
1329
-
"**Note:** Diagrams do NOT show `[nullable]` or `[unique]` modifiers—check table definitions for these constraints.\n",
1330
-
"\n",
1331
-
"See [How to Read Diagrams](../../how-to/read-diagrams.ipynb) for diagram operations and comparison to ER notation.\n",
1332
-
"\n",
1333
-
"## Insert Test Data and Populate"
1334
-
]
1302
+
"source": "### Reading the Diagram\n\nDataJoint diagrams show tables as nodes and foreign keys as edges. The notation conveys relationship semantics at a glance.\n\n**Line Styles:**\n\n| Line | Style | Relationship | Meaning |\n|------|-------|--------------|---------|\n| ━━━ | Thick solid | Extension | FK **is** entire PK (one-to-one) |\n| ─── | Thin solid | Containment | FK **in** PK with other fields (one-to-many) |\n| ┄┄┄ | Dashed | Reference | FK in secondary attributes (one-to-many) |\n\n**Visual Indicators:**\n\n| Indicator | Meaning |\n|-----------|---------|\n| **Underlined name** | Introduces new dimension (new PK attributes) |\n| Non-underlined name | Inherits all dimensions (PK entirely from FKs) |\n| **Green** | Manual table |\n| **Gray** | Lookup table |\n| **Red** | Computed table |\n| **Blue** | Imported table |\n| **Orange dots** | Renamed foreign keys (via `.proj()`) |\n\n**Key principle:** Solid lines mean the parent's identity becomes part of the child's identity. Dashed lines mean the child maintains independent identity.\n\n**Note:** Diagrams do NOT show `[nullable]` or `[unique]` modifiers—check table definitions for these constraints.\n\nSee [How to Read Diagrams](../../how-to/read-diagrams/) for diagram operations and comparison to ER notation.\n\n## Insert Test Data and Populate"
1335
1303
},
1336
1304
{
1337
1305
"cell_type": "code",
@@ -1562,80 +1530,7 @@
1562
1530
{
1563
1531
"cell_type": "markdown",
1564
1532
"metadata": {},
1565
-
"source": [
1566
-
"## Best Practices\n",
1567
-
"\n",
1568
-
"### 1. Choose Meaningful Primary Keys\n",
1569
-
"- Use natural identifiers when possible (`subject_id = 'M001'`)\n",
1570
-
"- Keep keys minimal but sufficient for uniqueness\n",
1571
-
"\n",
1572
-
"### 2. Use Appropriate Table Tiers\n",
1573
-
"- **Manual**: Data entered by operators or instruments\n",
"- **Imported**: Data read from files (recordings, images)\n",
1576
-
"- **Computed**: Derived analyses and summaries\n",
1577
-
"\n",
1578
-
"### 3. Normalize Your Data\n",
1579
-
"- Don't repeat information across rows\n",
1580
-
"- Create separate tables for distinct entities\n",
1581
-
"- Use foreign keys to link related data\n",
1582
-
"\n",
1583
-
"### 4. Use Core DataJoint Types\n",
1584
-
"\n",
1585
-
"DataJoint has a three-layer type architecture (see [Type System Specification](../reference/specs/type-system.md)):\n",
1586
-
"\n",
1587
-
"1. **Native database types** (Layer 1): Backend-specific types like `INT`, `FLOAT`, `TINYINT UNSIGNED`. These are **discouraged** but allowed for backward compatibility.\n",
1588
-
"\n",
1589
-
"2. **Core DataJoint types** (Layer 2): Standardized, scientist-friendly types that work identically across MySQL and PostgreSQL. **Always prefer these.**\n",
1590
-
"\n",
1591
-
"3. **Codec types** (Layer 3): Types with `encode()`/`decode()` semantics like `<blob>`, `<attach>`, `<object@>`.\n",
"**Why native types are allowed but discouraged:**\n",
1608
-
"\n",
1609
-
"Native types (like `int`, `float`, `tinyint`) are passed through to the database but generate a **warning at declaration time**. They are discouraged because:\n",
1610
-
"- They lack explicit size information\n",
1611
-
"- They are not portable across database backends\n",
1612
-
"- They are not recorded in field metadata for reconstruction\n",
1613
-
"\n",
1614
-
"If you see a warning like `\"Native type 'int' used; consider 'int32' instead\"`, update your definition to use the corresponding core type.\n",
"| **Secondary Attributes** | Attributes below `---` that store additional data |\n",
1626
-
"| **Foreign Key** (`->`) | Reference to another table, imports its primary key |\n",
1627
-
"| **One-to-Many** | FK in primary key: parent has many children |\n",
1628
-
"| **One-to-One** | FK is entire primary key: exactly one child per parent |\n",
1629
-
"| **Master-Part** | Compositional integrity: master and parts inserted/deleted atomically |\n",
1630
-
"| **Nullable FK** | `[nullable]` makes the reference optional |\n",
1631
-
"| **Lookup Table** | Pre-populated reference data |\n",
1632
-
"\n",
1633
-
"## Next Steps\n",
1634
-
"\n",
1635
-
"- [Data Entry](03-data-entry.ipynb) — Inserting, updating, and deleting data\n",
1636
-
"- [Queries](04-queries.ipynb) — Filtering, joining, and projecting\n",
1637
-
"- [Computation](05-computation.ipynb) — Building computational pipelines"
1638
-
]
1533
+
"source": "## Best Practices\n\n### 1. Choose Meaningful Primary Keys\n- Use natural identifiers when possible (`subject_id = 'M001'`)\n- Keep keys minimal but sufficient for uniqueness\n\n### 2. Use Appropriate Table Tiers\n- **Manual**: Data entered by operators or instruments\n- **Lookup**: Configuration, parameters, reference data\n- **Imported**: Data read from files (recordings, images)\n- **Computed**: Derived analyses and summaries\n\n### 3. Normalize Your Data\n- Don't repeat information across rows\n- Create separate tables for distinct entities\n- Use foreign keys to link related data\n\n### 4. Use Core DataJoint Types\n\nDataJoint has a three-layer type architecture (see [Type System Specification](../../reference/specs/type-system/)):\n\n1. **Native database types** (Layer 1): Backend-specific types like `INT`, `FLOAT`, `TINYINT UNSIGNED`. These are **discouraged** but allowed for backward compatibility.\n\n2. **Core DataJoint types** (Layer 2): Standardized, scientist-friendly types that work identically across MySQL and PostgreSQL. **Always prefer these.**\n\n3. **Codec types** (Layer 3): Types with `encode()`/`decode()` semantics like `<blob>`, `<attach>`, `<object@>`.\n\n**Core types used in this tutorial:**\n\n| Type | Description | Example |\n|------|-------------|---------|\n| `uint8`, `uint16`, `int32` | Sized integers | `session_idx : uint16` |\n| `float32`, `float64` | Sized floats | `reaction_time : float32` |\n| `varchar(n)` | Variable-length string | `name : varchar(100)` |\n| `bool` | Boolean | `correct : bool` |\n| `date` | Date only | `date_of_birth : date` |\n| `datetime` | Date and time (UTC) | `created_at : datetime` |\n| `enum(...)` | Enumeration | `sex : enum('M', 'F', 'U')` |\n| `json` | JSON document | `task_params : json` |\n| `uuid` | Universally unique ID | `experimenter_id : uuid` |\n\n**Why native types are allowed but discouraged:**\n\nNative types (like `int`, `float`, `tinyint`) are passed through to the database but generate a **warning at declaration time**. They are discouraged because:\n- They lack explicit size information\n- They are not portable across database backends\n- They are not recorded in field metadata for reconstruction\n\nIf you see a warning like `\"Native type 'int' used; consider 'int32' instead\"`, update your definition to use the corresponding core type.\n\n### 5. Document Your Tables\n- Add comments after `#` in definitions\n- Document units in attribute comments\n\n## Key Concepts Recap\n\n| Concept | Description |\n|---------|-------------|\n| **Primary Key** | Attributes above `---` that uniquely identify rows |\n| **Secondary Attributes** | Attributes below `---` that store additional data |\n| **Foreign Key** (`->`) | Reference to another table, imports its primary key |\n| **One-to-Many** | FK in primary key: parent has many children |\n| **One-to-One** | FK is entire primary key: exactly one child per parent |\n| **Master-Part** | Compositional integrity: master and parts inserted/deleted atomically |\n| **Nullable FK** | `[nullable]` makes the reference optional |\n| **Lookup Table** | Pre-populated reference data |\n\n## Next Steps\n\n- [Data Entry](03-data-entry/) — Inserting, updating, and deleting data\n- [Queries](04-queries/) — Filtering, joining, and projecting\n- [Computation](05-computation/) — Building computational pipelines"
0 commit comments