You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DataJoint 2.0 replaces the complex fetch() method with a set of explicit, composable output methods. This provides better discoverability, clearer intent, and more efficient iteration.
Design Principles
Explicit over implicit: Each output format has its own method
Composable: Use existing .proj() for column selection
Lazy iteration: Single cursor streaming instead of fetch-all-keys
Modern formats: First-class support for polars and Arrow
New API Reference
Output Methods
Method
Returns
Description
to_dicts()
list[dict]
All rows as list of dictionaries
to_pandas()
DataFrame
pandas DataFrame with primary key as index
to_polars()
polars.DataFrame
polars DataFrame (requires datajoint[polars])
to_arrow()
pyarrow.Table
PyArrow Table (requires datajoint[arrow])
to_arrays()
np.ndarray
numpy structured array (recarray)
to_arrays('a', 'b')
tuple[array, array]
Tuple of arrays for specific columns
keys()
list[dict]
Primary key values only
fetch1()
dict
Single row as dict (raises if not exactly 1)
fetch1('a', 'b')
tuple
Single row attribute values
Common Parameters
All output methods accept these optional parameters:
table.to_dicts(
order_by=None, # str or list: column(s) to sort by, e.g. "KEY", "name DESC"limit=None, # int: maximum rows to returnoffset=None, # int: rows to skipsqueeze=False, # bool: remove singleton dimensions from arraysdownload_path="."# str: path for downloading external data
)
Iteration
# Lazy streaming - yields one dict per row from database cursorforrowintable:
process(row) # row is a dict
Migration Guide
Basic Fetch Operations
Old Pattern (1.x)
New Pattern (2.0)
table.fetch()
table.to_arrays() or table.to_dicts()
table.fetch(format="array")
table.to_arrays()
table.fetch(format="frame")
table.to_pandas()
table.fetch(as_dict=True)
table.to_dicts()
Attribute Fetching
Old Pattern (1.x)
New Pattern (2.0)
table.fetch('a')
table.to_arrays('a')
a, b = table.fetch('a', 'b')
a, b = table.to_arrays('a', 'b')
table.fetch('a', 'b', as_dict=True)
table.proj('a', 'b').to_dicts()
Primary Key Fetching
Old Pattern (1.x)
New Pattern (2.0)
table.fetch('KEY')
table.keys()
table.fetch(dj.key)
table.keys()
keys, a = table.fetch('KEY', 'a')
See note below
For mixed KEY + attribute fetch:
# Old: keys, a = table.fetch('KEY', 'a')# New: Combine keys() with to_arrays()keys=table.keys()
a=table.to_arrays('a')
# Or use to_dicts() which includes all columns
config['fetch_format'] setting - use explicit methods
Removed Imports
# Old (removed)fromdatajointimportkeyresult=table.fetch(dj.key)
# Newresult=table.keys()
Examples
Example 1: Basic Data Retrieval
# Get all data as DataFramedf=Experiment().to_pandas()
# Get all data as list of dictsrows=Experiment().to_dicts()
# Get all data as numpy arrayarr=Experiment().to_arrays()
Example 2: Filtered and Sorted Query
# Get recent experiments, sorted by daterecent= (Experiment() &'date > "2024-01-01"').to_pandas(
order_by='date DESC',
limit=100
)
Example 3: Specific Columns
# Fetch specific columns as arraysnames, dates=Experiment().to_arrays('name', 'date')
# Or with primary key includednames, dates=Experiment().to_arrays('name', 'date', include_key=True)
Example 4: Primary Keys for Iteration
# Get keys for restrictionkeys=Experiment().keys()
forkeyinkeys:
process(Session() &key)
Example 5: Single Row
# Get one row as dictrow= (Experiment() &key).fetch1()
# Get specific attributesname, date= (Experiment() &key).fetch1('name', 'date')
Example 6: Lazy Iteration
# Stream rows efficiently (single database cursor)forrowinExperiment():
ifshould_process(row):
process(row)
ifdone:
break# Early termination - no wasted fetches
Example 7: Modern DataFrame Libraries
# Polars (fast, modern)importpolarsaspldf=Experiment().to_polars()
result=df.filter(pl.col('value') >100).group_by('category').agg(pl.mean('value'))
# PyArrow (zero-copy interop)table=Experiment().to_arrow()
# Can convert to pandas or polars with zero copy
Performance Considerations
Lazy Iteration
The new iteration is significantly more efficient:
# Old (1.x): N+1 queries# 1. fetch("KEY") gets ALL keys# 2. fetch1() for EACH key# New (2.0): Single query# Streams rows from one cursorforrowintable:
...
Memory Efficiency
to_dicts(): Returns full list in memory
for row in table:: Streams one row at a time
to_arrays(limit=N): Fetches only N rows
Format Selection
Use Case
Recommended Method
Data analysis
to_pandas() or to_polars()
JSON API responses
to_dicts()
Numeric computation
to_arrays()
Large datasets
for row in table: (streaming)
Interop with other tools
to_arrow()
Error Messages
When attempting to use removed methods, users see helpful error messages: