Skip to content

Latest commit

 

History

History
213 lines (173 loc) · 6.28 KB

File metadata and controls

213 lines (173 loc) · 6.28 KB

Welcome to DataJoint for Python!

PyPI pypi release
pypi downloads
Conda Forge conda-forge release
conda-forge downloads
Since Release commit since last release
Test Status test status
Release Status release status
Doc Status doc status
Coverage coverage
Developer Chat datajoint slack
License LGPL-2.1
Citation bioRxiv
zenodo

DataJoint for Python is a framework for scientific workflow management based on relational principles. DataJoint is built on the foundation of the relational data model and prescribes a consistent method for organizing, populating, computing, and querying data.

DataJoint was initially developed in 2009 by Dimitri Yatsenko in Andreas Tolias' Lab at Baylor College of Medicine for the distributed processing and management of large volumes of data streaming from regular experiments. Starting in 2011, DataJoint has been available as an open-source project adopted by other labs and improved through contributions from several developers. Presently, the primary developer of DataJoint open-source software is the company DataJoint (https://datajoint.com).

Data Pipeline Example

pipeline

Yatsenko et al., bioRxiv 2021

Getting Started

Developer Guide

Prerequisites

  • Docker for MySQL and MinIO services
  • Python 3.10+

Running Tests

Tests are organized into unit/ (no external services) and integration/ (requires MySQL + MinIO):

# Install dependencies
pip install -e ".[test]"

# Run unit tests only (fast, no Docker needed)
pytest tests/unit/

# Start MySQL and MinIO for integration tests
docker compose up -d db minio

# Run all tests
pytest tests/

# Run specific test file
pytest tests/integration/test_blob.py -v

# Stop services when done
docker compose down

Alternative: Full Docker

Run tests entirely in Docker (no local Python needed):

docker compose --profile test up djtest --build

Alternative: Using pixi

pixi users can run tests with automatic service management:

pixi install        # First time setup
pixi run test       # Starts services and runs tests
pixi run services-down  # Stop services

Pre-commit Hooks

pre-commit install          # Install hooks (first time)
pre-commit run --all-files  # Run all checks

Environment Variables

Tests use these defaults (configured in pyproject.toml):

Variable Default Description
DJ_HOST localhost MySQL hostname
DJ_PORT 3306 MySQL port
DJ_USER root MySQL username
DJ_PASS password MySQL password
S3_ENDPOINT localhost:9000 MinIO endpoint

For Docker-based testing (devcontainer, djtest), set DJ_HOST=db and S3_ENDPOINT=minio:9000.