Skip to content

Latest commit

 

History

History
90 lines (64 loc) · 3.97 KB

File metadata and controls

90 lines (64 loc) · 3.97 KB

Processing

Link to code

The processing.json file captures the data processing and analysis steps that have been carried out – mostly for derived data assets. This tracks what code was used for each step, when it was run, what the input and outputs where, what parameters were set. This includes things like spike sorting, image alignment, cell segmentation. It also includes manual annotation, quality control, and data analysis.

The processing file should be appended to with each subsequent stage of processing or analysis.

Example

:language: python
:linenos:

Core file

Processing

Description of all processes run on data

Field Type Title (Description)
data_processes List[DataProcess] Data processing
pipelines Optional[List[Code]] Pipelines (For processing done with pipelines, list the repositories here. Pipelines must use the name field ,and be referenced in the pipeline_name field of a DataProcess.)
notes Optional[str] Notes
dependency_graph Optional[Dict[str, List[str]]] Dependency graph (Directed graph of processing step dependencies. Each key is a process name, and the value is a list of process names that are inputs to that process.)

Model definitions

DataProcess

Description of a single processing step

Field Type Title (Description)
process_type ProcessName Process type
name str Name (('Unique name of the processing step.', ' If not provided, the type will be used as the name.'))
stage ProcessStage Processing stage
code Code Code (Code used for processing)
experimenters List[str] Experimenters (People responsible for processing)
pipeline_name Optional[str] Pipeline name (Pipeline names must exist in Processing.pipelines)
start_date_time datetime (timezone-aware) Start date time
end_date_time Optional[datetime (timezone-aware)] End date time
output_path Optional[List[AssetPath]] Output path (Path to processing outputs, if stored.)
output_parameters Optional[dict] Outputs (Output parameters)
notes Optional[str] Notes
resources Optional[ResourceUsage] Process resource usage

ProcessStage

Stages of processing

Name Value
PROCESSING Processing
ANALYSIS Analysis

ResourceTimestamped

Description of resource usage at a moment in time

Field Type Title (Description)
timestamp datetime (timezone-aware) Timestamp
usage float Usage

ResourceUsage

Description of resources used by a process

Field Type Title (Description)
os str Operating system
architecture str Architecture
cpu Optional[str] CPU name
cpu_cores Optional[int] CPU cores
gpu Optional[str] GPU name
system_memory Optional[float] System memory
system_memory_unit Optional[MemoryUnit] System memory unit
ram Optional[float] System RAM
ram_unit Optional[MemoryUnit] Ram unit
cpu_usage Optional[List[ResourceTimestamped]] CPU usage
gpu_usage Optional[List[ResourceTimestamped]] GPU usage
ram_usage Optional[List[ResourceTimestamped]] RAM usage
usage_unit str Usage unit