Skip to content

Commit 172ba18

Browse files
authored
Merge pull request #3438 from chrishalcrow/preprocessing-pipeline
Add PreprocessingPipeline
2 parents 065dfa8 + a9fcb94 commit 172ba18

10 files changed

Lines changed: 1061 additions & 5 deletions

File tree

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
Build a full Sorting pipeline with dicts
2+
========================================
3+
4+
When using ``SpikeInterface`` there are two phases. First, you should
5+
play: try to figure out any special steps or parameters you need to play
6+
with to get everything working with your data. Once you're happy, you then
7+
need to build a sturdy, consistent pipeline to process all your ephys
8+
sessions.
9+
10+
It is now possible to create a flexible spike sorting pipeline using
11+
three simple dictionaries: one for preprocessing (and the
12+
``PreprocessingPipeline``), another for sorting (and ``run_sorter``),
13+
and a final one for postprocessing (and the ``compute`` method). Here’s
14+
an example:
15+
16+
.. code:: ipython3
17+
18+
import spikeinterface.full as si
19+
20+
my_protocol = {
21+
'preprocessing': {
22+
'bandpass_filter': {},
23+
'common_reference': {'operator': 'average'},
24+
'detect_and_remove_bad_channels': {},
25+
},
26+
'sorting': {
27+
'sorter_name': 'mountainsort5',
28+
'verbose': False,
29+
'snippet_T2': 15,
30+
'remove_existing_folder': True,
31+
'progress_bar': False
32+
},
33+
'postprocessing': {
34+
'random_spikes': {},
35+
'noise_levels': {},
36+
'templates': {},
37+
'unit_locations': {'method': 'center_of_mass'},
38+
'spike_amplitudes': {},
39+
'correlograms': {},
40+
},
41+
}
42+
43+
# Usually, you would read in your raw recording
44+
rec, _ = si.generate_ground_truth_recording(num_channels=4, durations=[60], seed=0)
45+
preprocessed_rec = si.apply_pipeline(rec, my_protocol['preprocessing'])
46+
sorting = si.run_sorter(recording=preprocessed_rec, **my_protocol['sorting'])
47+
analyzer = si.create_sorting_analyzer(recording=preprocessed_rec, sorting=sorting)
48+
analyzer.compute(my_protocol['postprocessing'])
49+
50+
This is a full and flexible spike sorting pipeline in 5 lines of code!
51+
52+
To try out a different pipeline, you only need to update your protocol
53+
dicts.
54+
55+
Once you have an analyzer, you can then do things with it:
56+
57+
.. code:: ipython3
58+
59+
analyzer.save_as(folder="my_analyzer")
60+
si.plot_unit_summary(analyzer, unit_id=1)
61+
62+
63+
.. parsed-literal::
64+
65+
/home/nolanlab/Work/Developing/fromgit/spikeinterface/src/spikeinterface/widgets/unit_waveforms.py:183: UserWarning: templates_percentile_shading can only be used if the 'waveforms' extension is available. Settimg templates_percentile_shading to None.
66+
warn(
67+
68+
69+
70+
71+
.. parsed-literal::
72+
73+
<spikeinterface.widgets.unit_summary.UnitSummaryWidget at 0x7b122f7419d0>
74+
75+
76+
77+
78+
.. image:: build_pipeline_with_dicts_files/build_pipeline_with_dicts_6_2.png
79+
80+
81+
The main disadvantage of the dictionaties approach is that you don’t
82+
know exactly what options and steps are available for you. You can
83+
search the API for help. Or we store many dictionaries of tools and
84+
parameters, as is shown below.
85+
86+
Get all preprocessing steps:
87+
88+
.. code:: ipython3
89+
90+
from spikeinterface.preprocessing.pipeline import pp_names_to_functions
91+
print(pp_names_to_functions.keys())
92+
93+
94+
.. parsed-literal::
95+
96+
dict_keys(['filter', 'bandpass_filter', 'highpass_filter', 'notch_filter', 'gaussian_filter', 'normalize_by_quantile', 'scale', 'center', 'zscore', 'scale_to_physical_units', 'whiten', 'common_reference', 'phase_shift', 'detect_and_remove_bad_channels', 'detect_and_interpolate_bad_channels', 'rectify', 'clip', 'blank_saturation', 'silence_periods', 'remove_artifacts', 'zero_channel_pad', 'deepinterpolate', 'resample', 'decimate', 'highpass_spatial_filter', 'interpolate_bad_channels', 'depth_order', 'average_across_direction', 'directional_derivative', 'astype', 'unsigned_to_signed'])
97+
98+
99+
You can then check the arguments of each preprocessing step using
100+
e.g. their docstrings (in Jupyter you can run ``si.bandpass_filter?``
101+
and in the terminal ``help(si.bandpass_fitler)``)
102+
103+
.. code:: ipython3
104+
105+
print(si.bandpass_filter.__doc__)
106+
107+
108+
.. parsed-literal::
109+
110+
111+
Bandpass filter of a recording
112+
113+
Parameters
114+
----------
115+
recording : Recording
116+
The recording extractor to be re-referenced
117+
freq_min : float
118+
The highpass cutoff frequency in Hz
119+
freq_max : float
120+
The lowpass cutoff frequency in Hz
121+
margin_ms : float
122+
Margin in ms on border to avoid border effect
123+
dtype : dtype or None
124+
The dtype of the returned traces. If None, the dtype of the parent recording is used
125+
**filter_kwargs : dict
126+
Certain keyword arguments for `scipy.signal` filters:
127+
filter_order : order
128+
The order of the filter. Note as filtering is applied with scipy's
129+
`filtfilt` functions (i.e. acausal, zero-phase) the effective
130+
order will be double the `filter_order`.
131+
filter_mode : "sos" | "ba", default: "sos"
132+
Filter form of the filter coefficients:
133+
- second-order sections ("sos")
134+
- numerator/denominator : ("ba")
135+
ftype : str, default: "butter"
136+
Filter type for `scipy.signal.iirfilter` e.g. "butter", "cheby1".
137+
138+
Returns
139+
-------
140+
filter_recording : BandpassFilterRecording
141+
The bandpass-filtered recording extractor object
142+
143+
144+
145+
Get the default sorter parameters of mountainsort5:
146+
147+
.. code:: ipython3
148+
149+
print(si.get_default_sorter_params('mountainsort5'))
150+
151+
152+
.. parsed-literal::
153+
154+
{'scheme': '2', 'detect_threshold': 5.5, 'detect_sign': -1, 'detect_time_radius_msec': 0.5, 'snippet_T1': 20, 'snippet_T2': 20, 'npca_per_channel': 3, 'npca_per_subdivision': 10, 'snippet_mask_radius': 250, 'scheme1_detect_channel_radius': 150, 'scheme2_phase1_detect_channel_radius': 200, 'scheme2_detect_channel_radius': 50, 'scheme2_max_num_snippets_per_training_batch': 200, 'scheme2_training_duration_sec': 300, 'scheme2_training_recording_sampling_mode': 'uniform', 'scheme3_block_duration_sec': 1800, 'freq_min': 300, 'freq_max': 6000, 'filter': True, 'whiten': True, 'delete_temporary_recording': True, 'pool_engine': 'process', 'n_jobs': 1, 'chunk_duration': '1s', 'progress_bar': True, 'mp_context': None, 'max_threads_per_worker': 1}
155+
156+
157+
Find the possible extensions you can compute
158+
159+
.. code:: ipython3
160+
161+
print(analyzer.get_computable_extensions())
162+
163+
164+
.. parsed-literal::
165+
166+
['random_spikes', 'waveforms', 'templates', 'noise_levels', 'amplitude_scalings', 'correlograms', 'isi_histograms', 'principal_components', 'spike_amplitudes', 'spike_locations', 'template_metrics', 'template_similarity', 'unit_locations', 'quality_metrics']
167+
168+
169+
And the arguments for each extension ‘blah’ can be found in the
170+
docstring of ‘compute_blah’, e.g.
171+
172+
.. code:: ipython3
173+
174+
print(si.compute_spike_amplitudes.__doc__)
175+
176+
177+
.. parsed-literal::
178+
179+
180+
AnalyzerExtension
181+
Computes the spike amplitudes.
182+
183+
Needs "templates" to be computed first.
184+
Computes spike amplitudes from the template's peak channel for every spike.
185+
186+
Parameters
187+
----------
188+
sorting_analyzer : SortingAnalyzer
189+
A SortingAnalyzer object
190+
peak_sign : "neg" | "pos" | "both", default: "neg"
191+
Sign of the template to compute extremum channel used to retrieve spike amplitudes.
192+
193+
Returns
194+
-------
195+
spike_amplitudes: np.array
196+
All amplitudes for all spikes and all units are concatenated (along time, like in spike vector)
34.4 KB
Loading

doc/modules/preprocessing.rst

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,102 @@ CMR, and save it to a binary file in the "/path/to/preprocessed" folder. The :co
5353

5454
**NOTE:** all sorters will automatically perform the saving operation internally.
5555

56+
The Preprocessing Pipeline
57+
--------------------------
58+
59+
The module also contains the :code:`PreprocessingPipeline` object which aims to allow users to easily share pipelines across
60+
labs. The input to create the pipeline is a dictionary of preprocessing steps whose keys are the names of the steps
61+
and values are dictionaries of parameters. For example, to construct a pipeline consisting of highpass filtering
62+
with a minimum frequency of 250 Hz followed by whitening with default parameters, and finally a detect and remove
63+
bad channels step. We first make the appropriate dictionary
64+
65+
.. code-block:: python
66+
67+
from spikeinterface.preprocessing import apply_pipeline, PreprocessingPipeline
68+
69+
preprocessing_dict = {
70+
'highpass_filter': {'freq_min': 250},
71+
'whiten': {},
72+
'detect_and_remove_bad_channels': {},
73+
}
74+
75+
We can then pass this dictionary to the :code:`apply_pipeline` function to make a preprocessed recording
76+
77+
.. code-block:: python
78+
79+
preprocessed_recording = apply_pipeline(recording, preprocessing_dict)
80+
81+
Alternatively, we can construct a :code:`PreprocessingPipeline`, allowing us to investigate the pipeline before
82+
using it.
83+
84+
.. code-block:: python
85+
86+
preprocessing_pipeline = PreprocessingPipeline(preprocessing_dict)
87+
# to view the pipeline:
88+
preprocessing_pipeline
89+
90+
.. raw:: html
91+
92+
<div>
93+
<strong>PreprocessingPipeline</strong>
94+
<div style='border:1px solid #ccc; padding:10px;'>
95+
<strong>Initial Recording</strong>
96+
</div>
97+
<div style='margin: auto; text-indent: 30px;'>&#x2193;</div>
98+
<details style='border:1px solid #ddd; padding:5px;'>
99+
<summary><strong>highpass_filter</strong></summary>
100+
<ul>
101+
<li><strong>freq_min</strong>: 250</li>
102+
<li><strong>margin_ms</strong>: 5.0</li>
103+
<li><strong>dtype</strong>: None</li>
104+
<li><strong>**filter_kwargs</strong>: None</li>
105+
</ul>
106+
</details>
107+
<details style='border:1px solid #ddd; padding:5px;'>
108+
<summary><strong>whiten</strong></summary>
109+
<ul>
110+
<li><strong>dtype</strong>: None</li>
111+
<li><strong>apply_mean</strong>: False</li>
112+
<li><strong>regularize</strong>: False</li>
113+
<li><strong>regularize_kwargs</strong>: None</li>
114+
<li><strong>mode</strong>: 'global'</li>
115+
<li><strong>radius_um</strong>: 100.0</li>
116+
<li><strong>int_scale</strong>: None</li>
117+
<li><strong>eps</strong>: None</li>
118+
<li><strong>W</strong>: None</li>
119+
<li><strong>M</strong>: None</li>
120+
<li><strong>**random_chunk_kwargs</strong>: None</li>
121+
</ul>
122+
</details>
123+
<details style='border:1px solid #ddd; padding:5px;'>
124+
<summary><strong>detect_and_remove_bad_channels</strong></summary>
125+
<ul>
126+
<li><strong>parent_recording</strong>: None</li>
127+
<li><strong>bad_channel_ids</strong>: None</li>
128+
<li><strong>channel_labels</strong>: None</li>
129+
<li><strong>**detect_bad_channels_kwargs</strong>: None</li>
130+
</ul>
131+
</details>
132+
<div style='margin: auto; text-indent: 30px;'>&#x2193;</div>
133+
<div style='border:1px solid #ccc; padding:10px;'>
134+
<strong>Preprocessed Recording</strong>
135+
</div>
136+
</div>
137+
138+
Once we have the pipeline, we can apply it to a recording in the same way as applying the dictionary
139+
140+
.. code-block:: python
141+
142+
preprocessed_recording_again = apply_pipeline(recording, preprocessing_pipeline)
143+
144+
To share the pipeline you have made with another lab, you can simply share the dictionary. The dictionary
145+
can also be obtained from the pipeline object directly:
146+
147+
.. code-block:: python
148+
149+
dict_used_to_make_pipeline = preprocessing_pipeline.preprocessor_dict
150+
151+
56152
Impact on recording dtype
57153
-------------------------
58154

doc/tutorials_custom_index.rst

100644100755
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,13 @@ These tutorials focus on the :py:mod:`spikeinterface.core` module.
8585
:class-card: gallery-card
8686
:text-align: center
8787

88+
.. grid-item-card:: Build full pipeline with dicts
89+
:link: how_to/build_pipeline_with_dicts.html
90+
:img-top: /images/logo.png
91+
:img-alt: Build full pipeline with dicts
92+
:class-card: gallery-card
93+
:text-align: center
94+
8895
Extractors tutorials
8996
--------------------
9097

0 commit comments

Comments
 (0)