.. _pipeline_steps:
Pipeline Steps
==============
The AIND Ephys Pipeline consists of several key processing steps that are executed in sequence. Here's a detailed look at each step:
Job Dispatch
------------
The `job-dispatch `_ step:
* Generates JSON files for parallel processing
* Enables parallelization across:
* Multiple probes
* Multiple shanks (e.g., for NP2-4shank probes)
* Creates independent processing jobs for parallel execution
Preprocessing
-------------
The `preprocessing `_ step handles several critical data preparation tasks:
* Phase shift correction
* Highpass filtering
* Denoising
* Bad channel removal
* Common median reference ("cmr") or highpass spatial filter ("destripe")
* Motion estimation and correction (optional)
Spike Sorting
-------------
The pipeline supports multiple spike sorting algorithms:
* `Kilosort2.5 `_
* `Kilosort4 `_
* `SpykingCircus2 `_
Each sorter can be selected based on your specific needs and data characteristics.
Postprocessing
--------------
The `postprocessing `_ step performs additional processing on the
combined preprocessed recording and sorted data:
* Removal of duplicate units
* Computations of *extensions*:
* Waveforms extraction
* Templates
* Spike amplitudes
* Unit locations
* Principal Component Analysis (PCA) projections
* Spike locations
* Correlograms
* Template similarity
* Template metrics
* Quality metrics
Curation
--------
The `curation `_ step applies quality control by:
* Quality metrics-based filtering using thresholds on:
* ISI violation ratio
* Presence ratio
* Amplitude cutoff
* Unit classification as noise, MUA, or SUA using pretrained classifier (`UnitRefine `_)
The *recipe* for quality metrics can be customized to suit your specific needs.
Visualization
-------------
The `visualization `_ step generates static figures and interactive Figurl links for each probe:
* *timeseries*: including snippets of raw data, drift map, and motion visualizations
* *sorting_summary*: for spike sorting results inspection and curation
Each plot of the *timeseries* is also saved as a static image in the ``visualization/`` folder.
Result Collection
-----------------
The `result collection `_ step:
* Aggregates outputs from all parallel jobs
* Copies output folders to the results directory
* Organizes results in a standardized structure
NWB Export
----------
The final step creates standardized NWB output files, including:
* Session and subject information from `aind-subject-nwb `_
* Ecephys data from `aind-ecephys-nwb `_
* Unit data from `aind-units-nwb `_
Features:
* Supports multiple streams (e.g., probes) per file
* Optional raw data and LFP data writing