.. _pipeline_steps: Pipeline Steps ============== The AIND Ephys Pipeline consists of several key processing steps that are executed in sequence. Here's a detailed look at each step: Job Dispatch ------------ The `job-dispatch `_ step: * Generates JSON files for parallel processing * Enables parallelization across: * Multiple probes * Multiple shanks (e.g., for NP2-4shank probes) * Creates independent processing jobs for parallel execution Preprocessing ------------- The `preprocessing `_ step handles several critical data preparation tasks: * Phase shift correction * Highpass filtering * Denoising * Bad channel removal * Common median reference ("cmr") or highpass spatial filter ("destripe") * Motion estimation and correction (optional) Spike Sorting ------------- The pipeline supports multiple spike sorting algorithms: * `Kilosort2.5 `_ * `Kilosort4 `_ * `SpykingCircus2 `_ Each sorter can be selected based on your specific needs and data characteristics. Postprocessing -------------- The `postprocessing `_ step performs additional processing on the combined preprocessed recording and sorted data: * Removal of duplicate units * Computations of *extensions*: * Waveforms extraction * Templates * Spike amplitudes * Unit locations * Principal Component Analysis (PCA) projections * Spike locations * Correlograms * Template similarity * Template metrics * Quality metrics Curation -------- The `curation `_ step applies quality control by: * Quality metrics-based filtering using thresholds on: * ISI violation ratio * Presence ratio * Amplitude cutoff * Unit classification as noise, MUA, or SUA using pretrained classifier (`UnitRefine `_) The *recipe* for quality metrics can be customized to suit your specific needs. Visualization ------------- The `visualization `_ step generates static figures and interactive Figurl links for each probe: * *timeseries*: including snippets of raw data, drift map, and motion visualizations * *sorting_summary*: for spike sorting results inspection and curation Each plot of the *timeseries* is also saved as a static image in the ``visualization/`` folder. Result Collection ----------------- The `result collection `_ step: * Aggregates outputs from all parallel jobs * Copies output folders to the results directory * Organizes results in a standardized structure NWB Export ---------- The final step creates standardized NWB output files, including: * Session and subject information from `aind-subject-nwb `_ * Ecephys data from `aind-ecephys-nwb `_ * Unit data from `aind-units-nwb `_ Features: * Supports multiple streams (e.g., probes) per file * Optional raw data and LFP data writing