Pipeline Parameters¶
Global Parameters¶
The pipeline accepts several global parameters that control its overall behavior:
--n_jobs N_JOBS Number of parallel jobs (for local deployment)
--runmode {full,fast} Processing mode ('fast' skips some steps like motion correction)
--sorter {kilosort25,kilosort4,spykingcircus2} Spike sorter selection
Parameter File¶
A parameter file can be used to set all parameters at once.
This is the recommended way to configure the pipeline, especially for complex setups.
The parameter file should be in JSON format and you can use the pipeline/default_params.json file as a template.
To use a parameter file, specify it with the --params_file option:
--params_file PATH_TO_PARAMS_FILE
# Example: --params_file pipeline/default_params.json
Note
In the spikesorting section of the parameter file, you can specify the sorter and its parameters.
The sorter field, if specified and not null, will override the command line --sorter parameter.
Parameter Editor Webapp¶
A browser-based parameter editor is included in params_app/.
It reads the JSON schema (pipeline/default_params_schema.json) and renders an
interactive form for creating and editing parameter files, with built-in validation.
To run the webapp, use the included launcher script (requires Python 3):
python params_app/serve.py
This starts a local server from the repository root, prints the URL, and opens it in your browser. An optional port argument is supported:
python params_app/serve.py 9000
The webapp provides two tabs:
Editor — an interactive form with all pipeline parameters, inline descriptions, enum dropdowns, nullable toggles, and collapsible sections. You can generate, download, copy, or import JSON files.
Validate JSON — paste or upload an existing JSON file to validate it against the schema. Errors are shown with their JSON path and message.
No installation or build step is required — the app is fully static.
JSON Schema¶
The file pipeline/default_params_schema.json is a
JSON Schema (draft-07) that formally describes every
parameter, its type, allowed values, and defaults. You can use it for:
Editor integration — VS Code, PyCharm, and other editors can provide autocompletion and inline validation when you add a
$schemareference at the top of your params file:{ "$schema": "./default_params_schema.json", "job_dispatch": { "input": "nwb" } }
Programmatic validation — validate parameter files in Python:
import json, jsonschema with open("pipeline/default_params_schema.json") as f: schema = json.load(f) with open("my_params.json") as f: params = json.load(f) jsonschema.validate(params, schema) # raises on error
Process-Specific Arguments¶
Parameters can be specified via the parameter file or passed directly as command line arguments when running the pipeline. CLI arguments will override any conflicting parameters set in the parameter file.
Each pipeline step can be configured with specific parameters using the format:
--{step_name}_args="{args}"
Job Dispatch Parameters¶
Parameter file section (job_dispatch):
{
"split_segments": true,
"split_groups": true,
"debug": false,
"debug_duration": 30,
"skip_timestamps_check": false,
"multi_session": false,
"input": "openephys",
"spikeinterface_info": null
}
split_segmentsIf
true, each recording segment is processed independently. Iffalse, all segments are concatenated before processing.split_groupsIf
true, different electrode groups (e.g., probes) are dispatched as separate parallel jobs. Iffalse, all groups are combined into a single job.debugEnable debug mode: the recording is clipped to
debug_durationseconds to allow rapid end-to-end testing of the pipeline.debug_durationDuration in seconds to which the recording is clipped when
debugistrue.skip_timestamps_checkSkip validation of sample timestamps. Useful when timestamps are absent or known to be unreliable (e.g. some Open Ephys recordings).
multi_sessionIf
true, the data folder is expected to contain multiple session sub-folders, each of which is processed independently.inputData loader (reader) to use. One of
aind,spikeglx,openephys,nwb, orspikeinterface. Usespikeinterfacetogether with--spikeinterface-infofor any format supported by SpikeInterface.spikeinterface_infoJSON string containing the information needed to load a recording with SpikeInterface when
inputis set tospikeinterface. It includes:reader_type(required): string with the reader type (e.g. ‘plexon’, ‘neuralynx’, ‘intan’ etc.).Use ‘spikeinterface’ for any format supported by SpikeInterface’s universal reader.
reader_kwargs(optional): dictionary with the reader kwargs (e.g. {‘folder’: ‘/path/to/folder’}).keep_stream_substrings(optional): string or list of strings with the stream names to load (e.g. ‘AP’ or [‘AP’, ‘LFP’]).skip_stream_substrings(optional): string (or list of strings) with substrings used to skip streams (e.g. ‘NIDQ’ or [‘USB’, ‘EVENTS’]).probe_paths(optional): string or dict the probe paths to a ProbeInterface JSON file (e.g. ‘/path/to/probe.json’).If a dict is provided, the key is the stream name and the value is the probe path. If reader_kwargs is not provided, the reader will be created with default parameters. The probe_path is required if the reader doesn’t load the probe automatically.
{ "reader_type": "intan", "reader_kwargs": { "file_path": "/path/to/intan.rhd" }, "skip_stream_substrings": ["EVENTS"], "probe_paths": "path/to/probe.json" }
Note
If the reader needs extra packages installed, specify them in the EXTRA_INSTALLS variable in the capsule_versions.env file
(e.g. EXTRA_INSTALLS="mtscomp").
Preprocessing Parameters¶
Parameter file section (``preprocessing``):
{
"job_kwargs": {
"chunk_duration": "1s",
"progress_bar": false
},
"min_preprocessing_duration": 120,
"custom_preprocessing_pipeline": null,
"denoising_strategy": "cmr",
"filter_type": "highpass",
"highpass_filter": {
"freq_min": 300.0,
"margin_ms": 5.0
},
"bandpass_filter": {
"freq_min": 300.0,
"freq_max": 6000.0,
"margin_ms": 5.0
},
"phase_shift": {
"margin_ms": 100.0
},
"detect_bad_channels": {
"method": "coherence+psd",
"dead_channel_threshold": -0.5,
"noisy_channel_threshold": 1.0,
"outside_channel_threshold": -0.3,
"outside_channels_location": "top",
"n_neighbors": 11,
"seed": 0
},
"remove_out_channels": true,
"remove_bad_channels": true,
"max_bad_channel_fraction": 0.5,
"common_reference": {
"reference": "global",
"operator": "median"
},
"highpass_spatial_filter": {
"n_channel_pad": 60,
"n_channel_taper": null,
"direction": "y",
"apply_agc": true,
"agc_window_length_s": 0.01,
"highpass_butter_order": 3,
"highpass_butter_wn": 0.01
},
"motion_correction": {
"compute": true,
"apply": false,
"preset": "dredge_fast",
"detect_kwargs": {},
"select_kwargs": {},
"localize_peaks_kwargs": {},
"estimate_motion_kwargs": {
"win_step_norm": 0.1,
"win_scale_norm": 0.1
},
"interpolate_motion_kwargs": {}
}
}
job_kwargs.chunk_durationSize of each processing chunk, e.g.
"1s". Larger chunks reduce overhead but require more memory.job_kwargs.progress_barShow a progress bar during chunk-based processing.
min_preprocessing_durationMinimum recording duration in seconds required to run preprocessing. Recordings shorter than this value are skipped entirely.
custom_preprocessing_pipelineA dictionary defining a fully custom preprocessing pipeline. When
null, the default pipeline (filter → phase-shift → bad-channel detection → CMR/destripe → motion) is used. See [SpikeInterface docs](https://spikeinterface.readthedocs.io/en/stable/how_to/build_pipeline_with_dicts.html)denoising_strategyStrategy used for channel-level denoising after filtering:
"cmr"— Common Median Reference: subtracts the median trace computed across all (good) channels."destripe"— IBL destriping: applies a high-pass spatial filter along the probe axis (parameters controlled byhighpass_spatial_filter).
filter_typeTemporal filter applied to the raw signal before phase-shift correction:
"highpass"— useshighpass_filtersettings."bandpass"— usesbandpass_filtersettings.
highpass_filter.freq_minHigh-pass cutoff frequency in Hz.
highpass_filter.margin_msMargin in milliseconds added at segment boundaries to reduce filter edge artifacts.
bandpass_filter.freq_min/bandpass_filter.freq_maxLower and upper cutoff frequencies in Hz for the bandpass filter (only used when
filter_typeis"bandpass").bandpass_filter.margin_msBoundary margin in ms for the bandpass filter.
phase_shift.margin_msMargin in ms used for inter-sample phase-shift correction. This step compensates for the time offset introduced by multiplexed ADCs (e.g. Neuropixels).
detect_bad_channels.methodAlgorithm used to classify bad channels.
"coherence+psd"combines local signal coherence with power-spectral density to identify dead, noisy, and out-of-brain channels.detect_bad_channels.dead_channel_thresholdCoherence threshold below which a channel is classified as dead/disconnected.
detect_bad_channels.noisy_channel_thresholdSNR threshold above which a channel is classified as excessively noisy.
detect_bad_channels.outside_channel_thresholdThreshold used to detect channels outside the brain (based on PSD features).
detect_bad_channels.outside_channels_locationExpected anatomical position of out-of-brain channels on the probe:
"top"(channels at the tip end) or"bottom"(channels at the base end).detect_bad_channels.n_neighborsNumber of neighboring channels used when computing local signal coherence.
detect_bad_channels.seedRandom seed for reproducibility of the bad channel detection algorithm.
remove_out_channelsIf
true, channels detected as outside the brain are removed from further processing.remove_bad_channelsIf
true, dead and noisy channels are removed from further processing.max_bad_channel_fractionMaximum fraction of total channels that may be classified as bad before the entire recording is skipped. For example,
0.5means preprocessing is aborted if more than half of the channels are bad.common_reference.referenceScope of the common reference calculation.
"global"uses all good channels on the probe.common_reference.operatorAggregation function for the common reference.
"median"is robust to outlier channels.highpass_spatial_filterParameters for IBL destriping. Only used when
denoising_strategyis"destripe".n_channel_pad— number of channels padded at each edge before spatial filtering.n_channel_taper— number of channels used for the cosine taper (null= auto).direction— axis along which to apply the spatial filter ("y"= depth axis).apply_agc— apply Automatic Gain Control before spatial filtering.agc_window_length_s— AGC window length in seconds.highpass_butter_order— order of the Butterworth spatial high-pass filter.highpass_butter_wn— normalised cutoff frequency of the spatial filter (0–1).
motion_correction.computeIf
true, estimate probe drift and save the motion object. The motion estimate is always saved to results even if it is not applied to the recording.motion_correction.applyIf
true, apply motion interpolation to the recording traces. Iffalse(default), motion is computed and saved but the raw traces are left unmodified; postprocessing can optionally apply it later.motion_correction.presetNamed preset controlling the full motion-estimation workflow (detection, localisation, estimation).
- Available motion presets:
dredgedredge_fast(default)nonrigid_accuratenonrigid_fast_and_accuraterigid_fastkilosort_like
motion_correction.estimate_motion_kwargsExtra keyword arguments forwarded to the motion estimator.
win_step_normandwin_scale_normcontrol the temporal and spatial window step/scale (normalised to the recording duration and probe length, respectively).motion_correction.detect_kwargs/select_kwargs/localize_peaks_kwargs/interpolate_motion_kwargsAdditional keyword arguments forwarded to the peak detection, peak selection, peak localisation, and motion interpolation steps, respectively. Leave empty (
{}) to use preset defaults.
Spike Sorting Parameters¶
Parameter file section (spikesorting):
{
"sorter": null,
"{sorter_name}": {
"job_kwargs": {
"chunk_duration": "1s",
"progress_bar": false
},
"skip_motion_correction": false,
"min_drift_channels": 6,
"raise_if_fails": true,
"clear_cache": false,
"sorter": {
// sorter-specific parameters forwarded to SpikeInterface
}
}
}
Note
The kilosort4, kilosort25, and spykingcircus2 sub-objects inside spikesorting
hold sorter-specific parameters and are documented separately in each sorter separately.
sorterSelects the spike sorter to use. Accepted values:
"kilosort4","kilosort25","spykingcircus2". Whennull, the sorter is determined by the--sorterCLI argument.{sorter}.job_kwargsParallel processing chunk settings for the spike sorting step (same format as other steps).
{sorter}.skip_motion_correctionIf
true, disables the sorter’s built-in motion correction (useful when motion has already been handled in preprocessing).{sorter}.min_drift_channelsMinimum number of channels required to activate the sorter’s internal motion correction. Recordings with fewer channels skip drift correction automatically.
{sorter}.raise_if_failsIf
true, a sorting failure raises an exception and stops the pipeline for that recording. Iffalse, the failure is logged and the pipeline continues with the remaining recordings.{sorter}.clear_cache(Kilosort4 only) Force PyTorch to release its memory cache between memory-intensive operations. Useful on GPUs with limited VRAM.
{sorter}.sorterDictionary of sorter-specific parameters forwarded directly to the SpikeInterface sorter wrapper (e.g.
batch_size,Th_universalfor Kilosort4). Refer to the SpikeInterface documentation for the full list of accepted parameters per sorter.
Postprocessing Parameters¶
Parameter file section (postprocessing):
{
"job_kwargs": {
"chunk_duration": "1s",
"progress_bar": false
},
"use_motion_corrected": false,
"sparsity": {
"method": "radius",
"radius_um": 100
},
"duplicate_threshold": 0.9,
"return_in_uV": true,
"extensions": {
"random_spikes": {
"max_spikes_per_unit": 500,
"method": "uniform",
"margin_size": null,
"seed": null
},
"noise_levels": {
"num_chunks_per_segment": 20,
"chunk_size": 10000,
"seed": null
},
"waveforms": {
"ms_before": 2.0,
"ms_after": 3.0,
"dtype": null
},
"templates": {},
"spike_amplitudes": {
"peak_sign": "neg"
},
"template_similarity": {
"method": "l1"
},
"correlograms": {
"window_ms": 50.0,
"bin_ms": 1.0
},
"isi_histograms": {
"window_ms": 100.0,
"bin_ms": 5.0
},
"unit_locations": {
"method": "monopolar_triangulation"
},
"spike_locations": {
"method": "grid_convolution"
},
"template_metrics": {
"upsampling_factor": 10,
"sparsity": null,
"include_multi_channel_metrics": true
},
"principal_components": {
"n_components": 5,
"mode": "by_channel_local",
"whiten": true
},
"quality_metrics": {
"metric_names": [
"num_spikes", "firing_rate", "presence_ratio",
"snr", "isi_violation", "rp_violation",
"sliding_rp_violation", "amplitude_cutoff",
"amplitude_median", "amplitude_cv",
"synchrony", "firing_range", "drift",
"isolation_distance", "l_ratio", "d_prime",
"nearest_neighbor", "silhouette"
],
"metric_params": {
"presence_ratio": { "bin_duration_s": 60 },
"snr": { "peak_sign": "neg", "peak_mode": "extremum" },
"isi_violation": { "isi_threshold_ms": 1.5, "min_isi_ms": 0 },
"rp_violation": { "refractory_period_ms": 1, "censored_period_ms": 0.0 },
"sliding_rp_violation": {
"bin_size_ms": 0.25, "window_size_s": 1,
"exclude_ref_period_below_ms": 0.5, "max_ref_period_ms": 10,
"contamination_values": null
},
"amplitude_cutoff": {
"peak_sign": "neg", "num_histogram_bins": 100,
"histogram_smoothing_value": 3, "amplitudes_bins_min_ratio": 5
},
"amplitude_median": { "peak_sign": "neg" },
"amplitude_cv": {
"average_num_spikes_per_bin": 50, "percentiles": [5, 95],
"min_num_bins": 10, "amplitude_extension": "spike_amplitudes"
},
"firing_range": { "bin_size_s": 5, "percentiles": [5, 95] },
"synchrony": { "synchrony_sizes": [2, 4, 8] },
"nearest_neighbor": { "max_spikes": 10000, "n_neighbors": 4 },
"silhouette": { "method": ["simplified"] }
}
}
}
}
job_kwargsParallel processing chunk settings (same format as other steps).
use_motion_correctedIf
trueand motion was estimated but not applied during preprocessing, motion interpolation is applied to the recording before computing postprocessing extensions. Has no effect if motion correction was already applied or was not computed.sparsity.methodStrategy for selecting the subset of channels associated with each unit.
"radius"retains all channels withinradius_umµm of the estimated unit location.sparsity.radius_umRadius in micrometres around each unit’s estimated location used for sparse channel selection.
duplicate_thresholdTemplate correlation threshold above which two units are considered duplicates. The unit with fewer spikes is removed to avoid counting the same neuron twice.
return_in_uVIf
true, waveforms and templates are returned in microvolts (µV) by applying the recording’s gain/offset. Iffalse, values remain in raw ADC counts.extensionsParameters for the SpikeInterface extensions. Check spikeinterface documentation for the full list of available extensions and their parameters.
Curation Parameters¶
Parameter file section (curation):
{
"job_kwargs": {
"chunk_duration": "1s",
"progress_bar": false
},
"query": "isi_violations_ratio < 0.5 and presence_ratio > 0.8 and amplitude_cutoff < 0.1",
"noise_neural_classifier": "SpikeInterface/UnitRefine_noise_neural_classifier",
"sua_mua_classifier": "SpikeInterface/UnitRefine_sua_mua_classifier"
}
job_kwargsParallel processing chunk settings (same format as other steps).
queryA pandas-style query string applied to the quality metrics table. Units that do not satisfy the condition are labelled as
"bad"(they are retained in the output but flagged). Example:"isi_violations_ratio < 0.5 and presence_ratio > 0.8 and amplitude_cutoff < 0.1"
Any quality metric column name can be used in the expression. Set to
nullor""to skip query-based curation.noise_neural_classifierHuggingFace model ID for the noise-vs-neural unit classifier (part of the UnitRefine suite). The model takes waveform features as input and predicts whether each unit represents a real neuron or recording noise.
sua_mua_classifierHuggingFace model ID for the single-unit (SUA) vs. multi-unit (MUA) classifier. Predicts whether a unit is a well-isolated single neuron or a mixture of multiple neurons.
NWB Ecephys Parameters¶
Parameter file section (nwb.ecephys):
{
"backend": "zarr",
"stub": false,
"stub_seconds": 10,
"write_lfp": true,
"write_raw": false,
"lfp_temporal_factor": 2,
"lfp_spatial_factor": 4,
"lfp_highpass_freq_min": 0.1,
"surface_channel_agar_probes_indices": "",
"lfp": {
"filter": {
"freq_min": 0.1,
"freq_max": 500
},
"sampling_rate": 2500
}
}
backendNWB file format.
"zarr"produces a chunked, cloud-friendly Zarr store;"hdf5"produces a standard HDF5.nwbfile.stubIf
true, write a truncated version of the file for quick validation and testing.stub_secondsDuration in seconds of the stub recording written when
stubistrue.write_lfpIf
true, include the LFPElectricalSeriesin the NWB file.write_rawIf
true, include the raw (unfiltered, full-bandwidth)ElectricalSeriesin the NWB file. Note: this significantly increases output file size.lfp_temporal_factorTemporal downsampling factor applied to the LFP band before writing. A value of
2halves the sample rate. Use0or1to keep all samples.lfp_spatial_factorChannel subsampling stride for the LFP band. A value of
4retains every 4th channel. Use0or1to retain all channels.lfp_highpass_freq_minHigh-pass cutoff frequency in Hz applied to the LFP band before writing. Use
0to skip this filter.surface_channel_agar_probes_indicesJSON string mapping probe names to the index of the most superficial channel still in tissue, used for common-median referencing on probes inserted through agar. Example:
{"ProbeA": 350, "ProbeB": 360}. Leave empty ("") when not applicable.lfp.filter.freq_min/lfp.filter.freq_maxBandpass filter bounds (Hz) that define the LFP frequency band applied before downsampling.
lfp.sampling_rateTarget sampling rate in Hz for the LFP band after temporal downsampling.
Visualization Parameters¶
Parameter file section (visualization):
{
"job_kwargs": {
"chunk_duration": "1s",
"progress_bar": false
},
"timeseries": {
"n_snippets_per_segment": 2,
"snippet_duration_s": 0.5
},
"drift": {
"detection": {
"peak_sign": "neg",
"detect_threshold": 5,
"exclude_sweep_ms": 0.1
},
"localization": {
"ms_before": 0.1,
"ms_after": 0.3,
"radius_um": 100.0
},
"n_skip": 30,
"alpha": 0.15,
"vmin": -200,
"vmax": 0,
"cmap": "Greys_r",
"figsize": [10, 10]
},
"motion": {
"cmap": "Greys_r",
"scatter_decimate": 15,
"figsize": [15, 10]
}
}
job_kwargsParallel processing chunk settings (same format as other steps).
timeseries.n_snippets_per_segmentNumber of raw/preprocessed time-series snippet plots generated per recording segment.
timeseries.snippet_duration_sDuration in seconds of each time-series snippet.
drift.detection.peak_signPolarity of peaks detected for the drift scatter plot (
"neg"= negative peaks).drift.detection.detect_thresholdDetection threshold in median-absolute-deviation (MAD) units. Only peaks above this threshold are included in the drift plot.
drift.detection.exclude_sweep_msExclusion window in milliseconds around each detected peak to suppress double-detections.
drift.localization.ms_before/drift.localization.ms_afterWaveform window (ms) around each detected peak used for spike localisation in the drift plot.
drift.localization.radius_umRadius in µm around each peak’s primary channel used when localising spikes for the drift scatter plot.
drift.n_skipDecimation factor for the drift scatter plot. A value of
30means only 1 in 30 detected spikes is plotted (to reduce render time on dense recordings).drift.alphaTransparency (alpha) of scatter points in the drift plot (0 = fully transparent, 1 = opaque).
drift.vmin/drift.vmaxColour-axis limits for the drift colourmap (typically amplitude in µV).
drift.cmapMatplotlib colourmap used for colouring drift scatter points.
drift.figsizeFigure size
[width, height]in inches for the drift plot.motion.cmapMatplotlib colourmap used for the motion summary plot.
motion.scatter_decimateDecimation factor for the motion scatter plot (same concept as
drift.n_skip).motion.figsizeFigure size
[width, height]in inches for the motion plot.
Full example with custom parameters¶
Here’s an example of running the pipeline with custom parameters:
DATA_PATH=$DATA RESULTS_PATH=$RESULTS \
nextflow -C nextflow_local.config run main_multi_backend.nf \
--n_jobs 16 \
--sorter kilosort4 \
--job_dispatch_args="--input spikeglx --debug --debug-duration 120" \
--preprocessing_args="--motion compute --motion-preset nonrigid_fast_and_accurate" \
--nwb_ecephys_args="--skip-lfp"
- This example:
Runs 16 parallel jobs
Uses Kilosort4 for spike sorting
Processes SpikeGLX data in debug mode
Computes nonrigid motion correction
Skips LFP export in NWB files