Input and Output

Input Data Types

The pipeline supports several input data formats:

SpikeGLX

  • Input folder should contain a SpikeGLX saved folder

  • Recommended: Include subject.json and data_description.json following aind-data-schema specification

Open Ephys

  • Input folder should contain an Open Ephys folder

  • Recommended: Include subject.json and data_description.json following aind-data-schema specification

NWB

  • Input folder should contain a single NWB file

  • Supports both HDF5 and Zarr backends

SpikeInterface

  • Input folder should contain a SpikeInterface recording object

  • The spikeinterface_info parameter can be used to specify the recording type and additional parameters

AIND

  • Used for AIND-specific data ingestion

  • Input folder structure (as defined in aind-file-standards):
    • ecephys/ directory containing:
      • ecephys_clipped/ (clipped Open Ephys folder)

      • ecephys_compressed/ (compressed traces with Zarr)

    • JSON files following aind-data-schema specification

Pipeline Output

The pipeline output is organized in the RESULTS_PATH directory with the following structure:

preprocessed/

Contains preprocessing outputs:

  • Preprocessed JSON files for each stream

  • Motion folders with estimated motion

  • Can be loaded with SpikeInterface:

import spikeinterface as si
recording_preprocessed = si.load(
   "path-to-preprocessed.json",
   base_folder="path-to-raw-data-parent"
)

# Load motion data
import spikeinterface.preprocessing as spre
motion_info = spre.load_motion_info("path-to-motion-folder")

spikesorted/

Contains raw spike sorting output:

  • One folder per stream

  • Can be loaded as:

import spikeinterface as si
sorting_raw = si.load("path-to-spikesorted-folder")

curated/

Contains curated spike sorting outputs:

  • Includes unit deduplication and quality metric-based curation

  • Unit classification results

  • Load with SpikeInterface:

import spikeinterface as si
sorting_curated = si.load("path-to-curated-folder")

# Access curation properties
default_qc = sorting_curated.get_property("default_qc")  # True/False for QC pass
decoder_label = sorting_curated.get_property("decoder_label")  # noise/MUA/SUA

postprocessed/

Contains postprocessing output in Zarr format:

  • One folder per stream

  • Load with SpikeInterface as a SortingAnalyzer object:

import spikeinterface as si
sorting_analyzer = si.load("path-to-postprocessed-folder.zarr")

# Access extensions
unit_locations = sorting_analyzer.get_extension("unit_locations").get_data()
qm = sorting_analyzer.get_extension("quality_metrics").get_data()

Note

The default_qc and decoder_label properties are also included in the SortingAnalyzer! You can access them using: sorting_analyzer.get_sorting_property("default_qc"/"decoder_label").

nwb/

Contains generated NWB files:

  • One NWB file per block/segment

  • Includes all streams for that block/segment

  • Contains:
    • Session/subject information

    • Ecephys metadata

    • LFP signals (optional)

    • Units data

visualization/

Contains generated visualizations:

  • Drift maps

  • Motion plots

  • Sample traces for all streams

Additional Files

  • visualization_output.json: Contains Figurl links for each stream

  • processing.json: Logs processing steps, parameters, and execution times

  • nextflow/: Contains all Nextflow-generated files