UnitRefine is an open-source machine-learning framework for automated spike-sorting curation in electrophysiological experiments.
It strongly reduces the need for manual curation of spike-sorting results by leveraging supervised classifiers trained on human-labeled data to predict single-unit activity (SUA), multi-unit activity (MUA), and noise in sorted clusters.
UnitRefine is fully integrated into SpikeInterface, enabling users to:
- Apply pre-trained curation models to new recordings
- Train and fine-tune custom models on their own curated experimental data
- Iteratively improve models using active learning
- Share reproducible models via the Hugging Face Hub
UnitRefine is agnostic to probe type, species, brain region, or spike sorter and generalizes to previously unseen datasets. It has been validated across high-density probes, Utah arrays, and intracranial human recordings from multiple laboratories and species.
A user-friendly GUI supports end-to-end workflows including curation (using SpikeInterface-GUI for cluster visualization), training, validation, model loading, and retraining.
The GUI also supports active learning by highlighting uncertain clusters, allowing users to iteratively improve model performance through targeted relabeling.
To learn more about UnitRefine and how to use it in your projects, please see our preprint.
If UnitRefine is useful for your research, please consider citing our work.
UnitRefine provides several pre-trained models from different species and experimental setups.
Each model folder includes:
- The trained classifier (
.skopsformat) - Model metadata
- The curated feature matrix used for training
In our preprint we show that UnitRefine reliably identifies human-labeled Single-Unit Activity (SUA) across multiple datasets, probe types, and species.
| Dataset | Species | Probe type | Spike sorter | Pipeline | Output format | Source |
|---|---|---|---|---|---|---|
| Base dataset | Mouse | Neuropixels 1.0 | Kilosort 2.5 | SpikeInterface | Kilosort folders | UnitRefine base dataset |
| IBL dataset | Mouse | Neuropixels 1.0 | IBL sorter (PyKilosort 2.5) | IBL pipeline | SortingAnalyzer objects | International Brain Laboratory |
| Allen dataset | Mouse | Neuropixels 2.0 | Kilosort 4 | Allen ecephys | .zarr files |
Allen Institute |
| Mole rat dataset | Naked mole rat | Neuropixels 2.0 | Kilosort 4 | SpikeInterface | SortingAnalyzer objects | Shirdhankar et al., 2025 |
| Monkey dataset | Rhesus macaque | Utah array | Kilosort 4 | Custom | Kilosort folders | Chen et al., 2022 |
| Human dataset | Human | Behnke–Fried electrodes | Combinato | Combinato | Combinato output | Gerken et al., 2025 |
All datasets are also publicly available on figshare.
- Run spike sorting (e.g. Kilosort) and compute metrics with SpikeInterface.
- Apply a pre-trained UnitRefine model or label a subset of clusters manually.
- Train or fine-tune a classifier.
- Automatically curate the full dataset.
- Optionally refine using active learning.
-
Install uv (recommended python package manager).
(Note for Windows users: If you have issues installing uv, please check out the FAQ section.) -
Use Git (https://git-scm.com/install) to clone the UnitRefine repository. Then move into the repo folder to install dependencies and run the GUI.
git clone https://github.com/anoushkajain/UnitRefine.git
cd UnitRefine- Install dependencies
uv syncWe provide detailed Jupyter Notebook tutorials to help you get started with UnitRefine.
Tutorials are available in the repository under UnitRefine/tutorial
The notebooks demonstrate how to:
- Apply pre-trained models to automatically curate spike-sorted datasets
- Train custom classifiers using manually curated labels
- Use pre-computed cluster metrics stored as
.csvfiles - Integrate UnitRefine directly with SpikeInterface
SortingAnalyzerobjects
UnitRefine supports two main workflows:
-
Analyzer-based workflow (recommended)
Uses SpikeInterfaceSortingAnalyzerobjects for metric computation and ensures consistency with SpikeInterface pipelines. -
CSV-based workflow
Uses pre-computed cluster metrics stored as.csvfiles.
This enables integration into custom pipelines outside of SpikeInterface.
For additional background on automated curation within the SpikeInterface ecosystem, see the official
SpikeInterface automated curation tutorials.
For transparent and reproducible model interpretation (as described in the UnitRefine paper), we provide a dedicated notebook:
SHAP_plots.ipynb
This notebook demonstrates how to compute SHAP values, evaluate feature importance stability across random seeds, select the best-performing model, and generate confusion matrices for reproducible model interpretation.
We provide a UnitRefine GUI that simplifies unit curation, model training, loading, and relabeling.
For detailed instructions and usage examples, please refer to the GUI documentation here.
To run the GUI inside the UnitRefine repo, create a new project.
uv run unitrefine --project_folder my_new_projectImportant: This command must be executed from the root folder of the cloned UnitRefine repository.
This will create a new project folder and launch the UnitRefine GUI. A window should pop up that looks something like this:
Within the GUI, users can:
- Visualize cluster waveforms, amplitudes, correlograms, and quality metrics
- Manually assign or correct cluster labels
- Train new models using curated labels
- Load and apply pre-trained models
- Validate model predictions
- Retrain models based on updated labels
The GUI also supports active learning by highlighting clusters with low prediction confidence, enabling efficient and targeted relabeling to improve model performance.
UnitRefine requires only a standard computer with enough RAM to support the in-memory operations.
Tested on: Linux • macOS • Windows
Python: 3.11+
- NumPy, Pandas
- scikit-learn
- SpikeInterface + spikeinterface-gui
- PyQt5 (GUI backend)
- Hugging Face Hub (model loading)
- skops (model serialization)
This software is released under the MIT license.
We would like to express our sincere gratitude to the following individuals for their invaluable contributions to this project: UnitRefine builds heavily on the flexible and powerful SpikeInterface and SpikeInterface-GUI packages. Many thanks to Alessio, Sam, Zack, and Joe for their help and feedback on this project, as well as to the entire SpikeInterface team.
-
Code Refactoring and Integration in SpikeInterface:
Chris Halcrow, Jake Swann, Robyn Greene, Sangeetha Nandakumar (IBOTS) -
Model Curators:
Nilufar Lahiji, Sacha Abou Rachid, Severin Graff, Luca Koenig, Natalia Babushkina, Simon Musall -
Advisors and collaborators:
Alessio Buccino, Olivier Winter, Sonja Grün, Matthias Hennig, Simon Musall
We encourage feedback, contributions, and collaboration from the community to improve UnitRefine. Feel free to open issues or submit pull requests to enhance the toolbox further.
1. Issues with installing UV
For Windows users trying to install uv, try doing
pip install uvIf this does not work, please follow the instructions on the uv Windows installation page.
2. Number of labels
You need to provide at least 6 labels for each used class (SUA, MUA, Noise) to prevent errors during model fitting (e.g. cross-validation and class balancing). You can also use only SUA and Noise labels to create a binary instead of a 3-class classifier. As a starting model with decent performance you should label at least 10% of the data (should be more than 50 clusters in total).
