SpikeInterface
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/how_to/auto_label_units.rst‎
Lines changed: 272 additions & 0 deletions b/‎doc/how_to/auto_label_units.rst‎
Lines changed: 272 additions & 0 deletions
diff --git a/‎doc/how_to/auto_label_units_files/auto_label_units_12_1.png‎
73.7 KB b/‎doc/how_to/auto_label_units_files/auto_label_units_12_1.png‎
73.7 KB
diff --git a/‎doc/how_to/auto_label_units_files/auto_label_units_18_1.png‎
67.4 KB b/‎doc/how_to/auto_label_units_files/auto_label_units_18_1.png‎
67.4 KB
diff --git a/‎doc/how_to/auto_label_units_files/auto_label_units_22_1.png‎
68.3 KB b/‎doc/how_to/auto_label_units_files/auto_label_units_22_1.png‎
68.3 KB
diff --git a/‎doc/how_to/index.rst‎
Lines changed: 1 addition & 0 deletions b/‎doc/how_to/index.rst‎
Lines changed: 1 addition & 0 deletions
@@ -141,6 +141,7 @@ examples/modules_gallery/**/*.zarr
 
 # Files and folders generated during tests
 test_folder/
+*trained_pipeline/
 
 # Mac OS
 .DS_Store
 
@@ -0,0 +1,272 @@
+Automatic labeling units after spike sorting
+============================================
+
+This example shows how to automatically label units after spike sorting,
+using three different approaches:
+
+1. Simple filter based on quality metrics
+2. Bombcell: heuristic approach to label units based on quality and
+   template metrics [Fabre]_
+3. UnitRefine: pre-trained classifiers to label units as noise or
+   SUA/MUA [Jain]_
+
+.. code:: ipython3
+
+    import numpy as np
+
+    import spikeinterface as si
+    import spikeinterface.curation as sc
+    import spikeinterface.widgets as sw
+
+    from pprint import pprint
+
+.. code:: ipython3
+
+    %matplotlib inline
+
+.. code:: ipython3
+
+    analyzer_path = "/ssd980/working/analyzer_np2_single_shank.zarr"
+
+.. code:: ipython3
+
+    sorting_analyzer = si.load(analyzer_path)
+
+
+.. code:: ipython3
+
+    sorting_analyzer
+
+
+.. parsed-literal::
+
+    SortingAnalyzer: 96 channels - 142 units - 1 segments - zarr - sparse - has recording
+    Loaded 14 extensions: amplitude_scalings, correlograms, isi_histograms, noise_levels, principal_components, quality_metrics, random_spikes, spike_amplitudes, spike_locations, template_metrics, template_similarity, templates, unit_locations, waveforms
+
+
+
+The ``SortingAnalyzer`` includes several metrics that we can use for
+curation:
+
+.. code:: ipython3
+
+    sorting_analyzer.get_metrics_extension_data().columns
+
+
+
+
+.. parsed-literal::
+
+    Index(['amplitude_cutoff', 'amplitude_cv_median', 'amplitude_cv_range',
+           'amplitude_median', 'd_prime', 'drift_mad', 'drift_ptp', 'drift_std',
+           'firing_range', 'firing_rate', 'isi_violations_count',
+           'isi_violations_ratio', 'isolation_distance', 'l_ratio', 'nn_hit_rate',
+           'nn_miss_rate', 'noise_cutoff', 'noise_ratio', 'num_spikes',
+           'presence_ratio', 'rp_contamination', 'rp_violations', 'sd_ratio',
+           'silhouette', 'sliding_rp_violation', 'snr', 'sync_spike_2',
+           'sync_spike_4', 'sync_spike_8', 'exp_decay', 'half_width',
+           'main_peak_to_trough_ratio', 'main_to_next_extremum_duration',
+           'num_negative_peaks', 'num_positive_peaks',
+           'peak_after_to_trough_ratio', 'peak_after_width',
+           'peak_before_to_peak_after_ratio', 'peak_before_to_trough_ratio',
+           'peak_before_width', 'peak_to_trough_duration', 'recovery_slope',
+           'repolarization_slope', 'spread', 'trough_width', 'velocity_above',
+           'velocity_below', 'waveform_baseline_flatness'],
+          dtype='object')
+
+
+
+1. Quality-metrics based curation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A simple solution is to use a filter based on quality metrics. To do so,
+we can use the ``spikeinterface.curation.qualitymetrics_label_units``
+function and provide a set of thresholds.
+
+.. code:: ipython3
+
+    qm_thresholds = {
+        "snr": {"min": 5},
+        "firing_rate": {"min": 0.1, "max": 200},
+        "rp_contamination": {"max": 0.5}
+    }
+
+.. code:: ipython3
+
+    qm_labels = sc.qualitymetrics_label_units(sorting_analyzer, thresholds=qm_thresholds)
+
+.. code:: ipython3
+
+    qm_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    noise    115
+    good      27
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, qm_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("Quality-metrics labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_12_1.png
+
+
+Only 27 units are labeled as *good*, and we can see from the plots that
+some “noisy” waveforms are not properly flagged and some visually good
+waveforms are labeled as noise. Let’s take a look at more powerful
+methods.
+
+1. Bombcell
+-----------
+
+**Bombcell** ([Fabre]_) is another threshold-based method that also uses
+quality metrics and template metrics, but in a much more refined way! It
+can label units as ``noise``, ``mua``, and ``good`` and further detect
+``non-soma`` units. It comes with some default thresholds, but
+user-defined thresholds can be provided from a dictionary or a JSON
+file.
+
+.. code:: ipython3
+
+    bombcell_default_thresholds = sc.bombcell_get_default_thresholds()
+    pprint(bombcell_default_thresholds)
+
+
+.. parsed-literal::
+
+    {'amplitude_cutoff': {'max': 0.2, 'min': None},
+     'amplitude_median': {'max': None, 'min': 40},
+     'drift_ptp': {'max': 100, 'min': None},
+     'exp_decay': {'max': 0.1, 'min': 0.01},
+     'main_peak_to_trough_ratio': {'max': 0.8, 'min': None},
+     'num_negative_peaks': {'max': 1, 'min': None},
+     'num_positive_peaks': {'max': 2, 'min': None},
+     'num_spikes': {'max': None, 'min': 300},
+     'peak_after_to_trough_ratio': {'max': 0.8, 'min': None},
+     'peak_before_to_peak_after_ratio': {'max': 3, 'min': None},
+     'peak_before_to_trough_ratio': {'max': 3, 'min': None},
+     'peak_before_width': {'max': None, 'min': 0.00015},
+     'peak_to_trough_duration': {'max': 0.00115, 'min': 0.0001},
+     'presence_ratio': {'max': None, 'min': 0.7},
+     'rp_contamination': {'max': 0.1, 'min': None},
+     'snr_baseline': {'max': None, 'min': 5},
+     'trough_width': {'max': None, 'min': 0.0002},
+     'waveform_baseline_flatness': {'max': 0.5, 'min': None}}
+
+
+.. code:: ipython3
+
+    bombcell_labels = sc.bombcell_label_units(sorting_analyzer, thresholds=bombcell_default_thresholds)
+
+.. code:: ipython3
+
+    bombcell_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    good        58
+    noise       50
+    mua         33
+    non_soma     1
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, bombcell_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("Bombcell labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_18_1.png
+
+
+UnitRefine
+----------
+
+**UnitRefine** ([Jain]_) also uses quality and template metrics, but in
+a different way. It uses pre-trained classifiers to trained on
+hand-curated data. By default, the classification is performed in two
+steps: first a *noise*/*neural* classifier is applied, followed by a
+*sua*/*mua* classifier. Several models are available on the
+`SpikeInterface HuggingFace
+page <https://huggingface.co/SpikeInterface>`__.
+
+.. code:: ipython3
+
+    unitrefine_labels = sc.unitrefine_label_units(
+        sorting_analyzer,
+        noise_neural_classifier="SpikeInterface/UnitRefine_noise_neural_classifier",
+        sua_mua_classifier="SpikeInterface/UnitRefine_sua_mua_classifier",
+    )
+
+.. code:: ipython3
+
+    unitrefine_labels["label"].value_counts()
+
+
+
+
+.. parsed-literal::
+
+    label
+    sua      62
+    noise    47
+    mua      33
+    Name: count, dtype: int64
+
+
+
+.. code:: ipython3
+
+    w = sw.plot_unit_labels(sorting_analyzer, unitrefine_labels["label"], ylims=(-300, 100))
+    w.figure.suptitle("UnitRefine labeling")
+
+
+
+.. image:: auto_label_units_files/auto_label_units_22_1.png
+
+
+   **NOTE:** If you want to train your own models, see the `UnitRefine
+   repo <%60https://github.com/anoushkajain/UnitRefine%60>`__ for
+   instructions!
+
+This “How To” demonstrated how to automatically label units after spike
+sorting with different strategies. We recommend running **Bombcell** and
+**UnitRefine** as part of your pipeline. These methods will facilitate
+further curation and make downstream analysis cleaner.
+
+To remove units from your ``SortingAnalyzer``, you can simply use the
+``select_units`` function:
+
+.. code:: ipython3
+
+    non_noisy_units = bombcell_labels["label"] != "noise"
+    sorting_analyzer_clean = sorting_analyzer.select_units(sorting_analyzer.unit_ids[non_noisy_units])
+
+.. code:: ipython3
+
+    sorting_analyzer_clean
+
+
+
+
+.. parsed-literal::
+
+    SortingAnalyzer: 96 channels - 92 units - 1 segments - memory - sparse - has recording
+    Loaded 14 extensions: random_spikes, waveforms, templates, amplitude_scalings, correlograms, isi_histograms, noise_levels, principal_components, spike_locations, spike_amplitudes, quality_metrics, template_metrics, template_similarity, unit_locations
@@ -21,6 +21,7 @@ Guides on how to solve specific, short problems in SpikeInterface. Learn how to.
     load_matlab_data
     load_your_data_into_sorting
     benchmark_with_hybrid_recordings
+    auto_label_units
     auto_curation_training
     auto_curation_prediction
     import_kilosort_data