Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update vbn quality metrics tutorial #2528

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,15 @@
"\n",
"## Tutorial overview\n",
"\n",
"**Note**: We have changed the default behavior of the SDK from the Visual Coding Neuropixels dataset. We now return *all* units by default, without filtering based on waveform `quality` or other metrics. We leave this filtering to the user. Applying these metrics is an important part of any analysis pipeline and we encourage users to use this notebook and the linked resources to get a thorough understanding of what quality metric filters their analyses require.\n",
"\n",
"This Jupyter notebook will provide a detailed explanation of the unit quality metrics included in the Allen Institute Neuropixels Visual Behavior dataset. It's important to pay attention to quality metrics, because failing to apply them correctly could lead to invalid scientific conclusions, or could end up hiding potentially useful data.\n",
"\n",
"To help you avoid these pitfalls, this tutorial will explore how these metrics are calculated, how they can be biased, and how they should be applied to specific use cases. It's important to keep in mind that none of these metrics are perfect, and that the use of unit quality metrics for filtering ephys data is still an evolving area of research. More work is required in order to establish general-purpose best practices and standards in this domain.\n",
"\n",
"This tutorial assumes you've already created a data cache, or are working with the files on AWS (Simple Storage Service (S3) bucket: [visual-behavior-neuropixels-data](https://s3.console.aws.amazon.com/s3/buckets/visual-behavior-neuropixels-data)). If you haven't reached that step yet, we recommend going through the [data access tutorial](./visual_behavior_neuropixels_data_access.ipynb) first.\n",
"\n",
"Functions related to data analysis will be covered in other tutorials. For a full list of available tutorials, see the [SDK documentation](https://allensdk.readthedocs.io/en/latest/visual_behavior_neuropixels.html)."
"Functions related to data analysis will be covered in other tutorials. For a full list of available tutorials, see the [SDK documentation](https://allensdk.readthedocs.io/en/latest/visual_behavior_neuropixels.html).\n"
]
},
{
Expand Down Expand Up @@ -56,7 +58,7 @@
"source": [
"## How these metrics were calculated\n",
"\n",
"The Python code used to calculate these metrics from the outputs of Kilosort2 is available in the [ecephys_spike_sorting](https://github.com/AllenInstitute/ecephys_spike_sorting/tree/master/ecephys_spike_sorting/modules/quality_metrics) repository. A number of the metrics are based on the waveform principal components, which are not included in the data release. To recompute these metrics on your own, you'll need access to the raw data, which is available in the [Allen Brain Observatory S3 Bucket on AWS](****NEED LINK*). \n",
"The Python code used to calculate these metrics from the outputs of Kilosort2 is available in the [ecephys_spike_sorting](https://github.com/AllenInstitute/ecephys_spike_sorting/tree/master/ecephys_spike_sorting/modules/quality_metrics) repository. A number of the metrics are based on the waveform principal components, which are not included in the data release.\n",
"\n",
"This code was recently incorporated into the [SpikeMetrics](https://github.com/SpikeInterface/spikemetrics) repository by the SpikeInterface team. It's now available as a PyPi package (`pip install spikemetrics`) if you'd like to try them out on your own data.\n",
"\n",
Expand Down Expand Up @@ -275,7 +277,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we move on to the next metric, let's add one more feature to these plots. Displaying the metrics separately for different brain regions can be helpful for understanding the variation that results from the physiological features of the area we're recording from. The four main regions that are part of the Neuropixels Visual Coding dataset are cortex, thalamus, hippocampus, and midbrain. We'll use the Allen CCF structure acronyms to find the units that belong to each region."
"Before we move on to the next metric, let's add one more feature to these plots. Displaying the metrics separately for different brain regions can be helpful for understanding the variation that results from the physiological features of the area we're recording from. The four main regions that are part of the Neuropixels Visual Behavior dataset are cortex, thalamus, hippocampus, and midbrain. We'll use the Allen CCF structure acronyms to find the units that belong to each region."
]
},
{
Expand Down Expand Up @@ -402,7 +404,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"It's clear that most units have a presence ratio of 0.9 or higher, which means they are present for at least 90% of the recording. Units with lower presence ratio are likely to have drifted out of the recording, or had waveforms that changed so dramatically they were assigend to separate clusters.\n",
"It's clear that most units have a presence ratio of 0.9 or higher, which means they are present for at least 90% of the recording. Units with lower presence ratio are likely to have drifted out of the recording, or had waveforms that changed so dramatically they were assigned to separate clusters.\n",
"\n",
"Calculating the exact fraction of units with presence ratio above 0.9 is easy:"
]
Expand Down