Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kilosort data storage and GPU usage advice #389

Closed
saumilpatel opened this issue Apr 29, 2021 · 1 comment · Fixed by #595
Closed

Kilosort data storage and GPU usage advice #389

saumilpatel opened this issue Apr 29, 2021 · 1 comment · Fixed by #595

Comments

@saumilpatel
Copy link

Dear Marius,

We, in Andreas Tolias Lab at Baylor College of Medicine in Houston Texas, are planning to do close to 2 hours worth of recordings using neuropixels in mouse. Based on some calculations this would amount to close to 170 GB recording. We plan to use Kilosort 2 for spike sorting and are wondering what would your recommendations be regarding the storage of this data. Should we store in one file or multiple files ? How does Kilosort deal with large recordings, does it analyze in smaller time bins ? What GPU would you recommend ? And can we take advantage of multiple GPUs, we have tons of GPU servers so if we can parallelize the sorting we have the hardware for it. Also what kind of compute machines would you recommend, specially memory ?

Is there anything else you would recommend ?

We plan to record using our Labview software and store data in binary file/files so that as soon as they are closed, Kilosort can start working on them.

We appreciate any comments/feedback that you would have.

Thanks, Best, Saumil

@marius10p
Copy link
Contributor

Hi there, probably the most important thing would be to use Kilosort 2.5 instead of 2.0, especially for recordings like yours that are longer than one hour. See the recent Neuropixels 2.0 paper for a lot of detail about how 2.5 works.

The wiki has a hardware guide. Kilosort requires a single concatenated binary file, which it then processes into another temporary, high-pass filtered and whitened binary file. Ideally both the raw data and the binary file are on an SSD or a fast network connection. The processing is done in batches, so that's not a problem, but there are some quantities that accumulate over batches before being written to results and in Kilosort2 this sometimes lead to memory problems on machines with less RAM. I think you should be fine with 32 or 64 GB, not sure. There is no parallelization at the level of individual recordings, because the optimization process is sequential and there is barely enough work to keep one GPU busy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants