Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output filtered .dat for phy #223

Closed
biomath opened this issue Aug 18, 2020 · 2 comments · Fixed by #595
Closed

Output filtered .dat for phy #223

biomath opened this issue Aug 18, 2020 · 2 comments · Fixed by #595

Comments

@biomath
Copy link

biomath commented Aug 18, 2020

Hi there,
I apologize if this has been asked before (I could not find it searching the Issues).
I'm trying to output the pre-processed (band-pass and CAR filtered) recordings out of the preprocessDataSub.m code so that they can be viewed in phy during single unit curation.

I've narrowed the problem down to the offset buffers before filtering (Screenshot 1 below; red marking). Below I show cases of a single run of Kilosort and post-hoc filtered data extractions. Meaning, for debugging purposes, I ran Kilosort on these data only once (with the offset buffer code), and created a separate code to output the filtered data. To read it in phy, I simply changed params.py to read the new files.
Screenshot 1:
image

When I comment out the buffer code (red markings) before filtering, I can output a filtered file (with filtering artifacts, obviously) that aligns perfectly with the timestamps resulting from the original Kilosort run (screenshot 2 below). The waveforms are the same as when reading the unfiltered data on phy.
Screenshot 2:
image

Simply re-adding the offset buffer code back before the filtering will completely wipe out the spikes picked out by the previous run of Kilosort (screenshot 3 below). The raw recording as evidenced by the TraceView also seems to be off as if the offset buffers were not removed correctly before writing to file.
Screenshot 3:
image

Let me know if you need any more info. Any input on how to solve this would be greatly appreciated.

PS: Kilosort is fantastic and I very much appreciate all your work.

Thank you!
Matheus Macedo-Lima

@biomath biomath changed the title Output filtered-whitened .dat for phy Output filtered .dat for phy Aug 18, 2020
@biomath
Copy link
Author

biomath commented Aug 20, 2020

I found a provisional fix for this. I'm not 100% sure about any of this, so please correct me if I am wrong or don't understand this thoroughly.

I noticed that the original offset ( offset = max(0, ops.twind + 2*NchanTOT*((NT - ops.ntbuff) * (ibatch-1) - 2*ops.ntbuff)); ) is not consistently around [(ibatch-1)*NT; ibatch*NT], but changes with every iteration. For example, when ibatch = 2, the start of the offset buffer in relation to (ibatch-1)*NT is: (ibatch-1)*NT - offset/(2*NchanTOT)= 3*ops.ntbuff. Then when ibatch = 3, (ibatch-1)*NT - offset/(2*NchanTOT)= 4*ops.ntbuff. And so forth. So the distance between the start of the offset and (ibatch-1)*NT scales geometrically as the number of batches progresses.

If I replace the original offset for offset = max(0, ops.twind + 2*NchanTOT*(NT*(ibatch-1) - 2*ops.ntbuff));, I achieve a consistent offset always 2*ops.ntbuff before (ibatch-1)*NT. With this, the buffer then consistently spans 2*ops.ntbuff samples before (ibatch-1)*NT and 2*ops.ntbuff samples after ibatch*NT with every iteration (with NTbuff = NT + 4*ops.ntbuff). Doing this, I need to also replace the original ioffset for ioffset = 2*ops.ntbuff;

The filtered file resulting from this code aligns perfectly with the original Kilosort run mentioned in my original post (Screenshot below).
image

That said,
Is the original behavior of offset expected? I noticed it is used in many other functions along Kilosort, so if I want to replace it in preprocessDataSub.m, I also have to replace it in many other modules. If the original behavior is expected, there is something I fundamentally don't understand about it, and I'd appreciate a hint on what it's supposed to do.

Thank you!

@marius10p
Copy link
Contributor

You understand correctly, the filtered binary file is not meant to contain a continuous range of time samples. It is instead a concatenation of batches, each of which has it's own buffers on the ends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants