Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add find_empty_room step #629

Closed
larsoner opened this issue Oct 12, 2022 · 9 comments · Fixed by #634
Closed

ENH: add find_empty_room step #629

larsoner opened this issue Oct 12, 2022 · 9 comments · Fixed by #634

Comments

@larsoner
Copy link
Member

@agramfort experienced some "caching slowdowns" of unknown origin. Then locally for me when I ran a dataset, I also noticed it taking 5-10 sec per subject during the maxwell_filter step just to say "cached, skipping".

I tracked this down to the filename resolution step of scripts/preprocessing/_01_maxfilter::get_input_fnames_maxwell_filter, specifically this line:

in_files["raw_er"] = ref_bids_path.find_empty_room()

This is because, in order to find the empty room file, mne-bids will at least sometimes resort to reading info from all of the empty room files to find the one with the closest measurement date. (I think ideally this would be stored during the BIDS-ification step, but it's not always done, and we should accommodate these use cases if we can.)

To avoid this, I propose we add a scripts/init/_01_match_empty_room.py file that does this once ahead of time for datasets that need it, and then saves the match in some new file. This step can be cached just like all the others, and should only need to be run once per dataset (unless the raw data actually changes, which fortunately sane caching logic should take care of for us).

The only open question in this implementation is: where can we store the mapping from subject->empty room file? Can we create a new sidecar file somewhere? @hoechenberger

@larsoner
Copy link
Member Author

The only open question in this implementation is: where can we store the mapping from subject->empty room file? Can we create a new sidecar file somewhere? @hoechenberger

... From working a bit on #631 I'm thinking that we should just write a .json for each subject that gives the mapping to the empty-room file, if relevant. Then based on this:

$ git grep find_empty_room
mne_bids_pipeline/scripts/preprocessing/_01_maxfilter.py:            in_files["raw_er"] = ref_bids_path.find_empty_room()
mne_bids_pipeline/scripts/preprocessing/_02_frequency_filter.py:                in_files["raw_er"] = ref_bids_path.find_empty_room()

We can just modify the in_files for these two steps to get these from the JSON file (when relevant).

@agramfort
Copy link
Member

agramfort commented Oct 13, 2022 via email

@hoechenberger
Copy link
Member

Problem is we don't want to modify the input dataset

@agramfort
Copy link
Member

agramfort commented Oct 13, 2022 via email

@larsoner
Copy link
Member Author

larsoner commented Oct 13, 2022

true but at least it can be fast if sidecar is complete

Problem is we don't want to modify the input dataset

And if the sidecar is not complete -- as it seems to be for my dataset, and at least one of yours -- I propose we just store it in a little json file EDIT: in the mne-bids-pipeline derivatives. Then we get good speed in all cases without modifying the root dataset, or requiring people to update their root dataset.

@hoechenberger
Copy link
Member

true but at least it can be fast if sidecar is complete

Message ID: @.***>

Yes but we already are because we're using the sidecar if it's present

Check the use_sidecar_only parameter :)

@larsoner
Copy link
Member Author

larsoner commented Oct 13, 2022

From a brief chat with @agramfort he's okay with adding a step to store the matching in our own file, so I'll make a PR for that. Hopefully this can be done jointly with simplifications from #632 after #631 is merged, since #631 sets up the _io.py that I'll want to use, and #632 would greatly simplify/streamline the logic inside the freq filter and maxfilter functions that I'm going to be messing with anyway.

@hoechenberger
Copy link
Member

Sounds good!

@larsoner
Copy link
Member Author

I decided just to put the #632 logic changes in #633 for simplicity. I'll add the new find_empty_room step once #633 and #631 are in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants