-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add function to enable smart path search #1098 #1103
Conversation
Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴🏽♂️ |
|
||
regexp = ( | ||
leading_path_str + | ||
sub_str + ses_str + task_str + acq_str + run_str + proc_str + | ||
rec_str + space_str + split_str + desc_str + suffix_str + ext_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I fixed a small bug. "space" must be searched before "recording", according to their order in the basename.
Thanks for the contribution! I merged |
Codecov Report
@@ Coverage Diff @@
## main #1103 +/- ##
==========================================
+ Coverage 95.24% 95.28% +0.03%
==========================================
Files 24 24
Lines 3873 3900 +27
==========================================
+ Hits 3689 3716 +27
Misses 184 184
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
All green! I don't have time to review, but maybe @sappelhoff, @agramfort, or @adam2392 can find a couple minutes :) |
@moritz-gerster We'll definitely need a changelog entry though :) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to add the new function to the API docs, and I'd be very happy to see it in action in any one of our existing examples.
Thanks for your contribution, Moritz!
doc/whats_new.rst
Outdated
@@ -44,6 +44,7 @@ Detailed list of changes | |||
- Add :meth:`mne_bids.BIDSPath.find_matching_sidecar` to find the sidecar file associated with a given file path by `Eric Larson`_ (:gh:`1093`) | |||
- When writing data via :func:`~mne_bids.write_raw_bids`, it is now possible to specify a custom mapping of :class:`mne.Annotations` descriptions to event codes via the ``event_id`` parameter. Previously, passing this parameter would always require to also pass ``events``, and using a custom event code mapping for annotations was impossible, by `Richard Höchenberger`_ (:gh:`1084`) | |||
- Improve error message when :obj:`~mne_bids.BIDSPath.fpath` cannot be uniquely resolved by `Eric Larson`_ (:gh:`1097`) | |||
- Add :func:`mne_bids.find_matching_paths` to get all bids paths that match the requirements by :newcontrib:`Moritz Gerster` (:gh:`1103`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct @hoechenberger? In the mne_bids contribution guide it is not described how to add lines so I followed the mne contributing guide.
Here I didn't add my name because I guess this works automatically?
The following authors contributed for the first time. Thank you so much! 🤩
* ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this works automatically
unfortunately not :-)
I followed the mne contributing guide.
it's a bit different here. You just add your name to the list you quoted above, and we don't have the sphinx role ":new_contrib:", so you can just follow the style of previous changelog entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I checked the old versions of whats_new.rst
and understood that you don't have :newcontrib:
as in mne.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you want the function to be available as
mne_bids.find_matching_paths
, for example, you'd need to make additions here:Lines 4 to 18 in b97ce0b
from mne_bids import commands from mne_bids.report import make_report from mne_bids.path import (BIDSPath, get_datatypes, get_entity_vals, print_dir_tree, get_entities_from_fname, search_folder_for_text, get_bids_path_from_fname) from mne_bids.read import get_head_mri_trans, read_raw_bids from mne_bids.utils import get_anonymization_daysback from mne_bids.write import (make_dataset_description, write_anat, write_raw_bids, mark_channels, write_meg_calibration, write_meg_crosstalk, get_anat_landmarks, anonymize_dataset) from mne_bids.sidecar_updates import update_sidecar_json, update_anat_landmarks from mne_bids.inspect import inspect_dataset from mne_bids.dig import (template_to_head, convert_montage_to_ras, convert_montage_to_mri) for example look at
BIDSPath
in that file --> if it weren't where it is, you would always need to domne_bids.path.BIDSPath
.And for the API docs, you need to add it here (provided you followed the above):
Lines 19 to 22 in b97ce0b
.. autosummary:: :toctree: generated/ write_raw_bids
Thanks @sappelhoff! Added with 1bc097e.
@sappelhoff, how do I add the function to the API docs? I have never done that before. I thought this happens automatically via the docstring. |
if you want the function to be available as Lines 4 to 18 in b97ce0b
for example look at And for the API docs, you need to add it here (provided you followed the above): Lines 19 to 22 in b97ce0b
|
I have a question. What do you think about extending this proposed function to enable ignoring certain entity values? Suggestion: def find_matching_paths(root, subjects=None, sessions=None, tasks=None,
acquisitions=None, runs=None, processings=None,
recordings=None, spaces=None, splits=None,
descriptions=None, suffixes=None, extensions=None,
datatypes=None, ignore_subjects=None,
ignore_sessions=None, ignore_tasks=None,
ignore_runs=None, ignore_processings=None,
ignore_spaces=None, ignore_acquisitions=None,
ignore_splits=None, ignore_descriptions=None, # ignore_modalities=None, # <- what's this?
ignore_datatypes=None, check=True):
"""Get list of all matching paths for all matching entity values. ... Returns
-------
bids_paths : list of mne_bids.BIDSPath
The matching paths.
"""
if datatypes is None and ignore_datatypes is not None:
datatypes = get_entity_vals(root, 'datatype',
ignore_datatypes=ignore_datatypes)
fpaths = _return_root_paths(root, datatype=datatypes, ignore_json=False)
if subjects is None and ignore_subjects is not None:
subjects = get_entity_vals(root, 'subject',
ignore_subjects=ignore_subjects)
if sessions is None and ignore_sessions is not None:
sessions = get_entity_vals(root, 'session',
ignore_sessions=ignore_sessions)
if tasks is None and ignore_tasks is not None:
tasks = get_entity_vals(root, 'task', ignore_tasks=ignore_tasks)
if runs is None and ignore_runs is not None:
runs = get_entity_vals(root, 'run', ignore_runs=ignore_runs)
if processings is None and ignore_processings is not None:
processings = get_entity_vals(root, 'processing',
ignore_processings=ignore_processings)
if spaces is None and ignore_spaces is not None:
spaces = get_entity_vals(root, 'space', ignore_spaces=ignore_spaces)
if acquisitions is None and ignore_acquisitions is not None:
acquisitions = get_entity_vals(root, 'acquisition',
ignore_acquisitions=ignore_acquisitions)
if splits is None and ignore_splits is not None:
splits = get_entity_vals(root, 'split', ignore_splits=ignore_splits)
if descriptions is None and ignore_descriptions is not None:
descriptions = get_entity_vals(root, 'description',
ignore_descriptions=ignore_descriptions)
fpaths_filtered = _filter_fnames(fpaths,
subject=subjects,
session=sessions,
task=tasks,
acquisition=acquisitions,
run=runs,
processing=processings,
recording=recordings,
space=spaces,
split=splits,
description=descriptions,
suffix=suffixes,
extension=extensions) Reason: In the current implementation it is somewhat tedious to avoid certain entities. For example, if I want to avoid the subjects "emptyroom" and "03" and I want all tasks except for "Rest" and I don't need the ".tsv" files, I have to write subjects = get_entity_vals(root, "subject", ignore_subjects=['03', 'emptyroom'])
tasks = get_entity_vals(root, "task", ignore_tasks='Rest')
bids_paths = find_matching_paths(root, subjects=subjects, tasks=tasks) This becomes really tedious for users who don't know this function (I don't think everyone does), they might have to write bids_paths = find_matching_paths(root, subjects=["01", "02", "04", "05", "06", "07, "08", "09", "10", "11", "emptyroom"],
tasks=["Hold", "Move", "Dance", "Read"]) If the study is ongoing and they collect more subjects they would have to manually edit "subjects" many times. With the new function this would become bids_paths = find_matching_paths(root, ignore_subjects=['03', 'emptyroom'], ignore_tasks='Rest') What do you think? I haven't tested the code but I think it would work. If both "subjects" and "ignore_subjects" is provided, "ignore_subjects" would be ignored. |
mne_bids/path.py
Outdated
@@ -1951,8 +1951,8 @@ def _filter_fnames(fnames, *, subject=None, session=None, task=None, | |||
if split else r'(|_split-([^_]+))') | |||
desc_str = (r'_desc-(' + '|'.join(description) + ')' | |||
if description else r'(|_desc-([^_]+))') | |||
suffix_str = (r'_(' + '|'.join(suffixe) + ')' if suffix | |||
else (r'_(' + '|'.join(ALLOWED_FILENAME_SUFFIX) + ')')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the function _filter_fnames()
only returns paths with valid BIDS suffixes even if the two mother functions find_matching_paths()
and BIDSPath.match()
have the keyword argument check=False
.
In theory, one could add the argument check=False
to _filter_fnames(check=False)
and then code
if suffix:
suffix_str = r'_(' + '|'.join(suffix) + ')'
elif suffix is None and check:
# only return files with suffixes that are valid in BIDS
suffix_str = (r'_(' + '|'.join(ALLOWED_FILENAME_SUFFIX) + ')'))
elif suffix is None and not check:
suffix_str = r'_([^_]+)'
However, this seems overkill to me since the BIDS conformity is checked afterwards using _fnames_to_bidspaths(check=check)
.
A good reason to advertise that function in an example with the new Personally, I find your example subjects = get_entity_vals(root, "subject", ignore_subjects=['03', 'emptyroom'])
tasks = get_entity_vals(root, "task", ignore_tasks='Rest')
bids_paths = find_matching_paths(root, subjects=subjects, tasks=tasks) straight forward. On the other hand having "ignore" keywords in the function signature makes it veeeery long and messy 🤔 |
Ok, I agree :) Let's keep it as it is |
I have looked through the examples and I didn't find a good place to insert it. Because if I exchange a BIDSPath().match() example with this function, then .match() is no longer documented. However, if I add the function and show that it yields the same, the doc becomes longer and contains more than just the most important code snippets. But I guess this would be ok? I'll have a look again. Let me know if you find a good place to use it. |
Ok, I think I found a good example. I'll push it soon. |
# Note that this is the same as running: | ||
session = 'off' | ||
bids_path = BIDSPath(root=bids_root, session=session, datatype=datatype) | ||
print(bids_path.match(ignore_json=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we print the same list twice to show that the result is the same or should we comment out
# print(bids_path.match(ignore_json=True))
and just say that this would yield the same?
I went for printing it twice to make it as obvious as possible.
I'm done working on it @sappelhoff :) I added two examples. |
Co-authored-by: Richard Höchenberger <richard.hoechenberger@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add yourself to this file, please? Just below Denis
Line 150 in b97ce0b
- given-names: Denis |
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with this, great work @moritz-gerster.
+1 to merge once CIs are green
🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪 |
Thanks @moritz-gerster! |
Thank you two for all your effort!! |
PR Description
Fixes #1098. The added function
find_matching_paths
returns all foundbids_paths
similar toBIDSPath().match()
. However, in this function, it is possible to filter for several entities which can be passed as lists. For examplebids_paths = find_matching_paths(root, subjects=["01", "02", "03"], sessions="MedOn", tasks=["Rest", "Move"])
Since this function is very similar to
BIDSPath().match()
I added the private functions_fnames_to_bidspaths
and_return_root_paths
and refactored thematch()
method to avoid code repetition. Also note, that the private function_filter_fnames
was extended to enable passing lists of strings as input arguments.Merge checklist
Maintainer, please confirm the following before merging.
If applicable: