-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clarification for derivatives of derivatives #345
Comments
I feel that nesting derivatives allows for a cascade of processing steps where the input to each derivative comes from the sub-### folders of the parent directory. This way the derivative hierarchy can provide some of the provenance for the analysis trajectory. This also allows for shared starting points for multiple analyses (a single preprocessing derivative can be used as the input for multiple subsequent analysis procedures). We have an example of how we like to work with EEG in this way here: https://github.com/BUCANL/bids-examples/tree/face13_nest/eeg_face13 eeg_face13/sub-### ... contains the root project data files that are the inputs resulting in the derivative files... eeg_face13/derivatives/BIDS-Lossless-EEG/sub-### ...where the code used to generate these derivative data files as well as the execution logs are located respectively at: eeg_face13/derivatives/BIDS-Lossless-EEG/code These derivative files are preprocessed with annotations used to remove artifacts. Next we want to purge the artifacts and segment the data. Because the segmenting is performed on the derivative data (e.g. eeg_face13/derivatives/BIDS-Lossless-EEG/sub-###) and not the root data (eeg_face13/sub-###) I feel that the segmention derivative should be nested inside the preprocessed derivative folder such that the output data are located at: eeg_face13/derivatives/BIDS-Lossless-EEG/derivatives/BIDS-Seg-Face13-EEGLAB/sub-### Each of these derivatives (nested or not) can be used independently (e.g. I could simply copy the segmentation derivative if I wanted to try to replicate a result.. or I could copy the preprocessed derivative if I wanted to do a new segmentation from the same input data). The full provenance of the resulting data files, however, requires the full derivative hierarchy up to the root of the project (I think that this is ok and good). |
what do MRI people do? @yarikoptic @tyarkoni keeping the raw data (ie not talking about sharing preprocessed), would you do derivatives/preprocessed and derivatives/stats (but stats depends on preprocessed) while nesting seems easier, the spec doesn't specify this (see posts above) |
I think either approach is compliant. Every derivatives dataset must be a fully compliant BIDS dataset, which implies that you can nest derivatives. Note that the spec also supports a sourcedata/ folder, which would be another way to specify where the sources are (e.g., via symlink). |
ok so when nesting, will you go be ok with @sappelhoff @effigies can you foresee any issues with the validator using flat derivatives vs nesting? |
The first level under Note though that the above assumes that the |
👍 To what @tyarkoni said, although it is not true that all subdirectories of
But the overall point that each BIDS-Derivatives dataset is a BIDS dataset, and thus may contain |
#345 discuss splitting a pipeline into multiple derivatives with/without nesting
ok thx for clarifying guys - I made a PR into the common derivatives branch of the spec |
specify further the pipeline following #345
I've been discussing the issue of creating derivatives from already derived data.
An example of that would be:
from there, say someone else extend the pipeline to make connectivity matrices, these can be stored as
but since it now a different pipeline and according to BEP003:
The text was updated successfully, but these errors were encountered: