Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various proposed changes to diffusion derivatives #6

Open
wants to merge 11 commits into
base: enh/derivatives
Choose a base branch
from
Open

Various proposed changes to diffusion derivatives #6

wants to merge 11 commits into from

Conversation

Lestropie
Copy link

Hey @chrisfilo,

Sorry it's been so long since I've had a proper go at this. Since you indicated that the derivatives are approaching merge, I made the time to have a go at it.

I'll give as much detail as I can here, and leave it up to you as to whether you want me to split things out into separate commits / revert certain changes / incorporate suggestions I've made in the text below but not made in the code / anything else. It's also possible that some things have been discussed & resolved and I'm now re-raising them naively. Feel free to tag anyone else you think should have a look.

  1. Various formatting fixes / changes. Can discuss individual parts if need be.

  2. Eddy current correction can be performed at different orders, so a boolean value may be too restrictive.

  3. There are a wide range of techniques that can fall under the banner of "intensity normalisation" in DWI; indeed @draffelt tried to write a manuscript on it but it because too large & inconclusive. So again, a boolean may be too restrictive.

  4. "Nonlinear corrections" is too ambiguous. I'm guessing this is supposed to apply to gradient non-linearities, but it could equally well apply to EPI field inhomogeneity corrections, which are also "non-linear". The latter of these appeared to be absent entirely from the document, so I've added it in explicitly. Also, if I'm correct here, there's no need to have both a boolean to indicate that non-linearity correction was performed and a string to indicate what types of correction were performed.

  5. Requiring that all parameters with respect to any particular diffusion model be entirely captured within a single NIfTI image might be too restrictive and/or difficult to interpret. As an example: Consider NODDI, where the current proposal is to have a 4D image with 3 volumes, in the order ICVF, OD, ISOVF. The order is arbitrary, and a user not too familiar with the model may not be able to interpret from the images alone what that order is. While this could be read from the specification, it would perhaps be more clear to instead have 3 x 3D NIFTIs, each with the additional filename keyvalue "_parameter-<parameter>" (or alternative) to identify which model parameter each image stores. Another example would be the AWF volume in WMTI, which is a fundamentally different parameter to the others. Indeed for all four of these parameters, they were re-listed in the "Model-derived maps" section.

    As models become increasingly complex & incorporate more multi-modal / multi-contrast data this issue may exacerbate over time.

    Only a single JSON would be required; I don't know whether or not this would make this particular proposition too incompatible with BIDS "spirit" of having sidecar JSONs with identical names to NIfTIs?

  6. In some cases, ODFs are defined based on a dense set of pre-defined directions rather than a spherical harmonic basis (in MRtrix3 we refer to this as a "dixel" representation, as opposed to "fixels" which are a sparse set of directions where each corresponds to an estimated discrete fibre bundle orientation). Here the best format is to store in a 4D image a list of scalar amplitudes per voxel, and store the pre-defined set of unit directions within the JSON file. We already support this in MRtrix3, including for visualisation. Some diffusion models do yield their outputs in this format (e.g. damped Richardson-Lucy), and so such would need to be described in the "Diffusion Models" section here.

  7. Having the diffusion model label in the JSON seems to be redundant with respect to the file name. All good if it's intentional, just thought I'd make the observation.

  8. It's not clear to me what was being indicated in the description of the JSON entry for the response function in CSD. What I've replaced it with here is how we represent multi-shell response functions in MRtrix3; if this fundamentally conflicts with other software / what was intended & that had strong justifications behind it, we'll need to discuss this more specifically.

  9. TDI is not derived from the diffusion model, but from a tractogram reconstruction, so I wouldn't group it with "model-derived maps".

  10. "AFD" is somewhat context-dependent, there's a couple of different interpretations:

    • In the original, most basic case, this is literally the same as the amplitude of the (white matter) ODF in any given unit direction, and so therefore is not a "derived parameter map". It's also not scalar (unless one wants "AFD_TOTAL", which is just the l=0 term of the SH ODF);

    • In the newer, more complex case, "AFD" is represented as the fibre density attributed to a particular fixel, which opens up a can of worms.

  11. The description of "fixel" here - "Different voxels can have different numbers of fixels, and therefore a sparse file format is required" - fundamentally conflicts with the higher-level "Vector-valued maps" description in which it appears. This section refers only to XxYxZx(3N) images, for which the storage is not sparse (same amount of memory required for a voxel with 1 fixel as for another with 3). In MRtrix3 we have a specific format for handling fixel data that is sparse, but that requires its own extensive description, and couldn't be slipped into the "vector-valued maps" section as it is currently described (I believe incorporating the fixel format into BIDS derivatives was raised but it's being left out due to its NIfTI-2 dependency).

    As far as the rest of the description, each of the XYZ triplets in such a dataset is a fixel, so "fixels" don't belong as a sub-category of such.

  12. An ODF is never stored in the XxYxZx3N format so shouldn't be listed here.

  13. The issue of axis convention of vector-valued maps has not been addressed. In MRtrix we specify vector orientations with respect to the scanner-space / real-space XYZ axes, other packages define them with respect to the IJK image axes; further, due to the legacy of the old Analyse format, it may be possible for the left-right direction of vectors to be flipped to conform to a left-handed / neurological system even if the reference image data are in a right-handed / radiological system, similarly to how the DWI bvecs are stored; someone from FMRIB may need to confirm this though.

    This would at least need some kind of key in the JSON to indicate what convention is used.

  14. "Tractography maps contain visitation counts" is attributing a very specific type of data to a very general term. Although some softwares do natively output their tractography results in this format, there's a huge raft of other 3D maps that could be generated from tractography data where the underlying voxel quantities are something other than streamlines counts; indeed we published precisely this as "Track-Weighted Imaging (TWI)".

  15. Currently only two tractography 'classes' are listed: "local" and "global". I would advocate that "front evolution" and "geodesic" are probably their own classes also.

  16. The possibilities for streamlines seeding is wider than what is currently listed ("seedsPerVoxel", "globalSeedNumber", "totalStreamlinesNumber"):

    • Could be surface-based or voxel-based;

    • Could be a fixed number of (seeds, generated streamlines, or output streamlines) per (vertex, or voxel), or it could be a fixed number of (seeds, generated streamlines, or output streamlines) for the whole reconstruction;

    • Even more advanced approaches, e.g. "Dynamic seeding".

  17. What is the intended discrimination between FACT and DET? I see the former term being used in different ways all over the place without any real consensus on what details it specifically refers to.

  18. Sherbondy 2009 reference isn't really a tractography method in its own right; it's more of a tractography post-processing method.

  19. If the diffmodel JSON splits model parameters into a "Parameters" key hierarchy, then the same should probably be done for tractography.


OK I think that's enough for now 😅

@chrisgorgo
Copy link
Owner

@francopestilli @oesteban could you have a look and let me know if this can be merged as is or requires changes?

@Lestropie could you rebase or merge chrisfilo:enh/derivatives to resolve conflicts?

@Lestropie
Copy link
Author

Manual merge da97988 required to resolve conflicts between 12ff0a6 and ea2bc96.

Copy link

@jchoude jchoude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Lestropie I went through the proposed changes and had a few comments and answers to some of your points from the initial long list :)

I think your changes are helpful.

I'll comment on the rest of the list (if applicable) on Friday.

Thanks!

src/05-derivatives/05-diffusion-derivatives.md Outdated Show resolved Hide resolved
@@ -54,7 +54,7 @@ The following is a general example of naming convention:
<pipeline_name>/
sub-<participant_label>/
dwi/
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>_diffmodel.nii[.gz]
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>[_parameter-<parameter>]_diffmodel.nii[.gz]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the addition of the parameter keyword here. I re-read your PR comment, and I can't seem to find if there was an explanation of the intended use of this. Was the goal to disembiguate between 2 instances of the same model using different parameters, such as the SH order?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur. What would we cover with the 'parameter'?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to be as brief as I could initially due to the scope of the PR, but it can come off as cryptic when I try to explain the general case. I'll use the example of NODDI.


With the current spec, one would have:

<pipeline_name>/
    sub-<participant_label>/
        dwi/
            sub-<participant_label>_model-NODDI_diffmodel.nii[.gz]

This would be a 4D image, with 3 volumes. The user would need to consult the specifications to discover that these three volumes correspond to ICVF, OD, ISOVF in that order.


What I propose would be more intuitive is:

<pipeline_name>/
    sub-<participant_label>/
        dwi/
            sub-<participant_label>_model-NODDI_parameter-ICVF_diffmodel.nii[.gz]
            sub-<participant_label>_model-NODDI_parameter-OD_diffmodel.nii[.gz]
            sub-<participant_label>_model-NODDI_parameter-ISOVF_diffmodel.nii[.gz]

Each of these files would be a 3D image.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry I wanted to edit to say I understood later on. Thanks for claryfying for the others as well!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion. Thanks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this a go, and modified NODDI and fwDTI accordingly; see what you think. However the AWF parameter in the WMTI model is glaring at me; it's quite clumsy being concatenated onto the end of all the tensor / kurtosis coefficients.

- `Tissue` : string
- `SphericalHarmonicDegree` : value
- `ResponseFunctionDegree` : value
- `ResponseFunction` : values
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to your question in your main PR comment: the previously described format is the one used in tools such as Dipy, where the first 3 elements are the eigenvalues of a tensor that would represent a single fiber population, and the last element is the mean b=0 value in the voxels where the RF was estimated (for scaling).

I'm not sure what to do here: on the one hand, Mrtrix' approach (especially for MSMT CSD) makes sense. On the other hand, I don't know if other tools will want to convert. I think the most inclusive approach would be to support both formats, depending on the originating tool, but that might complexify the spec.

Maybe a better way to do it would be to instead save the RF(s) in files, and just specify the representation type in the .json file. That way, Mrtrix-style and Dipy-style could coexist.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this occasion could be used to bring consensus on these formats and increase compatibility between MRtrix and Dipy.
@arokem @Lestropie

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I am not sure that I follow. What exactly do you propose? That we change the implementation in DIPY? It is just a simple matter of programming.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personnally don't think the model should be changed in dipy. I think we should support both types in the spec, since they are quite different.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not familiar with the Dipy representation, so treat my comments as being naive of such.

While eigenvalues of a tensor can be converted to a (zonal) spherical harmonic representation, the converse is not true; so the latter is technically more "general". It's also the case that in MRtrix3 one does not actually need a b=0 "shell": the rows simply correspond to ascending b-value "shells" (where we treat b=0 as a "shell"). So e.g. something we've considered in the past is acquiring 3-6 volumes at b~100-200 to get comparable contrast but with less Gibbs ringing.

So from this, it would certainly be problematic to only have the Dipy representation in the BIDS spec, as it would fundamentally preclude MRtrix3 data from being properly represented. Having the two side-by-side in the spec would certainly be feasible; the question then is whether that complexity is warranted, or whether it would in fact be "cleaner" for Dipy to support the latter representation (even if it were to just perform conversions to its own internal representation at import/export).

Having said that: If this is for the purpose of deconvolution, is the tensor representation not immediately converted to (zonal) spherical harmonics in the Dipy code anyway? If that's the case, it might be overall more convenient to move to that representation? The b=0 mean is still encoded as the l=0 term of the first row if there is at least one b=0 volume in the input data.


One other thing I've realised here is that the tissue response function cannot be properly interpreted unless the b-value represented by each row is known. While this information is kind of in the bvals file, there can be ambiguity in regards to how these data are arranged into "shells". So it might be preferable to explicitly encode such here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arokem Do you know from memory how this is done in Dipy? Else I'll recheck.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things I now notice about the DIPY implementation. The first is that we support two kinds of inputs for response functions (at least in the programmatic API, probably not in the CLI). The first is a spherical harmonic representation through an AxSymShResponse object. The other is a tensor (three eigenvalues). They both require an S0 value.

But yes, IIUC, the tensor representation is almost immediately converted into a SH representation (here). Is that representation identical to the one that mrtrix has?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a couple of different ways that this could be done if both response function representations were to be supported:


ResponseFunctionZSH : values
(1 row per unique b-value, with lower b-values listed first;
1 column per even harmonic degree starting from zero)
ResponseFunctionTensor : values
(4 values: 3 tensor eigenvalues, then a b=0 intensity reference)

Obviously these two fields would need to be mutually exclusive.


ResponseFunctionBasis: [ZSH,Tensor]
ResponseFunctionCoeffs : values
(If "ResponseFunctionBasis" = "ZSH":
1 row per unique b-value, with lower b-values listed first,
1 column per even harmonic degree starting from zero;
If "ResponseFunctionBasis" = "Tensor":
4 values: 3 tensor eigenvalues, then a b=0 intensity reference)


Additionally, would field "ResponseFunctionDegree" need to be removed? In the case of the zonal spherical harmonic representation, the maximal degree can be inferred from the coefficients; in the case of the tensor representation, it either doesn't apply to those coefficients directly, or if that parameter is supposed to apply then to the initial conversion of those coefficients to the ZSH basis, it re-raises the question of why the response function would be provided in the tensor form in the first place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given some of this a go in 32183c2; see what you think.

- `ResponseFunction` : values
(1 row per unique *b*-value, with lower *b*-values listed first;
1 column per even harmonic degree starting from zero)
- `SphericalHarmonicBasis` : {`MRtrix 0.2`,`MRtrix3`,`DESCOTEAUX`}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rethinking this live: would it be better to specify something like tournier 2007, normalized tournier, descoteaux 2007, since the basis is not explicitely linked to a software tool?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am the fence here if the articles are specified as proposed by @jchoude users will need to go into the papers and translate the text there into code. Instead, by pointing to the toolbox more technical information can be found.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Normalized" would be too ambiguous; it's specifically an orthonormal SH basis.

For someone without access to these existing software tools, reading those papers would not provide enough detail for that person to be able to properly interpret data encoded in one of these bases. So both that proposal, and the current text, have their own issues.

If we really want a general solution, that can both support data from existing tools and not be tied specifically to software tools, the bases themselves would need to be decomposed:

  • Is it orthonormal, or is there a sqrt(2) factor in the m!=0 terms?
    (Or indeed, should we simply throw this out altogether? We're removing support for this from MRtrix3, and I doubt any substantial number of people would try to use MRtrix 0.2 with BIDS)

  • What is the scaling between DWI amplitude data and the SH representation?

  • Is there a -1 scaling between consecutive (even) spherical harmonic degrees?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right about the fact that people should be using Mrtrix3 with Bids. Even for legacy data that might have been previously processed with Mrtrix 0.2, I think people would be better off re-processing it. In this case, I agree we should drop Mrtrix 0.2, and just use the names you suggested.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am all for reducing the number of dependencies.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to remove the MRtrix 0.2 basis support also.

The question of naming of the remaining two options nevertheless remains. Some options:

  1. SphericalHarmonicBasis : [MRtrix3, DESCOTEAUX]
    (or alternatives; e.g. should both be named after authors, or should both be named after softwares?)

  2. SphericalHarmonicScaleFactor : value
    (Some parameter relating to the magnitude scaling between the SH and amplitude basis; for instance, it could define the value of the l=0 coefficient corresponding to the outcome of deconvolving the response function with itself)
    SphericalHarmonicInvertEveryOtherDegree : boolean

We're also assuming throughout here both that the order in which the different SH coefficients are provided in a 4D ODF image are the same between these two existing bases, and that this ordering is sufficiently unambiguous to not need to be specified explicitly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jchoude is this assumption Ok for dipy?

"We're also assuming throughout here both that the order in which the different SH coefficients are provided in a 4D ODF image are the same between these two existing bases, and that this ordering is sufficiently unambiguous to not need to be specified explicitly."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we seem to agree on dropping mrtrix 0.2 @Lestropie do you guys have a way to read/write mrtrix2 tractography into mrtrix3 tractography? If not how difficult would it be for folks to reprocess the data as @jchoude suggests?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lestropie do you guys have a way to read/write mrtrix2 tractography into mrtrix3 tractography?

For ODFs we do: command shbasis makes a decision about whether or not input SH data are or are not in the newer orthonormal basis, and can convert data between the two. We're actually considering removing this command from the repository, since we highly doubt anybody actually requires it any more, but it would still be made available e.g. as a Gist.

For tractography data, the .tck format remains unchanged since 0.2.

@jchoude
Copy link

jchoude commented Mar 8, 2019

Hey @chrisfilo,

Sorry it's been so long since I've had a proper go at this. Since you indicated that the derivatives are approaching merge, I made the time to have a go at it.

I'll give as much detail as I can here, and leave it up to you as to whether you want me to split things out into separate commits / revert certain changes / incorporate suggestions I've made in the text below but not made in the code / anything else. It's also possible that some things have been discussed & resolved and I'm now re-raising them naively. Feel free to tag anyone else you think should have a look.

  1. Various formatting fixes / changes. Can discuss individual parts if need be.

👍

  1. Eddy current correction can be performed at different orders, so a boolean value may be too restrictive.

You're right. Initially, when drafting the proposal, we used the philosophy (cited by Chris) that the overarching goal of the spec is to enable a end user to know globally what was done on the data, but not necessarily having all details. There were quite a few discussions at the meeting about keeping track of the provenance, which would tell for example the version of the tool, the OS, etc. I think that in this case a string might be better. The question then is: "Should we enforce a vocabulary?". I think I would let @chrisfilo give his opinion on this.

  1. There are a wide range of techniques that can fall under the banner of "intensity normalisation" in DWI; indeed @draffelt tried to write a manuscript on it but it because too large & inconclusive. So again, a boolean may be too restrictive.

Same answer. I see you modified it in this PR. I agree.

  1. "Nonlinear corrections" is too ambiguous. I'm guessing this is supposed to apply to gradient non-linearities, but it could equally well apply to EPI field inhomogeneity corrections, which are also "non-linear". The latter of these appeared to be absent entirely from the document, so I've added it in explicitly. Also, if I'm correct here, there's no need to have both a boolean to indicate that non-linearity correction was performed and a string to indicate what types of correction were performed.

👍

  1. Requiring that all parameters with respect to any particular diffusion model be entirely captured within a single NIfTI image might be too restrictive and/or difficult to interpret. As an example: Consider NODDI, where the current proposal is to have a 4D image with 3 volumes, in the order ICVF, OD, ISOVF. The order is arbitrary, and a user not too familiar with the model may not be able to interpret from the images alone what that order is. While this could be read from the specification, it would perhaps be more clear to instead have 3 x 3D NIFTIs, each with the additional filename keyvalue "_parameter-<parameter>" (or alternative) to identify which model parameter each image stores. Another example would be the AWF volume in WMTI, which is a fundamentally different parameter to the others. Indeed for all four of these parameters, they were re-listed in the "Model-derived maps" section.
    As models become increasingly complex & incorporate more multi-modal / multi-contrast data this issue may exacerbate over time.
    Only a single JSON would be required; I don't know whether or not this would make this particular proposition too incompatible with BIDS "spirit" of having sidecar JSONs with identical names to NIfTIs?

I agree that this issue is getting complex, and will only grow in complexity. I think one way to split this, which would fit with your comments, would be to reassess, for each model, if a model representation needs to be saved (eg: the tensor coefficients, or the SH coefficients for CSD), or if only derived quantities are saved (such as the NODDI maps). That way, we would have a way to represent the models,and for those techniques where the model is the end result, we could save those as derived maps.

Does my comment make sense? I'm writing in a car, and not sure if it's clear enough.

  1. In some cases, ODFs are defined based on a dense set of pre-defined directions rather than a spherical harmonic basis (in MRtrix3 we refer to this as a "dixel" representation, as opposed to "fixels" which are a sparse set of directions where each corresponds to an estimated discrete fibre bundle orientation). Here the best format is to store in a 4D image a list of scalar amplitudes per voxel, and store the pre-defined set of unit directions within the JSON file. We already support this in MRtrix3, including for visualisation. Some diffusion models do yield their outputs in this format (e.g. damped Richardson-Lucy), and so such would need to be described in the "Diffusion Models" section here.

From memory, we didn't include them in this version of the proposal since we felt that that they are not that commonly used, especially for space reasons. Do you use a sparse model when saving, or you save the amplitudes for all vertices on your sphere representation?

  1. Having the diffusion model label in the JSON seems to be redundant with respect to the file name. All good if it's intentional, just thought I'd make the observation.

You're right. I see you removed it, right?

  1. It's not clear to me what was being indicated in the description of the JSON entry for the response function in CSD. What I've replaced it with here is how we represent multi-shell response functions in MRtrix3; if this fundamentally conflicts with other software / what was intended & that had strong justifications behind it, we'll need to discuss this more specifically.

I added a specific comment about this directly in the PR.

  1. TDI is not derived from the diffusion model, but from a tractogram reconstruction, so I wouldn't group it with "model-derived maps".

I see you removed it. I think that's the right move. We decided to not touch everything that was derived from streamlines at this point, since the scope of the proposal would have grown too much. Next version.

  1. "AFD" is somewhat context-dependent, there's a couple of different interpretations:

    • In the original, most basic case, this is literally the same as the amplitude of the (white matter) ODF in any given unit direction, and so therefore is not a "derived parameter map". It's also not scalar (unless one wants "AFD_TOTAL", which is just the l=0 term of the SH ODF);
    • In the newer, more complex case, "AFD" is represented as the fibre density attributed to a particular fixel, which opens up a can of worms.

Pretty sure it was meant as AFD_TOTAL. As you say, this opens a can of worms, so we might postpone for next version.

  1. The description of "fixel" here - "Different voxels can have different numbers of fixels, and therefore a sparse file format is required" - fundamentally conflicts with the higher-level "Vector-valued maps" description in which it appears. This section refers only to XxYxZx(3N) images, for which the storage is not sparse (same amount of memory required for a voxel with 1 fixel as for another with 3). In MRtrix3 we have a specific format for handling fixel data that is sparse, but that requires its own extensive description, and couldn't be slipped into the "vector-valued maps" section as it is currently described (I believe incorporating the fixel format into BIDS derivatives was raised but it's being left out due to its NIfTI-2 dependency).
    As far as the rest of the description, each of the XYZ triplets in such a dataset is a fixel, so "fixels" don't belong as a sub-category of such.

You're right for the integration of the fixel format, it was postponed because of the Nifti-2 dependency. @chrisfilo might want to chime in to see if we change course on this point.

  1. An ODF is never stored in the XxYxZx3N format so shouldn't be listed here.
  1. The issue of axis convention of vector-valued maps has not been addressed. In MRtrix we specify vector orientations with respect to the scanner-space / real-space XYZ axes, other packages define them with respect to the IJK image axes; further, due to the legacy of the old Analyse format, it may be possible for the left-right direction of vectors to be flipped to conform to a left-handed / neurological system even if the reference image data are in a right-handed / radiological system, similarly to how the DWI bvecs are stored; someone from FMRIB may need to confirm this though.
    This would at least need some kind of key in the JSON to indicate what convention is used.

Yes, we should add a flag for this. As you said, a few tools using the FSL bvecs format all work in image space, so vector-valued maps are also in that space. Do you want to add it here, or do you want me to add it once this one is merged?

  1. "Tractography maps contain visitation counts" is attributing a very specific type of data to a very general term. Although some softwares do natively output their tractography results in this format, there's a huge raft of other 3D maps that could be generated from tractography data where the underlying voxel quantities are something other than streamlines counts; indeed we published precisely this as "Track-Weighted Imaging (TWI)".
  2. Currently only two tractography 'classes' are listed: "local" and "global". I would advocate that "front evolution" and "geodesic" are probably their own classes also.

Yes, those would be good additions.

  1. The possibilities for streamlines seeding is wider than what is currently listed ("seedsPerVoxel", "globalSeedNumber", "totalStreamlinesNumber"):

    • Could be surface-based or voxel-based;
    • Could be a fixed number of (seeds, generated streamlines, or output streamlines) per (vertex, or voxel), or it could be a fixed number of (seeds, generated streamlines, or output streamlines) for the whole reconstruction;
    • Even more advanced approaches, e.g. "Dynamic seeding".

Yes, this would need to be expanded or made more generic. Do you already have an idea? Else I can work on something once this one is merged.

  1. What is the intended discrimination between FACT and DET? I see the former term being used in different ways all over the place without any real consensus on what details it specifically refers to.

From memory, FACT was used for the Fact algorithm working on tensors, whereas DET was used for deterministic tracking over HARDI models (I would say this would be in line with SD_STREAM in Mrtrix).

  1. Sherbondy 2009 reference isn't really a tractography method in its own right; it's more of a tractography post-processing method.

True. Could be removed.

  1. If the diffmodel JSON splits model parameters into a "Parameters" key hierarchy, then the same should probably be done for tractography.

True.

OK I think that's enough for now

Those are all really great comments. Thanks for your time, I think this will improve the spec a lot.

@@ -235,19 +222,19 @@ fields for the maps.
<pipeline_name>/
sub-<participant_label>/
dwi/
<source_keywords>[_space-<space>][_model-<label>][_desc-<label>]_<map_label>.nii[.gz]
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>_<parameter>.nii[.gz]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lestropie same comment here as about the use of 'parameter'
@jchoude parameter is more general, but we were trying to create a structure here, I believe, [_model + _desc + _map]
This structure would allow human-friendly sorting in Unix, by model, as it was coming first.

The proposed structure [_desc + _model + _parameter] has a couple of issues for me:
(1) the term parameter is more open-ended and not as informative as _map
(2) we lose the sorting by 'model' by moving _desc to early.

Two cents.

Copy link
Author

@Lestropie Lestropie Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sorting issue makes sense; I think my primary concern with the existing text was that "_model" was optional, not compulsory.

I think another thing I was going for here was that intrinsic parameters of the diffusion model would have "[...]_model-<model>[_parameter-<parameter>]_diffmodel.nii[.gz]", whereas derived parameters would be "[...]_model-<model>_<parameter>.nii[.gz]". The discussion around NODDI sort of highlights that the distinction between intrinsic and derived parameters is not clear-cut. Then again maybe using the same terminology in both locations is contributing to that, I don't know. I'm not a huge fan of "map", but having trouble articulating why; maybe it's my coding background, where a "map" provides a complex mapping from a dictionary to a set of values, or the fact that "map" in non-scientific parlance has a very different meaning.


| `<map_label>` value | Description | Possible Model sources | Unit or scale |
| `<parameter>` value | Description | Possible Model sources | Unit or scale |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comments about 'parameter'

@oesteban
Copy link

oesteban commented Mar 8, 2019

@chrisfilo I'll be going through this today. Sorry for the delay.

@chrisgorgo
Copy link
Owner

Eddy current correction can be performed at different orders, so a boolean value may be too restrictive.

You're right. Initially, when drafting the proposal, we used the philosophy (cited by Chris) that the overarching goal of the spec is to enable a end user to know globally what was done on the data, but not necessarily having all details. There were quite a few discussions at the meeting about keeping track of the provenance, which would tell for example the version of the tool, the OS, etc. I think that in this case a string might be better. The question then is: "Should we enforce a vocabulary?". I think I would let @chrisfilo give his opinion on this.

I would start small and focus on broad strokes important in your field. We can refine this later. It's worth reminding that the space of provenance depth is infinte.

You're right for the integration of the fixel format, it was postponed because of the Nifti-2 dependency. @chrisfilo might want to chime in to see if we change course on this point.

To keep the scope managable I would leave Nifti-2 till another time unless absolutely necessary.

@Lestropie
Copy link
Author

I agree that this issue is getting complex, and will only grow in complexity. I think one way to split this, which would fit with your comments, would be to reassess, for each model, if a model representation needs to be saved (eg: the tensor coefficients, or the SH coefficients for CSD), or if only derived quantities are saved (such as the NODDI maps). That way, we would have a way to represent the models, and for those techniques where the model is the end result, we could save those as derived maps.

Maybe some amount of ambiguity here is whether these NODDI parameters are the fundamental model representation or derived maps. Currently, these are listed in "model parameters", not "model-derived maps". I'm not familiar with how existing software actually handles such, but hypothetically:

  • If software were to provide more "intrinsic" NODDI parameters as one output, and then these parameters (i.e. ICVF, OD, ISOVF) were subsequently calculated from such, then it would make sense for the latter to be referred to as "model-derived maps", and should be moved in the spec accordingly.

  • If such "intrinsic" parameters are not provided, then I would argue that data such as ICVF, OD, ISOVF are in fact the intrinsic model parameters themselves, not derivations thereof.

From memory, we didn't include them in this version of the proposal since we felt that that they are not that commonly used, especially for space reasons. Do you use a sparse model when saving, or you save the amplitudes for all vertices on your sphere representation?

While I don't know what fraction of the community uses such, there's at least one well-known model that uses such (dRL), and I've certainly had communication with one researcher who was performing calculations with such a representation and trying to write software to visualise such, at the time unaware that MRtrix3 already had the capability of visualising ODFs in this form. So my intuition is that given the reasonable exhaustiveness of the listing here, it doesn't warrant exclusion, and use of such may be more common than we think (I'm sure I've read many papers that perform calculations on such a representation, but simply aren't made available in popular software).

This form is specifically not a sparse representation. The 4D image contains N volumes, where N is the number of directions on the half-sphere (and that direction set does not change across voxels). For any particular voxel, the image intensity in each volume corresponds to an amplitude on the half-sphere; the direction on the half-sphere in which that amplitude was observed is the entry in the basis direction set corresponding to the volume number.

Pretty sure it was meant as AFD_TOTAL. As you say, this opens a can of worms, so we might postpone for next version.

👍

Not sure whether or not to add AFD_TOTAL to the list of model-derived maps: apart from being just the l=0 term of the white matter ODF, there's also the chance of differential scaling (i.e. multiplying by sqrt(4pi) so that AFD_TOTAL in the white matter is approximately 1.0). Thoughts?

Yes, we should add a flag for this. As you said, a few tools using the FSL bvecs format all work in image space, so vector-valued maps are also in that space. Do you want to add it here, or do you want me to add it once this one is merged?

I think it should be added here. What I would however need to know from the Oxford group is whether these vectors are stored in image space, or neurological image space (i.e. do they assume a left-hand coordinate system regardless of the handedness of the NIfTI, as is done with bvecs), or do both of these possibilities need to be supported. Anyone you can tag who may be able to provide this?

Currently only two tractography 'classes' are listed: "local" and "global". I would advocate that "front evolution" and "geodesic" are probably their own classes also.

Yes, those would be good additions.

I'll see how I go dredging up some references for these...

The possibilities for streamlines seeding is wider than what is currently listed

Yes, this would need to be expanded or made more generic. Do you already have an idea? Else I can work on something once this one is merged.

I feel I have a good sense of the "general solution" here, in software solution at least; the tricky part will be describing such in a way that doesn't become entirely cryptic. Let me have a go here.

From memory, FACT was used for the Fact algorithm working on tensors, whereas DET was used for deterministic tracking over HARDI models (I would say this would be in line with SD_STREAM in Mrtrix).

I definitely think that there may not be a consensus on what these terms mean (especially FACT). Indeed some time ago in MRtrix3 the algorithm that was referred to as FACT changed: the FACT algorithm was renamed to something else, and a different algorithm that already existed under a different name was renamed to FACT. I've also read multiple articles that talk about "FACT with multi-fiber capability", which would indicate that "FACT without multi-fiber capability" is based on a XYZ fiber orientation representation (e.g. primary eigenvector) in each voxel, not a tensor representation. There's also been some contention about whether "FACT" does or does not include interpolation (which can't be done trivially with a multi-fiber representation).

I think if the term is to remain the spec needs to be quite specific about what it refers to.

@chrisgorgo
Copy link
Owner

let me know when this is ready to merge

Copy link

@francopestilli francopestilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding individual nifti files for a single model as output seems helpful

@@ -54,7 +54,7 @@ The following is a general example of naming convention:
<pipeline_name>/
sub-<participant_label>/
dwi/
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>_diffmodel.nii[.gz]
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>[_parameter-<parameter>]_diffmodel.nii[.gz]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion. Thanks

Copy link

@francopestilli francopestilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this suggestion

src/05-derivatives/05-diffusion-derivatives.md Outdated Show resolved Hide resolved
Copy link

@francopestilli francopestilli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can accept this change except for the order of model and parameter in the file name. We agree 'model' should come earlier.

@@ -235,19 +222,19 @@ fields for the maps.
<pipeline_name>/
sub-<participant_label>/
dwi/
<source_keywords>[_space-<space>][_model-<label>][_desc-<label>]_<map_label>.nii[.gz]
<source_keywords>[_space-<space>][_desc-<label>]_model-<label>_<parameter>.nii[.gz]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can accept this except that we want _model-_parameter-_diffusionmodeo.nii[.gz]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@francopestilli Happy with 32183c2? Now have model as compulsory and desc as optional, but in that order.

Having said that I think there may be other locations in the spec where this key ordering may also be an issue...

Following on from ea2bc96.
- Fix grammattical error in description of DWI JSON.
- Modify description of recommended NODDI model output.
- For "fwDTI" model, separate tensor coefficients and free water fraction data into two separate NIfTI images.
- For DTI model, include iterative weighted least squares as an option, and separate RESTORE (outlier rejection) as being a separate parameter rather than a method of fitting.
- For CSD model, offer two mechanisms for specifying the response function; also remove MRtrix 0.2 SH basis, and remove "none" from NonNegativityConstraint (as without a non-negativity constraint, it would not be CSD).
- Provide additional examples for DWI model outputs.
- Remove NODDI parameters from "model-derived maps", as these are re-classified as fundamental model output parameters.
- Remove DECFA from "model-derived maps", and instead define "directionally-encoded colour maps" as a different form of model-derived map, since any scalar map can be augmented with DEC.
- Add "Count" as a compulsory parameter for tractography output.
- For tractography output, provide three distinct dictionaries within the JSON: "Constraints" (ROIs and anatomical tissue constraints), "Parameters" (for parameters specific to the streamlines tractography algorithm), and "Seeding" (since determining streamline seeds is in fact independent of the particular tractography algorithm and parameters). Field "TerminationCriterion" was removed in this process as it is common for there to be more than one mechanism by which streamlines are terminated. Possibilities for streamline seeding have been more broadly generalised as part of this process.
@Lestropie
Copy link
Author

Sorry to make everyone's lives even harder, but going through this a second time and thinking about the practical limitations that I would have trying to express my own BIDS derivative outputs within the existing spec made me change things quite a bit further.

Some details & justifications in 32183c2, but a more general comment here. While much of this echoes how things are done in MRtrix3, and therefore may come off as self-serving, with MRtrix3 we put a great deal of effort into figuring out the right classifications of mechanisms & command-line options, which mechanisms are independent of one another, and what the most general solutions to various problems are. So I think BIDS derivatives could benefit from what we've learned through those travels, in that it should become less likely that someone who wants to define their data in this format but has not been involved in developing the specification will find themselves unable to adequately describe their data / experiment within this framework.

@chrisgorgo
Copy link
Owner

Should I move on with merging this or are there still outstanding issues?

@Lestropie
Copy link
Author

Probably best to have @oesteban have another look first since I did change a fair bit more in 32183c2.

Is there an up-to-date estimated time for the full Derivatives merge? If I find anything further I want to change / add I'd rather defer to smaller PRs, but it would be useful to know how much time I might have for doing so.

@chrisgorgo
Copy link
Owner

chrisgorgo commented Mar 28, 2019 via email

@Lestropie
Copy link
Author

OK cool; the process of deriving examples should give the level of scrutiny I'm thinking of.

What I might try to do is update my BIDS App outputs to conform properly (currently they only conform "in spirit") as part of that finalisation process: that'll give me the chance to roadtest the proposals rather than trying to think of hypothetical use cases, and any conflicts or suggestions I come up with can be raised in the context of something more practical.

@chrisgorgo
Copy link
Owner

chrisgorgo commented Mar 28, 2019 via email

@francopestilli
Copy link

That's a great idea!

On Wed, Mar 27, 2019, 11:46 PM Robert Smith @.***> wrote: OK cool; the process of deriving examples should give the level of scrutiny I'm thinking of. What I might try to do is update my BIDS App outputs to conform properly (currently they only conform "in spirit") as part of that finalisation process: that'll give me the chance to roadtest the proposals rather than trying to think of hypothetical use cases, and any conflicts or suggestions I come up with can be raised in the context of something more practical. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOkp7U6c2IwZUWcIwSbDp2LLXfbbPsxks5vbGVKgaJpZM4bhCgg .

That is great! BTW -> This is precisely what we did with brainlife.io to make sure this was consistent ->
See below examples of download into BIDS-derivatives from this project https://doi.org/10.25663/bl.p.3 :
The following example shows the CLI command to download all T1w datasets from a subject in the publication data Release 2:
bl pub query # this will return the publication IDs
bl bids download --pub 5c0ff604391ed50032b634d1 --subject 0001 --datatype neuro/anat/t1w

The following will download the data in the entire project (from Release 2) into BIDS format:
bl bids download --pub 5c0ff604391ed50032b634d1

Note. The brainlife.io CLI can be installed on most Unix/Linux systems using the following command: npm install brainlife -g. The CLI can be used to query and download partial of full datasets.

@jchoude
Copy link

jchoude commented Apr 12, 2019

Sorry about the delay. Personal stuff and a job status change didn't give me lots of free time in the last few weeks. I'll give this a re-read in the coming hours.

@chrisgorgo
Copy link
Owner

Please change the target of this PR to https://github.com/bids-standard/bids-specification/tree/derivatives

@Lestropie
Copy link
Author

Unable to change target of an existing PR to a different fork; a new PR for the requested target can be found at bids-standard#205.

- Minor spelling & grammar fixes.
- In "scalar maps" section, expand out the different parameters to be derived from the ball-and-stick(s) model.
- Provide more accurate link to MRtrix3 tracks file format.
- Fix use of superscripts in "Scalar maps" table.
- For "Units" field in tractography JSON file, list possible values in the same format as that used elsewhere in the document.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants