-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformation types #101
Comments
Linear transformsI would not recommend to introduce separate transform types for
It seems simple, but years of experience with the NIFTI file format shows that it is a problem that is almost impossible to solve correctly. A common issue is that due to numerical inaccuracies most of the time images have slightly non-orthogonal axes, so you need to define tolerance metrics that you use to decide if the axes are orthogonal, unit-length, etc. and based on that decide if you write it out as a rigid transform (discarding the accurate orientation and scaling) or you write it out as affine (keeping all the values accurate). This is an open problem for over 20 years, there is still no universal solution that works well for all use cases.
If you introduce a new transform type for each parameterization of a 4x4 matrix then you cannot stop at just affine, rigid, axis_permutation, but you'll have to add all the other commonly used parameterizations, as it is done in ITK: Of course, it is just software, everything is doable, but still, implementing 15 transform types (instead of just 1) for representing a simple linear transform is significant workload. Most likely, each application would choose to implement just a subset, ending up with incompatibilities and many not-well-tested code branches in file I/O source code. Overall, leading to unhappy users and developers. Other transforms
|
Thanks for having a look at this @lassoan Linear transformsI would be happy not to include There is some value in other simpler parametrizations though - i.e. we should keep
|
I think one reason to do this was that some applications can only consume those simple transformations (scale and translate). However, I found the suggestion made in one of the last NGFF calls that then those applications could just pull out the scale and translation from the affine worth considering. So, even if it may break our current spec, I wonder, given the 20+ years of experience of @lassoan, whether we should reconsider to only support affine on the spec level (APIs could support more and then translate from and to affine). |
Getting the scale from the transformation matrix is very simple (scale[i] is |
I am with @bogovicj and others to support explicit subsets of affine transformations. I never found it helpful to remove information to later rediscover it and to deal with the associated inaccuracies. If a transformation is linearly independent (such as translations and scalings), then it should say so because an application can do helpful shortcuts when dealing with them. E.g. rendering transformed data is much faster and easier. If a transformation is meant to be orthogonal (similarities) or even orthonormal (rigid), then it is helpful to know this instead of guessing it from noisy projections. Applications that handle only affine transformations are free to convert first and then do their thing. This could indeed be a tool written in jQuery or a jQuery based translation layer. Proposed name: "see-everything-that-is-an-affine-as-an-affine". |
Thanks for working on this @bogovicj, a few first comments from my side:
I think both of these are not so easy to understand. That does not mean we should not include them, but they will need some more motivation, explanation and examples.
If we stick with the current way of specifying transformations in 0.4, then sequence is not necessary; whenever transformations are given in the spec they should be given as a list. I would be open to change this, but I think we should have only one of the two potential solutions, i.e. either
I think that these are not really
I am not sure yet how we represent non-cartesian spaces in #94 yet. Maybe it's simpler to leave these out for now as well. But I am happy to change my mind on this if the solution to this is simple. Regarding affine transformations and subsets thereof: I fully agree with @axtimwalde's comment #101 (comment) that being able to specify the explicit subset is better than needing to extract this information from the full affine representation. The forward translation of going from |
Examples for
Examples of
The need for We will eventually support references to transformations that are re-used multiple times. This saves both storage and makes it explicit that a specific transformation is being used. Transformations used for specific datasets can then be constructed as a combination of references and explicitly stored transformations. The referenced transformations can be single transformations or sequences of transformations and may themselves contain references to transformations. This whole structure means that transformations are trees that, when applied are flattened and applied as a sequence. The cleanest way to do this is to enable leaf transformations and sequences and references (to leafs or sequences) and understand them all as the same kind of node, a transformation. Best example for me: lens distortion correction for stitched EM or confocal images. The distortion correction consists of a non-linear polynomial transformation and an affine transformation that normalizes between color channels (confocal) or across cameras (EM), i.e. it is a sequence. The same lens distortion correction transformation is re-used by thousands of tiles in the stitched dataset. We may improve the lens-disortion correction at a later time with better calibration data and would then update only one instance instead of thousands. Each tile also has a rigid or general affine transformation that stitches it into a global montage.
Non-cartesian image data is abundant in medical imaging and must therefore be supported. The data arrays are just as multi-dimensional as microscopy acquisitions. Good practical example: Ultrasound -scanner data. |
Having multiple ways of specifying an affine transform adds a small amount of complexity but is indeed relatively easy to handle when reading. It is similarly true that it is easy to deal with However, I am less convinced that it will actually reduce implementation complexity even if you support optimizations in the scale-translation-only case, because in practice you will likely have to compose multiple transformations and an affine transform matrix is the simplest way to do that composition. Then in the final transform matrix you can check for whatever conditions you have optimizations for. Of course if there are non-linear transforms this sort of composition is not possible, but those transforms will have to be supported in a less efficient way (or not supported at all), and you would still want to compose any affine transforms before and after each non-linear transform. One issue I can foresee, related to what @lassoan said, is that if there are multiple ways to represent affine transforms, but some ome-zarr implementations support only some of those representations, or support them more efficiently, then when writing a transformation you will have to be aware of which representations are supported/more efficient by each implementation. For example, if some viewer only supports
|
Here is a brief summary of some examples. I've started a prototype implementation with more details here: Some possible changes
Basic examplePixel to physical space, and an simple affine between two physical spaces (scanner vs anatomical) for our medical imaging friends. Basic example metadata{
"spaces": [
{
"name": "scanner",
"axes": [
{ "type": "space", "label": "x", "unit": "millimeter", "discrete": false },
{ "type": "space", "label": "y", "unit": "millimeter", "discrete": false },
{ "type": "space", "label": "z", "unit": "millimeter", "discrete": false }
]
},
{
"name": "LPS",
"axes": [
{ "type": "space", "label": "LR", "unit": "millimeter", "discrete": false },
{ "type": "space", "label": "AP", "unit": "millimeter", "discrete": false },
{ "type": "space", "label": "IP", "unit": "millimeter", "discrete": false }
]
}
],
"coordinateTransformations": [
{
"scale": [ 0.8, 0.8, 2.2 ],
"type": "scale",
"name": "to-mm",
"input_space": "/basic/mri",
"output_space": "scanner"
},
{
"affine": [ 0.9975, 0.0541, -0.0448, 0, -0.05185, 0.9974, 0.0507, 0, 0.04743, -0.04824, 0.99771, 0 ],
"type": "affine",
"name": "scanner-to-anatomical",
"input_space": "scanner",
"output_space": "LPS"
}
]
} Crop / cutout exampleThis example has two 2d datasets,
In addition to the default pixel spaces, there are:
Crop example metadata{
"spaces": [
{
"name": "physical",
"axes": [
{ "type": "space", "label": "x", "unit": "micrometer", "discrete": false },
{ "type": "space", "label": "x", "unit": "micrometer", "discrete": false }
]
},
{
"name": "crop-offset",
"axes": [
{ "type": "space", "label": "ci", "unit": "", "discrete": true },
{ "type": "space", "label": "cj", "unit": "", "discrete": true }
]
},
{
"name": "crop-physical",
"axes": [
{ "type": "space", "label": "cx", "unit": "micrometer", "discrete": false },
{ "type": "space", "label": "cy", "unit": "micrometer", "discrete": false }
]
}
],
"coordinateTransformations": [
{
"name": "to-physical",
"type": "scale",
"scale": [ 2.2, 1.1 ],
"input_space": "/crop/img2d",
"output_space": "physical"
},
{
"name": "to-crop-physical",
"type": "scale",
"scale": [ 2.2, 1.1 ],
"input_space": "/crop/img2dcrop",
"output_space": "crop-physical"
},
{
"name": "offset",
"type": "translation",
"translation": [ 10, 12 ],
"input_space": "/crop/img2dcrop",
"output_space": "/crop/img2d"
}
]
} MultiscaleA multiscale dataset. The only change of note compared to
Edits:
Example multiscale metadata (lightly edited){
"spaces": [
{
"name": "physical",
"axes": [
{ "type": "space", "label": "x", "unit": "um", "discrete": false },
{ "type": "space", "label": "y", "unit": "um", "discrete": false }
]
}
],
"multiscales": [
{
"version": "0.5-prototype",
"name": "ms_avg",
"type": "averaging",
"datasets": [
{
"path": "/multiscales/avg/s0",
"coordinateTransformations": [
{ "scale": [ 2.2, 3.3 ], "type": "scale" },
"name": "s0-to-physical",
"input_space": "/multiscales/avg/s0",
"output_space": "physical"
]
},
{
"path": "/multiscales/avg/s1",
"coordinateTransformations": [
{ "scale": [ 4.4, 6.6 ], "type": "scale" },
{ "translation": [ 1.1, 1.65 ], "type": "translation" }
],
"name": "s1-to-physical",
"input_space": "/multiscales/avg/s1",
"output_space": "physical"
},
{
"path": "/multiscales/avg/s2",
"coordinateTransformations": [
{ "scale": [ 8.8, 13.2 ], "type": "scale" },
{ "translation": [ 3.3, 4.95 ], "type": "translation" }
],
"name": "s2-to-physical",
"input_space": "/multiscales/avg/s2",
"output_space": "physical"
}
]
}
]
} Example discrete multiscale metadata{
"spaces": [
{
"name": "physical",
"axes": [
{ "type": "space", "label": "x", "unit": "um", "discrete": false },
{ "type": "space", "label": "y", "unit": "um", "discrete": false }
]
}
],
"multiscales": [
{
"version": "0.5-prototype",
"name": "ms_discrete",
"type": "discrete",
"datasets": [
{
"path": "/multiscales/discrete/s0",
"coordinateTransformations": []
},
{
"path": "/multiscales/avg/s1",
"coordinateTransformations": [
{ "scale": [ 2, 2 ], "type": "scale" },
],
"name": "s1-to-s0",
"input_space": "/multiscales/discrete/s1",
"output_space": "/multiscales/discrete/s0",
},
{
"path": "/multiscales/avg/s2",
"coordinateTransformations": [
{ "scale": [ 4, 4 ], "type": "scale" }
],
"name": "s2-to-s0",
"input_space": "/multiscales/discrete/s2",
"output_space": "/multiscales/discrete/s0",
}
],
"coordinateTransformations" : [
{ "scale": [ 0.8, 1.1 ], "type": "scale" },
],
"input_space": "/multiscales/avg/s0",
"output_space": "physical"
}
]
} This alternative maps downsampled arrays ( This example also includes a "global" Example discrete multiscale metadata with shorthands{
"multiscales": [
{
"version": "0.5-prototype",
"name": "ms_discrete",
"type": "discrete",
"datasets": [
{
"path": "/multiscales/discrete/s0"
},
{
"path": "/multiscales/avg/s1",
"coordinateTransformations": [
{ "scale": [ 2, 2 ], "type": "scale" },
]
},
{
"path": "/multiscales/avg/s2",
"coordinateTransformations": [
{ "scale": [ 4, 4 ], "type": "scale" }
]
}
]
}
]
} This final example omits the global Shorthands:
Example multiscale metadata with multiple spaces{
"spaces" : [
{
"name": "physical",
"axes": [
{ "type": "space", "label": "x", "unit": "um", "discrete": false },
{ "type": "space", "label": "y", "unit": "um", "discrete": false }
]
},
{
"name": "anatomical",
"axes": [
{ "type": "space", "label": "LR", "unit": "um", "discrete": false },
{ "type": "space", "label": "AS", "unit": "um", "discrete": false }
]
}
],
"coordinateTransformations" : [
{
"name" : "s0-to-physical",
"type" : "scale",
"scale" : [ 0.8, 2.2 ],
"input_space" : "/multiscales/discrete/s0",
"output_space" : "physical"
},
{
"name" : "physical-to-anatomical",
"type" : "affine",
"affine" : [ 0.8, 0.05, -3.4, 0.08, 0.91, 10.2 ],
"input_space" : "physical",
"output_space" : "anatomical"
},
],
"multiscales": [
{
"version": "0.5-prototype",
"name": "ms_discrete",
"type": "discrete",
"datasets": [
{
"path": "/multiscales/discrete/s0"
},
{
"path": "/multiscales/avg/s1",
"coordinateTransformations": [
{ "scale": [ 2, 2 ], "type": "scale" },
]
},
{
"path": "/multiscales/avg/s2",
"coordinateTransformations": [
{ "scale": [ 4, 4 ], "type": "scale" }
]
}
]
}
]
} The
|
The multiscale example the schema you have shown seems to allow multiple coordinate transforms for each scale and multiple coordinate spaces for the multiscale. Is that something you specifically intended to support? |
No, every level gets one transform. The |
The "spaces" property of the items of the "multiscales" array is also an array --- but are you saying that is also intended to be just a single item? What do you imagine the use would be for the "name" given to each of the scale's coordinate transforms --- is that intended to allow something outside of that particular multiscale definition to reuse that coordinate transform? |
Forgive me for not giving a great answer now - a good answer means describing how I intend to use the spec, i.e. how the it enables having a nice API (in my view). I will write that up in a longer form soon, but wanted to get some examples out there first. In short:
|
I've updated and added new multiscales examples to the comment above (preserving the originals for the record). Changes and new examples:
|
@bogovicj thanks for working on this! I really like the global
I agree that these are a priority, but I would also add
As @axtimwalde and others have mentioned, while it is possible to represent scale, translation, rigid, etc. inside an affine transformation, it is not possible to know immediately whether an affine transformation does not contain shearing, for example. And, while affine composition can easily and universally be achieved with little computational overhead, decomposition depends on the availability of more advanced tools and the result depends on the method and values (it is "noisy" as @axtimwalde ), e.g. a negative sign in the
Numerical inaccuracies are an important consideration, but storing transformation parameters as binary 64-bit IEEE float's instead of ascii decimal is the way to minimize this.
Supporting different basic transformation types, i.e. ITK supports different representations of transformation types for the purposes of registration / optimization. I agree with @lassoan in that we want to keep representations as minimal and as simple as possible. I do not think we should require different representations to be supported in NGFF for standard simplicity and for the simplicity of implementing software. A representation of a transformation is euler angles vs versor for a |
Regarding binary vs text representation of floating point numbers, while it is certainly easy to lose precision when converting to text representation, it is also quite possible to convert losslessly --- there is no need to use a binary representation just for that purpose. In particular the Python |
Yes, while it is possible to convert binary to text floating point losslessly, there are issues and limitations that are not always handled ideally by every language / library / implementation. We found that in practice, ITK needed to use the Google Double Conversion library when serializing / deserializing transform parameters to text transform file formats to avoid a loss of information that interfered with results. |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/1 |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/save-irregular-time-coordinates-in-ome-zarr/82138/2 |
Details here, examples forthcoming.
The v0.4 specification declares the types:
identity
,translation
, andscale
.Version 0.5 should include new types of transformations. Here is a preliminary list, ordered approximately by importance / urgency / utility (perceived by me).
Questions
sequence
necessary?@constantinpape @xulman @tischi @axtimwalde @tpietzsch @d-v-b @jbms @satra
The text was updated successfully, but these errors were encountered: