Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downsampling and translation #440

Closed
d-v-b opened this issue Feb 17, 2023 · 9 comments
Closed

Downsampling and translation #440

d-v-b opened this issue Feb 17, 2023 · 9 comments

Comments

@d-v-b
Copy link
Contributor

d-v-b commented Feb 17, 2023

neuroglancer has a rendering error for multiscale images described by ome-ngff metadata -- it looks like the different scale level arrays are offset relative to each other. See this example. (edit: fixed broken url)

The ome-ngff metadata (below) assigns s0 a translation transform of (0, 0, 0), but s1 has (0.5, 0.5, 0.5), and s2 has (1.5, 1.5, 1.5), which is consistent with the effect of downsampling (and this is the behavior of the coarsening routines in xarray. If neuroglancer were internally applying extra translations to lower scale levels of multiscale images, then the rendering error I observed could be a side effect. A fix would be to avoid applying extra translations to lower scale levels if the right metadata is present.

Here's the ome-ngff metadata for that data:

{
    "multiscales": [
        {
            "axes": [
                {
                    "name": "z",
                    "type": "space",
                    "units": null
                },
                {
                    "name": "y",
                    "type": "space",
                    "units": null
                },
                {
                    "name": "x",
                    "type": "space",
                    "units": null
                }
            ],
            "coordinateTransformations": [
                {
                    "scale": [
                        1.0,
                        1.0,
                        1.0
                    ],
                    "type": "scale"
                }
            ],
            "datasets": [
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                1.0,
                                1.0,
                                1.0
                            ],
                            "type": "scale"
                        },
                        {
                            "translation": [
                                0.0,
                                0.0,
                                0.0
                            ],
                            "type": "translation"
                        }
                    ],
                    "path": "s0"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                2.0,
                                2.0,
                                2.0
                            ],
                            "type": "scale"
                        },
                        {
                            "translation": [
                                0.5,
                                0.5,
                                0.5
                            ],
                            "type": "translation"
                        }
                    ],
                    "path": "s1"
                },
                {
                    "coordinateTransformations": [
                        {
                            "scale": [
                                4.0,
                                4.0,
                                4.0
                            ],
                            "type": "scale"
                        },
                        {
                            "translation": [
                                1.5,
                                1.5,
                                1.5
                            ],
                            "type": "translation"
                        }
                    ],
                    "path": "s2"
                }
            ],
            "metadata": null,
            "name": null,
            "type": null,
            "version": "0.5-dev"
        }
    ]
}
jbms added a commit that referenced this issue Feb 17, 2023
Previously, Neuroglancer did not take into account that OME treats the
center of a pixel as its origin, while Neuroglancer treats the corner
as the origin.

Fixes #440.
@jbms
Copy link
Collaborator

jbms commented Feb 17, 2023

Thanks for reporting this.

The issue is due to Neuroglancer using the corner of a voxel as its origin (by default), while OME (or at least your example) uses the center as the origin.

I recall that in the context of the discussions on various OME metadata that it was decided to use the center as the origin of a voxel, but when I took a look at the specification just now I was not able to find anything about that. In any case it sounds like other tools use the center, so Neuroglancer should also for consistency.

I have a fix here: https://github.com/google/neuroglancer/tree/fix-440

Unfortunately there are some complications:

  • The solution that would seem to provide the greatest fidelity to the OME coordinate transformation spec would be to make Neuroglancer's continuous coordinate space exactly match what is specified, i.e. in your example the voxel at (0, 0, 0) would correspond to the continuous region [-0.5, -0.5, -0.5] to [1, 1, 1]. That would ensure that data sources like meshes are correctly aligned. Neuroglancer's rendering handles this fine, but in the normal case where we are intending to translate the base scale by just an integer number of voxels, there are a few UI issues due to Neuroglancer still treating the global coordinate space as having the voxel origin being the corner:

    • The global position display rounds down to the nearest integer, which doesn't do the right thing for your example.
    • Discrete navigation actions (like the arrow keys and the mouse wheel) snap to the center of a global coordinate space voxel, which in the OME coordinate space is the boundary of a voxel. Being on the boundary of a voxel is problematic for rendering cross sections because it is not well defined which of the two adjacent cross sections is displayed.
  • If we faithfully follow the OME coordinate space, then the base scale of an OME multiscale array (i.e. with zero translation) will be translated by half a voxel relative to how Neuroglancer would display the base scale as a plain zarr array. I think this would be quite unfortunate and confusing.

  • Instead, in the commit linked above, I translate the OME coordinate space by a half voxel in the base resolution so that it works well with Neuroglancer's UI. This means Neuroglancer's coordinate space is translated relative to the OME coordinate space.

I'm inclined to think that the current solution is the best option, but due to these issues I haven't pushed it to the main branch yet and I'm interested in feedback.

I suppose an alternative solution would be to extend Neuroglancer's coordinate space abstraction to optionally allow the origin to correspond to the center of a voxel. But I'm not sure how to fit that into the UI and I'd rather avoid the added complexity.

@d-v-b
Copy link
Contributor Author

d-v-b commented Feb 17, 2023

the fix in the fix-440 branch solves my acute problem. Longer term, I would probably recommend pulling off the band-aid and switching to a pixel / voxel-centered global coordinate system. But I don't know how laborious that would be, or if that would break existing workflows.

@jbms
Copy link
Collaborator

jbms commented Feb 17, 2023

Changing neuroglancer to use voxel center as origin by default or always would be a breaking change and therefore not something I'd want to do. Adding it as an option for each dimension of the global coordinate space is possible but I'm not sure the benefit would be worth the added complexity.

More generally, it seems to me that the making the origin of a voxel its corner makes more sense for the imaging applications I'm familiar with, where we generally assume voxels correspond to an area roughly equal to their stride, rather than being point samples with infinitesimal spread.

The problem with choosing any particular solution here is that changing it later will be backwards incompatible.

@d-v-b
Copy link
Contributor Author

d-v-b commented Feb 17, 2023

More generally, it seems to me that the making the origin of a voxel its corner makes more sense for the imaging applications I'm familiar with, where we generally assume voxels correspond to an area roughly equal to their stride, rather than being point samples with infinitesimal spread.

"image samples are small squares / cubes" is a natural perspective for visualizing images, but "image samples are points" is more consistent with the actual process of imaging -- a camera produces an array of values per frame, none of which have any area or volume properties, and a point detector like those used in raster scanning microscopy or SEM returns a stream of values that only has a geometrical interpretation when combined with knowledge of how the excitation beam was scanned, and the conventional raster scan is just one of many possible scan trajectory, only a tiny subset of which are grid-like.

Assuming that images are made of squares / cubes also makes it hard to represent images sampled on an irregular grid, which is an inevitability in climate imaging (because the earth is a sphere but the detectors that image it are ~squares).

Also, if images are made of squares / cubes, it's impossible to conceive of an image with one sample, because the size of the square / cube is undefined, as there is no other sample to measure distance to. But if images are made of points, there's no problem with a single sample.

All that being said, I'm perfectly happy with the solution in fix-440. But I suspect the underlying issue will re-appear in the future.

@bogovicj
Copy link

I recall that in the context of the discussions on various OME metadata that it was decided to use the center as the origin of a voxel, but when I took a look at the specification just now I was not able to find anything about that.

That's right, and just as a heads up for what's coming: the spec will be clear that the center is the origin after this is merged ome/ngff#138

The relevant part of the spec:
https://github.com/bogovicj/ngff/blob/coord-transforms/latest/index.bs#L440-L450

And the issue where discussion took place:
ome/ngff#89

@jbms
Copy link
Collaborator

jbms commented Feb 17, 2023

More generally, it seems to me that the making the origin of a voxel its corner makes more sense for the imaging applications I'm familiar with, where we generally assume voxels correspond to an area roughly equal to their stride, rather than being point samples with infinitesimal spread.

"image samples are small squares / cubes" is a natural perspective for visualizing images, but "image samples are points" is more consistent with the actual process of imaging -- a camera produces an array of values per frame, none of which have any area or volume properties, and a point detector like those used in raster scanning microscopy or SEM returns a stream of values that only has a geometrical interpretation when combined with knowledge of how the excitation beam was scanned, and the conventional raster scan is just one of many possible scan trajectory, only a tiny subset of which are grid-like.

Okay, but here we are explicitly trying to map the data back to the physical space. And also we are necessarily dealing with a regular grid, since we are starting with a zarr array and then applying just a scale/translation transform. I also think in general regardless of imaging method, you would want to choose your pixel spread to approximately match the stride; if it is smaller than the stride then you will get aliasing artifacts.

I also deal with segmentations a lot, where we can't do interpolation even if we wanted to, and so it is natural to just assume the pixel corresponds to a rectangular region.

Assuming that images are made of squares / cubes also makes it hard to represent images sampled on an irregular grid, which is an inevitability in climate imaging (because the earth is a sphere but the detectors that image it are ~squares).

In this case you would need an entirely different transformation from raw array to physical space, though. I don't know much about geospatial imaging, but I think it would be reasonable to choose coordinate spaces for microscopy that make sense for microscopy without being too concerned about the geospatial domain.

Also, if images are made of squares / cubes, it's impossible to conceive of an image with one sample, because the size of the square / cube is undefined, as there is no other sample to measure distance to. But if images are made of points, there's no problem with a single sample.

Not sure I understand this. Let's say we have a 1x1 pixel image, where the detection diameter is approximately 4nm. Then we might reasonably render this as a 4nm by 4nm square.

All that being said, I'm perfectly happy with the solution in fix-440. But I suspect the underlying issue will re-appear in the future.

Let's say you have a zarr array of size 10x10 and you specify a scale of [5nm, 5nm]. Some possible behaviors in Neuroglancer are:

  1. Render this as [0nm, 50nm] * [0nm, 50nm] (patch above, and existing behavior, and also the behavior of plain zarr datasource if you manually adjust the source units)
  2. Render this as [-2.5nm, 47.5nm] * [-2.5nm, 47.5nm]
  3. Render this as [0nm, 45nm] * [0nm, 45nm]

Option 1 works well with Neuroglancer as is, but does not match the OME spec.

Options 2 and 3 match the OME spec but have UI problems in Neuroglancer currently, and don't match what a plain zarr data source does.

Option 3 is consistent with your comment that if we have a single sample we can't visualize it, but I don't think it is what users normally expect or desire from image visualization software.

@d-v-b
Copy link
Contributor Author

d-v-b commented Feb 18, 2023

Not sure I understand this. Let's say we have a 1x1 pixel image, where the detection diameter is approximately 4nm. Then we might reasonably render this as a 4nm by 4nm square.

And what are the X and Y axes that define the extent of this square? How would you tell if it was upside down? The sample itself is silent on this. You might also reasonably render the sample as a disk with a 4nm radius, or as a hexagon, or a gaussian blur with a FWHM of 4nm, but these are decisions you make when rendering it. There is no right or wrong way -- the geometry of the glyph we use to depict the sample is not something determined by the sample itself.

Regarding these three options:

  1. Render this as [0nm, 50nm] * [0nm, 50nm] (patch above, and existing behavior, and also the behavior of plain zarr datasource if you manually adjust the source units)
  2. Render this as [-2.5nm, 47.5nm] * [-2.5nm, 47.5nm]
  3. Render this as [0nm, 45nm] * [0nm, 45nm]

Option 2 seems to be the most sensible to me. It's what matplotlib does, and 3D slicer I'm not sure I understand the issue with the plain zarr data source, but if it's extremely important that arrays without any coordinate information have pixel renderings that start at (0,0,0), then you could make the default transform include an offset of (.5,.5,.5) 🤷 . If the user hasn't supplied any coordinates, then it's up to you to define where the data gets rendered. But I don't know how much work it would take to implement this. Maybe it's prohibitive, in which case a hack is fine, but i'm sure this problem will arise again.

edit: no "point vs square pixel" debate is complete without a link to the classic (anti-square) screed on this topic: http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf

@jbms jbms closed this as completed in 6f2439e Mar 6, 2023
@jbms jbms reopened this Mar 7, 2023
@jbms
Copy link
Collaborator

jbms commented Mar 7, 2023

I actually plan to implement a better solution than 6f2439e:

  • Neuroglancer will properly respect the OME-zarr coordinate space
  • Neuroglancer will auto-detect whether it should snap coordinates to integer or integer + 0.5 depending on whether the lower and upper bounds for the dimension are integer or integer + 0.5.

@jbms
Copy link
Collaborator

jbms commented Mar 7, 2023

The better solution is now implemented:
8240a91

@jbms jbms closed this as completed Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants