-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZarrReader: Add support for OME-XML from bf2raw #31
Conversation
…pport # Conflicts: # src/loci/formats/in/ZarrReader.java
A clarification:
Background information is available in ome/ngff#112 cc: @pwalczysko |
Testing with...
|
thank you @joshmoore . I think I undestand. The two directories "0" and "1" were the images in the two "Series" as defined by the bf2raw. But ZarrReader did not read the ome.xml file where the info which is BF usually using to decide whether this is a MIF or not and how to order it and name it. Thus, prior to this PR, the MIF OME-NGFFs were not imported as MIFs, you could only import single images ? And surely they were wrongly named, I saw that during testing. |
@pwalczysko : that all sounds correct, yes. |
This is possibly a topic for a separate issue but I am raising here as I suspect this will become apparent with the testing of non-HCS multi-image OME-NGFF. It would be good to write down our expectation for the naming of these images. The current historical logic for non-HCS imports to use the filename of the first file in the fileset (as detected by Bio-Formats), hence for OME-NGFF most images will be named |
This seems natural to me. I remember a discussion where this issue was raised even in pre-ngff times (i.e. this should be what BF is doing, but is not.. (was one voice of the discussion)). This seems to me now like the best solution. |
Image converted to NGFF with |
@willmoore after resetting the rendering settings to |
https://merge-ci.openmicroscopy.org/web/webclient/?show=image-230866 But, thumbnails are broken (for full-sized image 230866) or incorrect as above for the other images in the FileSet. |
From the logs when trying to save a thumbnail with the file above
It almost feels like the server only sees the full resolution of the WSI - which would match the slow user experience in the viewer. |
@dgault pointing
so it looks like an issue with the detection of sub-resolutions |
Cross-linking to existing links
Overriding the target image name via the importer is always a possibility although this presents obvious usability and scalability issue. It might be worth moving this discussion into a separate issue to limit the scope of the review for this particular PR. As a first question, what is the expectation for the default image name for these datasets under the various scenarios (single images, MIFs)? |
Pasting here a first start on the testing of comparisons of ome.xml files between pre-conversion ome.tiffs and post-conversion ome.zarrs. Testing done with
The content of the
This does not look to me like the zarr reader is missing something, all differences seem logical ? cc @sbesson @dgault . |
That looks rather promising to me. The differences there all seem to be expected or explainable
|
In the above output, I do not understand the The |
There is something I do not understand about the Maybe whatever ome.tiff inside the folder I pick, bf2raw will take all the other sections into account and produce the zarr out of all of them ? (eidt) |
So big in fact I cannot paste it here. I think this workflow needs a revision. Studying the output more closely, I can see 4 questions
Hard to judge, there are too many of them. Does that maybe bind with the point Ad 2. in this comment ?
Imported the zarr I created during the workflow into merge-ci user-8 https://merge-ci.openmicroscopy.org/web/webclient/?show=image-231015 . It looks good, no difference spotted when comparing with the ome.tiff https://outreach.openmicroscopy.org/webclient/?show=image-95214 |
Bug: When trying to create an ome.xml output from the BBBC plate, I got an error. Command and error see below
|
Testing the zarr from spatial-image/multiscale-spatial-image#32:
chunks are |
I was able to reproduce #31 (comment) independently by converting a test plate in the context of broadinstitute/lincs-cell-painting#54 (comment). I believe the issue is related to the handling of the |
Comparing the OME.XML from https://merge-ci.openmicroscopy.org/web/webclient/?show=image-230717 (converted to NGFF and imported) with that of the Original Image (in the same Dataset) I noticed that the |
@dgault Is the strange sorting of T indices reported by @will-moore something which was indicated already in the diff of #31 (comment) maybe ? The |
Here is the slightly truncated diff of xml files of ome.tiff vs ome.zarr of the Leica-1.ome.tiff from I am not sure about the handling of the RGB true or false difference between ome.tiff and ome.zarr, this feels wrong, the number of images is different as well. Further, the number of Resolutions differ between the tiff and zarr.
|
Another example of ordering of planes problem is the output of @will-moore 's script see below:
|
Initial thought while reading #31 (comment) is that the planes seem to be unordered in the XML output. This might be a feature of the exporter code which possible retrieves the plane metadata in no particular order. The downside is that it makes the XML (at least the |
Ok, a number of updates with the new commits, namely in the areas below: HCS data
Original metadata Plane ordering The plane ordering issue is a bit a messy, the below are the steps that are occuring here:
Although this is the most common scenario in which we would see problems, the list of planes could be in any given order and still be considered valid. This leads to a potentially messy attempt to reorder them or the hopefully simpler approach of trusting the original data. So with the last the commit the reader will now trust the original plane data to be correct and avoid any attempt to update those fields. As long as the associated OME-XML is valid for the dataset then this should be correct. Number of resolutions RGB Proposed 0.5 changes |
Retested and test passing. See user-8 https://merge-ci.openmicroscopy.org/web/webclient/?show=image-231365. Fresh reimport of the ome.zarr was necessary, but I guess this is expected. Good to go for my part of the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the HCS front, tested the latest state of this PR with three different HCS samples to try and cover the different codepaths mentioned in #31 (comment)
- the
idr0104
plate mentioned in https://www.openmicroscopy.org/2020/12/01/zarr-hcs.html which is stored according to the OME-NGFF 0.1 specification without column/row index keys - a fake
test&plateRows=2&plateCols=2&fields=2.fake
converted usingbioformats2raw 0.4.0
(OME-NGFF 0.2 specification withcolumn_index/row_index
keys) andbioformats2raw 0.5.0-rc1
(OME-NGFF 0.4 specification withcolumnIndex/rowIndex
keys)
For the last 2 samples, two changes were required in the the acquisitions
key of the top-level .zattrs
- the
id
type had to be changed from a string to an integer - see Fix the type for acquisitions.id under the plate metadata glencoesoftware/bioformats2raw#152 for the relevant PR fixing the writer - a
maximumfieldcount
key had to be defined to avoid a NPE when populating the metadata
With these two changes, all three plates imported without issues with all images successfully loading. The OME-NGFF 0.4 plate in particular shows both original metadata as well as original metadata for each field of view confirming another feature introduced in this PR.
From my perspective, this is ready to merge and possibly release immediately. I'll open a follow-up PR relaxing the requirement on maximumfieldcount
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing today:
$ omero import -d 54419 --depth=10 438CTR_01_4_R3D_D3D.zarr/
...
Bioformats version: 6.10.0-SNAPSHOT revision: 4abd23de11919a93aa5a2cc8e8b41f2b8f144127 date: 16 May 2022
Comparing OME.xml with script (sorting Planes by Z/C/T) finds no diffs and metadata looks correct in webclient. 👍 LGTM.
Also retested 83843bc and confirmed that HCS OME-NGFF with |
For the next steps, two questions:
|
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/intermission-ome-ngff-0-4-1-bioformats2raw-0-5-0-et-al/72214/1 |
This PR adds parsing for the METADATA.ome.xml file produced via bioformats2raw.
Currently the XML is parsed first and any conflicting data in the Zarr attributes will override the XML values.
To test this the best approach would be to take existing OME-TIFF samples, convert them using bf2raw and then compare the OME-XML output from the original to that from the ZarrReader. The ZarrReader is still currently dumping the ngff metadata as annotations, so you will see extra annotations in the converted version.