Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend precompressed support to read DICOM and write TIFF/OME-TIFF #4190

Merged
merged 14 commits into from
Aug 26, 2024

Conversation

melissalinkert
Copy link
Member

Opening as a draft for now, as #4181 is higher priority. This may need some updates once #4181 is merged.

I would expect basic conversions such as bfconvert -noflat -precompressed -compression JPEG input.dcm output.ome.tiff to work, where the input DICOM is part of a whole slide dataset. Tile sizes should be correctly picked up without being specified in the bfconvert options.

A few outstanding things related to this PR and #4181:

/cc @fedorov, @dclunie

Only "plain" TIFF, OME-TIFF not supported yet.
Instead of requiring DICOM output and a pyramid. This makes sure that the
smaller resolutions of a pyramid are written tile-wise when writing OME-TIFF.
Different resolutions may have different tile sizes.
Mostly this reuses logic in the basic TIFF writer.
@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/bfconverts-precompressed-option-for-ome-tif-output/96519/2

…ression type automatically

This doesn't yet completely support mixed compression types within the same dataset, but something like:

$ bfconvert -noflat -precompressed CMU-1.svs CMU-1.dcm

should run without error and convert the pyramid without recompressing.
Always perform tile-wise conversion if precompressed conversion
was requested, independent of the format and presence of image pyramid.
@melissalinkert melissalinkert marked this pull request as ready for review August 9, 2024 20:57
@melissalinkert melissalinkert added this to the 8.0.0 milestone Aug 9, 2024
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of #3992 and #4181, used CMU-1.svs and CMU-1.npdi as the source files for conversion.

For both files, the bfconvert utility with this PR included was executed:

  • using both OME-TIFF and DICOM as the target format
  • using -compression JPEG, -compression JPEG-2000 and -precompressed as the options
    Additionally the two DICOM datasets generated with the JPEG and JPEG-2000 compression from CMU-1.ndpi were used as the source of a conversion from DICOM to OME-TIFF using the same three compression options as above.
Source file Target Precompressed Compression Conversion time File size
CMU-1.svs DICOM no JPEG 229.057s 109M
CMU-1.svs DICOM no JPEG-2000 682.503 539M
CMU-1.svs DICOM yes 83.013s 177M
CMU-1.svs OME-TIFF no JPEG 152.064s 247M
CMU-1.svs OME-TIFF no JPEG-2000 518.381s 735M
CMU-1.svs OME-TIFF yes 27.083s 176M
CMU-1.ndpi DICOM no JPEG 113.465s 127M
CMU-1.ndpi DICOM no JPEG-2000 640.609s 1.2G
CMU-1.ndpi DICOM yes 16.811s 191M
CMU-1.ndpi OME-TIFF no JPEG 111.965s 127M
CMU-1.ndpi OME-TIFF no JPEG-2000 620.038s 1.3G
CMU-1.ndpi OME-TIFF yes 6.852s 191M
CMU-1_0_0.dcm OME-TIFF no JPEG 137.521s 127M
CMU-1_0_0.dcm OME-TIFF no JPEG-2000 1150.285s 1.3G
CMU-1_0_0.dcm (JPEG) OME-TIFF yes 6.654s 127M
CMU-1_0_0.dcm (JPEG-2000) OME-TIFF yes 16.244s 1.2G

Overall all the conversion times and sizes are consistent with the previous measurements made during the work on reading precompressed SVS and NDPI files. The speed-up associated with the -precompressed flag is comparable when writing OME-TIFF and the absolute values are typically smaller.

Trying to validate the binary data in a viewer, all converted images were found to look good except for the two OME-TIFF files created from the DICOM datasets without -precompressed where the tiles look to be incorrectly ordered

Screenshot 2024-08-13 at 13 52 42

Once the above is addressed, I would re-run the tests above for consistency and I would propose to carry a few additional functional tests including:

  • converting to TIFF rather than OME-TIFF (small images)
  • converting to other supported formats
  • mixing -precompressed / -compression flags
  • for TIFF/OME-TIFF, testing a few other compression flags

@@ -832,6 +842,12 @@ else if (saveTileWidth > 0 && saveTileHeight > 0) {
}
}

if (precompressed && FormatTools.canUsePrecompressedTiles(reader, writer, writer.getSeries(), writer.getResolution())) {
if (getReaderCodecName().startsWith("JPEG")) {
writer.setInterleaved(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to special case here or should we ensure that reader.getInterleaved returns true appropriately so that

writer.setInterleaved(reader.isInterleaved() && !autoscale);

works correctly above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SVSReader was the particular issue here; my concern was that swapping the return value of reader.isInterleaved is a potentially a lot more invasive. When that value changes, what openBytes returns has to change, which means things like pixels hashes change, etc.

// the conversion must then be performed tile-wise to match the tile sizes,
// even if precompression doesn't end up being possible
if (precompressed) {
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely good to see instanceof checks removed. Are there specific tests that should be carried out here if the target format is neither DICOM not TIFF/OME-TIFF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 29e8c05.

It's probably worth a couple of tests, maybe with fake data, for things like ICS. Keep in mind most writers do not allow tile-wise conversion at all (e.g. https://github.com/ome/bioformats/blob/develop/components/formats-bsd/src/loci/formats/out/APNGWriter.java#L84), and DICOM/TIFF/OME-TIFF are the only ones that would support pyramids.

@melissalinkert
Copy link
Member Author

In the spirit of #3992 and #4181, used CMU-1.svs and CMU-1.npdi as the source files for conversion.
Trying to validate the binary data in a viewer, all converted images were found to look good except for the two OME-TIFF files created from the DICOM datasets without -precompressed where the tiles look to be incorrectly ordered

Screenshot 2024-08-13 at 13 52 42

Currently investigating, but it looks like this issue is reproducible if the same DICOM -> OME-TIFF conversion is done with develop. It's still worth fixing here, but may not have been introduced by this PR.

@melissalinkert
Copy link
Member Author

I believe 29e8c05 fixes the scrambled tiles issue. As noted above, I could reproduce the problem both with this PR and with develop alone, using something like:

bfconvert -noflat -compression JPEG /path/to/cmu1_0_0.dcm cmu1.ome.tiff

However, specifying the tile sizes explicitly:

bfconvert -noflat -compression JPEG -tilex 512 -tiley 512 /path/to/cmu1_0_0.dcm cmu1.ome.tiff

no longer showed the problem. With 29e8c05, I would expect either command to result in a correct OME-TIFF.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reproducing the tests from #4190 (review) with the last commit, the issue raised when converting from DICOM to OME-TIFF without precompression are now addressed

Screenshot 2024-08-14 at 10 59 52

Timings and file sizes are consistent with the table of the last review

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the last review, performed a few additional conversion using different input/output formats (fake, ics, tiff) with different source compression and tile/compression options.

The first set of conversion used a fake file with XY dimensions 2000x2000, 10 focal points and 2 channels. All the conversions using TIFF as the source format used one of the TIFF files generated from the fake files

Source format Source compression Target format Precompressed Target tile size Target compression Conversion time File size
Fake Uncompressed ICS no 1.143 197M
Fake Uncompressed ICS no 512x512 1.105s 197M
Fake Uncompressed ICS yes Failed
Fake Uncompressed TIFF no Uncompressed 2.558s 130M
Fake Uncompressed TIFF no zlib 5.001s 18M
Fake Uncompressed TIFF no 512x512 Uncompressed 3.222s 81M
Fake Uncompressed TIFF no 512x512 zlib 3.21s 448K
Fake Uncompressed TIFF no 512x512 JPEG 3.889s 2.5M
Fake Uncompressed TIFF no 512x512 JPEG-2000 7.178s 448K
TIFF zlib ICS no 1.952s 77M
TIFF zlib ICS no 512x512 1.065s 105M
TIFF zlib ICS yes 1.074s 105M
TIFF zlib TIFF no Uncompressed 2.436s 77M
TIFF zlib TIFF yes failed
TIFF zlib TIFF no 512x512 JPEG 3.724s 2.5M
TIFF JPEG TIFF no Uncompressed 2.494s 77M
TIFF JPEG TIFF yes failed
TIFF JPEG TIFF no 512x512 zlib 3.26s 546K

The conversion with a -precompressed flag failed with

java.lang.UnsupportedOperationException: Reader does not support pre-compressed tile access

which I believe is the expectation.

All generated files were validated visually with all planes/tiles being available for all coordinates.

From my side this is ready for inclusion in the next Bio-Formats release. Given the nature of the changes, this is definitely not amenable for a patch release but could be included in a release incrementing the minor or major version to communicate the backwards-compatible API additions.

In terms of documentation, I can see a few follow-up tasks including:

  • update the bfconvert utility help to list the formats/codecs supporting pre-compressed tiles like we do for compression
  • update the user/technical documentation especially https://bio-formats.readthedocs.io/en/latest/users/comlinetools/conversion.html to remove the requirement to specify the compression format and list the additional reader/writers/codecs supporting the -precompressed flag
    In the mid-term, I wonder whether we'll want to maintain a table of format/codecs that support precompressed reading in reading/writing mode separately.

@sbesson
Copy link
Member

sbesson commented Aug 26, 2024

Merging to fix conflicts with #4164 and unblock the performance work.

@sbesson sbesson merged commit 54cf1e8 into ome:develop Aug 26, 2024
18 checks passed
@melissalinkert melissalinkert deleted the dicom-reader-precompressed branch September 6, 2024 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants