Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. #315

suyashkumar · 2024-05-27T04:15:35Z

This change refactors the Go NativeFrame representation to vastly reduce the memory footprint (by 90% in some cases) and CPU usage (~30% in some cases). It does this by:

Storing data in a flat slice, instead of a nested one (e.g. [][]uint16). This helps because the 24 byte slice header for Go slices is very significant here for the inner small slices (e.g. when storing 1-3 uint16s, which is often the case). More details in my comment here: readNativeFrames Performance Optimizations #161 (comment)
Using generics to allow the Go NativeFrame Data slice use an appropriately sized Go integer based on the bitsAllocated (e.g. have the underlying slice be uint8, uint16, etc instead of always a full-size int).

This is an API breaking change. I'm not sure I love the INativeFrame API and naming, but introducing an interface likely makes it easier to interact with the NativeFrame, without forcing propagation of generic signatures everywhere. It also still preserves access to the underlying concrete integer slice if needed. Having a more efficient default NativeFrame representation is important, so lets get this in and iterate if needed before the next release.

This fixes #291 and addresses a big part of #161.

Benchmarks

See the PR GitHub Action here, with an example run from one of the runs screenshotted below:

Memory reduction:

CPU reduction:

… of cleanup required and things to explore, tests to fix, things to rename, etc

suyashkumar · 2024-05-27T04:28:53Z

initial PR benchmark shows some reduction in total B/op for the test dicoms.

suyashkumar · 2024-08-08T02:30:24Z

With the latest changes to only store a flat data array (in addition to storing them as Go integers of the proper size) we're seeing significant memory and cpu improvements over HEAD in the benchmark, as expected for most benchmarks!

Allocations:

CPU:

ducquangkstn · 2024-09-02T00:17:47Z

pkg/frame/native.go

+//	|
+//	▼
+//	Y
+func (n *NativeFrame[I]) GetPixel(x, y int) ([]int, error) {


Can we have an API to manipulate the pixel data?

Smt like SetPixel(x, y int, value []int)

Sure, sgtm! I can add that in, or feel free to send a PR if you'd like. Note that iiuc editing the data in the returned RawDataSlice() will also edit the data in the frame.

ducquangkstn · 2024-09-02T00:30:10Z

pkg/frame/frame.go

@@ -16,12 +16,12 @@ var ErrorFrameTypeNotPresent = errors.New("the frame type you requested is not p
 type CommonFrame interface {
 	// GetImage gets this frame as an image.Image. Beware that the underlying frame may perform
 	// some default rendering and conversions. Operate on the raw NativeFrame or EncapsulatedFrame
-	// if you need to do some custom rendering work or want the data from the dicom.
+	// if you need to do some custom rendering work or want the Data from the dicom.


Nit, we do need to either:

write an example program: convert a DICOM to a bitmap image.

update godoc + README

That will help other ppl understand how this library works.

Agreed, we should add some Go testable examples once the API is baked. The Godoc updates automatically, when a release is cut. (Note: while this was merged to main, it's not yet part of any release yet which is why the Go doc isn't updated. I intentionally haven't cut a release yet, since the next one will have a bunch of API changes, and I'd like to let them settle in first and solicit opinions before grouping everything together into a release).

Open to any PRs for Godoc updates also of course! Would appreciate any thoughts in general

ducquangkstn · 2024-09-02T00:35:16Z

pkg/frame/native.go

+	// A flattened slice is used instead of a nested 2D slice because there is
+	// significant overhead to creating nested slices in Go discussed here:
+	// https://github.com/suyashkumar/dicom/issues/161#issuecomment-2143627792.
+	RawData                 []I


Assuming one day, go have native support for JPEG2000, how you will adapt it into this project.

How we can allow users to decode DICOM, alter the image, and then write back to another DICOM files.

For the sustainable development of this library, do you think the data structure + interface are roburst to changes in the future.

Assuming there is a developer who want to write a plugin that decodes JPEG2000. How can we make our API open to that need.

Assuming one day, go have native support for JPEG2000, how you will adapt it into this project.

This is the NativeFrame construct, so this is only for representing Native PixelData that is encoded in binary format as various sizes of integers. And specifically, this is the container we parse that data into from the dicom. Encapsulated images encoded in JPEG2000 or other flavors of JPEG will not be represented directly in this format.

This is worth thinking about some more, but in the current API, any decoding (including JPEG/JPEG2000) of raw data would occur in EncapsulatedFrame.GetImage() which is the abstraction layer function for returning a standard Golang image.Image. However, image.Image may itself be an imperfect representation as it may be lossy in terms of precision for some kinds of image data that dicoms can represent to higher precision. So this is something we should dig into a bit deeper imo.

Assuming there is a developer who want to write a plugin that decodes JPEG2000. How can we make our API open to that need.

Very good question, as I know you're thinking a bit about this. I'm very open to thoughts and suggestions btw. The first thing we need to decide: is image.Image a suitable decoding target? If yes, then the answer would be that a contribution should be made to add any peeking + decoding logic to EncapsulatedFrame.GetImage(). If we want to allow users to register their own decoders, we can call the appropriate decoding logic here in this function based on global registrations seen so far.

One idea that might be worth thinking about--should we have a better intermediate decoding target other than image.Image? e.g. some structure holding []T along with some rows, cols, etc (something similar to NativeFrame)?

For the sustainable development of this library, do you think the data structure + interface are roburst to changes in the future.

I think this is a reasonable way to represent Native PixelData internally, mostly because this is how Native PixelData is modeled in a DICOM, and now we've correctly allowed for generic types to be used for storage (though we can expand the generic envelope here some more for signed ints, float PixelData, etc). Most folks will not need to interact with this directly though, and will interact via the NativeFrame's helper methods, where we can add additional layers of abstraction and ergonomics as needed. What do you think? Very open to changes and feedback, and don't think this is a final version or anything :).

How we can allow users to decode DICOM, alter the image, and then write back to another DICOM files.

This workflow needs improvement for sure. This is possible to do today very manually, but I definitely think we should allow for helper methods/packages/examples as makes sense to help with key parts of this process. Right now, you can call GetImage() on any frame.CommonFrame and get an image.Image. You can then manipulate that image.Image however you see fit, then encode it (manually, using system libraries) to binary bytes and drop it back into an EncapsulatedFrame, then write it out later. Some pieces here are missing, including setting the transfer syntax for the updated image if it's done in a different way.

Some thoughts on TODOs:

Determine if we want image.Image to be the Golang abstraction for working with images

Consider storing the transfer syntax uid on frames internally, and consult that later when writing the data back out

Write helper function to: ToFrame(img image.Image, outputTransferSyntaxUID string, frameMetadata ...) frame.Frame (the inverse of GetImage). Users can call something like this to create a new frame that can be added to a PixelDataInfo to be written out in a Dataset. Maybe we can have a common frameMetadata structure shared across frames to make life easy. Users would still need to take some care to edit other associated tags if needed for their use case (e.g. windowWidth, windowlevel, etc).

Anyway sorry for the long comment! Just some off the cuff thoughts.

suyashkumar added 3 commits May 26, 2024 23:59

Initial hacking to explore multi-value native pixel data--still a lot…

77468eb

… of cleanup required and things to explore, tests to fix, things to rename, etc

passing tests (though more needed)

a69d754

go mod tidy

01880ce

suyashkumar mentioned this pull request May 27, 2024

readNativeFrames Performance Optimizations #161

Open

suyashkumar self-assigned this May 28, 2024

suyashkumar added 3 commits June 1, 2024 19:13

try populating slice inline

7e8d5f7

Store Native pixel data in flat slice

eb8fb2b

Merge branch 'main' into s/pixeldata-rep

680f125

suyashkumar force-pushed the s/pixeldata-rep branch from c442e59 to 680f125 Compare August 8, 2024 02:22

suyashkumar mentioned this pull request Aug 8, 2024

Memory usage (sometimes 15x from the size of dicom file) #291

Closed

suyashkumar added 2 commits August 8, 2024 22:28

Add comments, NativeFrame tests

4630689

More comments, cleanup

a69ec44

suyashkumar changed the title ~~[work in progress / hacking] Support multi-go-type pixeldata~~ Improve NativeFrame data representation, leading to massive memory and CPU improvements! Aug 9, 2024

suyashkumar changed the title ~~Improve NativeFrame data representation, leading to massive memory and CPU improvements!~~ Improve NativeFrame representation, leading to massive memory and CPU improvements! Aug 9, 2024

suyashkumar marked this pull request as ready for review August 9, 2024 02:45

suyashkumar added 4 commits August 8, 2024 23:03

add some equals tests

87f9dc9

comment update

e598a0e

add uint32 read test

a667e8e

fix typo

f0126e8

suyashkumar changed the title ~~Improve NativeFrame representation, leading to massive memory and CPU improvements!~~ Improve NativeFrame representation by storing data in a flat slice and supporting multi-go-type NativeFrame data. Aug 9, 2024

suyashkumar changed the title ~~Improve NativeFrame representation by storing data in a flat slice and supporting multi-go-type NativeFrame data.~~ Improve NativeFrame representation by storing data in a flat slice and supporting multi-go-type data. Aug 9, 2024

suyashkumar changed the title ~~Improve NativeFrame representation by storing data in a flat slice and supporting multi-go-type data.~~ Improve NativeFrame memory footprint by storing data in a flat slice and supporting multi-go-type data. Aug 9, 2024

suyashkumar changed the title ~~Improve NativeFrame memory footprint by storing data in a flat slice and supporting multi-go-type data.~~ Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data slices. Aug 9, 2024

suyashkumar changed the title ~~Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data slices.~~ Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. Aug 9, 2024

suyashkumar merged commit cb3864e into main Aug 9, 2024
4 checks passed

suyashkumar mentioned this pull request Aug 13, 2024

Utils function for read/write value from unprocessed data #306

Open

suyashkumar mentioned this pull request Aug 29, 2024

Support for RGB for pixeldata #336

Open

ducquangkstn reviewed Sep 2, 2024

View reviewed changes

suyashkumar mentioned this pull request Sep 29, 2024

dicom.Write fails on 'SmallestImagePixelValue' with VR of 'US' #190

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. #315

Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. #315

suyashkumar commented May 27, 2024 •

edited

Loading

suyashkumar commented May 27, 2024 •

edited

Loading

suyashkumar commented Aug 8, 2024

ducquangkstn Sep 2, 2024

suyashkumar Sep 2, 2024

ducquangkstn Sep 2, 2024

suyashkumar Sep 2, 2024 •

edited

Loading

ducquangkstn Sep 2, 2024

suyashkumar Sep 2, 2024

Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. #315

Improve NativeFrame memory footprint by storing data in a flat slice and supporting generic integer data representation. #315

Conversation

suyashkumar commented May 27, 2024 • edited Loading

Benchmarks

suyashkumar commented May 27, 2024 • edited Loading

suyashkumar commented Aug 8, 2024

ducquangkstn Sep 2, 2024

Choose a reason for hiding this comment

suyashkumar Sep 2, 2024

Choose a reason for hiding this comment

ducquangkstn Sep 2, 2024

Choose a reason for hiding this comment

suyashkumar Sep 2, 2024 • edited Loading

Choose a reason for hiding this comment

ducquangkstn Sep 2, 2024

Choose a reason for hiding this comment

suyashkumar Sep 2, 2024

Choose a reason for hiding this comment

suyashkumar commented May 27, 2024 •

edited

Loading

suyashkumar commented May 27, 2024 •

edited

Loading

suyashkumar Sep 2, 2024 •

edited

Loading