Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow option to tolerate PixelData VL and expected length mismatches. #245

Merged
merged 8 commits into from
Sep 7, 2022

Conversation

ducquangkstn
Copy link
Contributor

Sometimes, the VL of pixel data does not match with calculated value. (I can not provide the sample file to test this)

  • In this case, we want to return an error rather than continue to parse and returns an EOF error (very hard to debug why)
  • This MR also adds a flag to allow ppl to skip the pixelData and mark it as Encapsulated.

@ducquangkstn ducquangkstn marked this pull request as ready for review July 27, 2022 09:24
Copy link
Owner

@suyashkumar suyashkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, thanks for the PR! Had some comments for you to consider.

Can you also share more about your use case? In particular, if you encounter a dicom like this with this ParseOption enabled, do you intend to use that malformed data in some way?

Or, if you wish to generally ignore these errors, would it be possible for you to simply inspect this error from Parse and then ignore it in your business logic? Since the main pixeldata should be the last element anyway, you'd still have the whole data set (except in cases of thumbnails [and in those cases you could consider parsing again with an ignore pixel data flag, which is being built in another PR]).

e.g.

ds, err := Parse(...)
if errors.Is(err, ErrorMismatchPixelDataLength) {
    // Do your custom logic, or ignore, depending on what your usecase needs.
}

Anyway, just thought I'd get a better sense of your use case. Thanks!

element.go Outdated Show resolved Hide resolved
parse.go Outdated Show resolved Hide resolved
parse.go Outdated Show resolved Hide resolved
read.go Outdated Show resolved Hide resolved
read.go Outdated
Comment on lines 292 to 297
f := frame.Frame{
Encapsulated: true,
EncapsulatedData: frame.EncapsulatedFrame{
Data: data,
},
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this model of parsing is as encapsulated Frame when it isn't actually supposed to be an encapsulated frame. Like trying to use this data as an encapsulated frame wouldn't work as expected later. Maybe we need some kind of error flag to indicate this was a fallback in the frame element.

Can you tell me more about how you need to access this data in your use case? Like if you hit a dicom with this error condition, do you just want to skip doing any downstream work with the pixel data? Or mark the pixel data as unusable or something? Because iiuc in the current model you have no way of distinguishing this case from a case where there was just a valid encapsulated frame parsed (if your option is set to true).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another variable at dd6ecec (IsVLMismatch).

  • Without AllowMismatchPixelDataLength, if the VL mismatches calculated BytesToRead, the library returns an error.
  • With AllowMismatchPixelDataLength, the library return a Frame with VLMismatch=true indicates that the pixel data is unusable.

read.go Outdated
@@ -269,10 +266,52 @@ func readNativeFrames(d dicomio.Reader, parsedData *Dataset, fc chan<- *frame.Fr

debug.Logf("readNativeFrames:\nRows: %d\nCols:%d\nFrames::%d\nBitsAlloc:%d\nSamplesPerPixel:%d", MustGetInts(rows.Value)[0], MustGetInts(cols.Value)[0], nFrames, bitsAllocated, samplesPerPixel)

bytesAllocated := bitsAllocated / 8
bytesRead = bytesAllocated * samplesPerPixel * pixelsPerFrame * nFrames
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call this variable bytesToRead in this context. By moving it earlier, I do think it changes what it was returning (previously presumably it was returning 0 for a lot of the return statements below), but I don't think this is used much anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we parse the broken DICOM files, the library reads the data in the stream by bytesToRead , so it doesn't compare VL to bytesToRead and returns an EOF error, which is not really useful.

There is some case when the length remaining bytes is smaller than bytesToRead. So in general, this change improves better error message.

read.go Outdated Show resolved Hide resolved
read_test.go Outdated Show resolved Hide resolved
read_test.go Show resolved Hide resolved
parse.go Outdated Show resolved Hide resolved
@ducquangkstn
Copy link
Contributor Author

Anyway, just thought I'd get a better sense of your use case. Thanks!

We have a DICOM which the PixelData under tag Icon Image Sequence is broken.
The main PixelData is fine

import pydicom
dcm = pydicom.dcmread('...')
dcm.pixel_array
array([[1899, 1927, 1875, ..., 4889, 5041, 5266],
       [1906, 1926, 1882, ..., 4865, 5072, 5240],
       [1923, 1907, 1865, ..., 4977, 5050, 5171],
       ...,
       [2288, 2319, 2271, ..., 6861, 6835, 6822],
       [2330, 2358, 2277, ..., 6888, 6835, 6848],
       [2337, 2359, 2301, ..., 6861, 6848, 6749]], dtype=uint16)

dcm.IconImageSequence[0]
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 64
(0028, 0011) Columns                             US: 64
(0028, 0100) Bits Allocated                      US: 8
(0028, 0101) Bits Stored                         US: 8
(0028, 0102) High Bit                            US: 7
(0028, 0103) Pixel Representation                US: 0
(0028, 1040) Pixel Intensity Relationship        CS: 'LOG'
(7fe0, 0010) Pixel Data                          OB: Array of 1976 elements

dcm.IconImageSequence[0].pixel_array
error

My use-case is reading the DICOM and just ignoring the pixelData inside IconImageSequence

Copy link
Owner

@suyashkumar suyashkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I think this is almost ready, had some comments for your consideration.

Comment on lines 23 to 24
// IsVLMismatch indicates if the pixel-data VL mismatches with calculated value
IsVLMismatch() bool
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just have something more generic like an Error field on the frame?

Suggested change
// IsVLMismatch indicates if the pixel-data VL mismatches with calculated value
IsVLMismatch() bool
// ParsingErr indicates if there was an error when reading this Frame from the DICOM. If this is set, this means fallback behavior was triggered to blindly write the PixelData bytes to an encapsulated frame. The ParsingErr will contain details about the specific error encountered.
ParsingErr error

You can then also have an exported sentinel error somewhere (perhaps in the frame package or where you think it fits best) ErrVLMismatch.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and also, would it make sense for this to belong on the PixelDataInfo itself instead of on the frame?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated at 95a533f

read.go Outdated Show resolved Hide resolved
read.go Outdated Show resolved Hide resolved
read.go Outdated Show resolved Hide resolved
read.go Outdated Show resolved Hide resolved
@ducquangkstn
Copy link
Contributor Author

@suyashkumar Pls also approve workflow for enable Github CI for this PR.

Copy link
Owner

@suyashkumar suyashkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me, once the final comments are addressed. I have a couple follow up PRs I may send to build on this as well. Thank you for your contribution!!

element.go Outdated Show resolved Hide resolved
read.go Outdated Show resolved Hide resolved
@@ -23,6 +23,8 @@ var ErrorElementNotFound = errors.New("element not found")
// within this Dataset (including Elements nested within Sequences).
type Dataset struct {
Elements []*Element `json:"elements"`

opts parseOptSet
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take care of this in another PR, but just putting a note for myself: I think if we're going to add this set of options to Dataset, it makes sense that any person who holds a Dataset should be able to inspect the opts ...ParseOption the Dataset was parsed with. Typically I don't like just tacking on options to another struct to "pass it through" but in this scenario I think this makes sense because it seems useful for anyone who has a Dataset to see the options it was parsed with.

parse.go Outdated Show resolved Hide resolved
parse.go Outdated Show resolved Hide resolved
@suyashkumar suyashkumar merged commit 58fd583 into suyashkumar:main Sep 7, 2022
@suyashkumar
Copy link
Owner

suyashkumar commented Sep 7, 2022

🎉 Merged! Thank you again!

@suyashkumar suyashkumar changed the title Fix: pixel data mismatch Allow option to tolerate PixelData VL and expected length mismatches. Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants