Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug fix] fix get_dicom_info.pl so that it skips the scanner reports #371

Conversation

cmadjar
Copy link
Collaborator

@cmadjar cmadjar commented Jan 17, 2019

Description

The dcm2mnc conversion is currently failing on datasets that contains Siemens reports. This PR adds a check in get_dicom_info.pl so that the script skips the scanner report.

Also added a check so that only DICOM medical imaging data files are passed to get_dicom_info.pl when converting data to MINC files.

…marizing the files so that dcm2mnc does not crash
…or dcm2mnc so that it only provides to the script DICOM medical imaging data
@@ -132,6 +132,8 @@ =head2 Methods
my $dicom = DICOM->new();
$dicom->fill($filename);

next if ($dicom->value('0008','103E') eq 'PhoenixZIPReport');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't checking for the presence of header (7fe0,0010) # Pixel data be more robust? Couldn't there be other types of reports (like AtlantaZIPReport, if that even exists) that could be falsely reported as being DICOM files?

Also, I think the code that checks if a file is DICOM should be centralized in subroutine isDicom of ImagingUpload.pm. I suggest modifying the existing code in that library and adding your additional check in there. Also, subroutine isDicom should be allowed to receive one or more file name arguments and return a hash array (key: file name, value: 0 or 1) that would tell whether a file is DICOM or not. The reason behind this is because isDicom issues a Unix file command to do its job and it's better to issue one big file command with a lot of arguments, possibly using xargs, instead of a truckload of file commands with a single argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

uploadNeuroDB/NeuroDB/MRIProcessingUtility.pm Show resolved Hide resolved
@cmadjar
Copy link
Collaborator Author

cmadjar commented Jan 21, 2019

@nicolasbrossard Thank you for the review. I have integrated your suggestion. Please, re-review when you get a chance.

my $dicom = DICOM->new();
$dicom->fill($filename);

next unless ($dicom->value('7fe0','0010'));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I wasn't clear in my original review but I wanted to move this check inside the isDicom method that you put in MRI.pm. Maybe also rename isDicom into isDicomImage since DICOM files can be images, reports or something else. Makes sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops. I did not get that. Makes sense. Will move this to the isDicomImage function :)

- @non_dicom_files: array with the list of non-DICOM files

=cut

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in my previous comment, I think this method should be renamed and the check for the pixel data header added in the algorithm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

my @non_dicom_files;
foreach my $line (@file_types) {
my ($file, $type) = split(':', $line);
push @dicom_files, $file if ($type =~ /DICOM medical imaging data$/);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method is of the form is<something> I'd have a tendency to return true or false for each file, depending on whether the file is DICOM or not:

$isDicom{$file} = $type =~ /DICOM medical imaging data$/ ? 1 : 0;

You'll have to declare my %isDicom before the foreach loop for this to work though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

push @non_dicom_files, $file if (!$type =~ /DICOM medical imaging data$/);
}

return \@dicom_files, \@non_dicom_files;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by

return \%isDicom;

The code that calls this method will have to be replaced also.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
}
my ($dicom_files, $non_dicom_files) = NeuroDB::MRI::isDicom(@file_list);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to change if you agree with my comments on isDicom (see below).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator Author

@cmadjar cmadjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolasbrossard I integrated your suggestions to the code. Note that in order to be able to run the imaging pipeline on datasets with scanner reports I had to modify part of the code in ValidateCandidateInfo so that it gives a warning that there are non-image files instead of returning an error and 0 from that function.

my $dicom = DICOM->new();
$dicom->fill($filename);

next unless ($dicom->value('7fe0','0010'));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops. I did not get that. Makes sense. Will move this to the isDicomImage function :)

}
}
}
my ($dicom_files, $non_dicom_files) = NeuroDB::MRI::isDicom(@file_list);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- @non_dicom_files: array with the list of non-DICOM files

=cut

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

my @non_dicom_files;
foreach my $line (@file_types) {
my ($file, $type) = split(':', $line);
push @dicom_files, $file if ($type =~ /DICOM medical imaging data$/);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

push @non_dicom_files, $file if (!$type =~ /DICOM medical imaging data$/);
}

return \@dicom_files, \@non_dicom_files;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


This method checks whether the list of files given as an argument is of type DICOM.
This method checks whether the files given as an argument are of type DICOM. All
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm. Isn't this doc outdated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooooooops


=cut

sub isDicom {
sub isDicomImage {
my (@files_list) = @_;

my $cmd = "file " . join(' ', @files_list);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this command could fail for large file lists (we could exceed the maximum length a Unix command can have). Maybe we could change this to:

my $cmd = "ls @files_list | xargs file";

just to be safe. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

@cmadjar
Copy link
Collaborator Author

cmadjar commented Jan 23, 2019

@nicolasbrossard Ready for another review. Thanks!

Copy link
Collaborator

@nicolasbrossard nicolasbrossard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nicolasbrossard nicolasbrossard merged commit 7dad428 into aces:20.2-dev Jan 23, 2019
@cmadjar cmadjar deleted the fix_get_dicom_info_to_skip_Siemens_report branch July 22, 2020 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants