-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug fix] fix get_dicom_info.pl so that it skips the scanner reports #371
[Bug fix] fix get_dicom_info.pl so that it skips the scanner reports #371
Conversation
…marizing the files so that dcm2mnc does not crash
…or dcm2mnc so that it only provides to the script DICOM medical imaging data
dicom-archive/get_dicom_info.pl
Outdated
@@ -132,6 +132,8 @@ =head2 Methods | |||
my $dicom = DICOM->new(); | |||
$dicom->fill($filename); | |||
|
|||
next if ($dicom->value('0008','103E') eq 'PhoenixZIPReport'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't checking for the presence of header (7fe0,0010) # Pixel data
be more robust? Couldn't there be other types of reports (like AtlantaZIPReport, if that even exists) that could be falsely reported as being DICOM files?
Also, I think the code that checks if a file is DICOM should be centralized in subroutine isDicom
of ImagingUpload.pm
. I suggest modifying the existing code in that library and adding your additional check in there. Also, subroutine isDicom
should be allowed to receive one or more file name arguments and return a hash array (key: file name, value: 0 or 1) that would tell whether a file is DICOM or not. The reason behind this is because isDicom
issues a Unix file
command to do its job and it's better to issue one big file
command with a lot of arguments, possibly using xargs
, instead of a truckload of file
commands with a single argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@nicolasbrossard Thank you for the review. I have integrated your suggestion. Please, re-review when you get a chance. |
dicom-archive/get_dicom_info.pl
Outdated
my $dicom = DICOM->new(); | ||
$dicom->fill($filename); | ||
|
||
next unless ($dicom->value('7fe0','0010')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I wasn't clear in my original review but I wanted to move this check inside the isDicom
method that you put in MRI.pm
. Maybe also rename isDicom
into isDicomImage
since DICOM files can be images, reports or something else. Makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooops. I did not get that. Makes sense. Will move this to the isDicomImage function :)
- @non_dicom_files: array with the list of non-DICOM files | ||
|
||
=cut | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted in my previous comment, I think this method should be renamed and the check for the pixel data header added in the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
my @non_dicom_files; | ||
foreach my $line (@file_types) { | ||
my ($file, $type) = split(':', $line); | ||
push @dicom_files, $file if ($type =~ /DICOM medical imaging data$/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the method is of the form is<something>
I'd have a tendency to return true or false for each file, depending on whether the file is DICOM or not:
$isDicom{$file} = $type =~ /DICOM medical imaging data$/ ? 1 : 0;
You'll have to declare my %isDicom
before the foreach
loop for this to work though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
push @non_dicom_files, $file if (!$type =~ /DICOM medical imaging data$/); | ||
} | ||
|
||
return \@dicom_files, \@non_dicom_files; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced by
return \%isDicom;
The code that calls this method will have to be replaced also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} | ||
} | ||
my ($dicom_files, $non_dicom_files) = NeuroDB::MRI::isDicom(@file_list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have to change if you agree with my comments on isDicom
(see below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolasbrossard I integrated your suggestions to the code. Note that in order to be able to run the imaging pipeline on datasets with scanner reports I had to modify part of the code in ValidateCandidateInfo so that it gives a warning that there are non-image files instead of returning an error and 0 from that function.
dicom-archive/get_dicom_info.pl
Outdated
my $dicom = DICOM->new(); | ||
$dicom->fill($filename); | ||
|
||
next unless ($dicom->value('7fe0','0010')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooops. I did not get that. Makes sense. Will move this to the isDicomImage function :)
} | ||
} | ||
} | ||
my ($dicom_files, $non_dicom_files) = NeuroDB::MRI::isDicom(@file_list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- @non_dicom_files: array with the list of non-DICOM files | ||
|
||
=cut | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
my @non_dicom_files; | ||
foreach my $line (@file_types) { | ||
my ($file, $type) = split(':', $line); | ||
push @dicom_files, $file if ($type =~ /DICOM medical imaging data$/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
push @non_dicom_files, $file if (!$type =~ /DICOM medical imaging data$/); | ||
} | ||
|
||
return \@dicom_files, \@non_dicom_files; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
|
||
This method checks whether the list of files given as an argument is of type DICOM. | ||
This method checks whether the files given as an argument are of type DICOM. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmm. Isn't this doc outdated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooooooops
uploadNeuroDB/NeuroDB/MRI.pm
Outdated
|
||
=cut | ||
|
||
sub isDicom { | ||
sub isDicomImage { | ||
my (@files_list) = @_; | ||
|
||
my $cmd = "file " . join(' ', @files_list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this command could fail for large file lists (we could exceed the maximum length a Unix command can have). Maybe we could change this to:
my $cmd = "ls @files_list | xargs file";
just to be safe. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!
@nicolasbrossard Ready for another review. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Description
The
dcm2mnc
conversion is currently failing on datasets that contains Siemens reports. This PR adds a check inget_dicom_info.pl
so that the script skips the scanner report.Also added a check so that only DICOM medical imaging data files are passed to
get_dicom_info.pl
when converting data to MINC files.