Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference Compatibility tool #7959

Merged
merged 6 commits into from
Jul 28, 2022
Merged

Conversation

orlicohen
Copy link
Contributor

@orlicohen orlicohen commented Jul 21, 2022

Tool to determine compatibility of a provided BAM/CRAM/VCF against provided references.

@codecov
Copy link

codecov bot commented Jul 26, 2022

Codecov Report

Merging #7959 (42fac0f) into master (64f9953) will decrease coverage by 0.362%.
The diff coverage is 93.478%.

@@               Coverage Diff               @@
##              master     #7959       +/-   ##
===============================================
- Coverage     87.062%   86.700%   -0.362%     
- Complexity     37007     38450     +1443     
===============================================
  Files           2218      2310       +92     
  Lines         173758    180344     +6586     
  Branches       18769     19841     +1072     
===============================================
+ Hits          151277    156358     +5081     
- Misses         15896     17042     +1146     
- Partials        6585      6944      +359     
Impacted Files Coverage Δ
...ls/reference/CompareReferencesIntegrationTest.java 77.465% <ø> (ø)
...tute/hellbender/tools/reference/ReferencePair.java 89.744% <50.000%> (+3.257%) ⬆️
...bender/tools/reference/ReferenceSequenceTable.java 87.097% <90.909%> (+1.786%) ⬆️
...r/tools/reference/CheckReferenceCompatibility.java 93.137% <93.137%> (ø)
...ce/CheckReferenceCompatibilityIntegrationTest.java 93.548% <93.548%> (ø)
.../hellbender/tools/reference/CompareReferences.java 66.667% <100.000%> (ø)
...ools/reference/ReferenceSequenceTableUnitTest.java 98.230% <100.000%> (+0.358%) ⬆️
...ynumber/models/AlleleFractionGlobalParameters.java 76.471% <0.000%> (-23.529%) ⬇️
...nstitute/hellbender/engine/filters/ReadFilter.java 83.784% <0.000%> (-16.216%) ⬇️
...er/tools/walkers/annotator/StrandBiasBySample.java 72.000% <0.000%> (-6.261%) ⬇️
... and 181 more

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oerall I think this is in good shape. A bunch of minor comments and some medium sized comments about how the tool is structured.

oneLineSummary = "",
programGroup = ReferenceProgramGroup.class
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DocumentedFeature, @ExperimentalFeature


@CommandLineProgramProperties(
summary = "",
oneLineSummary = "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

summaries


private void initializeSequenceDictionaryForInput(){
// BAMs/CRAMs
if(hasReads() ^ vcfPath != null){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be a check here that fails if both are specified for right now. Currently this falls off the bottom with empty inputs i think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rework the if statements here to first check if the inputs are valid, then maybe handle reads, then maybe handle vcfs checking explicitly each time. Then put a check at the bottom to throw an exception of nothing got set for some reason (indcating that there was no input).

if(hasReads() ^ vcfPath != null){
if (hasReads()) {
if (readArguments.getReadPathSpecifiers().size() > 1) {
throw new UserException.BadInput("Tool analyzes one BAM at a time.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BAM -> "Reads Input"

runCommandLine(args);
}

// TODO: compatibility based on MD5 faulty since MD5s not in sequence dictionary (see ticket #730 "VCFHeader drops sequence dictionary attributes")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #730 is in another repository (htsjdk) make this the full link since this is going to be confusing.


// for quick stdout testing
@Test(enabled = false)
public void testStdOutput() throws IOException{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably get rid of this before merging this branch.

@gatk-bot
Copy link

gatk-bot commented Jul 27, 2022

Github actions tests reported job failures from actions build 2748430417
Failures in the following jobs:

Test Type JDK Job ID Logs
integration 11 2748430417.12 logs
integration 8 2748430417.0 logs

@orlicohen orlicohen force-pushed the oc_referencecompatibilitychecker branch from 063d164 to 5648af9 Compare July 27, 2022 18:25
@@ -45,9 +48,9 @@
* <pre>
* #Current Reference: reads_data_source_test1_withmd5s_missingchr1.bam
* Reference Compatibility
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forgot to update the table line here.

}
// VCFs
else {
try(final FeatureDataSource<VariantContext> vcfReader = new FeatureDataSource<>(vcfPath.toString())){
try (final FeatureDataSource<VariantContext> vcfReader = new FeatureDataSource<>(vcfPath.toString())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this explicitly check for VCF in this else statement. vcfPath.toString() can still throw a null pointer exception if you get here in the code without specifying either VCF OR a bam.

//final File output = createTempFile("testReferenceCompatibilityMultipleReferencesWithMD5s", ".table");
final File expectedOutput = new File(getToolTestDataDir(), "expected.testReferenceCompatibilityMultipleReferencesBAMWithMD5s.table");
@Test(expectedExceptions = UserException.BadInput.class)
public void testReferenceCompatibilityBAMandVCF() throws IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add one for neither bam or vcf as well.

private final GATKPath ref;
private final Compatibility status;

private enum Compatibility{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this to CompatibilityStatus,

@orlicohen orlicohen force-pushed the oc_referencecompatibilitychecker branch from 5648af9 to 44dfc50 Compare July 27, 2022 20:50
@orlicohen orlicohen force-pushed the oc_referencecompatibilitychecker branch from 44dfc50 to 42fac0f Compare July 27, 2022 21:55
@orlicohen orlicohen merged commit c22972a into master Jul 28, 2022
@orlicohen orlicohen deleted the oc_referencecompatibilitychecker branch July 28, 2022 19:42
@droazen droazen changed the title **DO NOT MERGE** Reference Compatibility tool Reference Compatibility tool Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants