-
Notifications
You must be signed in to change notification settings - Fork 593
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support for flow based sequencing (#7876)
Added Support for Ultima Genomics flow based sequencing data. The major new features are as follows: - Added "FlowMode" argument to HaplotypeCaller which better supports flow based calling - Added a new Haplotype Filtering step after assembly which removes suspicious haplotypes from the genoytper - Added two new likelihoods models, FlowBasedHMM and the FlowBasedAlignmentLkelihoodEngine - Added "FlowMode" to Mutect2 which better supports flow based calling - Added support for uncertain read end-positions in MarkDuplicatesSpark - Added a new tool FlowFeatureMapper for quick heuristic calling of bams for diagnostics - Added a new tool GroundTruthReadsBuilder to generate ground truth files for Basecalling - Added a new diagnostic tool HaplotypeBasedVariantRecaller for recalling VCF files using the HaplotypeCallerEngine - Added a new tool breaking up CRAM files by their blocks, SplitCram There are a number of code and technical changes as well - Added a new read interface called `FlowBasedRead` that manages the new features for FlowBased data - Added a number of flow-specific read filters - Added a number of flow-specific variant annotations - Added support for read annotation-clipping as part of clipreads and GATKRead - Added a new PartialReadsWalker that supports terminating before traversal is finished Co-authored-by: James <emeryj@broadinstitute.org> Co-authored-by: Megan Shand <mshand@broadinstitute.org> Co-authored-by: Dror Kessler <dror27.kessler@gmail.com>
- Loading branch information
1 parent
8c348aa
commit f1e7265
Showing
388 changed files
with
372,947 additions
and
1,144 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
17 changes: 17 additions & 0 deletions
17
src/main/java/org/broadinstitute/hellbender/cmdline/programgroups/FlowBasedProgramGroup.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
package org.broadinstitute.hellbender.cmdline.programgroups; | ||
|
||
import org.broadinstitute.barclay.argparser.CommandLineProgramGroup; | ||
import org.broadinstitute.hellbender.utils.help.HelpConstants; | ||
|
||
/** | ||
* Tools that perform variant calling and genotyping for short variants (SNPs, SNVs and Indels) on | ||
* flow-based sequencing platforms | ||
*/ | ||
public class FlowBasedProgramGroup implements CommandLineProgramGroup { | ||
|
||
@Override | ||
public String getName() { return HelpConstants.DOC_CAT_SHORT_FLOW_BASED; } | ||
|
||
@Override | ||
public String getDescription() { return HelpConstants.DOC_CAT_SHORT_FLOW_BASED_SUMMARY; } | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
75 changes: 75 additions & 0 deletions
75
src/main/java/org/broadinstitute/hellbender/engine/PartialReadWalker.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
package org.broadinstitute.hellbender.engine; | ||
|
||
import org.broadinstitute.hellbender.engine.filters.CountingReadFilter; | ||
import org.broadinstitute.hellbender.utils.SimpleInterval; | ||
import org.broadinstitute.hellbender.utils.read.GATKRead; | ||
|
||
import java.util.Spliterator; | ||
import java.util.concurrent.atomic.AtomicBoolean; | ||
import java.util.function.BiConsumer; | ||
import java.util.stream.Stream; | ||
|
||
/** | ||
* A specialized read walker that may be gracefully stopped before the input stream ends | ||
* | ||
* A tool derived from this class should implement {@link PartialReadWalker#shouldExitEarly(GATKRead)} | ||
* to indicate when to stop. This method is called before {@link ReadWalker#apply(GATKRead, ReferenceContext, FeatureContext)} | ||
* | ||
*/ | ||
abstract public class PartialReadWalker extends ReadWalker { | ||
|
||
/** | ||
* traverse is overridden to consult the implementation class whether to stop | ||
* | ||
* The stoppage is implemented using a custom forEach method to compensate for the | ||
* lack of .takeWhile() in Java 8 | ||
*/ | ||
|
||
@Override | ||
public void traverse() { | ||
|
||
final CountingReadFilter countedFilter = makeReadFilter(); | ||
breakableForEach(getTransformedReadStream(countedFilter), (read, breaker) -> { | ||
|
||
// check if we should stop | ||
if ( shouldExitEarly(read) ) { | ||
breaker.set(true); | ||
} else { | ||
// this is the body of the iteration | ||
final SimpleInterval readInterval = getReadInterval(read); | ||
apply(read, | ||
new ReferenceContext(reference, readInterval), // Will create an empty ReferenceContext if reference or readInterval == null | ||
new FeatureContext(features, readInterval)); // Will create an empty FeatureContext if features or readInterval == null | ||
|
||
progressMeter.update(readInterval); | ||
} | ||
}); | ||
|
||
logger.info(countedFilter.getSummaryLine()); | ||
} | ||
|
||
/** | ||
* Method to be overridden by the implementation class to determine when to stop the read stream traversal | ||
* @param read - the read to be processed next (in case it is needed) | ||
* @return boolean indicator: true means stop! | ||
*/ | ||
protected abstract boolean shouldExitEarly(GATKRead read); | ||
|
||
/** | ||
* Java 8 does not have a .takeWhile() on streams. The code below implements a custom forEach to allow | ||
* breaking out of a stream prematurely. | ||
* | ||
* code adapted from: https://www.baeldung.com/java-break-stream-foreach | ||
*/ | ||
private static <T> void breakableForEach(Stream<T> stream, BiConsumer<T, AtomicBoolean> consumer) { | ||
Spliterator<T> spliterator = stream.spliterator(); | ||
boolean hadNext = true; | ||
AtomicBoolean breaker = new AtomicBoolean(); | ||
|
||
while (hadNext && !breaker.get()) { | ||
hadNext = spliterator.tryAdvance(elem -> { | ||
consumer.accept(elem, breaker); | ||
}); | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.