Streaming inference with CNN #4097

lucidtronix · 2018-01-09T02:53:51Z

Annotate a VCF with scores from a pretrained model. Stream from java to python.

codecov-io · 2018-01-10T22:28:52Z

Codecov Report

Merging #4097 into master will decrease coverage by 0.492%.
The diff coverage is 84.848%.

@@               Coverage Diff               @@
##              master     #4097       +/-   ##
===============================================
- Coverage     78.458%   77.966%   -0.491%     
- Complexity     16444     17700     +1256     
===============================================
  Files           1039      1084       +45     
  Lines          59196     64175     +4979     
  Branches        9692     10632      +940     
===============================================
+ Hits           46444     50035     +3591     
- Misses          8995     10124     +1129     
- Partials        3757      4016      +259

Impacted Files	Coverage Δ	Complexity Δ
...nder/utils/runtime/StreamingProcessController.java	`69.547% <ø> (-0.823%)`	`50 <0> (ø)`
...ute/hellbender/utils/variant/GATKVCFConstants.java	`80% <ø> (ø)`	`4 <0> (ø)`	⬇️
...e/hellbender/utils/variant/GATKVCFHeaderLines.java	`99.286% <100%> (+0.005%)`	`10 <0> (ø)`	⬇️
...lbender/tools/walkers/vqsr/NeuralNetInference.java	`84.733% <84.733%> (ø)`	`20 <20> (?)`
.../DiscoverVariantsFromContigAlignmentsSAMSpark.java	`71.839% <0%> (-28.161%)`	`37% <0%> (+24%)`
...adinstitute/hellbender/tools/IndexFeatureFile.java	`94.444% <0%> (-5.556%)`	`17% <0%> (+5%)`
...oadinstitute/hellbender/utils/GenomeLocParser.java	`84.848% <0%> (-3.03%)`	`57% <0%> (-2%)`
...tools/walkers/mutect/SomaticLikelihoodsEngine.java	`83.871% <0%> (-2.971%)`	`22% <0%> (+8%)`
...e/hellbender/tools/spark/sv/utils/SVVCFWriter.java	`86.047% <0%> (-1.709%)`	`10% <0%> (-1%)`
...ignment/AssemblyContigWithFineTunedAlignments.java	`42.105% <0%> (-1.316%)`	`15% <0%> (-1%)`
... and 106 more

cmnbroad · 2018-01-16T15:24:22Z

@lucidtronix I have a few comments on the on the java side of this, and want to do a review pass. Let me know if/when its ready for that (it may already be, now that tests are passing).

lucidtronix · 2018-01-16T15:32:55Z

Go for it!
I'll add the 2D CNN in a separate PR after we iron this one out...

lucidtronix · 2018-01-16T19:49:20Z

@cmnbroad I accidentally added the file AddScores.java in this PR, please ignore, I will remove it.

cmnbroad

First round done - it will probably take one more round after these changes are made. I'm going to submit the timeout changes for the script executor in a separate PR, so once those are in this can be rebased on that.

cmnbroad · 2018-01-16T20:29:50Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+
+/**
+ * Created by sam on 11/17/17.
+ */


This javadoc shows up in the online doc. It can be sparse for now, but should say something more descriptive, and we generally don't list the author.

cmnbroad · 2018-01-16T20:31:17Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+/**
+ * Created by sam on 11/17/17.
+ */
+@CommandLineProgramProperties(


This should probably have either a @Beta or @Experimental annotation (probably @Experimental).

cmnbroad · 2018-01-16T20:35:46Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+        programGroup = VariantEvaluationProgramGroup.class
+)
+
+public class NeuralNetStreamingExecutor extends VariantWalker {


Is 2d going to be a separate tool, or a mode of this tool ? We should probably pick a more descriptive name that doesn't have "StreamingExecutor" in it. So maybe something that will be symmetric with the names of the companion tools (i.e., training and/or 2d) once they're available.

I think 2d will be a mode of this tool. Renamed to NeuralNetInference.

cmnbroad · 2018-01-16T20:40:52Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+    @Argument(fullName = StandardArgumentDefinitions.OUTPUT_LONG_NAME,
+            shortName = StandardArgumentDefinitions.OUTPUT_SHORT_NAME,
+            doc = "Output file")
+    private File outputFile = null; // output file produced by Python code


We need to update this tool to write the final output file using the tool variant writer, and update this comment.

cmnbroad · 2018-01-16T20:44:28Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+    private boolean keepInfo = true;
+
+    @Argument(fullName = "python-batch-size", shortName = "pbs", doc = "Size of batches for python to do inference.", optional = true)
+    private int pythonBatchSize = 256;


I probably originated this name on one of my branches, but we should change it to something more descriptive. I'd suggest maybe inference-batch-size.

Should this argument be @Advanced, and/or have minValue and maxValue attributes on it (the Argument annotation has attributes for that). My experience using it was that even moderately larger values like 16k or 32k allocated huge gobs of memory).

Changed and added min max and advanced.

cmnbroad · 2018-01-16T22:26:42Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+                        pythonExecutor.getAccumulatedOutput();
+                    }
+                    final String pythonCommand = String.format(
+                            "vqsr_cnn.score_and_write_batch(model, tempFile, fifoFile, %d, %d)", curBatchSize, pythonBatchSize) + NL;


Can you factor out the code thats duplicated here and in onTraversalSuccess, especially so the python code appears in only one place.

cmnbroad · 2018-01-16T22:29:20Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+                    asyncWriter.startAsynchronousBatchWrite(batchList);
+                    waitforBatchCompletion = true;
+                    curBatchSize = 0;
+                    batchList = new ArrayList<>(pythonSyncFrequency);


This whole code block (line 145 to here) isn't covered by the test - it never syncs until traversal is finished because there aren't many variants. I'm not sure how long the tests takes to run - but could we add another (duplicate of the original) test using smaller batch/frequency values to force it through here ?

cmnbroad · 2018-01-16T23:06:18Z

.../broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutorIntegrationTest.java

+        try {
+            spec.executeTest("testInference", this);
+        } catch (IOException e) {
+            e.printStackTrace();


Just declare "throws IOException" in the method signature, and then you can remove the try/catch block altogether.

cmnbroad · 2018-01-17T00:02:07Z

.../broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutorIntegrationTest.java

+                .addArgument("architecture", architectureHD5)
+                .addArgument(StandardArgumentDefinitions.ADD_OUTPUT_VCF_COMMANDLINE, "false");
+
+        runCommandLine(argsBuilder);


Tests should be written using either ArgumentsBuilder/runCommandLine, or IntegrationTestSpec, but not both. IntegrationTestSpec is "old-style", but convenient. It automatically generates a temporary output file and substitutes the name for %s, and also compares the output to the expected results file. If you use runCommandLine, you need to generate your own temporary output file, and do your own expected results comparison. Since this method mixes both styles, it runs the test twice - first using runCommandLine with an output filename of literal "%s", and then again via executeTest using a generated temp file. Take a look at SelectVariantsIntegrationTest as an example of using InegrationTestSpec, or PrintReadSparkIntegrationTest for the other style.

cmnbroad · 2018-01-17T00:54:15Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetStreamingExecutor.java

+                String varData = getVariantDataString(variant, referenceContext);
+                String isSnp = variant.isSNP() ? "1" : "0";
+                String genos = "\t.";
+                if (noSamples) genos = "";


These variables all could use less cryptic names; most of them are only used once though so they could be inlined below.

lucidtronix · 2018-01-17T20:13:32Z

Responded to most of the comments, but still need to implement proper gatk style vcf writing from java. Also I copied @mbabadi's python package setup, but I havent been able to upload to pypi with setup_vqsr_cnn.py so for the time being I also have a setup.py inside the vqsr_cnn package which works with pypi, but hardcodes the version. I'm sure there is a better way. Some python tests failed but it seems to be a maven jar issue...

lucidtronix · 2018-01-18T22:34:53Z

Added intermediate temp file from python and proper VCF writing as we discussed, back to you @cmnbroad.

cmnbroad

A few more comments. I think we're close. My PR with the timeout changes is #4218, and should be reviewed soon. I'm hoping we can get all of this in for the point release, which is scheduled for Friday.

cmnbroad · 2018-01-22T15:53:52Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+
+    private void addScoresToVCF(){
+        try {
+            Scanner scoreScan = new Scanner(new File(scoreFile));


The Scanner should be created inside of a try-with-resources stmt (you'll have to create the File object outside of the try block) so it will always be automatically closed, even if an exception is thrown. Also as mentioned above we should make sure its deleted.

cmnbroad · 2018-01-22T15:54:32Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+                    });
+
+        } catch (IOException e) {
+            e.printStackTrace();


Rather than calling printStackTrace, wrap this exception in a GATKException and re-throw it.

I realize in looking at this (post traversal code) that we don't have a sanctioned way to do a second pass over the input data. We can leave this for now, but we'll probably need to add engine functionality to support this, i.e., a TwoPassVariantWalker.

Fixed, and yeah I just grabbed this from variantWalkerBase traverse.

cmnbroad · 2018-01-22T15:56:08Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+                                && variant.getReference().getBaseString().equals(scoredVariant[2])
+                                && variant.getAlternateAlleles().toString().equals(scoredVariant[3])) {
+                            final VariantContextBuilder builder = new VariantContextBuilder(variant);
+                            builder.attribute(GATKVCFConstants.CNN_1D_KEY, scoredVariant[4]);


Can you replace these 0,1,2,3 constants with symbolic constants saying what they represent.

cmnbroad · 2018-01-22T17:07:48Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+                        if(variant.getContig().equals(scoredVariant[0])
+                                && Integer.toString(variant.getStart()).equals(scoredVariant[1])
+                                && variant.getReference().getBaseString().equals(scoredVariant[2])
+                                && variant.getAlternateAlleles().toString().equals(scoredVariant[3])) {


Does this handle joining variants that are not snps/indels ("OTHER") correctly ? I'm just not sure whats getting written out for those.

I dug into those a bit and the ones I saw were all multiallelic sites that have one SNP allele and the other allele is a INDEL. For now they get processed like any other variant and scored as if they were SNPs. This is not ideal, maybe we should average the SNP and INDEL score and write that. I will ask @ldgauthier what she thinks. It is very few sites so I don't think we should worry too much about them right now.

cmnbroad · 2018-01-22T17:19:51Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+    private String getVariantInfoString(final VariantContext variant){
+        String varInfo = "";
+        for (final String attributeKey : variant.getAttributes().keySet()) {
+            varInfo += attributeKey + "=" + variant.getAttribute(attributeKey).toString().replace(" ", "").replace("[", "").replace("]", "") + ";";


Can you stick a comment in here saying this is creating a python dictionary.

cmnbroad · 2018-01-22T17:35:27Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+    }
+
+
+    private void addScoresToVCF(){


I'd rename this -maybe writeOutputVCFWithScores or something like that.

cmnbroad · 2018-01-22T17:41:15Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+        } catch (IOException e) {
+            e.printStackTrace();
+        }
+        vcfWriter.close();


We're always creating the input writer, but only closing it if we get here, on success. It would be better to only create the writer when we need it (probably best, since that way we don't create half-baked header only output file if there is a failure).

cmnbroad · 2018-01-22T17:43:04Z

src/main/python/org/broadinstitute/hellbender/vqsr_cnn/vqsr_cnn/inference.py

+		elif b in ambiguity_codes:
+			dna_data[i] = ambiguity_codes[b]
+		else:
+			print('Error! Unknown code:', b)


This will get silently swallowed by the java front end since there is no exception/traceback here. If this is fatal, it should raise an exception. Not sure if there are other similar instances anywhere.

cmnbroad · 2018-01-22T17:43:39Z

...java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInferenceIntegrationTest.java

+//
+//        runCommandLine(argsBuilder);
+//
+//    }


Should this be removed ?

Probably yes, but it is still helpful as everytime I update the model I uncomment and use this to generate a new expected VCF.

cmnbroad · 2018-01-22T17:49:16Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+                variant.getAlternateAlleles().toString()
+        );
+
+        return varData;


Don't really need the intermediate variable, but if you keep it, can be final.

cmnbroad

A couple of last code cleanup requests. It might be a good time to squash the commits down, remove the timeout commits, and rebase on my timeout branch. Although if you want to wait until my branch is merged thats fine too.

cmnbroad · 2018-01-23T16:37:30Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+    final StreamingPythonScriptExecutor pythonExecutor = new StreamingPythonScriptExecutor(true);
+
+    private FileOutputStream fifoWriter;
+    private VariantContextWriter vcfWriter;


This scope of this variable can be reduced now. See comment below.

cmnbroad · 2018-01-23T16:38:00Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+            pythonExecutor.sendSynchronousCommand(String.format("model = load_model('%s', custom_objects=vqsr_cnn.get_metric_dict())", architecture) + NL);
+            logger.info("Loaded CNN architecture:"+architecture);
+        } catch (IOException e) {
+            e.printStackTrace();


Wrap in GATKException and re-throw.

cmnbroad · 2018-01-23T16:42:18Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+
+    private void writeOutputVCFWithScores(){
+        writeVCFHeader();
+        try (Scanner scoreScan = new Scanner(scoreFile)) {


Can you remove the class level vcfWriter variable, and localize it to try-with-resources:

try (final Scanner scoreScan = new Scanner(scoreFile); final VariantContextWriter vcfWriter = createVCFWriter(new File(outputFile))) { scoreScan.useDelimiter("\\n"); writeVCFHeader(vcfWriter); // or call this getOutputHeader, have it return the header, and write it here

Then all the closing is handled automatically (you can remove the explicit vcfWriter.close() below), and the resource handling will be nicely symmetric.

Nice cleanup, fixed

lucidtronix · 2018-01-23T23:26:02Z

Thanks for the speedy reviews! Squashed and rebased and made the changes.

cmnbroad · 2018-01-25T16:41:12Z

@lucidtronix #4218 is merged now so you should be able to rebase this on master. It looks like when you squashed you left in some of the timeout changes, so you'll have to resolve the resulting conflicts in favor of master.

lucidtronix · 2018-01-25T19:46:53Z

Ok rebased on master, if tests pass do you think it's ready to merge?

cmnbroad

There are still a couple of files included that should be reverted completely (that have changes left over from the previous merging of branches). We should remove those, and then we can merge once tests pass. I do still have some minor code pattern comments, but we can fix those the 2d branch. And we probably need more test coverage before we remove @Experimental - we should have a ticket for that.

I'm still lobbying for a better tool name....Otherwise looks good!

cmnbroad · 2018-01-25T20:20:37Z

src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/NeuralNetInference.java

+        programGroup = VariantEvaluationProgramGroup.class
+)
+
+public class NeuralNetInference extends VariantWalker {


Still think we need a name thats more specific. But we can change it later.

cmnbroad · 2018-01-25T20:22:36Z

...t/java/org/broadinstitute/hellbender/utils/python/StreamingPythonScriptExecutorUnitTest.java

@@ -7,6 +7,7 @@
 import org.broadinstitute.hellbender.utils.runtime.ProcessOutput;
 import org.broadinstitute.hellbender.utils.runtime.StreamingPythonTestUtils;
 import org.testng.Assert;
+import org.testng.annotations.AfterClass;


You should be able to revert this whole file.

cmnbroad · 2018-01-25T20:23:02Z

src/test/java/org/broadinstitute/hellbender/utils/runtime/ProcessControllerUnitTest.java

@@ -17,6 +18,8 @@
 import java.util.LinkedHashMap;
 import java.util.Map;

+import static java.lang.Thread.sleep;
+


Same here (revert the whole file).

lucidtronix · 2018-01-25T22:57:58Z

Yes we're still brainstorming for a better name, but it can wait for the next PR. I removed those files and rebased.

cmnbroad · 2018-01-26T01:03:04Z

@lucidtronix It looks like the StreamingPythonExecutorUnitTest and ProcessControllerUnitTest files are entirely removed now, instead of just being reverted (they had some stray changes included before). Those files need to be restored, then we can merge once tests pass again.

lucidtronix · 2018-01-26T16:00:37Z

Ooops! They're back now and checks passed...

cmnbroad

All right then.

lucidtronix force-pushed the sf_nn_streaming_inference branch 4 times, most recently from 2bd8226 to e27d5e3 Compare January 12, 2018 17:13

lucidtronix closed this Jan 12, 2018

lucidtronix reopened this Jan 12, 2018

cmnbroad self-requested a review January 16, 2018 15:21

cmnbroad requested changes Jan 17, 2018

View reviewed changes

cmnbroad requested changes Jan 22, 2018

View reviewed changes

cmnbroad requested changes Jan 23, 2018

View reviewed changes

lucidtronix force-pushed the sf_nn_streaming_inference branch from 4d90134 to 52d0120 Compare January 23, 2018 23:21

lucidtronix mentioned this pull request Jan 24, 2018

2D CNN Streaming Inference #4245

Merged

lucidtronix force-pushed the sf_nn_streaming_inference branch from 52d0120 to f6a8058 Compare January 25, 2018 19:45

cmnbroad reviewed Jan 25, 2018

View reviewed changes

lucidtronix force-pushed the sf_nn_streaming_inference branch from f6a8058 to 8f3fc41 Compare January 25, 2018 22:56

CNN 1d variant filter streaming inference from java to python

f5b949c

lucidtronix force-pushed the sf_nn_streaming_inference branch from 8f3fc41 to f5b949c Compare January 26, 2018 15:00

cmnbroad approved these changes Jan 26, 2018

View reviewed changes

cmnbroad merged commit 25f96d4 into master Jan 26, 2018

cmnbroad deleted the sf_nn_streaming_inference branch January 26, 2018 20:51

lbergelson pushed a commit that referenced this pull request Jan 31, 2018

CNN 1d variant filter streaming inference from java to python (#4097)

2346a95

Streaming inference with CNN #4097

Streaming inference with CNN #4097

Conversation

lucidtronix commented Jan 9, 2018

codecov-io commented Jan 10, 2018 • edited Loading

Codecov Report

cmnbroad commented Jan 16, 2018 • edited Loading

lucidtronix commented Jan 16, 2018

lucidtronix commented Jan 16, 2018

cmnbroad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmnbroad Jan 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucidtronix commented Jan 17, 2018

lucidtronix commented Jan 18, 2018

cmnbroad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmnbroad Jan 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmnbroad left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucidtronix commented Jan 23, 2018

cmnbroad commented Jan 25, 2018

lucidtronix commented Jan 25, 2018

cmnbroad left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucidtronix commented Jan 25, 2018

cmnbroad commented Jan 26, 2018

lucidtronix commented Jan 26, 2018

cmnbroad left a comment

Choose a reason for hiding this comment

codecov-io commented Jan 10, 2018 •

edited

Loading

cmnbroad commented Jan 16, 2018 •

edited

Loading

cmnbroad Jan 16, 2018 •

edited

Loading

cmnbroad Jan 22, 2018 •

edited

Loading

cmnbroad left a comment •

edited

Loading

cmnbroad left a comment •

edited

Loading