Added regularization to covariance in GMM maximization step to fix convergence issues in VQSR. #7709

samuelklee · 2022-03-07T14:03:52Z

A quick patch to help out the Variants team, which is struggling with a problematic callset.

Note that a similar regularization to the effective number per component probably should have been applied to solve the issue in #6425. I'm not sure if the lack of this regularization will still lead to convergence issues, but I would hope that the fix that was implemented instead (treating vanishing components as a special case and skipping computation) suffices. As discussed there, we may also want to eventually remove the idiosyncratic finalize step; it’s likely this is the source of issues here, since the correct Bayesian M step is already regularized by the prior.

The covariance regularization term added here is standard (c.f. e.g. https://github.com/scikit-learn/scikit-learn/blob/7e1e6d09bcc2eaeba98f7e737aac2ac782f0e5f1/sklearn/mixture/_gaussian_mixture.py#L154), but it may result in non-negligible changes to VQSLODs. As just discussed with the Variants team, we can probably use the WARP validation to convince ourselves that results are functionally equivalent.

I updated the exact-match tests without much close examination (by simply forcing IntegrationTestSpec.assertEqualTextFiles to overwrite the old expected files), so someone may want to sanity check them. There were also a few more interactions between the integration tests for different tools than I anticipated. Some tests use output generated by an upstream tool as input and break encapsulation.

…nvergence issues in VQSR.

...est/java/org/broadinstitute/hellbender/tools/walkers/vqsr/GatherTranchesIntegrationTest.java

...st/resources/org/broadinstitute/hellbender/tools/walkers/VQSR/expected.AS.alternate.tranches

...ava/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibratorIntegrationTest.java

ldgauthier · 2022-03-08T14:12:25Z

@samuelklee I will build the docker and run the warp tests, but if we do this again I'm going to teach you to fish. Wasn't there a VariantRecalibrator integration test that had some sort of random seed hijinx? Hopefully your fix should mean we don't need any shenanigans anymore -- can you see if there are any other tests we can clean up?

samuelklee · 2022-03-08T14:22:52Z

Thanks for the quick review, @ldgauthier!

I don't think my fix will address any non-determinism in the integration tests. I'm inclined to just do better with the new tools---there does seem to be enough duct tape in the integration tests regarding re/setting the RNG so that the exact-match tests consistently pass.

As for learning how to run the WARP tests, I think that would indeed be pretty useful---for anyone that might have to update code for VQSR or the new tools in the future! Can we teach everyone to fish? Isn't this what CARROT is for?

ldgauthier · 2022-03-08T14:49:25Z

See "smart tests" here: broadinstitute/warp#639 Should finish in maybe an hour and a half? Then I'll look at the VerifyNA12878 results.

ldgauthier · 2022-03-08T14:51:12Z

It's this I'm hoping we can take out:

gatk/src/test/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibratorIntegrationTest.java

Line 229 in b097f75

final int hack = Utils.getRandomGenerator().nextInt();

Kevin has kindly volunteered to help me port the warp tests to Carrot, if I can ever make time. :-(

ldgauthier · 2022-03-08T19:07:43Z

Warp exome JG tests had slightly different metrics, so they failed, but the NA12878 results look like this:

test
type TP FP FN RECALL PRECISION
SNP 17444 47 843 0.954 0.997
INDEL 300 73 103 0.744 0.804

truth
type TP FP FN RECALL PRECISION
SNP 17443 48 844 0.954 0.997
INDEL 304 74 99 0.754 0.804

(Over the exome) negligible change in SNPs, slight recall decrease in indels, which I would say is "in the noise" given how many truth indels are in the exome.

samuelklee · 2022-03-08T19:11:17Z

Thanks @ldgauthier! That canary certainly looks alive to me. Happy to merge whenever you and/or Variants team approve.

ldgauthier · 2022-03-09T14:37:02Z

I will approve after you take a look at the integration test I linked above.

samuelklee · 2022-03-09T14:53:07Z

Sorry, I guess I didn't see your edit pointing out that line of code.

I've indeed looked at that test and more---there's a lot of similar duct tape and inconsistent resetting of the RNG throughout the entire test suite. But since I think we can reasonably assume that there's enough duct tape to make things deterministic overall, I don't think it's worth cleaning up the duct tape just to get neater, but equally deterministic behavior. (Or perhaps can you point to instances of persisting non-determinism, e.g. random failures in Travis?)

In any case, I think it makes more sense to focus effort on making cleaner tests for the new tools, rather than make an 11th hour effort to revamp these existing tests. Do you agree?

See e.g. #6112 for some related discussion.

Also added a note mentioning that the original GATK3 expected results have been updated, although now looking back at the commit history, I'm not sure if that was already true.

ldgauthier · 2022-03-10T15:21:12Z

I just want to know if the need for that RNG draw is obviated by your convergence fix. If it is, then we can remove that line. If not, then we know there are still some issues yet to be addressed.

samuelklee · 2022-03-10T15:31:44Z

As far as I can tell, even master passes when that line is removed. So the comments in the integration test are already misleading or out of date. Do you see differently?

samuelklee · 2022-03-10T16:17:57Z

Just for future reference, note that comments in testVariantRecalibratorSNPMaxAttempts are also incorrect or out of date. The test passes even if you limit it to one attempt.

// For this test, we deliberately *DON'T* sample a single random int as above; this causes
// the tool to require 4 attempts to acquire enough negative training data to succeed

So again, the tests were already "broken." But still, rather than attempt to fix them, I think it's best to follow the principle of not changing both production and test code to the extent that it is possible in this scenario. We've already updated enough exact-match expected results to make me a bit uncomfortable!

Someone else may want to tackle fixing the tests in a separate push, but I think it makes sense for me to focus on avoiding these sorts of issues when writing tests for the new tools.

EDIT: For the record, I confirmed that the undesired behavior in this test that the RNG hack was trying to avoid was fixed (and hence, the test was "broken") in #6425. Probably wasn't noticed because this is the only non-exact-match test and the test isn't strict enough to check that attempts 1-3 fail, it only checks that we succeed by attempt 4. Again, someone else may feel free to examine the actual coverage of this test and whether it's safe to remove it and/or clean up all the duct tape---but at some point, it becomes difficult to tell which pieces of duct tape are load bearing!

samuelklee force-pushed the sl_vqsr_convergence branch 2 times, most recently from 612bcc3 to 9eb886c Compare March 7, 2022 14:06

Added regularization to covariance in GMM maximization step to fix co…

6c0b156

…nvergence issues in VQSR.

samuelklee force-pushed the sl_vqsr_convergence branch from 9eb886c to d9a5468 Compare March 7, 2022 16:45

Fixed exact-match tests.

5bf0578

samuelklee force-pushed the sl_vqsr_convergence branch from d9a5468 to 5bf0578 Compare March 7, 2022 19:56

samuelklee commented Mar 7, 2022

View reviewed changes

...est/java/org/broadinstitute/hellbender/tools/walkers/vqsr/GatherTranchesIntegrationTest.java Show resolved Hide resolved

samuelklee commented Mar 7, 2022

View reviewed changes

...st/resources/org/broadinstitute/hellbender/tools/walkers/VQSR/expected.AS.alternate.tranches Show resolved Hide resolved

droazen requested a review from ldgauthier March 7, 2022 21:02

droazen assigned ldgauthier Mar 7, 2022

Changed disabled test for convergence failure to regression test.

f6c0119

samuelklee commented Mar 8, 2022

View reviewed changes

...ava/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibratorIntegrationTest.java Show resolved Hide resolved

Added note about updating GATK3 results.

0431b77

samuelklee force-pushed the sl_vqsr_convergence branch from 0985caa to 0431b77 Compare March 9, 2022 14:57

ldgauthier approved these changes Mar 10, 2022

View reviewed changes

samuelklee merged commit b14f810 into master Mar 10, 2022

samuelklee deleted the sl_vqsr_convergence branch March 10, 2022 17:46

samuelklee mentioned this pull request Mar 23, 2022

New tools for annotation-based filtering. #7724

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added regularization to covariance in GMM maximization step to fix convergence issues in VQSR. #7709

Added regularization to covariance in GMM maximization step to fix convergence issues in VQSR. #7709

samuelklee commented Mar 7, 2022 •

edited

Loading

ldgauthier commented Mar 8, 2022

samuelklee commented Mar 8, 2022

ldgauthier commented Mar 8, 2022

ldgauthier commented Mar 8, 2022 •

edited

Loading

ldgauthier commented Mar 8, 2022

samuelklee commented Mar 8, 2022 •

edited

Loading

ldgauthier commented Mar 9, 2022

samuelklee commented Mar 9, 2022 •

edited

Loading

ldgauthier commented Mar 10, 2022

samuelklee commented Mar 10, 2022

samuelklee commented Mar 10, 2022 •

edited

Loading

Added regularization to covariance in GMM maximization step to fix convergence issues in VQSR. #7709

Added regularization to covariance in GMM maximization step to fix convergence issues in VQSR. #7709

Conversation

samuelklee commented Mar 7, 2022 • edited Loading

ldgauthier commented Mar 8, 2022

samuelklee commented Mar 8, 2022

ldgauthier commented Mar 8, 2022

ldgauthier commented Mar 8, 2022 • edited Loading

ldgauthier commented Mar 8, 2022

samuelklee commented Mar 8, 2022 • edited Loading

ldgauthier commented Mar 9, 2022

samuelklee commented Mar 9, 2022 • edited Loading

ldgauthier commented Mar 10, 2022

samuelklee commented Mar 10, 2022

samuelklee commented Mar 10, 2022 • edited Loading

samuelklee commented Mar 7, 2022 •

edited

Loading

ldgauthier commented Mar 8, 2022 •

edited

Loading

samuelklee commented Mar 8, 2022 •

edited

Loading

samuelklee commented Mar 9, 2022 •

edited

Loading

samuelklee commented Mar 10, 2022 •

edited

Loading