Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove abandoned filter project and unneeded build dependency #7950

Merged
merged 3 commits into from
Jul 19, 2022

Conversation

TedBrookings
Copy link
Contributor

I'm never revisiting this effort (even if we somehow go back to the spark SV tool, I think there are better ways to approach this), so I wanted to get the xgboost predictor dependency out of GATK.

@gatk-bot
Copy link

gatk-bot commented Jul 18, 2022

Github actions tests reported job failures from actions build 2693262585
Failures in the following jobs:

Test Type JDK Job ID Logs
integration 11 2693262585.12 logs
integration 8 2693262585.0 logs

@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

Merging #7950 (17c7434) into master (804d1e2) will decrease coverage by 0.009%.
The diff coverage is 100.000%.

@@               Coverage Diff               @@
##              master     #7950       +/-   ##
===============================================
- Coverage     87.078%   87.070%   -0.009%     
+ Complexity     36997     36905       -92     
===============================================
  Files           2217      2213        -4     
  Lines         173849    173276      -573     
  Branches       18791     18700       -91     
===============================================
- Hits          151385    150871      -514     
- Misses         15832     15833        +1     
+ Partials        6632      6572       -60     
Impacted Files Coverage Δ
...tructuralVariationDiscoveryArgumentCollection.java 94.595% <ø> (-0.142%) ⬇️
...on/FindBreakpointEvidenceSparkIntegrationTest.java 96.296% <ø> (-3.704%) ⬇️
.../sv/integration/SVIntegrationTestDataProvider.java 93.333% <ø> (-0.784%) ⬇️
...spark/sv/evidence/FindBreakpointEvidenceSpark.java 69.710% <100.000%> (+0.225%) ⬆️
...er/tools/spark/sv/evidence/BreakpointEvidence.java 66.822% <0.000%> (-11.916%) ⬇️
...rg/broadinstitute/hellbender/utils/SVInterval.java 90.000% <0.000%> (-2.857%) ⬇️
...ute/hellbender/tools/spark/utils/IntHistogram.java 86.335% <0.000%> (-2.484%) ⬇️
...lbender/tools/sv/cluster/CanonicalSVCollapser.java 87.471% <0.000%> (-0.464%) ⬇️
...kers/haplotypecaller/AssemblyBasedCallerUtils.java 90.517% <0.000%> (-0.041%) ⬇️
... and 18 more

Copy link
Member

@cwhelan cwhelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, fine to remove this stuff if we're not going to use it again. Nice to streamline dependencies and unused code out of GATK -- writing it was a learning experience but I think we've moved on to better approaches. My one question is -- aren't we going to add the xgboost dependency back in #7705 if that gets merged?

@TedBrookings
Copy link
Contributor Author

No, this dependency is not from the main xgboost project, it's a 3rd-party library that allows you to make predictions using saved model files. At the time, the plan was to train models in python, then use this java tool to apply the trained predictor. The GQ filtering project trains using a java tool, so I have to bring in the real xgboost library.

@cwhelan
Copy link
Member

cwhelan commented Jul 19, 2022

Ah, so the ml.dmlc dependency is the real xgboost library and the tool for predicting from trained models? Got it, thanks for the clarification.

@TedBrookings TedBrookings merged commit 3749fa7 into master Jul 19, 2022
@TedBrookings TedBrookings deleted the tb_remove_xgboost_abandonware branch July 19, 2022 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants