Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: balsamic UMI workflow solution #932

Merged
merged 18 commits into from
May 23, 2022
Merged

Conversation

ashwini06
Copy link
Contributor

@ashwini06 ashwini06 commented May 17, 2022

This PR:

Fixes: #896

  • Adds new analysis_workflow option to balsamic config case to run different workflows (balsamic, balsamic-umi)
  • If --analysis-workflow=balsamic run only balsamic workflow, else if --analysis-workflow=balsamic-umi runs both balsamic+UMI workflow

Test CommandLine

# create config json for balsamic only analysis

balsamic config case  --analysis-workflow balsamic --case-id TN_panel_balsamic --analysis-dir run_tests/ --balsamic-cache /home/proj/stage/cancer/balsamic_cache/ -t tests/test_data/fastq/S1_R_1.fastq.gz -n tests/test_data/fastq/S2_R_1.fastq.gz -p tests/test_data/references/panel/panel.bed

balsamic run analysis -s run_tests/ TN_panel_balsamic/TN_panel_balsamic.json

# create config json for balsamic+UMI  analysis

balsamic config case  --analysis-workflow balsamic-umi --case-id TN_panel_balsamic_umi --analysis-dir run_tests/ --balsamic-cache /home/proj/stage/cancer/balsamic_cache/ -t tests/test_data/fastq/S1_R_1.fastq.gz -n tests/test_data/fastq/S2_R_1.fastq.gz -p tests/test_data/references/panel/panel.bed 

balsamic run analysis -s run_tests/TN_panel_balsamic_umi/TN_panel_balsamic_umi.json

Review and tests:

  • Tests pass
  • Code review
  • New code is executed and covered by tests, and test approve

@ashwini06 ashwini06 self-assigned this May 17, 2022
@ashwini06 ashwini06 changed the base branch from master to develop May 17, 2022 12:26
@ashwini06 ashwini06 marked this pull request as draft May 17, 2022 12:28
@ashwini06 ashwini06 changed the title Feat: balsamic UMI workflow solution feat: balsamic UMI workflow solution May 17, 2022
@codecov
Copy link

codecov bot commented May 17, 2022

Codecov Report

Merging #932 (5e758cc) into develop (97e83cf) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           develop     #932   +/-   ##
========================================
  Coverage    99.25%   99.25%           
========================================
  Files           29       29           
  Lines         1749     1756    +7     
========================================
+ Hits          1736     1743    +7     
  Misses          13       13           
Flag Coverage Δ
unittests 99.25% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
BALSAMIC/commands/config/pon.py 95.12% <ø> (ø)
BALSAMIC/commands/config/qc.py 100.00% <ø> (ø)
BALSAMIC/commands/config/case.py 96.42% <100.00%> (ø)
BALSAMIC/constants/common.py 100.00% <100.00%> (ø)
BALSAMIC/utils/models.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 97e83cf...5e758cc. Read the comment docs.

@ashwini06
Copy link
Contributor Author

ashwini06 commented May 17, 2022

balsamic dry-runs

For Tonly_panel (Balsamic_only)

Tonly_panel_balsamic

For Tonly_panel (Balsamic + UMI)

Tonly_panel_balsamic_umi

For TN_panel (Balsamic only)

TNpanel_balsamic

For TN_panel (Balsamic+UMI)

TNpanel_balsamic_umi

@ashwini06 ashwini06 marked this pull request as ready for review May 19, 2022 09:38
Copy link
Collaborator

@khurrammaqbool khurrammaqbool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. I have a minor suggestion related code readability.

@@ -136,6 +129,19 @@
"will be <outdir>/genome_version"
),
)
@click.option(
"-w",
"--analysis-workflow",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"analysis workflow" here and subsequently may cause confusion with "workflow solution". I suggest calling it umi-option with balsamic-umi-on and balsamic-umi-off as options and all subsequent terms updated accordingly, which could be a better alternative to adding "workflow" as keyword and may make it unique and descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Khurram! I agree and thought about this existing constantworkflow_solution

WORKFLOW_SOLUTION = ["BALSAMIC", "Sentieon", "DRAGEN", "Sentieon_umi"]

But this option (--workflow-solution) is not used in config case CLI. It is just a part of constant used within balsamic.smk

Your suggestion to use balsamic-umi-on and balsamic-umi-off instead of analysis_workflow might be too specific. I was thinking of expanding this analysis_workflow option for running other analysis snakemake workflows (for eg: balsamic_PON) in the future.

So the thought process for running balsamic from production based on different analysis_workflow(balsamic/balsamic_umi/balsamic_pon) is :

cg workflow balsamic start <case_id>
## executes
balsamic config case --case-id xx --analysis-dir xx --balsamic-cache xx -t xx -p xx --analysis-workflow balsamic
balsamic run analysis -s xx.json


cg worfklow balsamic-umi start <case_id>

balsamic config case --case-id xx --analysis-dir xx --balsamic-cache xx -t xx -p xx --analysis-workflow balsamic_umi
balsamic run analysis -s xx.json

cg workflow balsami-pon start <case_id>
# Executes and generate PON reference file in specific location on hasta 
balsamic config pon --case-id xx --analysis-dir xx --balsamic-cache xx --fastq_dir xx -p xx --analysis-workflow balsamic_pon 
balsamic run analysis -s xx.json

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with --analysis_workflow and will also use it for balsamic-qc. I think it makes cg commands clearer. Indeed analysis_workflow is close to workflow_solution but workflow_solution could be more specific, no? Because only the variant callers are influenced and not the rest of the pipeline. What about something like calling_solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflow_solution is just a constant name used within the balsamic.smk. If you think it could lead to confusion, we can change the name of it to something else. Like calling_solution or variant_calling_solution.

Copy link
Contributor

@rannick rannick May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think workflow_solution or analysis_workflow needs to be changed, but if you want to change one of those to make them more different from each other, I vote to change workflow_solution to variant_calling_solution or something like it and keep analysis_workflow

@ashwini06 ashwini06 requested a review from rannick May 20, 2022 14:00
BALSAMIC/utils/models.py Outdated Show resolved Hide resolved
BALSAMIC/workflows/balsamic.smk Outdated Show resolved Hide resolved
@@ -136,6 +129,19 @@
"will be <outdir>/genome_version"
),
)
@click.option(
"-w",
"--analysis-workflow",
Copy link
Contributor

@rannick rannick May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think workflow_solution or analysis_workflow needs to be changed, but if you want to change one of those to make them more different from each other, I vote to change workflow_solution to variant_calling_solution or something like it and keep analysis_workflow

@ashwini06 ashwini06 changed the base branch from develop to master May 20, 2022 14:48
@ashwini06 ashwini06 changed the base branch from master to develop May 20, 2022 14:49
ashwini06 and others added 2 commits May 20, 2022 17:24
Co-authored-by: Annick Renevey <47788523+rannick@users.noreply.github.com>
Co-authored-by: Annick Renevey <47788523+rannick@users.noreply.github.com>
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Contributor

@ivadym ivadym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ashwini06 ashwini06 merged commit 8d8ec17 into develop May 23, 2022
@ashwini06 ashwini06 deleted the feat/workflow_solution branch May 23, 2022 15:04
@ashwini06 ashwini06 mentioned this pull request Jun 13, 2022
17 tasks
@ashwini06 ashwini06 mentioned this pull request Jun 21, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decouple UMI workflow from BALSAMIC
4 participants