-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: SVdb merge SV and CNV #886
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #886 +/- ##
========================================
Coverage 99.60% 99.60%
========================================
Files 29 29
Lines 1762 1765 +3
========================================
+ Hits 1755 1758 +3
Misses 7 7
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant solution. See my suggestions below.
@@ -222,16 +226,18 @@ rule svdb_merge_tumor_normal: | |||
tumor = get_sample_type(config["samples"], "tumor"), | |||
normal = get_sample_type(config["samples"], "normal"), | |||
case_name = config["analysis"]["case_id"], | |||
vcf= lambda wildcards, input:[input[index] + ":" + sv_callers[index] for index in range(0,len(input))], | |||
svdb_priority= ",".join(sv_callers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to have this variable named properly. e.g. svdb_sv_caller_prio
or something.
@@ -222,16 +226,18 @@ rule svdb_merge_tumor_normal: | |||
tumor = get_sample_type(config["samples"], "tumor"), | |||
normal = get_sample_type(config["samples"], "normal"), | |||
case_name = config["analysis"]["case_id"], | |||
vcf= lambda wildcards, input:[input[index] + ":" + sv_callers[index] for index in range(0,len(input))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably create a function for this for readability. A function that takes sv_callers and input vcf, and outputs the string that you want. This way, you'd avoid any potential bug if order of input/sv_callers changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution above is currently the only one working without any errors!
vcf= lambda wildcards, input:[input[index] + ":" + sv_callers[index] for index in range(0,len(input))], | ||
svdb_priority= ",".join(sv_callers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as above.
@@ -75,7 +75,7 @@ def get_variant_callers( | |||
WorkflowRunError if values are not valid | |||
""" | |||
|
|||
valid_variant_callers = set() | |||
valid_variant_callers = list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing from list to set might cause redundant entries :-) Make sure you remove them and order it properly now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into this and it is not generating redundant entries. The workflow does not show any changes. Using set() generates random order and this has significant impact on the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be ambiguous if/when orders change. Just something to keep it in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍🏼
@@ -222,16 +227,19 @@ rule svdb_merge_tumor_normal: | |||
tumor = get_sample_type(config["samples"], "tumor"), | |||
normal = get_sample_type(config["samples"], "normal"), | |||
case_name = config["analysis"]["case_id"], | |||
vcf= lambda wildcards, input:[input[index] + ":" + svdb_sv_callers_to_merge_prio[index] for index in range(0,len(input))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
svdb_sv_callers_to_merge_prio - maybe this name can be simplified.
decoupling this into a separate function could simplify the readability. But maybe it will turn the difficulty of passing input vcf into the function without using wildcards. We can leave this as such for now., as it works fine
@@ -118,16 +122,17 @@ rule svdb_merge_tumor_only: | |||
params: | |||
tumor = get_sample_type(config["samples"], "tumor"), | |||
case_name = config["analysis"]["case_id"], | |||
vcf= lambda wildcards, input:[input[index] + ":" + svdb_sv_callers_to_merge_prio[index] for index in range(0,len(input))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
cfdb51d
to
880e860
Compare
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the function you added, you forgot to add wildcards as input, no?
otherwise, nicely done.
@@ -75,7 +75,7 @@ def get_variant_callers( | |||
WorkflowRunError if values are not valid | |||
""" | |||
|
|||
valid_variant_callers = set() | |||
valid_variant_callers = list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be ambiguous if/when orders change. Just something to keep it in mind.
@@ -222,16 +226,18 @@ rule svdb_merge_tumor_normal: | |||
tumor = get_sample_type(config["samples"], "tumor"), | |||
normal = get_sample_type(config["samples"], "normal"), | |||
case_name = config["analysis"]["case_id"], | |||
vcf= lambda wildcards, input:[input[index] + ":" + svdb_callers_prio[index] for index in range(0,len(input))], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this work?
def construct_vcf(w, i, sv):
outp=list()
for index in range(0,len(i)):
outp.append(f"i[index]:sv[index]")
return outp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe not. wildcard is the limiting factor, and can cause an issue. Ignore this :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe not. wildcard is the limiting factor, and can cause an issue. Ignore this :-)
Also, input is not a list and will not iterate in for....
This PR:
Added: svdb merge SV and CNV
Review and tests: