Skip to content
This repository has been archived by the owner on Jan 27, 2020. It is now read-only.

Multi-Sample Output Organization #691

Closed
apeltzer opened this issue Nov 16, 2018 · 14 comments
Closed

Multi-Sample Output Organization #691

apeltzer opened this issue Nov 16, 2018 · 14 comments

Comments

@apeltzer
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently, output is organized in a way that every VCF file lands in a single folder.

Describe the solution you'd like

It would be nice to have the possibility, if users can specify a single TSV containing multiple patients and the generated VCFs etc are stored nicely in

Describe alternatives you've considered

Creating n TSV files for each patient individually, which is quite cumbersome...

We discussed this together with @ggabernet @maxulysse on Gitter :-)

@ggabernet
Copy link

Currently it is possible to add as an input a TSV containing multiple patients, but I realized that somatic variant calling then calls all possibilities of tumor-normal pairs, even between different patients, so this would need to be fixed too.

@maxulysse
Copy link
Member

I can see why this is happening, thanks for telling me about it

@ggabernet
Copy link

sure, I'm very happy to use your pipeline!

@maxulysse
Copy link
Member

I'm glad that you're using it, with more user like you, Sarek can only improve its quality ;-)

@jongtaek-kim
Copy link

somatic variant calling then calls all possibilities of tumor-normal pairs, even between different
patients,

Has this been addressed yet? This wastes the time and computing resources exponentially as the number of patients increase.
I am resorting to running each patient T/N separately.

@maxulysse
Copy link
Member

Hi,
I'm sorry, I didn't manage to find time to implement that yet.
But it's definitively something that I plan for our next release.

@maxulysse
Copy link
Member

The bug encountered by @ggabernet will be fixed by #728.
Concerning the organization of files, is this just the vcfs that are posing an issue for you?
or would you want separate directories for the bams as well?

@ggabernet
Copy link

ggabernet commented Feb 14, 2019 via email

@maxulysse
Copy link
Member

Thanks for your input.
@jongtaek-kim what is your use case?

@jongtaek-kim
Copy link

@maxulysse
I am running the pipeline on exome T/N samples to detect somatic snvs and indels in my academic health center's HPC. Thanks for the pipeline repo.

@maxulysse
Copy link
Member

@jongtaek-kim I can see why you're interested in this issue.
I does indeed makes a lots of sense to have that for exome.

Are VCFs organized in patient specific folder enough for you ?
Are you interested in the same thing for BAMs as well ?

@jongtaek-kim
Copy link

Are VCFs organized in patient specific folder enough for you ?
Are you interested in the same thing for BAMs as well ?

After doing some thinking, I think both are good ideas to implement.

@maxulysse
Copy link
Member

OK, so the VCFs are to be separated in PR #728
I'll work on the Bams later on

@jongtaek-kim
Copy link

jongtaek-kim commented Feb 26, 2019

@maxulysse Thank you so much.

Also, appreciate getting the targetBED fixed.
eb3dfb1
Did little test run on dev branch and can attest --targetBED parameter worked.
This goes well with my exome sequencing where I have to provide a bed file.
Will run the whole thing with #728 merged dev and will see how it goes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants