Skip to content
This repository has been archived by the owner on Aug 23, 2024. It is now read-only.

Missing parameters in example file for v2.1.0 #45

Open
ardydavari opened this issue Nov 23, 2019 · 20 comments
Open

Missing parameters in example file for v2.1.0 #45

ardydavari opened this issue Nov 23, 2019 · 20 comments

Comments

@ardydavari
Copy link

https://github.com/cancerit/dockstore-cgpwgs/blob/develop/examples/cgpwgs/bam_bai.json is missing references to a .bas file

@ardydavari
Copy link
Author

ardydavari commented Nov 23, 2019

I was able to generate the .bas file using bam_stats, and confirm that the example script runs without error. (However the base example .json should still be updated)

@keiranmraine
Copy link
Contributor

Thank you for reporting. This is likely due to an earlier version of the flow generating the bas files if not presented.

@jinghu23
Copy link

hi,@ardydavari , where can I get this bwa_stats tool? I didnot find it, could u share a link?

@ardydavari
Copy link
Author

Sorry I made a typo, it's bam_stats, and its included in the dockstore-cgpmap image. I ran the container as root, mounted the volume with the example bam, and executed bam_stats -i example.bam -o example.bam.bas

@jinghu23
Copy link

Thanks so much, I just carried it out but failed with info said segmentation fault. Have you ever had this problem before?

@ghost
Copy link

ghost commented Dec 11, 2019

@jinghu23 can you paste the full error message you received please?

@jinghu23
Copy link

jinghu23 commented Dec 11, 2019 via email

@ghost
Copy link

ghost commented Dec 11, 2019

And the command you ran please?

@jinghu23
Copy link

I just did as ardydavari said, I ran the image with a dir mounted,then
image

@jinghu23
Copy link

hi,@drjsanger, I also tried to run it on my own PC and it failed with the same fault.

@winni2k
Copy link

winni2k commented Jun 8, 2020

I tried running bam_stats from the cgpwgs singularity container and I also got a segmentation fault. I will try remapping my reads with cgpmap and see if I get the required files that way.

@keiranmraine
Copy link
Contributor

Trying to allocate time for TLC on these projects...sorry

@winni2k
Copy link

winni2k commented Jun 8, 2020

I'm just trying to replicate an analysis from a scientific paper that used caveman for variant calling. Just to make sure I'm avoiding the XY problem here: Is my understanding correct that this docker image is likely the easiest way to run caveman as a one-off on a small number of BAM files?

@keiranmraine
Copy link
Contributor

@winni2k if you are only wanting to run caveman then the dedicated container is the best option, but that depends on which version of the underlying algorithm you are wanting to replicate. Caveman has had updates which would change the outcome:

https://github.com/cancerit/cgpCaVEManWrapper/releases

@winni2k
Copy link

winni2k commented Jun 8, 2020

Sorry, which dedicated container are you referring to? Do you mean the one provided by this github repo?

@keiranmraine
Copy link
Contributor

Sorry, looks like the docs on that project weren't updated either, there's not CWL on this one at present:

https://quay.io/repository/wtsicgp/cgpcavemanwrapper?tab=info

The main reason to look as this is that you don't need BAS files for caveman, only pindel and brass. However if you are running an analysis that expects to incorporate the ASCAT profile into CaVEMan then you need the cgpwgs container as it will do everything.

BAS file generation is know to work on the minimal image for PCAP-core:
https://quay.io/repository/wtsicgp/pcap-core

@winni2k
Copy link

winni2k commented Jun 8, 2020

Thank you for the heads up on the substantial changes. The paper only describes the methods in prose, and does not specify the version of caveman used. It does cite the cgpCaVEManWrapper paper by Jones D et al (2016) and that "All somatic changes in whole genome data were discovered with mutation calling pipelines developed in house (available at https://github.com/cancerit )."

So, then that means that I would use https://quay.io/repository/wtsicgp/cgpcavemanwrapper?tab=info to run the cgpcavemanwrapper script. The documentation on https://github.com/cancerit/cgpCaVEManWrapper seems to imply that the wrapper should actually be run through the docker image provided by this repo 😕

Thanks for the link to the wrapper docker image. I'll try that. Sorry to hijack this issue.

@keiranmraine
Copy link
Contributor

If the VCF files are provided as supplementary data the version is in the VCF header. Older versions of caveman are only in the cgpwgs image and it's possible that the specific version isn't present in any. Do you have a more complete reference, that one isn't great?

@winni2k
Copy link

winni2k commented Jun 8, 2020

Oh, sorry, I am trying to reproduce the SNP variant calling described in the supplementary methods section "Mutation discovery and tree building from whole genome sequencing" of

Lee-Six, Henry, et al. "Population dynamics of normal human blood inferred from somatic mutations." Nature 561.7724 (2018): 473-478.

That supplementary section cites

Jones D et al. cgpCaVEManWrapper: simple execution of CaVEMan in order to detect somatic single nucleotide variants in NGS data. Curr Protoc Bioinformatics 56, 15.10.1–15.10.18 (2016)

@keiranmraine
Copy link
Contributor

I recommend you use v1.1.5 of cgpwgs for your attempt. Please be aware that you should use the cgpwxs image for the targeted data. Should you have further queries please create a relevant issue specific to your problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants