Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to create input #1

Open
baj12 opened this issue Oct 24, 2021 · 2 comments
Open

how to create input #1

baj12 opened this issue Oct 24, 2021 · 2 comments

Comments

@baj12
Copy link

baj12 commented Oct 24, 2021

Hi,
I would like to test your tool. How can create the two input files. I have the results from a cellranger run, i.e. bam, fastq, and count matrix.
Thx
Bernd

@lweber21
Copy link
Collaborator

Hi,
Thanks for inquiring about our tool. I'm not too familiar with cellranger but from my understanding, it is a single-cell RNA expression tool? Are you working single-cell RNA data or single-cell DNA? Please note that our tool is designed for high coverage single-cell DNA sequencing data, although it might be possible to try it on RNA data. To create the input, you will need to identify a list of single-nucleotide variants. If you do not have a list of suspected variants, you would need to run a variant caller on the pooled single-cells. Then you can use your BAM file to perform a pileup on these positions. The two input matrices are of dimension # cells (droplets) x # of variants. In the DP matrix, each entry in the matrix is the count of total reads for cell (droplet) i at variant j and for the AD matrix, each entry is the total number of alternate (variant) reads for cell (droplet) i at variant j. For the specific file formatting, please see the example files.

@baj12
Copy link
Author

baj12 commented Oct 27, 2021

Thanks for the response. Indeed, I am working with RNAseq data.
When you talk about high-coverage single-cell DNA data you mean "full" coverage of the genome or sequencing depth per position, or large groups of single cells (how many cells?)? Since I would expect only one DNA molecule per cell I assume it is the first one.
I am interested in testing this on mRNA data, but I am not too familiar with the variant calling process. Do you have any pointers? Which programs to use, which parameters to look out for. As a starter, the commands you used to generate the input data would be perfect. Thanks so much for your kind support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants