-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typical command for EM-based 10x processing #1262
Comments
Hi Alex, you are right: there is no detailed description of STARsolo options in the manual at the moment; it's available here: To match the CR4 results, your parameters look good, except you also need Cheers |
Thank you very much for the answer! Yes, I've read the preprint carefully - but I was not sure why the option was missing from the manual, while others (e.g. Uniform, Rescue etc) were listed. Also, there is no p.59 in the preprint, I think you meant 19 ;) Actually, there seems to be a problem with Just in case it matters, I'm using STAR v2.7.9a installed from bioconda. |
Some extra information: it seems like the extra files only show up in the "raw" output folder. If I use Thank you for your help again! |
Hi Alex, that's a bug, it indeed does not work when only one multi-mapping option is supplied, but works fine when multiple options are supplied. I will fix it shortly. Cheers |
…EM options is specified in --soloMultiMappers.
Hello Alex, Just a quick follow-up on multimapper processing. Thank you for fixing the issue above! There's few things I wanted to check with you:
|
Hi Alex, thanks for your suggestions!
That's the present behavior... it should be possible to filter cells using the multimaping matrices, but I have not tried it. You can run STAR in CellFiltering mode, and use the multimapping raw matrices as the input, which will produce the filtering output:
Yes, multimapping output for velocyto matrices is not implemented yet, this is on my TODO list.
Rounding to the integer values may not be the best way to deal with multimapping values, as it introduces large errors for many genes, especially those close to 1, which are quite abundant abundant. After the count matrix is normalized, all values become non-integer, and at that point you may want to filter out values that are too small. I am not a big fun of hdf5 format. The matrix.mtx seems sufficient and is much easier to work with. Cheers |
Thank you very much for this detailed answer, I really appreciate it. Filtering of the multimapping matrix works out quite nicely - I've actually used the newest Your argument about rounding is an important one. I think for this reason |
Hello Alex,
Thanks again for STAR and STARsolo - the latest update with multi-mapper counting is especially great to have!
I had few questions/suggestions:
--soloMultiMappers EM
option listed in the manual for v2.7.9a, although the command itself works fine?STARsolo
runs in the manual, since the number of settings became overwhelming. Am I right to assume that the command to reproduce the latest CellRanger (version 4 and above) result should look something like this:STAR --runThreadN 16 --genomeDir $REF --readFilesIn $R2 $R1 --runDirPerm All_RWX --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate --outMultimapperOrder Random --runRNGseed 1 --outSAMattributes NH HI AS nM CB UB GX GN --soloType CB_UMI_Simple --soloCBwhitelist $BC --soloBarcodeReadLength 0 --soloUMIlen $UMILEN --soloStrand Forward --soloUMIdedup 1MM_CR --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloCellFilter EmptyDrops_CR --clipAdapterType CellRanger4
The text was updated successfully, but these errors were encountered: