-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include process to estimate error #64
Comments
Hello! I decided to take a shot at this issue. I've never worked on a pipeline of this complexity so I'm still figuring out how to correctly define the inputs and outputs, how everything connects, etc... What I've managed to do so far is make a process for EstimateError.py and got the pipeline to run on a sample of the Stern2014 data. Since it's such a small subset of the data (10K reads) and I'm testing on my wimpy home computer, I did not include additional EstimateError arguments like in the example. I can keep working on this-I think only R1 is used at the moment and I still need to get the outputs and logs going where the rest of the Presto outputs/logs go. Currently the output files are being written to work dirs. Can you advise if I stuck the process in a sensible place? It can run on the fastq outputs of PairSeq, PostConsensus_PairSeq, ClusterSets, and Parse_ClusterSets. My idea was running EstimateError right after PairSeq is the best choice. Maybe that's not the case. Here's the code so far if you'd like to see: Comparing changes to master branch. |
Logs and outputs are now being sent to a dir like the rest of the Presto outputs, and EstimateError is run on both R1 and R2. However, R2 almost never has a UMI if I remember correctly. Maybe that will cause unnecessary running of EstimateError and a flag or check for which read has the UMI could be added. |
EstimateError now runs on R1 or R2 based on the Still to do (I'm putting this here mainly as notes for myself): |
EstimateError can be used to estimate sequencing and PCR error rates: https://presto.readthedocs.io/en/stable/examples/tasks.html#estimating-sequencing-and-pcr-error-rates-with-umi-data
A process could be included to estimate these errors per sample
The text was updated successfully, but these errors were encountered: