Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitch Peer Review #3

Open
mrezzoni opened this issue Oct 27, 2018 · 0 comments
Open

Mitch Peer Review #3

mrezzoni opened this issue Oct 27, 2018 · 0 comments

Comments

@mrezzoni
Copy link

Ana!

First of all, I'm tripping out on how similar our pseudocode is...10/10 for overall readability. Your test examples are incredibly thorough. You seem to have considered many different instances of duplicates with your test examples, which will pay significant dividends in your real code. I like how you explicitly stated that you want samtools to sort by chromosome, as this will make identifying duplicates a lot faster.

Jason also suggested that I implement a set so I can totally help you with this. You just need to initialize an empty set (as you've done) and add your items of interest with ".add". Consider re-initializing the set after you have written your unique files out and progressed to a new chromosome to minimize the amount of stuff stored in memory.

Really nice high-level functions. The fact that you created two functions to account for strandedness is awesome, but consider combining adj_strand+_pos and adj_strand-_pos, as one is simply the inverse of the other. Nice job accounting for other op's such as I, D, and N. When you call UMI_check(string) to see if the UMI is in dict_known_UMI, wouldn't you want to skip it altogether?

I like how you already thought where you want to open your file(s). Make sure you think about where you want to put the command to write to your open files. Don't forget to close the files at the end.

Nice start. Please let me know if you'd like any elaboration on my feedback.
Mitch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant