-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mosdepth missing values imputed #1810
Conversation
Changes: - Mosdepth output has missing values in *.{region,global}.dist.txt - This change to module fills any missing values with the next value - e.g., if there is 100% at 100X and 80% at 80X, the value at 90X will be recorded as 80X - This may underestimate coverage slightly but it's not clear from MosDepth docs how it should be handled. - See brentp/mosdepth#190
Thanks! Speed thought: generally I try to avoid computing new data within MultiQC (plenty of exceptions abound). I agree that reporting Won't be as pretty, but would be more true to the original data. This data is essentially the same as the plot lower in the report though, right? So prettiness shouldn't be super important? Or have I got that wrong? |
aha! Yes, if we remove the |
Oh hold on, different plots. I think the data in general stats is coming from my plot though. |
I added some code where it evaluates every integer in the coverage values, but I haven't pushed it because while it parses the data fine and solves the problem, it takes forever to write the file. Unfortunately using None instead of zero creates a coverage of |
Hmm, how annoying. Can we just remove the keys if |
This reverts commit 2d71020.
Before and after comparison: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Just moved the log statement out of the loop as I was worried that there might be a lot per sample and it could make the debug log file huge.
Thanks for working on this and accepting feedback 👍🏻
Just before I merge - does this definitely fix the plot as well? The test data isn't affected like your screenshot above.. From a skim read of the code it now looks like it's only affecting values in the general stats table. Just want to check that the plot is good as well. |
What was the problem again? I think the main problem was the general stats table. The other problem is the coverage plot which isn’t fixed but I can look into that in the new year (in a new PR?) |
Sorry, had a typo - I was talking about the plot. If you have some example data kicking around I can take a quick look for a fix.. I'd prefer to do both in one PR if possible, just so that we don't forget. |
Ok, let's merge - I'm hoping to get a release out in 2022 and it would be good to include this 😅 |
I dropped the log message, hope that's ok. Figured that it's fairly self explanatory behaviour.. |
Thank you! I don’t have access to my laptop until next weekend so wouldn’t have been able to get this done for a while. |
Very sensible! 👍🏻 |
Changes:
*.{region,global}.dist.txt
CHANGELOG.md
has been updated