-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue: download counts dropped between 5.8 to 5.9? #8840
Comments
Interesting. I see a similar (slightly larger %-wise) at QDR right now. FWIW: The estimate is coming from the query in src/main/resources/db/migration/V5.8.0.3__7804-optimizations.sql . There are suggestions on the internet that this is a reasonable fast estimate but as far as I can tell how accurate it is depends on how the table is used and the accuracy can vary/ the count can be out of date since some of the parameters in the query are only calculated periodically. One thing I just tried (on non-production systems) is to run Assuming vacuuming is a good idea, that done manually or via autovacuum may help keep this estimate accurate. Alternately, if the time to get a count via the select count() query isn't prohibitive, I think sites could alter the query to use it, always or, for example, if the estimate is less than 1M, use the select count(). (Perhaps that could even be a modification contributed back to the community version? My guess is that autovacuum is the better choice overall unless there's reason not to do it.) |
@qqmyers Thanks for verifying that you were able to replicate the issue on QDR. |
FWIW: After more discussion and investigation, it looks like autovacuum is on by default but the default settings probably don't trigger 'analyze' on the guestbookresponse table often enough to keep the estimate as accurate as we might like. I may look into lowering settings such as autovacuum_analyze_scale_factor (e.g. from 0.1 to 0.01) for that table (not sure yet if that is what is needed). If anyone knows more about postgres and what we could do here to improve the estimate, please add here and/or consider a PR to suggest postgres config settings to add in the guide. |
@qqmyers did you observe that the download statistic on homepage did not change over a week? We observed that the homepage download stats is does not update over 6 days (using a cached value) while the db query From the user perspective, it's probably ok that the stats aren't that accurate. But would not be ok if the stats are just not updated even on a weekly basis. Is the solution to this to trigger |
@eunices - this is definitely something where I think someone has to take the time to get a good answer (I was going to reopen this issue - glad you did). I haven't yet done a vacuum/analyze at QDR but plan to do so. From the postgres docs, where it says I haven't yet dug deep enough to see if lowering the autovacuum_analyze_scale_factor would be a way/the best way to limit the discrepancy. My naïve guess would be that the estimate can get to be up to ~10% off when the scale factor is 0.1. Again, I plan to try this at QDR (checking with our sys admin there to make sure there are no concerns first). If you/others try this and think it helps, we can make a PR to put that sql command in the docs and/or into flyway. (Anyone who runs this sql update on a test or production system - please report here how it goes. Being able to confirm that the difference between the estimate and real count gets limited to 1% would help confirm that this works.) |
FWIW: At QDR I set One thing I noticed is that the threshold for autovacuum is actually the scale factor (i.e. now 1%) plus 50 rows (another param that can be set). Watching the updates for a bit, it appears that the estimate at QDR is changing in steps of reltuples/relpages which is ~67. It's possible that with fewer counts, a similar lower limit to the estimate's step size could exist, which would mean the estimate will still not change when 1% more downloads occur. In summary - so far the autovacuum has reduced the estimate's error at QDR from several thousand/many percent to something that is so far <0.1% off. I'll try to continue monitoring to see the maximum before autoanalyze runs again. Assuming it stays smaller, I'd push for some next steps - either recommend this setting (especially since it is reversible), or have a few more sites try it on production/highly used sites (perhaps demo.dataverse.org?). I think this could be added as a flyway script in some future version as well to avoid admins having to make a manual change. |
I'm talking to "analia-s" in https://chat.dataverse.org this morning and she reported seeing 36 downloads on the homepage but some datasets (such as https://dataverse.unr.edu.ar/dataset.xhtml?persistentId=doi:10.57715/UNR/PTDCEY have files with over 200 downloads). I told her she may be affected by this issue. Here are some screenshots: Update, I just tried https://dataverse.unr.edu.ar/api/info/metrics/downloads and it shows 833 downloads. Why would this be different than the GUI? |
I don't think the API uses the estimate, so this is probably a good indication that the difference is because of this issue. One could check in the db comparing: |
@qqmyers thanks! |
@analia-s - was the 739 number from before a restart? If it was showing in the UI at the same time the query reported 739, something else is going on. If more downloads happened or the restarts between when you saw 36 and did the query to get 739, postgres could have updated its statistics (which is what the change in this issue/PR will make happen more frequently). |
@qqmyers I couldn't check if before restarting postgres the number of downloads increased. Today I restarted postgres again but I did not do the query and the number increased again. |
What steps does it take to reproduce the issue?
When does this issue occur?
Upgrading between 5.8 and somewhere between 5.9/5.10/5.10.1 (likely 5.9, see https://groups.google.com/g/dataverse-community/c/z1EYZHswhhI)
Which page(s) does it occurs on?
Homepage, downloads metric at top of page
Seems related to dataset.xhtml (https://github.com/IQSS/dataverse/blob/develop/src/main/webapp/dataverse.xhtml), where guestbookResponseServiceBean.getCountOfAllGuestbookResponses was updated in 5.9 (#7804), to use an estimated value (
dataverse/src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java
Line 920 in fb24c87
What happens?
After the upgrade, downloads dropped by about 2,000 (with ~150,000 downloads)
To whom does it occur (all users, curators, superusers)?
All users
What did you expect to happen?
Downloads should not drop after an upgrade.
Which version of Dataverse are you using?
5.10.1
Any related open or closed issues to this bug report?
https://groups.google.com/g/dataverse-community/c/z1EYZHswhhI
#7804, PR: #8143
Some organisations are tracking the precise download stats. We'd just like to highlight this.. It's great if there'll be a potential fix, but understandably it would not be preferred as it comes as the expense of performance. I suppose if we needed precise stats, we could do a
select count(o.id) from GuestbookResponse o
The text was updated successfully, but these errors were encountered: