Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in monthly collection, days 1-31 #246

Open
huitema opened this issue May 26, 2024 · 0 comments
Open

Bug in monthly collection, days 1-31 #246

huitema opened this issue May 26, 2024 · 0 comments

Comments

@huitema
Copy link
Collaborator

huitema commented May 26, 2024

There was a bug in the monthly collection data of "by day" data. The line was:

    if (ret) {
        this->hourly_volume[time_m.tm_hour] += 1;
        this->daily_volume[time_m.tm_mday] += 1;
    }

This is wrong, because tm_mday has values from 1 to 31. On day = 31, the buffer overflowed. It is not clear whether this affected the next record in the data file:

    uint64_t query_volume;
    uint64_t hourly_volume[24];
    uint64_t daily_volume[31];
    uint64_t arpa_count;

The values of arpa_count seem correct, but this is worth an investigation.

The record is shifted by 1. We can fix that in the pandas analysis by changing the heads of the column, restating the first column as "d00", and computing "d31" as queries - sum(d01..d30).

Then, with pandas, we can compare the new d31 to the counter arpa. If we find a high correlation, this becomes suspicious!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant