Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix rank calculation in Summary::quantile()
- Changes the rank calculation in Summary::quantile() to 'q * total' from 'floor(q * (total - 1))'. The 'floor' in the previous calculation was implicit due to the cast to u64. - Changes the condition for mapping to the 'negative', 'positive', or 'zeroes' sketch to be '<=' instead of '<'. - Defers rounding the quantile to the nearest rank to the underlying sketch. This change is not necessary for correctness, but could be desireable. For example, deferring this operation could allow the underlying sketch to perform interpolation. - Fixes incorrect assertions in test_basics(), which pass with the changes to the rank calculation and quantile mapping. The previous method could incorrectly map a request for a value at a particular rank to the adjacent rank in some cases. Consider a distribution with the values {0.0, 1.0}. In this case, any request for a quantile < 1.0 would map to the 'zeroes' sketch. for example, the rank for q = 0.9 is computed as 'floor(0.9 * (2 - 1)) = floor(0.9) = 0'. This would map to the 'zeroes' sketch while it should instead map to a request for q = 0.8 in the positive sketch. The new method would give us a rank of '0.9 * 2 = 1.8' which correctly maps to 0.8 in the positive sketch. The condition for mapping to the appropriate sketch now uses the '<=' operator instead of the '<' operator. Consider a request for q = 0.5, with the distribution {-1.0, 1.0}. The '<=' operator ensures that this request is mapped to q = 0.0 in the negative sketch (would be 1.0, but we flip the distribution) rather than q = 0.0 in the positive sketch. The mapping for q = 0.0 was handled as a special case in a prior commit to allow for the updated mapping, as q = 0.0 would otherwise always map to the negative sketch, whether or not the negative sketch had any values. Both of the cases described above are now covered (with a different distribution) in test_basics().
- Loading branch information