-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Range Aggregation is not working properly for float fields #77033
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @ofirEdi , Thanks for submitting this. I ran your repro steps and it also fails for me. Feels like it might be a rounding bug. I'll look into it and see if I can track down a fix. |
Thank you @not-napoleon . I'll keep watching. |
HI @not-napoleon any update? |
Hi @ofirEdi, I actually haven't had a chance to start on this. I've had a few other big things come up, unfortunately. It is next on my list though. Thank you for your patience. |
Hi @not-napoleon, thank you for the update. |
Okay, so I've done some digging, and the issue here is (basically) that we always parse the range end points as I'm looking into a solution, but it's a little tricky. I think the right thing to do is parse the range end points with the type of the field being aggregated on, if we can. Problem is we don't know the field type until after parse time. Anyway, I'm working on it. Hopefully will have something soon. |
@not-napoleon i will keep following on the pull request. Thanks for taking the time and address this issue. |
This fixes a bug where the range aggregation always treats the range end points as doubles, even if the field value doesn't have enough resolution to fill a double. This was creating issues where the range would have a "more precise" approximation of an unrepresentable number than the field, causing the value to fall in the wrong bucket. Note 1: This does not resolve the case where we have a long value that is not precisely representable as a double. Since the wire format sends the range bounds as doubles, by the time we get to where this fix is operating, we've already lost the precision to act on a big long. Fixing that problem will require a wire format change, and I'm not convinced it's worth it right now. Note 2: This is probably still broken for ScaledFloats, since they don't implement NumberFieldType. Resolves #77033
) This fixes a bug where the range aggregation always treats the range end points as doubles, even if the field value doesn't have enough resolution to fill a double. This was creating issues where the range would have a "more precise" approximation of an unrepresentable number than the field, causing the value to fall in the wrong bucket. Note 1: This does not resolve the case where we have a long value that is not precisely representable as a double. Since the wire format sends the range bounds as doubles, by the time we get to where this fix is operating, we've already lost the precision to act on a big long. Fixing that problem will require a wire format change, and I'm not convinced it's worth it right now. Note 2: This is probably still broken for ScaledFloats, since they don't implement NumberFieldType. Resolves elastic#77033
…) (#78932) * Scale doubles to floats when necessary to match the field (#78344) This fixes a bug where the range aggregation always treats the range end points as doubles, even if the field value doesn't have enough resolution to fill a double. This was creating issues where the range would have a "more precise" approximation of an unrepresentable number than the field, causing the value to fall in the wrong bucket. Note 1: This does not resolve the case where we have a long value that is not precisely representable as a double. Since the wire format sends the range bounds as doubles, by the time we get to where this fix is operating, we've already lost the precision to act on a big long. Fixing that problem will require a wire format change, and I'm not convinced it's worth it right now. Note 2: This is probably still broken for ScaledFloats, since they don't implement NumberFieldType. Resolves #77033
Elasticsearch version (
bin/elasticsearch --version
): 7.12.1Plugins installed: []
JVM version (
java -version
):1.8.0_292OS version (
uname -a
if on a Unix-like system): Ubuntu 18.04.1Description of the problem including expected versus actual behavior:
According to the documentation: https://www.elastic.co/guide/en/elasticsearch/reference/7.12/search-aggregations-bucket-range-aggregation.html range aggreagation creates buckets in a way that from is inclusive and to is exclusive. When working with float fields Elasticsearch seems to work vise versa and classify to the wrong bucket.
Steps to reproduce:
the result i'm getting is that the doc belongs to 0.01-0.04 bucket and not to 0.04-0.06 bucket:
I tested with Integers and it seems to be fine but for floats the aggregation not behaving like in the documentation.
The text was updated successfully, but these errors were encountered: