Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AvgTests error on -0.0 avg #113272

Merged
merged 12 commits into from
Oct 22, 2024
Merged

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Sep 20, 2024

Fixes #113225
Fixes #114175

@ivancea ivancea added >test Issues or PRs that are addressing/adding tests Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.16.0 v9.0.0 labels Sep 20, 2024
@ivancea ivancea requested a review from astefan September 20, 2024 14:31
@ivancea ivancea requested a review from a team as a code owner September 20, 2024 14:31
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@not-napoleon
Copy link
Member

Maybe I'm reading this wrong, but it looks to me like this will accept -0.0 as the correct output. We really shouldn't be returning negative zeros from any operations. (In fact, probably AbstractAggregatorTestCase#extractResultsFromAggregator should assert that we never return a forbidden double value).

@ivancea
Copy link
Contributor Author

ivancea commented Sep 25, 2024

Maybe I'm reading this wrong, but it looks to me like this will accept -0.0 as the correct output. We really shouldn't be returning negative zeros from any operations.

@not-napoleon TestCaseSupplier provides -0.0 values, so if it can be an input, it should also be an output.

In ES, I see this: https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html#_which_type_should_i_use (Just above that anchor)

The double, float and half_float types consider that -0.0 and +0.0 are different values.

So if ES also allows it, I would allow it too. And for this case, the avg of -0.0 should be -0.0 too (IMO) to be as precise as possible.

I don't have an specific opinion on this topic, I'm just basing my thought process on what we currently have (And compatibility with ES)

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be as precise as possible.

It's not more precise, it's just more confusing. The IEEE spec for floating points says that the two zeros should be considered equal when possible. Java honors this with it's binary comparison operators (== and friends), but not with Double.equals(). You can read the docs on the double class to get an understanding for why this is the case. Practically, this means that if we start allowing negative zeros in the language, we need to be VERY clear on where they compare via == and where they compare via Double.equals() or Number.equals(). It becomes extremely trappy for users if that behavior is not entirely consistent.

We made the decision early on to not allow -0.0, positive or negative infinity, or NaN as possible values in ESQL. We should be consistent with that here. /cc @costin

@not-napoleon
Copy link
Member

so if it can be an input, it should also be an output.

It's an input on the test precisely to test that it isn't an output.

@nik9000
Copy link
Member

nik9000 commented Sep 27, 2024

We made the decision early on to not allow -0.0, positive or negative infinity, or NaN as possible values in ESQL. We should be consistent with that here. /cc @costin

I think we should be consistent with this decision. Our functions should never receive any of those forbidden doubles. We could (should?) add an assertion to the generated code that reads doubles for EVAL and STATS that asserts that we don't receive those numbers. They should always come in as null. And they should always be emitted as null.

ES might support these but they are all null to ESQL. One day we may change this but that's a difficult thing to change at the moment.

@not-napoleon
Copy link
Member

ES might support these but they are all null to ESQL. One day we may change this but that's a difficult thing to change at the moment.

Well, the infinities and NaN are null; -0.0 is 0.

@ivancea
Copy link
Contributor Author

ivancea commented Sep 30, 2024

Ok, so:

  • To ensure it's not an output, I'll add a check to tests like we have to NaN and infinities
  • I'll revert the changes on AvgTests for now

Then, if we have to keep the -0.0 as a potential input in tests, but forbid it as an output, as many functions are currently returning it (Max, MvMax, Min, Values...), I'll fix them by doing a conversion (-0.0 -> 0.0).
I could add it in the agg/evaluator implementers, but it would affect even functions never emitting -0.0. So I'll see first how many functions are failing with this

@ivancea ivancea marked this pull request as draft October 2, 2024 13:28
@ivancea
Copy link
Contributor Author

ivancea commented Oct 11, 2024

@not-napoleon After some checks and talks about this:

  • Added some "discoveries" about -0.0 here, to tackle whenever we want to work on this: [ES|QL] Don't return negative zero #104866 (comment)
  • Restored this PR with the Avg fix. It does nothing around -0.0 directly, so I think it's safe to merge whatever the decision we take. Right now, it's agnostic.
    • I left the little refactor on tests: removing repeated code, and adding test cases of single values to aggs, as they are handled differently (This test was failing 1/1000 times because of this)

@ivancea ivancea requested a review from not-napoleon October 11, 2024 14:35
@ivancea ivancea marked this pull request as ready for review October 14, 2024 14:28
@@ -750,6 +752,26 @@ public static void testFunctionInfo() {
assertEquals(returnFromSignature, returnTypes);
}

@SuppressWarnings("unchecked")
protected final void assertTestCaseResultAndWarnings(Object result) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc would be nice here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now.

@ivancea ivancea merged commit d65a030 into elastic:main Oct 22, 2024
16 checks passed
@ivancea ivancea deleted the avg-tests-error-minus-zero branch October 22, 2024 17:53
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 113272

smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Oct 23, 2024
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] AvgTests testFold {TestCase=<double> #7} failing [CI] AvgTests testFold {TestCase=<double> #2} failing
5 participants