-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR]: Make some slt tests deterministic #8525
[MINOR]: Make some slt tests deterministic #8525
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Quite curious, if the test is not deterministic, why we did not fail on this test before? Oh, I think probably because of Maybe we can remove unneccessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me -- the output order some of these queries likely depends both on the threading (e.g test_threads) as well as defails of how hash values are computed (and thus what the order of groups is), etc
@@ -38,17 +38,17 @@ LOCATION '../../testing/data/csv/aggregate_test_100.csv' | |||
# Basic example: distinct on the first column project the second one, and | |||
# order by the third | |||
query TI | |||
SELECT DISTINCT ON (c1) c1, c2 FROM aggregate_test_100 ORDER BY c1, c3; | |||
SELECT DISTINCT ON (c1) c1, c2 FROM aggregate_test_100 ORDER BY c1, c3, c9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing the order by changes the results because DISTINCT ON basically picks the first values for each of the distinct values (of c1) in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed
Yes, because setting is not random. Results are deterministicat each run. However, also in terms of query specifications some results are not certain. For instance for the query SELECT DISTINCT ON (c1) c1, c2 FROM aggregate_test_100 ORDER BY c1, c3; Both
and
are valid. However for query SELECT DISTINCT ON (c1) c1, c2 FROM aggregate_test_100 ORDER BY c1, c3, c9; only valid result is
|
Which issue does this PR close?
Closes #.
Rationale for this change
While working on another PR, I realized that some of the test results are not deterministic. This PR makes these test results deterministic.
What changes are included in this PR?
Are these changes tested?
Existing tests should work
Are there any user-facing changes?