-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api): add nulls_first=False
argument to order_by
#9385
Conversation
a17ff2e
to
0b216b9
Compare
Fixing up the snapshots now. |
e108969
to
67ce726
Compare
67ce726
to
be221d4
Compare
Need to add tests that check for order by more than one column, has nulls, and with different value of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
We can add the tests for multiple nulls_first
specifications here or in a follow-up, up to you.
nulls_first=False
argument to order_by
dd1be07
to
e34476b
Compare
7bfb757
to
b4d349c
Compare
ibis/backends/tests/test_generic.py
Outdated
result = con.execute(expr).reset_index(drop=True) | ||
expected = pd.DataFrame(expected) | ||
|
||
tm.assert_frame_equal(result.replace({np.nan: None}), expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I do this locally .replace({np.nan: None})
I get this
> tm.assert_frame_equal(result.replace({np.nan: None}), expected)
E AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="col1") are different
E
E Attribute "dtype" are different
E [left]: object
E [right]: float64
only in this test. I don't see it in the other ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the expected
that needs the astype
, and perhaps the result
(not sure). I'll push up a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what's not clear to me why in the other tests we needed it. Locally for me, they all passed without it.
After losing a few brain cells trying to understand how to set collation and sorting in MS SQL, we gave up and decided to xfail the multi key sort column null ordering tests for backends that for some reason known only to the original developers that strings should be compared to each other ignoring case by default. And it should be outrageously difficult to change that behavior. |
This is work in progress
TODO
Sort out (pun intended) how to deal with backends that don't expose option for ordering nulls.
- [x] mssql
- [x] mysql
EDIT: They get handled by sqlglot and follow the expected behavior
Add test in
test_api.py
to check that thist.order_by(ibis.desc("b", nulls_first=True))
is equal thant.order_by(t["b"].desc(nulls_first=True))
fix window tests errors (Thanks @gforsyth for your the "diving into deep into the weeds" session)
Decide how and deprecate
ibis/ibis/expr/operations/sortkeys.py
Lines 38 to 51 in 2dac5e4
Options are:
NOTE: I can't find any tests using this, not clear to me what do we want to do here.
EDIT: see #9413
desc()
orasc()
is taking thenulls_first
default value set inSortKeys
which isTrue
(I'm not sure yet, need to investigate) EDIT: doctest and other tests using order_by that didn't triggerdesc
orasc
rely on defaulnulls_first
inSortKeys
. Before this value wasn't provided and SQLGlot handel this by triggering nulls_first=False. So to be consistent we set it to False in the class default.I know why they are failing, it's unclear what path we want to take here. Need to discusshttps://github.com/ncclementi/ibis/blob/78f26dada233a3da3a5e145fc32d590e058f1a4e/ibis/expr/operations/sortkeys.py#L28-L37
Discuss/fix: Now we have a lot of snapshot tests failing, because of the fix in the
window_merge_frames
rewrite, where we are now passingnulls_first
seeibis/ibis/expr/rewrites.py
Lines 310 to 314 in a17ff2e
Add tests that orders by more than one column, has nulls, and with different value of nulls_first.