-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optional validation output to joins #2377
Comments
The indicator seems like a good option to me. If we choose that to be a boolean than it will also be very memory efficient. |
If boolean, that would have to be two indicators, one for whether it occurs in the original left dataframe, and one for the right dataframe. I see Pandas opts for a categorical to cover left/right/both. |
Still would save 2/8s of RAM. Two indicators sounds good to me. |
I think the |
Note that the validate option has recently been added in #9278. |
Hi. I'm not an expert on suggesting changes, so I apologize if this isn't the correct method. Although the validation option has been added, it would be very helpful to include an 'indicator' option, similar to what is available in pandas. As far as I know, this feature hasn't been added yet, and the issue explicitly requesting this option ( #5983 ) has been closed. Thanks! |
(originally by #2292 (comment) )
@austospumanto:
Separately, on the problem of two columns becoming one column in the join result: it would be great if polars could retain both columns in the join like pandas does (when the two columns have different names). This is useful for checking for nulls in non-inner joins to see which rows found matches, and also for situations like the one you stated. I find myself duplicating+suffixing columns before joining to get this behavior in polars.
Suggestion by me:
It may be easier if we have indicator as an optional output as in Pandas https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html?highlight=merge#pandas.DataFrame.merge
The text was updated successfully, but these errors were encountered: