Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash list of String #13950

Open
gab23r opened this issue Jan 24, 2024 · 5 comments
Open

hash list of String #13950

gab23r opened this issue Jan 24, 2024 · 5 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@gab23r
Copy link
Contributor

gab23r commented Jan 24, 2024

Description

I am maybe not the first to ask for this feature, but I couldn't find anything.

I which I could hash a list of string with polars. Is it out of the scope ?
It would be awesome to be able to hash any nested polars data types.

pl.DataFrame({
    'a': [['1', '2']],
}).select(pl.col('a').hash())
# PanicException: Hashing a list with a non-numeric inner type not supported. Got dtype: List(String)
@gab23r gab23r added the enhancement New feature or an improvement of an existing feature label Jan 24, 2024
@ion-elgreco
Copy link
Contributor

You can hash the inner elements first with list.eval(pl.element().hash).hash()

@gab23r
Copy link
Contributor Author

gab23r commented Jan 24, 2024

Thanks @ion-elgreco, this is a workaround, I think the feature is still really valuable. Today if you want to hash a full dataframe with hash_rows for example, It may fail for certain data types. So you have to handle these cases by hand.

@ion-elgreco
Copy link
Contributor

I agree :) Just showcasing a workaround that might help

@cmdlineluser
Copy link
Contributor

I think it has been mentioned a few times with regards to .group_by

Another workaround is to .cast(pl.List(pl.Categorical))

Not sure on the technical reasons for why it is disallowed currently.

@reswqa
Copy link
Collaborator

reswqa commented Jan 26, 2024

I think this is duplicated with #10747.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants