Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add empty attr dictionary to lazy/dataframe #14815

Closed
mcrumiller opened this issue Mar 1, 2024 · 2 comments
Closed

Add empty attr dictionary to lazy/dataframe #14815

mcrumiller opened this issue Mar 1, 2024 · 2 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@mcrumiller
Copy link
Contributor

mcrumiller commented Mar 1, 2024

Description

As of #13236 we can no longer dynamically add attributes to DataFrame and LazyFrame objects. However, sometimes it is desirable to add user attributes to frames for later use.

I propose here to add to __slots__ an empty attr dictionary so that users can store user-defined attributes in their tables, as in:

import polars as pl

df = pl.DataFrame()
df.attr["name"] = "my_dataframe"
@mcrumiller mcrumiller added the enhancement New feature or an improvement of an existing feature label Mar 1, 2024
@stinodego
Copy link
Member

stinodego commented Mar 2, 2024

This is not reliable, as not all DataFrame methods preserve the class. If you were to df.sort(), the data would be gone.

Also, it kind-of defeats the purpose of __slots__. One of the benefits of __slots__ is that this is no longer possible, and users will not be surprised if the above happens.

Closing in favor of #5117

@stinodego stinodego closed this as not planned Won't fix, can't repro, duplicate, stale Mar 2, 2024
@mcrumiller
Copy link
Contributor Author

mcrumiller commented Mar 2, 2024

One of the benefits of __slots__ is that this is no longer possible.

I'd call it more of a side effect. The primary benefit is faster access to the _df variable in almost all of the python methods, and adding another variable to __slots__ does not remove that benefit. I think it's worthwhile to allow a single variable for user-defined parameters. This is prevalent in other languages as well. A good resolution of the other issue would be preferred but that seems substantially more difficult and less likely to be realized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants