Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(python): Add __slots__ to most Polars classes #13236

Merged
merged 5 commits into from
Feb 26, 2024

Conversation

Object905
Copy link
Contributor

@Object905 Object905 commented Dec 24, 2023

Closes #13198

In this PR are "safe" additions of __slots__. Mostly for classes that are not usually subclassed by users - pl.Expr, its subclasses and build in namespaces.
polars.datatypes.classes also really good candidates, since they are instantiated now, but not included in this PR (yet?).

Adding __slots__ for LazyFrame (and DataFrame/Series) turned out to be a breaking change because of subclassing.
Using ldf.__class__ = SubClassedLazyFrame, like in test_preservation_of_subclasses is an error because of different object layouts. And even if subclass were to define __slots__ - they can only be empty, without additional attributes on instance to keep this assigment valid.
That can't be solved without specific conversion mechanism to subclasses to properly create a new instance (like pl.LazyFrame.as_subclass(SubClassedLazyFrame, *args, **kwargs)) where _ldf attribute itself would be passed/assigned to new instance of subclass.
Also using ldf.__class__ = SubClassedLazyFrame a dirty hack anyway and if polars were to properly support subclassing of frames, I think something similar should be in place - then __slots__ on them can be added safely.

Deeper description about __slots__ and benefits\downsides can be found here.
Also in issue were not mentioned memory usage benefits aside from slightly faster attribute access.
(sqlalchemy claimed 46% less memory usage for it's internal structures). Even though in latest python versions this gap should be smaller, due to more compact dict.

@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars labels Dec 24, 2023
@@ -43,7 +43,6 @@ def __get__(self, instance: NS | None, cls: type[NS]) -> NS | type[NS]:
return self._ns

ns_instance = self._ns(instance) # type: ignore[call-arg]
setattr(instance, self._accessor, ns_instance)
Copy link
Contributor Author

@Object905 Object905 Dec 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this were not required even before, but it served as "cache" on instance of expression.
Now it gives error because instance does not have __dict__ and _accessor goes straight to descriptor - and it's read only.

But since expressions are frequently recreated, then this "cache" were only working on chained namespace access like
pl.col("foo").namespace_foo.func().namespace_foo.other_func()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate if @alexander-beedie could sign off on this change before this gets merged. I don't really see exactly what is going on here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this did have user impact:
#14851

For this reason, and some related issues popping up, I will be reverting this PR.

@stinodego
Copy link
Member

Thanks for the PR. Could you provide some data on the performance impact of this change? I think I'm in favor of this change, but it would be nice to have some numbers to support it.

About subclassing: we don't support this. That test is from a while ago and we no longer guarantee preserving the class on calling a method.

@Object905
Copy link
Contributor Author

It's hard to measure exactly, because of variety of possible use cases. But speedup is small, which is anticipated.

In my production use case constructing pl.Expr and pl.LazyFrame with all the glue takes ~20% of request runtime, while collecting them only about 10% because dataframes are pretty small (100-2000 rows).
And given that this change is sadly close to noise.

Anyway here is some synthetic bench.

upstream

In [5]: %timeit -n 5000 pl.when(pl.col("foo") == 1).then("bar").when(pl.col("foo") == 2).then(pl.col("bar") / 2).when(pl.col("bar") == p
   ...: l.col("foo")).then("baz").otherwise(pl.col("bar").list.eval(pl.element().str.starts_with("ni")))
34.4 µs ± 480 ns per loop (mean ± std. dev. of 7 runs, 5,000 loops each)
In [2]: %timeit -n 10000 -r 100 pl.col("foo").alias("bar")
1.41 µs ± 43.8 ns per loop (mean ± std. dev. of 100 runs, 10,000 loops each)
In [3]: sys.getsizeof(pl.col.foo)
Out[3]: 48

this PR

In [7]: %timeit -n 5000 pl.when(pl.col("foo") == 1).then("bar").when(pl.col("foo") == 2).then(pl.col("bar") / 2).when(pl.col("bar") == p
   ...: l.col("foo")).then("baz").otherwise(pl.col("bar").list.eval(pl.element().str.starts_with("ni")))
33.5 µs ± 505 ns per loop (mean ± std. dev. of 7 runs, 5,000 loops each)
In [3]: %timeit -n 10000 -r 100 pl.col("foo").alias("bar")
1.33 µs ± 41.3 ns per loop (mean ± std. dev. of 100 runs, 10,000 loops each)
In [3]: sys.getsizeof(pl.col.foo)
Out[3]: 40

Also docs are failing to build because of this, because sphinx tries to get __dict__ on namespaces to check for classmethod/staticmethod.
If it's acceptable, I will add workaround for docs and __slots__ to other wrappers (LazyFrame/DataFrame/Series) later, otherwise might as well close this PR.

@Object905
Copy link
Contributor Author

Ready to be merged or closed 😐

@Object905 Object905 changed the title perf(python): add "safe" __slots__ to pl.Expr and built in namespaces. perf(python): add __slots__ to pl.Expr, pl.LazyFrame, built in namespaces, various internal structures and some dtypes. Jan 5, 2024
@alexander-beedie
Copy link
Collaborator

Ready to be merged or closed 😐

If you can rebase against the latest code I'll see about reviewing this 👌

@Object905
Copy link
Contributor Author

@alexander-beedie
Sorry for disappearance. Rebased main.

@deanm0000
Copy link
Collaborator

Does this prevent monkey patching?

I use pl.Expr.some_func = lambda x: x #some cool thing from time to time and it seems __slots__ would prevent that.

Also, what about this syntax pl.col.a instead of pl.col('a')?

@Object905
Copy link
Contributor Author

Object905 commented Feb 15, 2024

@deanm0000
pl.Expr.some_func = ... modifies class __dict__, which is still in place. But this would prevent

instance = pl.col("a")
instance.some_func = ...

Also suggesting to take a look at pipe method on expression for this, it also plays well with typing.

pl.col.a works, because its implemented with getattr.

@stinodego stinodego changed the title perf(python): add __slots__ to pl.Expr, pl.LazyFrame, built in namespaces, various internal structures and some dtypes. perf(python): Add __slots__ to Expr, LazyFrame, and various other classes Feb 15, 2024
Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left some comments.

I think Series and DataFrame can also get a __slots__ attribute. We do not guarantee subclass preservation, so I removed the related tests in another PR.

The way I see it, the biggest blocker for this PR for now is the Sphinx issue, which we should be able to address. Or we could add slots for these namespace classes later.

@@ -43,7 +43,6 @@ def __get__(self, instance: NS | None, cls: type[NS]) -> NS | type[NS]:
return self._ns

ns_instance = self._ns(instance) # type: ignore[call-arg]
setattr(instance, self._accessor, ns_instance)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate if @alexander-beedie could sign off on this change before this gets merged. I don't really see exactly what is going on here.

Comment on lines 19 to 20
if not BUILDING_SPHINX_DOCS:
__slots__ = ("_pyexpr",)
Copy link
Member

@stinodego stinodego Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this construction. I would expect the sphinx_accessor to take care of this, but it doesn't. We should fix that directly rather than introducing this check everywhere.

It seems to be a problem with sphinx-autosummary-accessors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on this. Will try to look into this.
I've tried monkey-patching, but dynamically adding __dict__ is way harder when __slots__ are in place, because it modifies object layout.
Problem seems to be that autosummary makes objects into "modules" and sphinx expects module objects to have dict, so exception thrown from sphinx itself.
Maybe will try to make subclass with __dict__ dynamically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created pull request to handle this xarray-contrib/sphinx-autosummary-accessors#124

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great! They have not released in a year though, so I don't expect this to be picked up quickly.

I'd suggest removing the namespace slots for now - we can add them again when the issue is resolved.

py-polars/polars/datatypes/classes.py Outdated Show resolved Hide resolved
@alexander-beedie
Copy link
Collaborator

@alexander-beedie
Sorry for disappearance. Rebased main.

Thx; will look at the weekend

Copy link

codecov bot commented Feb 17, 2024

Codecov Report

Attention: Patch coverage is 82.79570% with 16 lines in your changes are missing coverage. Please review.

Project coverage is 80.78%. Comparing base (0e63403) to head (4ed1b8b).
Report is 15 commits behind head on main.

❗ Current head 4ed1b8b differs from pull request most recent head 7da83bd. Consider uploading reports for the commit 7da83bd to get more accurate results

Files Patch % Lines
py-polars/polars/expr/array.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/binary.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/categorical.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/datetime.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/list.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/meta.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/name.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/string.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/expr/struct.py 66.66% 0 Missing and 1 partial ⚠️
py-polars/polars/series/array.py 66.66% 0 Missing and 1 partial ⚠️
... and 6 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13236      +/-   ##
==========================================
- Coverage   80.92%   80.78%   -0.15%     
==========================================
  Files        1328     1326       -2     
  Lines      173446   173123     -323     
  Branches     2455     2455              
==========================================
- Hits       140369   139865     -504     
- Misses      32605    32771     +166     
- Partials      472      487      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego stinodego changed the title perf(python): Add __slots__ to Expr, LazyFrame, and various other classes perf(python): Add __slots__ to most Polars classes Feb 26, 2024
Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the namespace slots for now. Let's merge this and see if we get any user feedback on this. If not, that would be a good thing :)

@stinodego stinodego merged commit f940c58 into pola-rs:main Feb 26, 2024
12 of 13 checks passed
@lorentzenchr
Copy link
Contributor

@stinodego May I ask what the decision criteria for inclusion finally was? I'm really just curious because I considered something similar in scikit-learn and measurement showed no performance improvement at all, quite similar to #13236 (comment) did neither.

@stinodego
Copy link
Member

stinodego commented Feb 27, 2024

@stinodego May I ask what the decision criteria for inclusion finally was? I'm really just curious because I considered something similar in scikit-learn and measurement showed no performance improvement at all, quite similar to #13236 (comment) did neither.

I like that this clearly documents/enforces what the instance variables are. That is the most important thing for me.

I found out later that this changes the behavior of getattr and I'm not sure I'm happy with that:

class A:
    __slots__ = ("x",)

print(getattr(A, "x", None))
# <member 'x' of 'A' objects>       <-- without slots this would print None

But besides that, I don't see any big downsides so might as well give this a try. There may be some users out there that hit a specific case where the potential performance benefits actually help them.

@lorentzenchr
Copy link
Contributor

@stinodego Thanks for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add __slots__ to pl.Expr and pl.LazyFrame
5 participants