-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-102578: Optimise setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses #102573
Conversation
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
set
-based name lookup rather than tuple
s for frozen dataclassesset
-based name lookup rather than tuple
s for frozen dataclasses
Misc/NEWS.d/next/Library/2023-03-10-10-10-00.gh-issue-102573.GQ7sV6.rst
Outdated
Show resolved
Hide resolved
The You should create an issue to describe the change you're proposing here, and then link to it in the title of this PR. |
set
-based name lookup rather than tuple
s for frozen dataclassesset
-based name lookup rather than tuple
s for frozen dataclasses
Opened a linked issue and added some benchmark results. |
Misc/NEWS.d/next/Library/2023-03-10-13-21-16.gh-issue-102578.-gujoI.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Library/2023-03-10-13-21-16.gh-issue-102578.-gujoI.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Looks like the CLA check has started failing with the latest commit -- if you're using two email address, you may have to sign it with both email addresses, unfortunately :( |
Seems that the CLA check has passed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me. I verified the speedup by running this benchmark locally (which is a little different to the one you posted in your issue and the one you posted in this PR).
Benchmark:
import dataclasses
import string
import time
Foo = dataclasses.make_dataclass(
"Foo",
[(letter, int) for letter in string.ascii_lowercase],
frozen=True
)
class Bar(Foo): ...
instance = Bar(*range(26))
t0 = time.perf_counter()
for _ in range(10_000_000):
instance.foo = 1
del instance.foo
print(f"{time.perf_counter() - t0:.2f}")
The result of this benchmark is 15.15 seconds on main
on my machine (--pgo
non-debug build), but 6.44 seconds with this patch applied.
I'll wait for a thumbs-up from @ericvsmith, @carljm, or another core dev before merging, but this has my approval -- thanks!
set
-based name lookup rather than tuple
s for frozen dataclasses
Lib/dataclasses.py
Outdated
else: | ||
# Special case for the zero-length tuple. | ||
# Special case for the zero-length set. | ||
# Use the empty tuple singleton to avoid unnecessary `set` construction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it matters much, but the zero length case could just avoid the or name in ...
test entirely. Maybe fields_str should become fields_test, and then set it to or name in {<<generated set literal>>}
or set it to an empty string if there are no fields. Then change the generated code to f'if type(self) is cls {fields_test}:'
Although that doesn't read very well. Maybe tweak fields_test to be something else.
This could be part of a different PR, or include it here. But in any event I'm not positive that the zero length case actually has a test. We should make sure it does for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in any event I'm not positive that the zero length case actually has a test. We should make sure it does for this PR.
Added a small test for empty frozen dataclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes LGTM. A couple comments on the new test.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
I have made the requested changes; please review again |
Thanks for making the requested changes! @carljm, @AlexWaygood: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me; thanks for the perf improvement!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
…non-dataclass subclasses of frozen dataclasses (pythongh-102573)
Creating
dataclass
es with argumentfrozen=True
will automatically generate methods__setattr__
and__delattr__
in_frozen_get_del_attr
.This PR changes$O(n)$ to $O(1)$ .
tuple
-based lookup toset
-based lookup. Reduce the time complexity fromA tiny benchmark script:
Result:
set
-based lookup:tuple
-based lookup (original):The$O(1)$ vs. $O(n)$).
set
-based is constantly faster than the old approach. And the theoretical time complexity is also smaller (Resolves #102578