Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-102578: Optimise setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses #102573

Merged
merged 15 commits into from
Mar 11, 2023

Conversation

XuehaiPan
Copy link
Contributor

@XuehaiPan XuehaiPan commented Mar 10, 2023

Creating dataclasses with argument frozen=True will automatically generate methods __setattr__ and __delattr__ in _frozen_get_del_attr.

This PR changes tuple-based lookup to set-based lookup. Reduce the time complexity from $O(n)$ to $O(1)$.

In [1]: # tuple-based

In [2]: %timeit 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
9.91 ns ± 0.0982 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [3]: %timeit 'd' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
33.2 ns ± 0.701 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [4]: %timeit 'g' in ('a', 'b', 'c', 'd', 'e', 'f', 'g')
56.4 ns ± 0.818 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: # set-based

In [6]: %timeit 'a' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.3 ns ± 0.0723 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [7]: %timeit 'd' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11 ns ± 0.106 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [8]: %timeit 'g' in {'a', 'b', 'c', 'd', 'e', 'f', 'g'}
11.1 ns ± 0.126 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

A tiny benchmark script:

from contextlib import suppress
from dataclasses import FrozenInstanceError, dataclass

@dataclass(frozen=True)
class Foo2:
    a: int
    b: int

foo2 = Foo2(1, 2)

def bench2(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0

@dataclass(frozen=True)
class Foo7:
    a: int
    b: int
    c: int
    d: int
    e: int
    f: int
    g: int

foo7 = Foo7(1, 2, 3, 4, 5, 6, 7)

def bench7(inst):
    with suppress(FrozenInstanceError):
        inst.a = 0
    with suppress(FrozenInstanceError):
        inst.b = 0
    with suppress(FrozenInstanceError):
        inst.c = 0
    with suppress(FrozenInstanceError):
        inst.d = 0
    with suppress(FrozenInstanceError):
        inst.e = 0
    with suppress(FrozenInstanceError):
        inst.f = 0
    with suppress(FrozenInstanceError):
        inst.g = 0

class Bar(Foo7):
    def __init__(self, a, b, c, d, e, f, g):
        super().__init__(a, b, c, d, e, f, g)
        self.baz = 0

def bench(inst):
    inst.baz = 1

Result:

set-based lookup:

In [2]: %timeit bench2(foo2)
1.08 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.81 µs ± 20.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
249 ns ± 6.31 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

tuple-based lookup (original):

In [2]: %timeit bench2(foo2)
1.15 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit bench7(foo7)
3.97 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit bench(bar)
269 ns ± 4.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

The set-based is constantly faster than the old approach. And the theoretical time complexity is also smaller ($O(1)$ vs. $O(n)$).

Resolves #102578

@XuehaiPan XuehaiPan requested a review from ericvsmith as a code owner March 10, 2023 09:57
@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@cpython-cla-bot
Copy link

cpython-cla-bot bot commented Mar 10, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@XuehaiPan XuehaiPan changed the title Use set-based name lookup rather than tuples for frozen dataclasses gh-102573: Use set-based name lookup rather than tuples for frozen dataclasses Mar 10, 2023
Lib/dataclasses.py Outdated Show resolved Hide resolved
@AlexWaygood
Copy link
Member

AlexWaygood commented Mar 10, 2023

The issue-number CI check is failing because gh-102573 does not yet exist as a GitHub issue :)

You should create an issue to describe the change you're proposing here, and then link to it in the title of this PR.

@XuehaiPan XuehaiPan changed the title gh-102573: Use set-based name lookup rather than tuples for frozen dataclasses gh-102578: Use set-based name lookup rather than tuples for frozen dataclasses Mar 10, 2023
@XuehaiPan
Copy link
Contributor Author

Opened a linked issue and added some benchmark results.

XuehaiPan and others added 2 commits March 10, 2023 22:08
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
@AlexWaygood AlexWaygood added performance Performance or resource usage stdlib Python modules in the Lib dir labels Mar 10, 2023
@AlexWaygood
Copy link
Member

Looks like the CLA check has started failing with the latest commit -- if you're using two email address, you may have to sign it with both email addresses, unfortunately :(

@XuehaiPan
Copy link
Contributor Author

Seems that the CLA check has passed now.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. I verified the speedup by running this benchmark locally (which is a little different to the one you posted in your issue and the one you posted in this PR).

Benchmark:
import dataclasses
import string
import time

Foo = dataclasses.make_dataclass(
    "Foo",
    [(letter, int) for letter in string.ascii_lowercase],
    frozen=True
)

class Bar(Foo): ...

instance = Bar(*range(26))
t0 = time.perf_counter()
for _ in range(10_000_000):
    instance.foo = 1
    del instance.foo
print(f"{time.perf_counter() - t0:.2f}")

The result of this benchmark is 15.15 seconds on main on my machine (--pgo non-debug build), but 6.44 seconds with this patch applied.

I'll wait for a thumbs-up from @ericvsmith, @carljm, or another core dev before merging, but this has my approval -- thanks!

@AlexWaygood AlexWaygood changed the title gh-102578: Use set-based name lookup rather than tuples for frozen dataclasses gh-102578: Optimise setting and deleting mutable attributes on non-dataclass subclasses of frozen dataclasses Mar 10, 2023
else:
# Special case for the zero-length tuple.
# Special case for the zero-length set.
# Use the empty tuple singleton to avoid unnecessary `set` construction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it matters much, but the zero length case could just avoid the or name in ... test entirely. Maybe fields_str should become fields_test, and then set it to or name in {<<generated set literal>>} or set it to an empty string if there are no fields. Then change the generated code to f'if type(self) is cls {fields_test}:' Although that doesn't read very well. Maybe tweak fields_test to be something else.

This could be part of a different PR, or include it here. But in any event I'm not positive that the zero length case actually has a test. We should make sure it does for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in any event I'm not positive that the zero length case actually has a test. We should make sure it does for this PR.

Added a small test for empty frozen dataclass.

Lib/dataclasses.py Outdated Show resolved Hide resolved
Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM. A couple comments on the new test.

Lib/test/test_dataclasses.py Outdated Show resolved Hide resolved
Lib/test/test_dataclasses.py Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@XuehaiPan
Copy link
Contributor Author

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@carljm, @AlexWaygood: please review the changes made to this pull request.

Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me; thanks for the perf improvement!

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@ericvsmith ericvsmith merged commit ee6f841 into python:main Mar 11, 2023
@XuehaiPan XuehaiPan deleted the dataclasses-lookup branch March 11, 2023 04:44
iritkatriel pushed a commit to iritkatriel/cpython that referenced this pull request Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
6 participants