-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-74690: Improve performance when testing against protocols with many attributes. #112157
gh-74690: Improve performance when testing against protocols with many attributes. #112157
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
@@ -1850,6 +1850,11 @@ def __instancecheck__(cls, instance): | |||
if super().__instancecheck__(instance): | |||
return True | |||
|
|||
# quick true negative check for protocols with many attributes. | |||
if not hasattr(instance, '__getattr__'): | |||
if not (cls.__protocol_attrs__ <= set(dir(instance))): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about classes that define custom __dir__()
methods? What about classes that define custom __getattribute__
methods? I think this approach is very risky
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, is there any C-implemented alternative to dir
, that returns some iterable of the attribute names? I saw inspect.getmembers_static
, but it is mostly plain python and way too slow. If custom __dir__
is a worry, one could maybe use object.__dir__(instance)
.
dir itself is even inefficient since it sorts the names, here we really only need something that iterates over the names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regard to the custom __getattribute__
stuff, I think the current code already suffers from that, no? The docs say about getattr_static
:
Note: this function may not be able to retrieve all attributes that getattr can fetch (like dynamically created attributes) and may find attributes that getattr can’t (like descriptors that raise AttributeError).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any C-implemented alternative to
dir
, that returns some iterable of the attribute names?
Not that I know of
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regard to the custom
__getattribute__
stuff, I think the current code already suffers from that, no?
Ah indeed; good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So really the issue here seems that
getattr_static
is just really slow (I re-ran the test, replacing it with plaingetattr
and got 11x speedup)getattr
can be slow if it triggers some computation.
isinstance(x, proto)
only checks statically known attributes and can give incorrect results if certain attributes are defined dynamically.- since Protocols are about static typing, this seems ok to do.
- There is nothing in the data-model for classes to provide something like a
__hasattr__
to communicate which attributes are set dynamically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your analysis is correct.
getattr_static
is just really slow (I re-ran the test, replacing it with plaingetattr
and got 11x speedup)
getattr
can be slow if it triggers some computation.
isinstance(x, proto)
only checks statically known attributes and can give incorrect results if certain attributes are defined dynamically.
Note that we changed these two bullet points in Python 3.12 after long discussions in python/typing#1363 and #102433. Previously, isinstance()
checks didn't use inspect.getattr_static
, and did resolve dynamically defined attributes. But it was decided that that behaviour was bad, so we decided to accept the performance penalty in order to have more correct and less unexpected behaviour. (And through various other performance optimisations to compensate for that, isinstance()
checks against runtime-checkable protocols are still overall much faster in Python 3.12 than in Python 3.11, despite the slowdown from switching to inspect.getattr_static
.)
Wait for |
This adds a very simple check for detecting true negatives:
Running some benchmarks with
pyperf
against the example provided here: #74690 (comment) give the following speedups:one_int
andthree_float
are negative examples, and we see a 2.7x-3.7x speedup, but no measurable slowdown for the true positivetwo_frac
.Are there any other benchmarks to test against?