-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't type check most function bodies if ignoring errors #14150
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like the idea, left couple comments (not a full review).
self.lvalue = False | ||
self.found = False | ||
|
||
def visit_assignment_stmt(self, s: AssignmentStmt) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also need to check assignment expression? (i.e. walrus a.k.a :=
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't support assigning to an attribute.
): | ||
# We only strip method bodies if they don't assign to an attribute, as | ||
# this may define an attribute which has an externally visible effect. | ||
visitor = FindAttributeAssign() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to strip statements after last assignment to attribute? This could make a bit more perf gain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be possible! I'd rather not do it in this PR to keep it simple (and the impact is probably pretty minor), but it's seems like a promising follow-up improvement to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a chance to test the PR on a full run with Home Assistant (with all dependencies installed). The performance improvement is noticeable. Both with the uncompiled version on Python 3.9 - perviously: ~11min - now: ~7:30min 🚀
Unfortunately, I also got a few new error messages which the primer didn't pick up on.
homeassistant/components/nibe_heatpump/__init__.py:267: error: "Coroutine[Any, Any, AsyncIterator[Coil]]" has no attribute "__aiter__" (not async iterable) [attr-defined]
homeassistant/components/nibe_heatpump/__init__.py:267: note: Maybe you forgot to use "await"?
homeassistant/components/google/calendar.py:350: error: "Coroutine[Any, Any, AsyncIterator[ListEventsResponse]]" has no attribute "__anext__" [attr-defined]
homeassistant/components/google/calendar.py:350: note: Maybe you forgot to use "await"?
homeassistant/components/homekit_controller/config_flow.py:155: error: "Coroutine[Any, Any, AsyncIterable[AbstractDiscovery]]" has no attribute "__aiter__" (not async iterable) [attr-defined]
homeassistant/components/homekit_controller/config_flow.py:155: note: Maybe you forgot to use "await"?
homeassistant/components/amcrest/__init__.py:209: error: Return type "_AsyncGeneratorContextManager[Response]" of "async_stream_command" incompatible with return type "_AsyncGeneratorContextManager[<nothing>]" in supertype "Http" [override]
homeassistant/components/amcrest/__init__.py:214: error: Need type annotation for "ret" [var-annotated]
Found 5 errors in 4 files (checked 5433 source files)
One of the edge cases seem to be async iterators with yield
. To reproduce it, create two files
# a.py
from typing import AsyncIterator
class L:
async def some_func(self, i: int) -> str:
return str(i)
async def get_iterator(self) -> AsyncIterator[str]:
for i in range(5):
yield await self.some_func(i)
# b.py
from a import L
async def func(l: L) -> None:
reveal_type(l.get_iterator)
async for i in l.get_iterator():
print(i)
And run mypy b.py
.
b.py:5: note: Revealed type is "def () -> typing.Coroutine[Any, Any, typing.AsyncIterator[builtins.str]]"
b.py:6: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable) [attr-defined]
b.py:6: note: Maybe you forgot to use "await"?
The second edge case is fairly similar, just with the addition of # a.py
from contextlib import asynccontextmanager
from typing import AsyncIterator
class Parent:
@asynccontextmanager
async def async_func(self) -> AsyncIterator[str]:
yield '' # b.py
from contextlib import asynccontextmanager
from typing import AsyncIterator
from test8 import Parent
class Child(Parent):
@asynccontextmanager
async def async_func(self) -> AsyncIterator[str]:
yield '' Running
|
@cdce8p Thanks for the detailed reports about regressions! I clearly need to investigate those cases. |
These can have an externally visible effect within an async function. The existence of yield affects the inferred return type, and a yield from generates a blocking error.
This comment has been minimized.
This comment has been minimized.
Just tested the PR with Home Assistant again. The last changes resolve all issues 🎉 |
There are some performance regressions on master that I'd like to address first before merging this. Otherwise the performance impact estimate might be inflated. |
@JukkaL I just fixed conflicts. What's the status of this PR? Are you ready to re-measure perf / merge in time for 1.0? |
This comment has been minimized.
This comment has been minimized.
Just wanted to quickly note that I think this will be very valuable for codebases that are migrating from untyped Python to typed Python. We use # Exclude files with '# mypy: ignore-errors' for performance
# - Requires GNU Grep (tested with 3.8)
# - Ignores the same directories as Mypy https://mypy.readthedocs.io/en/stable/command_line.html#cmdoption-mypy-exclude
# - Also ignores hidden directories like `.git` and `.mypy_cache`
# - Does not support unnecessary whitespace in comment
exclude="$(\
grep \
--binary-files without-match \
--recursive \
--include '*.py' \
--include '*.pyi' \
--exclude-dir 'site-packages' \
--exclude-dir 'node_modules' \
--exclude-dir '__pycache__' \
--exclude-dir '.*' \
--files-with-matches \
'^# mypy: ignore[-_]errors$' \
| sed 's/^\.\///g' \
| tr '\n' '|' \
| sed 's/\./\\\./g' \
| sed 's/|$//'\
)"
mypy --exclude "$exclude" |
According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉 |
I finally got around to running some benchmarks. Self-checking mypy (not including mypyc) was about 10% faster when compiled, and 15% faster when interpreted. |
If errors are ignored, type checking function bodies often can have no effect. Remove function bodies after parsing to speed up type checking.
Methods that define attributes have an externally visible effect even if errors are ignored. The body of any method that assigns to any attribute is preserved to deal with this (even if it doesn't actually define a new attribute). Most methods don't assign to an attribute, so stripping bodies is still effective for methods.
There are a couple of additional interesting things in the implementation:
...
) to checksuper()
method calls. The approach here is to preserve such trivial bodies and treat them differently from no body at all.The main benefit is faster type checking when using installed packages with inline type information (PEP 561). Errors are ignored in this case, and it's common to have a large number of third-party code to type check. For example, a self check (code under
mypy/
) is now about 20% faster, with a compiled mypy on Python 3.11.Another, more subtle benefit is improved reliability. A third-party library may have some code that triggers a mypy crash or an invalid blocking error. If bodies are stripped, often the error will no longer be triggered, since the amount code to type check is much lower.