-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up make_simplified_union, fix recursive tuple crash #15128
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
184e415
Speed up make_simplified_union, remove a potential crash
hauntsaninja cd5f40b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 13c4007
Merge branch 'master' into msu
hauntsaninja b009483
fix performance on materialize
hauntsaninja bea67d2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] fd8463b
more perf
hauntsaninja b132663
add regression test for crash
hauntsaninja bc79133
micro optimisations
hauntsaninja File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -385,25 +385,6 @@ def callable_corresponding_argument( | |
return by_name if by_name is not None else by_pos | ||
|
||
|
||
def simple_literal_value_key(t: ProperType) -> tuple[str, ...] | None: | ||
"""Return a hashable description of simple literal type. | ||
|
||
Return None if not a simple literal type. | ||
|
||
The return value can be used to simplify away duplicate types in | ||
unions by comparing keys for equality. For now enum, string or | ||
Instance with string last_known_value are supported. | ||
""" | ||
if isinstance(t, LiteralType): | ||
if t.fallback.type.is_enum or t.fallback.type.fullname == "builtins.str": | ||
assert isinstance(t.value, str) | ||
return "literal", t.value, t.fallback.type.fullname | ||
if isinstance(t, Instance): | ||
if t.last_known_value is not None and isinstance(t.last_known_value.value, str): | ||
return "instance", t.last_known_value.value, t.type.fullname | ||
return None | ||
|
||
|
||
def simple_literal_type(t: ProperType | None) -> Instance | None: | ||
"""Extract the underlying fallback Instance type for a simple Literal""" | ||
if isinstance(t, Instance) and t.last_known_value is not None: | ||
|
@@ -414,7 +395,6 @@ def simple_literal_type(t: ProperType | None) -> Instance | None: | |
|
||
|
||
def is_simple_literal(t: ProperType) -> bool: | ||
"""Fast way to check if simple_literal_value_key() would return a non-None value.""" | ||
if isinstance(t, LiteralType): | ||
return t.fallback.type.is_enum or t.fallback.type.fullname == "builtins.str" | ||
if isinstance(t, Instance): | ||
|
@@ -500,68 +480,80 @@ def make_simplified_union( | |
def _remove_redundant_union_items(items: list[Type], keep_erased: bool) -> list[Type]: | ||
from mypy.subtypes import is_proper_subtype | ||
|
||
removed: set[int] = set() | ||
seen: set[tuple[str, ...]] = set() | ||
|
||
# NB: having a separate fast path for Union of Literal and slow path for other things | ||
# would arguably be cleaner, however it breaks down when simplifying the Union of two | ||
# different enum types as try_expanding_sum_type_to_union works recursively and will | ||
# trigger intermediate simplifications that would render the fast path useless | ||
for i, item in enumerate(items): | ||
proper_item = get_proper_type(item) | ||
if i in removed: | ||
continue | ||
# Avoid slow nested for loop for Union of Literal of strings/enums (issue #9169) | ||
k = simple_literal_value_key(proper_item) | ||
if k is not None: | ||
if k in seen: | ||
removed.add(i) | ||
# The first pass through this loop, we check if later items are subtypes of earlier items. | ||
# The second pass through this loop, we check if earlier items are subtypes of later items | ||
# (by reversing the remaining items) | ||
for _direction in range(2): | ||
new_items: list[Type] = [] | ||
# seen is a map from a type to its index in new_items | ||
seen: dict[ProperType, int] = {} | ||
unduplicated_literal_fallbacks: set[Instance] | None = None | ||
for ti in items: | ||
proper_ti = get_proper_type(ti) | ||
|
||
# UninhabitedType is always redundant | ||
if isinstance(proper_ti, UninhabitedType): | ||
continue | ||
|
||
# NB: one would naively expect that it would be safe to skip the slow path | ||
# always for literals. One would be sorely mistaken. Indeed, some simplifications | ||
# such as that of None/Optional when strict optional is false, do require that we | ||
# proceed with the slow path. Thankfully, all literals will have the same subtype | ||
# relationship to non-literal types, so we only need to do that walk for the first | ||
# literal, which keeps the fast path fast even in the presence of a mixture of | ||
# literals and other types. | ||
safe_skip = len(seen) > 0 | ||
seen.add(k) | ||
if safe_skip: | ||
continue | ||
|
||
# Keep track of the truthiness info for deleted subtypes which can be relevant | ||
cbt = cbf = False | ||
for j, tj in enumerate(items): | ||
proper_tj = get_proper_type(tj) | ||
if ( | ||
i == j | ||
# avoid further checks if this item was already marked redundant. | ||
or j in removed | ||
# if the current item is a simple literal then this simplification loop can | ||
# safely skip all other simple literals as two literals will only ever be | ||
# subtypes of each other if they are equal, which is already handled above. | ||
# However, if the current item is not a literal, it might plausibly be a | ||
# supertype of other literals in the union, so we must check them again. | ||
# This is an important optimization as is_proper_subtype is pretty expensive. | ||
or (k is not None and is_simple_literal(proper_tj)) | ||
): | ||
continue | ||
# actual redundancy checks (XXX?) | ||
if is_redundant_literal_instance(proper_item, proper_tj) and is_proper_subtype( | ||
tj, item, keep_erased_types=keep_erased, ignore_promotions=True | ||
duplicate_index = -1 | ||
# Quickly check if we've seen this type | ||
if proper_ti in seen: | ||
duplicate_index = seen[proper_ti] | ||
elif ( | ||
isinstance(proper_ti, LiteralType) | ||
and unduplicated_literal_fallbacks is not None | ||
and proper_ti.fallback in unduplicated_literal_fallbacks | ||
): | ||
# We found a redundant item in the union. | ||
removed.add(j) | ||
cbt = cbt or tj.can_be_true | ||
cbf = cbf or tj.can_be_false | ||
# if deleted subtypes had more general truthiness, use that | ||
if not item.can_be_true and cbt: | ||
items[i] = true_or_false(item) | ||
elif not item.can_be_false and cbf: | ||
items[i] = true_or_false(item) | ||
# This is an optimisation for unions with many LiteralType | ||
# We've already checked for exact duplicates. This means that any super type of | ||
# the LiteralType must be a super type of its fallback. If we've gone through | ||
# the expensive loop below and found no super type for a previous LiteralType | ||
# with the same fallback, we can skip doing that work again and just add the type | ||
# to new_items | ||
pass | ||
else: | ||
# If not, check if we've seen a supertype of this type | ||
for j, tj in enumerate(new_items): | ||
tj = get_proper_type(tj) | ||
# If tj is an Instance with a last_known_value, do not remove proper_ti | ||
# (unless it's an instance with the same last_known_value) | ||
if ( | ||
isinstance(tj, Instance) | ||
and tj.last_known_value is not None | ||
and not ( | ||
isinstance(proper_ti, Instance) | ||
and tj.last_known_value == proper_ti.last_known_value | ||
) | ||
): | ||
continue | ||
|
||
if is_proper_subtype( | ||
proper_ti, tj, keep_erased_types=keep_erased, ignore_promotions=True | ||
): | ||
duplicate_index = j | ||
break | ||
if duplicate_index != -1: | ||
# If deleted subtypes had more general truthiness, use that | ||
orig_item = new_items[duplicate_index] | ||
if not orig_item.can_be_true and ti.can_be_true: | ||
new_items[duplicate_index] = true_or_false(orig_item) | ||
elif not orig_item.can_be_false and ti.can_be_false: | ||
new_items[duplicate_index] = true_or_false(orig_item) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to above. |
||
else: | ||
# We have a non-duplicate item, add it to new_items | ||
seen[proper_ti] = len(new_items) | ||
new_items.append(ti) | ||
if isinstance(proper_ti, LiteralType): | ||
if unduplicated_literal_fallbacks is None: | ||
unduplicated_literal_fallbacks = set() | ||
unduplicated_literal_fallbacks.add(proper_ti.fallback) | ||
|
||
return [items[i] for i in range(len(items)) if i not in removed] | ||
items = new_items | ||
hauntsaninja marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if len(items) <= 1: | ||
break | ||
items.reverse() | ||
hauntsaninja marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
return items | ||
|
||
|
||
def _get_type_special_method_bool_ret_type(t: Type) -> Type | None: | ||
|
@@ -992,17 +984,6 @@ def custom_special_method(typ: Type, name: str, check_all: bool = False) -> bool | |
return False | ||
|
||
|
||
def is_redundant_literal_instance(general: ProperType, specific: ProperType) -> bool: | ||
if not isinstance(general, Instance) or general.last_known_value is None: | ||
return True | ||
if isinstance(specific, Instance) and specific.last_known_value == general.last_known_value: | ||
return True | ||
if isinstance(specific, UninhabitedType): | ||
return True | ||
|
||
return False | ||
|
||
|
||
def separate_union_literals(t: UnionType) -> tuple[Sequence[LiteralType], Sequence[Type]]: | ||
"""Separate literals from other members in a union type.""" | ||
literal_items = [] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we only change
can_be_true
, or is it ok to only adjustcan_be_false
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question. I'm going to preserve the existing behaviour for now and explore this in another PR. These code paths don't have many unit tests (see #15094 and #15098), so I want to be careful here