Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing background infrastructure for recursive types: Part 1 #7330

Merged
merged 19 commits into from
Aug 16, 2019

Conversation

ilevkivskyi
Copy link
Member

During planning discussions one of the main concerns about recursive types was the fact that we have hundreds of places where certain types are special-cased using isinstance(), and fixing all of them will take weeks.

So I did a little experiment this weekend, to understand how bad it actually is. I wrote a simple mypy plugin for mypy self-check, and it discovered 800+ such call sites. This looks pretty bad, but it turns out that fixing half of them (roughly 400 plugin errors) took me less than 2 days. This is kind of a triumph of our tooling :-) (i.e. mypy plugin + PyCharm plugin).

Taking into account results of this experiment I propose to actually go ahead and implement recursive types. Here are some comments:

  • There will be four subsequent PRs: second part of isinstance() cleanup, implementing visitors and related methods everywhere, actual core implementation, adding extra tests for tricky recursion patterns.
  • The core idea of implementation stays the same as we discussed with @JukkaL: TypeAliasType and TypeAlias node will essentially match logic between Instance and TypeInfo (but structurally, as for protocols)
  • I wanted to make PlaceholderType a non-ProperType, but it didn't work immediately because we call make_union() during semantic analysis. If this seems important, this can be done with a bit more effort.
  • I make TypeType.item a proper type (following PEP 484, only very limited things can be passed to Type[...]). I also make UnionType.items proper types, mostly because of make_simplified_union(). Finally, I make FuncBase.type a proper type, I think a type alias can never appear there.
  • It is sometimes hard to decide where exactly is to call get_proper_type(), I tried to balance calling them not too soon and not too late, depending of every individual case. Please review, I am open to modifying logic in some places.

@ilevkivskyi ilevkivskyi requested a review from JukkaL August 13, 2019 13:32
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see us moving towards supporting general recursive types! Looks good, just various small comments.

Using a plugin is a smart way to find all the places where we have dangerous isinstance calls. I had considered using NewType hacks but it would produce more confusing error messages.

mypy/types.py Outdated
A type alias to another type.

To support recursive type aliases we don't immediately expand a type alias
during semantic analysis, but create an in stance of this type that records the target alias
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "in stance"

mypy/types.py Outdated
"""Expand to the target type exactly once.

This doesn't do full expansion, i.e. the result can contain another
(or even this same) type alias.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be considered an internal helper (perhaps prefixed with _) and we could suggest using get_proper_type instead.

raise NotImplementedError('TODO')

@property
def can_be_true(self) -> bool: # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The # type: ignore may break mypyc is strange ways, so it would be better not to use it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually a mypy false positive #6759 that doesn't allow fixing this. I will leave a comment here with a link to issue.

return self.alias.target.can_be_true

@property
def can_be_false(self) -> bool: # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above.

mypy/types.py Outdated
@@ -144,6 +144,97 @@ def deserialize(cls, data: JsonDict) -> 'Type':
raise NotImplementedError('Cannot deserialize {} instance'.format(cls.__name__))


class TypeAliasType(Type):
"""
A type alias to another type.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: move the first line to the same line with """ ("""A type alias ...).

definition node (mypy.nodes.TypeAlias) and type arguments (for generic aliases).

This is very similar to how TypeInfo vs Instance interact.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Add empty line after """, for consistency.

"""
A type alias to another type.

To support recursive type aliases we don't immediately expand a type alias
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a prominent note that this is not being used yet, and the implementation is still incomplete?

@@ -868,7 +959,7 @@ def __init__(self,
assert len(arg_types) == len(arg_kinds) == len(arg_names)
if variables is None:
variables = []
self.arg_types = arg_types
self.arg_types = list(arg_types)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making sure that these are expanded enough so that they are proper types (but may contain type alias references)? Not sure if this would be worth it, but it might let us avoid a bunch of get_proper_type calls elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is actually not possible. Actually even what I did for unions is problematic, for example if we have (not very meaningful, since technically A = int) A = Union[int, A]. To represent this we must have somewhere a union that contains an item that is A. I will try to fix this now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these kind of "trivial" recursive unions are the most problematic, maybe we should actually ban them. Another solution would be to actually remove aliases being defined from immediate union, but this can silently hide an error, so I don't like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some more thinking I think I convinced myself that aliases like A = Union[int, A] are really bad. Essentially they are not much different from A = A, for example all situations like this:

x: A
x.to_bytes()

cause infinite recursion. Also, it is very hard to imagine any practical use for such aliases. Finally, they can be easily detected during semantic analysis and banned.

If we ban them, then we can keep UnionType.items and TypeType.item proper. I however didn't find any simple consistent representation schema where we could also make TupleType.items and CallableType.arg_types (and .ret_type) proper, mostly because of things like A = Union[int, Callable[[], A]].

@JukkaL If you don't have objections, I would merge this to get things moving (sorry for pinging during vacation :-)).

@@ -1216,7 +1307,7 @@ def deserialize(cls, data: JsonDict) -> 'Overloaded':
return Overloaded([CallableType.deserialize(t) for t in data['items']])


class TupleType(Type):
class TupleType(ProperType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, I can imagine that expanding items could be useful (or not).

@@ -55,7 +58,7 @@ def _get_argument(call: CallExpr, name: str) -> Optional[Expression]:
callee_node = call.callee.node
if (isinstance(callee_node, (Var, SYMBOL_FUNCBASE_TYPES))
and callee_node.type):
callee_node_type = callee_node.type
callee_node_type = get_proper_type(callee_node.type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this change may affect plugins, it needs to be documented in plugin docs and the plugins changelog issue before the next mypy release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I actually added this expansion to minimize possible breakages (at least few of our plugins used isinstance() with these types).

@ilevkivskyi ilevkivskyi merged commit e04bf78 into python:master Aug 16, 2019
@ilevkivskyi ilevkivskyi deleted the recursive-types branch August 16, 2019 07:59
ilevkivskyi added a commit that referenced this pull request Aug 22, 2019
…7366)

This completes the addition of expansion points before every `isinstance()` checks on types.

This essentially does the same as #7330. Couple comments:
* Many checks (probably more than half) in type checker are against `AnyType`, `NoneType`, and `UninhabitedType`, but technically these are valid targets for aliases, so I still do expansions there.
* I added couple hooks to the plugin to flag redundant `get_proper_type()` and `get_proper_types()` calls, however sometimes when they trigger they add some additional spurious errors about non-matching overloads. I tried to fix this, but it looks like this requires calling `get_function_hook()` once on a whole overload (now it is called for every item).

Next part will contain implementation of the visitors.
ilevkivskyi added a commit that referenced this pull request Nov 8, 2019
…7885)

This is the last part of plumbing for recursive types (previous #7366 and #7330). Here I implement visitors and related functions.

I convinced myself that we need to only be more careful when a recursive type is checked against another recursive one, so I only special-case these. Logic is similar to how protocols behave, because very roughly type alias can be imagined as a protocol with single property:
```python
A = Union[T, Tuple[A[T], ...]]

class A(Protocol[T]):
    @Property
    def __base__(self) -> Union[T, Tuple[A[T], ...]]: ...
```
but where `TypeAliasType` plays role of `Instance` and `TypeAlias` plays role of `TypeInfo`.

Next two pull requests will contain some non-trivial implementation logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants