Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-105499: Merge typing.Union and types.UnionType #105511

Open
wants to merge 50 commits into
base: main
Choose a base branch
from

Conversation

JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Jun 8, 2023

Lib/test/test_typing.py Outdated Show resolved Hide resolved
@AlexWaygood
Copy link
Member

Would we be able to get rid of typing._make_union, and just have TypeVar etc. call UnionType.__class_getitem__ directly?

@JelleZijlstra
Copy link
Member Author

Would we be able to get rid of typing._make_union, and just have TypeVar etc. call UnionType.__class_getitem__ directly?

Yes

@AlexWaygood
Copy link
Member

Instead of deleting typing.Union completely, adding UnionType.__class_getitem__, and making typing.Union an alias for types.UnionType, I wonder if we could just change typing.Union so that it's a special form that just returns instances of UnionType:

from operator import or_
from functools import reduce

@_SpecialForm
def Union(self, parameters):
    return reduce(or_, parameters)

That would avoid the issue of "Should we rename types.UnionType to be Union", as we'd keep them as distinct objects.

But I know your plan is to deal with handling forward refs in | expressions later, which would mean that^ idea above wouldn't work right now (unions involving forward references would break).

So perhaps you could instead expose a _secret_undocumented_constructor method for types.UnionType (we can bikeshed over the name), and then just do

@_SpecialForm
def Union(self, parameters):
    return types.UnionType._secret_undocumented_constructor(parameters)

Or we could just expose the constructor of types.UnionType, of course.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jun 8, 2023

Also you can now get rid of this function as part of this PR; prior callers of the function can now just use isinstance() checks against types.UnionType:

cpython/Lib/functools.py

Lines 842 to 844 in 6a8b862

def _is_union_type(cls):
from typing import get_origin, Union
return get_origin(cls) in {Union, types.UnionType}

@JelleZijlstra
Copy link
Member Author

That would avoid the issue of "Should we rename types.UnionType to be Union", as we'd keep them as distinct objects.

I'd rather not keep them as distinct objects, though; that means we still have two objects that to a user look like the same thing. Plus, we'd have get_origin(Union[int, str]) != Union.

@AlexWaygood
Copy link
Member

AlexWaygood commented Jun 8, 2023

I guess one of my reservations here is that UnionType ends up looking like a pretty weird type. You can't construct instances of it directly:

>>> from types import UnionType
>>> UnionType(int, str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create 'types.UnionType' instances

...Except wait, you can, we now have a "secret __class_getitem__ backdoor" to create instances directly:

>>> UnionType[int, str, bytes]
int | str | bytes

It's very unusual for __class_getitem__ to just directly return instances of the class, and it feels really weird to make that the only way you're allowed to construct instances of the class.

Maybe that means we should just expose the constructor as well as adding __class_getitem__...?

Plus, we'd have get_origin(Union[int, str]) != Union.

Good point, that would indeed be confusing.

@JelleZijlstra
Copy link
Member Author

I can make the constructor work. Should it take *args and union them all together?

@AlexWaygood
Copy link
Member

I can make the constructor work. Should it take *args and union them all together?

That makes sense to me!

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, in my opinion. From a design perspective, the only weirdnesses I can see are that both of these become valid at runtime:

from types import UnionType
from typing import Union

UnionType[int, str]
Union(int, str)

I can live with that, though, and hopefully linters can flag those uses. (Type checkers almost certainly will, anyway.) On our side, we can just not document that you can do either of those things.

@JelleZijlstra
Copy link
Member Author

Coming back to this PR now, I still think it's best to combine the two objects completely. This creates a less confusing situation in the long term as we no longer have two nearly-equivalent objects. @AlexWaygood do you still have concerns about this?

@JelleZijlstra
Copy link
Member Author

Removed the tp_new for Union to alleviate part of @AlexWaygood's concerns. Now you can (still) only instantiate a union by subscripting it:

>>> from typing import Union
>>> Union[int, str]
int | str
>>> Union(int, str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create 'typing.Union' instances
>>> type(int | str) is Union
True

Doc/library/stdtypes.rst Outdated Show resolved Hide resolved
@JelleZijlstra
Copy link
Member Author

@AlexWaygood any further thoughts? I just looked over the PR again and continue to think it's a good simplification.

@JelleZijlstra
Copy link
Member Author

This is now failing tests due to changes from #112283 that allow unhashable entries in unions. I'll need some new C code to make that work.

@JelleZijlstra
Copy link
Member Author

Finally got around to fixing hashability, and I'm pretty happy with the result (less code than before!). At the sprint I finagled @AlexWaygood into maybe taking another look. I still think this is a good idea and we should move forward with it.

@@ -709,10 +709,6 @@ def test_or_types_operator(self):
y = int | bool
with self.assertRaises(TypeError):
x < y
# Check that we don't crash if typing.Union does not have a tuple in __args__
y = typing.Union[str, int]
y.__args__ = [str, int]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.__args__ is no longer writable.

Lib/test/test_types.py Outdated Show resolved Hide resolved
@@ -1015,9 +1005,14 @@ def __eq__(self, other):
return 1 / 0

bt = BadType('bt', (), {})
bt2 = BadType('bt2', (), {})
# Comparison should fail and errors should propagate out for bad types.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new code there are fewer code paths that trigger the equality comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants