-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An argument against intersections. Or, having our cake, and eating it too. #38
Comments
Firstly, thanks @mikeshardmind for formulating this - it's really cool to see a little idea I had being rigorously explored! Just have a syntax question really, which I mentioned before but I think it's worth discussing again - how do we differentiate instances vs classes? # In current python
class A:
pass
x: A = A()
y: type[A] = A In the new syntax class A:
pass
class B:
pass
class C(B,A):
pass
x: type[B,A] = C
x = C() # How do we express type of this?
# Would it instead be:
x: type[B,A] = C()
x: type[type[B,A]] = C # This syntax seems kind of strange The term type is used to say, "this follows the metaclass of |
x = C() # How do we express type of this? This would be fine: typ_x: type[B, A] = C
x: object[B, A] = C()
# also fine:
x: object[B, A] = typ_x() But I believe that in that particular case, it should just be left alone (To inference) or typed as C. it becomes more clearly needed in a function signature def some_constructor(typ: type[*Ts]) -> object[*Ts]:
return typ() leaving out some details there of course |
Thanks for raising my attention to this on discord. I don't have much to add to this. I would be happy with this as an outcome. +0 on using object this way, it's better that it's a builtin and it doesn't need to be an import, but I don't feel as strongly as you do on it being a natural choice. Allowing reordering when trivial to the type structurally seems pragmatic to nominal typing use, but feels like this is just a clever formulation that hides the differences the type checker cares about being different from what runtime cares about. I think that's okay here since the behavior appears to be fully consistent with runtime and describes everything I think matters at static analysis time as well, but this needs to be something people seeing this don't gloss over. |
typ_x: type[B, A] = C
x: object[B, A] = C()
# also fine:
x: object[B, A] = typ_x() I follow your idea here - I like the idea of saving an import, but there are I think some points to consider: Why is
|
type[T] already works this way today. Code sample in pyright playground And also: >>> class X:
... pass
...
>>> isinstance(X, type)
True The only new things here are expanding this to allow multiple type parameters / typevartuples, defining what that means, and providing a way to specify when you want the instance behavior by expanding object. |
Sorry I realise now my heading was slightly misleading, but want I meant was "why is Update: I've investigated further and found it does not depend on the metaclass at all, so we're alright on that front - the below works: class B(type):
pass
class C(metaclass=B):
pass
x: type[C] = C |
Ah okay, then I should clarify in a bit more detail on a couple things you said. Yes,
The relationship between type and object is actually a little different, I showed a little bit of it above, but classes (Rather than instances of them) are type objects and are |
I agree this is unavoidable whichever name we choose - I guess I was wondering if giving specifically
Also Also by using a different name it has the nice property that, as with all other classes, Consider: T = TypeVar("T")
def f(o: T) -> type[T]:
...
# Now suppose we have:
x: object[A,B] = C()
y = f(x) The type of If instead: x: Inherit[A,B] = C()
y = f(x) The type of y can be You could also consider the inverse function of this which I think has a similar issue. |
Seems fine to me. When I had the idea to use object, it was from the relation between type and object. even the example you posed that mirrors with existing behavior of type, for instance, x: type[type] = type # This type checks, and is correct today
y: object = object() # this type checks, and is correct today
z: object[object] = object() # in theory, this should be fine too. It'll only look strange in cases where people have the single form, in which we can say that people should prefer not wrapping in I wouldn't want to special case and preclude that though since the equivalence is trivial. This could very easily be a linting rule in something that is opinionated about style including a code action in IDE or other autofix (see stuff like ruff), but we do need a corresponding way to refer to an instance, and requiring an import (and keep in mind that typing is considered expensive to import, and many people avoid doing so) for this when object is right here and a reasonable parallel seems like it makes this worse. |
@mikeshardmind I'm pretty convinced by your points here:
The only case that still bothers me is this one:
and it's inverse func as below: def f(o: type[T]) -> T:
...
x: type[A,B] = C()
y = f(x) Type of |
This is an interesting idea. @mikeshardmind, thanks for working out the details and presenting the proposal. Here are a few follow-up questions to help me clarify some points that I'm unclear on...
From a naming perspective, I think that |
Type Variables and typevar tuples are allowed, but type checkers are allowed to say they don't know that they represent if there isn't an appropriate corresponding use. I have not written out all of the places I would expect this to be solvable yet, but this is definitely a question that needs at least a minimum set of detectable specified even if we leave an upper limit wider in case of type checker advances. I can work on better specifying this with places I expect it to work and not, but I think it's fine to leave this with similar rules to existing type vars: there needs to be a clear place the types enter the system at some point to tie it all together. I'd really prefer to leave this as much in the ball court of type checkers as possible with how much they deem they can infer. I would state though that because this is intended to flexibly work with other non-specified bases, it's probably not reasonably inferable from "there are no other bases" I'll see about firming this up. At assignment, it is sufficient to check that each type is individually satisifed because creating the type will handle checking that creating the type is valid. From the use side, it's "safe to use" what's provided by the minimum bound.
quick example: def foo(some_list: list, x: object[int, S, list[T]]):
# type checker may choose to error or warn that it does not know what S or T, nothing provides that information here
# may alternatively chose to use the lowest known bound, object.
some_list[x] # this is fine, looking for `.__index__` we find it on int
for item in x: # `__iter__` is not found on int, not found on S, but found on list[T] without needing to specialize for T
print(item) # And the type being Iterator[T], and all types being printable allows this.
class Foo[S, T]:
def __init__(self, stype: S, ttype: T): ...
def foo(self, some_list: list, x: object[int, S, list[T]]):
# S and T are now resolvable in relation to something else (the class)
# but inside the body prior to specialization, may still only be known to the minimum bound, rest same as above Other Examples: class InplaceSortable(Protocol):
def sort(self):
...
def foo[T](x: object[Sequence[int], InplaceSortable, T]) -> object[Sequence[int], InplaceSortable, T]:
# type checker may error here, there's no appropriate way to infer what T is
# may indicate that the next example is what should actually be used
def foo_fixed[T: object[Sequence[int], InplaceSortable]](x: T) -> T:
...
class FooFineHere[T: SupportsIndex]:
def __init__(self, emitted_type: type[T]):
self.typ: T = emitted_type
self.some_list: list[T]
def foo(self, x: object[InplaceSortable, Sequence[T]])
# T fine here, inferable from class
x.sort() # this is fine
x.do_something() # not fine, nothing provides this
self.some_list[x[1]] # This works fine, even if it's nonsense for use as an example, bound of T
It is not the same.
This is entirely a type system side feature with the other checks you mention existing when someone tries to make the type that will be used here, so users don't lose out on those checks, but they will be raised to them in other places. The lookups are intentionally defined to be ordered in the same way mro would, but to allow levels of indirection not specified to keep both reasoning about this simple, as well as implementation. Some examples showing a few of the ways this is different: class A:
...
class B:
...
class ABRealized1(A, B):
...
class ABRealized2(B, A):
...
class C(B):
...
class ABRealized3(A, C):
...
type NeedsAB = object[A, B]
# all of these are fine.
# These are all instances of types which are all nominal subclasses of all of what is specified, LSP verified in respective definitions.
x: NeedsAB
x = ABRealized1()
x = ABRealized2()
x = ABRealized3()
# I believe this is in here, but a protocol indicating supporting rich comparison if not
from useful_types import RichComparable
def total_ordering(typ: type[T]) -> type[RichComparable, T]: # see functools.total_ordering
...
# Any type wrapped in this is now known to be both a nominal subtype of the original type
# while also being a structural subtype of the protocol RichComparison.
@total_ordering
class Point:
def __eq__(self, other) -> bool: ...
def __lt__(self, other) -> bool: ...
p1: Point
p2: Point
p1 >= p2 # fine |
OK, I think I understand. To summarize: A I think that implies... If either
That's a great point. I think that's really compelling because it eliminates the need to duplicate those complex checks and rules within the intersection type itself. Those errors will be easier to understand if they are surfaced at the point where such a type is defined (e.g. a Your answer about TypeVar usage makes sense, but I don't think you fully answered my question about constraint solving. I was asking about the case where a call expression targets a generic function or constructor whose signature includes a TypeVar within an class A: ...
class B: ...
class C: ...
class D(A, B): ...
class E(A, C): ...
def call_target[T](val: object[A, T]) -> T:
return val
def caller(d: D, de: D | E):
reveal_type(call_target(d)) # B ?
reveal_type(call_target(de)) # B | C ? I can think of other more complex constraint solver examples that involve structural types, overlapping types, etc. Some of these don't have such obvious solutions, but similar complex (and unspecified) situations arise when solving TypeVars that are included in unions, so maybe it's OK to leave some of these details up to individual type checkers to figure out. For all the cases I can think of, I'm able to come up with some answer that is reasonable. As for nomenclature, is there a reason we wouldn't want to use the term What other limitations are there on types that can be included in this construct? You've mentioned that any class (nominal or structural) is allowed, as are |
Mostly, yes. There's an example with non-disjoint types to consider though, and part of the reason the ordering does matter class A
def foo(self) -> int:
...
class B:
def foo(self, some_arg: bool = True) -> int:
...
def foo(x: object[A, B]):
x.foo() # ok
x.foo(True) # not okay we've specified A as first in order when resolving lookup at static analysis time. In this trivial case we can figure out that anything that subclasses A and B validly must allow this, but we're specifically leaving ordering to the user so that the more complcex cases where a type checker wouldn't be able to make a decision function properly for multiple use cases. Type checkers may decide to note that swapping the order would fix it to give a helpful message, but should error here ( I mentioned this off-handedly in the part about obviousness of behavior and ergonomics)
Actually, that's a compelling reason to not use obviously in the case of def foo[T](typ: type[T]) -> T:
... we want the current behavior def foo[*Ts](typ: type[*Ts]) ->object[*Ts]:
... In this case, is Ts the full mro of the type? Reasonable to do so, but is it useful? idk def foo[T, U](typ: type[T, U]) -> object[T, U]:
... This case looks fine at first, it's resolvable by having T resolve to the type passed in and U as object (or any other base in the mro) But it invites the case in your sample you have here where there's no reasonable choice as any choice would inherently be lossy on type information, and there's no singular correct pick that doesn't require understanding the intent of the developer outside of the type system itself.
Not really, no. I'd say in both cases here there's not a reasonable choice for the type checker to go with here, the type checker isn't provided enough information to make the right decision unambiguously here, and I don't think that this should be destructuring the mro partially to make sense of what the user is trying to do, the user should need to find a less ambiguous way to write this. If the specific call site shown is the intent, there's no clear reason why this wouldn't be a typevar bound, on a singular type var. From what the type checker should do, There's multiple answers which won't error, and at least one of these involves understanding all of the call sites, which may not be possible, especially if we consider libraries. this could present some undesirable cases where the lack of inference doesn't feel consistent, even if there's a consistent underlying rule for the type checker, which may make this harder to teach, learn, and/ or otherwise adopt. I would say from this that it's safe to infer only from call site in the singular case, and in the plural case, there should probably be some other external information here.
A preference on avoiding the same split in the future that is currently happening with TypeGuards, and a preference to avoid inviting use of I would be fine with
I have no reason to disallow any prior existing type construct here. The fun part about "it just needs to be all of these, and we have an order that the type checker should check for things in", is any error where it's impossible to make it will arise elsewhere when someone tries to make it and can't satisfy LSP or some other constraint of the type system. Literal types are the flimsiest of these, but since I can't see a reason to outright forbid it, I have to wonder if there is someone who will find a use. I also think that since users are responsible for sensible ordering in the case of any overlaps here, there's no additional complexity for type checkers. This should scale linearly with the number of elements in the ordered intersection. |
This is an interesting example because there would be no legal way to define a type with the order class Good(B, A): # OK
...
class Bad(A, B): # Type error: A.foo overrides B.foo in an incompatible way
... I think that any non-disjoint types are either illegal to combine (if the disjoint member is invariant) or demand a specific ordering (if the disjoint member is covariant). Just brainstorming... Here are some other terms that might connote "intersection-like" without using the term "Intersection": Overlap, Shared, Joint, Conjunction, Common, Meet, Mutual. I'm pretty bullish on this proposal. It strikes a good balance between functionality and complexity, and it appears to address most (all?) of the major use cases that were identified for intersections. Thoughts on next steps? Maybe start writing a more complete spec that could serve as a first draft of a PEP? |
Hmm. I'm not sure I want this to be specified behavior as mandatory, as this may increase complexity for type checkers when it would already be detected when you try to create something that is that type, but I do think that's a reasonable way type checkers could go above and beyond for users and detect it earlier. Not sure here. If you think this is reasonable to detect at the site of the ordered intersection as well, I can include detection of that in writing up a more formalized draft, but this was something there wasn't good consensus on in the unordered form, and while much of the disagreement should go away with the ordering being part of the construct, I'm unsure if it's necessary.
I should be able to spend time on that soon if there are no other major things to consider here. If we're mostly in agreement on the construct not having major discrepancies across use-cases, then we're down to defining it, specifying it well enough, potentially bikeshedding on a name, and presenting the argument for it to the wider python community as part of the PEP process. |
No, that's not what I meant. Sorry for any confusion. I definitely don't think that it's reasonable to detect that at the site of the type annotation, and I wouldn't want to see that appear in the spec. My point is that only some base class orders are legal in a real class definition. Both classes |
Short answer, yes. Slightly longer: it's still possible to resolve it even with that ordering (one of the cases against intersection detecting this in the unordered form from before, complexity in determining what the type would result in minimally) class NoLongerBad(A, B):
def foo(self, some_arg: bool = True) -> int:
if some_arg:
return super().foo()
else:
return 1
def foo(x: OrderedIntersection[A, B]):
# but we still use the order we told static analysis to look for it here,
# so while x supports an arg, we can't use it here unless we reorder here.
x.foo(True) # still an error
x.foo()
foo(NoLongerBad()) # fine Defering to user provided ordering can in some cases result in wider or narrower cases. The clear reason why deferring to that ordering isn't just to make type checking simpler shows up with |
I think implementation wise I'm convinced - I agree with Eric that a separate name would be better than Re names, I'm more in favour of single word names for brevity's sake - also I think
I think Other thoughts - I know I originally suggested
|
Other related naming ideas based on the same concept:
|
Disclosure: I haven't I got a bit lost in why Problems with resolving somehting like Wouldn't the problem not also exist for |
@CarliJoy Just to clarify, there are really two distinct options presented in terms of names: 1. Using a separate name:For the instance: What that name is matters less - hence various options discussed - as it doesn't affect behaviour in any way, and you can do import as whatever you want. I've used 2. Using type and objectFor the instance: This presents a quite difference interface - we've been discussing if this is a good interface or not mainly.
The Also we've discussed equivalences that come out of this, and whether these matter such as:
One thing I like about option 1 is that only one of these exists: |
@mikeshardmind I've been pondering a little thought experiment following our discussion on
This set me thinking - is there any precedent in python for banning single item type wrappers when multiple args is allowed? I discovered there is: from typing import Union
class A:
pass
Union[A] The above returns in pyright |
Pyright might not like it, but the Union itself allows it and in fact |
Oh really? That's interesting - so the type checker is more restrictive than the PEP. If the PEP for |
There are a couple issues being discussed on discourse that may effect the specific language required, specifically around Annotated and Literal. I'm advocating what I believe are sensible interpretations with those, but there are other interpretations which could cause needing special behavior for intersections with Annotated or Literal. If those come to pass before this is accepted, I'll need to do some extra work with specification here. I'll try and wrap the draft specifying this up by the end of the week for review since there have been no major detractions from it in a week. I'm a little behind on this compared to my intent to have the draft ready today, my apologies to the couple people who asked about my timeline on this out of band. I'm going to use the name |
As someone who hasn't followed the nuanced / in-the-weeds discussion too well: Would anyone be willing to summarize what might be wanted from a |
Also, my 2¢ on the import issue: IMO if folks are sensitive to typing import costs, then they should wrap type imports in |
Given the relative rarity of expected use cases for this (as evidenced by the fact that they don't exist yet 🙂), I'm 👎 on overloading the term I vote |
This was super helpful for understanding the issue. Thanks. I see now why you don't want to call it an "Intersection": because the word "Intersection" might give a mistaken expectation of commutativity of the arguments when they don't in fact commute. |
This contrasting commutativity between Unions vs the proposed spec for Intersections is making me think about a parallel to sets and dicts. E.g. The similarities may end there (I'm not sure), but seems like interesting food for thought. |
What about |
It's an interesting comparison - I think there's a good reason for the second point if we consider the annotations accessible in each: class A:
foo: int
bar: str
class B:
bar: float
class C(B,A):
pass
class D(A,B):
pass
print(A.__annotations__)
print(B.__annotations__)
print({**A.__annotations__, **B.__annotations__}) # C
print({**B.__annotations__, **A.__annotations__}) # D Here the results are representative of the attributes available in C and D |
I see where you're going with this idea - I think maybe it's better to avoid overlap with the existing Also
|
I'm now starting to wonder if I dismissed the |
Ok another idea bouncing off @vergenzt 's thought on class: This does suffer this issue I mentioned earlier of being a "harder sell". Also just noticed something: a: list[A] = list([A()])
a: all[A, B] /= all([A(), B()]) I think this probably means we don't want to go down this road - just wanted to follow the thought through to the end. |
I don't like this for stylistic reasons, but even beyond that, this probably wouldn't work, as this would get very muddy with runtime use of the type system. That expression is a list literal with types as it's items. Other quick name notes:
|
There's very little friction to adding a new type form, and not enough reason to overload the behavior of concatenate IMO. This could lead to hard to read type signatures if an intersection involves multiple callables that use concatenate themselves. |
I think I got the idea for this from a document on diagram structures, that I used it for multiple inheritance. It's also used by scss for multiple inheritance I believe.
I'm fine with this as a name but I do think it's a bit long - I wondered about I think it'd be worth having a call with anyone who wants in to brainstorm and pick a name if that's the only outstanding thing - it's not worth discussing for aeons, but I think a quick discussion wouldn't hurt. If |
Also I just went through the thread and gathered up all the possible names that have been discussed - I'm excluding any that overlap with any existing names here: Feel free to add more if people have more thoughts |
Thank you for the list mark. We use Once the the PEP is handed in, we can create a survey in Typing on discuss python. During the PEP discussion process we still can change the name of the operator. As mark pointed out, shortcuts like Please react with 👍🏻 or 👎🏻 if you like/dislike this proposal on the progress. |
Yeah, I think it's worth moving ahead with this in the way you describe - couple of points on the name I just wanted to mention, but happy to park the discussion. As a little thought experiment, I took a look at the existing names in the typing module, to see what the longest ones were. Longest I've found were I did also think maybe it's worth noting how likely it is the regular unordered intersection could exist. They could just use I also think the majority of people would just start their script As I say though, if |
This approach will not be possible to implement in a consistent manner without larger changes to the type system. Those same changes would allow just not having an ordered intersection. I'll be following up with both a summary of my findings on this, as well as details and references to other related discussions at a later point in time. |
@mikeshardmind Is there no way we can make a specification that allows for the special cases as they are currently? I'm judging from the other discussion that making the proposed changes to the way Any subclassing works won't be possible, but I don't think that necessarily means we have to scrap intersections altogether. So long as there is a defined way that it behaves can't we just make a special case in intersections for now? Perhaps I'm missing something more fundamental, but it seems to me that the above approach works for almost all the cases we've considered which seems like something still worth pursuing. |
Another thing I've been considering is that it seems we're trying to resolve a lot of cases, some of which may not come up very much in practice. If we can determine exactly what these cases are, is there a way we can deliberately exclude them from the specification? While it may not be the most satisfying theoretically, I think perhaps an approach where we create an intersection that satisfies 99% of use cases is definitely the most practical. Once we have this "minimum viable intersection", there's no reason why this couldn't be expanded by some future pep to be more complete when other issues in the typing system are resolved. |
There is no useful to users way to reconcile Any in an intersection with the current behavior type checkers have, and I consider the behavior of having Any as a base class to be fundamentally broken right now. I'm unwilling to put my name behind a proposal that makes this situation worse and harder to untangle in the future by layering on top of something that is in this state, and I'm unwilling to expend more effort on this proposal until a time when it can be done without these problems. The "best case" for accepting intersections in the current state with what we explored and with how this needs to be done in a way that the underlying problems can be fixed without people saying "but we're relying on the current behavior" (see TypeGuard vs TypeIs for a simpler example of how fixing behavior isn't trivial) right now would be to have intersections that were T & Any is T & Any, unordered, not special-cased, and that type checkers should know that T places a theoretical lower bound on T & Any, but for now they are free to treat it as if it were Any. |
That's fair enough, I respect your position on this. You've certainly put in a lot of effort already into trying to figure out a path forward with it - I'm quite keen for all our efforts not to go to waste, so I might at some point try and continue this forward in some form. Would you be happy if I use your draft for ideas on some parts in drafting a new PEP? I think there's a lot of good ideas here, and I like to think that there might be a way forward with it. |
As it stands, I believe that intersections are likely to cause more harm than good if accepted without addressing the underlying issues around how we handle gradual types within the type system. I won't go as far to say that you shouldn't feel free to use what was discussed intentionally in the open to try and steer a way to salvage the effort, but I think we're at a point where attempts to do so are potentially misguided and a product of the sunk cost fallacy, so I would urge caution in actually doing so. I think other productive things could be worked on that wouldn't be intersections, but which came about throughout these discussions and would help those who want them, such as I'm burnt out on this. I was hopeful we could find some sort of compromise. I've worked for that compromise and balancing of various concerns through everything from people telling me I was too concerned with theory, to not enough on theory, to that I was unwilling to compromise*, only to conclude that right now there seems to be too much working against this feature without other things that are at the heart of why compromise seemed needed in the first place being addressed first. * The only thing I've been unwilling to compromise on, is that I think we needed to be able to show that this would be a net positive for users, and would not make other typing features worse for users as a side effect. I think that's a pretty tame constraint when discussing being confident enough in a feature to propose it for standard inclusion. I won't be detailing all of the fatal flaws you'd need to overcome right now (see above about burnout, I don't want to retread everything to go pick out every single minor detail that must be addressed), but a brief handful of major ones until I've had time to just let things be for a while are below.
If after reading that, you still think that's worth pursuing right now rather than in a future where some of the issues have been resolved ahead of intersections, I wish you the best of luck with it. |
Thanks for the summary, that's very useful. I actually had a slightly new idea last night based on a lot of what we've discussed and the PEP that's already been formulated. I have a feeling it would address a lot of these issues, but it's good to have some of them gathered in one place so I can potentially address previous discussions in the PEP. It'll take a bit of time for me to flesh out this PEP, but I'm hopeful that there's a way forward - I wish you all the best. I'll close this thread as it's getting quite long and I think I'll probably make a fresh start with the new one. |
@mark-todd Hi, do you have any updates/are you currently working on this? I stumbled across https://github.com/python/typing/issues/213 today and went down the rabbithole into this repository, and while I do not have a solution to the issues presented here, @mikeshardmind's alternate approach made me think that something like a I would expect By tieing the meaning of the annotation to the actual inheritance mechanism in python, we could mostly get around any additional definitions (maybe there is some benefit in allowing for example
I currently have no idea about the internals of typecheckers; therefore I cannot judge whether this would be heavy on hot paths, but I think that checking whether a given object complies with a In my mind, it would make sense to allow the expression I realize that this issue is probably the wrong place to discuss this further, so what would be an appropriate place for this? Should I open a new issue on the python typing repository? I saw at least one similar suggestion there, but as far as I remember, it did not yield any significant results/pointed to the future implementation of intersections. |
Hi @hoelzeli - thanks for taking an interest in intersection! Just to provide some context, not much has happened on this since around April (although later than the current thread). Essentially we had a lot of discussions, went back and forth on a lot of the trade offs, but couldn't agree how to proceed in a coherent way. Personally, I was very much in favour of an "MROEquivalent" version of intersection - I had a go at making a PEP for essentially this here: #49 (There were several debates on the name - in the end people were in favour of calling it Intersection, but honestly I think if another name was less controversial that might be better). ...however, I understand there was one particular edge case involving A lot of the reasoning against taking the name Intersection was that someone might eventually come up with a formulation that would not depend on the MRO in this, and would be essentially be Unordered. It's also worth noting that for the Protocol case it isn't quite as straight forward as following the MRO, but I think we can use this to guide the formulation. In terms of how we proceed - I'd be keen to develop this into a feature people could use - perhaps we start off with the draft PEP I had before, but essentially "rebrand" it away from being an intersection as such? |
Oh also perhaps let's discuss here: #48 This is the later issue on this |
Note: This is An argument for the intentional non-implementation of pure intersections, and instead use all of what we worked out to implement an alternative option that satisfies all currently supportable use-cases, while leaving the door open to full intersections. This is not arguing to throw away everything we've worked on, but to use what we worked out to help our collective use cases, productively. I am actively inviting any constructive detraction to this, as if there is something these cannot do that are "must haves" to go this route would require determining if intersections are must-have to support those uses.
(edited) Note 2: Where you see
type[*Ts]
and object[*Ts]it may make more sense to have
OrderedIntersection` There's more details at the end on this nowWith Thanks to @DiscordLiz , @erictraut , @gvanrossum, @mark-todd, and @NeilGirdhar in alphabetical order (and no other order of particularity) for presenting each extremely strong, and opinionated arguments as well as some feedback that led to some self-reflection on why I felt so strongly about particular outcomes of this, I believe that there is a strong argument that without other constructs (such as negation, pick, and omit), there is not a good argument for the inclusion of the theoretical intersection, and that even with those, it might only make sense for that to work with structural subtyping.
However, that leaves an awkward question, how can we best handle the use cases that are certainly valid without them such that I can say we don’t need intersections? Have we really just wasted months of work? No.
The pitch: The order matters to us and to the behavior we want, so lets go with an ordered construct that's simpler and still does what we need.
type[*Ts]
andobject[*Ts]
First of all, my apologies to @mark-todd as when this was originally floated as an idea in a slightly different form by him in Discord, I stated the non-equivalence to intersections and that that should be separate, and we went back to working on intersections, but there’s a real question here I think we need to answer: Do these not solve every case we care about sufficiently? As originally stated, I would say that they didn't, but I believe I made a mistake in viewing these as something we should consider separately instead of as something which might do enough while being simpler. Exploring the idea further led to a slight tweak in which they appeared to me to solve all presented use cases that intersections could in the context of the current type system.
If it does, we can use the existing explorations to justify this construct instead and leave notes on rationale for anyone who wants to explore the difference between this and “true intersections” at a later date.
Background
The primary difference between all interpretations of Intersection which anyone involved has stated support for (For this I'm mostly looking at 4, 5, and the not fully fleshed out, hypothetical 6 from #31, #37) comes into play in things that are defined on multiple operands. Put another way, the unordered nature of Intersection as a parallel construct to Unions has some very unpleasant consequences. This became especially contentious when looking at what happens in non-disjoint cases and (implicitly non-disjoin) cases involving
Any
.I iterated on this several times trying to find a solution that satisfied all the complaints from multiple parties, but every time there was still some corner case that had to favor one use over another, and given that we wanted intersections to also take up an operator in the type system
&
the consequences for needing two different constructs to express one version or the other wasn’t sitting right with me. With some time spent reflecting on it, I realized my primary issue with favoring any solution over another was that by promoting one to use&
and another being relegated to requiring an import, we were necessarily putting some needs above others when there was no consensus on which behavior would be more common, or how this would reconcile withAny
.Enter, an alternative:
type[*Ts]
This is ordered. People want option 4 for type[Any & T]? Write type[Any, T]. Want option 5? Write type[T, Any]. This reflects existing behavior, including the ordered nature with subclassing of Any.
Semantics
type[*Ts]
does not create a new type at runtime. It creates a virtual type that has the bases*Ts
. The order is not semantically considered for nominal subtyping, but should be considered when handling diamond patterns. As a return type or annotation, the type returned must have the bases *Ts. The bases do not need to be in the same order, but the types must resolve as if they were.object[*Ts]
is the equivalent formulation for an instance of something of that type, reflecting the duality between type and object at both runtime and in the type system.This allows the following class definitions (and many non-trivial others)
to each work for
type[A, B]
or instances of these classes forobject[A, B]
.This also allows all of the current assumptions of nominal subtyping to apply to this construct.
In the case of protocols, each must be satisfied.
Because of this, it is enough at assignment to check that each type in
Ts
fortype[*Ts]
is satisfied, as any real object being passed would be checked when defined for LSP. LSP does not need to be checked for at time of assignment. At time of use, lookup should be done in order. By deferring to the fact that LSP violations would be detected elsewhere, the ordered search should be fine.Examples:
Does this really cover all of our use cases?
It covers all of mine, including potential future cases with type negation as a separate construct. I've given it a lot of thought, and it seems to cover every case I've seen floated for intersections, both in this repo and elsewhere when discussing intersections for python, but the trick with
type[A, B]
allowing the mro order to be B, A is mandatory for that to be the caseThere are potential cases that I would be blind to here, but we’ve limited the scope of this to non-disjoint overlap of types with differing types involved in resolution, and this allows handling of even that scope to a very large degree.
There are potentially other cases that can come up with higher kinded types. I have not explored in depth what this would mean with the inclusion of higher kinded types in python, because there is no strong prevailing push for these to use as a framework to explore. The temptation to view HKTs as in one of the languages that has them exists, but there’s no guarantee that any of these would reflect what is added to python’s type system. I believe that this composes properly with any reasonable future interpretation of higher kinded types, but will refrain from saying that HKTs could not re-introduce the need for a fully featured intersection.
What about type safety vs pragmatism and warnings?
Any
did not precede non-Any types in the order, or even to preclude it all together in a check, but this would be an optional additional constraint provided by a specific tool, not something that needs to be forbidden by the type system (And which would have negative effects on gradual typing to forbid outright)What about ergonomics and obviousness of behavior?
In the most commonly anticpated case, of disjoint types, the ordering won’t matter. The definition above only cares about the order when mro would change a resolved type. The complex cases where the ordering will matter should be obvious, as it will be the same as the required mro order when defining the type.
What about backwards compatibility with
type[T]
(singular type)?Behavior is unchanged, the definition here is compatible with and does not change the behavior of the current use of
type[T]
What about not using type and object here?
It's probably better to use a new
typing.OrderedIntersection
instead ofobject
for this, and allow it to compose withtype
with the existing semantics.That would make one of the examples using decorators look like this:
Having our cake and eating it too
It may be less satisfying for some here to have done all this work and to walk away not implementing intersections, but if we use this work to implement something that handles as many cases as this would, and which leaves the edge cases in the user's hands to pick the behavior they want, I think we walk away accomplishing everything we set out to do by adding intersections, by not adding intersections. If this works for everyone here, the work wasn't a waste, but informative of what the issues would be with an unordered intersection, and an exploration of what the scope of difference would be with different solutions. This remains available as a reference to those revisiting and (if we go with this) we should absolutely document for history "why not a true intersection".
The text was updated successfully, but these errors were encountered: