Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle "impossible" intersections #5

Closed
CarliJoy opened this issue Jul 24, 2023 · 186 comments
Closed

Handle "impossible" intersections #5

CarliJoy opened this issue Jul 24, 2023 · 186 comments

Comments

@CarliJoy
Copy link
Owner

CarliJoy commented Jul 24, 2023

This issue is about handling an "impossible" intersection by static type checkers. We consider two cases:

It must be transformed to Never

Arguments in favor:

It is up to the type checker to reduce it or not

Arguments in favor:

  • @erictraut argues that the conditions are complex and expensive.
  • @erictraut argues in the context of reduction and normalization.
  • @CarliJoy argues that there are problems caused by inconsistent typing definitions in the standard library.

(Please edit this comment. I haven't gone through the whole issue to add links to everyone's arguments.)

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I don't really think this needs its own issue. We can detect some cases where an intersection is impossible to satisfy, and that should result in Never, I'm not sure why this would be something requiring any amount of bikeshedding.

@CarliJoy
Copy link
Owner Author

CarliJoy commented Jul 24, 2023

@mikeshardmind I already wrote an algorithm for protocols in the current specification under https://github.com/CarliJoy/intersection_examples/blob/main/specification.rst#protocol-reduction

Could you adopt this algorithm to accompany this?

There is already a section with evaluation to never, we could add things there, too.

But in this case I would like to define also, that type checkers should deliver a good error msg. Otherwise finding these issues will be pain.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I'm not in a position to do anything formal at the moment, but that said I already posted a partial list of "trivial" reductions in the discord server, I'll post them here as well.

Note, removal of Any was included prior to some people finding that controversial, don't read into it as a mandate that this approach is the resolution

Trivial Removal of Any
Any & P -> P
Any & P & Q -> P & Q

Subtype relationship example
(copied directly from draft, already included)
(Note: examples for Any above also follow this rule of more specific types taking precedence)

Intersection[str, Employee, Manager] is Intersection[str, Manager]

concrete types which fully satisfy protocols

SupportsIndex & int -> int

Reductions involving None

None & ConcreteType -> Never
None & ProtocolNoneDoesntSatisfy -> Never
None & ProtocolNoneSatisfies -> None (ie. bool)

Interaction with other Final types

(Note: None follows all of these rules, no special casing of None)

FT is a final type
NBFT is another type that isn't in FT's bases
BFT is a type that is in FT's bases
SFTP is a protocol (heh) that FT satisfies
NSFTP is a protocol that FT does not satisfy

FT & NBFT -> Never
FT & BFT -> FT
FT & NSFTP -> Never
FT & SFTP -> FT


In addition to this post, there's also if you have two types with incompatible interfaces -> Never as it would be impossible to satisfy both requirements requested by use of the intersection. I don't think artificially creating overloads works, for reasons above, complexity, and so on. It also behaves more like a Union at that point, so if you want a Union, just use a Union.

@CarliJoy
Copy link
Owner Author

CarliJoy commented Jul 24, 2023

I already included them in the specification. Did you have look at it?

@CarliJoy
Copy link
Owner Author

In addition to this post, there's also if you have two types with incompatible interfaces -> Never as it would be impossible to satisfy both requirements requested by use of the intersection. I don't think artificially creating overloads works, for reasons above, complexity, and so on. It also behaves more like a Union at that point, so if you want a Union, just use a Union.

That's where we disagree.
Creating overloads is actually one use case for intersections. I.e. think of Callable[[int], str] & Callable[[float], [str].
People already said that would expect an overload there.

Of course the special cases about staticmethods and classmethods are non trivial and need some evaluation.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I would expect this to be an error. If I wanted that, the correct tool is a Union, not an intersection.

Callable[[int], str] | Callable[[float], str] has the behavior you are putting into the intersection here by creating overloads artifically.

@mikeshardmind
Copy link
Collaborator

The specification is also stricter on allowable type combinations than what is required by python. It is possible to have multiple inheritance from multiple builtin types without it being in error. This is why my post was specific to mention type compatibility as well as if a class was marked as being final

@CarliJoy
Copy link
Owner Author

The specification is also stricter on allowable type combinations than what is required by python. It is possible to have multiple inheritance from multiple builtin types without it being in error. This is why my post was specific to mention type compatibility as well as if a class was marked as being final

That is not correct, see https://github.com/CarliJoy/intersection_examples/blob/main/test_basetypes.py I didn't find any combination that is valid.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Jul 24, 2023

I should have been so much more specific. There is no rule in the interpreter or at runtime that multiple inheritance from builtin classes is impossible. The specification implies this off it being true for the current types there, but one can imagine that it may not remain true. It should be sufficient, more succinct, and more permanently correct to say that A & B is not valid for concrete types if class(A, B) is not valid.

@mikeshardmind
Copy link
Collaborator

Anyhow, if you are resolving this by adding overloads, I'm no longer interested in intersections. I consider this to be antithetical to why I want an intersection in python. Everything else there's been disagreement about I could see multiple paths out of, but this would result in me just not using the feature ever. It would be more correct to just hand out protocols that specify exactly my intent at that point.

If I specify A & B where A is mine, and B is a library's, I expect that A & B are out-of-the-box not incompatible interfaces. Hiding the incompatibility with artificially inserted overloads that then have to be checked for an outcome when used breaks expectations and some of the reasons for Intersections. I want to be able to use this as if it is A and as if it is B. If I wanted to check which interface I had, I wouldn't need an intersection.

@CarliJoy
Copy link
Owner Author

CarliJoy commented Jul 24, 2023

Well the correct definition would be A & B is never if class (A, B) or class(B, A) raises TypeError.

There are two Problems with this definition:

  1. Performance: You need to check all permutations of all arguments
  2. It requires running the code what a type checker never should do (strongest arguemt)
  3. How should type checker not implemented in python know this? And there are even multiple implementations of python next to CPython. So maybe in one of this implementation class (int, float) maybe even works. So typing will now depend on the interpreter?

@mikeshardmind
Copy link
Collaborator

The type error here is predictable and arises from a base class layout conflict. This can be statically determined and does not require running the code.

All permutations of arguments should be checked, it shouldn't be possible to create an impossible requirement.

Realistically, though, this goes back to what was brought up in discord by @DiscordLiz that even maintainers of type checkers thought intersections on anything other than protocols would be too complex.

@DiscordLiz
Copy link
Collaborator

pragmatically speaking, it's worth just disallowing multiple inheritance of builtin types. The manner in which they are implemented would need to change for this to no longer be true, and other implementations could allow it.

you do need to verify that a combination of types is possible, which would be checking that there exists at least 1 valid permutation of bases.

Really should just have intersections be for fully compatible protocols only. It's simple, and and solves a real problem. All of these edge cases come in from things nobody should ever need to care about in practice, but for it to be defined robustly and allow more than protocols, do.

@DiscordLiz
Copy link
Collaborator

DiscordLiz commented Jul 24, 2023

I read through the current specification, and I need to state more than just simple agreement that the overloads are a mistake. This creates a situation where intersections do something that could never be valid on real classes for protocols, and only for protocols. Protocols are already only a definition structurally, I would expect any incompatibility to be a mistake on the part of whoever requested that something works with both.

You can't hack this together for real classes, because it would become an LSP violation. This breaks symmetry between structural and nominal typing.

@erictraut
Copy link
Collaborator

erictraut commented Jul 25, 2023

A few thoughts...

Regarding overloads, these are not the same as intersections. The two should not be conflated. Overloads have complex signature matching rules and are order dependent. Intersections have simple rules based on set theory, and are order independent. I would be surprised if the word "overload" appeared anywhere within this PEP.

The subtyping rules for intersections should build upon existing subtyping rules within the type system. Type checkers already have rules in place for evaluating subtyping relationships for functions and overloads. The same is true for protocols, nominal classes, generics, etc. This PEP should not attempt to define these existing rules. It should simply explain how intersections work with respect to these existing subtyping rules. This will simplify the PEP and help ensure that it composes well with existing type features. If the PEP attempts to define all of the behaviors in terms of specific cases ("if an overload is involved..." or "if a protocol is intersected with a nominal class..." or "if a nominal class is intersected with Any"), the complexity of the spec will explode and inconsistencies and omissions will result. I therefore encourage the authors to stick to defining intersection rules that leverage existing generalized rules within the type system.

Regarding "impossible" intersections, my recommendation is to leave it up to each type checker to decide whether to detect such impossible conditions and where such conditions should be flagged as an error (e.g. at the point where the type is "spelled" as an annotation, where it's evaluated, or where it's used to evaluate type compatibility). I don't think the PEP should mandate such errors — or even express an opinion as to whether such errors should be reported. There are many reasons in Python why an intersection type may not be realizable without generating type checking or runtime errors. Some of these conditions are quite expensive to check for. Here's a partial list off the top of my head:

  • Classes that are explicitly marked @final
  • Certain builtin classes that are implicitly @final
  • Classes that generate MRO conflicts when combined
  • Classes that generate metaclass conflicts when combined
  • Conflicts involving __slots__
  • Certain builtins that cannot be combined
  • Generic classes whose specialized type arguments conflict
  • Classes that define the same attribute with conflicting types — those that result in a Never when intersected

Type checkers today detect a subset of these conditions when evaluating a class declaration with multiple base classes, but many of these conditions are not flagged as errors. Here are some examples:

class A(int, str): ... # Runtime error, but not currently flagged as error by mypy or pyright

class B(list[str], Sequence[int]): ... # Flagged as error by pyright in strict mode, no error in mypy

class P1(Protocol):
    a: str

class P2(Protocol):
    a: int

class C(P1, P2): # Both mypy and pyright generate an error here
    ...

In practice, I think that we'll find that most of these "impossible intersection" cases occur very rarely (if ever) in real code. If some of these cases do come up more often, users will report these cases, and type checker authors can decide if it makes sense (based on tradeoffs like implementation complexity and runtime costs) to check for that specific condition. So my preference is for the PEP not to say anything about "impossible intersections" or simply state that "type checkers may (but are not obligated to) check for certain conditions where an intersection type cannot be realized at runtime".

@DiscordLiz
Copy link
Collaborator

I would be surprised if the word "overload" appeared anywhere within this PEP. [...] Some of these conditions are quite expensive to check for. Here's a partial list off the top of my head: [...] Classes that define the same attribute with conflicting types — those that result in a Never when intersected

The Current draft has specified behavior for this case which multiple people believe to be unsound. The draft specifies this results in overloads of each conflicting signature for Protocols, and Unions of types for TypedDict fields rather than being Never

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Jul 25, 2023

Requesting an impossible intersection should be treated as an error.

I agree that requesting an impossible intersection should be treated as an error (with each type checker deciding what that means as Eric says), but there are certain intersections that are absolutely possible.

First, in your example, you have one method that's static and one that's an ordinary method. Python typing doesn't have good support for the descriptor protocol yet so this isn't going to be testable to create a subtype method that supports both. Let's not look at these.

In general however, you can often create a subtype method that satisfy any supertype methods provided that the subtype method returns a type that's the intersection of all its superclasses. That's what I illustrated in the other thread. Disparate parameter lists can always be accommodated through overloads.

So, very few things are "impossible" except when that intersection is empty—for example for different return types that cannot be subclassed.

@mikeshardmind
Copy link
Collaborator

To what end is it worth creating a rule that this must be valid? Pragmatism and the fact that there Are so many edge cases on this that you are trying to handwave away is all I need to say this is a horrific idea. Solve the edge cases in a way that satisfies a real need, currently, this is just a pit for type checker performance, and a way for subtle incompatibilities to go unnoticed.

@NeilGirdhar
Copy link
Collaborator

To what end is it worth creating a rule that this must be valid?

Because intersections can be created synthetically in generic types. If you block all of these things, you're going to severely limit the type system.

What's the reason for blocking perfectly valid code?

@mikeshardmind
Copy link
Collaborator

To what end is it worth creating a rule that this must be valid?

Because intersections can be created synthetically in generic types. If you block all of these things, you're going to severely limit the type system.

What's the reason for blocking perfectly valid code?

Well, if you had read the whole message that's from you'd have your answer. There are edge cases that you do not have an answer for, and for which the current proposal can be shown to be unsound. It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

@NeilGirdhar
Copy link
Collaborator

There are edge cases that you do not have an answer for, and for which the current proposal can be shown to be unsound. It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

@mikeshardmind
Copy link
Collaborator

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

Okay, then leave it up to type checkers. The current spec instead mandates a synthetically created overload and this is what I took issue with, and what you attempted to demonstrate was "fine actually".

@NeilGirdhar
Copy link
Collaborator

That's not a good reason for blocking possible intersections. I'm with Eric: you should leave this up to type checkers.

Okay, then leave it up to type checkers. The current spec instead mandates a synthetically created overload and this is what I took issue with, and what you attempted to demonstrate was "fine actually".

Sorry, that's not what I was saying. What I was saying was that trying to block "incompatible interfaces" would end up blocking intersections that were actually possible, and I was presenting the overloads as an example of a possible intersection.

@mikeshardmind
Copy link
Collaborator

Sorry, that's not what I was saying. What I was saying was that trying to block "incompatible interfaces" would end up blocking intersections that were actually possible, and I was presenting the overloads as an example of a possible intersection.

Hmm. I think the mix of overloads being prescribed in the draft and showing use that didn't fit edge cases may have led to me reading that with less understanding of your point of view. I think we fully agree that compatible things should not be blocked if at all reasonably possible, and only disagreed on the specific example given being reasonably possible.

I think I largely agree with @erictraut's comment here: #3 (comment) that merely defining the rules of the intersection should be enough. I don't think that overloads are a satisfactory answer to an intersection of two methods with conflicting signatures, but in defining it as just "fields are intersected as well", it allows anything which can be determined as satisfactory to be allowed, and doesn't require an answer to that being satisfactory to be immediately apparent

@NeilGirdhar
Copy link
Collaborator

I think we fully agree that compatible things should not be blocked if at all reasonably possible, and only disagreed on the specific example given being reasonably possible.

Fair enough. It sounded like you were suggesting an incremental approach:

It is possible to start with what is currently possible to do correctly, and expand incrementally as there is demonstrated need for it.

I don't want to go through all of these issues that will sit unsolved for years. Better to not induce them to be created in the first place.

and doesn't require an answer to that being satisfactory to be immediately apparent

Right.

@mikeshardmind
Copy link
Collaborator

Fair enough. It sounded like you were suggesting an incremental approach:

Well, to be fair, I was, but not in a way that I believed required this to become a progress blocker or extra rules to specifically block something valid. I think Eric's comments made the case far more elegantly than I did, but it is my belief that if the rules we pick are fundamentally sound, then it doesn't matter if we do not have all the answers, but if we have special cases for things, we actually do need all the answers that are relevant to those special cases, or the rules become ambiguous.

Because I was operating with the lens of the current draft specifying a specific behavior for how to solve an intersection, rather than only defining the intersection and whether something is considered a sound intersection based on general rules, I was attempting to show that the prescribed behavior for a specific case was not able to completely solve all instances of that case, and was therefore partially ambiguous, and better not to have that included as a result. Even simpler though is what Eric said, just define it correctly from base rules and let implementations handle the implementation of the definition.

@gandhis1
Copy link

While I understand the need for practicality to determine what rules are checked or not, I'm not thrilled at the idea of allowing for even more areas where the various type checkers can have inconsistent errors. This freedom is beneficial for type checker maintainers but a burden to type annotation maintainers. Libraries often have to choose a single type checker to support or attempt to support both and regularly run into conflicts (as I have witnessed with pandas-stubs).

I don't want to derail this thread with a debate on that more general issue, but simply want to offer a counterpoint as to why mandating a consistent set of errors that all maintainers can agree on actually may be desirable. If something will likely end up being non-performant or tedious to implement, perhaps now is the time to refine that.

@CarliJoy
Copy link
Owner Author

@gandhis1 thanks for laying out this potential conflict. I am sure we can find a solution the fits to both target groups.
A PEP that doesn't limit type checker maintainers in their implementation but is still well enough defined to ensure compatible results of different checkers.

Seeing the different opinions about how strict a type checker should be i.e. when it comes to differing function parameter definition I am also fearing we would increase the problem of type checkers giving different results.
Maybe one type checker implements a strict understanding of intersection and another a flexible.
A maintainer of type annotations for code that is valid python but wasn't well designed (i.e. not 100% following the LISP completely) could run in serious troubles.
Even worse would be if one type checker is strict with one aspect and the other in another.

@erictraut I fully understand your concern about the final PEP becoming to complex and unsound. Still don't you think that formulating that target what should evaluate to Never could be formulated without falling into a logical/complex trap?

@mikeshardmind I am not an expert so feel free to come up changes of the draft :-)

@mikeshardmind
Copy link
Collaborator

@CarliJoy We have 1 case for sure where we can't leave it to type checkers and need to mandate resolution to Never because of the existence of typing.assert_never If it is left up to type checkers and not mandated for this case, then we aren't mandating compatibility with existing type system features.

That is very much still under discussion.

Is it? I don't think anyone came up with a counter to this point at all when it was presented.

@diabolo-dan
Copy link

Is it? I don't think anyone came up with a counter to this point at all when it was presented.

@randolf-scholz was countering this point as recently as 2 days ago. It seems premature to believe that he's stopped posting because he's been persuaded to agree with you, rather than hasn't had time to continue for now.

I haven't personally addressed it (yet) because I have limited time (and emotional energy), and I think that we probably need to make progress on other parts of the discussion first before we can fruitful progress there.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Aug 30, 2023

was countering this point as recently as 2 days ago. It seems premature to believe that he's stopped posting because he's been persuaded to agree with you, rather than hasn't had time to continue for now.

I don't see anything he said as countering it. He said an example was begging the question when it wasn't; The example was specifically demonstrating how typing.assert_never would no longer function as documented and purposed if we don't mandate this. After this was pointed out, he moved on to other things without addressing that point.

I haven't personally addressed it (yet) because I have limited time (and emotional energy), and I think that we probably need to make progress on other parts of the discussion first before we can fruitful progress there.

Any direction in particular? Seriously, it feels like we're going in circles and not actually wrapping up anything, and I actually think many other open questions are predicated on whether or not we are mandating reductions, and if so, in what cases. This has specific effects on if #17 remains needed. Specifically, #17 is no longer needed, only nice, if incompatible intersections are always detected in the intersection themselves.

It also has specific effects on how we need to handle the next steps for the PEP. If we are mandating reductions, we need to write the rules for those reductions, if we aren't, we don't. If we are, we also need to loop in type checkers on the rationale for why we are and open that line of discussion about the tradeoffs this proposes for them.

In either case, we need to then write the formal language that describes the prescribed behavior.

@randolf-scholz
Copy link

randolf-scholz commented Aug 30, 2023

I don't see anything he said as countering it. He said an example was begging the question when it wasn't; The example was specifically demonstrating how typing.assert_never would no longer function as documented and purposed if we don't mandate this. After this was pointed out, he moved on to other things without addressing that point.

I stand by this point, because the whole example was built on the presumption that A&B must be set equal to Never, when my position is that A&B should be defined purely by its subtyping rules. This is crucial when we take serious the reality that nominal subtypes of A and B can exist even when A and B are incompatible.

The reasoning that it is required for type hinting code with C-implementation I find unconvincing, since, in that case, if the library that provides A makes any changes of A's interface the code is plausibly no longer type-safe, regardless of whether or not A is intersected with anything in the function type-hint.

The discussions here are straining and take a lot of time, I would also propose to instead focus on other issues and continue writing a rough draft. Each party can independently add a proposal regarding the issue of incompatible intersections, and we should try to get some external opinion on that matter and also check what other languages like TypeScript do about it.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Aug 30, 2023

I stand by this point, because the whole example was built on the presumption that A&B must be set equal to Never, when my position is that A&B should be defined purely by its subtyping rules.

Which then breaks use of typing.assert_never, something which still hasn't been addressed. Saying "Well A&B should just be handled by it's subtyping rules" isn't useful in the context of this issue. typing.assert_never is a specific request for "is this type no longer inhabitable?" Never can always be used in place of a type, but that isn't what assert_never is asking. assert_never is used to assert that code based on the exhaustion of valid types (uninhabitabile, is only Never) has exhausted all possible valid types and therefore unreachable. Because of the existence of this, we need the ability to know if A&B is uninhabited (ie. is Never)

See the documentation on this for details, but the relevant line:

For a call to assert_never to pass type checking, the inferred type of the argument passed in must be the bottom type, Never, and nothing else.

Nobody has actually said anything that squares this issue or outright says they are fine breaking typing.assert_never.

And if we aren't breaking existing features to add intersections, does this not then mean that we already need a means to handle the reductions people are claiming are going to be too complex?

This is crucial when we take serious the reality that nominal subtypes of A and B can exist even when A and B are incompatible.

This isn't actually possible under existing type-checking rules. If A and B are incompatible, it isn't legal to subclass from both. Your ability to lie to type checkers remains intact but is irrelevant for discussing a specification of what is correct.

@DiscordLiz
Copy link
Collaborator

The discussions here are straining and take a lot of time, I would also propose to instead focus on other issues and continue writing a rough draft. Each party can independently add a proposal regarding the issue of incompatible intersections, and we should try to get some external opinion on that matter and also check what other languages like TypeScript do about it.

If people would stop breaking the existing rules we might take a lot less time. How much time has been spent by people still trying to say that they can create a class that breaks the rules of static typing and then use that in an example? Can we start with getting everyone on the same page about what is and isn't correct with the existing typing rules so we stop wasting time on this?

You're still doing it even now after it's been pointed out:

I stand by this point, because the whole example was built on the presumption that A&B must be set equal to Never, when my position is that A&B should be defined purely by its subtyping rules. This is crucial when we take serious the reality that nominal subtypes of A and B can exist even when A and B are incompatible.

@randolf-scholz
Copy link

randolf-scholz commented Aug 30, 2023

@DiscordLiz It should be clear by now that people have different ideas of what the existing rules are. For instance, I consider most of the examples given in this thread for supposedly incompatible classes to not actually be examples of incompatible classes, because I consider the following is to be type-safe [mypy-play]:

from typing import Never

class A:
    def foo(self, x: int) -> None: ...
    
class B(A):
    foo: Never   # __getattr__(B, "foo") -> raise NotImplementedError
    
x: A = B()  # type checks 
reveal_type(B.foo)  # <nothing>

@DiscordLiz
Copy link
Collaborator

DiscordLiz commented Aug 30, 2023

@randolf-scholz I believe mypy is incorrect about allowing this as a subtype, and I'll be filling an issue later. There's non-obvious conflicts between LSP and strictly structural subtyping which have not been specified since the addition of Never as something which can be used in more places directly than just a return with NoReturn (also a no-pep change....), and static typing currently is supposed to enforce multiple concepts of typing, not just structural.

However, even if this is correct, all you've shown is that you're "Solving" an incompatibility by reducing to Never, which at some point in an intersection may also be the only correct way to interpret it, and have not stated why we should not check if this is the only available solution when people are explicitly asking for "is there any possible other type left" with typing.assert_never

@DiscordLiz
Copy link
Collaborator

DiscordLiz commented Aug 30, 2023

It's also worth noting Never isn't a subtype of all other types and treating it as such is incorrect. http://bit.ly/python-subtyping
the bottom type in a type system is just a way to indicate that a type specification is uninhabited.

@CarliJoy
Copy link
Owner Author

@randolf-scholz
Hmm interesting example.
If change Never to anything else MyPy will complain

main.py:7: error: Incompatible types in assignment (expression has type "int", base class "A" defined the type as "Callable[[A, int], None]") [assignment]

I feel like the people with proper background on the formal theory need to discuss some flaws within the existing Python typing system first before we can discuss further here.
So lets wait for this issue to settle.

I also would highly recommend that we base our PEP on formal theory. It seems that it was neglected quite a lot in the past which leads to inconsistencies. I rather not want to create PEP that increases them.

Of course being a dynamic language, you can do everything that violates the LSP principle. The goal of introducing formal typing to your code base is that this is actually not happening.

I don't see a real life use case in which we need to allow/consider behaviour that brakes LSP. This would introduce problems on so many other fronts...

The only reason why type checkers can determine LSP incomptible subclassing is because they know that order matters and the last definition is picked.
But as want and need Intersections to be independent of order this can't be done here.

Consider

from typing import Never, Any, reveal_type

class A:
    def foo(self, x: int) -> None: ...
    
class B:
    foo: int

def foo(val: Any) -> None:
    reveal_type(val)
    if isinstance(val, A) and isinstance(val, B):
        reveal_type(val)  # should be A&B -> currently not printed at all
        reveal_type(val.foo)  # should be never ?  -> currently not printed all

Mypy Play

The isinstance check is fulfilled in runtime by both C(A,B) and C(B,A) but the results differ for both. So it is impossible for a type checker to determine the result, therefore it only can raise an error here.

So I don't see any reason why we should consider this in this PEP.

@DiscordLiz and @mikeshardmind I really have no good understanding of formal typing theory, bottom and top types etc.
But if we say val.foo evaluates to Never, it seems logical to me, that if we define a

class C(A,B):
    foo: Never

is valid, because it is actually the result that would expect from A&B. Probably missing something here...
If you have time and energy, could you explain for a dummy like me why this logic is flawed?

@randolf-scholz
Copy link

randolf-scholz commented Aug 31, 2023

@CarliJoy This is not surprising that your example errors, since int is not a subtype of Callable[[A, int], None].

The reason nothing is printed for the last reveal_type is because mypy deems that A and B are incompatible, so the code after the if-clause becomes unreachable, and reveal_type within unreachable sections are ignored. (mypy-play with --warn-unreachable).

Now, when we look in the type hierarchy, we have: Never <: Callable[..., Never] <: Callable[[T1, ..., Tn], R], for any types Tₖ and R. So for instance, it would be considered type safe to have

# Example ①
class A:
    def foo(self, x: int) -> None: ...
    
class B:
    def foo(self, x: int) -> Never: raise NotImplementedError

With this, no unreachable errors are emitted (mypy-play ①). However, regarding LSP, there are versions of LSP that the B above violates, in particular (LSP-exception):

(LSP-exception): New exceptions cannot be thrown by the methods in the subtype, except if they are subtypes of exceptions thrown by the methods of the supertype. (wikipedia)

The point I am making is that, apparently, the above is not part of the python typing ecosystem. Therefore, if we accept examples ① as valid type-safe subtyping, then it would have to accept that example ② is valid, type-safe subtyping (mypy-play ②).

# example ②
class A:
    @property
    def foo(self) -> int: return 0
    
class B:
    @property
    def foo(self) -> Never: raise NotImplementedError

and, as a final logical conclusion also example ③

# example ③
class A:
    def foo(self) -> int: return 0

class B:
    foo: Never

    def __getattribute__(self, key):
        if key == "foo":
            raise NotImplementedError
        return super().__getattribute__(key)

The reason is the following interpretation: If we define / annotate A.foo as something, say Callable[[int], None], then this defines a contract that looking up the attribute 'foo' of A produces an object of that type. Ergo, the following should be considered quasi equivalent:

# example ④
class A:
    foo: T

class A:
     @overload
     def __getattribute__(self, key: Literal["foo"]) -> T: ...  # promise that getattr(A, "foo") yields T.
     @overload
     def __getattribute__(self, key: str) -> Any: ...  # actual signature of object.__getattribute__

After all, the interpreter converts A.foo to a __getattribute__ lookup. But since Callable[Literal["foo"], Never] <: Callable[Literal["foo"], T] it appears that the logical consequence of accepting ① is that we also have to accept ② and ③.

Notably, this means that subclasses can choose to for example raise NotImplementedError instead of actually implementing a given interface. What type-checking still guarantees that the code will not raise any TypeError. Whether LSP is violated depends on what variant of LSP python typing tries to adhere to. Example ① indicates that (LSP-exception) is not part of it.

@diabolo-dan
Copy link

diabolo-dan commented Aug 31, 2023

        reveal_type(val.foo)  # should be never ?  -> currently not printed all

I would argue it should be

int & Callable[[int], None]*
  • Technically this should be an anonymous protocol callable that matches the parameter types.

This is implementable in python:

class CallableInt(int):
    def __call__(self, x: int) -> None: ...

It's relatively unlikely that this is what was intended by the user, but it's not impossible, so a type checker shouldn't really be considering it unreachable.

@randolf-scholz
Copy link

randolf-scholz commented Aug 31, 2023

One possible resolution: Behavioral LSP states that

  • (LSP-precondition): Preconditions cannot be strengthened in the subtype.
  • (LSP-postcondition): Postconditions cannot be weakened in the subtype.

For functions, the contra-variance of arguments and co-variance of return types can be equivalently described as a set of pre-conditions and post-conditions. In particular, the covariance of the return type of Callable[..., R] is equivalent to the post-condition isinstance(return_value, R).

However, when we try to substitute Never for R there is a problem: the equivalence to a post-condition is no longer true, since Never is supposed to be uninhabited (no instances of the type can exists). Therefore, the post-condition must fail. Indeed, the prior interpretation of Never as NoReturn means that the post-condition can actually never even be reached.

This suggests one could consider a modified subtyping rule for functions, which one may refer to:

(no-substitute-never) Callable[...., R] <: Callable[..., S] if and only if either R==S==Never or Never ≨ R <: S, and similarly for attributes.

I.e. if S≠Never we assume subtyping includes the post-condition and if S=Never it doesn't. This is a weak version of (LSP-exception) that forbids to substitute a function with another function that unconditionally raises if the original didn't. It would make example ① void, and, consequently, the whole argumentative chain from ① - ④ would fall apart. In particular, this would guarantee that B.__getattribute__("specific_key") is not allowed to raise unconditionally if the parent class didn't for "specific_key".

@DiscordLiz @mikeshardmind What do you think about this?

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Aug 31, 2023

@randolf-scholz I think this is at the crux of the issue, but resolving that Never isn't a subtype of other types over in the linked typing issue leads to the same result, so you may want to add your details about how Never currently interacts with that in that thread if you have the time to do so.

It's worth noting that the set-theoretical definition being argued over in that thread is more general, but reaches the exact same conclusion of the limited modified subtyping rule you have there

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Aug 31, 2023

@CarliJoy I think @DiscordLiz has provided a much clearer explanation of why even though A.foo & B.foo = Never, that doesn't mean C.foo can be defined to depend on Never over here I also think you are correct when you said that discussion may need to wrap up before this one.

Some code to go with the prose she wrote:

int # singular static type
class A: ...  # singular static type
class B: ... # also a singular static type
ABAlias = A & B  # *also* a singular static type
Any  # This is the top type, and could be anything. This is not a singular type.
Never  # This is the bottom type, and can't be anything. 

class SomeClass:
    foo: ABAlias  # This is a type specification, spelled with a singular type, but this is not a singular type!
    # As an annotation, this allows the set of all potential types which are also A & B
    # and of which until further inference happens, is only safe to use what is provided by A&B.
    # Never doesn't provide A&B, so allowing this would violate LSP, even if Never was possible to exist!

def x() -> Never:  # Never isn't a singular type, it's the absence of a type being returned at all!
def y(foo: Never):  # Because There are no types which are never, this code isn't callable.

@DiscordLiz
Copy link
Collaborator

DiscordLiz commented Aug 31, 2023

@randolf-scholz I think you hit the nail right on the head for the current issues with Never and LSP. This is a good formulation and good example of the desirable properties of LSP that are broken with the currently accepted definitions. My only qualm with layering on an exception instead of changing the definition, is that it appears to me that they have the same effects on all currently valid code. With this being the case, changing the rules to match those of the definitions which don't have the bottom type participate in subtyping freely gives us what is correct in more things which could be added to python later, as there are other features of type systems which are provably correct with those definitions, including rules for how they compose, that we can freely take in the future as they are proven already if we take the definition for the bottom type from the set-theoretic view.

@CarliJoy It's basically that if a type relies on the existence of Never (needs an instance of Never to exist for it to exist), the type itself can't exist. I think we need to wait on the outcome of the wider typing discussion on Never, because a definitive statement there could remove this from being the concern of this PEP or not depending on how strongly worded the definitive statement is.

And @CarliJoy following up on @mikeshardmind's above example there's also A | B, which is a set of static types. In an annotation, the annotation allows the set of all possible types that can claim to be members of {A, B} by substitutability, which is a possibly wider set of types defined by a subtyping relation to a set of types. I believe you [Michael] were aware of this given your wording in the discussion in python/typing, but it's worth mentioning for completeness.

@CarliJoy
Copy link
Owner Author

CarliJoy commented Sep 1, 2023

Thank you @DiscordLiz and @mikeshardmind . I understand it now a bit better.

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Sep 1, 2023

that forbids to substitute a function with another function that unconditionally raises if the original didn't.

I also don't see what this has to do with LSP-exception since you don't know whether the superclass method raised the exception that the child class method is unconditionally raising. I also don't see why there's anything wrong with subclasses raising unconditionally where their superclasses don't. That already happens in ordinary code.

Is there any progress towards resolving the top level topic, which is "should reduction to never be mandated?"

Yes, if you don't mandate reduction, then assert_never won't do what you want it to. Is there any evidence that many people will use it to test whether two interfaces are entirely compatible? What seems more like is that people will inherit from two interfaces and write tests for the various functions, and in those tests, it will come up that there are type errors. That is tested without mandated reduction.

I think ultimately, if you want to mandate this behaviour, then given that there are significant runtime and implementation costs, those will have to be justified with real world examples of bugs that would be caught by not reducing—in order to be convincing to the people reviewing the PEP.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Sep 1, 2023

@NeilGirdhar

Yes, if you don't mandate reduction, then assert_never won't do what you want it to. Is there any evidence that many people will use it to test whether two interfaces are entirely compatible? What seems more like is that people will inherit from two interfaces and write tests for the various functions, and in those tests, it will come up that there are type errors. That is tested without mandated reduction.

It doesn't actually matter if we think they will or not. It's been confirmed that if intersections get added, they may end up in inference inside of type checker internals. We can't build this to be knowingly incompatible with other typing features that are supposed to work everywhere, especially with this being the only one of those with actual runtime behavior associated. assert_never is not for writing tests, it's for exhaustive matching against types in runtime code, and getting the type checker to confirm you actually exhausted all possible types in your handling. Getting this wrong has runtime impact as it causes assertion errors. (and worse, silent failures in runtime code using -O)

However

Is there any progress towards resolving the top level topic, which is "should reduction to never be mandated?"

Somewhat, and related to the above, this and #17 are both pending discussion here python/typing#1458

There are 5 outcomes for that that I can forsee, and 3 of them leave us able to restrict mandating when intersections need to be resolved from "a lot of places" to "only when the existence of Never is specifically checked for" (Which is currently only typing.assert_never) and remove the need for Not as a requirement, and instead turn it into a nice addition. 1 of the remaining 2 options also makes the rules for Not immediately obvious, and therefore non-expensive, both in terms of PEP language and type implementation.

@DiscordLiz Also came up with a slightly broader set than this which would be needed in the remaining 1 case here

We're quite close to being able to close much of the controversial issues with intersections and focus on picking how much needs mandating (And presenting good arguments for it) based on clearing up a couple of foundational details where some behaviors aren't well defined in python PEPs

I think ultimately, if you want to mandate this behaviour, then given that there are significant runtime and implementation costs, those will have to be justified with real world examples of bugs that would be caught by not reducing—in order to be convincing to the people reviewing the PEP.

Will do, and I'm making a conscious effort to make sure we mandate the bare minimum actually necessary in light of these costs., as well as opportunity costs where this interacts with potential future features. If it ends up the assert_never case is the only case, type checkers will almost never encounter it and can defer the reduction until reaching an intersection passed to assert_never, rather than needing to reduce it eagerly. This makes the cost directly proportional to the number of places user code specifically is asserting that something is Never and that the type checker should check this is true, and that the intersection survived other type inference removals during the user exhausting types. (otherwise it won't be part of the type passed in)

@NeilGirdhar
Copy link
Collaborator

NeilGirdhar commented Sep 1, 2023

We're quite close to being able to close much of the controversial issues with intersections and focus on picking how much needs mandating (And presenting good arguments for it) based on clearing up a couple of foundational details where some behaviors aren't well defined in python PEPs

That's good to hear.

If it ends up the assert_never case is the only case, type checkers will almost never encounter it and can defer the reduction until reaching an intersection passed to assert_never, rather than needing to reduce it eagerly.

Sounds good. (And really they don't even need to get assert_never right in my opinion. It would just mean that you can't use assert_never with intersections and expect that type checkers are checking that every method of the interface is compatible.)

Somewhat, and related to the above, this and #17 are both pending discussion here python/typing#1458

I'm not sure exactly what you are looking for to actually resolve that issue, but hopefully it is resolved to your satisfaction soon.

@mikeshardmind
Copy link
Collaborator

Actually, I'm going to go ahead and preemptively close #17. If it ends up needing to be reopened, I'll reopen it, but it seems the least likely case after we cleared up some other discrepencies.

I'm not sure exactly what you are looking for to actually resolve that issue, but hopefully it is resolved to your satisfaction soon.

@NeilGirdhar We just need a clear answer from type checkers on "Is it intended that this is allowed, and are we sure this is the behavior we want to go with for handling Never and subtyping given a few issues with that"

Right now the answer to the first half of that is yes, the second we don't have an answer to yet. There's been a few things shown where mypy has had user complaints with the current behavior actually not being expected or causing other issues, and there's a statement that we need to investigate the impact of actually changing this, which I'm hoping to have the time to do later today myself.

If the answer is yes to both, then we can start from what Liz had as a very slim set of cases where type checkers would need to.

If there is a willingness to make a slight modification on Never's behavior on the second, (specifically where we can't have attributes be Never (or replace a method with an attribute that is Never) and a class be valid, but we can have parameters be Never and only error if initialized), then we only need to check in typing.assert_never and we don't even need to mandate it in that case anymore, only define the reductions and leave it to type checkers (it can no longer leak out of library code poorly in that case).

There are a few other outcomes on the second question possible, but that's the most likely one if there is going to be a change in my opinion as it is the most minimal in its potential disruption of existing code while addressing the "safe substitution" concern.

In the mean time, I've started summarizing all of what we've discussed so that in any outcome we have a nice record of what was and wasn't considered for the PEP, not sitting idle while waiting on the upstream outcome.

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Sep 8, 2023

What was shunted over to python/typing has resulted in Eric saying we should formalize the existing type system prior to adding intersections. That said, there's a simpler case to show the problem presented here with Never without never, and a corresponding mypy issue going back to 2017.

In the case of

class A:
    foo: int

foo should be invariant. This being resolved in type-checking implementations preempts the issue in this case from interacting with Never as str & int would not be invariantly compatible with str

Demonstrating this as an issue without Never:

class A:
    a: int
    
class B(A):
    a: bool

def foo(a: A) -> None:
    a.a = 2

B().a = 2 # not ok

foo(B()) # ok

This clearly isn't safe despite being (currently) allowed.

@mikeshardmind
Copy link
Collaborator

The above has been patched in pyright, is being patched in mypy, and pyre has also been made aware of it. Given that, and the general consensus that this was something that type checkers weren't checking, but was part of the specification over in the related discussion in python/typing, I believe that leaves us with the following:

Mutable fields must be invariantly compatible all across all operands of an intersection that provide the field or the intersection can't be satisfied.

This vastly simplifies any rules here as we don't need to intersect all the way down, we can simply check for conflict. This increases the arguability to reduce to Never as simple in the case of impossible intersections.

@QuantumTim
Copy link

QuantumTim commented Nov 20, 2023

Bit late to the discussion so hopefully this isn't repeating anyone, but I think one consequence of reducing impossible types to Never is that the types C1: TypeAlias = Callable[[t1], None] | Callable[[t2], None] and C2: TypeAlias = Callable[[t1 & t2], None] are not equivalent, but if you treat such impossible types as unique types (even though they all share the property of having no valid values) then they are equivalent.

(Note: whilst I do have a maths background, it's been a while, so do let me know if there's a mistake in my logic 😉)

C1 is obviously a subtype of C2: if you have some function f that's a valid C1 then it's (only) safe to pass something that's both a t1 and a t2, aka t1 & t2. Hence f is also a valid C2*.

*
As long as you allow that `Callable[[Never], None]` is a valid type (it's the type of a callable that takes a single argument and returns None). You can't actually call it since you don't know what types it accepts, but you can do other operations that would be valid such as query its `__name__`, or introspect it to figure out what types it accepts and then call it with a valid type.

For C2 to be a subtype of C1, we could try to show that any g that's not a valid C1 is also not a valid C2. For g not to be a valid C1 there must be some type u1 that's a subtype of t1 that g cannot be called with (if no such type exists then g can be called with any subtype of t1 hence it's a valid Callable[[t1], None]. Similarly there must be some u2 subtype of t2 that g cannot be called with. The next step would be to argue that now u1 & u2 is a type that's a sub-type of t1 & t2 that cannot be passed to g**, and hence g is not a valid C2. This is where the argument diverges - if u1 & u2 can be Never, then this argument doesn't work. If u1 & u2 will always be a valid unique type (even if it might not have any valid values) then it does.

**
I've hand-waved over a step here - it's possible that some sub-types of u1 are valid for g, and same for some sub-types of u2, and the intersection u1 & u2 happens to yield a narrowed type that would actually be wholly valid for g. But this is basically the same argument again - you just narrow to u1_ = u1 & ~(u1 & u2) and u2_ = u2 & ~(u1 & u2) and use u1_ & u2_ instead, which once again only works if that intersection is not Never.

For an example of such a function g (abusing notation and glossing over the exact meanings of some operations):

class A1:
    pass
class B1(A1):
    foo: int
class A2:
    pass
class B2(A1):
    foo: str

def g(x: (A1 | A2) & ~B1 & ~B2) -> None:
    ...

This cannot be called with all A1 or all A2, so it's not a valid Callable[[A1], None] | Callable[[A2], None], but the "C2 version" would be Callable[[A1 & A2], None] and the u1 & u2 type would be B1 & B2 (which is "impossible" in CPython at least, but not actually invalid at runtime). You could also do this with

def g(x: (int | str) & ~Literal[0] & ~Literal['']) -> None:
    ...

In this case Callable[[int & str], None] already contains such an invalid type, and the u1 & u2 type of Literal[0] & Literal[''] looks even more absurd, but... it does exhibit the potential problem with reducing int & str to Never - it's throwing away information. If you reduce Callable[[int & str], None] to Callable[[Never], None], then (as per * above) your type is now the type of all callables that accept a single argument (aka Union[Callable[[T], None] for T in TYPES]) and not just Callable[[int], None] | Callable[[str], None].

So I guess the question is, do you like this nice duality of union/covariance with intersection/contravariance, or do you dislike the potential absurdity of treating types like Literal[0] & Literal[''] as distinct from Never? Personally I think I lean towards the duality.

As for assert_never() - perhaps that indicates that we need two distinct concepts here: one is the bottom type (aka intersection of all other types) and the other is the set of types that have no valid values, which would be used as the type of arguments to assert_never() (and a lot of the discussion on python/typing#1458 seems to revolve around which of these is "correct"). As has been mentioned though, it might not always be obvious to a type checker when a type (eg generated by an intersection) is in this set, which is when the programmer might need to step in to help out.

Sidenote
The name assert_never() (and similarly assert_type()) seems slightly off to me - I would usually use and assertion when I need to tell the type checker something I know (hope?) to be true but that it can't infer/prove, not when I want it to check something. Maybe it should have a check: bool argument to specify which mode you want, though I guess the ship has sailed on the default value for that.

@mikeshardmind
Copy link
Collaborator

I believe this is resolved as not to be enforced at the point of the intersection for the options we are considering

@CarliJoy
Copy link
Owner Author

@mikeshardmind Thats because due to the ordered nature of our suggestion there are no conflicts when accessing in attribute, as the highest order is taken, correctly?
And when trying to assign an object to a value, it could simply be that no object ever could satisfy the condition.
Is that more or less correct?

@mikeshardmind
Copy link
Collaborator

mikeshardmind commented Feb 22, 2024

@CarliJoy

Is that more or less correct?

Yeah. At assignment it needs to be compatible with all. The ordered nature for lookup means there's no reason to determine a minimum bound in the type system statically, and it basically gets handled when creating a type (being told the type violates other rules)

@CarliJoy
Copy link
Owner Author

there's no reason to determine a minimum bound in the type system statically

Sorry I don't understand what this means (in detail).
I think I am confused again with structual/nominal typing etc.

But I think it is more productive to wait for the PEP draft and see I reading it I will get it and if not work together with you to reformulate it :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants