Consider field value types when disambiguating a union of TypedDicts #11199

bluetech · 2021-09-26T13:45:42Z

Description

Previously, given a union of TypedDicts, e.g. A|B in

from typing import TypedDict, Literal, Union

class A(TypedDict):
    tag: Literal['A']
    extra_a: str

class B(TypedDict):
    tag: Literal['B']
    extra_b: str

when needing to disambiguate the union, e.g.

td: A|B = {
    'tag': 'A',
    'extra_a': 'foo',
}

mypy would only consider the keys of the dict expression and
TypedDict, e.g. 'tag' and 'extra_a'. But if multiple members of the
union have the same shape, only distinguished by a value type, the
disambiguation fails, e.g.

class A(TypedDict):
    tag: Literal['A']

class B(TypedDict):
    tag: Literal['B']

td: A|B = {  # E: Type of TypedDict is ambiguous, could be any of ("A", "B")
    'tag': 'A',
}

To allow this, also consider the types of the dict expression's values
when narrowing the candidates from the union.

Test Plan

I've added a test case and also successfully ran on my own code where I've hit this issue. The test fails before and passes after.

Fixes python#8533. Previously, given a union of TypedDicts, e.g. `A|B` in ```py from typing import TypedDict, Literal, Union class A(TypedDict): tag: Literal['A'] extra_a: str class B(TypedDict): tag: Literal['B'] extra_b: str ``` when needing to disambiguate the union, e.g. ``` td: A|B = { 'tag': 'A', 'extra_a': 'foo', } ``` mypy would only consider the *keys* of the dict expression and TypedDict, e.g. 'tag' and 'extra_a'. But if multiple members of the union have the same shape, only distinguished by a value type, the disambiguation fails, e.g. ```py class A(TypedDict): tag: Literal['A'] class B(TypedDict): tag: Literal['B'] td: A|B = { # E: Type of TypedDict is ambiguous, could be any of ("A", "B") 'tag': 'A', } ``` To allow this, also consider the types of the dict expression's values when narrowing the candidates from the union.

bluetech · 2021-09-26T13:48:42Z

Note: I don't really know if self.accept/is_subtype is the right way to do this check. In the issue (#8533) @Michael0x2a implied there might need to be more substantial changes to the flow of disambiguation/type-checking, I suspect that's true. But since it's a small change and seems to work well, I'm submitting it anyway.

sobolevn · 2021-09-26T15:05:20Z

test-data/unit/check-typeddict.test

+C2 = TypedDict('C2', {'@type': Literal['c-type'], 'extra': str})
+
+x: Union[A, B, C1, C2]
+x = {'@type': 'a-type'}


Maybe I'm missing something, but in the test case above [case testTypedDictUnionUnambiguousCase] - the inferred type of c: Union[A, B] = {'@type': 'a-type', 'a': 'Test'} is Union[TypedDict('__main__.A', {'@type': Literal['a-type'], 'a': builtins.str}), TypedDict('__main__.B', {'@type': Literal['b-type'], 'b': builtins.int})] which looks inconsistent with this case.

Should it be this way?

There is a difference if it's annotated directly c: Union[A, B] = ... or separately c: Union[A, B]; c = .... Initially I emulated the style of the previous test (annotated directly), but then switched to this one because it's shorter. I cannot explain why the difference exist, naively I would have thought they should be equivalent. I can try to figure it out if you think it's pertinent.

naively I would have thought they should be equivalent

Yeah, me too! 😄 I would be very confused if I saw this. I don't remember any other cases when x: Annotation; x = ... and x: Annotation = ... would produce different results.

For reference, this behavior is not particular to TypedDict:

class A: pass class B: pass x: A|B x = A() reveal_type(x) # Revealed type is A y: A|B = A() reveal_type(y) # Revealed type is Union[A, B]

Yeah this behavior is suprising. We have it for historical reasons and because without this feature there would be no easy way to force a wider/more general type for a variable in a type-safe manner.

JukkaL

Thanks for the PR!

This is probably fine. There's one context I can think of where this might be problematic. Can you add a test case where you do something like this:

x: TD1 | TD2
x = {'x': 1, 'y': []}   # Note empty list, disambiguate to only TD1 or TD2 based on 'x'
reveal_type(x)

The type of y should be list[int], for example, in the TypedDict type. I'm wondering how the type of the empty list would be inferred.

bluetech · 2021-10-01T20:06:22Z

@JukkaL thanks for reviewing!

I added the test you suggested, and it's seems good.

Also added a check for an ambiguous case, where the xs have the same type, but the ys are List[int] or List[str]. Given [], it should be ambiguous, and it is. But then I've had a thought to also test this given [1], which should go to the List[int] TypedDict and not be ambiguous -- but it is. So I need to look into it.

For reference, the failing check is

from typing import Union, List
from typing_extensions import TypedDict

A = TypedDict('A', {'x': str, 'y': List[int]})
B = TypedDict('B', {'x': int, 'y': List[int]})
C = TypedDict('C', {'x': int, 'y': List[str]})

abc_unambiguous: Union[A, B, C]                                                                                                            
abc_unambiguous = {'x': 0, 'y': [1]} # N: Revealed type is "TypedDict('__main__.B', {'x': builtins.int, 'y': builtins.list[builtins.int]})"
                                     # Actual: Type of TypedDict is ambiguous, could be any of ("B", "C")

bluetech · 2021-10-15T19:11:27Z

I tried to find a solution but failed unfortunately. The problem is this: as part of the proposed algorithm, I check the dict expr (the rvalue) against each TypedDict in the union. For each TypedDict, I go over each key-value and ask "is the dict expr key's value (in the example: [1]) compatible with the TypedDict key value's type (in the example: List[str])?". The "compatible" check is done like this:

value_type = self.accept(node=<dict expr key's value expr>, type_context=<TypedDict key's value type>`)
is value_type a subtype of <TypedDict key's value type>?

I pass type_context to self.accept because otherwise it will return str for strings instead of Literal when the TypedDict key's value type being matched against is a literal, and won't match when it should.

But then, self.accept(ListExpr[IntExpr(1)], type_context=list[str]) does two things I don't want: emits an incompatible-type error, and returns list[str], which matches when it shouldn't.

So there's a problem if I pass the type context, and there's a problem if I don't.

Hopefully someone else can figure it out!

sobolevn reviewed Sep 26, 2021

View reviewed changes

JukkaL reviewed Oct 1, 2021

View reviewed changes

Add a test case with inference

b769bfe

bluetech closed this Oct 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider field value types when disambiguating a union of TypedDicts #11199

Consider field value types when disambiguating a union of TypedDicts #11199

bluetech commented Sep 26, 2021 •

edited

Loading

bluetech commented Sep 26, 2021

sobolevn Sep 26, 2021

bluetech Sep 26, 2021

sobolevn Sep 26, 2021

bluetech Sep 26, 2021

JukkaL Oct 1, 2021

JukkaL left a comment

bluetech commented Oct 1, 2021

bluetech commented Oct 15, 2021

Consider field value types when disambiguating a union of TypedDicts #11199

Consider field value types when disambiguating a union of TypedDicts #11199

Conversation

bluetech commented Sep 26, 2021 • edited Loading

Description

Test Plan

bluetech commented Sep 26, 2021

sobolevn Sep 26, 2021

Choose a reason for hiding this comment

bluetech Sep 26, 2021

Choose a reason for hiding this comment

sobolevn Sep 26, 2021

Choose a reason for hiding this comment

bluetech Sep 26, 2021

Choose a reason for hiding this comment

JukkaL Oct 1, 2021

Choose a reason for hiding this comment

JukkaL left a comment

Choose a reason for hiding this comment

bluetech commented Oct 1, 2021

bluetech commented Oct 15, 2021

bluetech commented Sep 26, 2021 •

edited

Loading