diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 77cf5b09068..e17e0940fc7 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -607,6 +607,7 @@ peps/pep-0724.rst @jellezijlstra peps/pep-0725.rst @pradyunsg peps/pep-0726.rst @AA-Turner peps/pep-0727.rst @JelleZijlstra +peps/pep-0728.rst @JelleZijlstra peps/pep-0729.rst @JelleZijlstra @hauntsaninja peps/pep-0730.rst @ned-deily peps/pep-0731.rst @gvanrossum @encukou @vstinner @zooba @iritkatriel diff --git a/peps/pep-0728.rst b/peps/pep-0728.rst new file mode 100644 index 00000000000..b5d7b2a4581 --- /dev/null +++ b/peps/pep-0728.rst @@ -0,0 +1,640 @@ +PEP: 728 +Title: TypedDict with Typed Extra Items +Author: Zixuan James Li +Sponsor: Jelle Zijlstra +Status: Draft +Type: Standards Track +Topic: Typing +Content-Type: text/x-rst +Created: 12-Sep-2023 +Python-Version: 3.13 + + +.. highlight:: rst + +Abstract +======== + +This PEP proposes a way to type extra items for :class:`~typing.TypedDict` using +a reserved ``__extra__`` key. This addresses the need to define a subset of +keys that might appear in a ``dict`` while permitting additional items of a +specified type, and the need to create a closed TypedDict type with ``__extra__: +Never``. + +Motivation +========== + +A :py:class:`typing.TypedDict` type can annotate the value type of each known +item in a dictionary. However, due to structural subtyping, a TypedDict can have +extra items that are not visible through its type. There is currently no way to +restrict the types of items that might be present in the TypedDict type's +structural subtypes. + +Defining a Closed TypedDict Type +-------------------------------- + +The current behavior of TypedDict prevents users from defining a closed +TypedDict type when it is expected that the type contains no additional items. + +Due to the possible presence of extra items, type checkers cannot infer more +precise return types for ``.items()`` and ``.values()`` on a TypedDict. This can +also be resolved by +`defining a closed TypedDict type `__. + +Another possible use case for this is a sound way to +`enable type narrowing `__ with the +``in`` check:: + + class Movie(TypedDict): + name: str + director: str + + class Book(TypedDict): + name: str + author: str + + def fun(entry: Movie | Book) -> None: + if "author" in entry: + reveal_type(entry) # Revealed type is 'Movie | Book' + +Nothing prevents a ``dict`` that is structurally compatible with ``Movie`` to +have the ``author`` key, and under the current specification it would be +incorrect for the type checker to narrow its type. + +Allowing Extra Items of a Certain Type +-------------------------------------- + +For supporting API interfaces or legacy codebase where only a subset of possible +keys are known, it would be useful to explicitly expect additional keys of +certain value types. + +However, the typing spec is more restrictive on type checking the construction of a +TypedDict, `preventing users `__ +from doing this:: + + class MovieBase(TypedDict): + name: str + + def fun(movie: MovieBase) -> None: + # movie can have extra items that are not visible through MovieBase + ... + + movie: MovieBase = {"name": "Blade Runner", "year": 1982} # Not OK + fun({"name": "Blade Runner", "year": 1982}) # Not OK + +While the restriction is enforced when constructing a TypedDict, due to +structural subtyping, the TypedDict may have extra items that are not visible +through its type. For example:: + + class Movie(MovieBase): + year: int + + movie: Movie = {"name": "Blade Runner", "year": 1982} + fun(movie) # OK + +It is not possible to acknowledge the existence of the extra items through +``in`` checks and access them without breaking type safety, even though they +might exist from arbitrary structural subtypes of ``MovieBase``:: + + def g(movie: MovieBase) -> None: + if "year" in movie: + reveal_type(movie["year"]) # Error: TypedDict 'MovieBase' has no key 'year' + +Some workarounds have already been implemented in response to the need to allow +extra keys, but none of them is ideal. For mypy, +``--disable-error-code=typeddict-unknown-key`` +`suppresses type checking error `__ +specifically for unknown keys on TypedDict. This sacrifices type safety over +flexibility, and it does not offer a way to specify that the TypedDict type +expects additional keys compatible with a certain type. + +Support Additional Keys for ``Unpack`` +-------------------------------------- + +:pep:`692` adds a way to precisely annotate the types of individual keyword +arguments represented by ``**kwargs`` using TypedDict with ``Unpack``. However, +because TypedDict cannot be defined to accept arbitrary extra items, it is not +possible to +`allow additional keyword arguments `__ +that are not known at the time the TypedDict is defined. + +Given the usage of pre-:pep:`692` type annotation for ``**kwargs`` in existing +codebases, it will be valuable to accept and type extra items on TypedDict so +that the old typing behavior can be supported in combination with the new +``Unpack`` construct. + +Rationale +========= + +A type that allows extra items of type ``str`` on a TypedDict can be loosely +described as the intersection between the TypedDict and ``Mapping[str, str]``. + +`Index Signatures `__ +in TypeScript achieve this: + +.. code-block:: typescript + + type Foo = { + a: string + [key: string]: string + } + +This proposal aims to support a similar feature without introducing general +intersection of types or syntax changes, offering a natural extension to the +existing type consistency rules. + +We propose that we give the dunder attribute ``__extra__`` a special meaning: +When it is defined on a TypedDict type, extra items are allowed, and their types +should be compatible with the value type of ``__extra__``. Different from index +signatures, the types of known items do not need to be consistent with the value +type of ``__extra__``. + +There are some advantages to this approach: + +- Inheritance works naturally. ``__extra__`` defined on a TypedDict will also + be available to its subclasses. + +- We can build on top of the `type consistency rules defined in the typing spec + `__. + ``__extra__`` can be treated as a pseudo-item in terms of type consistency. + +- There is no need to introduce a syntax to specify the type of the extra items. + +- We can precisely type the extra items without making ``__extra__`` the union + of known items. + +Specification +============= + +This specification is structured to parallel :pep:`589` to highlight changes to +the original TypedDict specification. + +Extra items are treated as non-required items having the same type of +``__extra__`` whose keys are allowed when determining +`supported and unsupported operations +`__. + +Using TypedDict Types +--------------------- + +For a TypedDict type that has the ``__extra__`` key, during construction, the +value type of each unknown item is expected to be non-required and compatible +with the value type of ``__extra__``. For example:: + + class Movie(TypedDict): + name: str + __extra__: bool + + a: Movie = {"name": "Blade Runner", "novel_adaptation": True} # OK + b: Movie = { + "name": "Blade Runner", + "year": 1982, # Not OK. 'int' is incompatible with 'bool' + } + +In this example, ``__extra__: bool`` does not mean that ``Movie`` has a required +string key ``"__extra__"`` whose value type is ``bool``. Instead, it specifies that +keys other than "name" have a value type of ``bool`` and are non-required. + +The alternative inline syntax is also supported:: + + Movie = TypedDict("Movie", {"name": str, "__extra__": bool}) + +Accessing extra keys is allowed. Type checkers must infer its value type from +the value type of ``__extra__``:: + + def f(movie: Movie) -> None: + reveal_type(movie["name"]) # Revealed type is 'str' + reveal_type(movie["novel_adaptation"]) # Revealed type is 'bool' + +Interaction with PEP 705 +------------------------ + +When ``__extra__`` is annotated with ``ReadOnly[]``, the extra items on the +TypedDict have the properties of read-only items. This affects subclassing +according to the inheritance rules specified in :pep:`PEP 705 <705#Inheritance>`. + +Notably, a subclass of the TypedDict type may redeclare the value type of +``__extra__`` or of additional non-extra items if the TypedDict type declares +``__extra__`` to be read-only. + +More details are discussed in the later sections. + +Interaction with Totality +------------------------- + +It is an error to use ``Required[]`` or ``NotRequired[]`` with the special +``__extra__`` item. ``total=False`` and ``total=True`` have no effect on +``__extra__`` itself. + +The extra items are non-required, regardless of the totality of the TypedDict. +Operations that are available to ``NotRequired`` items should also be available +to the extra items:: + + class Movie(TypedDict): + name: str + __extra__: int + + def f(movie: Movie) -> None: + del movie["name"] # Not OK + del movie["year"] # OK + +Interaction with ``Unpack`` +--------------------------- + +For type checking purposes, ``Unpack[TypedDict]`` with extra items should be +treated as its equivalent in regular parameters, and the existing rules for +function parameters still apply:: + + class Movie(TypedDict): + name: str + __extra__: int + + def f(**kwargs: Unpack[Movie]) -> None: ... + + # Should be equivalent to + def f(*, name: str, **kwargs: int) -> None: ... + +Inheritance +----------- + +``__extra__`` is inherited the same way as a regular ``key: value_type`` item. +As with the other keys, the same rules from +`the typing spec `__ +and :pep:`PEP 705 <705#inheritance>` apply. We interpret the existing rules in the +context of ``__extra__``. + +We need to reinterpret the following rule to define how ``__extra__`` interacts +with it: + + * Changing a field type of a parent TypedDict class in a subclass is not allowed. + +First, it is not allowed to change the value type of ``__extra__`` in a subclass +unless it is declared to be ``ReadOnly`` in the superclass:: + + class Parent(TypedDict): + __extra__: int | None + + class Child(Parent): + __extra__: int # Not OK. Like any other TypedDict item, __extra__'s type cannot be changed + +Second, ``__extra__: T`` effectively defines the value type of any unnamed items +accepted to the TypedDict and marks them as non-required. Thus, the above +restriction applies to any additional items defined in a subclass. For each item +added in a subclass, all of the following conditions should apply: + +- If ``__extra__`` is read-only + + - The item can be either required or non-required + + - The item's value type is consistent with ``T`` + +- If ``__extra__`` is not read-only + + - The item is non-required + + - The item's value type is consistent with ``T`` + + - ``T`` is consistent with the item's value type + +- If ``__extra__`` is not redeclared, the subclass inherits it as-is. + +For example:: + + class MovieBase(TypedDict): + name: str + __extra__: int | None + + class AdaptedMovie(MovieBase): # Not OK. 'bool' is not consistent with 'int | None' + adapted_from_novel: bool + + class MovieRequiredYear(MovieBase): # Not OK. Required key 'year' is not known to 'Parent' + year: int | None + + class MovieNotRequiredYear(MovieBase): # Not OK. 'int | None' is not consistent with 'int' + year: NotRequired[int] + + class MovieWithYear(MovieBase): # OK + year: NotRequired[int | None] + +Due to this nature, an important side effect allows us to define a TypedDict +type that disallows additional items:: + + class MovieFinal(TypedDict): + name: str + __extra__: Never + +Here, annotating ``__extra__`` with :class:`typing.Never` specifies that +there can be no other keys in ``MovieFinal`` other than the known ones. + +Type Consistency +---------------- + +In addition to the set ``S`` of keys of the explicitly defined items, a +TypedDict type that has the item ``__extra__: T`` is considered to have an +infinite set of items that all satisfy the following conditions: + +- If ``__extra__`` is read-only + + - The key's value type is consistent with ``T`` + + - The key is not in ``S``. + +- If ``__extra__`` is not read-only + + - The key is non-required + + - The key's value type is consistent with ``T`` + + - ``T`` is consistent with the key's value type + + - The key is not in ``S``. + +For type checking purposes, let ``__extra__`` be a non-required pseudo-item to +be included whenever "for each ... item/key" is stated in +:pep:`the existing type consistency rules from PEP 705 <705#type-consistency>`, +and we modify it as follows: + + A TypedDict type ``A`` is consistent with TypedDict ``B`` if ``A`` is + structurally compatible with ``B``. This is true if and only if all of the + following are satisfied: + + * For each item in ``B``, ``A`` has the corresponding key, unless the item + in ``B`` is read-only, not required, and of top value type + (``ReadOnly[NotRequired[object]]``). [Edit: Otherwise, if the + corresponding key with the same name cannot be found in ``A``, "__extra__" + is considered the corresponding key.] + + * For each item in ``B``, if ``A`` has the corresponding key [Edit: or + "__extra__"], the corresponding value type in ``A`` is consistent with the + value type in ``B``. + + * For each non-read-only item in ``B``, its value type is consistent with + the corresponding value type in ``A``. [Edit: if the corresponding key + with the same name cannot be found in ``A``, "__extra__" is considered the + corresponding key.] + + * For each required key in ``B``, the corresponding key is required in ``A``. + For each non-required key in ``B``, if the item is not read-only in ``B``, + the corresponding key is not required in ``A``. + [Edit: if the corresponding key with the same name cannot be found in + ``A``, "__extra__" is considered to be non-required as the corresponding + key.] + +The following examples illustrate these checks in action. + +``__extra__`` puts various restrictions on additional items for type +consistency checks:: + + class Movie(TypedDict): + name: str + __extra__: int | None + + class MovieDetails(TypedDict): + name: str + year: NotRequired[int] + + details: MovieDetails = {"name": "Kill Bill Vol. 1", "year": 2003} + movie: Movie = details # Not OK. While 'int' is consistent with 'int | None', + # 'int | None' is not consistent with 'int' + + class MovieWithYear(TypedDict): + name: str + year: int | None + + details: MovieWithYear = {"name": "Kill Bill Vol. 1", "year": 2003} + movie: Movie = details # Not OK. 'year' is not required in 'Movie', + # so it shouldn't be required in 'MovieWithYear' either + +Because "year" is absent in ``Movie``, ``__extra__`` is considered the +corresponding key. ``"year"`` being required violates the rule "For each +required key in ``B``, the corresponding key is required in ``A``". + +When ``__extra__`` is defined to be read-only in a TypedDict type, it is possible +for an item to have a narrower type than ``__extra__``'s value type. + + class Movie(TypedDict): + name: str + __extra__: ReadOnly[str | int] + + class MovieDetails(TypedDict): + name: str + year: NotRequired[int] + + details: MovieDetails = {"name": "Kill Bill Vol. 2", "year": 2004} + movie: Movie = details # OK. 'int' is consistent with 'str | int'. + +This behaves the same way as :pep:`705` specified if ``year: ReadOnly[str | int]`` +is an item defined in ``Movie``. + +``__extra__`` as a pseudo-item follows the same rules that other items have, so +when both TypedDicts contain ``__extra__``, this check is naturally enforced:: + + class MovieExtraInt(TypedDict): + name: str + __extra__: int + + class MovieExtraStr(TypedDict): + name: str + __extra__: str + + extra_int: MovieExtraInt = {"name": "No Country for Old Men", "year": 2007} + extra_str: MovieExtraStr = {"name": "No Country for Old Men", "description": ""} + extra_int = extra_str # Not OK. 'str' is inconsistent with 'int' for item '__extra__' + extra_str = extra_int # Not OK. 'int' is inconsistent with 'str' for item '__extra__' + +Interaction with Mapping[KT, VT] +-------------------------------- + +A TypedDict type can be consistent with ``Mapping[KT, VT]`` types other than +``Mapping[str, object]`` as long as the union of value types on the TypedDict +type is consistent with ``VT``. It is an extension of this rule from the typing +spec:: + + * A TypedDict with all ``int`` values is not consistent with + ``Mapping[str, int]``, since there may be additional non-``int`` + values not visible through the type, due to structural subtyping. + These can be accessed using the ``values()`` and ``items()`` + methods in ``Mapping`` + +For example:: + + class MovieExtraStr(TypedDict): + name: str + __extra__: str + + extra_str: MovieExtraStr = {"name": "Blade Runner", "summary": ""} + str_mapping: Mapping[str, str] = extra_str # OK + + int_mapping: Mapping[str, int] = extra_int # Not OK. 'int | str' is not consistent with 'int' + int_str_mapping: Mapping[str, int | str] = extra_int # OK + +Furthermore, type checkers should be able to infer the precise return types of +``values()`` and ``items()`` on such TypedDict types:: + + def fun(movie: MovieExtraStr) -> None: + reveal_type(movie.items()) # Revealed type is 'dict_items[str, str]' + reveal_type(movie.values()) # Revealed type is 'dict_values[str, str]' + +Interaction with dict[KT, VT] +----------------------------- + +Note that because the presence of ``__extra__`` prohibits additional required +keys in a TypedDict type's structural subtypes, we can determine if the +TypedDict type and its structural subtypes will ever have any required key +during static analysis. + +If there is no required key, the TypedDict type is consistent with ``dict[KT, +VT]`` and vice versa if all items on the TypedDict type satisfy the following +conditions: + +- ``VT`` is consistent with the value type of the item + +- The value type of the item is consistent with ``VT`` + +For example:: + + class IntDict(TypedDict): + __extra__: int + + class IntDictWithNum(IntDict): + num: NotRequired[int] + + def f(x: IntDict) -> None: + v: dict[str, int] = x # OK + v.clear() # OK + + not_required_num: IntDictWithNum = {"num": 1, "bar": 2} + regular_dict: dict[str, int] = not_required_num # OK + f(not_required_num) # OK + +In this case, methods that are previously unavailable on a TypedDict are allowed. + + not_required_num.clear() # OK + + reveal_type(not_required_num.popitem()) # OK. Revealed type is tuple[str, int] + +Open Issues +=========== + +Alternatives to the ``__extra__`` Reserved Key +---------------------------------------------- + +As it was pointed out in the `PEP 705 review comment +`__, +``__extra__`` as a reserved item has some disadvantages, including not allowing +"__extra__" as a regular key, requiring special handling to disallow +``Required`` and ``NotRequired``. There could be some better alternatives to +this without the above-mentioned issues. + +Backwards Compatibility +======================= + +While dunder attributes like ``__extra__`` are reserved for stdlib, it is still +a limitation that ``__extra__`` is no longer usable as a regular key. If the +proposal is accepted, none of ``__required_keys__``, ``__optional_keys__``, +``__readonly_keys__`` and ``__mutable_keys__`` should include ``__extra__`` in +runtime. + +Because this is a type-checking feature, it can be made available to older +versions as long as the type checker supports it. + +Rejected Ideas +============== + +Allowing Extra Items without Specifying the Type +------------------------------------------------ + +``extra=True`` was originally proposed for defining a TypedDict that accepts extra +items regardless of the type, like how ``total=True`` works:: + + class TypedDict(extra=True): + pass + +Because it did not offer a way to specify the type of the extra items, the type +checkers will need to assume that the type of the extra items is ``Any``, which +compromises type safety. Furthermore, the current behavior of TypedDict already +allows untyped extra items to be present in runtime, due to structural +subtyping. + +Supporting ``TypedDict(extra=type)`` +------------------------------------ + +This adds more corner cases to determine whether a type should be treated as a +type or a value. And it will require more work to support using special forms to +type the extra items. + +While this saves us from reserving an attribute for special use, it will require +extra work to implement inheritance, and it is less natural to integrate with +generic TypedDicts. + +Support Extra Items with Intersection +------------------------------------- + +Supporting intersections in Python's type system requires a lot of careful +considerations, and it can take a long time for the community to reach a +consensus on a reasonable design. + +Ideally, extra items in TypedDict should not be blocked by work on +intersections, nor does it necessarily need to be supported through +intersections. + +Moreover, the intersection between ``Mapping[...]`` and ``TypedDict`` is not +equivalent to a TypedDict type with the proposed ``__extra__`` special item, as +the value type of all known items in ``TypedDict`` needs to satisfy the +is-subtype-of relation with the value type of ``Mapping[...]``. + +Requiring Type Compatibility of the Known Items with ``__extra__`` +------------------------------------------------------------------ + +``__extra__`` restricts the value type for keys that are *unknown* to the +TypedDict type. So the value type of any *known* item is not necessarily +consistent with ``__extra__``'s type, and ``__extra__``'s type is not +necessarily consistent with the value types of all known items. + +This differs from TypeScript's `Index Signatures +`__ +syntax, which requires all properties' types to match the string index's type. +For example: + +.. code-block:: typescript + + interface MovieWithExtraNumber { + name: string // Property 'name' of type 'string' is not assignable to 'string' index type 'number'. + [index: string]: number + } + + interface MovieWithExtraNumberOrString { + name: string // OK + [index: string]: number | string + } + +This is a known limitation is discussed in `TypeScript's issue tracker +`__, +where it is suggested that there should be a way to exclude the defined keys +from the index signature, so that it is possible to define a type like +``MovieWithExtraNumber``. + +Reference Implementation +======================== + +pyanalyze has `experimental support +`__ +for a similar feature. + +Reference implementation for this specific proposal, however, is not currently +available. + +Acknowledgments +=============== + +Thanks to Jelle Zijlstra for sponsoring this PEP and providing review feedback, +Eric Traut who `proposed the original design +` +this PEP iterates on, and Alice Purcell for offering their perspective as the +author of :pep:`705`. + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.