Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeForm[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc) #9773

Open
davidfstr opened this issue Dec 1, 2020 · 104 comments
Labels
feature meta Issues tracking a broad area of work topic-depends-on-pep-change topic-type-form TypeForm might fix this

Comments

@davidfstr
Copy link
Contributor

davidfstr commented Dec 1, 2020

(An earlier version of this post used TypeAnnotation rather than TypeForm as the initially proposed spelling for the concept described here)

Feature

A new special form TypeForm[T] which is conceptually similar to Type[T] but is inhabited by not only regular types like int and str, but also by anything "typelike" that can be used in the position of a type annotation at runtime, including special forms like Union[int, str], Literal['foo'], List[int], MyTypedDict, etc.

Pitch

Being able to represent something like TypeForm[T] enables writing type signatures for new kinds of functions that can operate on arbitrary type annotation objects at runtime. For example:

# Returns `value` if it conforms to the specified type annotation using typechecker subtyping rules.
def trycast(typelike: TypeForm[T], value: object) -> Optional[T]: ...

# Returns whether the specified value can be assigned to a variable with the specified type annotation using typechecker subtyping rules.
def isassignable(value: object, typelike: TypeForm[T]) -> bool: ...

Several people have indicated interest in a way to spell this concept:

For a more in-depth motivational example showing how I can use something like TypeForm[T] to greatly simplify parsing JSON objects received by Python web applications, see my recent thread on typing-sig:

If there is interest from the core mypy developers, I'm willing to do the related specification and implementation work in mypy.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Dec 1, 2020

Why not do something like:

from __future__ import annotations
from typing import *

T = TypeVar("T")

class TypeAnnotation(Generic[T]):
    @classmethod
    def trycast(cls, value: object) -> T:
        ...

reveal_type(TypeAnnotation[Optional[int]].trycast(object()))

@JelleZijlstra
Copy link
Member

Why can't we make Type[T] also work for other special forms?

@hauntsaninja's workaround is useful, but it would be better to have a feature in the core type system for this.

@davidfstr
Copy link
Contributor Author

@hauntsaninja , using your workaround I am unable to access the passed parameter at runtime from inside the wrapper class.

The following program:

from __future__ import annotations
from typing import *

T = TypeVar('T')

class TypeAnnotation(Generic[T]):
    @classmethod
    def trycast(cls, value: object) -> Optional[T]:
        Ts = get_args(cls)
        print(f'get_args(cls): {Ts!r}')
        return None

ta = TypeAnnotation[Union[int, str]]
print(f'get_args(ta): {get_args(ta)!r}')
result = ta.trycast('a_str')

prints:

get_args(ta): (typing.Union[int, str],)
get_args(cls): ()

@davidfstr
Copy link
Contributor Author

@JelleZijlstra commented:

Why can't we make Type[T] also work for other special forms?

@hauntsaninja's workaround is useful, but it would be better to have a feature in the core type system for this.

I agree that having Type[T] be widened to mean "anything typelike, including typing special forms" would be an alternate solution. A very attractive one IMHO.

However it appears that there was a deliberate attempt in mypy 0.780 to narrow Type[T] to only refer to objects that satisfy isinstance(x, type) at runtime. I don't understand the context for that decision. If however that decision was reversed and Type[T] made more general then there would be no need for the additional TypeAnnotation[T] syntax I'm describing in this issue.

@gvanrossum
Copy link
Member

I believe the issue is that Type is used for things that can be used as the second argument of isinstance(). And those things must be actual class objects (or tuples of such) -- they cannot be things like Any, Optional[int] or List[str].

So if this feature is going to happen I think it should be a separate thing -- and for the static type system it probably shouldn't have any behavior, since such objects are only going to be useful for introspection at runtime. (And even then, how are you going to do the introspection? they all have types that are private objects in the typing module.)

@ltworf
Copy link

ltworf commented Dec 4, 2020

Well I wrote this module with various checks and add every new major py version to the tests to see that it keeps working: https://github.com/ltworf/typedload/blob/master/typedload/typechecks.py

Luckily since py3.6 it has not happened that the typing objects change between minor upgrades of python.

@davidfstr
Copy link
Contributor Author

davidfstr commented Dec 5, 2020

I believe the issue is that Type is used for things that can be used as the second argument of isinstance(). And those things must be actual class objects (or tuples of such) -- they cannot be things like Any, Optional[int] or List[str].

Makes sense.

for the static type system it probably shouldn't have any behavior, since such objects are only going to be useful for introspection at runtime.

Agreed.

how are you going to do the introspection? they all have types that are private objects in the typing module.

The typing module itself provides a few methods that can be used for introspection. typing.get_args and typing.get_origin come to mind.

@davidfstr
Copy link
Contributor Author

I could see a couple of possible spellings for the new concept:

Personally I'm now leaning toward TypeForm (over TypeAnnotation) because it is consistent with prior documentation and is more succinct to type. It does sound a bit abstract but I expect only relatively advanced developers will be using this concept anyway.

(Let the bikeshedding begin. :)

@gvanrossum
Copy link
Member

I like TypeForm.

@davidfstr
Copy link
Contributor Author

Okay I'll go with TypeForm then, for lack of other input.

Next steps I expect are for me to familiarize myself with the mypy codebase again since I'm a bit rusty. Hard to believe it's been as long as since 2016 I put in the first version of TypedDict. Rumor is it that the semantic analyzer has undergone some extensive changes since then.

@davidfstr davidfstr changed the title TypeAnnotation[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc) TypeForm[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc) Dec 8, 2020
@gvanrossum
Copy link
Member

Good. And yeah, a lot has changed. Once you have this working we should make a PEP out of it.

@davidfstr
Copy link
Contributor Author

Once you have this working we should make a PEP out of it.

Yep, will do this time around. :)

@davidfstr
Copy link
Contributor Author

Update: I redownloaded the high-level mypy codebase structure this afternoon to my brain. It appears there are now only 4 major passes of interest:

  • State.semantic_analysis_pass1()
  • semanal_main.semantic_analysis_for_scc() # pass 2
  • TypeChecker.check_first_pass()
  • TypeChecker.check_second_pass()

Next steps I expect are to trace everywhere that mypy is processing occurrences of Type[T] and T = TypeVar('T'), which I expect to be most-similar in implementation to the new TypeForm[T] support.

@davidfstr
Copy link
Contributor Author

Update: I have found/examined all mypy code related to processing the statement T = TypeVar('T'). Briefly:

  • A TypeVarExpr(TypeVarLikeExpr) is parsed from the characters T = TypeVar('T') by mypy.fastparse.parse().
  • A TypeVarDef(TypeVarLikeDef) is parsed from [a TypeVarExpr assigned to a name] and put into the current scope by TypeVarLikeScope.bind_new(name: str, TypeVarLikeExpr) -> TypeVarLikeDef.
  • A TypeVarType is returned as the resolved type for an unbound type reference by TypeAnalyser.visit_unbound_type_nonoptional(t: UnboundType, ...) -> Type

Next steps I expect are to trace everywhere that mypy is processing occurrences of Type[T] and other references to a T (which a TypeVar assignment statement defines).

@davidfstr
Copy link
Contributor Author

Update: I did trace everywhere that mypy is processing occurrences of Type[T], and more specifically uses of TypeType. There are a ton!

In examining those uses it looks like the behavior of TypeForm when interacting with other type system features is not completely straightforward, and therefore not amenable to direct implementation. So I've decided to take a step back and start drafting the design for TypeForm in an actual PEP so that any design issues can be ironed out and commented on in advance.

Once it's ready, I'll post a link for the new TypeForm PEP draft to here and probably also to typing-sig.

@gvanrossum
Copy link
Member

Yeah, alas Type[] was not implemented very cleanly (it was one of the things I tried to do and I missed a lot of places). We do have to consider whether this is going to be worth it -- there are much more important things that require our attention like variadic generics and type guards.

@davidfstr
Copy link
Contributor Author

We do have to consider whether this is going to be worth it -- there are much more important things that require our attention like variadic generics and type guards.

Aye. Variadic generics and type guards both have much wider applicability than TypeForm in my estimation. Python 3.10's upcoming alpha window from Feb 2021 thru April is coming up fast, and only so many PEPs can be focused on.

Nevertheless I'll get the initial TypeForm PEP draft in place, even if it needs to be paused ("deferred"?) for a bit.

@davidfstr
Copy link
Contributor Author

Update: I've drafted an initial PEP for TypeForm.

However I was thinking of waiting to post the TypeForm PEP for review (on typing-sig) until the commenting on PEP 646 (Variadic Generics) slows down and it becomes soft-approved, since that PEP is consuming a lot of reviewer time right now and is arguably higher priority.

In the meantime I'm building out an example module (trycast) that plans to use TypeForm.

@davidfstr
Copy link
Contributor Author

Update: I'm still waiting on Variadic Generics (PEP 646) on typing-sig to be soft-approved before posting the TypeForm PEP draft, to conserve reviewer time.

(In the meantime I'm continuing to work on trycast, a new library for recognizing JSON-like values that will benefit from TypeForm. Trycast is about 1-2 weeks away from a beta release.)

@gvanrossum
Copy link
Member

gvanrossum commented Jan 18, 2021 via email

@TeamSpen210
Copy link
Contributor

From python/typeshed#11653, dataclass.make_dataclass() would need TypeForm, as well as the attrs equivalent. The same would apply to TypedDict and namedtuple. All these are special cased anyway so it probably wouldn't have a big impact, but it does show a use case in the standard library.

@JelleZijlstra
Copy link
Member

Same goes for typing.get_origin, typing.get_args, etc.

@superlopuh
Copy link

@davidfstr I work on a compiler in Python that embeds constraints on the values in the IR into the Python type system. Some of these constraints are generic, representing nested constraints, for which we wrote a function that is similar to isinstance that verifies these: isa.

@davidfstr
Copy link
Contributor Author

Next topic: Naming the concept of "the type of a type annotation object":

I also like AnnotationType and am leaning toward that as the name:

  • It contains the word "annotation".
  • It aligns with the names of a few other special types like EllipsisType, FunctionType, NoneType, etc.
  • It aligns with the new definition of "annotation expression" which I believe is the equivalent concept in the big typing spec.
  • It is less jargony than "TypeForm", which primarily takes its name from a "typing special form", which is not a concept most folks have their minds wrapped around.

Comments? Support? Objections?

@adriangb
Copy link
Contributor

adriangb commented Apr 2, 2024

I like AnnotationType more 😀

@mikeshardmind
Copy link

I don't like AnnotationType as a name. There's a very important distinction between Annotations and Annotation Expressions that has led to confusions in the past, and even putting that aside for a moment, neither all Annotations nor all Annotation Expressions are valid values described by this special form.

This specifically describes typing special forms and not annotations as a whole. I don't think TypeForm being "more jargony" is a large enough detraction to have a name that is actively more misleading about what it describes instead.

@erictraut
Copy link

Could you please move this discussion to the typing forum? This isn't a mypy-specific feature. It's a proposed change to the typing spec, so it deserves the visibility and broader input from the community.

@davidfstr
Copy link
Contributor Author

Could you please move this discussion to the typing forum? [...] it deserves the visibility and broader input from the community.

Sure. I'll make a new thread there in the next few days.

@davidfstr
Copy link
Contributor Author

In preparation for moving this discussion to the typing forum, I'm currently drafting a new (2024) version of the TypeForm PEP, incorporating various feedback. Hoping to be done later this week.

@davidfstr
Copy link
Contributor Author

The 2024 version of the TypeForm PEP is ready for review. Please see the thread in the Typing forum.

@davidfstr
Copy link
Contributor Author

Draft 2 of the TypeForm PEP (2024 edition) is ready for review. Please leave your comments either in that thread or as inline comments in the linked Google Doc.


I especially solicit feedback from maintainers of runtime type checkers:

Please see §"Abstract", §"Motivation", and §"Common kinds of functions that would benefit from TypeForm" in the PEP to see how the TypeForm feature relates to specific functions in the library you maintain.

@hynek
Copy link
Member

hynek commented May 6, 2024

attrs doesn't do anything with types except copying them around (type-checking logic is entirely via a Mypy plugin and/or dataclass transforms), so I don't have feedback. But as you mention in the PEP, my less-known child svcs would benefit! But it seems to be more of a trivial byproduct of a bigger thing that I have to admit don't fully understand. :)

@leycec
Copy link

leycec commented May 7, 2024

...heh. @beartype and typeguard are the ultimate consumers of this PEP. If anyone cares, we care. Oh, how we care! Actually, our users care even more than we do. Our users deeply care so much they repeatedly prod us with pain sticks inspire us with feature requests until we finally do something about this. Users that deeply cared include:

  • @alexander-c-b, @patrick-kidger, @rsokl, and @skeggse all prodded me with pain sticks inspired me with feature requests for literally years that felt like an eternity in purgatory. They begged me to augment the beartype.door.is_bearable() tester into a full-blown typing.TypeGuard[T]. I thought it was impossible. They asked too much! Their increasingly despondent pleas fell on deaf ears – until at long last the Hero of Light emerged from the cavernous darkness of GitHub prophecy. And his username was...
  • @asford, who authored a full-frontal funny essay on this exact topic a month ago. Over the course of many, many paragraphs at which I chuckled drolly, @asford discovered how to augment the beartype.door.is_bearable() tester into a full-blown typing.TypeGuard[T]. How? By profanely combining typing.TypeGuard[T] + @typing.overload. Is it an utmost evil? It works. Thus, it can only be good:
# Note that this PEP 484- and 647-compliant API is entirely the brain child of
# @asford (Alex Ford). If this breaks, redirect all ~~vengeance~~ enquiries to:
#     https://github.com/asford
@overload
def is_bearable(
    obj: object, hint: Type[T], *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeGuard[T]:
    '''
    :pep:`647`-compliant type guard conditionally narrowing the passed object to
    the passed type hint *only* when this hint is actually a valid **type**
    (i.e., subclass of the builtin :class:`type` superclass).
    '''


@overload
def is_bearable(
    obj: T, hint: Any, *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeGuard[T]:
    '''
    :pep:`647`-compliant fallback preserving (rather than narrowing) the type of
    the passed object when this hint is *not* a valid type (e.g., the
    :pep:`586`-compliant ``typing.Literal['totally', 'not', 'a', 'type']``,
    which is clearly *not* a type).
    '''

This behaves itself under all Python versions – even Python 3.8 and 3.9, which lack typing.TypeGuard. How? By abusing typing.TYPE_CHECKING, of course. Is there anything typing.TYPE_CHECKING can't solve? Behold:

# Portably import the PEP 647-compliant "typing.TypeGuard" type hint factory
# first introduced by Python >= 3.10, regardless of the current version of
# Python and regardless of whether this submodule is currently being subject to
# static type-checking or not. Praise be to MIT ML guru and stunning Hypothesis
# maintainer @rsokl (Ryan Soklaski) for this brilliant circumvention. \o/
#
# Usage of this factory is a high priority. Hinting the return of the
# is_bearable() tester with a type guard created by this factory effectively
# coerces that tester in an arbitrarily complete type narrower and thus type
# parser at static analysis time, substantially reducing complaints from static
# type-checkers in end user code deferring to that tester.
#
# If this submodule is currently being statically type-checked (e.g., mypy),
# intentionally import from the third-party "typing_extensions" module rather
# than the standard "typing" module. Why? Because doing so eliminates Python
# version complaints from static type-checkers (e.g., mypy, pyright). Static
# type-checkers could care less whether "typing_extensions" is actually
# installed or not; they only care that "typing_extensions" unconditionally
# defines this type factory across all Python versions, whereas "typing" only
# conditionally defines this type factory under Python >= 3.10. *facepalm*
if TYPE_CHECKING:
    from typing_extensions import TypeGuard as TypeGuard
# Else, this submodule is currently being imported at runtime by Python. In this
# case, dynamically import this factory from whichever of the standard "typing"
# module *OR* the third-party "typing_extensions" module declares this factory,
# falling back to the builtin "bool" type if none do.
else:
    TypeGuard = import_typing_attr_or_fallback(
        'TypeGuard', TypeHintTypeFactory(bool))

I only vaguely understand what's happening there. If I understand correctly, acceptance of this PEP would enable @beartype to (A) dramatically simplify the above logic (e.g., by eliminating the need for @typing.overload entirely) and (B) dramatically enhance the utility of the is_bearable() tester by generalizing that tester to narrow arbitrary type hints.

After the release of Python 3.13.0, @beartype and all things like @beartype should now (A) globally replace all reference to typing.TypeGuard with typing.TypeIs, which is strictly superior for all practical intents and purposes praise Jelle Zijlstra and (B) refactor the signatures of things like is_bearable() to now resemble:

def is_bearable(
    obj: object, hint: TypeForm[T], *, conf: BeartypeConf = BEARTYPE_CONF_DEFAULT,
) -> TypeIs[T]:

David Foster be praised! I rejoice at this. The @beartype codebase will once again become readable. Well... more readable. Also, users are now weeping tears of joy at this. Type narrowing will start doing something useful for once. Yet questions remain.

The Demon Is In the Nomenclature: Name Haters Gonna Hate

The 100-pound emaciated gorilla in the room is actually your own open issue, @davidfstr:

I also added one Open Issue, whether the name “TypeForm” is best one to use

...heh. My answer is: "It's really not." I have no capitalist skin in this game. I barely know what a Typeform is. Yet, googling "python typeform" trivially yields nearly half-a-million hits. Googling "typeform" itself yields an astonishing 24 million hits – none of which have anything to do with typing systems and everything to do with TypeForm, the wildly successful tech startup I only marginally understand. Their search engine optimization (SEO) would probably frown and get a crinkled forehead if we trampled all over their heavily monetized brand space.

Out of sheer courtesy to Typeform, Typeform clients, and my rapidly shrinking 401k plan, ...heh it's probably best that the CPython standard library not trample American capitalism. Leave that to the evening news.

Even if Typeform wasn't a thing, TypeForm still wouldn't necessarily be the best name. None of us associate type hints or annotations with "forms." Yeah, sure; it's an internal private implementation detail of the standard typing module that various public type hint factories leverage a private typing._SpecialForm superclass. Nobody's supposed to know about that, though. More importantly, everybody already cognitively associates "forms" with HTML- and JavaScript-driven web forms. When some dude wearing a pinstriped suit forces me to "...just fill out that friggin' TPES form, already!", I don't tend to think about type hints or annotations.

Maybe I should. Now I will. Great. Thanks a lot, @davidfstr. My cluttered mind now has even more material baggage to lug.

Oh, I Know. I Know! I've Got It. You're Just Gonna Love It. It's...

typing.TypeHint. 🥳

...heh. Who didn't see that one coming, huh? Seriously. typing.TypeHint. You know this is the name. You knew five paragraphs ago when I started rambling incoherently about American capitalism that it was all ramping up to this big climactic finale.

typing.TypeHint. 🥳 🥳

Likewise, let's consider globally replacing all usages of the corresponding term "form" throughout the PEP with "hint": e.g.,

# Instead of this, which makes my cross eyes squint even more than normal...
def isassignable[T](value: object, form: TypeForm[T]) -> TypeIs[T]: ...

# Let's do this! My wan and pale facial cheeks are now smiling.
def isassignable[T](value: object, hint: TypeHint[T]) -> TypeIs[T]: ...

Assignability Raisers and Sorters: So Those Are Things Too Now, Huh?

So. It comes to this. In the parlance of this PEP, the aforementioned is_bearable() tester is an "assignability tester." Cool. That's cool. But the beartype.door subpackage does a lot more than just testing assignability. beartype.door offers a joyous medley of general-purpose functions that operate on arbitrary type hints – including:

  • die_if_unbearable(). It's a lot like is_bearable(). Whereas is_bearable() returns a bool describing whether the passed value satisfies the passed type hint, die_if_unbearable() either:
    • If that value satisfies that type hint, does nothing (i.e., returns None, silently reduces to a noop).
    • If that value violates that type hint, raises a human-readable exception describing why.
  • is_subhint(). Now this is a cool one that has nothing to do with either die_if_unbearable() or is_bearable(). Yet, it'd be wonderful if static type-checkers and competing runtime type-checkers alike supported something similar. Basically, is_subhint() defines a partial ordering over the set of all type hints. Specifically, is_subhint() returns a bool describing whether the first passed type hint is a subhint of the second passed type hint – where "subhint" is defined as:
    • These two hints are commensurable (i.e., convey broadly similar semantics enabling these two hints to be reasonably compared). For example:
      • callable.abc.Iterable[str]`` and callable.abc.Sequence[int]` are commensurable. These two hints both convey container semantics. Despite their differing child hints, these two hints are broadly similar enough to be reasonably comparable.
      • callable.abc.Iterable[str]`` and callable.abc.Callable[[], int]` are incommensurable. Whereas the first hint conveys a container semantic, the second hint conveys a callable semantic. Since these two semantics are unrelated, these two hints are dissimilar enough to not be reasonably comparable.
    • The first hint is semantically equivalent to or narrower than the second hint. Formally:
      • The first hint matches less than or equal to the total number of all possible objects matched by the second hint.
      • The size of the countably infinite set of all possible objects matched by the first hint is less than or equal to that of those matched by the second hint.

In the same way that is_bearable() can be broadly thought of as a generalization of the isinstance() builtin, is_subhint() can be broadly thought of as a generalization of the issubclass() builtin. Examples or it only happened in the DMT hyperspace:

>>> from beartype.door import is_subhint

# Test simple subclass relations.
>>> is_subhint(bool, int)
True
>>> is_subhint(int, int)
True
>>> is_subhint(str, int)
False

# Test less simple type hint relations.
>>> from typing import Any
>>> is_subhint(list, Any)
True

# Test brutally hurtful type hint relations that make me squint. My eyes!
>>> from collections.abc import Callable, Sequence
>>> is_subhint(Callable[[], list], Callable[..., Sequence[Any]])
True
>>> is_subhint(Callable[[], list], Callable[..., Sequence[int]])
False

Is a partial ordering over the set of all types actually useful, though? I mean, sure. It's cool. We get that. Everything's cool if you squint enough at it. But does anyone care?

...heh. Yeah. It turns out a partial ordering over the set of all types unlocks the keys to the Kingdom of QA – including efficient runtime multiple dispatch in O(1) time. Think @typing.overload that actually does something useful. Does anyone want Julia without actually having to use Julia? Praise be to @wesselb.

In the case of die_if_unbearable(), integration between @beartype and static type-checkers via TypeHint would inform static type-checkers that the passed value is now guaranteed to satisfy the passed type hint. No intervening if conditionals are required: e.g.,

from beartype.door import die_if_unbearable

# Define something heinous dynamically. Static type-checkers no longer have any idea what's happening.
eval('muh_list = ["kk", "cray-cray", "hey, hey", "wut is going on with this list!?"])

# Beartype informs static type-checkers of that the type of "muh_list" is "list[str]".
die_if_unbearable(muh_list, list[str])

# Static type-checkers be like: "Uhh... I... I guess, bro. I guess. Seems wack. But you do you."
print(''.join(for muh_item in muh_list))  # <-- totally fine! accept this madness, static type-checker

"TypeForm" Values Section: Not Sure What's Going On Here, But Now Squinting

The "TypeForm" Values section makes me squint. From @beartype's general-purpose broad-minded laissez faire "anything goes" perspective, anything that is a type hint should ideally be a TypeHint.

This includes type hints that are only contextually valid in various syntactic and/or semantic contexts – like Final[...], InitVar[...], Never, NoReturn, Self, TypeIs[...] and so on. Ultimately, the line between whether a type hint is globally valid or only contextually valid is incredibly thin. From @beartype's permissive perspective, for example, Final[...], InitVar[...], and Self type hints are all syntactically and semantically valid anywhere within the body of a class – which covers most real-world code, because most real-world code is object-oriented and thus resides within the body of a class. You're probably thinking: "Wait. What? How is InitVar[...] valid as the parameter of a method?" Look. It's complicated. Just know that subtle runtime interactions between @beartype and @dataclasses.dataclass require @beartype to look the other way while @dataclasses.dataclass rummages around in dunder methods like __init__() behind everyone's backs.

The PEP currently rejects these sorts of type hints as "annotated expressions" – which is itself really weird, because we already have annotated type hints that are technically Python expressions and thus "annotated expressions": typing.Annotated[...]. When you say "annotated," I think: "Surely you speak of typing.Annotated[..], good Sir!" I do not think: "Surely you speak of arbitrary type hints that are only contextually valid in various syntactic and/or semantic contexts, less good Sir!"

The problem with rejecting some but not all type hints is:

  • Why? Why bother? No justification is given. If @beartype is fine with literally all type hints, everybody is fine. Unless everybody hates @beartype. Then the @leycec emoji sobs. 😭
  • The PEP doesn't even enumerate the set of all rejected type hints. It's way more than just Self, TypeGuard[...], and TypeIs[...].
  • Even enumerating the set of all rejected type hints is pointless, because the set of all rejected type hints explodes exponentially with each subsequent Python release. Today, it's ten type hint factories or whatevah. In ten years, it's ten thousand type hint factories after the inevitable release of Python 3.923480713084723407.0 introduces an earth-shattering gamut of new type hint factories that are only contextually valid in PEP 526-compliant annotated variable assignments. What about those, huh!? Those poor guys. More sobbing can be heard. 😭

Stringified TypeForms Section: NO GODS WHY NOOOOOOOOOOOOOOOOOOOOO

A type-form value may itself be a string literal that spells a forward reference:

...heh. So. It comes to this. You're trying to commit @leycec to a sanitarium. The truth is now revealed. Please. Let's all be clear on this:

  • Any attempt to declare strings as type hints should be soundly rejected as insane.

Static type-checkers don't care about stringified type hints, of course. But static type-checkers also hallucinate. By definition, their opinions are already insane.

Runtime type-checkers, however, basically cannot cope with stringified type hints – like, any stringified type hints. In the general case, doing so requires non-portable call stack inspection. It's slow. It's fragile. It's non-portable. It basically never works right. Even when it works "right," it never works the way users expect.

Sure. @beartype copes with stringified type hints – mostly. But @beartype is also insane. @beartype has already squandered years of sweaty blood, smelly sweat, unpaid man hours, and precious life force attempting to support insane shenanigans like PEP 563 (i.e., from __future__ import annotations) and PEP 695 (e.g., type YouTooShallKnowOurPain = 'wut_u_say' | 'die_beartype_die'). Everybody else in the runtime type-checking space just gave up and didn't even bother trying.

In 2024, with the benefit of hindsight and safety goggles, let us all quietly admit that stringified type hints were a shambolic zombie plague that should have never happened. We certainly shouldn't be expanding the size and scope of stringified type hints. We should be deprecating, obsoleting, and slowly backing away from stringified type hints with our arms placatingly raised up in the air as we repeatedly say: "We're sorry! We're sorry for what we did to you, Pydantic and @beartype and typeguard! We didn't know... Gods. We didn't know. The horrors your codebase must have seen. Please accept this hush money as compensation."

There's no demonstrable reason whatsoever to permit useless insanity like IntTreeRef: TypeForm = 'IntTree' # OK. NO, NOT OK. Absolutely not OK. If this comment has one and only takeaway, let it be this:

Pythonistas don't let Pythonistas stringify type hints. Not even once.
— thus spake @leycec

@leycec: He Is Now Tired and Must Now Collapse onto a Bed Full of Cats

@Tinche
Copy link
Contributor

Tinche commented May 7, 2024

LGTM. Excited about this!

@agronholm
Copy link

Yeah, just today I ran into a problem that would probably be solved with this. And it wasn't even related to typeguard. The PEP looks good at a glance, but I'll give it a proper read-through later.

@patrick-kidger
Copy link

patrick-kidger commented May 7, 2024

A broad +1 to all of @leycec's points.

  • I don't love how this is proposing to create a three-tier hierarchy of 'types' and 'types forms' and 'type hints'. The existing split is complicated enough! Better to call this TypeHint and accept anything. Even Self and Final etc. have to be handled by runtime type checkers.

  • Definitely let's not accept stringified annotations. Every runtime type-checker I write either loudly fails or silently turns those into Any, because they're basically impossible to handle well at runtime. (And once PEP649 is implemented then hopefully this issue can go away.)

@davidfstr
Copy link
Contributor Author

davidfstr commented May 8, 2024

Name

Googling "typeform" itself yields [...] TypeForm, the wildly successful tech startup I only marginally understand.

Heh. True. "Typeform" is much better known as a service for online surveys. :)

So, definitely a +1 that there's almost certainly a better name then "TypeForm". However choosing such a name depends
a lot on your next point:

Values

From https://github.com/beartype's general-purpose broad-minded laissez faire "anything goes" perspective, anything that is a type hint should ideally be a TypeHint.

This includes type hints that are only contextually valid in various syntactic and/or semantic contexts – like Final[...], InitVar[...], Never, NoReturn, Self, TypeIs[...] and so on.

I've debated whether the TypeForm concept should cover all runtime type annotation objects (i.e. what the typing specification calls "annotation expressions") or only those objects which spell a "type" (i.e. what the typing specification calls "type expressions").

Allowing TypeForm[] to match non-types (like InitVar[], Final[], Self, etc) doesn't make sense when trying to combine TypeForm[] with TypeIs[] or TypeGuard[] in a function definition, one of the key capabilities I want to enable. I discuss this further in §"Rejected Ideas > Accept arbitrary annotation expressions". Consider the following code:

# AKA: is_bearable
def isassignable[T](value: object, form: TypeForm[T]) -> TypeIs[T]: ...

request_json = ...
if isassignable(request_json, Final[int]):
    assert_type(request_json, ???)  # Never? int? Certainly not Final[int] because not valid for a variable type.

What should a static type checker infer for the ??? position above? (Pause to consider your own answer here...)

Right now the PEP takes the stance that passing a non-type of Final[] where a TypeForm[] is expected is an error. So ??? would be Any (i.e. the error type).

Surprisingly, I see that is_bearable (an implementation of isassignable from beartype) can return True in the above scenario...

>>> from beartype.door import is_bearable
>>> from typing import *
>>> is_bearable(5, Final[int])
True  # 😳

Appendix: More adventures in beartype

Happily I see Self is rejected outright (👍 ):

>>> is_bearable(5, Self)
beartype.roar.BeartypeDecorHintPep673Exception: Is_bearable() PEP 673 type hint "typing.Self" invalid outside @beartype-decorated class. PEP 673 type hints are valid only inside classes decorated by @beartype.

And ClassVar is unsupported (👌 ):

>>> is_bearable(5, ClassVar[int])
beartype.roar.BeartypeDecorHintPepUnsupportedException: Is_bearable() type hint typing.ClassVar[int] currently unsupported by @beartype.

But InitVar is accepted, surprisingly:

>>> from dataclasses import InitVar
>>> is_bearable(5, InitVar[int])
True  # 😳

Appendix: Similar adventures in trycast

In the trycast library, Final, Self, ClassVar, and InitVar are all unsupported, since they don't make sense when looking at a value in isolation:

>>> from trycast import isassignable
>>> isassignable(5, Final[int])
trycast.TypeNotSupportedError: isassignable does not know how to recognize generic type typing.Final.
>>> isassignable(5, Self)
TypeError: typing.Self cannot be used with isinstance()
>>> isassignable(5, ClassVar[int])
trycast.TypeNotSupportedError: isassignable does not know how to recognize generic type typing.ClassVar.
>>> isassignable(5, InitVar[int])
TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union

(Heh. I especially need to fix that last error message to be something sensible.)

Stringified TypeForms

Runtime type-checkers, however, basically cannot cope with stringified type hints – like, any stringified type hints. In the general case, doing so requires non-portable call stack inspection. It's slow. It's fragile. It's non-portable. It basically never works right. Even when it works "right," it never works the way users expect.

let us all quietly admit that stringified type hints were a shambolic zombie plague that should have never happened. We certainly shouldn't be expanding the size and scope of stringified type hints. We should be deprecating, obsoleting, and slowly backing away from stringified type hints

Agreed that stringified TypeForms - where the entire type is a string, not just some interior forward references - are very difficult to work with at runtime. I allude to this in §"How to Teach This", but I thought I had used stronger language than what I now see: 😉

  • Stringified type annotations[^strann-less-common] (like 'list[str]') must be parsed (to something like typing.List[str]) to be introspected.
  • Resolving string-based forward references[^strann-less-common] inside type expressions to actual values must typically be done using eval(), which is difficult/impossible to use in a safe way.

(Note to self: Increase emphasis in the PEP RE how difficult it is to work with stringified type annotations at runtime.)

The current PEP draft defaults to allowing stringified TypeForms since static type checkers already expect & handle them robustly in locations where a type expression can appear. But - upon further thought - anything that can fit into a TypeForm must be capable of being well-supported both by static and runtime type checkers in order to spell an implementable function definition.

So I'm inclined to agree that TypeForms probably shouldn't allow stringified annotations since they're basically impossible to work with robustly at runtime.

Edit: I changed my mind RE not allowing stringified annotations to be matched, to prioritize aligning with matching all "type expressions" (which include them).

@patrick-kidger
Copy link

patrick-kidger commented May 8, 2024

What should a static type checker infer for the ??? position above? (Pause to consider your own answer here...)

I have a possibly-controversial suggestion (that I don't feel too strongly about right now), which is that this isn't defined behaviour. For example, if I fill in the type hint explicitly with pyright (which is what I have installed at the moment), then we get the perfectly meaningless:

from typing import Final, TypeGuard

def foo(x) -> TypeGuard[Final[int]]:
    pass

x = 1
if foo(x):
    reveal_type(x)  # Type of `x` is `Final`.

To expand on this, the proposed TypeForm-that-isn't-TypeHint feels to me a bit like "the type of all positive integers". At some point we make the jump from properties we care to express in the type system to properties we don't. I feel like "the set of all valid parameters T for TypeIs[T]" is probably already substantially more niche than "the type of all positive integers" -- in fact I suspect the latter would see quite a lot more use-cases! -- but we don't implement that.

@JelleZijlstra
Copy link
Member

I would argue that the types allowed by TypeForm should exactly match the definition of either "type expression" or "annotation expression" in the spec (https://typing.readthedocs.io/en/latest/spec/annotations.html#type-and-annotation-expressions). This reduces the number of concepts and makes the overall system simpler. Possibly we should add both, which suggests obvious names for the new special forms: AnnotationExpression[T] and TypeExpression[T].

I don't think it is practical to disallow stringified annotations. Consider a type alias from a third party library that is defined as Alias = list[int]. You can use Alias as a TypeForm. If the library now changes to Alias = list["int"], does that mean Alias is no longer valid as a TypeForm? Similarly, if we disallow Self, should we also disallow list[Self]?

There will always be some types that are hard for a runtime type checker to check. For example, is_assignable(some_generator(), Generator[int, str, float]) would be impossible to fully check without analyzing the bytecode of the generator.

To @patrick-kidger's example, pyright correctly shows an error on the TypeGuard[Final[int]] line, because TypeGuard[...] requires a type expression and Final[int] isn't one. I don't think you can draw a conclusion from pyright's behavior on the rest of an invalid program.

@TeamSpen210
Copy link
Contributor

To me, it feels like we should tend towards being as loose as possible with what is permitted as a TypeForm. Anything using it is going to have restrictions at runtime, in ways that couldn't possibly be easily expressed. Users are going to have to check documentation/rely on runtime exceptions to know what is allowed, so restricting a few specific cases doesn't help too much?

For Final, ClassVar and InitVar, the rule a static type checker could use is to simply strip them off before evaluating the guard. These in particular I can see uses for, to do things like match specific configurations of like a dataclass field object. Maybe for things like Self that don't meaningfully interact with TypeGuard/TypeIs, there should just be a type error at the point where you call such a function with such a variable, and no narrowing occurs.

@davidfstr
Copy link
Contributor Author

Values

Several folks have recommended not bifurcating the existing concepts of "annotation expressions" and "type expressions" to a further third subset, and to instead just pick one of the first two.

Since the main utility of TypeForm[] is using it in combination with TypeIs[] + TypeGuard[], and because a "type expression" is what those forms accept, I'm inclined to round the concept of TypeForm to exactly match a "type expression".

Name

With the above meaning defined for the concept, I'm looking at renaming TypeForm[] to TypeExpression[].

There may be a desire to define a separate concept that aligns with "annotation expressions", perhaps called AnnotationExpression[], but I don't think it's valuable to define in this PEP. I don't see any benefits in being able to spell AnnotationExpression[] vs just spelling object, as you currently must do.

Stringified TypeForms

By rounding the concept of TypeForm[] to exactly match a "type expression", that would imply that stringified annotations like 'list[int]' would be allowed. Despite being allowed, runtime type checkers cannot handle them reliably at runtime. However this is not a unique problem: there are a number of type expressions - Generator[...] 1, Callable[...] 2, stringified annotations 3 - that are particularly hard to work with at runtime already.

Perhaps it would be sufficient in §"How to Teach This":

  • to mention that it's not expected that runtime type checkers necessarily handle every possible kind of TypeExpression[] input, which is already the status quo, and
  • to acknowledge certain specific kinds of type expressions which have been difficult to work with.

New idea: Matching TypeExpressions[] with an ABC?

@erictraut has expressed concern that it would be difficult for a static type checker like pyright to match TypeExpressions[]s in locations that would normally accept only a regular value expression. 1

Static type checkers already have to deal with recognizing ABCs, so I wonder if defining a TypeExpression[] as an ABC would make it easier for a static type checker to recognize...

A quick proof of concept:

>>> from abc import ABC
>>> from typing import *
>>> import typing
>>> 
>>> class TypeExpression(ABC):
...     pass
>>> 
>>> type(str)
<class 'type'>
>>> TypeExpression.register(type)
>>> 
>>> type(Union[int, str])
<class 'typing._UnionGenericAlias'>
>>> TypeExpression.register(typing._UnionGenericAlias)
>>> 
>>> isinstance(str, TypeExpression)
True
>>> isinstance(Union[int, str], TypeExpression)
True

Footnotes

  1. https://docs.google.com/document/d/1PRvl3uKE-BxvmyFO3Ic4fZDpgW2fOU58aYh3LJ3zLpk/edit?disco=AAABMt7c3oU

@JelleZijlstra
Copy link
Member

renaming TypeForm[] to TypeExpression[]

We could also consider TypeExpr for brevity; "expr" is a fairly common and well-understood abbreviation of "expression".

I wonder if defining a TypeExpression[] as an ABC would make it easier for a static type checker to recognize

I don't think that would work very well. The fact that Union[str, int] is an instance of _UnionGenericAlias at runtime is an implementation detail that type checkers should not be aware of. Even if we make TypeExpression an ABC, type checkers would still need special-casing to know that Union[str, int] is compatible with TypeExpression.

@mikeshardmind
Copy link

mikeshardmind commented May 11, 2024

@davidfstr I think this discussion should be happening on discourse as it affects more than just mypy. As for the concern Eric brought up, that was (part of) the reason for what I mentioned with TypeForm[T, *Args] earlier in this thread, and here https://discuss.python.org/t/typeform-spelling-for-a-type-annotation-object-at-runtime/51435/2 as well. The other part being that it allows explicitly marking with overloads which types of type expressions functions handle

@davidfstr
Copy link
Contributor Author

@davidfstr I think this discussion should be happening on discourse as it affects more than just mypy.

Agreed. I have requested that folks respond on the discourse thread, but most responses have actually happened here so far. I have been giving counter-responses in the same venue that I received responses, which so far has been mostly here.

As for the concern Eric brought up, that was (part of) the reason for what I mentioned with TypeForm[T, *Args] earlier in this thread

I'm not sure how

  • the TypeForm[OriginType, *ArgTypes] syntax

is related to

  • Eric's concern that it would be difficult for a static type checker like pyright to match TypeExpressions[]s in locations that would normally accept only a regular value expression.

[the TypeForm[OriginType, *ArgTypes] syntax] allows explicitly marking with overloads which types of type expressions functions handle

Yes, however I've already mentioned that: I’m not sure that its desirable (or even possible) to write out the exact constraints on what kinds of forms a particular function will accept. (And I think that’s OK.) I also responded further later in the thread.

@davidfstr
Copy link
Contributor Author

Draft 3 of the TypeForm TypeExpr PEP is ready for review. Please leave your comments in the discuss.python.org thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature meta Issues tracking a broad area of work topic-depends-on-pep-change topic-type-form TypeForm might fix this
Projects
None yet
Development

No branches or pull requests