Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue about type annotation #822

Closed
ain-soph opened this issue Jan 11, 2021 · 27 comments
Closed

Issue about type annotation #822

ain-soph opened this issue Jan 11, 2021 · 27 comments
Labels
fixed in next version (main) A fix has been implemented and will appear in an upcoming version

Comments

@ain-soph
Copy link

ain-soph commented Jan 11, 2021

I think we should allow type modification after function returns. This is essential for Union and subclass.
See 2 cases below:

from typing import Union

def func(x: Union[int, float]) -> Union[int, float]:
    return x

x: float = func(13.0)
# However, x is still Union[int, float] according to Pylance.
class A:
    pass

class B(A):
    utils: int = 13
    pass

def func2(x: A) -> A:
    return x

y: B = func2(B())
y.utils = 15  # However, y is still regarded as A type and doesn't have attribute utils according to Pylance.
@erictraut
Copy link
Contributor

You have a misunderstanding of how type systems and type checking work. It can be confusing at first if you haven't used a typed language. For details of type checking within Python, please refer to PEP 483 and 484.

Let me explain some concepts based on the two examples you provided.

In your first example, the declared type of x is float. If it is assigned a subtype of float, it can be temporarily "narrowed" to that subtype. The type Union[int, float] is a narrowed version of float, so if you hover over x at that point in the code flow graph, it will show that narrowed type. If you were to add the statement x = 1 to the bottom of your first example and then hover over the x, you would see that it now has a narrowed type of Literal[1] at that point in the code flow graph. The type checker generally attempts to produce the narrowest type possible based on assignments and conditional checks. For more details about type narrowing, refer to this documentation.

In your second example, you have declared that y must be of type B, but you're attempting to assign it an instance of A, which is not a subtype of B. The type checker correctly reports this type violation.

If your intent is for func2 to accept any subclass of A and return the same subclass, then you would need to make it a generic function that uses a bound TypeVar, like this:

from typing import TypeVar

class A:
    pass

class B(A):
    utils: int = 13
    pass

_T = TypeVar("_T", bound=A)

def func2(x: _T) -> _T:
    return x

y: B = func2(B())
y.utils = 15

Hope that helps.

@jakebailey jakebailey added the waiting for user response Requires more information from user label Jan 11, 2021
@github-actions github-actions bot removed the triage label Jan 11, 2021
@ain-soph
Copy link
Author

ain-soph commented Jan 11, 2021

@erictraut Thanks for your detailed explanation! It makes me clear about the type narrowing. And I realize that the declaration goes first, and then narrow to the method return type.

However, there still some cases confusing.

import numpy as np

c: np.ndarray = np.bincount([1, 1, 3, 3, 4])

If I run the program, np.bincount will return a np.ndarray for sure. So it is written in the docs.
However, the Pylance claims the return type will be a tuple because numpy doesn't write explicit return type annotation, and Pylance makes predictions from https://github.com/numpy/numpy/blob/e745a19cb0ea9af0fd252a4625ec793130f87414/numpy/core/multiarray.py#L952 but ignores the decorator function that further transforms tuple to np.ndarray.

This is certainly not correct and will cause violation, and should be fixed (by some stub pyi files perhaps?).
I just wonder from user side, can I manually fix it by something like c: np.ndarray = np.bincount([1, 1, 3, 3, 4]). (I know this is certainly not working now.) Currently, this declaration won't work and the type of c is still tuple for Pylance.


I still have questions about the Union and subclass cases. But I think I need to first go over the listed PEPs and come back to this issue later.

@erictraut
Copy link
Contributor

If you want to enable type checking, annotated libraries are very important. Pylance will do its best to infer types in the absence of annotations, but inference is prone to errors, and you'll run into many issues. Parts of numpy are even delivered as native libraries, so there's no Python code for Pylance to use as a source for inference. So you really need type stubs for type checking to work. Try data-science-types and see if that meets your needs.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

Yes, I admit that using stub could solve this problem for numpy, and that should be a standard that every library should have correct type annotation.
But the issue here is whether the user could handle that in their codes. I know the function return type annotation in some other libraries is actually not correct or missing, could I fix it in my code by a declaration that c is actually a np.ndarray?

Well, it seems not. And Pylance will keep the wrong prediction forever no matter what I do in the code (unless I write a stub elsewhere.).

I think it might need some feature improvement to support this?

@jakebailey
Copy link
Member

Numpy support is tracked in #150. Numpy itself may ship with stubs soon (depending on their quality).

I'd think you'd be able to manually override the type and stick a # type: ignore on it, but if type checking is off you won't get any errors. I'd be surprised if it didn't respect your annotation.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

@jakebailey
Thanks for your involvement. As you can see that the first tolist is white color and no linting because c is still regarded as a tuple, and I can't go to definition.
As an expected example, you can see e.tolist() where tolist is shown in yellow color and I can go to definition. This is because np.ones returns an Any.
image

@jakebailey
Copy link
Member

This shows that annotating the variable works (what you described as a fix), it's just that ndarray doesn't have a typed tolist, which is a different problem.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

This shows that annotating the variable works (what you described as a fix), it's just that ndarray doesn't have a typed tolist, which is a different problem.

Nope? Because e has tolist and I can go to definition and see the method implementation. Even though it actually goes into a pyi stub in vscode pylance plugin. But that's another issue I've seen in the issue list.

@erictraut
Copy link
Contributor

You can certainly declare the types of variables within your own code. However, you can't declare the return types of library functions in your own code. You would need to write a stub file to do that.

You can call the typing.cast function to override the type of any expression.

from typing import cast

# Cast is required because np.bincount function is not annotated.
c = cast(np.ndarray, np.bincount([1, 1, 3, 3, 4]))

@jakebailey
Copy link
Member

My mistake, I missed the yellow.

@jakebailey
Copy link
Member

I guess I'm confused as to why annotating it doesn't work; I would have expected c to be a variable of type ndarray, but then an error be emitted at the assignment that says "this isn't the right type", then later in the code c is still the type it was declared as.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

@erictraut
Yes, that's what I exactly mean. I think it's a little inconvenient for most users if they have to write stub or typing.cast.

To clarify, the expected behavior should be: If the function return type is conflict with the variable declaration, raise Error (or warning?), but the variable type should still be the declaration type.

So it is for the Union or subclass.

  1. If the function return type is a wider range Union[float, int] and the declaration is float, I think the expected variable type should be float rather than Union[float, int], which is a more narrow type.
  2. If the function return type is a wider range A and the declaration is B (B subclass A), I think the expected variable type should be B rather than A, which is a more narrow type.

But I think I might understand incorrectly for the word narrow, since it seems Union[float, int] is more narrow than float.

@erictraut
Copy link
Contributor

Most users don't care about static type checking. For those that do, Python is going to be inconvenient until library authors provide complete and accurate type annotations. We're doing our best to accelerate that.

Yes, you are correct about the expected behavior.

x: int = 3
reveal_type(x) # Literal[3]
x = 5.4 # This line generates an error because 5.4 is a float
reveal_type(x) # int

In the above example, the type of x is narrowed from int to Literal[3] because of the first assignment. The second assignment is an error because 5.4 is a float, not an int. The type of x reverts back to its declared type int after the illegal assignment.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

@erictraut
Thanks for your instance. But I don't quite agree with it? In python, after x = 5.4, it becomes a float, and pylance identifies that correctly. I think it's expected.
In C++, that will raise an error for sure.
image

@erictraut
Copy link
Contributor

You have typeCheckingMode set to "off". If you enable type checking (i.e. set python.analysis.typeCheckingMode to "basic"), you will see the following:

Screen Shot 2021-01-11 at 4 56 13 PM

If you're not interested in type checking and simply want completion suggestions and other language features, you should leave type checking off.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

Thanks, you are correct. I've tested with python.analysis.typeCheckingMode="basic" and it works. (But I get thousands of error reports in my project 😂 I thought I was a careful python programmer, so naive.)

So the current issue is to get the expected behavior for python.analysis.typeCheckingMode="off"?
(Personally, I think the type of x is float is very good when mode='off' in the previous example and we should just keep that.)

We just need to solve the conflict between variable type declaration and method return type as np.bincount example shows.
At least allow users to fix the wrong prediction by Pylance.

@erictraut
Copy link
Contributor

If type checking is off, you will not see any type-checking errors. You will receive completion suggestions and other language features but no type checking. Why do you care what types Pylance predicts as long as you're getting the best possible completion suggestions in most cases?

Pylance intentionally changes its behavior when type checking is disabled to favor better completion suggestions based on the types of the values assigned, versus the declared types. The assumption is that if type checking is disabled, you don't care about the declared types as much as the actual assigned types.

What you're proposing is that we don't change the behavior between the two modes. That may address the specific examples you've cited, but I think it will create even problems in more situations. In other words, I think such a change would be a net negative for most Pylance users. We'll discuss this internally and get back to you, but right now I'm not inclined to change the behavior.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

Yes. I don't care about the type-checking errors. I just want completion suggestions.
But now I have no auto-completion for c.tolist. That's the problem.

The problem is caused by Pylance getting wrong type predictions and users can't fix it manually without using stub or typing.cast.

I know it might be difficult to handle. Just hope we can finally get a good solution.

import numpy as np

c: np.ndarray = np.bincount([1, 1, 3, 3, 4])
c.tolist()

@erictraut
Copy link
Contributor

It's actually trivial to change the behavior and make it consistent. It simply involves deleting an if statement in the code here: https://github.com/microsoft/pyright/blob/f1da893256092f55549c1b38710bf4aec9a9f075/packages/pyright-internal/src/analyzer/typeEvaluator.ts#L2497. I'm still not convinced it's a good tradeoff though. I think it will fix some cases but break even more of them.

@ain-soph
Copy link
Author

ain-soph commented Jan 12, 2021

Thanks a lot for your patient discussion. Hope it can get solved!
Sorry for not being able to get so deep into Pyright, but I'll keep watching on this issue for sure.

@erictraut erictraut added needs decision Do we want this enhancement? and removed waiting for user response Requires more information from user labels Jan 12, 2021
@erictraut
Copy link
Contributor

We had a good internal debate about the pros and cons of making this change. We decided to make the change and get feedback. If many Pylance users dislike it, we could revert it or look for some other solution.

@erictraut erictraut added fixed in next version (main) A fix has been implemented and will appear in an upcoming version and removed needs decision Do we want this enhancement? labels Jan 12, 2021
@ain-soph
Copy link
Author

We had a good internal debate about the pros and cons of making this change. We decided to make the change and get feedback. If many Pylance users dislike it, we could revert it or look for some other solution.

You guys are really working efficiently! So damn quick, which violates my previous impression on Microsoft😂

@jakebailey
Copy link
Member

This issue has been fixed in version 2021.1.1, which we've just released. You can find the changelog here: https://github.com/microsoft/pylance-release/blob/main/CHANGELOG.md#202111-13-january-2021

@hassanselim0
Copy link

I personally support this change, I have type-checking off, and I prefer to have a way to force a type without using cast (by using the type annotation as in case 1 here).
I don't fully understand the cons of this change, but if I find any I'll report them here.

@ain-soph
Copy link
Author

ain-soph commented May 1, 2021

@erictraut
Here I find a disadvantage of previous modification (not quite sure if it is the cause), but I think it's possible to fix it.

Expectation

Let me first put my expectation here (2 cases):

    a: set[int] = {}           # 'a' is set[int]
    a.add(4)                   # with good linting, 'a' is set[int]

    a = list(a)                # 'a' is list[int]
    a.sort()                   # with good linting, 'a' is list[int]

and

    a: set[int] = {1,2,3}      # 'a' is set[int]
    a.add(4)                   # with good linting, 'a' is set[int]

    a: list[int] = list(a)     # 'a' is list[int]
    a.sort()                   # with good linting, 'a' is list[int]

Bug

To describe the bug, let's see a working-well case first (with automatic type annotation):

    a = {1,2,3}      # 'a' is set[int]
    a.add(4)         # with good linting, 'a' is set[int]

    a = list(a)      # 'a' is list[int]
    a.sort()         # with good linting, 'a' is list[int]

However, sometimes we have to construct a = {} with no element first. But we know the element should be int, so we annotate it with a: set[int] = {}, which makes the scenario I describe in the expectation case 1. But the actual behavior is:

    a: set[int] = {1,2,3}      # 'a' is set[int]
    a.add(4)                   # with good linting, 'a' is set[int]

    a = list(a)                # 'a' is still set[int]
    a.sort()                   # with no linting, 'a' is set[int]

From User's end, I try to fix the wrong annotation by a: list[int] = list(a), which is the expectation case 2. But it just leads to another unexpected issue: The 2nd annotation overrides the 1st one between the sentences 1st and 2nd annotation are claimed.

    a: set[int] = {1,2,3}      # 'a' is list[int]
    a.add(4)                   # with no linting, 'a' is set[int]

    a: list[int] = list(a)     # 'a' is list[int]
    a.sort()                   # with good linting, 'a' is list[int]

I think the current behavior is not what we expect. And if anyone think this shall be a new issue different from this one, I'm okay with that.

@erictraut
Copy link
Contributor

The current behavior is consistent with how type checking works in Python. See PEP 484 and PEP 526 for details.

When you declare the type of a variable, it is the job of a type checker to enforce that type and report an error if you attempt to assign a value to that variable that does not conform to that type.

If your intent is for the variable to accept different types, you can declare it with a union type.

a: Union[set[int], list[int]] = {1,2,3}
a.add(4)

a = list(a)
a.sort()

Or you can use a different variable name for a separate type.

a: set[int] = {1,2,3}
a.add(4)

b = list(a)
b.sort()

@ain-soph
Copy link
Author

ain-soph commented May 1, 2021

@erictraut
Thanks for your quick response.
It makes sense that I (as a programmer) should take care of it by myself, even though I wish my expectation works (but that wish is too much.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed in next version (main) A fix has been implemented and will appear in an upcoming version
Projects
None yet
Development

No branches or pull requests

4 participants