False positive for possibly unbound variable #496

hauntsaninja · 2020-10-15T23:57:32Z

Language Server version: Pylance language server 2020.10.0 (pyright d1f950ac)
Python version (& distribution if applicable, e.g. Anaconda): 3.8

Expected behaviour

Pylance should treat the else statement as being unreachable and shouldn't complain about the possibility of a being unbound.

Actual behaviour

Pylance doesn't treat the else statement as being unreachable and does complain about the possibility of a being unbound.

Code Snippet / Additional information

from typing import Union

def f(x: Union[int, str, list]) -> None:
    if isinstance(x, int):
        a = 1
    elif isinstance(x, str):
        a = 2
    elif isinstance(x, list):
        a = 3
    else:
        reveal_type(x)
    print(a)

The text was updated successfully, but these errors were encountered:

erictraut · 2020-10-16T04:20:21Z

I agree that it should theoretically be possible to determine that a is not unbound in this instance, but I can't think of a practical way to determine this using the lazy evaluation technique employed in the pyright architecture. The TypeScript compiler (upon which the pyright architecture is based) does not handle this case either.

Here are several workarounds:

Assign a a dummy value in the else clause.
Remove the unnecessary elif isinstance(x, list) statement and replace it with an else.
Use a # type: ignore comment to suppress the error on that line.
Use a # pyright: reportUnboundVariable=false comment to suppress the error in that file.

savannahostrowski · 2020-11-19T21:41:35Z

This issue has been open for 30 days waiting for information. I'm going to close this issue but if the above guidance doesn't work for you, please let us know by reopening the issue.

hauntsaninja · 2020-11-19T21:53:37Z

I don't think this is a waiting-for-info (if it is, let me know what more information I can give!); looks to me like Eric considers this a won't-fix given the limitations of pyright's architecture.

jakebailey · 2020-11-19T21:55:38Z

We marked this as waiting-for-info to see if you would respond with any feedback for the workarounds, or if the original code is the only way to do it as intended.

erictraut · 2020-11-19T22:11:14Z

Here's another better workaround. I've seen this technique recommended in some mypy discussions. It involves the use of an assert_never function.

def assert_never(a: NoReturn) -> NoReturn:
    raise RuntimeError("Should not get here")

def f(x: Union[int, str, list]) -> None:
    if isinstance(x, int):
        a = 1
    elif isinstance(x, str):
        a = 2
    elif isinstance(x, list):
        a = 3
    else:
        assert_never(x)

    print(a)

hauntsaninja · 2020-11-19T22:49:52Z

Yeah, I used an else that had raise AssertionError, i.e. the runtime equivalent of the assert_never trick. assert_never is a good fit for this kind of issue, thanks for reminding me of it!
Feedback for the other workarounds: 1) is kind of ugly / didn't work well with my actual code; 2) reads less explicitly / isn't as robust in the face of change; 3,4) type ignores are always an acceptable workaround for false positives :-)

There's also the question of the false negative for the unreachable code (which obviously doesn't need a workaround / isn't as big a deal). For what it's worth, mypy --warn-unreachable catches this (and also correctly avoids flagging assert_never / raise AssertionError).

mcrumiller · 2021-01-05T21:01:57Z

Edit: I'm super wrong, sorry. Keeping what I wrote for posterity.

Yeah, there's no way pylance could do this. isinstance might be overloaded in the current module, in which case you could get a runtime error once you got to print(a). Imagine the following:

def isinstance(x, type):
    return False

from typing import Union

def f(x: Union[int, str, list]) -> None:
    if isinstance(x, int):
        a = 1
    elif isinstance(x, str):
        a = 2
    elif isinstance(x, list):
        a = 3
    else:
        print(type(x))

    print(a)

f(3)

This gives error:

Exception has occurred: UnboundLocalError
local variable 'a' referenced before assignment
  File "C:\code\blah.py", line 16, in f
    print(a)
  File "C:\code\blah.py", line 18, in 
    f(3)

pylance would have to be able to interpret and validate isinstance, which would probably violate the halting problem.

erictraut · 2021-01-05T21:08:51Z

@mcrumiller, that's not a problem. Pylance knows the difference between builtins.isinstance and a locally-defined version. Narrowing logic is applied only for the builtins version.

hauntsaninja · 2021-01-05T21:09:03Z

That's not really relevant — the same argument argument applies to any type narrowing involving isinstance. It's easy for static type checkers to detect the overloading you describe and all of them do.

sylann · 2021-01-29T14:32:58Z

I am not sure about the first example but there is one other case that I encounter a lot, in many code bases, and it seems a bit more problematic in my opinion since it is a basic specificity of python:

Here is an example:

def best_unit(dt: float) -> Tuple[float, str]:
    units = (
        (1.0, "sec"),
        (1e-3, "msec"),
        (1e-6, "usec"),
        (1e-9, "nsec"),
    )
    for scale, unit in units:
        if dt >= scale:
            break
    return scale, unit  # "unit" is possibly unbound Pylance(reportUnboundVariable)

In this case the warning message is semantically incorrect because units is obviously not empty.

erictraut · 2021-01-29T15:59:19Z

A type checker has no way of knowing whether an iterable type is going to iterate zero or more times. It simply knows that it's iterable and what types it will return. In this case, units is an iterable type, and the for statement is iterating over it. You happen to know that units is a non-empty tuple, and when a non-empty tuple is iterated over, it will not iterate zero times. A type checker does not have this information or knowledge. If it assumes that all iterables iterate at least once in all cases, it will fail to detect an entire class of errors.

A tuple is the only iterable type where a type checker sometimes (but not always) knows the length. All other iterable types (like lists, dictionaries, etc.) have lengths that are unknown to a static type checker. It's possible that in the special case of the tuple (as shown in your example), a type checker could avoid this false positive, but it would be complicated because the control flow graph would need to vary based on the type of the iterable. I'll give more thought to this and see if I can come up with a way to handle this case without introducing other false positives or greatly slowing down analysis.

In the meantime, the workaround is straightforward. If you're going to use a variable after a loop, always initialize it before the loop.

unit = ""
for scale, unit in units:
    ...
return scale, unit

Or add a # type: ignore comment.

jakebailey · 2021-02-04T00:16:58Z

This issue has been fixed in version 2021.2.0, which we've just released. You can find the changelog here: https://github.com/microsoft/pylance-release/blob/main/CHANGELOG.md#202120-3-february-2021

localrobot · 2021-02-04T19:08:23Z

I can still see reproduce this in a try-except-else-finally block in version 2021.2.0, python 3.9.0, VS Code 1.53.0

try:
    some_fn_that_can_throw_an_exception()
except Exception:
    success = False
else:
    success = True
finally:
    print(success)  # "success" is possibly unbound

Or, am I missing something?

erictraut · 2021-02-04T19:44:30Z

The try/except/else/finally clause is different from this issue. Please refer to this issue instead.

hauntsaninja · 2022-02-17T09:29:25Z

Thank you for the talk earlier today!

I was reminded of this issue and thought it might be interesting to understand why lazy type evaluation means pyright can't determine that that branch is unreachable and "a" is unbound (although maybe I just need to start reading pyright code :-) Feel free to ignore this comment, it's basically just me rambling to myself.

From the slides, it sounds like: pyright walks backward through the code flow graph. It sees a, tries to evaluate it, and sees a branch that doesn't have a assigned. But it doesn't yet know that that branch is unreachable — because you need to go forward from narrowed variables to deduce this — so it complains.

Trying out some stuff, pyright doesn't complain about:

def g():
    if True:
        a = 0
    else:
        pass
    print(a)  # no complaint

So clearly pyright is able to eliminate parts of the code flow graph as unreachable. But:

def h(truth: Literal[True]):
    if truth:
        a = 0
    else:
        pass
    print(a)  # pyright complains

def i():
    truth: Final = True
    if truth:
        a = 0
    else:
        pass
    print(a)  # pyright complains

So seems like whatever reachability checking of the code flow graph pyright does happens before we get a chance to infer local types?

mypy has some reachability analysis like if TYPE_CHECKING happen early in semantic analysis, before we know any types, but most of it happens at type check time, where it's eager and forward and knows about everything. It knows when it's done some impossible narrowing and marks statements in scope as unreachable, with some silencing logic for other diverging statements like raise Exception and assert_never

Anyway, I think I see why it's not really feasible for pyright to determine branch reachability. There's just no good time to do it! So (idiomatic use of) assert_never actually gives pyright information, whereas with mypy it's usually telling mypy something mypy already knows.

def j():
    truth: Final = True
    if truth:
        a = 0
    else:
        assert_never(truth)
    print(a)  # no complaints

Here's an interesting example:

def k():
    truth: Final = True
    if truth:
        a = 0
    else:
        def inner():
            return truth
        inner()  # pyright knows that this returns Never, but there are no reachability implications
        1 + 1
    print(a)  # pyright complains

That sort of squares with "pyright determines reachability when building a code flow graph, at which point it doesn't have access to inferences made from code flow".

Okay, but maybe pyright could still avoid the false positive for a being unbound? There is still a good time to do this: do extra checks when we'd otherwise complain about it being unbound (e.g. traverse the problematic code paths in the forward direction and see if anything gets narrowed to Never). That sounds kind of gross, but not too performance killing.

(I've already rambled longer than I thought I would, so I'm cutting myself off. Feel free to ignore, but I wouldn't mind code pointers or commentary)

erictraut · 2022-02-18T01:15:41Z

Like mypy, pyright is able to statically evaluate certain expression forms within the binder. This includes expressions like True or TYPE_CHECKING or sys.version_info >= (3, 10). The binder takes into account these expressions when it builds the code flow graph.

The other examples above involve expressions that cannot be evaluated by the binder. They involve the type evaluator, which depends on the code flow graph.

All of the examples you provided could theoretically be handled by pyright in the type evaluator, but they would be expensive in terms of analysis performance. In response to this bug report, I added support for "implied else" narrowing. When I initially added this support, we saw a significant spike in analysis and completion times. I mitigated this by excluding the "falsy/truthy" type narrowing pattern, which is not typically used in cases where type exhaustion is the goal.

The "implied else" support doesn't work in the case where the "else" is explicit, as in a couple of your cases above.

I'm wondering whether you're raising this primarily because you're curious about a theoretical edge case, or is this a real case that you think is important to handle? Like I said, there are ways to handle this in pyright, but I question whether that would be the right tradeoff if it were to have a performance impact that affected the vast majority of pylance and pyright users.

hauntsaninja · 2022-02-18T06:19:12Z

I don't think anything here is particularly important to handle and I trust you to make the correct tradeoffs.
My comment ^ was just me trying to understand what can be done performantly with lazy type evaluation and why pyright can or cannot do certain things :-) Sorry for not making that clear!

Oh neat, "implied else" is very similar to these examples. It looks like it might be implemented here: https://github.com/microsoft/pyright/blob/b9e33caee9823fc59a4ee53103ce5fcbdca1f71b/packages/pyright-internal/src/analyzer/binder.ts#L1125 I'll play around with it and see if I start to understand pyright's impl better.

I also noticed this maybe-missing-feature?

def func1(x: Literal[True]):
    reveal_type(x)  # Literal[True]
    if x is True:
        y = True
    
    print(y) # no error


def func2():
    x: Final = True
    reveal_type(x)  # Literal[True]
    if x is True:
        y = True
    
    print(y) # error

erictraut · 2022-02-18T06:38:10Z

This difference in behavior highlights another limitation of the "implied else narrowing" implementation. It requires that the reference expression for the condition (in this case, the reference expression is x) has a declared type. If its type were not declared, it would need to be inferred, and that's expensive. That's because we're already in the process of inferring the type of y (which is an expensive code flow analysis operation to begin with), and we'd need to also infer the type of x. This is especially expensive in code that is untyped.

So your second example works if you annotate x.

def func2():
    x: Literal[True] = True
    if x is True:
        y = True

    reveal_type(y)

Here's the code that imposes that limitation: https://github.com/microsoft/pyright/blob/561ec5b093cc09c3b6216c51a161388447b551b4/packages/pyright-internal/src/analyzer/codeFlowEngine.ts#L721

My main motivation for adding "implied else narrowing" was to support type exhaustion checks like the sample at the top of this issue. And in those cases, the type of the reference expression is typically declared.

gonzalogjuka · 2022-12-26T23:44:06Z

I agree that it should theoretically be possible to determine that a is not unbound in this instance, but I can't think of a practical way to determine this using the lazy evaluation technique employed in the pyright architecture. The TypeScript compiler (upon which the pyright architecture is based) does not handle this case either.

Here are several workarounds:

Assign a a dummy value in the else clause.

Remove the unnecessary elif isinstance(x, list) statement and replace it with an else.

Use a # type: ignore comment to suppress the error on that line.

Use a # pyright: reportUnboundVariable=false comment to suppress the error in that file.

Excelent workaround!

github-actions bot added the triage label Oct 15, 2020

savannahostrowski added the waiting for user response Requires more information from user label Oct 19, 2020

github-actions bot removed the triage label Oct 19, 2020

savannahostrowski closed this as completed Nov 19, 2020

jakebailey reopened this Nov 19, 2020

erictraut mentioned this issue Jan 29, 2021

Avoid unbound variable false positive when iterating over non-empty tuple microsoft/pyright#1438

Closed

erictraut added the fixed in next version (main) A fix has been implemented and will appear in an upcoming version label Jan 31, 2021

jakebailey removed the waiting for user response Requires more information from user label Feb 1, 2021

jakebailey closed this as completed Feb 4, 2021

lohedges referenced this issue in michellab/BioSimSpace May 11, 2022

Catch and handle GSL error during alignment. [closes #287]

bad708e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False positive for possibly unbound variable #496

False positive for possibly unbound variable #496

hauntsaninja commented Oct 15, 2020

erictraut commented Oct 16, 2020 •

edited

Loading

savannahostrowski commented Nov 19, 2020

hauntsaninja commented Nov 19, 2020

jakebailey commented Nov 19, 2020

erictraut commented Nov 19, 2020

hauntsaninja commented Nov 19, 2020 •

edited

Loading

mcrumiller commented Jan 5, 2021 •

edited

Loading

erictraut commented Jan 5, 2021

hauntsaninja commented Jan 5, 2021

sylann commented Jan 29, 2021 •

edited

Loading

erictraut commented Jan 29, 2021

jakebailey commented Feb 4, 2021

localrobot commented Feb 4, 2021

erictraut commented Feb 4, 2021

hauntsaninja commented Feb 17, 2022 •

edited

Loading

erictraut commented Feb 18, 2022

hauntsaninja commented Feb 18, 2022

erictraut commented Feb 18, 2022

gonzalogjuka commented Dec 26, 2022

False positive for possibly unbound variable #496

False positive for possibly unbound variable #496

Comments

hauntsaninja commented Oct 15, 2020

Expected behaviour

Actual behaviour

Code Snippet / Additional information

erictraut commented Oct 16, 2020 • edited Loading

savannahostrowski commented Nov 19, 2020

hauntsaninja commented Nov 19, 2020

jakebailey commented Nov 19, 2020

erictraut commented Nov 19, 2020

hauntsaninja commented Nov 19, 2020 • edited Loading

mcrumiller commented Jan 5, 2021 • edited Loading

erictraut commented Jan 5, 2021

hauntsaninja commented Jan 5, 2021

sylann commented Jan 29, 2021 • edited Loading

erictraut commented Jan 29, 2021

jakebailey commented Feb 4, 2021

localrobot commented Feb 4, 2021

erictraut commented Feb 4, 2021

hauntsaninja commented Feb 17, 2022 • edited Loading

erictraut commented Feb 18, 2022

hauntsaninja commented Feb 18, 2022

erictraut commented Feb 18, 2022

gonzalogjuka commented Dec 26, 2022

erictraut commented Oct 16, 2020 •

edited

Loading

hauntsaninja commented Nov 19, 2020 •

edited

Loading

mcrumiller commented Jan 5, 2021 •

edited

Loading

sylann commented Jan 29, 2021 •

edited

Loading

hauntsaninja commented Feb 17, 2022 •

edited

Loading