Support "is None" constraints from if statements during inference #1189

david-yz-liu · 2021-09-24T02:31:30Z

Steps

For new features or bug fixes, add a ChangeLog entry describing what your PR does.
Write a good description on what the PR does.

Description

Alright, this is a big one so please don't hold back. This PR adds infrastructure to support constraints on values during inference. Here's a simple example:

from astroid import extract_node

node = extract_node("""
def f(b):
    if b:
        x = 1
    else:
        x = None
    
    if x is not None:
        x #@
""")
print(node.inferred())

This prints [<Const.int l.4 at 0x14232a6e6d0>, <Const.NoneType l.6 at 0x142329b38b0>], using both possible values from the assignment statements. Of course, because of the x is not None if condition, only the int is possible, but astroid currently doesn't take this condition into account.

This PR changes how inference works so that only [<Const.int l.4 at 0x14232a6e6d0>] is printed.

Scope

I reviewed the existing pylint issues related to this (they are generally labelled as "topic-control-flow"). There were five types of constraints that were brought up:

x is None/x is not None
x/not x (truthy/falsey value)
isinstance(x, ...)
x == ...
Combinations of the above with and/or

There were also six primary locations where these constraints could be inferred (orthogonal to the constraints themselves):

if statement (as in above example)
and/or clauses. For example, x is not None and x[0] == 1; the second x should not be inferred as None
ternary if expressions. For example, x[0] if x is not None else 1; the first x should not be inferred as None

reassignment within an if statement, e.g.

def f():
    a = None
    if a is None:
        a = 1
    return -a

early return/raises within an if statement.
assert statements

This pull request addresses just the item 1's in each list. I used an approach that can handle 1-5 of the constraint types, and 1-3 of the locations. If this approach is approved, I think it would be pretty straightforward to make additional PRs for these items. (More on the other constraint locations below.)

Implementation overview

I added a new Constraint class to represent a constraint on a variable, like x is not None.

This is an abstract class, and can be subclassed for different constraint types.

When a variable is inferred (see inference.py, infer_name), we search the ancestor chain from the node and look for containing if statements, using these to accumulate constraints on the variable. These constraints are added to the inference context (added a new attribute to InferenceContext).

Then, in _infer_stmts (bases.py), we filter the inferred values based on whether or not they satisfy all constraints for that variable.

Uninferable is considered to satisfy all constraints.
⚠️ If there is at least one inferred value, but all inferred values are filtered out, a new Uninferable value is yielded, rather than yielding nothing at all. I think this is the conservative approach, to avoid errors from the raise_if_nothing_inferred decorator.

The above also applies to attributes (see infer_attribute).

Some comments/questions

Because constraints are gathered by searching up the ancestor chain, this PR doesn't handle real control flow issues, e.g. with the reassignment example from above.
Because the filtering occurs in _infer_stmts, it only applies to variables and attributes. So for example, subscript nodes (e.g. my_array[0]) won't have their inferred values filtered.
Gathering constraints occurs every time infer_name/infer_attribute is called, and since this requires going up the ancestor tree, this adds some overhead to inference. In principle constraints can be gathered through an AST visitor, but I thought that would be more complex so didn't choose it. (Thoughts welcome of course!)
I think adding the constraints to InferenceContext is correct, but to be honest I'm fuzzy on the nuances of this class and how it's used.
Do you like the name "constraint" here? I go back on forth on it, but haven't come up with anything better.

Type of Changes

	Type
✓	✨ New feature

Related Issue

Closes with test #791
Closes with test pylint-dev/pylint#157
Closes with test pylint-dev/pylint#1472
Closes with test pylint-dev/pylint#2016
Closes with test pylint-dev/pylint#2631
Closes with test pylint-dev/pylint#2880

DanielNoord · 2021-12-19T19:14:44Z

Hi @david-yz-liu, thanks for all of the work here. I think the size (and importance) of this PR has put off people from reviewing this but I think we should try and start the process of getting this merged.

The first question I would have before doing a full review is how we could expand this to include conditional reassignments. I think it would be every important to support the following code pattern:

def func():
    if True:
        to_infer = 1
    else:
        to_infer = 2

    print(to_infer)

This does not need to be added in this PR, but we should at least have a general idea of how we could implement this within the same system. Do/did you have any ideas about this yourself?

…nt-checking

for more information, see https://pre-commit.ci

david-yz-liu · 2022-03-21T20:57:36Z

@DanielNoord wow, thank you! So sorry for the delay, I think I saw your message and meant to reply months ago but then... forgot. My bad!

Your question is a great one. It's been a while since I've thought about this, but I'll try to give a decent answer now (and perhaps a more thorough one later). I believe my current approach piggybacks off of the way InferenceContexts are passed around when doing inference, so the general rule for what this PR can or can't do is "we can accumulate Contraints whenever we can accumulate other things in InferenceContexts". The examples I gave in the PR description are places where a context is passed from a parent down to a child node in the AST, so they're pretty straightforward.

However as far I as can remember, existing InferenceContexts don't get passed across statements in a block. I think this is fundamentally why pylint's VariablesChecker is as complex as it is?

But anyway to answer your question, I think that this PR can be extended to handle the example you provided, but it wouldn't be by adding to my code very much, but rather by changing how these InferenceContexts (or some other "context" state) is passed around, using techniques similar to what's happening in the VariablesChecker. Or, using control flow graphs (😅) to precisely model the possible execution paths, and then accumulating Contraints across paths in the graph.

Please let me know if that answers your question, or if there are any follow-ups! Happy to be of service :)

DanielNoord · 2022-03-21T21:01:19Z

@DanielNoord wow, thank you! So sorry for the delay, I think I saw your message and meant to reply months ago but then... forgot. My bad!

Your question is a great one. It's been a while since I've thought about this, but I'll try to give a decent answer now (and perhaps a more thorough one later). I believe my current approach piggybacks off of the way InferenceContexts are passed around when doing inference, so the general rule for what this PR can or can't do is "we can accumulate Contraints whenever we can accumulate other things in InferenceContexts". The examples I gave in the PR description are places where a context is passed from a parent down to a child node in the AST, so they're pretty straightforward.

However as far I as can remember, existing InferenceContexts don't get passed across statements in a block. I think this is fundamentally why pylint's VariablesChecker is as complex as it is?

But anyway to answer your question, I think that this PR can be extended to handle the example you provided, but it wouldn't be by adding to my code very much, but rather by changing how these InferenceContexts (or some other "context" state) is passed around, using techniques similar to what's happening in the VariablesChecker. Or, using control flow graphs (😅) to precisely model the possible execution paths, and then accumulating Contraints across paths in the graph.

Please let me know if that answers your question, or if there are any follow-ups! Happy to be of service :)

I think that answers it pretty clearly. As you can see I spent some time familiarising myself with the code 😄 I think this could indeed be a very good first step towards some better control flow. I have been thinking about doing something with frames and line numbering to do some more difficult constraints, but at the moment I don't see how we couldn't change the current approach to do that. It all seems flexible enough.

Let me know what you think of my changes. I made the tests pass on 3.7 and made some of the typing a little more strict. I appreciate that you had the constraints allow for other nodes than If, but for ease of later expansion I think it makes sense to keep it tighter now and expand the typing if needed later on.

david-yz-liu · 2022-03-21T21:05:08Z

Let me know what you think of my changes. I made the tests pass on 3.7 and made some of the typing a little more strict. I appreciate that you had the constraints allow for other nodes than If, but for ease of later expansion I think it makes sense to keep it tighter now and expand the typing if needed later on.

Will do. I'll try to look at this more closely tonight or tomorrow. 👍

ChangeLog

coveralls · 2022-03-22T05:49:31Z

Pull Request Test Coverage Report for Build 3586640247

92 of 92 (100.0%) changed or added relevant lines in 4 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.06%) to 92.365%

Totals
Change from base Build 3586270619:	0.06%
Covered Lines:	9980
Relevant Lines:	10805

💛 - Coveralls

david-yz-liu · 2022-03-23T01:45:33Z

@DanielNoord Okay I reviewed all of your changes, they look good to me! Yeah I was going for more general types but happy to stay strict for now and expand later if needed. Also thanks for making that Python 3.7 fix, that would have taken me a while to track down 😅.

Anyway as far as I can tell, this is good from my end, just waiting for more detailed review/feedback.

Pierre-Sassoulas

LGTM. The tests in particular are diversified and very meticulous. This will close a ton of issues and open a lot of possibilities in pylint too.

astroid/constraint.py

DanielNoord · 2022-04-07T20:40:24Z

@cdce8p Do you still want to review this before we merge this? I think with my last comments addressed this is almost good to go!

cdce8p

Hello! I apologize for the delay again in making these updates (just have had a lot of other things going on).

Not necessary 🙂 We all do.

Notably, I've moved the constraint code back into a single file (constraint.py) and removed the all and instead used the underscore prefix where appropriate.

👍🏻

--
Left some more style comments.

cdce8p · 2022-11-15T12:37:09Z

astroid/constraint.py

+    @classmethod
+    @abstractmethod
+    def match(
+        cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False


Since this is a classmethod, the type of the first argument should be a class type not an instance.

Suggested change

@classmethod

@abstractmethod

def match(

cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False

@classmethod

@abstractmethod

def match(

cls: type[_ConstraintT], node: _NameNodes, expr: nodes.NodeNG, negate: bool = False

Changed this to be Self which I think makes more sense here, let me know if I need to revert

Self would be awesome. Unfortunately it isn't supported by mypy yet.
The PR has already been merged, but we need to wait for the next release (1.0.0).

Yeah, because it was already merged I thought it would be okay to use. pyright understands it and I don't think we will turn on mypy in astroid before 1.0.0 gets released so this doesn't give us any additional work (instead it prevents some 😄 )

cdce8p · 2022-11-15T12:37:49Z

astroid/constraint.py

+    @classmethod
+    def match(
+        cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False


Suggested change

@classmethod

def match(

cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False

@classmethod

def match(

cls: type[_ConstraintT], node: _NameNodes, expr: nodes.NodeNG, negate: bool = False

Same here

astroid/constraint.py

cdce8p · 2022-11-15T12:41:01Z

astroid/constraint.py

+                and _matches(right, cls.CONST_NONE)
+                or _matches(left, cls.CONST_NONE)


cls.CONST_NONE is a type error. See: #1189 (comment)

I think this might be fixed with using Self?

astroid/constraint.py

cdce8p · 2022-11-15T12:53:00Z

astroid/constraint.py

+    for constraint_cls in ALL_CONSTRAINT_CLASSES:
+        constraint = constraint_cls.match(node, expr, invert)
+        if constraint:
+            return constraint


Is there a case in which multiple constraint classes could match (in the future)? Does it make sense to change it to an Iterator then, i.e. yield constraint instead of return?

astroid/bases.py

Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>

DanielNoord · 2022-11-20T23:14:56Z

I handled the review by Marc and incorporate his comments to help move this PR along. I think we're all eager to get this merged 😄

david-yz-liu · 2022-11-21T02:13:17Z

Thanks @DanielNoord and @cdce8p! I wasn't sure exactly when I'd have time to look at this again so very much appreciate the updates. 👍

BTW on the question of whether multiple constraint classes may match a given node (in the future), I think... maybe but not necessarily. For example, if we imagine a hypotheical "AndConstraint" from

if x is not None and isinstance(x, int):
    ... x ...

and so the if node's test is an and of two expressions, I could see either yielding two distinct constraints ("not none constraint" and "is int instance" constraint), or a single AndConstraint whose children are the two constraints. (Personally I'd prefer going with the latter approach.)

DanielNoord · 2022-11-21T21:32:47Z

BTW on the question of whether multiple constraint classes may match a given node (in the future), I think... maybe but not necessarily. For example, if we imagine a hypotheical "AndConstraint" from
if x is not None and isinstance(x, int):
    ... x ...
and so the if node's test is an and of two expressions, I could see either yielding two distinct constraints ("not none constraint" and "is int instance" constraint), or a single AndConstraint whose children are the two constraints. (Personally I'd prefer going with the latter approach.)

I think the concept of AndConstraint makes sense. This would remove the need for the yield behaviour.

I implemented the yielding behaviour for now but that could easily be reverted. @cdce8p what do you think?

codecov · 2023-01-06T19:45:27Z

Codecov Report

Merging #1189 (46c056f) into main (f476ebc) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1189      +/-   ##
==========================================
+ Coverage   92.58%   92.63%   +0.05%     
==========================================
  Files          93       94       +1     
  Lines       10782    10869      +87     
==========================================
+ Hits         9982    10069      +87     
  Misses        800      800

Flag	Coverage Δ
linux	`92.40% <100.00%> (+0.06%)`	⬆️
pypy	`88.54% <96.73%> (+0.05%)`	⬆️
windows	`92.31% <100.00%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
astroid/bases.py	`87.98% <100.00%> (+0.38%)`	⬆️
astroid/constraint.py	`100.00% <100.00%> (ø)`
astroid/context.py	`97.33% <100.00%> (+0.11%)`	⬆️
astroid/inference.py	`94.54% <100.00%> (+0.03%)`	⬆️

Everything was taken into account.

jacobtylerwalls · 2023-01-06T20:36:32Z

Thanks to all for reviews and persistence--huge step forward!

david-yz-liu · 2023-01-06T20:52:28Z

💯 @Pierre-Sassoulas thank you! And thank you all for the efforts in reviewing this and keeping the branch up to date. I'm super excited to see how this evolves!

Support "is None" constraints from if statements during inference

159af87

cdce8p self-requested a review September 25, 2021 10:18

Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Sep 26, 2021

Pierre-Sassoulas added this to the 2.9.0 milestone Sep 26, 2021

Pierre-Sassoulas self-requested a review September 26, 2021 07:27

Merge branch 'main' into constraint-checking

1c5b4b3

DanielNoord modified the milestones: 2.10.0, 2.11.0 Feb 26, 2022

DanielNoord and others added 8 commits March 21, 2022 20:22

Merge branch 'main' of https://github.com/PyCQA/astroid into constrai…

4dffa4d

…nt-checking

Changes

4828797

Some typing

c39baff

[pre-commit.ci] auto fixes from pre-commit.com hooks

b54d841

for more information, see https://pre-commit.ci

Some changes to typing and removal of dead code

de4c318

Fix on python 3.7

3da1c83

Fix import

7fe82e6

Add abstractness

fc4156e

Pierre-Sassoulas reviewed Mar 22, 2022

View reviewed changes

ChangeLog Outdated Show resolved Hide resolved

Update ChangeLog

55bb33f

Pierre-Sassoulas mentioned this pull request Mar 22, 2022

Turn on comments in coveralls #1482

Closed

Pierre-Sassoulas approved these changes Mar 23, 2022

View reviewed changes

astroid/constraint.py Outdated Show resolved Hide resolved

Simplify expression NoneConstraint.match

8a3b443

DanielNoord reviewed Mar 24, 2022

View reviewed changes

astroid/constraint.py Show resolved Hide resolved

astroid/constraint.py Outdated Show resolved Hide resolved

astroid/constraint.py Outdated Show resolved Hide resolved

jacobtylerwalls mentioned this pull request Mar 27, 2022

False positive "not-callable" after is not None pylint-dev/pylint#6005

Closed

jacobtylerwalls mentioned this pull request Nov 13, 2022

What's left for 3.0? pylint-dev/pylint#7607

Closed

Pierre-Sassoulas mentioned this pull request Nov 14, 2022

Planning for astroid 3.0 #1868

Closed

10 tasks

cdce8p reviewed Nov 15, 2022

View reviewed changes

DanielNoord and others added 6 commits November 20, 2022 23:49

Use Self

f6e001b

Use InferenceResult

5da24fe

Don't check None is

b46f001

Use frozenset

409543b

Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>

Also accept Proxy

316ba46

Use an Iterator

dabf0e9

DanielNoord requested a review from cdce8p November 20, 2022 23:14

Merge branch 'main' into constraint-checking

f70fbad

Pierre-Sassoulas mentioned this pull request Dec 6, 2022

Handle properties in dataclasses correctly #1897

Merged

2 tasks

Pierre-Sassoulas mentioned this pull request Dec 22, 2022

Crash on enumerate with binary operator in start parameter pylint-dev/pylint#7963

Closed

Merge remote-tracking branch 'origin/main' into constraint-checking

46c056f

Pierre-Sassoulas merged commit 21880dd into pylint-dev:main Jan 6, 2023

david-yz-liu deleted the constraint-checking branch January 6, 2023 20:52

Pierre-Sassoulas added a commit that referenced this pull request Jan 7, 2023

[Changelog] Move #1189 changelog back in 2.13.0

fc5c93c

This was referenced Feb 4, 2023

False positive: E1307 Argument 'builtins.NoneType' does not match format type 'd' pylint-dev/pylint#2631

Closed

False positive E1126 when index previously assigned to None pylint-dev/pylint#2880

Closed

jacobtylerwalls mentioned this pull request Feb 3, 2024

False negative: no-member in conditional branch pylint-dev/pylint#9405

Open

jacobtylerwalls mentioned this pull request Mar 24, 2024

[unsubscriptable-object] False Positive on Optional list pylint-dev/pylint#9515

Open

jacobtylerwalls mentioned this pull request May 4, 2024

not-callable false positive in type narrowing pylint-dev/pylint#7412

Closed

Pierre-Sassoulas mentioned this pull request May 6, 2024

Negation and None pylint-dev/pylint#5085

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support "is None" constraints from if statements during inference #1189

Support "is None" constraints from if statements during inference #1189

david-yz-liu commented Sep 24, 2021 •

edited

Loading

DanielNoord commented Dec 19, 2021

david-yz-liu commented Mar 21, 2022

DanielNoord commented Mar 21, 2022

david-yz-liu commented Mar 21, 2022

coveralls commented Mar 22, 2022 •

edited

Loading

david-yz-liu commented Mar 23, 2022

Pierre-Sassoulas left a comment

DanielNoord commented Apr 7, 2022

cdce8p left a comment

cdce8p Nov 15, 2022

DanielNoord Nov 20, 2022

cdce8p Nov 21, 2022

DanielNoord Nov 22, 2022

cdce8p Nov 15, 2022

cdce8p Nov 15, 2022

DanielNoord Nov 20, 2022

cdce8p Nov 15, 2022

DanielNoord commented Nov 20, 2022

david-yz-liu commented Nov 21, 2022

DanielNoord commented Nov 21, 2022

codecov bot commented Jan 6, 2023

jacobtylerwalls commented Jan 6, 2023

david-yz-liu commented Jan 6, 2023

		and _matches(right, cls.CONST_NONE)
		or _matches(left, cls.CONST_NONE)

Support "is None" constraints from if statements during inference #1189

Support "is None" constraints from if statements during inference #1189

Conversation

david-yz-liu commented Sep 24, 2021 • edited Loading

Steps

Description

Scope

Implementation overview

Some comments/questions

Type of Changes

Related Issue

DanielNoord commented Dec 19, 2021

david-yz-liu commented Mar 21, 2022

DanielNoord commented Mar 21, 2022

david-yz-liu commented Mar 21, 2022

coveralls commented Mar 22, 2022 • edited Loading

Pull Request Test Coverage Report for Build 3586640247

💛 - Coveralls

david-yz-liu commented Mar 23, 2022

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

DanielNoord commented Apr 7, 2022

cdce8p left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DanielNoord commented Nov 20, 2022

david-yz-liu commented Nov 21, 2022

DanielNoord commented Nov 21, 2022

codecov bot commented Jan 6, 2023

Codecov Report

jacobtylerwalls commented Jan 6, 2023

david-yz-liu commented Jan 6, 2023

david-yz-liu commented Sep 24, 2021 •

edited

Loading

coveralls commented Mar 22, 2022 •

edited

Loading