-
-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support "is None" constraints from if statements during inference #1189
Support "is None" constraints from if statements during inference #1189
Conversation
Hi @david-yz-liu, thanks for all of the work here. I think the size (and importance) of this PR has put off people from reviewing this but I think we should try and start the process of getting this merged. The first question I would have before doing a full review is how we could expand this to include conditional reassignments. I think it would be every important to support the following code pattern: def func():
if True:
to_infer = 1
else:
to_infer = 2
print(to_infer) This does not need to be added in this PR, but we should at least have a general idea of how we could implement this within the same system. Do/did you have any ideas about this yourself? |
for more information, see https://pre-commit.ci
@DanielNoord wow, thank you! So sorry for the delay, I think I saw your message and meant to reply months ago but then... forgot. My bad! Your question is a great one. It's been a while since I've thought about this, but I'll try to give a decent answer now (and perhaps a more thorough one later). I believe my current approach piggybacks off of the way However as far I as can remember, existing But anyway to answer your question, I think that this PR can be extended to handle the example you provided, but it wouldn't be by adding to my code very much, but rather by changing how these Please let me know if that answers your question, or if there are any follow-ups! Happy to be of service :) |
I think that answers it pretty clearly. As you can see I spent some time familiarising myself with the code 😄 I think this could indeed be a very good first step towards some better control flow. I have been thinking about doing something with Let me know what you think of my changes. I made the tests pass on 3.7 and made some of the typing a little more strict. I appreciate that you had the constraints allow for other nodes than |
Will do. I'll try to look at this more closely tonight or tomorrow. 👍 |
Pull Request Test Coverage Report for Build 3586640247
💛 - Coveralls |
@DanielNoord Okay I reviewed all of your changes, they look good to me! Yeah I was going for more general types but happy to stay strict for now and expand later if needed. Also thanks for making that Python 3.7 fix, that would have taken me a while to track down 😅. Anyway as far as I can tell, this is good from my end, just waiting for more detailed review/feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The tests in particular are diversified and very meticulous. This will close a ton of issues and open a lot of possibilities in pylint too.
@cdce8p Do you still want to review this before we merge this? I think with my last comments addressed this is almost good to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! I apologize for the delay again in making these updates (just have had a lot of other things going on).
Not necessary 🙂 We all do.
Notably, I've moved the constraint code back into a single file (constraint.py) and removed the all and instead used the underscore prefix where appropriate.
👍🏻
--
Left some more style comments.
astroid/constraint.py
Outdated
@classmethod | ||
@abstractmethod | ||
def match( | ||
cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a classmethod
, the type of the first argument should be a class type not an instance.
@classmethod | |
@abstractmethod | |
def match( | |
cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False | |
@classmethod | |
@abstractmethod | |
def match( | |
cls: type[_ConstraintT], node: _NameNodes, expr: nodes.NodeNG, negate: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to be Self
which I think makes more sense here, let me know if I need to revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self
would be awesome. Unfortunately it isn't supported by mypy yet.
The PR has already been merged, but we need to wait for the next release (1.0.0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, because it was already merged I thought it would be okay to use. pyright
understands it and I don't think we will turn on mypy
in astroid
before 1.0.0
gets released so this doesn't give us any additional work (instead it prevents some 😄 )
astroid/constraint.py
Outdated
@classmethod | ||
def match( | ||
cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@classmethod | |
def match( | |
cls: _ConstraintT, node: _NameNodes, expr: nodes.NodeNG, negate: bool = False | |
@classmethod | |
def match( | |
cls: type[_ConstraintT], node: _NameNodes, expr: nodes.NodeNG, negate: bool = False |
Same here
astroid/constraint.py
Outdated
and _matches(right, cls.CONST_NONE) | ||
or _matches(left, cls.CONST_NONE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls.CONST_NONE
is a type error. See: #1189 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be fixed with using Self
?
astroid/constraint.py
Outdated
for constraint_cls in ALL_CONSTRAINT_CLASSES: | ||
constraint = constraint_cls.match(node, expr, invert) | ||
if constraint: | ||
return constraint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case in which multiple constraint classes could match (in the future)? Does it make sense to change it to an Iterator then, i.e. yield constraint
instead of return
?
Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
I handled the review by Marc and incorporate his comments to help move this PR along. I think we're all eager to get this merged 😄 |
Thanks @DanielNoord and @cdce8p! I wasn't sure exactly when I'd have time to look at this again so very much appreciate the updates. 👍 BTW on the question of whether multiple constraint classes may match a given node (in the future), I think... maybe but not necessarily. For example, if we imagine a hypotheical " if x is not None and isinstance(x, int):
... x ... and so the if node's |
I think the concept of I implemented the |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1189 +/- ##
==========================================
+ Coverage 92.58% 92.63% +0.05%
==========================================
Files 93 94 +1
Lines 10782 10869 +87
==========================================
+ Hits 9982 10069 +87
Misses 800 800
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Everything was taken into account.
Thanks to all for reviews and persistence--huge step forward! |
💯 @Pierre-Sassoulas thank you! And thank you all for the efforts in reviewing this and keeping the branch up to date. I'm super excited to see how this evolves! |
Steps
Description
Alright, this is a big one so please don't hold back. This PR adds infrastructure to support constraints on values during inference. Here's a simple example:
This prints
[<Const.int l.4 at 0x14232a6e6d0>, <Const.NoneType l.6 at 0x142329b38b0>]
, using both possible values from the assignment statements. Of course, because of thex is not None
if condition, only the int is possible, but astroid currently doesn't take this condition into account.This PR changes how inference works so that only
[<Const.int l.4 at 0x14232a6e6d0>]
is printed.Scope
I reviewed the existing pylint issues related to this (they are generally labelled as "topic-control-flow"). There were five types of constraints that were brought up:
x is None
/x is not None
x
/not x
(truthy/falsey value)isinstance(x, ...)
x == ...
and
/or
There were also six primary locations where these constraints could be inferred (orthogonal to the constraints themselves):
if statement (as in above example)
and
/or
clauses. For example,x is not None and x[0] == 1
; the secondx
should not be inferred asNone
ternary if expressions. For example,
x[0] if x is not None else 1
; the firstx
should not be inferred asNone
reassignment within an if statement, e.g.
early return/raises within an if statement.
assert statements
This pull request addresses just the item 1's in each list. I used an approach that can handle 1-5 of the constraint types, and 1-3 of the locations. If this approach is approved, I think it would be pretty straightforward to make additional PRs for these items. (More on the other constraint locations below.)
Implementation overview
I added a new
Constraint
class to represent a constraint on a variable, likex is not None
.When a variable is inferred (see
inference.py
,infer_name
), we search the ancestor chain from the node and look for containing if statements, using these to accumulate constraints on the variable. These constraints are added to the inference context (added a new attribute toInferenceContext
).Then, in
_infer_stmts
(bases.py
), we filter the inferred values based on whether or not they satisfy all constraints for that variable.Uninferable
is considered to satisfy all constraints.Uninferable
value is yielded, rather than yielding nothing at all. I think this is the conservative approach, to avoid errors from theraise_if_nothing_inferred
decorator.The above also applies to attributes (see
infer_attribute
).Some comments/questions
_infer_stmts
, it only applies to variables and attributes. So for example, subscript nodes (e.g.my_array[0]
) won't have their inferred values filtered.infer_name
/infer_attribute
is called, and since this requires going up the ancestor tree, this adds some overhead to inference. In principle constraints can be gathered through an AST visitor, but I thought that would be more complex so didn't choose it. (Thoughts welcome of course!)InferenceContext
is correct, but to be honest I'm fuzzy on the nuances of this class and how it's used.Type of Changes
Related Issue
Closes with test #791
Closes with test pylint-dev/pylint#157
Closes with test pylint-dev/pylint#1472
Closes with test pylint-dev/pylint#2016
Closes with test pylint-dev/pylint#2631
Closes with test pylint-dev/pylint#2880