Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIP: require loop variable annotation #3475

Closed
charles-cooper opened this issue May 31, 2023 · 5 comments · Fixed by #3596
Closed

VIP: require loop variable annotation #3475

charles-cooper opened this issue May 31, 2023 · 5 comments · Fixed by #3596

Comments

@charles-cooper
Copy link
Member

charles-cooper commented May 31, 2023

Simple Summary

as title. require for statements like for i: uint256 in [1, 2, 3]: instead of for i in [1, 2, 3]

(h/t to @trocher for suggesting this change to the language!)

Motivation

loop variable type inference is one of the most complicated areas of the vyper type checker, taking up 150 loc (that is, 30%) of vyper/semantics/analysis/local.py:

def visit_For(self, node):
if isinstance(node.iter, vy_ast.Subscript):
raise StructureException("Cannot iterate over a nested list", node.iter)
if isinstance(node.iter, vy_ast.Call):
# iteration via range()
if node.iter.get("func.id") != "range":
raise IteratorException(
"Cannot iterate over the result of a function call", node.iter
)
validate_call_args(node.iter, (1, 2))
args = node.iter.args
if len(args) == 1:
# range(CONSTANT)
if not isinstance(args[0], vy_ast.Num):
raise StateAccessViolation("Value must be a literal", node)
if args[0].value <= 0:
raise StructureException("For loop must have at least 1 iteration", args[0])
validate_expected_type(args[0], IntegerT.any())
type_list = get_possible_types_from_node(args[0])
else:
validate_expected_type(args[0], IntegerT.any())
type_list = get_common_types(*args)
if not isinstance(args[0], vy_ast.Constant):
# range(x, x + CONSTANT)
if not isinstance(args[1], vy_ast.BinOp) or not isinstance(
args[1].op, vy_ast.Add
):
raise StructureException(
"Second element must be the first element plus a literal value", args[0]
)
if not vy_ast.compare_nodes(args[0], args[1].left):
raise StructureException(
"First and second variable must be the same", args[1].left
)
if not isinstance(args[1].right, vy_ast.Int):
raise InvalidLiteral("Literal must be an integer", args[1].right)
if args[1].right.value < 1:
raise StructureException(
f"For loop has invalid number of iterations ({args[1].right.value}),"
" the value must be greater than zero",
args[1].right,
)
else:
# range(CONSTANT, CONSTANT)
if not isinstance(args[1], vy_ast.Int):
raise InvalidType("Value must be a literal integer", args[1])
validate_expected_type(args[1], IntegerT.any())
if args[0].value >= args[1].value:
raise StructureException("Second value must be > first value", args[1])
if not type_list:
raise TypeMismatch("Iterator values are of different types", node.iter)
else:
# iteration over a variable or literal list
if isinstance(node.iter, vy_ast.List) and len(node.iter.elements) == 0:
raise StructureException("For loop must have at least 1 iteration", node.iter)
type_list = [
i.value_type
for i in get_possible_types_from_node(node.iter)
if isinstance(i, (DArrayT, SArrayT))
]
if not type_list:
raise InvalidType("Not an iterable type", node.iter)
if isinstance(node.iter, (vy_ast.Name, vy_ast.Attribute)):
# check for references to the iterated value within the body of the loop
assign = _check_iterator_modification(node.iter, node)
if assign:
raise ImmutableViolation("Cannot modify array during iteration", assign)
# Check if `iter` is a storage variable. get_descendants` is used to check for
# nested `self` (e.g. structs)
iter_is_storage_var = (
isinstance(node.iter, vy_ast.Attribute)
and len(node.iter.get_descendants(vy_ast.Name, {"id": "self"})) > 0
)
if iter_is_storage_var:
# check if iterated value may be modified by function calls inside the loop
iter_name = node.iter.attr
for call_node in node.get_descendants(vy_ast.Call, {"func.value.id": "self"}):
fn_name = call_node.func.attr
fn_node = self.vyper_module.get_children(vy_ast.FunctionDef, {"name": fn_name})[0]
if _check_iterator_modification(node.iter, fn_node):
# check for direct modification
raise ImmutableViolation(
f"Cannot call '{fn_name}' inside for loop, it potentially "
f"modifies iterated storage variable '{iter_name}'",
call_node,
)
for name in self.namespace["self"].typ.members[fn_name].recursive_calls:
# check for indirect modification
fn_node = self.vyper_module.get_children(vy_ast.FunctionDef, {"name": name})[0]
if _check_iterator_modification(node.iter, fn_node):
raise ImmutableViolation(
f"Cannot call '{fn_name}' inside for loop, it may call to '{name}' "
f"which potentially modifies iterated storage variable '{iter_name}'",
call_node,
)
self.expr_visitor.visit(node.iter)
if not isinstance(node.target, vy_ast.Name):
raise StructureException("Invalid syntax for loop iterator", node.target)
for_loop_exceptions = []
iter_name = node.target.id
for type_ in type_list:
# type check the for loop body using each possible type for iterator value
with self.namespace.enter_scope():
try:
self.namespace[iter_name] = VarInfo(type_, is_constant=True)
except VyperException as exc:
raise exc.with_annotation(node) from None
try:
with NodeMetadata.enter_typechecker_speculation():
for n in node.body:
self.visit(n)
except (TypeMismatch, InvalidOperation) as exc:
for_loop_exceptions.append(exc)
else:
# type information is applied directly here because the
# scope is closed prior to the call to
# `StatementAnnotationVisitor`
node.target._metadata["type"] = type_
# success -- bail out instead of error handling.
return
# if we have gotten here, there was an error for
# every type tried for the iterator
if len(set(str(i) for i in for_loop_exceptions)) == 1:
# if every attempt at type checking raised the same exception
raise for_loop_exceptions[0]
# return an aggregate TypeMismatch that shows all possible exceptions
# depending on which type is used
types_str = [str(i) for i in type_list]
given_str = f"{', '.join(types_str[:1])} or {types_str[-1]}"
raise TypeMismatch(
f"Iterator value '{iter_name}' may be cast as {given_str}, "
"but type checking fails with all possible types:",
node,
*(
(f"Casting '{iter_name}' as {type_}: {exc.message}", exc.annotations[0])
for type_, exc in zip(type_list, for_loop_exceptions)
),
)

because the compiler currently has to loop over all possible types for the loop variable, it

@external
def loo():
    for x in [1, 2, 3]:
        v: bool = x - 10 == 1  # x is uint256, safesub reverts

@external
def foo():
    for x in [1, 2, 3]:
        a: int256 = convert(x - 100,int256)  # x is uint256, checked conversion reverts

these above two examples could be fixed by attempting to (at compile-time) unroll the loop and fold in the loop variable, and reject types which result in folding failures. this would "correctly" (i.e., match the user's intuition) infer the type of x as int256 in both of the above two cases. however, this would result in more complexity in the loop variable type inference routine, and would result in more runtime complexity as well.

lastly, the fact that loop variables are currently not annotated is actually a peculiar outcome of PEP526 - "In addition, one cannot annotate variables used in a for or with statement". i consider this to be a flaw in PEP484+PEP526 itself, as loop variables cannot be annotated in python without commented type hints.

also, while it might seem unfamiliar to users at first, from the point of view of vyper, the fact that loop variables are not annotated is inconsistent with all other vyper variables, which do not run into the same type inference issues (because they are all required to be annotated in the first place).

Specification

require loop variables to be annotated.

an implementation note: the fact that type annotations are not allowed for loop variables causes some issues for us at the syntactic level, but they can be worked around (e.g. by adding a preprocessing rule which converts for i: uint256 in ..: to some currently unused syntactic expression, ex. for *(i, uint256) in ..:, which can then be converted into a type annotation during conversion from python to vyper ast.

Backwards Compatibility

this is a breaking change.

Dependencies

If this VIP depends on any other VIPs being implemented, please mention them.

References

Add any references that this VIP might reference (other VIPs/issues, links to blog posts, etc.)

Copyright

Copyright and related rights waived via CC0

@tserg
Copy link
Collaborator

tserg commented Jun 1, 2023

Do we want to consider disallowing literal lists if there is more than one possible type? This seems like a reasonable restriction given that one of vyper's goals is strong typing.

Instead of the first snippet, we require users to do the latter:

for i in [1, 2, 3]:
    ...
x: uint256[3] = [1, 2, 3]
for i in x:
    ...

@fubuloubu
Copy link
Member

Do we want to consider disallowing literal lists if there is more than one possible type? This seems like a reasonable restriction given that one of vyper's goals is strong typing.

Instead of the first snippet, we require users to do the latter:

for i in [1, 2, 3]:
    ...
x: uint256[3] = [1, 2, 3]
for i in x:
    ...

I think with the iterator variable finally having typing, we can forward propagate the type to the list literal

@tserg
Copy link
Collaborator

tserg commented Jun 1, 2023

Do we want to consider disallowing literal lists if there is more than one possible type? This seems like a reasonable restriction given that one of vyper's goals is strong typing.
Instead of the first snippet, we require users to do the latter:

for i in [1, 2, 3]:
    ...
x: uint256[3] = [1, 2, 3]
for i in x:
    ...

I think with the iterator variable finally having typing, we can forward propagate the type to the list literal

I was thinking that between the two un-pythonic choices of (1) type annotation for the iterator variable in for statements; or (2) disallowing literal lists with more than one possible type in for statements, option (2) would be the lesser of the two evils.

@charles-cooper
Copy link
Member Author

charles-cooper commented Jun 1, 2023

I was thinking that between the two un-pythonic choices of (1) type annotation for the iterator variable in for statements; or (2) disallowing literal lists with more than one possible type in for statements, option (2) would be the lesser of the two evils.

i disagree. annotating iterator variables is simpler from a UX perspective. it also increases consistency of vyper's variable system, since iterator variables are currently the only variables in vyper that are not annotated, whereas, requiring lists used as an iterator to not be literals decreases vyper's internal consistency since is inconsistent with .. most of the rest of vyper, untyped list literals are almost always allowed including when the type is unspecified, e.x. [1, 2, 3][2].

@tserg
Copy link
Collaborator

tserg commented Aug 2, 2023

I have a sketch at this PR - during pre-parsing, we transform for i: uint256 in ... to a tuple in for i, uint256 in ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants