-
-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type annotations for early stage compiler functions #1504
Conversation
Improves readability and passes mypy.
Actually, I'll just say that this PR is ready for review. I have some other changes that build on top of this but I don't want people to feel like they have to read through too many commits. I'll make those as another PR. |
I am afraid you'll have to revert the compile_codes interface work as it's now a solidified compiler interface for tools to use and the change is not backwards compatible. |
bin/vyper
Outdated
|
||
def uniq(seq: Iterable[T]) -> List[T]: | ||
""" | ||
Returns unique items in ``seq`` in order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: We use imperative present tense among the rest of the Python libraries and I think this should also follow this. (Same as Python docs and many others)
https://github.com/ethereum/snake-charmers-tactical-manual/blob/master/documentation.md#grammar
Return unique items in
seq
in order
As a rule of thumb, imperative present tense comes out when it fits in this:
If we call this API it will: __________________________
bin/vyper
Outdated
T = TypeVar('T') | ||
|
||
|
||
def uniq(seq: Iterable[T]) -> List[T]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware this has nothing to do with this PR and should not be addressed here but it would probably be a good idea to not turn the Iterable[T]
into a List[T]
and instead do one of these things:
- Either just yield the items so that we return an
Iterable[T]
and have consumers deal with conversion if needed
or - Return a
Tuple[T, ...]
instead of aList[T]
to not return something that is mutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cburgdorf I like the yielding suggestion, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Minor further suggestion. You could change the type annotation from Generator[T, None, None]
to Iterable[T]
which is basically a subset of a generator and looks a bit cleaner imho. Basically our convention in Trinity is to use Iterator[T]
instead of Generator[T, None, None
] whenever you are only interested in the yield type but not in the send type or return type (e.g. not using generator.send(something)
)
@jacqueswww Can you give me an example of why the API of Practically speaking, having a function return one type or another depending on the input makes working with |
@davesque vyper-debug and vyper online off the top of my head. Last time we changed it many things ended up breaking and people popped on gitter that we had to handle on a case by case basis. I agree it's better to do it the above way, however I don't see mypy as worth breaking backwards compatibility change i.e. I don't see it as critical to have OrderedDict vs. having to find all projects that use it this way currently. |
Can we ignore it for now? |
@jacqueswww @fubuloubu Perhaps we can raise a deprecation warning if the arg is provided? I don't agree that mypy is not worth making backwards incompatible changes. It helps tremendously with code readability and is a very worthy endeavor, even if there are some growing pains in adopting it. |
Is there a way to have it support both methods if that is the case? |
I can't comment on the specific case (other than that functions with variable return types make me sad 😅 ) but I 💯 % agree that adopting mypy is absolutely worth it. In the long run, it make the code base much more robust. When we started adopting mypy in Trinity we've found plenty of small bugs just by adding type hints! Also it makes refactoring code so much simpler and safer. Mypy does catch lots of things that are easy to go unnoticed otherwise. |
I think it may make sense to adopt the change here, but in a separate PR and perhaps alongside the suggestion to add the Those impacted by the change would probably be happy to find that we added a new script to help them with their use case. |
I think you guys are misunderstanding what I am saying. I am not saying mypy shouldn't be adopted, I am saying that a single use case that is difficult to annotate is, should rather be difficult to annotate than force backwards incompatibility. I see compatibility to outwards facing APIs as critical to the usage of vyper, we can make the switch but will have to be deprecation warning first for b11 and then disabled in b12. We changed the interface before and it was painful, just trying to learn from previous mistakes. |
OK (bear with me here) so I am not sure why in this specific case it's a problem and am trying to understand something: Why is it a bad practice to return different types from a function, if you ask for that type at calling time? I can see the argument if the output is not dependent on the call parameters - as you have no idea how to handle a call. But being able to this in python is great, and to me has a lot of utility. |
Good news: I scanned most implementations I could find calling compile_codes and they don't use list type, so we can probably just #yolo it. |
@jacqueswww That's convenient if true. Are you remembering that the default result type was As for the exact reason why it's not convenient, here's a commit that causes mypy to complain (I didn't give an annotation for the Here's one of the errors that mypy throws that relates to the return statement in the implementation of
The problem is that the return type could be a list or a dict, so it has to be a Those sorts of problems will come up for any kind of function that potentially returns different types. I think it's good that mypy complains about stuff like this because it highlights an API that is sort of ambiguous and potentially a source of bugs. A possibly better approach would be to just have two different, appropriately named, functions that return results with the desired structure. As for functions which take formatters as args, I guess I would try to design such an API so that the formatter only modifies the structure of the result within a single type. So a function which does string templating or something might accept a formatter which could build the result differently, but the result would still be a string either way. |
Yes so far I can trace most use the |
You are generally right that reading However, it isn't clear enough to be useful to perform static analyzing on it. A static analyzer can't tell the difference between Sure However, there is a way to have a function return different types based on its inputs which is entirely type safe and statically analyzable. E.g. you can have a function def parse_to_unicorn(raw: bytes, parser: UnicornParser[T]) -> T:
return parser.parse(raw) But because this code uses generics, mypy knows exactly the return type of Here's the entire code for that. from abc import ABC, abstractmethod
from typing import Generic, Tuple, TypeVar
T = TypeVar("T")
class UnicornParser(Generic[T], ABC):
@abstractmethod
def parse(self, raw: bytes) -> T:
...
class TupleUnicornParser(UnicornParser[Tuple[str, ...]]):
def parse(self, raw: bytes) -> Tuple[str, ...]:
return ("u", "n", "i", "c", "o", "r", "n")
class StringUnicornParser(UnicornParser[str]):
def parse(self, raw: bytes) -> str:
return "unicorn"
def parse_to_unicorn(raw: bytes, parser: UnicornParser[T]) -> T:
return parser.parse(raw)
first_unicorn = parse_to_unicorn(b"", TupleUnicornParser())
# mypy knows that `first_unicorn` is of type `Tuple[str, ...]`
print(first_unicorn) # prints: ('u', 'n', 'i', 'c', 'o', 'r', 'n')
second_unicorn = parse_to_unicorn(b"", StringUnicornParser())
# mypy knows that `second_unicorn` is of type `str`
print(second_unicorn) # prints: unicorn
# mypy rightfully rejects this with: `Unsupported operand types for + ("Tuple[str, ...]" and "str")
# first_unicorn + second_unicorn |
@cburgdorf Looking at the above example - does this mean that the only way to return different types in the mypy sphere is by creating extra classes using generics? How would it work if I wanted to pass a function pointer instead of a class in the above example? ie. it's after all a class with only one function. |
Nope, generics work both for classes and functions. I just did it class based to show case a more complete example. I'm with you that classes with single functions are generally a smell and are often better to be replaced with standalone functions. The generics work out the same. from typing import Callable, Tuple, TypeVar
T = TypeVar("T")
def tuple_unicorn_parser(raw: bytes) -> Tuple[str, ...]:
return ("u", "n", "i", "c", "o", "r", "n")
def string_unicorn_parser(raw: bytes) -> str:
return "unicorn"
def parse_to_unicorn(raw: bytes, parser: Callable[[bytes], T]) -> T:
return parser(raw)
first_unicorn = parse_to_unicorn(b"", tuple_unicorn_parser)
# mypy knows that `first_unicorn` is of type `Tuple[str, ...]`
print(first_unicorn) # prints: ('u', 'n', 'i', 'c', 'o', 'r', 'n')
second_unicorn = parse_to_unicorn(b"", string_unicorn_parser)
# mypy knows that `second_unicorn` is of type `str`
print(second_unicorn) # prints: unicorn
# mypy rightfully rejects this with: `Unsupported operand types for + ("Tuple[str, ...]" and "str")
# first_unicorn + second_unicorn |
Ok that's great :) |
@jacqueswww @cburgdorf Yeah, I should have mentioned generics in the context of the formatter example. |
@jacqueswww Just wanted to ping you on this. Still feel like doing a #yolo on it? |
Yes |
I believe I also included all of @cburgdorf 's suggestions. I'll go ahead and merge this. |
What I did
Added type annotations for a number of functions encountered during the first stages of invoking the
vyper
binary. Also did some refactors to facilitate type checking and annotation.How to verify it
Run the tests. Reviewers might also find it helpful to look at each commit individually instead of the entire branch diff.
Description for the changelog
vyper.compile_codes
has been simplified to always return anOrderedDict
mapping vyper code file names to their respective compilation results. This means that this function no longer accepts anoutput_type
keyword argument. Users wishing to duplicate the functionality of passingoutput_type='list'
should convert the results of this function to a list like so:list(compile_codes(...).values())
.Cute Animal Picture