-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TensorIR][M1a] TVMScript Parser/Printer #7630
Conversation
[TensorIR] TVMScript Parser/Printer Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org>
CC: @tkonolige if you are interested :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Hzfengsy. Overall I think it looks pretty good. I've commented a bit around error handling (which is important because this is user-facing).
As part of the process of merging autotir, we really need to document tvm script usage. I'm not really sure where we would put all these docs to make Sphinx happy.
We need a list of glossaries, introducing what |
Maybe we can have a big picture showing the overview of the key data structures and the whole system. Just like what we do at the beginning of Ansor upstreaming: #5962 😄 |
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
@tkonolige please take another look. Let us make sure all actionable items are being addressed. The key data structures are now listed in #7527 cc @comaniac @jroesch @icemelon9 @jcf94 @ZihengJiang to get more comments. |
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Mostly just nits.
@@ -144,6 +144,197 @@ def test_no_body(): | |||
check_error(no_body, 3) | |||
|
|||
|
|||
def allocate_with_buffers() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why some tests have the return annotations but others haven't? IMHO, it's fine to ignore the type annotation of returning None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that there are two types of functions here: 1. script function which will be parsed and 2. Python function with test_
prefix. The first ones need type annotation while it is optional for the second ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed that the ones with type annotations are TVM scripts. So just curious, does that mean when users are writing a TVM script, they have to specify type annotation?
if nodes is None: | ||
nodes = [] | ||
self.node_stack.append(list(reversed(nodes))) | ||
self.symbols.append(dict()) | ||
|
||
def update_symbol(self, name, symbol): | ||
def enter_block_scope(self, nodes: Optional[List[synr.ast.Node]] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document the difference between a regular and block scope from the user perspective. When should a should enter_scope
be used and when should enter_block_scope
be used.
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com> Co-authored-by: Cody Yu <comaniac0422@gmail.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
@comaniac @tkonolige Please take another look. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @tkonolige . The level of documentation can be debatabe, while it is always good to add more details in the docs, we also cannot too hung up on things, to ensure timely delivery of the item, assuming the code quality is good and readable (in this case get other committers to read and understands the code logic). Which such cases happens, we could apply the following principle - reviewers should strive to read and understand the code, and suggest comments that can be directly act upon to make sure merge happen in a reasonable pace. The merger/committer takes the responsibility in case future problem occurs to the case. It is of course great to help further suggest improvements. Normally in such cases it would be great if reviewer can make suggestions that also facilitate merging in a quick way, especially comments that are directly actionable(e.g. suggest changes that can be directly applied to the code). This is particularly useful to help the code review move forward. Of course doing so would demands more time from the reviewer (to read and understand), but that is what we should strive to do in code reviews. Here is an example of actionable comment, please feel free to suggest more. Example Actionable CommentPlease add the following comment to enter_block:
Please add the following clarifications to
|
@Hzfengsy please do your best effort in applying the actionable comments :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR but I have agree on the documentation process here. I read over the PR and agree there is not clarity around scoping, there is no clear reason to me we need two concepts of scoping to begin with. Given the relative brevity of the RFCs around TensorIR, and AutoTIR it is really important that some documentation lands with the actual features. Getting code into the repo in a state which only usable by the authors and is not easily understandable by contributors during review is not what we should be striving for. Especially in cases where we have comments from people who have already worked on and read the part of the code base. We are going to have many new people joining who are not experts on the code base and its important code is designed to scale the number of people who can contribute to the project.
I understand longer form documentation is a significant ask but we should at least have basic level of explanations about the concepts attached to the class doc strings and fields. I know we all want to strive to write good code but in the past across many features the probability of significant doc changes landing after the initial PR is near zero.
In the spirit of being actionable doc strings which provide the same amount of information as the type and name are not very helpful as they don't provide a picture of how the pieces fit together, i.e:
match_buffers: List[MatchBufferRegion] = []
"""List[MatchBufferRegion]: match_buffer_region list for the block
Yeah @tkonolige if there are implementation details you don't understand, as a reviewer, you can just ask. Then if you documents could help with better understanding, indeed you should bring up with suggested changes, like you always did. To be clear, we should be really serious about user-facing documents. Like if you are introducing c++ to college students, you need really clear tutorials and examples. On the other hand, I don't think a long introduction on how clang parser works would help users, because that is developer facing, and in that case, we need a design doc. |
Seems that @Hzfengsy has made the further update on the actionable comments:
Note these are changes to clarify the general internal functions. The current internal docs are reasonable(of course not perfect), and the clarity matches what we already have in the codebase, e.g. parser.cc. Additionally, significant amount of efforts of this PR are spent better on error reporting, adding type anotations. These are a significant improvements over the current TVMScript parser. We can also make further improvements in the followup PRs. Thanks for all the review suggestions so far. Please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current internal docs are reasonable(of course not perfect), and the clarity matches what we already have in the codebase, e.g. parser.cc.
I'm sorry, but I'm going to have disagree with you here. We need more docs as there still isn't a clear distinction between a block and regular scope.
@Hzfengsy Could you address previous reviews along with these new changes?
I'm with @jroesch here. It really does seem like we can merge enter_scope
and enter_block_scope
. This means we could avoid special casing scopes for blocks. Latter additions then could also take advantage of the information we retain for a block. Is there any reason not to do this?
This function should be used to handle a block scope, | ||
aka the blocks that involve a `with block` scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still confusing. What is a block? How does it differ from a regular scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The block structure is clearly described in the RFC https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you put that in the codebase somewhere then? It'll be hard for people to understand if they have to go to discus to get the full docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be docs about the TensorIR tvmscript langauge, but that should come as a separate PR. Additionally, this PR already contains test cases that covers the cases needed. Like in our previous parser code , there is less of a description of the language itself..
While I agree some examples would be helpful, it may not be necessary, assuming the maintainer have a good understanding of the Block structure itself, and future docs of the language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Tristan on this. I think that assuming that a maintainer has a good understanding of block structure already is not a good assumption to make. Having examples in the codebase makes code easily understandable and accessible to anyone who wants to read it, not just people who are familiar with the code. Since the project is rapidly growing and getting new contributors, it's important to make code understandable and accessible. Scaling the number of developers in TVM isn't sustainable without good documentation -- and good documentation includes having good comments. Ideally, the comments and the formal documentation would even be a little bit redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @electriclilies. I agree with all you said in particular wrt to code readability. We already followed the principle of enforcing heavy documentation in the case of user facing code and making sure the overall logic flows well.
Code readability also goes beyond the comments, a lot of efforts needs to be spent on API naming, intuitive callings and error handling. This PR does a lot of that, for example:
- E0: Clear API naming: see convert_to_int function in the diff
- E1: Good error handling: call_with_error_reporting that allows accurate error reporting in most of the cases.
- E2: Type annotations: adding type annotations to a lot of the places(while our current convention does not require so) to increase readability
- E3: Documentation of most functions
There is of course a tradeoff between the time we spend and amount of comments to be added and other efforts. On one hand we certainly want to add as many comments as possible. On the other hand, adding every code blocks may not be the best way of investing time -- we could spend more time on overall architectural correctness, the scaffolds(APIs and components) and other elements that makes the code more readable and maintainable.
Comming back to a related example(e.g. reviewing the quantization code). It is certainly helpful to add examples about network patterns happened during the quantization process, values being involved and so on. But that may not be the most important thing for now, since we can focus on more important issues on readability and maintainability -- e.g. clarifying the key APIs, make sure they compose well and so on. Examples can then be added to places that could contain subtle set of logic to help clarify things.
Right now we are prioritizing to add the examples to developer code paths that are more sutble, like the arithmetic modules and so on. In this particular case, an example can be suggested and checked in as followup PRs is more than welcomed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of us disagree with the extreme importance of comprehensive docs, but we need think a bit deeper what kind docs really help.
I would like to further specifically reiterate my understanding of categorization of helpful docs:
- D1. User-facing. This helps a user, who is possibly not familiar with the implementation, to understand how to use some functionalities of TVM;
- D2. Design doc. This provides the high-level overview of how the codebase is structured and why it is designed like this;
- D3. Private API doc. Document the private APIs that are not directly user-facing.
D1 directly helps a user to better understand how to make things work and how it works. It is clear that D1 is desirable.
D2 helps developers to understand the design philosophy, how the codebase is structured, etc, so that more people could help better maintain the codebase. Without D2, people are unable to understand the key concepts - that is why Tianqi redirects us to the RFC, so that we could all understand what a "Block" is, etc. Without D2, no matter how many words the developer use to document private APIs, it is still painful to understand some data structures.
On D3, we always insist that APIs to be at least somewhat documented, so that maintainers could get a brief sense what will be going on if we call an API. Assuming we have good D1 and D2, we could substantially lower the steep learning curve and makes understanding D3 much easier - the prerequisite is that maintainers should read D1 and D2 first, in our specific case, the RFC and related materials.
I think everybody totally agrees with Lily's words, and trust me, nobody wants bring trouble to future maintainers :-) With D1 and D2 ready in place (after M1s and M2s merged), it will be much easier to understand the design philosophy. This is what we are doing in Ansor upstreaming too - we upstream many tutorials after the codebase is fully functioning.
It is totally understandable that reviewers are feeling frustrated when not understanding the design philosophy, and that is what RFCs are for, aren't they :-) Next time, we would love to see such frustration converts to clear questions and answers. For design philosophy-related questions, I would love to redirect everybody to the RFC from the very beginning, and we shouldn't debate outside the RFC about topics like "why two scopes".
Then should we include the RFC text into the inlined docs? I think it is debatable, and I prefer not to replicate. The basic reason is that we will have D1 and D2 in the end, and maintaining two copies is quite error-prone. An easier solution for maintenance is to add links to D1 and D2 on critical data structures.
@junrushao1994 @tqchen I have a policy question: should coauthors on PR approve their own PRs? |
Thanks @tkonolige for bring it up. To be fully transparent, technically right now you are also co-author of the PR according to the commit history :) In this particular case, @Hzfengsy wrote the original code, while both @junrushao1994 and myself are serving as reviewers of the code. Personally I have not made any major change to the code as much as you do(just suggestions to the code comments and structure itself), it was nice that he added ourselves and you as co-author and as well, although from the perspective of the engineering, I am serving as the reviewer that helps to polish the overall architecture and logic correctness. |
Thanks again @Hzfengsy sorry for blocking you here, just attempting to push for better documentation as this code is really important and critical work that I believe will be important to many contributors of TVM in the future. Talked with TQ a bit more about documentation standards for new features and will come back to the community with a more clear suggestion in the coming weeks. |
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Thanks everyone's kindly input. It makes this PR much clearer and readable. I totally agree that great documentation is necessary for open-source projects. But I agree more with @junrushao1994 that we should distinguish who the documents are written for. We cannot write every doc for devs without any background. I think that the docs in TVM parser are written for developers who knows the TIR data structures. So that documents for data structures (e.g. block signature: reads/writes/match_buffer_region) are not necessary. But I have been making my effort to address actional items as much as possible, making things move on. Finally, I add the block example in I have polished the code for many rounds together with my co-authors. I want to thank their efforts again for improving code quality. During development, I found some latent issues of TVMScript Parser. I just updated the following thing besides basic functional improvement:
This PR's original purpose is to add support for TensorIR, which is long about 1000 lines of code. But I just tried my best to improve code quality, and finally, the changes became to 2000+ lines. That's far more than what should an incrementally PR does. I'd love to try my best to polish code but not endless requests. In general, I'm happy to improve code quality and documentation. But it still to think about the tradeoff between readability and heavy documents. |
Thansk @Hzfengsy for continuous effort to improve this PR. Thanks @tkonolige @electriclilies @comaniac @junrushao1994 @jroesch for reviewing. This PR is now merged. |
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com> Co-authored-by: Cody Yu <comaniac0422@gmail.com>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com> Co-authored-by: Cody Yu <comaniac0422@gmail.com>
In this PR, we introduce TVMScript interface for TensorIR, which is an important user interface.
Please see full tracking at #7527:
co-authors: @spectrometerHBH @junrushao1994 @tqchen @jinhongyii @MasterJH5574 and @vinx13