Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorIR][M1a] TVMScript Parser/Printer #7630

Merged
merged 22 commits into from
Mar 20, 2021
Merged

[TensorIR][M1a] TVMScript Parser/Printer #7630

merged 22 commits into from
Mar 20, 2021

Conversation

Hzfengsy
Copy link
Member

In this PR, we introduce TVMScript interface for TensorIR, which is an important user interface.

Please see full tracking at #7527:

co-authors: @spectrometerHBH @junrushao1994 @tqchen @jinhongyii @MasterJH5574 and @vinx13

@tqchen tqchen self-assigned this Mar 10, 2021
[TensorIR] TVMScript Parser/Printer

Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
@jroesch
Copy link
Member

jroesch commented Mar 10, 2021

Thanks for the PR @Hzfengsy! I will read over today or tomorrow, would like to make sure we get some good reviews outside of AutoTIR team. cc @icemelon9 @zhiics

@tqchen
Copy link
Member

tqchen commented Mar 10, 2021

@junrushao
Copy link
Member

CC: @tkonolige if you are interested :-)

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Hzfengsy. Overall I think it looks pretty good. I've commented a bit around error handling (which is important because this is user-facing).

As part of the process of merging autotir, we really need to document tvm script usage. I'm not really sure where we would put all these docs to make Sphinx happy.

python/tvm/script/context_maintainer.py Show resolved Hide resolved
python/tvm/script/context_maintainer.py Outdated Show resolved Hide resolved
python/tvm/script/node.py Outdated Show resolved Hide resolved
python/tvm/script/parser.py Show resolved Hide resolved
python/tvm/script/parser.py Show resolved Hide resolved
src/tir/analysis/block_access_region_detector.cc Outdated Show resolved Hide resolved
src/tir/analysis/block_access_region_detector.cc Outdated Show resolved Hide resolved
src/tir/analysis/block_access_region_detector.cc Outdated Show resolved Hide resolved
python/tvm/script/scope_handler.py Show resolved Hide resolved
@junrushao
Copy link
Member

We need a list of glossaries, introducing what Block and BlockRealize are and the motivation behind them

@jcf94
Copy link
Contributor

jcf94 commented Mar 11, 2021

We need a list of glossaries, introducing what Block and BlockRealize are and the motivation behind them

Maybe we can have a big picture showing the overview of the key data structures and the whole system. Just like what we do at the beginning of Ansor upstreaming: #5962 😄

Hzfengsy and others added 2 commits March 11, 2021 23:08
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
@junrushao junrushao linked an issue Mar 11, 2021 that may be closed by this pull request
29 tasks
@tqchen
Copy link
Member

tqchen commented Mar 13, 2021

@tkonolige please take another look.

Let us make sure all actionable items are being addressed. The key data structures are now listed in #7527

cc @comaniac @jroesch @icemelon9 @jcf94 @ZihengJiang to get more comments.

Hzfengsy and others added 3 commits March 14, 2021 20:39
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Mostly just nits.

python/tvm/script/node.py Outdated Show resolved Hide resolved
python/tvm/script/parser.py Outdated Show resolved Hide resolved
python/tvm/script/parser.py Outdated Show resolved Hide resolved
python/tvm/script/scope_handler.py Show resolved Hide resolved
python/tvm/script/scope_handler.py Outdated Show resolved Hide resolved
python/tvm/script/special_stmt.py Show resolved Hide resolved
python/tvm/script/special_stmt.py Show resolved Hide resolved
python/tvm/script/special_stmt.py Show resolved Hide resolved
python/tvm/tir/analysis/analysis.py Outdated Show resolved Hide resolved
@@ -144,6 +144,197 @@ def test_no_body():
check_error(no_body, 3)


def allocate_with_buffers() -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why some tests have the return annotations but others haven't? IMHO, it's fine to ignore the type annotation of returning None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that there are two types of functions here: 1. script function which will be parsed and 2. Python function with test_prefix. The first ones need type annotation while it is optional for the second ones

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I noticed that the ones with type annotations are TVM scripts. So just curious, does that mean when users are writing a TVM script, they have to specify type annotation?

python/tvm/script/context_maintainer.py Outdated Show resolved Hide resolved
python/tvm/script/context_maintainer.py Outdated Show resolved Hide resolved
python/tvm/script/context_maintainer.py Outdated Show resolved Hide resolved
python/tvm/script/parser.py Outdated Show resolved Hide resolved
python/tvm/script/parser.py Outdated Show resolved Hide resolved
python/tvm/script/special_stmt.py Outdated Show resolved Hide resolved
python/tvm/script/special_stmt.py Outdated Show resolved Hide resolved
src/printer/tvmscript_printer.cc Outdated Show resolved Hide resolved
if nodes is None:
nodes = []
self.node_stack.append(list(reversed(nodes)))
self.symbols.append(dict())

def update_symbol(self, name, symbol):
def enter_block_scope(self, nodes: Optional[List[synr.ast.Node]] = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the difference between a regular and block scope from the user perspective. When should a should enter_scope be used and when should enter_block_scope be used.

python/tvm/script/context_maintainer.py Outdated Show resolved Hide resolved
Hzfengsy and others added 4 commits March 16, 2021 08:45
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
@Hzfengsy
Copy link
Member Author

@comaniac @tkonolige Please take another look. Thanks

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tqchen
Copy link
Member

tqchen commented Mar 18, 2021

Thanks @tkonolige . The level of documentation can be debatabe, while it is always good to add more details in the docs, we also cannot too hung up on things, to ensure timely delivery of the item, assuming the code quality is good and readable (in this case get other committers to read and understands the code logic).

Which such cases happens, we could apply the following principle - reviewers should strive to read and understand the code, and suggest comments that can be directly act upon to make sure merge happen in a reasonable pace. The merger/committer takes the responsibility in case future problem occurs to the case.

It is of course great to help further suggest improvements. Normally in such cases it would be great if reviewer can make suggestions that also facilitate merging in a quick way, especially comments that are directly actionable(e.g. suggest changes that can be directly applied to the code). This is particularly useful to help the code review move forward. Of course doing so would demands more time from the reviewer (to read and understand), but that is what we should strive to do in code reviews.

Here is an example of actionable comment, please feel free to suggest more.

Example Actionable Comment

Please add the following comment to enter_block:

Note
------
This function is used for normal scopes that does not
involve a `with block` scope. Use `enter_block_scope`
for the other case.

Please add the following clarifications to enter_block_scope's docstring

Note
------
This function should be used for the block scope,
aka the blocks that involves a `with block` scope.

@tqchen
Copy link
Member

tqchen commented Mar 18, 2021

@Hzfengsy please do your best effort in applying the actionable comments :)

Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR but I have agree on the documentation process here. I read over the PR and agree there is not clarity around scoping, there is no clear reason to me we need two concepts of scoping to begin with. Given the relative brevity of the RFCs around TensorIR, and AutoTIR it is really important that some documentation lands with the actual features. Getting code into the repo in a state which only usable by the authors and is not easily understandable by contributors during review is not what we should be striving for. Especially in cases where we have comments from people who have already worked on and read the part of the code base. We are going to have many new people joining who are not experts on the code base and its important code is designed to scale the number of people who can contribute to the project.

I understand longer form documentation is a significant ask but we should at least have basic level of explanations about the concepts attached to the class doc strings and fields. I know we all want to strive to write good code but in the past across many features the probability of significant doc changes landing after the initial PR is near zero.

In the spirit of being actionable doc strings which provide the same amount of information as the type and name are not very helpful as they don't provide a picture of how the pieces fit together, i.e:

 match_buffers: List[MatchBufferRegion] = []
    """List[MatchBufferRegion]: match_buffer_region list for the block

@junrushao
Copy link
Member

junrushao commented Mar 19, 2021

Yeah @tkonolige if there are implementation details you don't understand, as a reviewer, you can just ask. Then if you documents could help with better understanding, indeed you should bring up with suggested changes, like you always did.

To be clear, we should be really serious about user-facing documents. Like if you are introducing c++ to college students, you need really clear tutorials and examples. On the other hand, I don't think a long introduction on how clang parser works would help users, because that is developer facing, and in that case, we need a design doc.

@tqchen
Copy link
Member

tqchen commented Mar 19, 2021

Seems that @Hzfengsy has made the further update on the actionable comments:

  • adding clarifications to enter_block_scope and enter_scope per @tkonolige 's suggestion
  • clarified the fields per @jroesch 's suggestion

Note these are changes to clarify the general internal functions. The current internal docs are reasonable(of course not perfect), and the clarity matches what we already have in the codebase, e.g. parser.cc. Additionally, significant amount of efforts of this PR are spent better on error reporting, adding type anotations. These are a significant improvements over the current TVMScript parser. We can also make further improvements in the followup PRs.

Thanks for all the review suggestions so far. Please take another look.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen

The current internal docs are reasonable(of course not perfect), and the clarity matches what we already have in the codebase, e.g. parser.cc.

I'm sorry, but I'm going to have disagree with you here. We need more docs as there still isn't a clear distinction between a block and regular scope.

@Hzfengsy Could you address previous reviews along with these new changes?

I'm with @jroesch here. It really does seem like we can merge enter_scope and enter_block_scope. This means we could avoid special casing scopes for blocks. Latter additions then could also take advantage of the information we retain for a block. Is there any reason not to do this?

Comment on lines +139 to +140
This function should be used to handle a block scope,
aka the blocks that involve a `with block` scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still confusing. What is a block? How does it differ from a regular scope?

Copy link
Member

@tqchen tqchen Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block structure is clearly described in the RFC https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put that in the codebase somewhere then? It'll be hard for people to understand if they have to go to discus to get the full docs.

Copy link
Member

@tqchen tqchen Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be docs about the TensorIR tvmscript langauge, but that should come as a separate PR. Additionally, this PR already contains test cases that covers the cases needed. Like in our previous parser code , there is less of a description of the language itself..

While I agree some examples would be helpful, it may not be necessary, assuming the maintainer have a good understanding of the Block structure itself, and future docs of the language

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Tristan on this. I think that assuming that a maintainer has a good understanding of block structure already is not a good assumption to make. Having examples in the codebase makes code easily understandable and accessible to anyone who wants to read it, not just people who are familiar with the code. Since the project is rapidly growing and getting new contributors, it's important to make code understandable and accessible. Scaling the number of developers in TVM isn't sustainable without good documentation -- and good documentation includes having good comments. Ideally, the comments and the formal documentation would even be a little bit redundant.

Copy link
Member

@tqchen tqchen Mar 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @electriclilies. I agree with all you said in particular wrt to code readability. We already followed the principle of enforcing heavy documentation in the case of user facing code and making sure the overall logic flows well.

Code readability also goes beyond the comments, a lot of efforts needs to be spent on API naming, intuitive callings and error handling. This PR does a lot of that, for example:

  • E0: Clear API naming: see convert_to_int function in the diff
  • E1: Good error handling: call_with_error_reporting that allows accurate error reporting in most of the cases.
  • E2: Type annotations: adding type annotations to a lot of the places(while our current convention does not require so) to increase readability
  • E3: Documentation of most functions

There is of course a tradeoff between the time we spend and amount of comments to be added and other efforts. On one hand we certainly want to add as many comments as possible. On the other hand, adding every code blocks may not be the best way of investing time -- we could spend more time on overall architectural correctness, the scaffolds(APIs and components) and other elements that makes the code more readable and maintainable.

Comming back to a related example(e.g. reviewing the quantization code). It is certainly helpful to add examples about network patterns happened during the quantization process, values being involved and so on. But that may not be the most important thing for now, since we can focus on more important issues on readability and maintainability -- e.g. clarifying the key APIs, make sure they compose well and so on. Examples can then be added to places that could contain subtle set of logic to help clarify things.

Right now we are prioritizing to add the examples to developer code paths that are more sutble, like the arithmetic modules and so on. In this particular case, an example can be suggested and checked in as followup PRs is more than welcomed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of us disagree with the extreme importance of comprehensive docs, but we need think a bit deeper what kind docs really help.

I would like to further specifically reiterate my understanding of categorization of helpful docs:

  • D1. User-facing. This helps a user, who is possibly not familiar with the implementation, to understand how to use some functionalities of TVM;
  • D2. Design doc. This provides the high-level overview of how the codebase is structured and why it is designed like this;
  • D3. Private API doc. Document the private APIs that are not directly user-facing.

D1 directly helps a user to better understand how to make things work and how it works. It is clear that D1 is desirable.

D2 helps developers to understand the design philosophy, how the codebase is structured, etc, so that more people could help better maintain the codebase. Without D2, people are unable to understand the key concepts - that is why Tianqi redirects us to the RFC, so that we could all understand what a "Block" is, etc. Without D2, no matter how many words the developer use to document private APIs, it is still painful to understand some data structures.

On D3, we always insist that APIs to be at least somewhat documented, so that maintainers could get a brief sense what will be going on if we call an API. Assuming we have good D1 and D2, we could substantially lower the steep learning curve and makes understanding D3 much easier - the prerequisite is that maintainers should read D1 and D2 first, in our specific case, the RFC and related materials.

I think everybody totally agrees with Lily's words, and trust me, nobody wants bring trouble to future maintainers :-) With D1 and D2 ready in place (after M1s and M2s merged), it will be much easier to understand the design philosophy. This is what we are doing in Ansor upstreaming too - we upstream many tutorials after the codebase is fully functioning.

It is totally understandable that reviewers are feeling frustrated when not understanding the design philosophy, and that is what RFCs are for, aren't they :-) Next time, we would love to see such frustration converts to clear questions and answers. For design philosophy-related questions, I would love to redirect everybody to the RFC from the very beginning, and we shouldn't debate outside the RFC about topics like "why two scopes".

Then should we include the RFC text into the inlined docs? I think it is debatable, and I prefer not to replicate. The basic reason is that we will have D1 and D2 in the end, and maintaining two copies is quite error-prone. An easier solution for maintenance is to add links to D1 and D2 on critical data structures.

python/tvm/script/special_stmt.py Outdated Show resolved Hide resolved
python/tvm/script/special_stmt.py Outdated Show resolved Hide resolved
python/tvm/script/special_stmt.py Outdated Show resolved Hide resolved
python/tvm/tir/analysis/analysis.py Outdated Show resolved Hide resolved
python/tvm/tir/analysis/analysis.py Outdated Show resolved Hide resolved
src/tir/analysis/block_access_region_detector.cc Outdated Show resolved Hide resolved
@tkonolige
Copy link
Contributor

@junrushao1994 @tqchen I have a policy question: should coauthors on PR approve their own PRs?

@tqchen
Copy link
Member

tqchen commented Mar 19, 2021

Thanks @tkonolige for bring it up. To be fully transparent, technically right now you are also co-author of the PR according to the commit history :)

In this particular case, @Hzfengsy wrote the original code, while both @junrushao1994 and myself are serving as reviewers of the code. Personally I have not made any major change to the code as much as you do(just suggestions to the code comments and structure itself), it was nice that he added ourselves and you as co-author and as well, although from the perspective of the engineering, I am serving as the reviewer that helps to polish the overall architecture and logic correctness.

@jroesch
Copy link
Member

jroesch commented Mar 19, 2021

Thanks again @Hzfengsy sorry for blocking you here, just attempting to push for better documentation as this code is really important and critical work that I believe will be important to many contributors of TVM in the future.

Talked with TQ a bit more about documentation standards for new features and will come back to the community with a more clear suggestion in the coming weeks.

Hzfengsy and others added 5 commits March 20, 2021 19:45
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
@Hzfengsy
Copy link
Member Author

Hzfengsy commented Mar 20, 2021

Thanks everyone's kindly input. It makes this PR much clearer and readable. I totally agree that great documentation is necessary for open-source projects. But I agree more with @junrushao1994 that we should distinguish who the documents are written for. We cannot write every doc for devs without any background. I think that the docs in TVM parser are written for developers who knows the TIR data structures. So that documents for data structures (e.g. block signature: reads/writes/match_buffer_region) are not necessary. But I have been making my effort to address actional items as much as possible, making things move on. Finally, I add the block example in context_maintainer to make everyone happy, although I still disagree it should be here.

I have polished the code for many rounds together with my co-authors. I want to thank their efforts again for improving code quality. During development, I found some latent issues of TVMScript Parser. I just updated the following thing besides basic functional improvement:

  1. Type annotations for the Parser
  2. Fix error reporting issues left over by last PR (authored by @tkonolige)
  3. Add many test cases for Parser and error reporting (not only for new features of TensorIR)
  4. Add more document and assert checks lacking in the current version

This PR's original purpose is to add support for TensorIR, which is long about 1000 lines of code. But I just tried my best to improve code quality, and finally, the changes became to 2000+ lines. That's far more than what should an incrementally PR does. I'd love to try my best to polish code but not endless requests.

In general, I'm happy to improve code quality and documentation. But it still to think about the tradeoff between readability and heavy documents.

@tqchen tqchen merged commit fffed0f into apache:main Mar 20, 2021
@tqchen
Copy link
Member

tqchen commented Mar 20, 2021

Thansk @Hzfengsy for continuous effort to improve this PR. Thanks @tkonolige @electriclilies @comaniac @junrushao1994 @jroesch for reviewing. This PR is now merged.

@junrushao junrushao linked an issue Mar 24, 2021 that may be closed by this pull request
29 tasks
@junrushao junrushao changed the title [TensorIR] TVMScript Parser/Printer [TensorIR][M1a] TVMScript Parser/Printer Apr 16, 2021
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request May 11, 2021
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Tianqi Chen <tqchen@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC][Tracking Issue] TensorIR Scheduling
8 participants