Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] TVMScript Metaprogramming #79

Merged
merged 12 commits into from
Aug 4, 2022

Conversation

yelite
Copy link
Contributor

@yelite yelite commented Jun 16, 2022

@yelite yelite changed the title [RFC] Add TVMScript Metaprogramming RFC [RFC] TVMScript Metaprogramming Jun 16, 2022
@yelite
Copy link
Contributor Author

yelite commented Jun 18, 2022

Cross post from https://discuss.tvm.apache.org/t/rfc-tvmscript-metaprogramming/12969/2 for more visibility.

From @Johnson9009,

Thanks for the work, for F4,do you mean we can write some impreative code that can be execute by Python?

Yes, that’s right. Actually you can write imperative code to construct IR graph now, by calling IR node constructors from tvm.tir.stmt|expr manually. The new things behind F4 are:

  1. IRBuilder API, which can be used to construct IR nodes in a more concise way compared to calling node constructor directly.
  2. The ability to embed these functions inside TVMScript. This means you can use imperative code to construct small fragments inside a large T.prim_func.

I update the RFC to clarify when and where the imperative code gets evaluated.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this RFC. Overall, I think meta programming is an important feature to have in TVMScript. It is also a complicated one that warrants discussion. I'd like to see a little more talk about other approaches languages use for meta programming.

1. All primitive types will be captured automatically, while advanced types
(like function) needs to be explicitly declared in the decorator to be
captured (for example, `@T.prim_func(capture=[get_box_coordinates])`)
2. Call the corresponding function from IRBuilder, as the parser visits the AST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, what you are saying here is that the parser will not construct AST nodes directly and it will instead use the new IRBuilder infrastructure to construct the AST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the IRBuilder is responsible for constructing the IR graph and the parser can be considered as a thin layer built on the IRBuilder.

I updated the RFC to better clarify this.


1. Evaluate the expression `T.grid(*A.shape)` by the step described above.
`T.grid` returns a value that is nearly equivalent to `List[Var]`.
2. Call `exec` on a specially constructed statement `*i = __tvm_rhs_var__`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec is an incredibly powerful function and we should be careful using it. Is there a reason we can't bind the results directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it'd be helpful to note here that using exec means folks would need to essentially strip this code out from places that were productionizing TVM, as exec is generally a security no-no and requires special review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. The original RFC indeed lacks some discussion on the necessity and security implication behind the usage of eval and exec. I updated the RFC to include a section for it.

To answer this comment directly, the left hand side of assignment can get very complicated, like

a, (b, *c, _), d = rhs

It's much easier to use exec, compared to writing our own small Python interpreter. Since the tokens on the LHS comes from user's code without modification, and the auto capturing behavior is limited, the usage of exec here does not create additional security risk for user.

using exec means folks would need to essentially strip this code out from places that were productionizing TVM

I think it's a little bit far-fetched. I agree that the usage of eval and exec needs strong rationale and careful security review. Also using them for untrusted input is a big security hole. But there are still legitimate use cases. In our case the input comes from user's code directly, which should be considered safe. There is a similar usage of exec in torch.jit https://github.com/pytorch/pytorch/blob/master/torch/jit/frontend.py#L465 (constructing a function from template with user's input).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting discussion and both sides seem to have valid points - eval and exec can be very useful but they are susceptible to misuse. If they are irreplaceable, can we provide more guideline (e.g., good/bad use-cases) to avoid the misuses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be quite challenging to have a comprehensive guideline over the usage of eval and exec. In our particular case, the guideline is to always limit the input of eval and exec to fragments of user's code (or the equivalent snippet, like <lhs> = __tvm_rhs_var__). By doing this, we eliminate the possibility that unknown code is executed without user's awareness.

Another effect of using eval is that the timing of the evaluation changes, an extreme example would be (although not something people will write in TVMScript) :

@T.prim_func
def f():
    ...
    if False:
        nuke_home_directory()
    ...

User would expect:

  1. Nothing from f will run if f is not called by other code
  2. nuke_home_directory will not be called because it's dead code

To minimize the surprise to users, we ask user to explicitly capture those function before they can be evaluated in the parser, to make it clear that the timing of evaluation is different than ordinary Python code.

rfcs/0079-tvmscript-metaprogramming.md Show resolved Hide resolved
rfcs/0079-tvmscript-metaprogramming.md Show resolved Hide resolved

```python
@dispatch.register(token="tir", type_name="For")
def visit_for(self: Parser, node: doc.For) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused at the type signature here. Shouldn't the visitor return an AST fragment so that the visit statements can be use recursively?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser is a thin wrapper of the IRBuilder API and doesn't directly construct IR graph. IRBuilder uses thread local store for builder state so there is no need to return things from visit methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why the granularity for registering parsers is at the individual token level and not at the language level. Your example here is

@dispatch.register(token="tir", type_name="For")
 def visit_for(self: Parser, node: doc.For) -> None:
    ...

But I'd expect

@dispatch.register(language="tir")
class TIRParser:
    def parse_for(...):
        ...
    def parse_with(...):
        ...

I don't understand why you would need the extension point of the language to be on a per token basis.

Comment on lines +290 to +291
The implementation of IRBuilder will be in C++ so that it can be used in an
environment without Python. Python binding will be created to expose IRBuilder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good motivation for having IRBuilder in C++ is that it could be used with other languages like rust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this complement to the design of Doc and DocPrinter in the printer RFC https://github.com/apache/tvm-rfcs/blob/main/rfcs/0074-tvmscript-unified-printer.md. We will have the flexibility to parse from and print to a different language if use cases emerge.

Comment on lines 312 to 314
1. All primitive types will be captured automatically, while advanced types
(like function) needs to be explicitly declared in the decorator to be
captured (for example, `@T.prim_func(capture=[get_box_coordinates])`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more consistent to capture all values explicitly. It avoids the problem of users accidentally capturing a variable without knowing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an ideal world we would want to capture everything implicitly, to match the closure behavior in Python. However, due to the usage of eval and exec, we want to limit the auto capture to those types that's absolutely safe to mitigate security concern.

We will provide detailed error message if user tries to use a variable that should have been explicitly captured, to compensate the potential errors caused by this inconsistency.

What parser does here is to:

1. Collect the environment inside `gen_matmul`, getting a dictionary
1. All primitive types will be captured automatically, while advanced types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you list the types that are supported for capture?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's updated. It will auto capture int, str, float and None.

Comment on lines +345 to +346
The logic of how to process a particular Python syntax node is registered
through decorator. For example,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why we want to register parsers for specific fragments instead of calling them explicitly. Using a registration mechanism makes it a lot harder to understand the codebase. Registrations can be split over multiple files and you have to go on a wild chase through the codebase to find what you are looking for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed sometimes registration results in code being split over many files and make it hard to trace how things are connected together. In our case though, all registered parsing function for a particular IR will be centralized in a single file. Another benefit of not calling them directly is that it's easier to test individual part of the parser.


N/A

# Rationale and alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are definitely different approaches for doing meta programming like lisp macro vs c macros. Could you add a little discussion on that here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the design constraint is that TVMScript should not deviate from Python. All features proposed in this RFC follows Python's semantics and doesn't create foreign syntax or language features. Neither lisp macro or C macro is relevant to the design here.

I updated the RFC to include a discussion on metaprogramming features from Taichi and Triton per your other feedback.

Copy link
Member

@yongwww yongwww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks very good to me!

rfcs/0079-tvmscript-metaprogramming.md Outdated Show resolved Hide resolved
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @yelite for the RFC! overall i think it does a pretty good job of setting forth examples and motivating the proposed solution. i really liked how you grounded F1-F2 and F4 in expressing a particular operator or subgraph; i'd like to encourage you to do the same for F3 as well, as i think it would help to motivate the overall RFC even more.

i think it would be worth talking about the style of this solution, as it is pushing us in the direction of a rather complex TIR parser rather than towards a simpler one that could be combined as a tool. i think metaprogramming is a large reason for that, and it would be great to just spell out the kinds of larger tasks this allows users to accomplish more easily so folks can understand.

i think @tkonolige has also raised some good questions, and i think most of them can be addressed through smaller revisions. i've tried to add some suggestions here as to a path forward to addressing them. overall i think the structure of the RFC is good, but we could add a little bit more detail here and there to improve the motivation and clarify the intentions. lmk what you think of my suggestions!

The current design also lacks of unified approach for different IRs. At
[https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax](https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax),
a mature implementation of TVMScript parser is maintained for Relax. But it’s
hard to extend if we want to support more IRs for TVM unity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think talking about more IRs without context might sound a bit scary to some readers...perhaps channeling the spirit of YAGNI we should avoid over-generalizing here. I think having 1 parser for Relax and TIR is quite reasonable; beyond that we might consider waiting til we have more details.

Copy link
Member

@junrushao junrushao Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @areusch for your feedback! The technique itself could sound like scary or over-generalized, but the fact is it's a better implementation with better maintainability:

  1. It's extremely simple to implement. Concretely speaking, at the time of writing, the core parser is 176 lines of code, which now is working for existing cases without problem. Compared with the current monolithic parser which has 1370 lines of code, it's significantly shorter.
  2. It's good for tooling, particularly, unittesting, documentation and ease to add more sugars. Currently the monolithic parser is extremely hard to be unittested, because it's impossible to split a specific chunk of parsing logic out, and thus all tests are end-to-end tests, for example, this 1k-line "unit"-test is not quite "unit". With our new design, every part of the logic can be tested separately, and almost all sugars don't need to go into changing the parser, but instead adding more functions on the IRBuilder side just suffice.
  3. The new design provides much better clarity and readability.

B: T.Buffer[(128, 128)],
) -> None:
...
C = T.compute((128, 128), lambda i, j: A[i, j] + B[i, j])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if you may want to apply rewrite rules here, how would that look? how big should we go here? is it worth including this as "metaprogramming" coupled with the parser, or should we simply make it possible to "sew" fragments of TVMScript together, whether they are hand-written or machine-generated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if you may want to apply rewrite rules here, how would that look? how big should we go here?
Just to make sure I understand the question correctly - by rewrite rule, you mean rewriting the python AST?

If so, the answer is no. To implement such a feature, there is absolutely no need to change the core parser at all. We only need to add a method in the IRBuilder:

def compute(shape, f_compute) -> None: ...

which will internally call other methods the IRBuilder offers, for example in our case is:

# note it can be implemented in either C++ or python without the parser to do anything
C = T.alloc_buffer(*shape)
with T.grid(*shape) as loop_vars:
  with T.block("some_name"):
    T.buffer_store(C, shape, f_compute(*loop_vars))


1. Evaluate the expression `T.grid(*A.shape)` by the step described above.
`T.grid` returns a value that is nearly equivalent to `List[Var]`.
2. Call `exec` on a specially constructed statement `*i = __tvm_rhs_var__`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it'd be helpful to note here that using exec means folks would need to essentially strip this code out from places that were productionizing TVM, as exec is generally a security no-no and requires special review.

rfcs/0079-tvmscript-metaprogramming.md Show resolved Hide resolved
captured (for example, `@T.prim_func(capture=[get_box_coordinates])`)
2. Call the corresponding function from IRBuilder, as the parser visits the AST
of function `f`.
1. When visiting the function argument, call `eval` on its type annotation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing to watch out for is that type annotations can be str to avoid circular imports here

Copy link
Member

@junrushao junrushao Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's definitely a good point in general. In our particular case, our type system is rather restrictive to those only TVM IR could represent (mostly tir.IterVar, tir.Var, tir.Buffer, etc), so it shouldn't be a concern here

rfcs/0079-tvmscript-metaprogramming.md Show resolved Hide resolved
Copy link

@YuchenJin YuchenJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks for the proposal! The metaprogramming features are very useful to Relax, for example, when implementing a ResNet model in TVMScript, users can define a residual block function with variables inside, and generate residual blocks(Relax functions) by supplying specific values. The IRBuilder approach seems general enough to support other IRs.

rfcs/0079-tvmscript-metaprogramming.md Outdated Show resolved Hide resolved
@yelite
Copy link
Contributor Author

yelite commented Jul 7, 2022

Thanks @tkonolige and @areusch for the valuable feedback! They are very helpful on getting me to better clarify the motivation behind this work and also the design detail. I updated the RFC to address them. Please take another look and let me know if you have more questions.

@xqdan
Copy link

xqdan commented Jul 12, 2022

@yelite It's a great RFC,and this is what we need right now.
the requirements we need:

  1. For compute fusion. With TE compute, it's easy to concate TE computes with producer-comsuer relation to get a fused compute. for example, conv + elemwise ops fusion. We should have similar function in TVM script. Which thread is related to this requirement?
  2. For conditional lowering. We may have some attributtes in graph/relay level, which will further decide how to lower into different tir. With old ir builder/TE compute, we can do that. F4 in this RFC will ensure this,correct?
  3. For reducing boilerplate code. F3 is a good idea. Another one is we define a tir function (with or without host python code), and we reuse it other place. We see this in F4 which foucus on conditional lowering, however I think we should define/declare it as standalone Fearture.

Looking forward to see this RFC in upstream!

@yelite
Copy link
Contributor Author

yelite commented Jul 13, 2022

@xqdan Thanks! For your questions,

For compute fusion. With TE compute, it's easy to concate TE computes with producer-comsuer relation to get a fused compute. for example, conv + elemwise ops fusion. We should have similar function in TVM script. Which thread is related to this requirement?

This will be supported naturally by the F3 (TE compute). F3 will be implemented by calling the IRBuilder APIs during parse-time evaluation. So one can write multiple TE compute side by side, like

x = T.compute(...)
y = T.compute(...)

and the actual fusion is done in MetaSchedule and TIR.

For conditional lowering. We may have some attributtes in graph/relay level, which will further decide how to lower into different tir. With old ir builder/TE compute, we can do that. F4 in this RFC will ensure this,correct?

Yes, this is correct.

For reducing boilerplate code. F3 is a good idea. Another one is we define a tir function (with or without host python code), and we reuse it other place. We see this in F4 which foucus on conditional lowering, however I think we should define/declare it as standalone Fearture.

Mutual function call will be supported in the new TVMScript parser as well. We didn't include it in this RFC because it doesn't belong to the metaprogramming category and doesn't require potentially controversial design consideration.

Copy link

@sunggg sunggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the exciting RFC!! 🥳
Just a couple of minor questions.

rfcs/0079-tvmscript-metaprogramming.md Outdated Show resolved Hide resolved
rfcs/0079-tvmscript-metaprogramming.md Outdated Show resolved Hide resolved

1. Evaluate the expression `T.grid(*A.shape)` by the step described above.
`T.grid` returns a value that is nearly equivalent to `List[Var]`.
2. Call `exec` on a specially constructed statement `*i = __tvm_rhs_var__`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting discussion and both sides seem to have valid points - eval and exec can be very useful but they are susceptible to misuse. If they are irreplaceable, can we provide more guideline (e.g., good/bad use-cases) to avoid the misuses?

@junrushao
Copy link
Member

junrushao commented Jul 28, 2022

To follow up with our latest discussion with @tkonolige @areusch @csullivan @slyubomirsky @jwfromm et al.

The following questions are raised in our discussion:

  1. Move discussion of vendor IR to tradeoffs / benefits section rather than core motivation.
  2. (Section 2) Parser registration example is a little confusing (dispatch.register), probably just needs more detail, less …
  3. (Section 1) Add tradeoffs section noting that this isnt a typical parser, gives much more pythonic interface, easier testing, more flexibility.
  4. (Section 3) More discussion around what lives in host language. Maybe need script.host_var and script.quote in part of future work, might be worth highlighting in a future work section.
  5. (Section 2) tradeoff on registration approach.

Our response:

1. Tradeoffs: TVMScript parser is not a typical parser

Unlike a typical parser that converts a token stream to an AST, which in our case, converts python source code to TIR AST, the TVMScript parser is technically speaking a transpiler that transforms a Python AST to TIR AST instead of parsing raw python.

Pros:

  1. Python’s AST package provides accurate parsing functionality, so that all issues around ambiguity, grammar and parser performance is a non-issue.
  2. Compared with the existing monolithic TVMScript parser which is 1.5k lines of code, the transpiler provides much easier testing, more pythonic interface and more flexibility.

Cons:

  1. This means we are depending on Python’s AST package - which could come with breaking changes across python versions

2. Registration Logic

The registration logic is in fact quite straightforward. For example, to support TIR’s for-loop syntax like:

for i, *many_j, k in T.grid(...): # or anything like T.serial/parallel/vectorized/unroll/thread_binding
  ...

The registration logic is as simple as calling into our IR-builder:

@dispatch.register(token="tir", type_name="For")
def visit_for(self: Parser, node: doc.For) -> None:
    for_frame = self.eval_expr(node.iter)
    if not isinstance(for_frame, T.ForFrame):
        self.report_error(
            node.iter,
            "Expect the for loop to be one of the following: "
            "range, T.serial, T.grid, T.parallel, T.vectorized, T.unroll, T.thread_binding",
        )
    with self.var_table.with_frame():
        with for_frame as iters:
            self.eval_assign(target=node.target, source=iters, bind_value=bind_value)
            self.visit_body(node.body)

There is an alternative proposal that registration should happen at class-level instead of method-level, e.g.

## Our RFC
@dispatch.register(token="tir", type_name="For")
def registered_method(self: Parser, node: doc.For) -> None: ...

## Alternative
@dispatch.register(token="tir"):
class DispatchOfTIR:
  @staticmethod
  def visit_For(self: Parser, node: doc.For) -> None: ...

The advantage of the alternative proposal is that it limits the users so that they have to put all the logic inside a class, while the disadvantage is that the class itself doesn’t mean anything other than a collection of static methods, which could bring some confusion if developers attempt to instantiate the class.

To this end, “putting all logic inside a class” is equivalent to “putting all logic inside a file” because the class only serves as a namespace. Therefore, we believe the best design should be as minimal as possible, i.e. without a class.

Drawback. Registry is a necessary design when it comes to supporting per-node tooling and multiple IRs at scale. However, less carefully designed registries, for example, relay operator strategy, where definitions span in multiple folders and tens of files, would lead to worse discoverability and debugging experience. On the other hand, in our particular case, because all dispatches for a certain IR is confined in a single file (e.g. tir.py), it would not be a particular concern.

3. Metaprogramming syntax

A question has been raised on supporting quotations for metaprogramming in languages like MetaOCaml [1].

This feature is very cool, and it could be an important future work for well motivated use-cases. In terms of syntax, we believe the following could be like:

with script.quote():   ## <===== hand back control to python interpreter
  arbitrary python program
  with script.unquote():  ## <===== gain control from python interpreter
    parsable TVMScript

[1] MetaOCaml -- an OCaml dialect for multi-stage programming https://okmij.org/ftp/ML/MetaOCaml.html

@slyubomirsky
Copy link

Could you give an example of the metaprogramming? E.g., one of creating some value in ordinary Python and then referencing it from TVMScript

@junrushao
Copy link
Member

@slyubomirsky Sure! Please see F1 and F2 for existing meta-programming capability (https://github.com/yelite/tvm-rfcs/blob/tvmscript-metaprogramming/rfcs/0079-tvmscript-metaprogramming.md#f1-template-metaprogramming), and see F4 for interleaving python interpreter with the parser. The quotation thing is stated in Section 3 of my response above

@slyubomirsky
Copy link

slyubomirsky commented Jul 28, 2022

So if you define a variable in a quoted portion, you should be able to reference it in the unquoted portion? I am mostly curious about how the interactions between quoting and unquoting will be defined and also how it would be implemented. The decorator as previously understood converted a Python AST into a TIR AST, but with the quoted portions, it would have to leave those as Python, so is the result now a Python AST that also builds up TIR ASTs? (I haven't looked at the decorator implementations before, so pardon my ignorance if that's already how they work.)

@junrushao
Copy link
Member

junrushao commented Jul 28, 2022

So if you define a variable in a quoted portion, you should be able to reference it in the unquoted portion

  • From quoted to unquoted: Yes, that's correct.
  • From unquoted to quoted: For security concern, accessing values from unquoted portion will require explicit specification if the values are not basic atomic ones (int, float, str, None, etc)

I am mostly curious about how the interactions between quoting and unquoting will be defined and also how it would be implemented.

The mechanism here is that the core logic is in IRBuilder, and the parser is only a thin wrapper of the IRBuilder. More specifically, any T.xxx is a call into the IRBuilder, in both quoted and unquoted portion, and the only difference is it's run by the interpreter or the parser.

@tkonolige
Copy link
Contributor

tkonolige commented Jul 29, 2022

I'm still a bit confused on 2. The example you give for a single class is one that has all static members, but what if instead it was just a regular class. It would instantiate a single instance of said class to do parsing. This is what relax is doing (https://github.com/tlc-pack/relax/blob/25cb42133130fb8ec75d2cc021c6c347e99f0b28/python/tvm/script/relax/parser.py#L107). Instantiating a class would also give the benefit of being able to maintain some state while parsing which is not possible with either free functions or a class with static methods. Here is how I would imagine it (this is just an example):

# To clearly define the interface
class AbstractParser:
    def parse(func: ast.Function):
        raise NotImplemented()

@register("relax")
class RelaxParser(AbstractParser):
    def __init__():
        self.tir_parser = TIRParser()
    def parse(func: ast.Function):
        self.inputs.push(parse_args(func.args))
        self.tir_parser.add_variables(self.inputs)
        parse_stmt(func.body)

    def parse_stmt(stmt):
        if stmt == ast.Let:
            assert stmt.var in self.inputs
        if stmt == ast.Function:
            # parse nested functions as TIR
            self.tir_parser.parse(stmt)

If we want to consider future dialects, this also has the added benefit of being able to add dialects by subclassing one of the existing parsers.

Another issue I see with the proposed registration method is that there is no way to handle different parser rules based on context. As an example, what if I want to treat the first line of a function body differently than every other line (say it is a metadata dictionary):

def foo():
    {"version": "0.1"} # This should become a `MetadataNode`, only strings allowed as values.
    my_dictionary = {"hi", 1} # This should become a `DictLiteralNode`, only integers allowed as values.

With the proposed registration method there is no way to distinguish between what should be a DictLiteralNode and a MetadataNode:

@dispatch.register(token="mylang", type_name="Dict")
def visit_dict(self: Parser, node: doc.Dict):
    for key_ast, value_ast in node.items:
        key = visit(key_ast) # this assumes that the visitor will return something, given the type signatures in the RFC that doesn't look to be possible?
        assert isinstance(key, StringLiteral)
        value = visit(value_ast)
        assert isinstance(value, StringLiteral) # or wait, should it be IntLiteral? There's no way to tell if this is the first line or not.

(Maybe I am misunderstanding how this approach works though. Can we put arbitrary state into the the self: Parser object? If so then we can fix this issue. But it would be a complicated and confusing solution as the parser would have to maintain a stack of rules takes or flags or some such.)

But with the class based approach I proposed above:

@register("mylang")
class MyLangParser:
    def parse_function(func):
        # assuming func.body is a list of statements, also applies if body is a recursive set of statements
        parse_metadata(func.body[0])
        for stmt in func.body[1:]:
            parse_statement(stmt)

    def parse_metadata(stmt):
        for key_ast, value_ast in node.items:
            key = parse_string(key_ast)
            value = parse_string(value_ast)

    def parse_statement(stmt):
        for key_ast, value_ast in node.items:
            key = parse_string(key_ast)
            value = parse_int(value_ast)

This issue is caused by registering visitors at the python node ast level. When visiting each ast node there is no context as to where we may be in the production rules. If we instead my proposed approach, the parser can control which production rules can be used at each location.

@junrushao
Copy link
Member

I believe we are all aware the RFC is to support general IRBuilder-based metaprogramming, and with this context, I would love to address your concerns as below.

there is no way to handle different parser rules based on context

Our design handles context-dependent parsing in a unified approach - both TIR and Relax are context-dependent, and therefore, it is a hard requirement for any parser.

Here is an example of context-dependent parsing in TIR:

@T.prim_func
def f(a: T.handle):
  A = T.match_buffer(a, shape=(128, 128), dtype="float32")
      ^^^^^^^^^^^^^^
      # `match_buffer` here interprets buffer's data pointer to tvm.tir.Buffer

@T.prim_func
def f(...):
  # An example from tensorization:
  # https://github.com/apache/tvm/blob/c2ec95616ed52e8377c42c5f3e15621841924059/tests/python/unittest/test_tir_schedule_tensorize.py#L199-L203
  with T.block(...):
     A_sub = T.match_buffer(A[vi * 16 : vi * 16 + 16, vk * 16 : vk * 16 + 16], shape=(16, 16))
             ^^^^^^^^^^^^^^
             # `match_buffer` here corresponds to `MatchBufferRegion` in TIR
             # Link: https://github.com/apache/tvm/blob/c2ec95616ed52e8377c42c5f3e15621841924059/include/tvm/tir/stmt.h#L1215-L1216  

The example you give for a single class is one that has all static members, but what if instead it was just a regular class. It would instantiate a single instance of said class to do parsing.

Instantiating a class would also give the benefit of being able to maintain some state while parsing which is not possible with either free functions or a class with static methods.

Thanks for bringing up the case where some state needs maintaining during parsing, and to be clear, that’s the exactly what an IRBuilder is capable of handling.

For example, considering the IRBuilder program below:

def example():
  with Builder():
    with T.block("block"):
      T.block_attr({"some_attr": some_value})
      ^^^^^^^^^^^^
      `block_attr` depends on the `T.Block` above

To handle this, like every IRBuilder, the IRBuilder proposed here keeps a stack of contexts as its internal state so that information could be stored there.

The proposed parser, as a thin wrapper of the IRBuilder, does not store any extra context-dependent data other than symbol table and those already stored in the IRBuilder. As an example,

@T.prim_func
def example():
  ...
  with T.block("block"):
       ^^^^^^^ 
       calls IRBuilder via `T.block`
    T.block_attr({"some_attr": some_value})
    ^^^^^^^^^^^^
    calls IRBuilder via `T.block_attr`

If we want to consider future dialects, this also has the added benefit of being able to add dialects by subclassing one of the existing parsers.

Agreed that we need to consider future dialects and glad we are on the same page about this. In our RFC, IRBuilder is IR-agnostic and extensible to multiple dialects. Contextual information is stored in the IRBuilder so that metaprogramming with IRBuilder is possible.

self.tir_parser = TIRParser()

self.tir_parser.parse(stmt)

The problem of this design is that

  • To support with IRBuilder-based metaprogramming, extra work with similar logic has to be done in the IRBuilder, which is less desirable and could cause divergence between parser and IRBuilder;
  • Here is an assumption that there is a big RelaxParser which knows there is one and only one language “TIR” exists, which is less desirable, as you mentioned, when we want to “add dialects”. Hope this snippet that comes from the exact commit you provided reveals how many underlying assumptions there are with this approach.

It would be great if you could refer to Section Parsing-time evaluation to understand how the new parser is a thin wrap of IRBuilder.

@tkonolige
Copy link
Contributor

Ok, I think I understand things a little bit better now. Thanks for the explanation! I can see how if the IRBuilder handles all state then there is not really much of a point to having classes for the parser. It might be worthwhile to mention having a stateful parser class as an alternative in the Alternatives section (with the mentioned disadvantages around duplicating logic).

Two more questions:

Given that IRBuilder maintains all state, how does it determine what context to use for evaluating expressions. Say I have

@T.prim_func
def f(a: T.Buffer[2, "float32"]):
    ...

And I want this buffer to be a TIRBuffer when the language is tir and a RelaxBuffer when the language is relax. Is this possible? From my reading, the parser and IRBuilder are only called for statements and for function calls. Is this correct? Or can you register a parser for any node in the ast (instead of just statements)?

Anther question is how does this IRBuilder approach handle error catching and reporting? Lets say we have

@T.prim_func
def f(a: T.handle):
    with T.block(...):
        A = T.match_buffer(a, ["a string", 1], "float32")
        #                      ^^^^^^^^^^
        # An error should be reported here because the second arguments to match_buffer should be a `Sequence[Union[int, T.Var]]`

If I understand correctly, the T.match_buffer call is evaluated within a python context. This context calls match_buffer on the IRBuilder. Then the IRBuilder constructs a MatchBuffer object which in turn throws an error because its second argument has the wrong type. How is this error and error location then bubbled back up to the user?

@junrushao
Copy link
Member

junrushao commented Aug 4, 2022

Thanks for following up!

the parser and IRBuilder are only called for statements and for function calls. Is this correct?

In fact we allow registering visitors for any Python AST constructs. For example, developers could specify the behavior when visiting type annotations, function arguments, etc.

And I want this buffer to be a TIRBuffer when the language is tir and a RelaxBuffer when the language is relax.

It’s allowed with our infrastructure, but in reality, we instead prefer to use prefix, i.e. T.Buffer and R.Buffer which is more friendly to python tooling.

how does this IRBuilder approach handle error catching and reporting

It handles error reporting in a consistent approach with the current parser, i.e. using DiagnosticContext and its rendering.

In your specific case, where the error comes from a type mismatch of a particular parameter of a function call, despite being not well supported in the current parser (only supported in a small subset of buffer-related intrinsics: match_buffer, buffer_decl, alloc_buffer) and the mechanism is not general enough, it is in fact one of our design consideration before starting to work on this project, as well as overall tooling of the entire TVM project. A general mechanism is that we interpret expressions in finer-grained level and do runtime type checking.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for answering all my questions!

@junrushao junrushao merged commit ffbf686 into apache:main Aug 4, 2022
Hzfengsy added a commit to Hzfengsy/tvm that referenced this pull request Sep 2, 2022
Please join me in welcoming Yaxing Cai (@cyx-6) as a new reviewer in TVM. Yaxing has brought the PackedFunc into TVM object system ([RFC-051](apache/tvm-rfcs#51)), designed and implemented the new parser infrastructure for TVMScript and meta-programming ([RFC-079](apache/tvm-rfcs#79))

- [Commits History](https://github.com/apache/tvm/commits?author=cyx-6)
- [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Acyx-6+)
junrushao pushed a commit to apache/tvm that referenced this pull request Sep 2, 2022
Please join me in welcoming Yaxing Cai (@cyx-6) as a new reviewer in TVM. Yaxing has brought the PackedFunc into TVM object system ([RFC-051](apache/tvm-rfcs#51)), designed and implemented the new parser infrastructure for TVMScript and meta-programming ([RFC-079](apache/tvm-rfcs#79))

- [Commits History](https://github.com/apache/tvm/commits?author=cyx-6)
- [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Acyx-6+)
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
Please join me in welcoming Yaxing Cai (@cyx-6) as a new reviewer in TVM. Yaxing has brought the PackedFunc into TVM object system ([RFC-051](apache/tvm-rfcs#51)), designed and implemented the new parser infrastructure for TVMScript and meta-programming ([RFC-079](apache/tvm-rfcs#79))

- [Commits History](https://github.com/apache/tvm/commits?author=cyx-6)
- [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Acyx-6+)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants