[Frontend] Add Span filling for frontends to Relay #9723

chunit-quic · 2021-12-13T09:27:51Z

Add a common span filling feature for tf1/2, tflite and pytorch.
Add test cases for Span filling in each frontend.
Expose Tuple and TupleGetItem span to python end

Hi community,

Here is a pull request about span filling for frontends -> relay
(frontedns: TF 1 and 2, tfltie and pytorch)
This feature could help users to track the conversion more precisely
I would like to descript more about how it works and current status below. :D

One to many conversion
First, though there is a set_span function for tensorflow and tensorflow2, some spans are still missing from time to time.
One of the reasons is that an op conversion might be an one to many conversion.
In this situation the intermediate ops result in empty string.
Take the pack conversion for example, there are several expand_dims ops might be added before concatenate.
Via adding a ExprMutator to traverse expression each time after an op is converted, we should get a full span tagged RelayIR.

Here gives a simple example:
Before modification, the test case in this patch (tensorflow/test_forward.py:320) is converted to the following Relay expressions

def @main(%input: Tensor[(?, ?, 3, 1), float32]) {
%113 = shape_of(%input, dtype="int32") /* Shape /;
%114 = strided_slice(%113, begin=[0], end=[1], strides=[1], axes=None);
%115 = squeeze(%114) / strided_slice /;
%116 = expand_dims(%115, axis=0);
%117 = expand_dims(3, axis=0);
%118 = expand_dims(3, axis=0);
%119 = (%116, %117, %118);
%120 = concatenate(%119) / stack /;
dyn.reshape(%input, %120, newshape=[]) / output */
}

With this patch we can obtain the following format.

def @main (%input: Tensor[(?, ?, 3, 1), float32]) {
%0 = shape_of(%input, dtype="int32") /* Shape /;
%1 = strided_slice(%0, begin=[0], end=[1], strides=[1], axes=None) / strided_slice_PART_0 /;
%2 = squeeze(%1) / strided_slice /;
%3 = expand_dims(%2, axis=0) / stack_PART_0 /;
%4 = expand_dims(3, axis=0) / stack_PART_1 /;
%5 = expand_dims(3, axis=0) / stack_PART_2 /;
%6 = (%3, %4, %5) / stack_PART_3 /;
%7 = concatenate(%6) / stack /;
dyn.reshape(%input, %7, newshape=[]) / output */
}

(Thanks to @lixiaoquan's advice, keeping the span without suffix in the same position seems better :D)

span naming for each frontend
2.1. TensorFlow (1 and 2) naming: is kept the same.
2.2. tflite naming: is a combination of its op position index and output tensor name(s).
op position is good enough to map back the tflite.
And the output tensor name should be helpful when user search the op in Netron
2.3. Pytorch naming: Because PyTorch provides two kinds of graph, jit._trace.TopLevelTracedModule, and _C.Graph, two key attributes, kind() and scopeName() are recorded in a span.
scopeName(), is used to map a Realy expression back to its original pytorch module part in jit._trace.TopLevelTracedModule, and _C.Graph.
Combined with kind(), the position of node can be precisely located in _C.Graph.
Limitation
3.1. Few model in test_functional_models.py is still in investigation.
3.2. In the end of tflie conversion, a Tuple expression is added if output more than 1. This tuple will not have any span.
3.2 Note that some conversion, like aten::to in Pytorch might result in a python-built-in float instance. its node information will be drop simply.
Trivial
Several test cases are attached. Should be a quick verifier for reviewing.

Thank you for reading. Any comment is appreciated. :)

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

* Add a common span filling feature for tf1/2, tflite and pytorch. * Add test case for Span filling in each frontend. * Expose Tuple and TupleGetItem to python end

lixiaoquan · 2021-12-13T11:01:47Z

I just feel that in one to many case, the original tensor/layer name should be attached to the last node in the group. Because that's where the computational result (and tensor type) matches between original graph and relay IR. And we may need to find the last node of a group frequently, keep original layer's name there can make it easier.

For example, a LSTM can become thousands of nodes in relay IR. If its name is not attached to the last, we'll have to try to search layer_DERIVED_xxxx many times to find the end.

def @main(%input: Tensor[(?, ?, 3, 1), float32]) {
%10 = shape_of(%input, dtype="int32") /* Shape /;
%11 = strided_slice(%10, begin=[0], end=[1], strides=[1], axes=None) / strided_slice_PART_0) /;
%12 = squeeze(%11) / strided_slice /;
%13 = expand_dims(%12, axis=0) / stack_PART_0 /;
%14 = expand_dims(3, axis=0) / stack_PART_1 /;
%15 = expand_dims(3, axis=0) / stack_PART_2 /;
%16 = (%13, %14, %15) / stack_PART_3 /;
%17 = concatenate(%16) / stack /;
dyn.reshape(%input, %17, newshape=[]) / output */
}

chunit-quic · 2021-12-13T12:58:48Z

Hi @lixiaoquan,

It's a good advice and easy to implement. :D
If tag the original name to the final expression is a better philosophy for users, I can change to this way after collecting more other comments from reviewers.

* Fix lint errors * Change default string of scope_part in Pytorch * Reorder the span position for one to many conversion

mbs-octoml

Nice one, thanks for this! We are gradually improving how we flow spans through passes so that all this hard-won debug info is not immediately lost, so your work will pay even more dividends in the future.

Just some nits.

mbs-octoml · 2021-12-14T17:26:19Z

python/tvm/relay/frontend/common.py

+    class SpanFiller(ExprMutator):
+        """SpanFiller"""
+
+        def __init__(self, node_name, surfix_str="_PART_"):


nit: suffix_str

mbs-octoml · 2021-12-14T17:28:47Z

src/printer/relay_text_printer.cc

@@ -389,12 +389,21 @@ Doc RelayTextPrinter::VisitExpr_(const TupleNode* op) {
  if (op->fields.size() == 1) {
    doc << ",";
  }
-  return doc << ")";
+  doc << ")";
+  if (op->span.defined()) {


nit:: can you leave a warning comment that we'll probably need to protect this by some kind of 'include_spans' or 'verbose' printer flag. But at this stage I'm happy to have them all!

would you be up for doing the span suffix printing in the VisitExpr override? I think might as well do it for all the node types uniformly.

Ideally this should be replaced by my different printer that I described to you the other day iirc. I think we should bring back the old printing mode via an implementation of a Renderer.

Hi @mbs-octoml
Thank you for reviewing and giving this PR a positive feedback! :D

About comments in this part, please kindly correct me if I am wrong.

nit:: can you leave a warning comment that we'll probably need to protect this by some kind of 'include_spans' or 'verbose' printer flag

If I didn't misunderstand it. It will be nice to have a flag to control span printing (After all some time it will be super long.)
In the latest commit I add a bool flag with true as default value to control it. (src/printer/text_printer.h:116)
Although a "/* */" is left based on this implementation... is It basically what we want?

would you be up for doing the span suffix printing in the VisitExpr override?

About this part, do you mean how about adding print span for those printer without it currently?
Like, ScalarLiteral, PrintExpr and VisitExpr_(const IfNode* op) ...
If so, I did try to browse them at first. Yet it seems that it is not easy to track and verify them comprehensively barely by a glance. Since sometime we might even need to check their c++ global register and python end.
If is fine to you perhaps one more PR for this enhancement would be better? I could also try to think about which test cases could help me to check those kind of printers. :D

Hi @jroesch

It seems that there is a more suitable printer for this part.
Would you mind to share that feature with me? Sorry that I just follow the existing format without checking it more carefully.
Once we have conclusion about which one should be used this time, I could try to modify my current code. :D

* nit fixed * Add a bool flag to control print span * refactor pytorch get span to a birefer way

* Add one more condition for spanFller * Refine the format for those pytorch node without scopeName

* Fix lint

chunit-quic · 2021-12-19T23:56:36Z

Hi @mbs-octoml, @jroesch,

Just a gentle ping. Should I modify something more or would it be fine to be merged? Thanks :)

chunit-quic · 2021-12-27T13:12:58Z

Hi @mbs-octoml, @jroesch,
Hope you guys have a great vacation! And just a gentle ping again. :D

Hi @masahi,
I found you merged a lots of PRs recently in closed PR section. Would you mind to take a look at this PR also please? :)

FrozenGene · 2021-12-28T02:26:36Z

I like this pr, which could make us support heterogeneous execution more fine-grained . Thanks @chunit-quic

FrozenGene · 2021-12-28T02:30:28Z

As @mbs-octoml approved, ideally we could merge it now. However, we have one unresolved comment, @chunit-quic do you want to file a new pr or do you want to resolve it in this pr?

chunit-quic · 2021-12-28T02:39:54Z

Hi @FrozenGene,
Thanks for your positive feedback! :D

However, we have one unresolved comment,

Currently I would prefer to merge this one first if possible.
The unresolved part you means should be the discussion with @jroesch and @mbs-octoml in src/printer/relay_text_printer.cc.
Since I haven't got further replies from them and to the best of my knowledge it might be good to have one more PR to reach (or modify) what we want. Like the formal printer, or collect and expose those spans which are hidden now.

FrozenGene · 2021-12-28T04:56:23Z

Thanks @chunit-quic @mbs-octoml @jroesch It is merged now. @chunit-quic Let us make new PRs to solve left discussion.

chunit-quic · 2021-12-28T05:11:29Z

Thanks @FrozenGene. Sure thing! :D
Once we get some more precise information from @mbs-octoml and @jroesch, we can start these works.

mbs-octoml · 2022-01-03T18:47:46Z

Hi @chunit-quic, sorry the line went dead over the break, glad to see this merged. I don't have any outstanding requests for you.

* [Frontend] Add Span filling for frontends to Relay * Add a common span filling feature for tf1/2, tflite and pytorch. * Add test case for Span filling in each frontend. * Expose Tuple and TupleGetItem to python end * [Frontend] Add Span filling for frontends to Relay * Fix lint errors * Change default string of scope_part in Pytorch * Reorder the span position for one to many conversion * [Frontend] Add Span filling for frontends to Relay * nit fixed * Add a bool flag to control print span * refactor pytorch get span to a birefer way * [Frontend] Add Span filling for frontends to Relay * Add one more condition for spanFller * Refine the format for those pytorch node without scopeName * [Frontend] Add Span filling for frontends to Relay * Fix lint

rebel-shshin · 2022-01-25T09:02:32Z

@chunit-quic @FrozenGene @mbs-octoml
Hi guys, I think I found a bug when convert pytorch LSTM layer to relay graph.
LSTM layer appears twice in the converted relay graph even if I have only one LSTM layer.
I found this weird behavior solved, if I commented the following two lines in python/tvm/relay/frontend/pytorch.py
span_str, empty_counter = self._get_torch_span(op_node, empty_counter)
relay_out = set_span(relay_out, span_str)

Do you have any idea??

chunit-quic · 2022-01-26T05:26:29Z

Hi, @rebel-shshin
I'm surprised by the lstm case.
After checking the file test_lstm.py in the pytorch folder, the output IR graph does change with set_span mutator.

Hi, @FrozenGene @mbs-octoml
Since it might take me a while to investigate it, perhaps it would be better to revert it? I am preparing a PR to revert the whole change of this PR. I will submit the change if reversion is OK to you guys.
Thank you!

FrozenGene · 2022-01-26T07:29:32Z

Hi, @rebel-shshin I'm surprised by the lstm case. After checking the file test_lstm.py in the pytorch folder, the output IR graph does change with set_span mutator.

Hi, @FrozenGene @mbs-octoml Since it might take me a while to investigate it, perhaps it would be better to revert it? I am preparing a PR to revert the whole change of this PR. I will submit the change if reversion is OK to you guys. Thank you!

OK. Please submit reverted pr.

…)" Because of the failure the LSTM conversion from Pyotrch This reverts commit ce108c1.

…)" Because of the failure the LSTM conversion from Pytorch This reverts commit ce108c1.

…)" Because of the failure of LSTM conversion from Pytorch This reverts commit ce108c1.

chunit-quic · 2022-01-26T08:46:40Z

Thank you @FrozenGene, for your reference here is the reversion PR. :)
#10072

chunit-quic · 2022-01-26T10:19:26Z

Hi @rebel-shshin,
Pardon that I forgot to confirm with you about the details.
So the following snapshot is what I get from the single LSTM layer result from test_lstm.py.

In the left-hand side with span filling, four more expressions pop out:
Two more tuples (%36, %37) appear in the while loop, and a Nil (%44) followed by (%45 = %39(0, %44, %states, %input)), which is the LSTM body.

Is it the same as what you get? If not would you mind to share what's your model and conversion file with me? Thank you. :)

…10072) Because of the failure of LSTM conversion from Pytorch

rebel-shshin · 2022-01-27T02:18:01Z

Hi @chunit-quic, thanks for the fast reaction.

My case is bit different with yours. My network has a single LSTM layer with some dense layers before and after the LSTM. The model returns three outputs which are output, cell, and hidden state of the LSTM. However, the converted relay graph has two LSTM layers. The first LSTM stands for the output and the second one stands for the cell and hidden state. This is really weird because all of them should be generated from the same LSTM layer.

chunit-quic · 2022-01-27T14:42:56Z

Hi @rebel-shshin,

Thank you for the detailed information. I found some clues yet still need some time to spot the problem preciously. I will try to make a test case similar to yours and give it a try. :D

…)" (apache#10072) Because of the failure of LSTM conversion from Pytorch

…)" (apache#10072) (#246) Because of the failure of LSTM conversion from Pytorch Co-authored-by: Chun-I Tsai <quic_chunit@quicinc.com>

* ExprMutator without any change fails to mutate expressions during converting LSTM * [1]apache#9723 (comment) * [2]https://github.com/apache/tvm/blob/122be3fb18902bf2317797fedfa867dcf9607ef9/src/relay/transforms/de_duplicate.cc#L113 * [3] https://github.com/apache/tvm/blob/8f6fa8f2c41406cb54d01647ba8731e4ceb8f4ab/src/ir/module.cc#L202 * [4] https://github.com/apache/tvm/blob/8f6fa8f2c41406cb54d01647ba8731e4ceb8f4ab/python/tvm/relay/expr_functor.py#L216 * [5]apache#10072 * [6]apache#9723 (comment) * [7]https://github.com/apache/tvm/blob/8f6fa8f2c41406cb54d01647ba8731e4ceb8f4ab/src/relay/transforms/de_duplicate.cc

[Frontend] Add Span filling for frontends to Relay

d0eaf2d

* Add a common span filling feature for tf1/2, tflite and pytorch. * Add test case for Span filling in each frontend. * Expose Tuple and TupleGetItem to python end

chunit-quic requested review from anijain2305, areusch, comaniac, Huyuwei, icemelon, jroesch, junrushao, jwfromm, kazum, MarisaKirisame, mbrookhart, merrymercy, siju-samuel, slyubomirsky, srkreddy1238, tqchen, vinx13, wweic, yzhliu, zhiics and ZihengJiang as code owners December 13, 2021 09:27

[Frontend] Add Span filling for frontends to Relay

c4d1172

* Fix lint errors * Change default string of scope_part in Pytorch * Reorder the span position for one to many conversion

mbs-octoml approved these changes Dec 14, 2021

View reviewed changes

chunit-quic added 3 commits December 15, 2021 22:27

[Frontend] Add Span filling for frontends to Relay

daabb71

* nit fixed * Add a bool flag to control print span * refactor pytorch get span to a birefer way

[Frontend] Add Span filling for frontends to Relay

99248d9

* Add one more condition for spanFller * Refine the format for those pytorch node without scopeName

[Frontend] Add Span filling for frontends to Relay

e7ebd70

* Fix lint

FrozenGene approved these changes Dec 28, 2021

View reviewed changes

FrozenGene merged commit ce108c1 into apache:main Dec 28, 2021

chunit-quic pushed a commit to chunit-quic/tvm that referenced this pull request Jan 26, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (apache#9723…

0a35990

…)" Because of the failure the LSTM conversion from Pyotrch This reverts commit ce108c1.

chunit-quic pushed a commit to chunit-quic/tvm that referenced this pull request Jan 26, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (apache#9723…

6877e70

…)" Because of the failure the LSTM conversion from Pytorch This reverts commit ce108c1.

chunit-quic pushed a commit to chunit-quic/tvm that referenced this pull request Jan 26, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (apache#9723…

871263d

…)" Because of the failure of LSTM conversion from Pytorch This reverts commit ce108c1.

chunit-quic mentioned this pull request Jan 26, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (#9723)" #10072

Merged

Mousius pushed a commit that referenced this pull request Jan 26, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (#9723)" (#…

095b639

…10072) Because of the failure of LSTM conversion from Pytorch

sunggg pushed a commit to sunggg/tvm that referenced this pull request Jan 29, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (apache#9723…

aa44d7b

…)" (apache#10072) Because of the failure of LSTM conversion from Pytorch

ylc pushed a commit to ylc/tvm that referenced this pull request Feb 16, 2022

Revert "[Frontend] Add Span filling for frontends to Relay (apache#9723…

be311bb

…)" (apache#10072) Because of the failure of LSTM conversion from Pytorch

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Add Span filling for frontends to Relay #9723

[Frontend] Add Span filling for frontends to Relay #9723

chunit-quic commented Dec 13, 2021 •

edited

Loading

lixiaoquan commented Dec 13, 2021

chunit-quic commented Dec 13, 2021 •

edited

Loading

mbs-octoml left a comment

mbs-octoml Dec 14, 2021

chunit-quic Dec 15, 2021

mbs-octoml Dec 14, 2021

mbs-octoml Dec 14, 2021

jroesch Dec 15, 2021

chunit-quic Dec 15, 2021

chunit-quic Dec 15, 2021

chunit-quic commented Dec 19, 2021 •

edited

Loading

chunit-quic commented Dec 27, 2021

FrozenGene commented Dec 28, 2021

FrozenGene commented Dec 28, 2021

chunit-quic commented Dec 28, 2021 •

edited

Loading

FrozenGene commented Dec 28, 2021

chunit-quic commented Dec 28, 2021

mbs-octoml commented Jan 3, 2022

rebel-shshin commented Jan 25, 2022 •

edited

Loading

chunit-quic commented Jan 26, 2022

FrozenGene commented Jan 26, 2022

chunit-quic commented Jan 26, 2022 •

edited

Loading

chunit-quic commented Jan 26, 2022 •

edited

Loading

rebel-shshin commented Jan 27, 2022 •

edited

Loading

chunit-quic commented Jan 27, 2022

[Frontend] Add Span filling for frontends to Relay #9723

[Frontend] Add Span filling for frontends to Relay #9723

Conversation

chunit-quic commented Dec 13, 2021 • edited Loading

lixiaoquan commented Dec 13, 2021

chunit-quic commented Dec 13, 2021 • edited Loading

mbs-octoml left a comment

Choose a reason for hiding this comment

mbs-octoml Dec 14, 2021

Choose a reason for hiding this comment

chunit-quic Dec 15, 2021

Choose a reason for hiding this comment

mbs-octoml Dec 14, 2021

Choose a reason for hiding this comment

mbs-octoml Dec 14, 2021

Choose a reason for hiding this comment

jroesch Dec 15, 2021

Choose a reason for hiding this comment

chunit-quic Dec 15, 2021

Choose a reason for hiding this comment

chunit-quic Dec 15, 2021

Choose a reason for hiding this comment

chunit-quic commented Dec 19, 2021 • edited Loading

chunit-quic commented Dec 27, 2021

FrozenGene commented Dec 28, 2021

FrozenGene commented Dec 28, 2021

chunit-quic commented Dec 28, 2021 • edited Loading

FrozenGene commented Dec 28, 2021

chunit-quic commented Dec 28, 2021

mbs-octoml commented Jan 3, 2022

rebel-shshin commented Jan 25, 2022 • edited Loading

chunit-quic commented Jan 26, 2022

FrozenGene commented Jan 26, 2022

chunit-quic commented Jan 26, 2022 • edited Loading

chunit-quic commented Jan 26, 2022 • edited Loading

rebel-shshin commented Jan 27, 2022 • edited Loading

chunit-quic commented Jan 27, 2022

chunit-quic commented Dec 13, 2021 •

edited

Loading

chunit-quic commented Dec 13, 2021 •

edited

Loading

chunit-quic commented Dec 19, 2021 •

edited

Loading

chunit-quic commented Dec 28, 2021 •

edited

Loading

rebel-shshin commented Jan 25, 2022 •

edited

Loading

chunit-quic commented Jan 26, 2022 •

edited

Loading

chunit-quic commented Jan 26, 2022 •

edited

Loading

rebel-shshin commented Jan 27, 2022 •

edited

Loading