Add Project API RFC #8

areusch · 2021-06-30T20:29:09Z

This PR adds an RFC for the Project API (embryonic discussion - m2 roadmap item #6`). Project API is a plugin-style infrastructure that allows TVM to integrate with a variety of platform-specific build systems. It is particularly useful to allow TVM to build and time operator implementations for non-traditional build platforms (e.g. embedded firmware) under the microTVM effort.

Draft PR: apache/tvm#8380

@mehrdadh @guberti @tqchen @jroesch @tkonolige @csullivan @leandron @u99127 @Mousius @giuseros @gromero @stoa @hogepodge

mehrdadh

@areusch thanks for the RFC! It's very informative.
Just added a minor comment.

mehrdadh · 2021-06-30T20:43:10Z

rfcs/0008-microtvm-project-api.md

+standard location in that directory). TVM communicates with the Project API Server using JSON-RPC
+over standard OS pipes.
+
+TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md).


This link is broken. Is there an RFC for model library format on the way?

oh thanks for catching--it's actually just docs in apache/tvm#8270 which should land soon

This link can be fixed now apache/tvm#8270 has landed 😸

sorry, but this needs to change again since the PR is merged :(

guberti

Really nice writeup! It might be good to explicitly state that a platform doesn't need to support all of ProjectAPIHandler's methods - if I understand correctly, which workflows are supported can depend on what makes sense for each platform.

areusch · 2021-07-01T00:16:08Z

@mdw-octoml would be great to get your feedback here as well

tkonolige · 2021-07-01T15:42:14Z

rfcs/0008-microtvm-project-api.md

+def default_module_loader(pre_load_function=None):
+    """Returns a default function that can be passed as module_loader to run_through_rpc.
+    Parameters
+    ----------
+    pre_load_function : Optional[Function[tvm.rpc.Session, tvm.runtime.Module]]
+        Invoked after a session is established and before the default code-loading RPC calls are
+        issued. Allows performing pre-upload actions, e.g. resetting the remote runtime environment.
+    Returns
+    -------
+    ModuleLoader :
+        A function that can be passed as module_loader to run_through_rpc.
+    """
+
+    @contextlib.contextmanager
+    def default_module_loader_mgr(remote_kwargs, build_result):
+        remote = request_remote(**remote_kwargs)
+        if pre_load_function is not None:
+            pre_load_function(remote, build_result)
+
+        remote.upload(build_result.filename)
+        try:
+            yield remote, remote.load_module(os.path.split(build_result.filename)[1])
+
+        finally:
+            # clean up remote files
+            remote.remove(build_result.filename)
+            remote.remove(os.path.splitext(build_result.filename)[0] + ".so")
+            remote.remove("")
+
+    return default_module_loader_mgr


Please update this definition to match apache/tvm#8363.

Can you address this?

oops i see. done.

tkonolige · 2021-07-01T15:43:30Z

rfcs/0008-microtvm-project-api.md

+# Drawbacks
+[drawbacks]: #drawbacks
+
+Why should we *not* do this?


I think you can find some drawbacks.

Maybe:

More code to maintain.

Extra work for people using embedded OSes

I don't think this extra work for those coming directly from an embedded background as they'll likely produce model files directly into their own embedded projects. For example, I wouldn't expect a current Zephyr user to use this flow over using tvmc and west directly due to their existing familiarity.

Though I agree, with the variety of embedded OSes available, this will likely mean TVM as a project has to support a number of OSes - whether directly in checked in project generators or indirectly by features required to support different OSes.

yeah I agree that end users may likely just consume MLF. I think that as a "getting started" flow, it may help to have automatic project generation. I expect this flow to be primarily beneficial in implementing autotuning or remote execution flows on various platforms.

I added some drawbacks

tkonolige · 2021-07-01T15:45:55Z

rfcs/0008-microtvm-project-api.md

+alphabet. Python provides standard support for these via the `base64` module, so the most compact
+encoding (`base85`) was chosen from those standards to encode binary data in the Project API.


TVM uses base64 for encoding data in the relay text format. Maybe we should be consistent.

that is also technically an easier modulo to compute, i think. would be open to this.

Lets just do base64 to be consistent

okay fine :)

tkonolige · 2021-07-01T15:47:03Z

rfcs/0008-microtvm-project-api.md

+[unresolved-questions]: #unresolved-questions
+
+1. Is anyone particularly opposed the RPC mechanism used here?
+2. Does this seem simple for downstream platforms to implement?


Do you have an example implementation?

a simple example is the subprocess-based microtvm test: https://github.com/areusch/incubator-tvm/tree/project-generator/src/runtime/crt/host

more complex is the zephyr implementation: https://github.com/areusch/incubator-tvm/tree/project-generator/apps/microtvm/zephyr/template_project

i'd recommend reading the first one if you're not familiar with zephyr; the second one is not a minimal example and includes code to communicate with qemu and physical boards

rfcs/0008-microtvm-project-api.md

Mousius · 2021-07-05T11:17:07Z

rfcs/0008-microtvm-project-api.md

+# Drawbacks
+[drawbacks]: #drawbacks
+
+Why should we *not* do this?


I don't think this extra work for those coming directly from an embedded background as they'll likely produce model files directly into their own embedded projects. For example, I wouldn't expect a current Zephyr user to use this flow over using tvmc and west directly due to their existing familiarity.

Though I agree, with the variety of embedded OSes available, this will likely mean TVM as a project has to support a number of OSes - whether directly in checked in project generators or indirectly by features required to support different OSes.

Mousius · 2021-07-05T11:56:06Z

rfcs/0008-microtvm-project-api.md

+but ultimately rejected because JSON-RPC can be implemented in a single Python file without adding
+the complexities of an IDL compiler.
+
+## Transport functions


I'd suggest this tightly coupling autotuning and prototyping to specific frameworks called directly from TVM is the main alternative to using the Project API, inclusive of RPC server and support frameworks?

Tightly coupling would be less code and potentially easier user journeys without the consideration of the additional server. Given we can mark dependencies to include as extras, are they a real concern for someone doing pip install tlcpack ?

the concern i have is that we don't want our TVM dependencies held back by some platform's need to depend on e.g. pyyaml. if e.g. TensorFlow introduces an install_requires of pyyaml > 3.0, but some mciro platform refuses to update past pyyaml 2.x, the platform is going to lose the battle and become unsupported. going to the API server avoids this battle and also allows for development of API servers outside the TVM tree.

It'd be good to have that expressed explicitly as part of the RFC, attempting to continue with a subset of RTOSes or trying to balance different dependencies is a valid alternative to building out the project API but the reasons you've articulated give a concrete example to clear that up 😸

gromero

I took a close look at the RFC and the Project API draft implementation.

Currently I can't comment much about the AutoTVM part.

On generate_project method, and considering TVMC, my only comment is that it should allow a project creation based also on MLF .tar, instead of only a "live" executor. As we've discussed I don't think it's necessary to have it now and I'll send a patch for it mostly in the lines as we've discussed it, since TVMC will need it.

On build method I just think it should be a way to force a rebuild in the same project dir (i.e. even if build dir exists)

On flash method alright, no comments.

On transport, please consider my comments inline, specially the bit a about the speed regression.

Also, if possible, please consider the comments I've posted to the draft code also (apache/tvm#8380)

Otherwise, looks great. Thanks for refactoring and improving it!

Finally, I confirm that the fixes for MAX_ defines are ok (no build error) and the per board configs are kicking in correctly.

Cheers,
Gustavo

gromero · 2021-07-05T20:09:00Z

rfcs/0008-microtvm-project-api.md

+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+1. Is anyone particularly opposed the RPC mechanism used here?


@areusch Hi Andrew. Sorry, I've posted a comment regarding it on the PR related to Project API code, so I'm posting it again here, since there is an item explicitly asking about feedback about it :)

So, I'd like to understand better the motivation for having the RPC mechanism since it seems to add an unnecessary complexity (to maintain, to debug, for instance). Would that be possible to have the exactly same interfaces but using merely a, let's say, Python module / class inclusion (like importing the microtvm_api_server.py at first from the template dir and then also later when the project dir is already created)?

I've just noticed what looks like a speed regression when talking to the serial port using the new Project API Transporter (I've shared the details on our #microtvm Discord channel). It looks like related to the limitations pointed out by you on read and write calls, regarding the timeouts. So I'm wondering if these limitations would simply go away if the RPC mechanism is removed, for instance. If so, that would weight in against the use of a RPC mechanism as-is. I'm guessing such a regression is not observable in test_zephyr.py because the models there are too simple.

I also think it's a bit convoluted the way that the client-server works now, like a client creating local pipes, executing the server and passing them to it so the client and server can talk like as a parent/child mechanism, afaics. But I also wonder if you have a more ambitious plan in the long term, like really decoupling the client/server (client runs in one host, and the server in yet another host, connected via UDP or TCP). In that case I see the value of implementing the RPC mechanism now.

The main motivation for the RPC mechanism is to allow project generators to use python dependencies which may not be present in TVM's dependency set. For example, it may be necessary to include pyserial to facilitate communication with µTVM RPC server to support autotuning, but this dependency has no business being in TVM's official set of dependencies (we can't set a precedent that every single micro platform's dependencies become TVM's dependencies).

so the RPC server allows the microtvm_api_server.py to live in a separate process, and that breaks the dependency chain.

I've just noticed what looks like a speed regression when talking to the serial port using the new Project API Transporter (I've shared the details on our #microtvm Discord channel)

This was a great catch with the PoC; it's resolved now. the main reason was that I changed the semantics of read and write but forgot to update the TVM-side code in src/runtime/micro/micro_session.cc.

I also think it's a bit convoluted the way that the client-server works now, like a client creating local pipes, executing the server and passing them to it so the client and server can talk like as a parent/child mechanism, afaics

your understanding is correct...the main issue with making this a network-aware RPC is that there are local paths included in the RPC interface now. I don't really intend to make this work over the network, but having a standard way of communicating rather than an ad-hoc one makes more sense to me. then there is a well-defined spec and it's clearer how bugs in the code should be handled.

The main motivation for the RPC mechanism is to allow project generators to use python dependencies which may not be present in TVM's dependency set

Absolutely, you have clarified it on the last microTVM Meet up 👍 Now I only don't recall if it's stated that clear in RFC. Thanks. makes sense.

This was a great catch with the PoC; it's resolved now. the main reason was that I changed the semantics of read and write but forgot to update the TVM-side code in src/runtime/micro/micro_session.cc.

Right, that issue is by now resolved. Thanks for fixing it :)

your understanding is correct...the main issue with making this a network-aware RPC is that there are local paths included in the RPC interface now.

right, I was scratching my head about it : )

I don't really intend to make this work over the network, but having a standard way of communicating rather than an ad-hoc one makes more sense to me. then there is a well-defined spec and it's clearer how bugs in the code should be handled.

Got it. Yeah, plus the current main reason to use an RPC interface here is well state. ok.

areusch · 2021-07-26T19:38:38Z

Thanks @Mousius @gromero for your comments. updated the RFC to reflect accurately on PoC now that it seems to be passing regression. Please take another look so we can merge this and proceed forward with Project API.

@gromero some follow-ups on your comments:

On generate_project method, and considering TVMC, my only comment is that it should allow a project creation based also on MLF .tar, instead of only a "live" executor.

I agree; let's merge additional logic in project.py to do this as follow-on to this impl.

On build method I just think it should be a way to force a rebuild in the same project dir (i.e. even if build dir exists)

I believe that is the functionality now. Did you see something different?

On transport, please consider my comments inline, specially the bit a about the speed regression.

Replied to your comments; I believe the speed issue is not a problem now.

Also, if possible, please consider the comments I've posted to the draft code also (apache/tvm#8380)

Will take a look at these now that PoC is passing

Finally, I confirm that the fixes for MAX_ defines are ok (no build error) and the per board configs are kicking in correctly.

I reworked the way these are done since posting the PoC--now prj.conf is generated from microtvm_api_server.py so we can have a single combined template_project and the prj.conf can be situation-specific. Let me know what you think.

gromero · 2021-07-27T21:20:12Z

On generate_project method, and considering TVMC, my only comment is that it should allow a project creation based also on MLF .tar, instead of only a "live" executor.

I agree; let's merge additional logic in project.py to do this as follow-on to this impl.

Sure! I'll submit a follow up patch I've been using to test this patchset with TVMC. My comment was also more for the records for others following the discussion since we've already discussed the details about.

On build method I just think it should be a way to force a rebuild in the same project dir (i.e. even if build dir exists)

I believe that is the functionality now. Did you see something different?

Yes, a functionality detail. Currently I'm handling that at TVMC side, like if the build/ dir exists it means tvmc micro build was run before so it informs the user and asks to confirm a re-build using a --force flag. In that case, if --force is used TVMC will remove the build/ and let Project API recreate it. But I was just wondering if you would like to change that behavior. I have no preference here. What do you think?

Also, if possible, please consider the comments I've posted to the draft code also (apache/tvm#8380)

Will take a look at these now that PoC is passing

OK. I think you answered all my initial comments there. I'll just post a couple more related to that new round, i.e. related to last commit pushed: apache/tvm@de75022 I have no more comments regarding the RFC itself - I'm happy with it, so I'll just post some comments about the code. I think it's pretty close to land :)

Finally, I confirm that the fixes for MAX_ defines are ok (no build error) and the per board configs are kicking in correctly.

I reworked the way these are done since posting the PoC--now prj.conf is generated from microtvm_api_server.py so we can have a single combined template_project and the prj.conf can be situation-specific. Let me know what you think.

Yeah, I'm not much inclined to use that approach of using a Python code to generate the prj.conf because for adding new bits to the config fil will imply changing a Python code. That seems a bit strange to me. I thought of having a prj.conf per project_type but there is also other things to being handled like the stack configuration based on running the model on QEMU or a physical board, so I'm not sure, maybe we can go ahead with that approach and see it goes overtime?

Regarding the MAX_ defines, even with the new approach generating the prj.conf the ~~crt/crt_config.h~~ crt_config/crt_config.h is used and so the MAX_ defines are used anyways? Or I missed something?

gromero

LTGM.

areusch · 2021-07-27T22:36:34Z

@Mousius please take another look when you can and explicitly approve!

areusch · 2021-07-27T23:14:56Z

@tkonolige @guberti @mehrdadh please take another look and explicitly approve if you're good w/ this

Mousius · 2021-07-28T18:50:02Z

rfcs/0008-microtvm-project-api.md

+standard location in that directory). TVM communicates with the Project API Server using JSON-RPC
+over standard OS pipes.
+
+TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md).


This link can be fixed now apache/tvm#8270 has landed 😸

Mousius · 2021-07-28T18:52:47Z

rfcs/0008-microtvm-project-api.md

@@ -0,0 +1,543 @@
+- Feature Name: microtvm_project_api
+- Start Date: 2020-06-09
+- RFC PR: [apache/tvm-rfcs#0000](https://github.com/apache/tvm-rfcs/pull/0000)


This needs to be 0008? And I don't think there's an associated issue for this?

rfcs/0008-microtvm-project-api.md

Mousius · 2021-07-28T19:05:00Z

rfcs/0008-microtvm-project-api.md

+but ultimately rejected because JSON-RPC can be implemented in a single Python file without adding
+the complexities of an IDL compiler.
+
+## Transport functions


It'd be good to have that expressed explicitly as part of the RFC, attempting to continue with a subset of RTOSes or trying to balance different dependencies is a valid alternative to building out the project API but the reasons you've articulated give a concrete example to clear that up 😸

areusch · 2021-07-29T21:52:03Z

@Mousius @tqchen please take a look

tqchen · 2021-07-29T23:02:06Z

@Mousius @mehrdadh @tkonolige @guberti please take another look and https://tvm.apache.org/docs/contribute/code_review.html#approve-and-request-changes-explicitly

tkonolige · 2021-07-29T23:24:28Z

rfcs/0008-microtvm-project-api.md

+def default_module_loader(pre_load_function=None):
+    """Returns a default function that can be passed as module_loader to run_through_rpc.
+    Parameters
+    ----------
+    pre_load_function : Optional[Function[tvm.rpc.Session, tvm.runtime.Module]]
+        Invoked after a session is established and before the default code-loading RPC calls are
+        issued. Allows performing pre-upload actions, e.g. resetting the remote runtime environment.
+    Returns
+    -------
+    ModuleLoader :
+        A function that can be passed as module_loader to run_through_rpc.
+    """
+
+    @contextlib.contextmanager
+    def default_module_loader_mgr(remote_kwargs, build_result):
+        remote = request_remote(**remote_kwargs)
+        if pre_load_function is not None:
+            pre_load_function(remote, build_result)
+
+        remote.upload(build_result.filename)
+        try:
+            yield remote, remote.load_module(os.path.split(build_result.filename)[1])
+
+        finally:
+            # clean up remote files
+            remote.remove(build_result.filename)
+            remote.remove(os.path.splitext(build_result.filename)[0] + ".so")
+            remote.remove("")
+
+    return default_module_loader_mgr


Can you address this?

tkonolige · 2021-07-29T23:25:36Z

rfcs/0008-microtvm-project-api.md

+alphabet. Python provides standard support for these via the `base64` module, so the most compact
+encoding (`base85`) was chosen from those standards to encode binary data in the Project API.


Lets just do base64 to be consistent

mehrdadh

Please change the URL since the other PR is merged now, otherwise LGTM.
Thanks!

mehrdadh · 2021-07-29T23:44:55Z

rfcs/0008-microtvm-project-api.md

+standard location in that directory). TVM communicates with the Project API Server using JSON-RPC
+over standard OS pipes.
+
+TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md).


sorry, but this needs to change again since the PR is merged :(

guberti

This fits my Arduino use case perfectly. LGTM!

Mousius

Thanks for the updates 😸

tkonolige

LGTM, thanks @areusch!

Add Project API RFC

238837c

mehrdadh requested changes Jun 30, 2021

View reviewed changes

guberti reviewed Jun 30, 2021

View reviewed changes

areusch mentioned this pull request Jul 1, 2021

[microTVM] Project API infrastructure apache/tvm#8380

Merged

Revise Python interface (forgot to update this before posting)

a209a95

tkonolige reviewed Jul 1, 2021

View reviewed changes

areusch commented Jul 4, 2021

View reviewed changes

rfcs/0008-microtvm-project-api.md Show resolved Hide resolved

Mousius reviewed Jul 5, 2021

View reviewed changes

gromero reviewed Jul 5, 2021

View reviewed changes

tqchen added the status: need update RFC needs update based on feedback label Jul 24, 2021

Address Mousius comments

8fb2021

gromero approved these changes Jul 27, 2021

View reviewed changes

areusch added status: need review RFC needs review and removed status: need update RFC needs update based on feedback labels Jul 27, 2021

add drawbacks to address TK comments

5a9962e

areusch mentioned this pull request Jul 28, 2021

Update the guideline of RFC tracking issues #13

Merged

Mousius requested changes Jul 28, 2021

View reviewed changes

areusch mentioned this pull request Jul 29, 2021

[RFC] Tracking bug for RFC 0008: microTVM Project API apache/tvm#8595

Closed

address Mousius comments

a4f66e8

tkonolige requested changes Jul 29, 2021

View reviewed changes

mehrdadh approved these changes Jul 29, 2021

View reviewed changes

guberti approved these changes Jul 29, 2021

View reviewed changes

Mousius approved these changes Jul 30, 2021

View reviewed changes

areusch added 2 commits July 30, 2021 14:05

address tkonolige comments

6d6af6b

switch to base64

402e848

tkonolige approved these changes Jul 30, 2021

View reviewed changes

tqchen approved these changes Aug 3, 2021

View reviewed changes

tqchen merged commit 2f94b4c into apache:main Aug 4, 2021

		alphabet. Python provides standard support for these via the `base64` module, so the most compact
		encoding (`base85`) was chosen from those standards to encode binary data in the Project API.

Add Project API RFC #8

Add Project API RFC #8

Conversation

areusch commented Jun 30, 2021 • edited Loading

mehrdadh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guberti left a comment

Choose a reason for hiding this comment

areusch commented Jul 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

areusch Jul 26, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gromero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gromero Jul 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

areusch commented Jul 26, 2021

gromero commented Jul 27, 2021 • edited Loading

gromero left a comment

Choose a reason for hiding this comment

areusch commented Jul 27, 2021

areusch commented Jul 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

areusch commented Jul 29, 2021

tqchen commented Jul 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mehrdadh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guberti left a comment

Choose a reason for hiding this comment

Mousius left a comment

Choose a reason for hiding this comment

tkonolige left a comment

Choose a reason for hiding this comment

areusch commented Jun 30, 2021 •

edited

Loading

areusch Jul 26, 2021 •

edited

Loading

gromero Jul 27, 2021 •

edited

Loading

gromero commented Jul 27, 2021 •

edited

Loading