-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Project API RFC #8
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@areusch thanks for the RFC! It's very informative.
Just added a minor comment.
rfcs/0008-microtvm-project-api.md
Outdated
standard location in that directory). TVM communicates with the Project API Server using JSON-RPC | ||
over standard OS pipes. | ||
|
||
TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link is broken. Is there an RFC for model library format on the way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh thanks for catching--it's actually just docs in apache/tvm#8270 which should land soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link can be fixed now apache/tvm#8270 has landed 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, but this needs to change again since the PR is merged :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice writeup! It might be good to explicitly state that a platform doesn't need to support all of ProjectAPIHandler
's methods - if I understand correctly, which workflows are supported can depend on what makes sense for each platform.
@mdw-octoml would be great to get your feedback here as well |
rfcs/0008-microtvm-project-api.md
Outdated
def default_module_loader(pre_load_function=None): | ||
"""Returns a default function that can be passed as module_loader to run_through_rpc. | ||
Parameters | ||
---------- | ||
pre_load_function : Optional[Function[tvm.rpc.Session, tvm.runtime.Module]] | ||
Invoked after a session is established and before the default code-loading RPC calls are | ||
issued. Allows performing pre-upload actions, e.g. resetting the remote runtime environment. | ||
Returns | ||
------- | ||
ModuleLoader : | ||
A function that can be passed as module_loader to run_through_rpc. | ||
""" | ||
|
||
@contextlib.contextmanager | ||
def default_module_loader_mgr(remote_kwargs, build_result): | ||
remote = request_remote(**remote_kwargs) | ||
if pre_load_function is not None: | ||
pre_load_function(remote, build_result) | ||
|
||
remote.upload(build_result.filename) | ||
try: | ||
yield remote, remote.load_module(os.path.split(build_result.filename)[1]) | ||
|
||
finally: | ||
# clean up remote files | ||
remote.remove(build_result.filename) | ||
remote.remove(os.path.splitext(build_result.filename)[0] + ".so") | ||
remote.remove("") | ||
|
||
return default_module_loader_mgr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this definition to match apache/tvm#8363.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you address this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops i see. done.
rfcs/0008-microtvm-project-api.md
Outdated
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Why should we *not* do this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can find some drawbacks.
Maybe:
- More code to maintain.
- Extra work for people using embedded OSes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this extra work for those coming directly from an embedded background as they'll likely produce model files directly into their own embedded projects. For example, I wouldn't expect a current Zephyr user to use this flow over using tvmc
and west
directly due to their existing familiarity.
Though I agree, with the variety of embedded OSes available, this will likely mean TVM as a project has to support a number of OSes - whether directly in checked in project generators or indirectly by features required to support different OSes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I agree that end users may likely just consume MLF. I think that as a "getting started" flow, it may help to have automatic project generation. I expect this flow to be primarily beneficial in implementing autotuning or remote execution flows on various platforms.
I added some drawbacks
rfcs/0008-microtvm-project-api.md
Outdated
alphabet. Python provides standard support for these via the `base64` module, so the most compact | ||
encoding (`base85`) was chosen from those standards to encode binary data in the Project API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TVM uses base64 for encoding data in the relay text format. Maybe we should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is also technically an easier modulo to compute, i think. would be open to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets just do base64 to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay fine :)
[unresolved-questions]: #unresolved-questions | ||
|
||
1. Is anyone particularly opposed the RPC mechanism used here? | ||
2. Does this seem simple for downstream platforms to implement? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a simple example is the subprocess-based microtvm test: https://github.com/areusch/incubator-tvm/tree/project-generator/src/runtime/crt/host
more complex is the zephyr implementation: https://github.com/areusch/incubator-tvm/tree/project-generator/apps/microtvm/zephyr/template_project
i'd recommend reading the first one if you're not familiar with zephyr; the second one is not a minimal example and includes code to communicate with qemu and physical boards
rfcs/0008-microtvm-project-api.md
Outdated
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Why should we *not* do this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this extra work for those coming directly from an embedded background as they'll likely produce model files directly into their own embedded projects. For example, I wouldn't expect a current Zephyr user to use this flow over using tvmc
and west
directly due to their existing familiarity.
Though I agree, with the variety of embedded OSes available, this will likely mean TVM as a project has to support a number of OSes - whether directly in checked in project generators or indirectly by features required to support different OSes.
but ultimately rejected because JSON-RPC can be implemented in a single Python file without adding | ||
the complexities of an IDL compiler. | ||
|
||
## Transport functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest this tightly coupling autotuning and prototyping to specific frameworks called directly from TVM is the main alternative to using the Project API, inclusive of RPC server and support frameworks?
Tightly coupling would be less code and potentially easier user journeys without the consideration of the additional server. Given we can mark dependencies to include as extras, are they a real concern for someone doing pip install tlcpack
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the concern i have is that we don't want our TVM dependencies held back by some platform's need to depend on e.g. pyyaml. if e.g. TensorFlow introduces an install_requires
of pyyaml > 3.0
, but some mciro platform refuses to update past pyyaml 2.x
, the platform is going to lose the battle and become unsupported. going to the API server avoids this battle and also allows for development of API servers outside the TVM tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to have that expressed explicitly as part of the RFC, attempting to continue with a subset of RTOSes or trying to balance different dependencies is a valid alternative to building out the project API but the reasons you've articulated give a concrete example to clear that up 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a close look at the RFC and the Project API draft implementation.
Currently I can't comment much about the AutoTVM part.
On generate_project
method, and considering TVMC, my only comment is that it should allow a project creation based also on MLF .tar, instead of only a "live" executor. As we've discussed I don't think it's necessary to have it now and I'll send a patch for it mostly in the lines as we've discussed it, since TVMC will need it.
On build
method I just think it should be a way to force a rebuild in the same project dir (i.e. even if build
dir exists)
On flash
method alright, no comments.
On transport
, please consider my comments inline, specially the bit a about the speed regression.
Also, if possible, please consider the comments I've posted to the draft code also (apache/tvm#8380)
Otherwise, looks great. Thanks for refactoring and improving it!
Finally, I confirm that the fixes for MAX_ defines are ok (no build error) and the per board configs are kicking in correctly.
Cheers,
Gustavo
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
1. Is anyone particularly opposed the RPC mechanism used here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@areusch Hi Andrew. Sorry, I've posted a comment regarding it on the PR related to Project API code, so I'm posting it again here, since there is an item explicitly asking about feedback about it :)
So, I'd like to understand better the motivation for having the RPC mechanism since it seems to add an unnecessary complexity (to maintain, to debug, for instance). Would that be possible to have the exactly same interfaces but using merely a, let's say, Python module / class inclusion (like importing the microtvm_api_server.py
at first from the template dir and then also later when the project dir is already created)?
I've just noticed what looks like a speed regression when talking to the serial port using the new Project API Transporter (I've shared the details on our #microtvm Discord channel). It looks like related to the limitations pointed out by you on read
and write
calls, regarding the timeouts. So I'm wondering if these limitations would simply go away if the RPC mechanism is removed, for instance. If so, that would weight in against the use of a RPC mechanism as-is. I'm guessing such a regression is not observable in test_zephyr.py
because the models there are too simple.
I also think it's a bit convoluted the way that the client-server works now, like a client creating local pipes, executing the server and passing them to it so the client and server can talk like as a parent/child mechanism, afaics. But I also wonder if you have a more ambitious plan in the long term, like really decoupling the client/server (client runs in one host, and the server in yet another host, connected via UDP or TCP). In that case I see the value of implementing the RPC mechanism now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation for the RPC mechanism is to allow project generators to use python dependencies which may not be present in TVM's dependency set. For example, it may be necessary to include pyserial
to facilitate communication with µTVM RPC server to support autotuning, but this dependency has no business being in TVM's official set of dependencies (we can't set a precedent that every single micro platform's dependencies become TVM's dependencies).
so the RPC server allows the microtvm_api_server.py to live in a separate process, and that breaks the dependency chain.
I've just noticed what looks like a speed regression when talking to the serial port using the new Project API Transporter (I've shared the details on our #microtvm Discord channel)
This was a great catch with the PoC; it's resolved now. the main reason was that I changed the semantics of read
and write
but forgot to update the TVM-side code in src/runtime/micro/micro_session.cc
.
I also think it's a bit convoluted the way that the client-server works now, like a client creating local pipes, executing the server and passing them to it so the client and server can talk like as a parent/child mechanism, afaics
your understanding is correct...the main issue with making this a network-aware RPC is that there are local paths included in the RPC interface now. I don't really intend to make this work over the network, but having a standard way of communicating rather than an ad-hoc one makes more sense to me. then there is a well-defined spec and it's clearer how bugs in the code should be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main motivation for the RPC mechanism is to allow project generators to use python dependencies which may not be present in TVM's dependency set
Absolutely, you have clarified it on the last microTVM Meet up 👍 Now I only don't recall if it's stated that clear in RFC. Thanks. makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a great catch with the PoC; it's resolved now. the main reason was that I changed the semantics of read and write but forgot to update the TVM-side code in src/runtime/micro/micro_session.cc.
Right, that issue is by now resolved. Thanks for fixing it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your understanding is correct...the main issue with making this a network-aware RPC is that there are local paths included in the RPC interface now.
right, I was scratching my head about it : )
I don't really intend to make this work over the network, but having a standard way of communicating rather than an ad-hoc one makes more sense to me. then there is a well-defined spec and it's clearer how bugs in the code should be handled.
Got it. Yeah, plus the current main reason to use an RPC interface here is well state. ok.
Thanks @Mousius @gromero for your comments. updated the RFC to reflect accurately on PoC now that it seems to be passing regression. Please take another look so we can merge this and proceed forward with Project API. @gromero some follow-ups on your comments:
I agree; let's merge additional logic in
I believe that is the functionality now. Did you see something different?
Replied to your comments; I believe the speed issue is not a problem now.
Will take a look at these now that PoC is passing
I reworked the way these are done since posting the PoC--now |
Sure! I'll submit a follow up patch I've been using to test this patchset with TVMC. My comment was also more for the records for others following the discussion since we've already discussed the details about.
Yes, a functionality detail. Currently I'm handling that at TVMC side, like if the
OK. I think you answered all my initial comments there. I'll just post a couple more related to that new round, i.e. related to last commit pushed: apache/tvm@de75022 I have no more comments regarding the RFC itself - I'm happy with it, so I'll just post some comments about the code. I think it's pretty close to land :)
Yeah, I'm not much inclined to use that approach of using a Python code to generate the Regarding the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM.
@Mousius please take another look when you can and explicitly approve! |
@tkonolige @guberti @mehrdadh please take another look and explicitly approve if you're good w/ this |
rfcs/0008-microtvm-project-api.md
Outdated
standard location in that directory). TVM communicates with the Project API Server using JSON-RPC | ||
over standard OS pipes. | ||
|
||
TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link can be fixed now apache/tvm#8270 has landed 😸
rfcs/0008-microtvm-project-api.md
Outdated
@@ -0,0 +1,543 @@ | |||
- Feature Name: microtvm_project_api | |||
- Start Date: 2020-06-09 | |||
- RFC PR: [apache/tvm-rfcs#0000](https://github.com/apache/tvm-rfcs/pull/0000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be 0008? And I don't think there's an associated issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
but ultimately rejected because JSON-RPC can be implemented in a single Python file without adding | ||
the complexities of an IDL compiler. | ||
|
||
## Transport functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to have that expressed explicitly as part of the RFC, attempting to continue with a subset of RTOSes or trying to balance different dependencies is a valid alternative to building out the project API but the reasons you've articulated give a concrete example to clear that up 😸
@Mousius @mehrdadh @tkonolige @guberti please take another look and https://tvm.apache.org/docs/contribute/code_review.html#approve-and-request-changes-explicitly |
rfcs/0008-microtvm-project-api.md
Outdated
def default_module_loader(pre_load_function=None): | ||
"""Returns a default function that can be passed as module_loader to run_through_rpc. | ||
Parameters | ||
---------- | ||
pre_load_function : Optional[Function[tvm.rpc.Session, tvm.runtime.Module]] | ||
Invoked after a session is established and before the default code-loading RPC calls are | ||
issued. Allows performing pre-upload actions, e.g. resetting the remote runtime environment. | ||
Returns | ||
------- | ||
ModuleLoader : | ||
A function that can be passed as module_loader to run_through_rpc. | ||
""" | ||
|
||
@contextlib.contextmanager | ||
def default_module_loader_mgr(remote_kwargs, build_result): | ||
remote = request_remote(**remote_kwargs) | ||
if pre_load_function is not None: | ||
pre_load_function(remote, build_result) | ||
|
||
remote.upload(build_result.filename) | ||
try: | ||
yield remote, remote.load_module(os.path.split(build_result.filename)[1]) | ||
|
||
finally: | ||
# clean up remote files | ||
remote.remove(build_result.filename) | ||
remote.remove(os.path.splitext(build_result.filename)[0] + ".so") | ||
remote.remove("") | ||
|
||
return default_module_loader_mgr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you address this?
rfcs/0008-microtvm-project-api.md
Outdated
alphabet. Python provides standard support for these via the `base64` module, so the most compact | ||
encoding (`base85`) was chosen from those standards to encode binary data in the Project API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets just do base64 to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the URL since the other PR is merged now, otherwise LGTM.
Thanks!
rfcs/0008-microtvm-project-api.md
Outdated
standard location in that directory). TVM communicates with the Project API Server using JSON-RPC | ||
over standard OS pipes. | ||
|
||
TVM supplies generated code to the Project API Server using [Model Library Format](0001-model-library-format.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, but this needs to change again since the PR is merged :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fits my Arduino use case perfectly. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @areusch!
This PR adds an RFC for the Project API (embryonic discussion - m2 roadmap item #6`). Project API is a plugin-style infrastructure that allows TVM to integrate with a variety of platform-specific build systems. It is particularly useful to allow TVM to build and time operator implementations for non-traditional build platforms (e.g. embedded firmware) under the microTVM effort.
Draft PR: apache/tvm#8380
@mehrdadh @guberti @tqchen @jroesch @tkonolige @csullivan @leandron @u99127 @Mousius @giuseros @gromero @stoa @hogepodge