-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Unity][MSC][Tracking Issue] Introduction to Multi-System Compiler #15233
Comments
TODO: add tests for M0.2 after M0.3 |
Discussion on translate relay to relax without loss info: https://discuss.tvm.apache.org/t/msc-translate-relay-to-relax-without-loss-info/15650 |
I'm somewhat concerned about the While I agree with the need for a operator-level conversion from relay to relax, I think it should be done through extending the existing |
@Lunderberg sorry for the late reply.... I've checked the failures, seems like tril/triu method have been changed, I'll fix them in latter PRs. And the reasons why build a duplicate "relay -> relax" converter:
Thanks for watching ! |
@Archermmt No worries, and I've been slow responding as well. After thinking on it, I think my primary concern is in the method used for the Instead of generating a string to use the Python API, I think the MSC to Relax conversion should instead be done by directly calling the C++ APIs. This would expose any errors during the C++ compilation, rather than delaying them until runtime. |
@Lunderberg Emmm....I've also thought about this, which method is better: 1. Convert in C++ to enable eager errors detection; 2. Convert by string generation to enable independent loading. Both has advantage and disadvantage. The first method (lets say converter, either C++ or python) like relax.builder can check and normalize the op while building graph, but that limit the deployment possibility. For example if I need compare the results between an old version tvm without relax and the new unity version(which maybe a real task for me....), I have to spend lot of time setting up environments and dumps testing datas with the converter solution. And MSC is designed not only for converting to relax, but also torch/torch2, tensorflow/tf2, tensorrt, and so on. Considering dispatch models in different framework and environment, the converter may not be a good solution. The second method (lets say string generation) like cutlass codegen first generate strings and process them to kernel/model/engine. That means codegen process disable check and normalization, that may lead to lazy errors detection. However, strings can be change to script/C++ files and loaded in any environment, that method seperates codegen and loading, which is very essential in fast model release, especially on cloud(where different environment and framework are used). And as mentioned in the RFC:https://discuss.tvm.apache.org/t/rfc-unity-msc-introduction-to-multi-system-compiler/15251 To partially solve the error detection problem, the codegen in MSC not only generate the model, but also generate the unittest. Using the unittest developers can locate and solve the problems efficiently. I think we can leave this part as a todo, thus enable C++ converter for MSC. After the main target is reached, I'll consider of building a converter, or may be directly use relax as the core IR. |
This is a pull request for MSC(Multi-System Compile) RFC: https://discuss.tvm.apache.org/t/rfc-unity-msc-introduction-to-multi-system-compiler/15251/5 Tracking issue: #15233 This PR change test workspace to random workspace, which fix the bug for workspace conflict.
cc @quic-sanirudh
The text was updated successfully, but these errors were encountered: