-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ANSOR][AUTOTVM] Combine Ansor and AutoTVM to Improve Scheduling #16499
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not main developer, but I have few suggestions to improve code quality.:
- try to not mix 2 formatting methods f-string and %-formatting. I think we should prefer f-string, because it is more robust and actual.
- try to not one-letter variables.
- not sure if your PR is suitable for tests, but it would be great to have some unit tests for your code.
Thank you @pfk-beta ! All comments are very welcome. I'll improve my code. |
thanks for the contribution, just want to bring some of the context in https://discuss.tvm.apache.org/t/discuss-tvm-core-strategy-for-operator-scheduling-and-tuning/16352 would love to see how we can leverage some of the techniques in MetaSchedule and TensorIR in future |
Thanks @tqchen ! We already have a plan to work with MetaSchedule. I hope to bring contributions in the near future. |
e3ac901
to
5572dc4
Compare
@pfk-beta Could you review my code? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general:
- I have commented only one instance of problem, e.g. one-letter variable. But single problem appears one or more times. Sometime it is annoying(for reviewer, and for author) to mark the same problem multiple time.
- There are many levels of reviewing (important - less important, pythonic - not pythonic, readable - not readable). I just pick problems which are most annoying and simple to me.
- What is spotted, that you are mixing 2 styles, e.g.
with
statement andno-with
statement. Or%-formatting
andfstring
. - one letter variables
@pfk-beta Thanks for the review! I applied modifications to each point you commented on. Could you see if further modifications need to be made? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general: it looks very very good. For me, it looks pretty in almost 95% :)
@pfk-beta Thanks for the review! Could you see if further modifications need to be made? |
@canesche Thanks for your effort. LGTM :) |
Hi @pfk-beta , I'm not that familiar with the whole PR process, but I think you forgot to approve my PR. Could you look at it? |
Hi! I'd like to share some updates on the experiments conducted for this pull request. We've included the performance data for an RTX3080 in addition to the existing dataset in our report. The report now uses four hardware configurations: AMD x86-64 R7, ARM aarch64 A64FX, Nvidia A100, and Nvidia RTX3080. Across all these scenarios, reducing the number of trials for Ansor while using Droplet Search to exploit the best results tends to outperforms Ansor with 10,000 trials per model, considering both search time and model quality. We've also conducted a study on the impact of the model size on the combination of Ansor and AutoTVM's Droplet Search. That's Section 3.3 of the manuscript. Here are our conclusions:
|
Description
This pull request aims to enhance model optimization by combining parts of Ansor and AutoTVM. The proposed approach involves the following steps:
Execution of Ansor over an end-to-end model that requires optimization.
Selection of the best implementation identified by Ansor for the given model.
Utilization of AutoTVM's Droplet Search to exploit the selected candidate.
By integrating Ansor with AutoTVM's Droplet Search (droplet paper), we anticipate a reduction in the number of trials explored by Ansor while still achieving faster kernel performance. Our experimentation has demonstrated significant improvements in kernel speed with reduced search times across various architectures, including Nvidia A100, Nvidia 3080, AMD x86, and ARM A64FX. The results can be found in this report: bennu paper
Proposed Changes
Integration of Ansor and Droplet Search methodologies.
Utilization of Droplet Search to exploit the best candidates identified by Ansor.
Motivation
The motivation behind this pull request is to streamline the model optimization process by leveraging the complementary strengths of Ansor and Droplet Search. By combining these techniques, we aim to enhance the efficiency and effectiveness of kernel search and optimization, ultimately improving overall model performance across different hardware architectures.
Testing and Validation
Extensive testing has been conducted to validate the efficacy and performance improvements achieved through the integration of Ansor and Droplet Search. Benchmarking tests have been performed across Nvidia A100, AMD x86, and ARM A64FX architectures to assess the impact on kernel speed and search time reduction compared with 10,000 trials from Ansor execution. These results are available in Section 3 of this manuscript: bennu paper
Additional Notes
This pull request builds upon prior research and experimentation in model optimization. The proposed approach improves end-to-end models across diverse hardware platforms while still reducing Ansor's search time. We welcome the community’s feedback, suggestions, and contributions to further refine and enhance these methodologies.
Thank you.
Sincerely,
Michael Canesche, Gaurav Verma, and Fernando Pereira