-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MetaSchedule][M3c] XGB-based Cost Model #9859
Conversation
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification and that makes lots of sense. LGTM.
One potential issue Ansor faced before is that when training data gets bigger and bigger, the time to train the XGBoost cost model becomes tedious even the accuracy isn't further improved. What Ansor has done is simply reduce the re-training frequency (e.g., re-train per 2 rounds) when training data size is larger than a threshold. Other than that, we can also refer to the accuracy between the predicted cost and new measured latencies to determine whether to re-train the model in the next round. These are just my two cents and we could probably revisit this issue in the future.
@comaniac Thanks for the extremely valuable feedback!
That's exactly what I'm observing too! In this particular case, hyper-parameters of XGB might not be suitable any more, which limits the model capacity, and we might have to tweak around to find out the best hyperparameters.
This is how Ansor deals with this right now...We might consider better heuristics in the future, including switching models, tweaking model capacity with AutoML stuff, etc.
Using our current interface, this is pretty simple to do so. We have a Anyway, I think we are pretty aligned with the methodology and path to improvement. Let's work together to improve it in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* [MetaSchedule] XGB-based Cost Model Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> * Fix lint * fix doc * fix mypy Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
This PR is part of the stage M3c of the meta schedule project (#8473).
The architecture is re-designed by Junru and Xiyou. In this PR we introduced a XGB-based cost model based on meta schedule's cost model interface. Unittests are included.
Thanks to all co-authors for contributing!
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>