-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Make Graviton3 default AArch64 job runner node #15352
[CI] Make Graviton3 default AArch64 job runner node #15352
Conversation
In order to support SVE testing, migrating the current default AArch64 nodes to Graviton3 based nodes. Using r7g.large instances which have the memory requirements to support the TVM workloads.
@@ -19,7 +19,7 @@ | |||
|
|||
{% call m.invoke_build( | |||
name='BUILD: arm', | |||
node='ARM-SMALL', | |||
node='ARM-GRAVITON3', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it would be useful to have a analysis of cost and the way we structure the tests.
As of now running the UTs directly through e2e compilation can take up a lot of CI time. A lot of that comes from tests that are as a matter of fact integration tests
Would be great for us to isolate out a limited set of integration tests (in cases with tests/arm_sve/
) and only run limited set of testcases over these would be useful. Like our require_cuda
tag, while majority of tests do not have to go through the specific nodes
Thanks, it would be useful to have a analysis of cost of the new instance. As of now running the UTs directly through e2e compilation can take up a lot of CI time. A lot of that comes from tests that likely do not need SVE. My understanding is that we will need SVE for some of the integration tests. Ideally we should isolate out a limited set of integration tests(e.g. via Most remainder of the tests can be structured through UTs and likely do not need SVE |
@tqchen the new instance type is slightly more expensive on paper, as detailed below:
However, the new generation of instance has been proven to improve performance (see: Re:invent presentation). Which indicates this is an improvement for CI costs. If you look at the diff, this replaces the |
OK get it, seems to be good on this |
In order to support SVE testing, migrating the current default AArch64 nodes to Graviton3 based nodes. Using r7g.large instances which have the memory requirements to support the TVM workloads.