-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Add ml_dypes dependency for all docker images #15226
Conversation
@tvm-bot rerun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi. Thanks for the PR @yzh119.
Which images were tested with the proposed change?
I couldn't verify the usage of ubuntu2004_install_python_package.sh
in ci_cpu, for example. Perhaps you meant ubuntu_install_python_package.sh
instead?
@leandron thank you for your review! The Besides |
Hi @leandron do you have any further feedback? |
@@ -43,4 +43,5 @@ pip3 install --upgrade \ | |||
junitparser==2.4.2 \ | |||
six \ | |||
tornado \ | |||
pytest-lazy-fixture | |||
pytest-lazy-fixture \ | |||
git+https://github.com/jax-ml/ml_dtypes.git@v0.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should favour the usage of this package from Pypi, like other platforms. I checked and the source package is there: https://pypi.org/project/ml-dtypes/0.2.0/#files
Can we just move to a plain ml_dtypes
type of dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using ml_dtypes from PyPI at first but unfortunately, there is no i686 prebuilt so it will fallout to compile from source (.tar.gz
file), and it doesn't work (see this flow: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-docker/detail/PR-15226/2/pipeline).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I'm fine with the change, but would also like to see another PR with test passing when using the temporary Docker images generated by this job before we proceed merging this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'd love to, but the docker images would only be uploaded to tlcpack
if the PR gets merged into the mainline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great you want to do the test and you are correct wrt tlcpack. I'm happy to say that there is a way: if you check in the CI results box below, you'll see a docker/pr-head
, which generated a set of temporary images and uploaded them to AWS ECR.
This is the specific one for this job: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-docker/detail/PR-15226/4/pipeline
Usually people will get those images and submit a new PR to test CI with the temporary images (e.g. 477529581014.dkr.ecr.us-west-2.amazonaws.com/ci_i386:PR-15226-c0454f162-4
, by updating https://github.com/apache/tvm/blob/main/ci/jenkins/docker-images.ini with the image names. The test PR can be closed as soon as it finishes successfully, then we merge this one.
Other PRs in past did it and it is safer for us to push for better testing on Docker changes, as they take a long time to troubleshoot when we break them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The temporary images might have been cleaned up. I’d suggest re running CI and throwing the test job as soon as the image rebuild finishes. Sorry, I have no access to the temp. images configuration. |
@tvm-bot rerun |
I don't see any point blocking on this given it is just a minor dependency update and unlikely break anything. Technically, the set of binary images are also still stable and point to the previous build. This is exactly the reason why we used the binary tag in the first place.
Likely immediately move main CI to include the dependency also won't break. Maybe I am missing something here.. but my technical access-ment is that there is no risk in merging it in. |
Considering the minor nature of this dependency update, there seems to be no reason for us to hold back. The risk of it causing any significant disruptions is quite low. From a technical perspective, the current set of binary images remains stable and is still directed towards the previous build. This is precisely why we initially opted for the binary tag. Until we make the necessary changes to the Jenkinsfile, pointing it to the new binary, our main CI process will remain unaffected. The existing code does not currently rely on the new package, so it is highly unlikely to cause any complications. Even in the rare event that it does result in issues (though I cannot think of any at the moment), we can promptly revert the change if the binary breaks. As a result, there won't be any blockers in the CI pipeline since all builds will continue using the old binary image. Furthermore, it is probable that integrating the new dependency into the main CI process immediately will not lead to any problems. Though I may be overlooking certain aspects, my technical assessment indicates that there is minimal risk associated with incorporating the update. |
The PR #15183 adds ml_dtypes as a dependency of TVM, however, some of the CI docker images do not have ml_dtypes installed, the PR fixes the issue.
cc @yongwww , would you mind also helping upload the new images with ml_dtypes installed to docker hub?