-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker build with Ubuntu mirrors #16098
Docker build with Ubuntu mirrors #16098
Conversation
Once I check all architectures build successfully, I will drop all commits except the "Add more mirrors" one |
a063086
to
9075bc8
Compare
@hashhar Do you think this approach is worth considering? |
@losipiuk Any chance of getting this merged? |
|
docker build \ | ||
"${WORK_DIR}" \ | ||
--pull \ | ||
--platform "linux/$arch" \ | ||
-f Dockerfile \ | ||
-t "${TAG_PREFIX}-$arch" \ | ||
--build-arg "TRINO_VERSION=${TRINO_VERSION}" | ||
rm -fr "${WORK_DIR}/default/apt/sources.list.d" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not keep not put those mirror files in default/etc
and have all of those for build of images for each architecture. Would it break sth? I think the non-matching mirros woudl not be used. Then you do not need to modify this script at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my initial approach, but if I remember correctly having all architectures defined at once made the apt update
step longer since it tried to download package lists from all mirrors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One performance improvement that might be worth checking out is having all mirrors defined while also using docker buildx
to build multiple architectures in parallel. That way we'd have a single shared build context, and the parallel build gains might offset some of the cost of fetching all mirror package lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One performance improvement that might be worth checking out is having all mirrors defined while also using
docker buildx
to build multiple architectures in parallel. That way we'd have a single shared build context, and the parallel build gains might offset some of the cost of fetching all mirror package lists.
I have a WIP branch with this approach in branch michal/docker_ubuntu_mirrors_dockerx
It's down to 11m23s in the second execution, still a little more than the usual. |
f093312
to
89f9d6e
Compare
Enabled: yes | ||
Types: deb | ||
URIs: https://mirrors.ocf.berkeley.edu/ubuntu/ https://mirror.kumi.systems/ubuntu/ | ||
Suites: jammy jammy-updates jammy-backports jammy-security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it always be jammy
. Aren't they bumping debian version when there is new one in eclipse-temurin:17-jdk
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think once eclipse-temurin:17-jdk
updates their base image we should update the mirror descriptors manually.
Other than pinning ourselves to a specific hash of eclipse-temurin
we have no control over the distro version they use.
They might even switch from Ubuntu to something which doesn't use APT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is we will not notice that moment right?
Or will we get some error when they update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error might not directly mention a mismatch of Ubuntu versions, but I expect it might look like a huge update with APT trying to pull every installed package twice. I think it will either timeout the job or take long enough for us to notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if using a tag instead of a hash for the base image is a good idea. I think it's better to use immutable hashes for build repeatability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably build these mirrorlists dynamically during image build by something like:
source /etc/os-release
# the above exports a UBUNTU_CODENAME variable
sed -i 's/UBUNTU_CODENAME/${UBUNTU_CODENAME}/g' /etc/apt/sources.list.d/*
And in the mirrorlist we commit to repo we replace occurences of actual release name with UBUNTU_CODENAME
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if using a tag instead of a hash for the base image is a good idea
I think the point of not using specific version it to always get the most recent security patches in place
We can probably build these mirrorlists dynamically during image build by something like
This looks nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if using a tag instead of a hash for the base image is a good idea
I think the point of not using specific version it to always get the most recent security patches in place
I understand that was the intention, but it seems risky because a build of the same trino git hash might work today and fail tomorrow just because the tag was updated in between. I think a more repeatable approach would be to pin ourselves to a specific docker image hash and use dependabot
to keep it up to date with the tag. In this case every docker image change would correspond with a commit in trino that we can roll back if it breaks something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think repeatablitity of build is a very important thing we want to address here. Having subsequent builds use safe docker as a base, without manual intervention to bump base image version is more important imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please squash the fixup
2cf3887
to
8e5c991
Compare
Future-proof against Ubuntu codename update This approach works as long the following assumptions hold: - the location of `/etc/os-release` file does NOT change - the name of the UBUNTU_CODENAME environment variable does NOT change - `eclipse-temurin` still uses Ubuntu as it's base
8e5c991
to
a8a7912
Compare
The recent merge of trinodb#16098 increased build robustness in the event of main Ubuntu repo failures, at the cost of 2-3 minutes for downloading Ubuntu mirror indexes. It turns out the 45 minute timeout was very close to the actual time needed to perform the entire job, and extending it is necessary to avoid cancelling the job.
Description
In the last few weeks, we've seen failures in the job that builds the Docker image where it couldn't connect to
ports.ubuntu.com
orarchive.ubuntu.com
. Adding mirrors toapt
should make it more resilient against such failures.These two mirrors were chosen by getting a list of mirrors that include ports using this: https://askubuntu.com/questions/428698/are-there-alternative-repositories-to-ports-ubuntu-com-for-arm
and then checking which ones are the fastest using netselect, executed on an ec2 instance in us-east-2:
Out of these, the first one is invalid as it doesn't work over http, returning a 403.
Additional context and related issues
#15807 (comment)
https://github.com/trinodb/trino/actions/runs/4004028582/jobs/6872683430
This and #15840 both attempt to solve the same problem, but this solution is preferred as it doesn't rely on cache entries existing. #15840 can still be used to speed up Docker builds occasionally, but it isn't reliable enough to solve this problem.
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: