Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor dockerfiles to have a common base image. #42691

Closed
wants to merge 1 commit into from

Conversation

tomprince
Copy link
Member

Most of the Dockerfiles have common parts installing a bunch of packages as well as sccache and dumb-init. Move those common parts into a shared base image and build that first.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@tomprince
Copy link
Member Author

My only concern with putting everything in the base image is that all the images will need to be updated when we update sccache. I'm not sure if that occurs often enough to be an issue.

@tomprince
Copy link
Member Author

I also haven't test this, but am going to let travis handle it.

@bors
Copy link
Contributor

bors commented Jun 16, 2017

☔ The latest upstream changes (presumably #42631) made this pull request unmergeable. Please resolve the merge conflicts.

@shepmaster shepmaster added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 16, 2017
@alexcrichton
Copy link
Member

Thanks for the PR! The diffstat here is quite impressive :)

I've only got two points of hesitation for why I didn't do this in the first place. Historically we've had issues where some containers are different base images than others. I think everything right now is ubuntu:16.04 but historically we had a smorgasboard of a bunch of different versions. This in the end meant up that we ended up duplicating everything anyway because there's not a great way to have something like "middleware" in Docker (AFAIK at least).

The second point is that the base image tends to be relatively large-ish for all the other images. Most images, for example, don't need libncurses5-dev (except for the crosstool-ng ones I think).

I personally have been ok with the duplication here as it allows everything to evolve independently. If you're fixing a bug in one container there's no need to worry about an impact across every single other container. Now this does make it difficult when you want to fix a bug in all containers! My hope was that the shared scripts (like the sccache script) could be made so "updating one container" and "updating all containers" in typical day-to-day usage just had O(1) updates.

What do you think of these points? Do you feel the deduplication is worth the extra cost on images and potential loss in flexibility to change everything independently? I'm sort of up in the air about it myself, personally.

@alexcrichton
Copy link
Member

cc @malbarbo

@tomprince
Copy link
Member Author

The second point is that the base image tends to be relatively large-ish for all the other images. Most images, for example, don't need libncurses5-dev (except for the crosstool-ng ones I think).

I don't think this is a big issue. libncurses5-dev with all it dependencies appears to be about ~2MB (compared to the base image which is 118MB), so not significantly larger. I think the benefit of having once place to change things outweighs the small extra size. On a developers machine, if they are using multiple containers, it will actually end up being less size, since all the containers can share the base image, rather than each one having a slightly different package set.

It might make sense to refactor things more, so that the crosstool based images share a common subset too, in which case libncurses5-dev could be moved there (along with the custom build of make).

but historically we had a smorgasboard of a bunch of different versions.

I can't speak to this point at all.


The impetuous for this was experimenting with running linux build in taskcluster, which requires everything to be done in a docker container, including checking out the source. That would require having git in containers (along, possibly, with scripts to download the repo).

@tomprince
Copy link
Member Author

@alexcrichton
Copy link
Member

@tomprince I'm sort of leaning towards preferring to keep these containers separate (so they can all have different base versions if they need them). Is this strategy though incompatible with taskcluster?

@tomprince
Copy link
Member Author

I don't think the potential future need to have different base versions is a good reason not to do this:

  • This doesn't force all the containers to have the same base image, it just makes doing that the easiest thing (dist-{i686,x86_64}-linux still have a different base image, for example).
  • I think there is value in having a common base image. For example, the builds scripts don't need to worry about incidental differences. Making the default be having the same image probably puts a slight pressure to articulate why a particular builder needs to be different, which is a good thing.

Regarding taskcluster, there is nothing that forces having a common base image. Taskcluster does require that the checkout of source code happen inside the image, and I was pondering writing a tool that would handle that, which I would then include in the base image. But I could just duplicate that inclusion into all the images if they aren't factored like this.

@alexcrichton
Copy link
Member

Oh right yes I forgot about the dist x86 builders. So there is no default image already in that case? If that's true I personally see even more reason for not trying to consolidate all the other ones, it'd just mean there's no rule for consistency, which I'd find more confusing than consistently duplicating everything.

It sounds like there's no technical requirement for having the same base image, and furthermore we have a hard requirement that this can't be a requirement. In that sense I would prefer to keep the images separate as they are today.

@arielb1 arielb1 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 20, 2017
@aidanhs
Copy link
Member

aidanhs commented Jun 21, 2017

I'm mildly against having a single base image just because it makes dependencies less explicit and it becomes harder to know if a tool can be removed or what precisely is still depending on it.

@Mark-Simulacrum
Copy link
Member

Yeah, I don't think a single image gives us many benefits in this case. I like the idea, though -- it's worth considering if there's some way we can deduplicate without potentially introducing problems later down the road. I'm especially worried about things being added to the base image when it's better to add them to, say, 3 other images.

I've also nominated for the infra meeting if we don't come to a decision before then, though with all hands it'll be a small meeting I imagine.

@Mark-Simulacrum Mark-Simulacrum added T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. I-nominated labels Jun 24, 2017
@bors
Copy link
Contributor

bors commented Jun 25, 2017

☔ The latest upstream changes (presumably #42784) made this pull request unmergeable. Please resolve the merge conflicts.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 2, 2017
@shepmaster
Copy link
Member

I'm not sure if it would apply here, but the playground used to do the same thing (have a shared "base" container that stable, beta, nightly all used). It was annoying after a while and newer Docker versions allowed me to have a single Dockerfile that is then parameterized at build time, allowing me to have a unified system but without having to remember to build A before B.

I'm also keeping my eye on multi-stage builds which might help with similar problems.

@Mark-Simulacrum
Copy link
Member

We discussed this during the infra meeting today and general consensus was to close. If someone would like to look into what @shepmaster says above, perhaps we could do that, but I am also concerned about depending on new docker versions -- the less we have to download on CI the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants