Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance for Building Docker Image #896

Open
pinkfloydx33 opened this issue Feb 3, 2023 · 6 comments
Open

Guidance for Building Docker Image #896

pinkfloydx33 opened this issue Feb 3, 2023 · 6 comments

Comments

@pinkfloydx33
Copy link

My apologies if this is not the correct place to ask this....

We run a customized version of the dependabot-script inside of our Azure Devops instance. The script requires the dependabot-core image. Due to some restrictions in AzDO, we need to alter the UID/GID in the dependabot image to match those of our Azure Devops build agent user.

Up until last week we would clone this repo and then build ./Dockerfile passing --build-arg USER_UID=1234 --build-arg USER_GID=1234. We'd then publish the new image to our container registry and use that in the actual run of dependabot.

However this file is now gone, so at the very least the file to build has changed. But.. I've reviewed the replacement file as well as the PR that introduced the changes and it is not clear to me if/how I can continue to build the image from source. I'm not a docker expert and the person who originally created this process is no longer with us. I'm struggling to figure out what the right thing to do is.

Any help/guidance once be appreciated.

@jeffwidman
Copy link
Member

👋 Sorry for the pain here!

We actually should document how to do this for others as well, so I'll give you some high-level instructions and let me know what else you need, and then I'll probably turn this into a help doc long term over in the https://github.com/dependabot/dependabot-script repo.

Instead of building the root Dockerfile, you'll want to build the dockerfile(s) for the ecosystems you want to support. Example: nuget/Dockerfile

And then use that. That should actually build much faster.

If you have multiple ecosystems that you need to support, you'll need to build multiple docker images, and run each separately against the different ecosystems... ie, the java image against Java manifests, the php image against composer manifests etc.

Also, you may find https://github.com/tinglesoftware/dependabot-azure-devops useful if you haven't seen it already, although I'm not sure if @mburumaxwell has had a chance yet to update it to support the new per-ecosystem docker image workflows.

@pinkfloydx33
Copy link
Author

pinkfloydx33 commented Feb 5, 2023

So that's interesting. We actually used to build a "slim" version of the image which involved pre-processing the dockerfile, reading it line by line and detecting the ecosystems we didn't want, outputting the lines to keep into a new file and building that.

Having images dedicated per ecosystem definitely eases that a bit. Unfortunately I need several ecosystems (nuget, npm, go, docker). On its face I don't mind building four images, I'm just not sure its going to work with our dependabot setup, at least not without a refactoring of the job--but that's my problem :)

We aren't actually using dependabot-script since its not designed for azure devops (or at least it wasn't at the time). Instead we are using an azure devops port that someone associated with that repo made and linked in comments a while back. We then heavily modified the ruby script to support our Azure Devops+Jira setup along with some tweaks to deal with a couple of our older monorepos. That other project looks interesting but we're pretty invested in our customized script at this point, though I will check it out to see if I can gleen anything. Thanks for the tip.

It's the monorepos I'm concerned about since there's a mixture of ecosystems. If I were to run the nuget-specific image against, let's say npm, will it just ignore it? IOW will it ignore any config file entries that aren't for nuget? If that's the case I should be fine as I could build a matrix job and just run the different images blindly.

I looked at the nuget Dockerfile and I'm a bit confused. The UID/GID are set and handled in the updater dockerfile and it looks like the ecosystem-specific ones just copy files out of that image. If I build the nuget file, isn't it too late for me to set the two build args I need? Again I'm not a docker expert so apologies if that's a stupid question.

Alternatively, do you know if it's possible for me to just set the uid/gid of the main dependabot-core image, without even pulling this repo, since really that's all I need... ie. in my own build/Dockerfile:

FROM dependabot-core
RUN "reset UID and GID somehow"

I suppose that's more of a Linux question. I found this though not sure if it'll work, but I suppose it's worth a shot if all else fails.

In the meantime I'll start looking at the ecosystem specific images and see what I can come up with. Thanks for your response and any further guidance you might be able to offer.

@jeffwidman
Copy link
Member

Thanks, that's a pretty interesting use case, and probably not too uncommon of wanting to support multiple ecosystems in one run.

Sounds like you have two options to quickly unblock yourself (w/o waiting on us to change things here in upstream):

  1. refactor your entrypoint scripts to take an "ecosystem" argument, and load the appropriate image for each ecosystem, as well as skipping jobs that don't match that ecosystem... then run a matrix of jobs.
  2. Re-create a single monolith image using Docker Buildkit to pull stuff from the various broken-apart images... but it's probably going to be painful as you'll have to manually specify a bunch of things.

I suspect option 1️⃣ will be easier and certainly more maintainable in the long run. You can look at the PR's that modified the bin/docker-dev-shell script to get a sense of how you might plumb this through.

Re: changing the user... you can either do exactly what you're thinking of tweaking them in a chained Dockerfile, or just set them at runtime using the --build-args style... docker run supports a similar command too. Also, if there are changes you want upstream here to make it more pluggable, feel free to submit a PR.

@pinkfloydx33
Copy link
Author

Ok thanks for your help. I will look into the various options and see what I can come up with. I suspect I can probably create something a bit more simplistic now that these are broken apart. I appreciate your help. Thanks again!


A little more background in case it helps in whatever guidance you guys come up with generally:

We run this as a container job, meaning that the entire pipeline actually runs inside the container image (dependabot) that we select. This is different from doing a docker run inside the pipeline itself: you declare the image(s) to use upfront as part of the pipeline definition. The image is pulled before the pipeline starts and then AzDO mounts itself into the container. The pipeline agent's user is the one seen inside the container, ie id -u would return the UID of the pipeline agent. This is why we need to set the uid/gid at build time to make sure our user has permissions inside the container, otherwise the job cannot even start.

This worked for a while, until AzDO made some changes around container jobs that I think (?) coincided with dependabot moving away from the root user. I forget what the exact error was, but the solution was to provide --user 0 in the container's run arguments. You'd think that that would necessarily override things and that id -u would no longer match the agent's user but that's somehow not the case and we still had to build the image with the uid/gid args.

@jeffwidman
Copy link
Member

I'm going to transfer to dependabot-script repo and leave open as I think this feedback is very pertinent to some of the cleanup work we plan to do there in the hopefully near future.

@jeffwidman jeffwidman transferred this issue from dependabot/dependabot-core Feb 10, 2023
@mburumaxwell
Copy link

@jeffwidman I am not sure how straightforward it will be to migrate to this new workflow. Does it mean we need to migrate to using the updater? (Unfortunately, it seems fixed for GitHub only, with the internal HTTP service). Also, it's about 16 docker images to manage/maintain now.

Using custom scripts/code is also unclear. Whether to add the script in each docker image or to execute from local file using the docker image.

Hopefully, you guide on this.

mburumaxwell added a commit to tinglesoftware/dependabot-azure-devops that referenced this issue Jul 30, 2023
This PR moves from the single image which is now dated, on to having an image per ecosystem. This alignes with the GitHub hosted version.

Some more information available [here](dependabot/dependabot-script#896 (comment))

As a consequence, the image repository can no longer be specified as an input to the task (using `dockerImageRepository`) or the server (using `updaterImageRepository`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants