Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use digests instead of tags to reference images #16047

Open
3 of 4 tasks
l0rd opened this issue Feb 14, 2020 · 18 comments
Open
3 of 4 tasks

Use digests instead of tags to reference images #16047

l0rd opened this issue Feb 14, 2020 · 18 comments
Labels
area/devfile-registry area/plugin-registry kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. severity/P1 Has a major impact to usage or development of the system.

Comments

@l0rd
Copy link
Contributor

l0rd commented Feb 14, 2020

Is your enhancement related to a problem?

Images tags can reference an image today and a different one tomorrow. That makes our plugin and devfile registries builds not repeatable. We have the same kind of problem with the references to git repos. In this case it's for containers.

Using the digests would have the following benefits:

  • make the build of the registries repeatable (with that as well)
  • reducing the number of builds and push of images (no changes --> no build/push)
  • reduce the number of replacements we need to do in devfiles and meta.yaml when preparing a release

Describe the solution you'd like

Replacing tags with digests in:

  • che dockerfiles
  • plugin registry
  • devfile registry
  • properties files
@l0rd l0rd added the kind/enhancement A feature request - must adhere to the feature request template. label Feb 14, 2020
@l0rd l0rd changed the title Use digests instead of tags when referencing container images Use digests instead of tags to reference images Feb 14, 2020
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Feb 14, 2020
@ibuziuk
Copy link
Member

ibuziuk commented Feb 17, 2020

@l0rd could you please put the priority and team for this task. I suspect this is P1 since this feature is needed for the product

@l0rd l0rd added kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. kind/enhancement A feature request - must adhere to the feature request template. labels Feb 17, 2020
@l0rd
Copy link
Contributor Author

l0rd commented Feb 17, 2020

@ibuziuk done. By the way this is not needed for the product. It's an improvement that will make Che more reliable. I have also affected the label kind/epic because it will need multiple teams to work on and probably a few iterations.

@nickboldt
Copy link
Contributor

properties files

What properties files do we have in Che that refer to images?

che dockerfiles

Can you give an example of this?

In CRW we only use tags converted to digests in three places:

  • plugin registry
  • devfile registry
  • operator metadata's latest CSV file

These are updated at build time from the latest tag via these scripts:

Perhaps the best solution here is to implement the same thing in upstream, so every container build of registries and every csv release uses specific digests, but the code in GH still contains the tag references?

That way you have BOTH the snapshot with the specific, permanent reference to a specific image, but also the code with the moving tag in case you want to rebuild it against a newer nightly/latest.

@RickJWagner
Copy link
Contributor

RickJWagner commented May 6, 2020

A user is asking about image upgrades. (i.e. some workspace image gets a CVE upgrade.) To make use of the improved image:

A) The user should upgrade the entire product, via che(crw)ctl server:upgrade
B) The user should manually hack plugin, devfile, operator metadata
C) We will provide some documented way of doing this
D) Other

(Note: Docker build processes above don't seem workable on RHEL 7)

@nickboldt @l0rd

@l0rd
Copy link
Contributor Author

l0rd commented May 6, 2020

@nickboldt 4 images are specified in che.properties, every Dockerfile in our repos have a FROM <base-image> where the <base-image> should be specified using a digest.

Perhaps the best solution here is to implement the same thing in upstream, so every container build of registries and every csv release uses specific digests, but the code in GH still contains the tag references?

That approach doesn't solve the main problem here: we want repeatable builds when we do a release. When our source code doesn't change we want the result of the build to be identical. The example of issue that we want to fix is when che.openshift.io got broken after an update because we were referring quarkus nightly image as nightly and that image had "silently" changed.

That way you have BOTH the snapshot with the specific, permanent reference to a specific image, but also the code with the moving tag in case you want to rebuild it against a newer nightly/latest.

That's exactly what we want to avoid.

@ericwill
Copy link
Contributor

ericwill commented May 6, 2020

From my understanding, we should convert the usage of tags to digests in:

  • the registry (devfile, plugin) meta.yaml's
  • Dockerfiles base images

Going forward then we should only accept contributions with digests used, no tags allowed. There may even be a PR check that could be triggered which will use a regex to automatically validate that no tags have been used in the PR being submitted.

@l0rd
Copy link
Contributor Author

l0rd commented May 6, 2020

@ericwill that's fine. What about the use of next version of theia? Are we going to replace next here with a digest or should we consider it as an exception?

@l0rd
Copy link
Contributor Author

l0rd commented May 6, 2020

@RickJWagner it depends on the image. If it's an update of a server image (operator, che server, registries etc...) those are updated without any manual intervention. If it's an update of a workspace image (theia, machine-exec, sidecars etc...) then existing workspaces that include that image need to be manually migrated. Existing workspaces only. New workspaces will automatically include the new patched image.

@l0rd
Copy link
Contributor Author

l0rd commented May 6, 2020

@RickJWagner but in any case we need to release a new version of the operator first. Otherwise no update will be triggered.

@ericwill
Copy link
Contributor

ericwill commented May 7, 2020

@ericwill that's fine. What about the use of next version of theia? Are we going to replace next here with a digest or should we consider it as an exception?

I think it makes sense to handle this as an exception, since we are the ones controlling the release of che-theia, and the releases of the registries are sync'ed with the releases of che-theia.

@amisevsk
Copy link
Contributor

Another option here, somewhat in line with Nick's suggestion, is to fix images via digest for releases and leave tags in the master branch. Once we release e.g. 7.15.0 for the registries, we run the write_image_digests.sh script and that release will be a reproducible build, without the manual work of maintaining 'freshness' of the digests.

@ericwill
Copy link
Contributor

Maybe I'm missing the point here, but wouldn't that still leave us with the problem of images "silently" changing between releases? For example: if a change is pushed to a :nightly image over the course of a sprint, then automatically updating that image at release time means a potentially breaking change is stuffed into the next release.

IMO we should heavily automate testing of these images, and have dependabot style automation for keeping them up-to-date. That way we aren't blindly updating images in a script, yet at the same time reducing our time spent maintaining these images. The added benefit here is that there is a clear trail of issues/PRs showing what was changed, so in case something breaks it's easier to find the breaking commit.

The flow would be:

  1. Dependabot checks images for updates, opens issues/PRs for updates.
  2. Strong automated testing of the image is triggered via CI.
  3. A quick manual verification/smoke test is conducted before merging.

@ericwill ericwill mentioned this issue May 13, 2020
43 tasks
@amisevsk
Copy link
Contributor

Part of the issue here is that if we can implement

Strong automated testing of the image is triggered via CI.

that's all the issue boils down to. If we intend automated jobs to catch breaking changes, then it doesn't really matter all that much, does it?

  • A) Keep nightly tag, use automated tests to test images. If a breaking change is pushed, CI fails and we do the work to resolve the issue.
  • B) Use digests, CI triggers PRs on update that are tested via automated tests. If tests fail, we do the work to resolve the issue.

The same image is published at release time in both cases, except that case b) will require manual verification of every change and be a significant manual burden until automated tests are in a stable place.

I agree that FROM images should be fixed by digest, I'm referring more to fixing digests for every image in the registries themselves.

@ericwill
Copy link
Contributor

Okay I think we're talking about the same thing. My main concern are the 14 base images in the devfile registry. This is where "silent changes" can really cause problems for us.

The other images in the plugin registry aren't a high priority right now, because most of them are already using "digest style" tags in the format <IMAGE_NAME>:<VERSION>-<COMMIT ID>, so it's impossible for silent changes to happen as those images are only ever pushed once.

What I propose as an immediate action:

  1. Convert the base images in the devfile registry to digests
  2. Introduce dependabot automation for said images, so we get free PR's and can test each change easily. There aren't many images here anyway so it should be manageable.
  3. Improve automated tests for the plugin/devfile registries (already in progress)
  4. Compile a list of plugins not using the "digest style" tag format and convert them.

We could probably (just a thought) handle the plugin registry via attrition actually, and just do digests going forward as plugins become updated (those have to be done manually anyway, for now).

@amisevsk
Copy link
Contributor

That makes sense 👍, sorry I misunderstood the context of the discussion.

@ericwill ericwill mentioned this issue Jun 3, 2020
40 tasks
@ericwill ericwill mentioned this issue Jun 24, 2020
31 tasks
@l0rd l0rd added area/install Issues related to installation, including offline/air gap and initial setup area/plugin-registry area/devfile-registry area/che-server labels Jul 29, 2020
@che-bot
Copy link
Contributor

che-bot commented Feb 4, 2021

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

@che-bot che-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2021
@ericwill ericwill added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 4, 2021
@ericwill
Copy link
Contributor

Plugin registry is done.

@ericwill
Copy link
Contributor

Devfile registry is also done.

@tolusha tolusha removed the area/install Issues related to installation, including offline/air gap and initial setup label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devfile-registry area/plugin-registry kind/epic A long-lived, PM-driven feature request. Must include a checklist of items that must be completed. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

9 participants