Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve retention policy and management of node and ci images #767

Open
4 tasks
NymanRobin opened this issue May 21, 2024 · 5 comments
Open
4 tasks

Improve retention policy and management of node and ci images #767

NymanRobin opened this issue May 21, 2024 · 5 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.

Comments

@NymanRobin
Copy link
Member

Current Situation

Currently, the retention policy saves the last 5 images, but this isn't very safe. The process to take a new image is manual and lacks visibility into which image is currently in use. You have to be a Jenkins Admin to see and change the image for CI, which means someone with triggering rights to the build can start it without knowing they might erase the actively used image from OpenStack.

This problem also applies to node images. However, everyone currently has visibility into both Artifactory and the dev-env code, allowing them to see what image is used and understand how a new trigger will affect it.

What needs to be fixed

To address these issues, we need to ensure that the actively used image is never deleted. We also need a way to ensure that if the active image is changed, the new image will work properly through some testing. Additionally, any changes to an image build should be testable in the PR before merging.

  • Make the active image separate from the candidate images
  • Add a promotion process for changing the active image
  • Add tests for new image when changes happen to the DiB image workflow
  • Check that all file changes affecting the DiB workflow trigger tests and a new build trigger on merge

Potential solution

A potential solution could involve having the active image with a separate naming convention from the candidate images. For promotion, there would be a pipeline that takes a candidate image as input, runs tests on it, and if the tests pass, automatically changes the active image to the candidate.

Note: Jenkins also offers an artifactory plugin which supports promtion logic out of the box which could be investigated. Not sure if same exists for the openstack plugin

By implementing these changes, we can increase the reliability and safety of our image retention process, improve coordination among team members triggering builds, and reduce the risk of active image overwriting and build failures. Testing new images before they become active will ensure they are reliable and functional, providing a smoother and more predictable CI/CD process.

@metal3-io-bot metal3-io-bot added the needs-triage Indicates an issue lacks a `triage/foo` label and requires one. label May 21, 2024
@tuminoid
Copy link
Member

Absolutely +1 from me.

/triage accepted

@metal3-io-bot metal3-io-bot added triage/accepted Indicates an issue is ready to be actively worked on. and removed needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels May 21, 2024
@Sunnatillo
Copy link
Member

Thank you creating the issue. This would be a great improvement

@metal3-io-bot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 20, 2024
@tuminoid
Copy link
Member

/remove-lifecycle stale

@metal3-io-bot metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 20, 2024
@tuminoid
Copy link
Member

/lifecycle frozen

@metal3-io-bot metal3-io-bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.
Projects
Status: Backlog
Development

No branches or pull requests

4 participants