feat(RELEASE-1283): process-file-updates resources #712

scoheb · 2024-11-25T21:56:37Z

new pipeline and task for internal process-file-updates

- new pipeline and task for internal process-file-updates Signed-off-by: Scott Hebert <scoheb@gmail.com>

scoheb · 2024-11-26T02:29:34Z

force merging since the RHOAI needs to start testing their add-on in stage environment

mmalina · 2024-11-26T06:59:50Z

This is not good. You're not supposed to merge anything to stable like this. Stable is supposed to be in sync with production. This will break promotion to production tomorrow, because we do just fast forward (git push without -f): https://github.com/konflux-ci/release-service-catalog/blob/development/.github/scripts/promote_branch.sh#L157

scoheb · 2024-11-26T13:44:23Z

This is not good. You're not supposed to merge anything to stable like this. Stable is supposed to be in sync with production. This will break promotion to production tomorrow, because we do just fast forward (git push without -f): https://github.com/konflux-ci/release-service-catalog/blob/development/.github/scripts/promote_branch.sh#L157

my bad. I did the same type of thing for going to the internal branch.

Is there some other way that we can be more tolerant of hotfixes or patches that are needed prior to the scheduled promotion? There is no way with internal-services to allow users to use sanctioned content from the trifecta of branches like we do in the catalog.

johnbieren · 2024-11-26T14:21:10Z

How do we get to a point where we can make users wait for more than one day to get a new feature they requested? I've brought this up in two meetings now and it was a frequent topic at our F2F. We are just merging things to unblock a user and creating technical debt. I keep hearing "oh this is just for high priority product X" and there is a new product X every week. If they want bleeding edge, put them on the staging SRE cluster and use the development branch. Why do we even have the staging SRE cluster if we aren't using it to enable users to use things before they are going to production?

scoheb · 2024-11-26T14:56:26Z

Why do we even have the staging SRE cluster if we aren't using it to enable users to use things before they are going to production?

for:

e2e tests
manual integration testing

right now, we are limited on how we can allow access to the staging SRE cluster from a prod Konflux cluster. To achieve this, you'd have to:

setup a new deployment on the staging SRE cluster that connects to the prod Konflux cluster.
Then once you add that managed workspace to the allow list
the internal services controller would start acting on every single IR coming from the Konflux prod cluster....meaning we start signing everything in double. not feasible or desirable.

if you have any ideas, I am all ears.

How do we get to a point where we can make users wait for more than one day to get a new feature they requested?

why one day? you'd have to wait maybe a week if the window is passed.

this is not some willy-nilly new untested feature we are trying to get into prod. it was battle-tested on the stage p01 Konflux cluster and has soaked in internal/staging for at least a week.

but if you want to generalize, we can go that route - nothing gets to prod outside the promotion window.

johnbieren · 2024-11-26T15:23:34Z

Why do we even have the staging SRE cluster if we aren't using it to enable users to use things before they are going to production?

for:

e2e tests

manual integration testing

right now, we are limited on how we can allow access to the staging SRE cluster from a prod Konflux cluster. To achieve this, you'd have to:

setup a new deployment on the staging SRE cluster that connects to the prod Konflux cluster.

Then once you add that managed workspace to the allow list

the internal services controller would start acting on every single IR coming from the Konflux prod cluster....meaning we start signing everything in double. not feasible or desirable.

? The e2e tests run in staging. They use internal requests. What are you talking about we would need internal services to act on IRs from the prod cluster? I am saying put them on stage.

if you have any ideas, I am all ears.

How do we get to a point where we can make users wait for more than one day to get a new feature they requested?

why one day? you'd have to wait maybe a week if the window is passed.

Today is Tuesday. We promote on Wednesdays

this is not some willy-nilly new untested feature we are trying to get into prod. it was battle-tested on the stage p01 Konflux cluster and has soaked in internal/staging for at least a week.

So we can battle test this code on stage cluster but we can't put this product team on the stage cluster for a short amount of time? I don't understand that

but if you want to generalize, we can go that route - nothing gets to prod outside the promotion window.

I've never been a part of a product that force merges to production without review at 9pm. Sure, maybe it doesn't break anything. What if you made a typo? What if you pulled in an extra commit on accident? It could break just e2e. It could break users. Who knows? But then we will have an extra support load the next day if anything went wrong because users will complain. And making a mistake can certainly happen.. which is the whole point of getting a review in the first place. As is, this breaks our promotion tomorrow as Martin mentioned. So this is a perfect example of why this type of thing shouldn't happen

mmalina · 2024-11-26T15:37:48Z

? The e2e tests run in staging. They use internal requests. What are you talking about we would need internal services to act on IRs from the prod cluster? I am saying put them on stage.

I think you're missing one thing. That doesn't work. If you want to use stage app sre cluster for IRs in prod Konflux, that just won't work - stage app sre cluster doesn't listen for IRs in prod Konflux.

But I have a different question - I might have asked before. Is it really not possible to use git resolvers in app sre clusters? Because that would solve our problem - you could then use development branch of catalog including all internal resources.

johnbieren · 2024-11-26T15:46:51Z

? The e2e tests run in staging. They use internal requests. What are you talking about we would need internal services to act on IRs from the prod cluster? I am saying put them on stage.

I think you're missing one thing. That doesn't work. If you want to use stage app sre cluster for IRs in prod Konflux, that just won't work - stage app sre cluster doesn't listen for IRs in prod Konflux.

No. I am saying nothing about prod Konflux. I am saying put them on stage. And stage Konflux has IRs, or else our e2e tests would not be ever pass

But I have a different question - I might have asked before. Is it really not possible to use git resolvers in app sre clusters? Because that would solve our problem - you could then use development branch of catalog including all internal resources.

scoheb · 2024-11-26T16:25:22Z

I am saying put them on stage.

so that means having the product team onboard to a new cluster, merge a whole bunch of onboarding PRs, re-add all their secrets that were created manually, re-create ALL their builds (since snapshots are pushed to a different user-workloads location on stage clusters) ? only to then re-do all of it 1 week later once the stable internal release catalog is promoted. not a pretty user experience.

Today is Tuesday. We promote on Wednesdays

and I was generalising. This could have gone down Thursday as an example.

I've never been a part of a product that force merges to production without review at 9pm. Sure, maybe it doesn't break anything. What if you made a typo? What if you pulled in an extra commit on accident? It could break just e2e. It could break users. Who knows? But then we will have an extra support load the next day if anything went wrong because users will complain. And making a mistake can certainly happen.. which is the whole point of getting a review in the first place. As is, this breaks our promotion tomorrow as Martin mentioned. So this is a perfect example of why this type of thing shouldn't happen

and I apologized first thing in my comment.

my comments were possibly misconstrued.

I was trying to get at what Martin was saying -- how can we then use development branch of catalog including all internal resources

seems like you use git resolvers now: https://console-openshift-console.apps.rosa.appsrep09ue1.03r5.p3.openshiftapps.com/k8s/ns/openshift-pipelines/configmaps/resolvers-feature-flags/yaml

johnbieren · 2024-11-26T16:41:30Z

I am saying put them on stage.

so that means having the product team onboard to a new cluster, merge a whole bunch of onboarding PRs, re-add all their secrets that were created manually, re-create ALL their builds (since snapshots are pushed to a different user-workloads location on stage clusters) ? only to then re-do all of it 1 week later once the stable internal release catalog is promoted. not a pretty user experience.

Well this use case has to be figured out. As far as I have been told, it is my responsibility as tech lead to ensure only good code goes into our repos that does not introduce tech debt and considers future requirements. Force merging things into production goes directly against this IMO

Today is Tuesday. We promote on Wednesdays

and I was generalising. This could have gone down Thursday as an example.

I've never been a part of a product that force merges to production without review at 9pm. Sure, maybe it doesn't break anything. What if you made a typo? What if you pulled in an extra commit on accident? It could break just e2e. It could break users. Who knows? But then we will have an extra support load the next day if anything went wrong because users will complain. And making a mistake can certainly happen.. which is the whole point of getting a review in the first place. As is, this breaks our promotion tomorrow as Martin mentioned. So this is a perfect example of why this type of thing shouldn't happen

and I apologized first thing in my comment.

my comments were possibly misconstrued.

I was trying to get at what Martin was saying -- how can we then use development branch of catalog including all internal resources

seems like you use git resolvers now: https://console-openshift-console.apps.rosa.appsrep09ue1.03r5.p3.openshiftapps.com/k8s/ns/openshift-pipelines/configmaps/resolvers-feature-flags/yaml

mmalina · 2024-11-27T08:43:46Z

seems like you use git resolvers now: https://console-openshift-console.apps.rosa.appsrep09ue1.03r5.p3.openshiftapps.com/k8s/ns/openshift-pipelines/configmaps/resolvers-feature-flags/yaml

OK, that looks promising. If we could use git resolvers, then I think that would solve this problem. That way you could use development branch in prod Konflux and prod app sre cluster. @johnbieren should we explore that?

johnbieren · 2024-11-27T13:27:43Z

seems like you use git resolvers now: https://console-openshift-console.apps.rosa.appsrep09ue1.03r5.p3.openshiftapps.com/k8s/ns/openshift-pipelines/configmaps/resolvers-feature-flags/yaml

OK, that looks promising. If we could use git resolvers, then I think that would solve this problem. That way you could use development branch in prod Konflux and prod app sre cluster. @johnbieren should we explore that?

We should test using git resolvers. It would be surprising to me if we are allowed to, considering the fight from sre against us using a floating branch to define our tekton resources instead of a specific commit.

@mmalina can you clarify? I am not sure I am following your proposal. Are you saying add the taskGitRepo and taskGitRevision stuff to the internal-requests+internal-pipelines+internal-tasks and call them that way?

mmalina · 2024-11-27T13:32:13Z

@mmalina can you clarify? I am not sure I am following your proposal. Are you saying add the taskGitRepo and taskGitRevision stuff to the internal-requests+internal-pipelines+internal-tasks and call them that way?

Yes. This would require a change in the internal-services controller - it would then use git resolver to reference the pipeline when creating the PLR, same as the release-controller. And of course you also need it as pipeline params to use it for task reference as well.

johnbieren · 2024-11-27T14:08:15Z

@mmalina can you clarify? I am not sure I am following your proposal. Are you saying add the taskGitRepo and taskGitRevision stuff to the internal-requests+internal-pipelines+internal-tasks and call them that way?

Yes. This would require a change in the internal-services controller - it would then use git resolver to reference the pipeline when creating the PLR, same as the release-controller. And of course you also need it as pipeline params to use it for task reference as well.

I am cool with this approach. I don't see any way to enable the use case Scott wants. Do you want to create the ticket or me? I think it can all be one ticket. It is a lot of things, but all small and easy things

johnbieren · 2024-11-27T14:08:27Z

This depends on git resolvers truly working in the app sre clusters of course

mmalina · 2024-12-03T09:21:12Z

@mmalina can you clarify? I am not sure I am following your proposal. Are you saying add the taskGitRepo and taskGitRevision stuff to the internal-requests+internal-pipelines+internal-tasks and call them that way?

Yes. This would require a change in the internal-services controller - it would then use git resolver to reference the pipeline when creating the PLR, same as the release-controller. And of course you also need it as pipeline params to use it for task reference as well.

I am cool with this approach. I don't see any way to enable the use case Scott wants. Do you want to create the ticket or me? I think it can all be one ticket. It is a lot of things, but all small and easy things

I created a jira to explore using git resolver in app sre clusters: https://issues.redhat.com/browse/RELEASE-1311

If it works, we can follow up and create more jiras for the actual changes.

johnbieren · 2024-12-03T20:23:06Z

@mmalina can you clarify? I am not sure I am following your proposal. Are you saying add the taskGitRepo and taskGitRevision stuff to the internal-requests+internal-pipelines+internal-tasks and call them that way?

Yes. This would require a change in the internal-services controller - it would then use git resolver to reference the pipeline when creating the PLR, same as the release-controller. And of course you also need it as pipeline params to use it for task reference as well.

I am cool with this approach. I don't see any way to enable the use case Scott wants. Do you want to create the ticket or me? I think it can all be one ticket. It is a lot of things, but all small and easy things

I created a jira to explore using git resolver in app sre clusters: https://issues.redhat.com/browse/RELEASE-1311

If it works, we can follow up and create more jiras for the actual changes.

#721 is going to test using git resolvers in the internal clusters actually, as the simple-signing-pipeline uses git resolvers and I am putting that on the internal clusters with that PR

feat(RELEASE-1283): process-file-updates resources

824ac4e

- new pipeline and task for internal process-file-updates Signed-off-by: Scott Hebert <scoheb@gmail.com>

scoheb requested a review from a team as a code owner November 25, 2024 21:56

scoheb merged commit 5ecc457 into konflux-ci:stable Nov 26, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(RELEASE-1283): process-file-updates resources #712

feat(RELEASE-1283): process-file-updates resources #712

scoheb commented Nov 25, 2024

scoheb commented Nov 26, 2024

mmalina commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

mmalina commented Nov 26, 2024

johnbieren commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

mmalina commented Nov 27, 2024

johnbieren commented Nov 27, 2024

mmalina commented Nov 27, 2024

johnbieren commented Nov 27, 2024

johnbieren commented Nov 27, 2024

mmalina commented Dec 3, 2024

johnbieren commented Dec 3, 2024

feat(RELEASE-1283): process-file-updates resources #712

feat(RELEASE-1283): process-file-updates resources #712

Conversation

scoheb commented Nov 25, 2024

scoheb commented Nov 26, 2024

mmalina commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

mmalina commented Nov 26, 2024

johnbieren commented Nov 26, 2024

scoheb commented Nov 26, 2024

johnbieren commented Nov 26, 2024

mmalina commented Nov 27, 2024

johnbieren commented Nov 27, 2024

mmalina commented Nov 27, 2024

johnbieren commented Nov 27, 2024

johnbieren commented Nov 27, 2024

mmalina commented Dec 3, 2024

johnbieren commented Dec 3, 2024