-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support uninstall through the installation apply command #1851
Support uninstall through the installation apply command #1851
Conversation
cfbcb06
to
f8747ff
Compare
f8747ff
to
00d8549
Compare
/azp run porter-integration |
Azure Pipelines successfully started running 1 pipeline(s). |
Allow someone to manage the entire installation lifecycle with a file, and not have to jump back to using uninstall. That causes problems when people are scripting porter calls based on what's in a repo, for example, they need to detect deleted files and treat them differently. Also if the bundle is already uninstalled, calling uninstall twice results in an error, whereas apply does the right thing. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
00d8549
to
40a7dd6
Compare
/azp run porter-integration |
Azure Pipelines successfully started running 1 pipeline(s). |
insync, err := p.IsInstallationInSync(p.RootContext, i, &run, upgradeOpts) | ||
require.NoError(t, err) | ||
assert.True(t, insync) | ||
// Nothing is printed out in this case, the calling function will print "up-to-date" for us |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's still a good idea to assert this behavior so we know when it changes in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other tests check for the "up-to-date" message that is printed. We can't check for it in this unit test because the function that prints it isn't called.
Here is my sad attempt at visualizing the call tree and who prints what
ApplyInstallation()
-> IsInstallationInSync()
prints if it should be triggered
performs extra logic and prints if the installation is up to date or what action will be executed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That make sense. What about asserting that nothing is printed out, so we can confirm this expected behavior in the test?
assert.Contains(t, p.TestConfig.TestContext.GetError(), "")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I went to add that check I discovered that other functions end up printing to stderr. I don't think it's worth asserting that nothing is printed in this case, it would just make the test more fragile.
I was checking for the existence of the "triggering" and "up-to-date" because they help give the user context into what's going on, and why things happened or didn't. But checking for the lack of printing something is something that is much easier to break when we make changes, like adding more log statements elsewhere. And elsewhere in our tests we do check for "up-to-date", so I feel like we have this code path covered pretty well already.
Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851. I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see getporter#58 ). This PR has lots of general improvements to the operator that became relevant while implementing uninstall: * Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there. * Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted. * Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only. * We reset the status when a new change is made to the installation (we aren't keeping run history for now). * Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues. * The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851. I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see getporter#58 ). This PR has lots of general improvements to the operator that became relevant while implementing uninstall: * Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there. * Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted. * Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only. * We reset the status when a new change is made to the installation (we aren't keeping run history for now). * Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues. * The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
023bb95
to
f8b83fe
Compare
LGTM |
@VinozzZ I'll make sure to review the commit message when I squash and merge. |
/azp run porter-integration |
Azure Pipelines successfully started running 1 pipeline(s). |
Two pull requests, getporter#1864 and getporter#1851, were merged, and while it didn't cause a merge conflict, they didn't play well together and it broke the build. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
* Support uninstall, installation status Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851. I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see #58 ). This PR has lots of general improvements to the operator that became relevant while implementing uninstall: * Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there. * Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted. * Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only. * We reset the status when a new change is made to the installation (we aren't keeping run history for now). * Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues. * The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com> * Use uninstalled instead of active * Also bump porter reference to pick up fix for local registries Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com> * Fix godoc Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
) * Support uninstall through the installation apply command Allow someone to manage the entire installation lifecycle with a file, and not have to jump back to using uninstall. That causes problems when people are scripting porter calls based on what's in a repo, for example, they need to detect deleted files and treat them differently. Also if the bundle is already uninstalled, calling uninstall twice results in an error, whereas apply does the right thing. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com> * Switch from active to uninstalled flag on installation Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com> Signed-off-by: joshuabezaleel <joshua.bezaleel@gmail.com>
Two pull requests, getporter#1864 and getporter#1851, were merged, and while it didn't cause a merge conflict, they didn't play well together and it broke the build. Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com> Signed-off-by: joshuabezaleel <joshua.bezaleel@gmail.com>
What does this change
Allow someone to manage the entire installation lifecycle with a file, and not have to jump back to using uninstall. That causes problems when people are scripting porter calls based on what's in a repo, for example, they need to detect deleted files and treat them differently. Also if the bundle is already uninstalled, calling uninstall twice results in an error, whereas apply does the right thing.
Also as part of this I moved two fields, created and modified, that are really status and not part of the installation's desired state.
What issue does it fix
Part of getporter/operator#27
Notes for the reviewer
I'd like your feedback on the new field
active
. If you have ideas for a more clear name, let me know.Also I realize that maybe this sets people up for tripping over "I made an installation file (but forget to set active: true) and now it's not installing and I don't know why". Perhaps there is a better name that we can use where defaulting to false, yields the more obvious behavior. So that if you omit this field, it still installs normally.
I could flip it to
inactive: true
so that you have to explicitly add that field to tell porter to delete it? But I'm not sure that people will get what inactive means... Suggestions welcome! 😅Checklist