Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uninstall through the installation apply command #1851

Merged
merged 2 commits into from
Jan 26, 2022

Conversation

carolynvs
Copy link
Member

@carolynvs carolynvs commented Jan 6, 2022

What does this change

Allow someone to manage the entire installation lifecycle with a file, and not have to jump back to using uninstall. That causes problems when people are scripting porter calls based on what's in a repo, for example, they need to detect deleted files and treat them differently. Also if the bundle is already uninstalled, calling uninstall twice results in an error, whereas apply does the right thing.

Also as part of this I moved two fields, created and modified, that are really status and not part of the installation's desired state.

What issue does it fix

Part of getporter/operator#27

Notes for the reviewer

I'd like your feedback on the new field active. If you have ideas for a more clear name, let me know.

Also I realize that maybe this sets people up for tripping over "I made an installation file (but forget to set active: true) and now it's not installing and I don't know why". Perhaps there is a better name that we can use where defaulting to false, yields the more obvious behavior. So that if you omit this field, it still installs normally.

I could flip it to inactive: true so that you have to explicitly add that field to tell porter to delete it? But I'm not sure that people will get what inactive means... Suggestions welcome! 😅

Checklist

  • Unit Tests
  • Documentation
  • Schema (resource schema files)

@carolynvs
Copy link
Member Author

/azp run porter-integration

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Allow someone to manage the entire installation lifecycle with a file,
and not have to jump back to using uninstall. That causes problems when
people are scripting porter calls based on what's in a repo, for
example, they need to detect deleted files and treat them differently.
Also if the bundle is already uninstalled, calling uninstall twice
results in an error, whereas apply does the right thing.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
@carolynvs carolynvs force-pushed the uninstall-through-apply branch from 00d8549 to 40a7dd6 Compare January 18, 2022 20:06
@carolynvs
Copy link
Member Author

/azp run porter-integration

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@carolynvs carolynvs marked this pull request as ready for review January 20, 2022 21:11
@carolynvs carolynvs requested a review from VinozzZ January 20, 2022 21:15
insync, err := p.IsInstallationInSync(p.RootContext, i, &run, upgradeOpts)
require.NoError(t, err)
assert.True(t, insync)
// Nothing is printed out in this case, the calling function will print "up-to-date" for us
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's still a good idea to assert this behavior so we know when it changes in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other tests check for the "up-to-date" message that is printed. We can't check for it in this unit test because the function that prints it isn't called.

Here is my sad attempt at visualizing the call tree and who prints what

ApplyInstallation()
  -> IsInstallationInSync()
          prints if it should be triggered
  performs extra logic and prints if the installation is up to date or what action will be executed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That make sense. What about asserting that nothing is printed out, so we can confirm this expected behavior in the test?

assert.Contains(t, p.TestConfig.TestContext.GetError(), "")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I went to add that check I discovered that other functions end up printing to stderr. I don't think it's worth asserting that nothing is printed in this case, it would just make the test more fragile.

I was checking for the existence of the "triggering" and "up-to-date" because they help give the user context into what's going on, and why things happened or didn't. But checking for the lack of printing something is something that is much easier to break when we make changes, like adding more log statements elsewhere. And elsewhere in our tests we do check for "up-to-date", so I feel like we have this code path covered pretty well already.

carolynvs added a commit to carolynvs/porter-operator that referenced this pull request Jan 21, 2022
Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851.

I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see getporter#58 ).

This PR has lots of general improvements to the operator that became relevant while implementing uninstall:
* Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there.
* Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted.
* Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only.
* We reset the status when a new change is made to the installation (we aren't keeping run history for now).
* Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues.
* The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
carolynvs added a commit to carolynvs/porter-operator that referenced this pull request Jan 21, 2022
Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851.

I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see getporter#58 ).

This PR has lots of general improvements to the operator that became relevant while implementing uninstall:
* Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there.
* Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted.
* Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only.
* We reset the status when a new change is made to the installation (we aren't keeping run history for now).
* Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues.
* The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
@carolynvs carolynvs force-pushed the uninstall-through-apply branch from 023bb95 to f8b83fe Compare January 24, 2022 19:56
@VinozzZ
Copy link
Contributor

VinozzZ commented Jan 25, 2022

LGTM
One nit about the commit message. Should we update it to reference uninstall instead of active in there as well?

@carolynvs
Copy link
Member Author

@VinozzZ I'll make sure to review the commit message when I squash and merge.

@carolynvs
Copy link
Member Author

/azp run porter-integration

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@carolynvs carolynvs merged commit e955679 into getporter:release/v1 Jan 26, 2022
@carolynvs carolynvs deleted the uninstall-through-apply branch January 26, 2022 16:41
carolynvs added a commit to carolynvs/porter that referenced this pull request Jan 26, 2022
Two pull requests, getporter#1864 and getporter#1851, were merged, and while it didn't cause a merge
conflict, they didn't play well together and it broke the build.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
@carolynvs carolynvs mentioned this pull request Jan 26, 2022
4 tasks
carolynvs added a commit to getporter/operator that referenced this pull request Jan 26, 2022
* Support uninstall, installation status

Uninstall is now triggered when a CRD is deleted OR if spec.active=false. In either case we have the porter-agent call `porter installation apply`. This requires an unreleased build of porter with support for uninstalling a bundle when active=false See getporter/porter#1851.

I have also updated our logging to use log.V(level) to make it easier to filter to the type of info. 5=trace, 4=debug, 3=system state, 2=app state, 1=info, 0=error. I borrowed this from another blog/k8s project but am having trouble finding it. I'm using constants so if we need to switch what the levels mean we can later. Similarly I have added trace statements so we can figure out what happened when things go wrong. Eventually we need to add instrumentation (see #58 ).

This PR has lots of general improvements to the operator that became relevant while implementing uninstall:
* Use a well-known annotation to trigger reconcile again: `porter.sh/retry`. We are now adding that value as a label to the agent job so that we can retry without accidentally picking up the previous job if it's still there.
* Use generation instead of resource version to detect changes. Resource version is updated anytime the entire installation is changed, while generation only updates when the spec changes or when the resource is deleted.
* Populate the installation status indicating if we have scheduled an agent, it's running, succeeded or failed. Emiting events will be in a separate PR. There isn't more detailed information about how the run went or what action porter took. For simplicity, and to avoid data sync issues with Porter's database, the installation status is about the porter agent job only.
* We reset the status when a new change is made to the installation (we aren't keeping run history for now).
* Split out the reconcile logic into smaller unit-tested functions. These tests check that we are creating resources properly and that reconcile only makes a single change then requeues.
* The integration test used to check a lot of details but since it's easier to check in unit tests, and the integration test is more for "does the whole thing work?" I've simplified the checks in the integration test.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>

* Use uninstalled instead of active

* Also bump porter reference to pick up fix for local registries

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>

* Fix godoc

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
joshuabezaleel pushed a commit to joshuabezaleel/porter that referenced this pull request Feb 8, 2022
)

* Support uninstall through the installation apply command

Allow someone to manage the entire installation lifecycle with a file,
and not have to jump back to using uninstall. That causes problems when
people are scripting porter calls based on what's in a repo, for
example, they need to detect deleted files and treat them differently.
Also if the bundle is already uninstalled, calling uninstall twice
results in an error, whereas apply does the right thing.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>

* Switch from active to uninstalled flag on installation

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
Signed-off-by: joshuabezaleel <joshua.bezaleel@gmail.com>
joshuabezaleel pushed a commit to joshuabezaleel/porter that referenced this pull request Feb 8, 2022
Two pull requests, getporter#1864 and getporter#1851, were merged, and while it didn't cause a merge
conflict, they didn't play well together and it broke the build.

Signed-off-by: Carolyn Van Slyck <me@carolynvanslyck.com>
Signed-off-by: joshuabezaleel <joshua.bezaleel@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants