-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolve operators on catalog availability change #216
resolve operators on catalog availability change #216
Conversation
225d8b3
to
9eaf91f
Compare
controllers/catalog_predicates.go
Outdated
catalogReady, err := isCatalogReady(e.Object) | ||
if err != nil { | ||
fmt.Println(err) | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the catalog is not ready, is that something we would want to reflect in the status of the Operator (either now, or in the future)?
I think I'm of the opinion that we should just blindly forward along all CatalogSource events and deal with the state of it during reconciliation where we have a change to touch the Operator status if necessary.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on whether we have a caching layer for catalog contents at the operator controller level, we may not want to reflect a non-ready state for catalogs at the operator-controller level. I don't think it makes sense to run the resolver to trigger on any catalog change especially if the catalog is not ready, but I agree that we can have that check moved further down the resolution process. We'd need to verify that none of our sources are stale/non-ready to avoid resolving unnecessarily.
467e627
to
555895f
Compare
} | ||
}) | ||
|
||
It("resolves again when a new catalog is available", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can avoid all the long timeouts here by simplifying the test expectations. I think all we really want to assert is "whenever a catalog is created/updated/deleted, all of the existing operators are reconciled". Is there something that would show up in the Operator
CR that we could use to verify that a reconcile happened after the catalog event?
That might be organized something like this:
Describe("reconcile Operators on catalog events")
- BeforeEach() -> create 3 operators
- AfterEach() -> delete 3 operators
- When("catalog is created")
- It("reconciles 3 operators")
- When("catalog exists")
- BeforeEach() -> create a catalog
- AfterEach() -> delete the catalog
- When("catalog is updated")
- It("reconciles 3 operators")
- When("catalog is deleted")
- It("reconciles 3 operators")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe even this is too much to do in the e2e? Is there a way to test the controller and the watches without a full-blown e2e?
If so, we could verify all these scenarios there, and then just focus on the happy path in the e2e?
I'm honestly not sure what's the best option here, but I think we should really focus on keeping our overall e2e run as short (in terms of overall run time) as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be ideal if we could test this outside of the e2e tests, but as far as I am aware there isn't a good way to test watches being triggered without actually starting the controller and performing operations against a cluster that would trigger those watches. I think it could be done with envtest but would likely involve a lot of stubbing out resources and would essentially boil down to the question of "do we prefer complex unit tests or longer running e2e tests?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a minimal catalog cuts the wait down by a good chunk, so the e2e suite shouldn't be as long lived. I'm not sure if we'd want to expose resolution/reconcile details on the operator CR just for the sake of CI, in case it can be confusing to a user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're concerned about adding to the status API, how about examining the logs. We control the logs, and the tests can examine them for particular log entries. It wouldn't necessarily be exposing the logs as an API since it would be for e2e/unit tests, which can be modified when the logs are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this PR boil down to an updated controller configuration with a new watch that adds some extra stuff to the workqueue and calls a reconciler. So I'm suggesting it might be possible to isolate that specific behavior if we refactored a little bit (e.g. what if instead of SetupWithManager
being a method on the reconciler, we made it a function that we could pass a reconciler to). In our main.go
, we'd pass the real reconciler, but in a test, we'd pass a fake reconciler that does the "did I get triggered" assertions
Then we could put some envtest-based unit tests together alongside the reconciler and we would essentially split the testing into two chunks:
- Does the controller configuration enqueue the expected reconcile requests?
- Does the reconciler handle reconcile requests in the expected way?
We'd still want an e2e that runs through some scenarios of the integrated/real main.go, but we would avoid the combinatoric problem here, which would save lots of time in e2e runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about examining the logs
Yeah +1 if the unit test path isn't taken. We do this in the SDK helm-operator e2e tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opening another issue to track the refactoring #247. I don't think it makes sense to block this PR on the refactor, I'm in favor of having that in a separate PR.
4c372d1
to
9755169
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this generally looks good, but would like to see @joelanford's last comment addressed, whether any changes are made or not.
Not sure how I clicked "requested changes" instead of "approve"! |
Signed-off-by: Ankita Thomas <ankithom@redhat.com>
Signed-off-by: Ankita Thomas <ankithom@redhat.com>
Signed-off-by: Ankita Thomas <ankithom@redhat.com>
9755169
to
cd88cc7
Compare
Signed-off-by: Ankita Thomas <ankithom@redhat.com>
cd88cc7
to
67caff4
Compare
/approve |
) * resolve operators on catalog availability change Signed-off-by: Ankita Thomas <ankithom@redhat.com> * removing catalog state check for operator reconcile Signed-off-by: Ankita Thomas <ankithom@redhat.com> * e2e tests for catalog watch Signed-off-by: Ankita Thomas <ankithom@redhat.com> * use smaller test index image, cleanup after tests Signed-off-by: Ankita Thomas <ankithom@redhat.com> --------- Signed-off-by: Ankita Thomas <ankithom@redhat.com>
This PR triggers resolution whenever the available catalogsources on cluster change. Resolution is triggered for all operators on cluster when there is any change to a catalogsource.
See: #205