Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix resource creation/deletion after operator group config change #675

Closed
wants to merge 3 commits into from

Conversation

jpeeler
Copy link

@jpeeler jpeeler commented Jan 18, 2019

This fixes scenario(s) where CSVs were not being copied.

Commit message:
This changes the operator group target namespace format specifically
when watching all namespaces. Instead of only including "", the
annotation has been updated to additionally include all the namespaces
seen. The new behavior fixes syncing issues when a new namespace is
created after the initial CSVs have been copied, again only when
watching all namespaces.

Doesn't change format, just fixes OperatorGroup managed resources.

Requeuing changes:
When namespaces are updated, matching operator groups are requeued.
When operator group annotations change, CSVs in that namspace are
requeued.

ALM-665

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 18, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 18, 2019
@jpeeler
Copy link
Author

jpeeler commented Jan 18, 2019

/retest
And so it begins.

@jpeeler
Copy link
Author

jpeeler commented Jan 18, 2019

/retest

} else {
if selector == nil || selector.Empty() {
selector = labels.Everything()
// set is sorted after returning, so being at the beginning can be relied upon
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow this comment - where are we using order of the set? (asking because maps don't retain any ordering)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I should have specifically said slice instead. It's after being converted from a set to a slice in updateNamespaceList.

@@ -58,16 +58,25 @@ func (a *Operator) syncOperatorGroups(obj interface{}) error {
a.Log.Debug("Cluster roles completed")

for _, csv := range a.csvSet(op.Namespace, v1alpha1.CSVPhaseAny) {
origCSVannotations := csv.GetAnnotations()
origCSVannotations := a.copyOperatorGroupAnnotations(&csv.ObjectMeta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why copy out the just the three annotations and then immediately set just those three below with addOperatorGroupAnnotations? If we aren't going to retain the other annotation values, then we can just directly build the set of annotations here and not use these copyOperatorGroupAnnotations/addOperatorGroupAnnotations methods?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here wasn't to change the logic, but to fix the bug of not detecting annotation changes due to getting a reference instead of a copy. The new copyOperatorGroupAnnotations method ensures a copy is done so that the reflection test below works properly.

@@ -310,6 +314,32 @@ func (a *Operator) syncObject(obj interface{}) (syncError error) {
a.requeueOwnerCSVs(metaObj)
}

namespace, ok := obj.(*corev1.Namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be too aggressive for resyncing.

The cases where we know we need to resync:

  • namespace is added/deleted
  • namespace's labels updated

This makes me think we may want to detect those cases directly by passing in specific handlers for those events.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're looking at all three add/update/delete, is having separate handlers for each better? Or perhaps you're referring to something more granular that I don't know about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing I really want to omit from processing is the case where the namespace is synced periodically, or when there has been some non-label-related change to a namespace

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no namespace processing is done then the scenario I described in #675 (comment) will only eventually work, not instantly as I was hoping could be the case. Is that acceptable?

return
}
for _, opGroup := range operatorGroupList {
namespaceMap, err := a.getMatchingNamespaces(opGroup)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to list all of the namespaces of the operator group to know if the one we're currently syncing should requeue an operatorgroup? Just need to compare the namespace list (in spec of opgroup) or the label selector on the operatorgroup / labels of the namespace.

We probably already have a function somewhere that does this, but don't we just need to do something like if ShouldContain(opGroup, namespace) { resync opgroup }?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resyncing here is specifically to make an operator group status contain a properly updated namespace list. You are right though that less work is necessary here, so I'll look into fixing that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take back what I said about less work being necessary. I think it should be left as is.

@@ -98,7 +98,7 @@ func NewInstallModeSet(modes []InstallMode) (InstallModeSet, error) {
// the given operatorNamespace and list of target namespaces.
func (set InstallModeSet) Supports(operatorNamespace string, namespaces []string) error {
numNamespaces := len(namespaces)
if !set[InstallModeTypeAllNamespaces] && numNamespaces == 1 && namespaces[0] == v1.NamespaceAll {
if !set[InstallModeTypeAllNamespaces] && numNamespaces > 0 && namespaces[0] == v1.NamespaceAll {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? Why does it help to list additional namespaces for NamespaceAll?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact scenario that is solved by doing this is after an operator group's status is written that is watching all namespaces, a new namespace creation isn't "noticed" without specifically listing all the namespaces seen.

@@ -293,7 +302,7 @@ func (a *Operator) ensureTenantRBAC(operatorNamespace, targetNamespace string, c

func (a *Operator) copyCsvToTargetNamespace(csv *v1alpha1.ClusterServiceVersion, operatorGroup *v1alpha2.OperatorGroup) error {
namespaces := make([]string, 0)
if len(operatorGroup.Status.Namespaces) == 1 && operatorGroup.Status.Namespaces[0] == corev1.NamespaceAll {
if operatorGroup.Status.Namespaces[0] == corev1.NamespaceAll {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we didn't have to implicitly rely on list ordering here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was better than searching the string each time to see if "" was in the list. I couldn't figure out any naming issues with sorting not causing "" to be first.

njhale
njhale previously requested changes Jan 24, 2019
Copy link
Member

@njhale njhale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome overall. Just a few small things:

pkg/controller/operators/olm/operator.go Outdated Show resolved Hide resolved
@njhale njhale dismissed their stale review January 24, 2019 15:50

Not relevant in the context of the proposed solution.

@jpeeler
Copy link
Author

jpeeler commented Jan 25, 2019

I've reworked this to NOT change the namespace status format in the case of watching all namespaces. I've verified that it works in the additional namespace scenario I've been testing with. However, it does not properly remove a CSV if a namespace is removed from an existing operator group (which I assume does not work properly with the current code either). Perhaps I should add these scenarios as e2es.

@jpeeler
Copy link
Author

jpeeler commented Jan 28, 2019

/retest

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpeeler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jpeeler jpeeler changed the title fix(olm): change operator group target namespace format fix resource creation/deletion after operator group config change Jan 30, 2019
@@ -282,6 +282,16 @@ func (a *Operator) ensureTenantRBAC(operatorNamespace, targetNamespace string, c
return err
}
}
for _, ns := range pruneNamespaces {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These RBAC changes and the ones below seem to be unnecessary. Why would that be?

@@ -58,16 +58,34 @@ func (a *Operator) syncOperatorGroups(obj interface{}) error {
a.Log.Debug("Cluster roles completed")

for _, csv := range a.csvSet(op.Namespace, v1alpha1.CSVPhaseAny) {
origCSVannotations := csv.GetAnnotations()
origCSVannotations := a.copyOperatorGroupAnnotations(&csv.ObjectMeta)
a.addOperatorGroupAnnotations(&csv.ObjectMeta, op, !csv.IsCopied())
if reflect.DeepEqual(origCSVannotations, csv.GetAnnotations()) == false {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this do the right thing if there are other annotations on the csv? (Non-operatorgrou-related)

Copy link
Author

@jpeeler jpeeler Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment made no sense. origCSVannotations is only used for comparing to operator group annotations. So it does do the right thing.

if err != nil {
return err
}
for _, ns := range namespaceObjs {
namespaces = append(namespaces, ns.GetName())
for _, csv := range fetchedCSVs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also verify that the CSV status is “copied” just to be safe

@alecmerdler
Copy link
Member

@jpeeler What does this PR still need before I give it a review?

@jpeeler
Copy link
Author

jpeeler commented Jan 31, 2019

@alecmerdler sorry missed your comment. just pushed a new revision, should be good to go at this point.

a.addOperatorGroupAnnotations(&csv.ObjectMeta, op, !csv.IsCopied())
if reflect.DeepEqual(origCSVannotations, csv.GetAnnotations()) == false {
// CRDs don't support strategic merge patching, but in the future if they do this should be updated to patch
if _, err := a.client.OperatorsV1alpha1().ClusterServiceVersions(csv.GetNamespace()).Update(csv); err != nil {
a.Log.Errorf("Update for existing CSV failed: %v", err)
} else {
if _, ok := origCSVannotations[v1alpha2.OperatorGroupAnnotationKey]; ok {
if err := a.csvQueueSet.Requeue(csv.GetName(), csv.GetNamespace()); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating ObjectMeta on the CSV will enqueue it. I don't think you need to explicitly requeue.

Copy link
Author

@jpeeler jpeeler Feb 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specifically handles the case when an operator group related annotation update hasn't occurred, but CSVs still need to be copied to newly created namespaces.

@jpeeler
Copy link
Author

jpeeler commented Feb 4, 2019

/retest

@jpeeler
Copy link
Author

jpeeler commented Feb 5, 2019

/hold
Even though this is all green, I think there's some test ordering that will still cause test failure.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 5, 2019
@jpeeler
Copy link
Author

jpeeler commented Feb 6, 2019

I believe the e2e issue I saw earlier has been fixed now (here's for hoping my local success is replicated). Not removing the hold though until I look over #701.

@jpeeler jpeeler mentioned this pull request Feb 7, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 13, 2019
Jeff Peeler added 3 commits February 25, 2019 11:27
This adds requeuing for both the namespace and operator group sync
loops, which fixes syncing issues after initial CSVs have been copied.
This is a change necessary due to newly introduced functionality, but it
fixes a problem and ultimately is cleaner anyway. Now that there exists
the possibility of "invalid" CSVs being copied over existing valid CSVs,
save the copy step until a final CSV state has been reached.
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 25, 2019
@jpeeler
Copy link
Author

jpeeler commented Feb 25, 2019

Please focus attention on syncClusterServiceVersion as the CSV copying behavior has been (necessarily) changed due to the rebase changes.

@jpeeler
Copy link
Author

jpeeler commented Feb 28, 2019

This needs to be rebased and we talked about instead of only copying CSVs when they enter the final state (in the last commit) to add validation beforehand instead.

@openshift-ci-robot
Copy link
Collaborator

@jpeeler: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 3, 2019
@sferich888
Copy link

@jpeeler can we close this in favor of #736

@jpeeler jpeeler closed this Mar 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants