-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ support work driver config for cluster manager. #381
✨ support work driver config for cluster manager. #381
Conversation
91137ad
to
d43748a
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #381 +/- ##
==========================================
+ Coverage 61.54% 62.44% +0.89%
==========================================
Files 133 134 +1
Lines 14078 11432 -2646
==========================================
- Hits 8665 7139 -1526
+ Misses 4664 3527 -1137
- Partials 749 766 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
0433e14
to
691dc14
Compare
/unhold |
048830f
to
c5f756d
Compare
@@ -209,6 +216,17 @@ func (n *clusterManagerController) sync(ctx context.Context, controllerContext f | |||
} | |||
managementClient := n.operatorKubeClient // We assume that operator is always running on the management cluster. | |||
|
|||
// If the work driver is not kube, we need to sync the work driver config secret | |||
if workDriver != operatorapiv1.WorkDriverTypeKube { | |||
if err := helpers.SyncWorkConfigSecret(ctx, n.operatorKubeClient, clusterManagerNamespace); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is already a sync secret func in library-go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced this with SyncSecret from library-go.
@@ -209,6 +216,17 @@ func (n *clusterManagerController) sync(ctx context.Context, controllerContext f | |||
} | |||
managementClient := n.operatorKubeClient // We assume that operator is always running on the management cluster. | |||
|
|||
// If the work driver is not kube, we need to sync the work driver config secret | |||
if workDriver != operatorapiv1.WorkDriverTypeKube { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if clustermanager is create at first and then secret is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator will initially error, but subsequent reconcile will succeed after creating the secret.
Currently we don't watch the secret, do we need that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to reflect that the secret is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added condition to clustermanager CR's status to reflect that.
dea5883
to
63c8853
Compare
@@ -46,10 +46,15 @@ spec: | |||
args: | |||
- "/work" | |||
- "manager" | |||
- "--work-driver=kube" | |||
- "--work-driver={{ .WorkDriver }}" | |||
{{ if eq .WorkDriver "kube" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be like this ?
{{ if .HostedMode }}
{{ if eq .WorkDriver "kube" }}
hostedMode only in the case workDirver == kube
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, per my understanding.
It's not clear how it works in hosted mode for other driver.
@@ -92,16 +97,28 @@ spec: | |||
volumeMounts: | |||
- name: tmpdir | |||
mountPath: /tmp | |||
{{ if eq .WorkDriver "kube" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question:
hostedMode only in the case workDirver == kube
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
8e56daf
to
48a5f00
Compare
48a5f00
to
a559f7e
Compare
{{ if .HostedMode }} | ||
- "--kubeconfig=/var/run/secrets/hub/kubeconfig" | ||
{{ end }} | ||
{{ else }} | ||
- "--cloudevents-client-id=$(POD_NAME)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if using podname is good choice. The pod name is changed after upgrade. Should the client id be already in the config file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the client id is not in config file, it's in cloudevents source options, see: https://github.com/open-cluster-management-io/sdk-go/blob/7dbdd1b5c63da7b832705501185138a5e5c530ba/pkg/cloudevents/generic/options/mqtt/sourceoptions.go#L21
per my understanding, it's acceptable if the client is changed, since it's used by MQTT broker to differentiate clients, the key is to make sure each publisher/subscriber has different client id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am more concerned about if client id is part of authentication/authz, which will be impacted by the change of client id. cc @skeeey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remembered for some MQTT broker, like the AWS IOT, we need this client ID do authn/authz, but we can use a wildcard in the authn/authz, I think you may need a fixed format ID here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is link openshift-online/maestro#28 (comment) we have discussed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I updated this to "--cloudevents-client-id=work-controller-$(POD_NAME)"
with fixed prefix work-controller-
, so in the future the authn/zuthz can be donw with wildcard work-controller-*
.
pkg/operator/helpers/helpers.go
Outdated
} | ||
|
||
// GetOperatorNamespace is used to get the namespace where the operator is running. | ||
func GetOperatorNamespace() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we already get the namespace from the controllerContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the work-driver-config secret is supposed to be created in the namespace (open-cluster-management
by default) as the operator, not in the operand namespace (open-cluster-management-hub
by default), because the operand namespace may not exist before clustermanager is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but you can get the namespace from controllerContext which get the namespace the same as what this func does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh... I see, it's in command ControllerContext: https://github.com/openshift/library-go/blob/0f907b20fed14da2731dbb6a537e97c691579dc7/pkg/controller/controllercmd/builder.go#L41-L60
updated this.
} | ||
} | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need to remove it? The else section here seems unnecessary to me. we can just return a condition with true when kube-driver is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used for case: non-kube driver is changed to kube
driver, the condition should be removed, make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can always add the condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, removed this part.
} | ||
} | ||
return err | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to many nested if. should refactor the code here.
Wouldn't SyncSecret return a notFound error? I think we do not need to get then sync.
directly sync and return error should be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyncSecret doesn't return an error when the source secret doesn't exist, that's why I added the check before SyncSecret. Also added comments at
Lines 227 to 229 in a559f7e
// check the secret containing work driver config explicitly because | |
// resourceapply.SyncSecret below won't return err if the source secret is not found | |
_, err = n.operatorKubeClient.CoreV1().Secrets(operatorNamespace).Get(ctx, helpers.WorkDriverConfig, metav1.GetOptions{}) |
@@ -75,6 +75,7 @@ func newClusterManager(name string) *operatorapiv1.ClusterManager { | |||
}, | |||
WorkConfiguration: &operatorapiv1.WorkConfiguration{ | |||
FeatureGates: []operatorapiv1.FeatureGate{featureGate}, | |||
WorkDriver: operatorapiv1.WorkDriverTypeKube, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ut on other type would be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added UT for sync work driver config secret and condition check.
@@ -209,6 +217,71 @@ func (n *clusterManagerController) sync(ctx context.Context, controllerContext f | |||
} | |||
managementClient := n.operatorKubeClient // We assume that operator is always running on the management cluster. | |||
|
|||
if config.MWReplicaSetEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this if is not necessary, we should alway sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this.
clusterManagerFinalizer = "operator.open-cluster-management.io/cluster-manager-cleanup" | ||
clusterManagerApplied = "Applied" | ||
clusterManagerProgressing = "Progressing" | ||
workDriverConfigSecretSynced = "WorkDriverConfigSecretSynced" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just secret synced. We will need to also sync other secret for image pull credential later. These could be in the same condition, different reasons and messages.
FYI @zhiweiyin318
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated this to a general value: "SecretSynced"
ec24242
to
6fdf148
Compare
f6f5ed0
to
e9df745
Compare
3f80cd5
to
80940b5
Compare
Another look? @qiujian16 @zhiweiyin318 |
return err | ||
} | ||
// TODO: set condition to indicate if the secret is synced successfully | ||
// check the source secret exist before syncing and set condition if not exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @qiujian16's comment is to add a separated reconciler like
Line 217 in fe6e75f
&webhookReconcile{cache: n.cache, recorder: n.recorder, hubKubeClient: hubClient, kubeClient: managementClient}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, added a new secret reconciler.
a989e1c
to
11233c5
Compare
if _, err := c.operatorKubeClient.CoreV1().Secrets(c.operatorNamespace).Get(ctx, | ||
workDriverConfig, metav1.GetOptions{}); errors.IsNotFound(err) { | ||
// TODO: set condition if the source secret doesn't exist, | ||
return cm, reconcileStop, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in helpers.SyncSecret, the synced secret will be deleted if the source secret is not found.
so if return here, the synced workDriverConfig secret will not be deleted if workDriverConfig is deleted.
not sure if this is expected ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the source secret should not be deleted per my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we hav a empty secret slice, and append the "work-driver-config" secret if feature gate is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated this with a secret set instead of a secret slice, to avoid duplicated resync.
b2ea778
to
0725a66
Compare
Signed-off-by: morvencao <lcao@redhat.com>
0725a66
to
95c0509
Compare
for _, secretName := range secretNames.UnsortedList() { | ||
// check the source secret explicitly as the 'helpers.SyncSecret' doesn't return an error | ||
// when the source secret is not found. | ||
if _, err := c.operatorKubeClient.CoreV1().Secrets(c.operatorNamespace).Get(ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove this additional get for now. It could happen this should return success if the secret does not exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this.
config.ClusterManagerNamespace, | ||
secretName, | ||
[]metav1.OwnerReference{}, | ||
); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, we should append the err to a err slice and return aggregated err finally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added aggregated error.
Signed-off-by: morvencao <lcao@redhat.com>
7289013
to
6f5c4ee
Compare
kindly ping @qiujian16 |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: morvencao, qiujian16 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7154863
into
open-cluster-management-io:main
Summary
This PR adds the capability to configure the work driver for the work controller via the cluster manager.
Related issue(s)
https://issues.redhat.com/browse/ACM-10422