-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csv): install owned APIServices #492
feat(csv): install owned APIServices #492
Conversation
f8e7608
to
2110f0b
Compare
3e8d3c1
to
a563272
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great overall!
I noticed this in the service-catalog deploy artifacts:
# apiserver gets the ability to read authentication. This allows it to
# read the specific configmap that has the requestheader-* entries to
# enable api aggregation
- apiVersion:
kind: RoleBinding
metadata:
name: "servicecatalog.k8s.io:apiserver-authentication-reader"
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- apiGroup: ""
kind: ServiceAccount
name: "service-catalog-apiserver"
namespace: "default"
I think we may want to generate these as well. This one is tricky since it's a binding to a Role in kube-system
(which means you can't write any equivalent clusterPermission
or permission
to get it).
We may want to go ahead and generate the binding to auth-delegator
as well. The e2e tests would miss both of these things since they don't delete the package apiserver's rbac roles.
@@ -189,15 +189,15 @@ spec: | |||
items: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should include the "deploymentName" field as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - thanks for catching that.
Version string `json:"version"` | ||
Kind string `json:"kind"` | ||
DeploymentName string `json:"deploymentName,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should accept a container as well? deploymentName: packages-apiserver/pkg-container
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what would the value add be, further verification of the APIService Deployment?
Just hit an interesting issue: if the port used to expose the apiserver isn't 443, we get stuck in installing (but no info about why). We should probably either have a way to specify |
Also, thinking about the volume issue - what if we just had a standard volume name (like |
8cb0678
to
471008e
Compare
So We could also add a liveness probe on the for the given or default port that enforces that a socket can be opened against it. |
ec0b9de
to
0136a21
Compare
0136a21
to
0b1e97c
Compare
/assign @alecmerdler |
82d2486
to
2400b79
Compare
// installcheck determined we can't progress (e.g. deployment failed to come up in time) | ||
if install.IsErrorUnrecoverable(strategyErr) { | ||
csv.SetPhase(v1alpha1.CSVPhaseFailed, v1alpha1.CSVReasonInstallCheckFailed, fmt.Sprintf("install failed: %s", strategyErr)) | ||
return strategyErr | ||
} | ||
|
||
// if there's an error checking install that shouldn't fail the strategy, requeue with message | ||
if apiServiceErr != nil { | ||
csv.SetPhase(v1alpha1.CSVPhaseInstalling, requeueConditionReason, fmt.Sprintf("APIServices installing: %s", apiServiceErr)) | ||
a.requeueCSV(csv.GetName(), csv.GetNamespace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this is the best way to ensure APIService GVK discovery is immediately detected. Watching APIServices only tells us when its status changes to available, but not when the expected GVKs are discoverable (which can happen much later). I expect this to swamp the CSV workqueue.
@ecordell Maybe we could requeue an APIService when it's detected that the owner CSV's expected GVKs are not discoverable in the syncAPIServices
handler. Then, once all expected GVKs are discoverable, the syncAPIServices
handler can requeue the owner CSV. This approach should at least shift the pressure from the CSV workqueue to the APIService workqueue. This logic could also be placed in a time limited goroutine, which could potentially keep workqueue pressure down all together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good for now. It would be nice to watch discovery as an optimization, but this should work fine. And you don't need to worry too much about swamping the queue, requeueCSV
obeys the rate limiting on the backing queue.
I say we add just an issue to our backlog for the optimization 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if podTemplateName == depSpec.Template.GetName() { | ||
return nil, fmt.Errorf("a name collision occured when generating name for PodTemplate") | ||
} | ||
depSpec.Template.SetName(podTemplateName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ensures that the extension api-server is actually rolled out in the event of a secret change. Triggering rollouts this way allows us to keep a static secret name.
// installcheck determined we can't progress (e.g. deployment failed to come up in time) | ||
if install.IsErrorUnrecoverable(strategyErr) { | ||
csv.SetPhase(v1alpha1.CSVPhaseFailed, v1alpha1.CSVReasonInstallCheckFailed, fmt.Sprintf("install failed: %s", strategyErr)) | ||
return strategyErr | ||
} | ||
|
||
// if there's an error checking install that shouldn't fail the strategy, requeue with message | ||
if apiServiceErr != nil { | ||
csv.SetPhase(v1alpha1.CSVPhaseInstalling, requeueConditionReason, fmt.Sprintf("APIServices installing: %s", apiServiceErr)) | ||
a.requeueCSV(csv.GetName(), csv.GetNamespace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good for now. It would be nice to watch discovery as an optimization, but this should work fine. And you don't need to worry too much about swamping the queue, requeueCSV
obeys the rate limiting on the backing queue.
I say we add just an issue to our backlog for the optimization 😄
@@ -1211,6 +1221,7 @@ | |||
"k8s.io/kube-aggregator/pkg/apis/apiregistration/v1", | |||
"k8s.io/kube-aggregator/pkg/client/clientset_generated/clientset", | |||
"k8s.io/kube-aggregator/pkg/client/clientset_generated/clientset/fake", | |||
"k8s.io/kube-aggregator/pkg/client/informers/externalversions", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice job not pulling in kubernetes/kubernetes 😄
version: | ||
type: string | ||
description: The version field of the APIService | ||
kind: | ||
type: string | ||
description: The kind field of the APIService | ||
deploymentName: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering suggesting a selector
here instead so that you don't have to point to an entire deployment and better match Service
, but I think it's reasonable to require a deployment so that we can modify it to include the volume (using a selector to modify the pods would mean the deployment
controller would fight us for changes).
|
||
// GetRequiredAPIServiceDescriptions returns a deduplicated set of required APIServiceDescriptions | ||
// with the intersection of required and owned removed | ||
// Equivalent to the set subtraction required - owned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense, but I think we should just prevent people from creating CSVs that have duplicate entries in owned and required. fine for now, but let's keep it in mind for the next apiversion.
} | ||
if ownerutil.IsOwnedByKind(apiService, v1alpha1.ClusterServiceVersionKind) { | ||
oref := ownerutil.GetOwnerByKind(apiService, v1alpha1.ClusterServiceVersionKind) | ||
log.Infof("APIService %s change requeuing CSV %s", apiService.GetName(), oref.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug?
2400b79
to
7931dd8
Compare
@@ -158,18 +158,18 @@ spec: | |||
serviceAccount: prometheus-operator-0-22-2 | |||
containers: | |||
- name: prometheus-operator | |||
image: quay.io/coreos/prometheus-operator@sha256:3daa69a8c6c2f1d35dcf1fe48a7cd8b230e55f5229a1ded438f687debade5bcf | |||
image: quay.io/coreos/prometheus-operator:master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you remove the changes from this file? this was just for testing out the operatorgroup in that combined branch for the demo, it's not something we want to change just yet
c6a09d1
to
36286b4
Compare
0a027ae
to
bfe9524
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: njhale The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
feat(csv): install owned APIServices
Description
Extends the CSV control-loop to install owned
APIServices
. This includes generating and injecting associated serving certificates as volume mountedSecrets
in matchingDeployments
, creating/updating aService
to point to the pods of saidDeployments
, and creating/updatingAPIServices
with the CA bundle of the signing CA.Note:
This is a first attempt at something that creates all the required resources to establish trust with the OLM managed extension api-server. Managing cert expiration will be the next step. At this time tests still need to be written for this.
Related to ALM-657