-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2339: beta details #3593
KEP-2339: beta details #3593
Conversation
Signed-off-by: Monis Khan <mok@microsoft.com>
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
Signed-off-by: Monis Khan <mok@microsoft.com>
Signed-off-by: Monis Khan <mok@microsoft.com>
531eac2
to
69cba67
Compare
## FAQ | ||
##### e2e tests | ||
|
||
No E2E tests are required for this enhancement as the functionality can be completely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an e2e test for CRDs looks practical to develop. For built-in types, an e2e test does not look practical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to not simply write an integration test for CRDs as well? We already have code for the ETCD storage path test that can create CRDs and CRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to have both. I'd like to see an e2e demonstrate the functionality on a running cluster .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of an e2e would be to eventually make it a conformance test, I think.
|
||
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? | ||
|
||
Yes, like any other alpha API, it can be disabled with runtime config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to beta. When disabled, the cluster-admin should delete the stale resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question: why is that rule specifically pertaining to when an API is promoted to Beta?
###### Are there any missing metrics that would be useful to have to improve observability of this feature? | ||
|
||
The following metrics could be useful, but are likely not practical due to cardinality issues or complexity of the implementation: | ||
- The latency for a single apiserver to update it's encoding version after start-up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this encoding update is indirectly exposed via kubernetes_healthcheck{name="poststarthook/built-in-resources-storage-version-updater"}
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some ways yes, although I can imagine scenarios where the time between when the health check passes and when the actual encoding version is updated can be non-negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that metric was a simple counter and did not include latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that metric was a simple counter and did not include latency?
Given a scrap interval, you can determine how long/often it is false. The idea for the metrics/sli is an order 5s poll.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think we're missing this, but if you want to leave it in missing, that's ok.
keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md
Show resolved
Hide resolved
keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md
Show resolved
Hide resolved
Signed-off-by: Monis Khan <mok@microsoft.com>
// Spec is omitted because there is no spec field. | ||
// Spec StorageVersionSpec | ||
// Spec is an empty spec. It is here to comply with Kubernetes API style. | ||
Spec StorageVersionSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we would need this, but we settle these things in API review, not KEPs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lavalamp this is copy pasted from the current code in k/k
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha ok
```golang | ||
// Storage version of a specific resource. | ||
```go | ||
// Storage version of a specific resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is one of these per {group, resource}? Would be good to state that.
### Risks and Mitigations | ||
|
||
Writes to most Kubernetes resources must be prevented until the API server has | ||
had a chance to emit its encoding versions for all resources. This requires us to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally a readiness check will return false during this time so a loadbalancer can know not to send traffic?
OK, some nits, but I don't see anything blocking. |
Signed-off-by: Monis Khan <mok@microsoft.com>
/lgtm |
/label tide/merge-method-squash |
PRR looks good and the PR looks good /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, enj, lavalamp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* KEP-2339: update to latest KEP template Signed-off-by: Monis Khan <mok@microsoft.com> * KEP-2339: add PRR for StorageVersion API Signed-off-by: Andrew Sy Kim <andrewsy@google.com> * KEP-2339: update test plan section with links to existing tests Signed-off-by: Andrew Sy Kim <andrewsy@google.com> * KEP-2339: add details around current code Signed-off-by: Monis Khan <mok@microsoft.com> * KEP-2339: add details around default behavior Signed-off-by: Monis Khan <mok@microsoft.com> * KEP-2339: address comments Signed-off-by: Monis Khan <mok@microsoft.com> * KEP-2339: address comments Signed-off-by: Monis Khan <mok@microsoft.com> Signed-off-by: Monis Khan <mok@microsoft.com> Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Co-authored-by: Andrew Sy Kim <andrewsy@google.com>
/assign @deads2k