-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-647: Update apiserver tracing to GA for 1.30 #4333
KEP-647: Update apiserver tracing to GA for 1.30 #4333
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
/assign @wojtek-t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few smaller comments from PRR perspective.
@@ -207,15 +202,19 @@ Alpha | |||
|
|||
Beta | |||
|
|||
- [] Tracing 100% of requests does not break scalability tests (this does not necessarily mean trace backends can handle all the data). | |||
- [X] Tracing 100% of requests does not break scalability tests (this does not necessarily mean trace backends can handle all the data). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple smaller comments:
- For testing section:
- can you please add the coverage numbers (as in the template)
- for integration tests - can you please link to k8s-triage (as described in the template)
- for beta criteria:
Tracing 100% of requests does not break scalability tests (this does not necessarily mean trace backends can handle all the data).
For my own education - how did you do that? I think we didn't modify the periodic tests - should we?
-
SLIs question - are there any plans to have it addressed on OpenTelemetry?
-
Scalability section - there is a new question to answer:
https://github.com/kubernetes/enhancements/blob/master/keps/NNNN-kep-template/README.md#can-enabling--using-this-feature-result-in-resource-exhaustion-of-some-node-resources-pids-sockets-inodes-etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a one-time test with tracing set at 100%: kubernetes/kubernetes#113695 (comment). It isn't continuous. I don't think we should modify the periodic tests. If we did, it should be at a reasonably low sampling rate to make traces available for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SLIs, some in OTel are working on proposals: open-telemetry/oteps#238. I don't know when it will land, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed points 1 and 4 in 4500ad8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Then just one more follow-up on the scalability tests:
- Can you maybe link the comment about the passing thread to the KEP itself too
- Co you add a sentence that we're consciously not modifying the periodic tests and we will consider sampling at low rate in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 47c37e4
4500ad8
to
47c37e4
Compare
/label tide/merge-method-squash /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dashpole, logicalhan, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
One-line PR description: Update apiserver tracing to GA for 1.30
Issue link:API Server tracing #647
/sig instrumentation
/assign @logicalhan @dgrisonnet