[Question] Integration with tracing #1876

STRRL · 2022-04-25T12:17:58Z

Hi! I found that the feature "API Server Tracing" is available as alpha since Kubernetes v1.22, and this blog mentioned that a simple patch could also enable tracing on controller-runtime as well.

I think integration with tracing would be a powerful tool to enhance the observability of controller-runtime and soo-many operators build on it.

Is this feature on the roadmap? And I am very interested in building that.

STRRL · 2022-05-03T07:15:28Z

related PR: #1211

FillZpp · 2022-05-05T06:46:05Z

Thanks @STRRL . I'm not sure, the patch example in the blog adds an otelhttp handler on top of the existing webhook server. Is that all we have to do?

STRRL · 2022-05-05T09:01:16Z

Is that all we have to do?

No.

IMO, including the webhook server, there are also other components that need integration with tracing, like

Controller and Reconciler
Client provided by controller-runtime
Webhook server
Logger/Contextual Logging

The first 3 things might be resolved by otelhttp, and propagate context properly. And the 4th one might need upstream updates on logr. But we would still implement customized implementation for logr.LogSink as a preview.

FillZpp · 2022-05-05T09:30:25Z

I do understand tracing webhook server may help users to find out the time cost of a request to apiserver. But I don't understand what should we trace for controller or reconciler, for they all work asynchronously. Are you going to trace each object from list/watch to reconcile?

STRRL · 2022-05-05T12:47:45Z

For almost all the controllers/operators based on controller-runtime, the Reconciler is the most important part which contains their core business logic. I think there is no reason to ignore the tracing on them.

But I don't understand what should we trace for controller or reconciler, for they all work asynchronously. Are you going to trace each object from list/watch to reconcile?

I did not think about how tracing context/span propagates throw api-server and etcd, it might work or might not. And I am not sure that "trying to find out one reconciliation relates to previous reconciliation" is practical or not in theory, because the current status is the aggregation of all the previous updates, there must be overlapping for the propagation of different tracing contexts/span. I think it should be cleared when we actually design the tracing integration.

On the other side, only tracing operations inside only one reconciliation is also very useful:

what kind of event trigger this reconciliation
then, which resources are modified/created/deleted
maybe some other kind of API invoked during the reconciliation
- for Chaos Mesh, it would invoke chaos-daemon to inject chaos by grpc
- for cloud provider related controllers, it would invoke cloud provider's openapi
- etc.
does this reconciliation "win" the optimistic lock when updating resources.

Are you going to trace each object from list/watch to reconcile?

Based on the former topic, I want to trace all the single reconciliation, I am not sure, but prefer to yes.

STRRL · 2022-07-22T01:33:46Z

I was suffering with profiling the performance of the chaos mesh controller-manager recently days. It makes me concentrate much more on the tracing of Kubernetes operators.

I will start to work on this issue in the next several weeks.

k8s-triage-robot · 2022-10-20T01:44:51Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-11-19T02:12:22Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-12-19T02:34:06Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2022-12-19T02:34:09Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mjnovice · 2024-04-08T17:53:21Z

Can this be re-opened ?

sbueringer · 2024-04-08T18:31:09Z

/reopen
/remove-lifecycle rotten

k8s-ci-robot · 2024-04-08T18:31:13Z

@sbueringer: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2024-07-07T19:14:44Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-08-06T19:35:44Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-09-05T20:02:32Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-09-05T20:02:37Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 20, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 19, 2022

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2022

k8s-ci-robot reopened this Apr 8, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 8, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 7, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 6, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Integration with tracing #1876

[Question] Integration with tracing #1876

STRRL commented Apr 25, 2022

STRRL commented May 3, 2022

FillZpp commented May 5, 2022

STRRL commented May 5, 2022

FillZpp commented May 5, 2022

STRRL commented May 5, 2022

STRRL commented Jul 22, 2022

k8s-triage-robot commented Oct 20, 2022

k8s-triage-robot commented Nov 19, 2022

k8s-triage-robot commented Dec 19, 2022

k8s-ci-robot commented Dec 19, 2022

mjnovice commented Apr 8, 2024

sbueringer commented Apr 8, 2024

k8s-ci-robot commented Apr 8, 2024

k8s-triage-robot commented Jul 7, 2024

k8s-triage-robot commented Aug 6, 2024

k8s-triage-robot commented Sep 5, 2024

k8s-ci-robot commented Sep 5, 2024

[Question] Integration with tracing #1876

[Question] Integration with tracing #1876

Comments

STRRL commented Apr 25, 2022

STRRL commented May 3, 2022

FillZpp commented May 5, 2022

STRRL commented May 5, 2022

FillZpp commented May 5, 2022

STRRL commented May 5, 2022

STRRL commented Jul 22, 2022

k8s-triage-robot commented Oct 20, 2022

k8s-triage-robot commented Nov 19, 2022

k8s-triage-robot commented Dec 19, 2022

k8s-ci-robot commented Dec 19, 2022

mjnovice commented Apr 8, 2024

sbueringer commented Apr 8, 2024

k8s-ci-robot commented Apr 8, 2024

k8s-triage-robot commented Jul 7, 2024

k8s-triage-robot commented Aug 6, 2024

k8s-triage-robot commented Sep 5, 2024

k8s-ci-robot commented Sep 5, 2024