A mechanism to list the Workloads that are admitted and not finished #1776

alculquicondor · 2024-02-27T18:48:38Z

What would you like to be added:

Some way of querying the list of Workloads that are still "running", that is, they are admitted and don't have a Finished condition. Querying by ClusterQueue would be ideal too.

Some options (not necessarily exclusive to each other):

The kueue CLI
A visibility endpoint.
Is there a way to do this just with a kubectl jsonpath?

Why is this needed:

For basic diagnostics of the state of the cluster.

Completion requirements:

This enhancement requires the following artifacts:

Design doc
API change?
Docs update

The artifacts should be linked in subsequent comments.

tenzen-y · 2024-02-27T18:52:34Z

Do you expect that we extend the on-demand visibility server or save information on any CustomResource like ClusterQueue?

alculquicondor · 2024-02-27T18:53:55Z

I don't think we should save it. This is more for on-demand queries.

tenzen-y · 2024-02-27T18:55:06Z

I don't think we should save it. This is more for on-demand queries.

It makes sense.

astefanutti · 2024-02-28T08:27:34Z

Tangentially, do you think adding a running_worloads metric could be useful too?

tenzen-y · 2024-02-28T08:30:16Z

Tangentially, do you think adding a running_worloads metric could be useful too?

Does it mean Prometheus metric?

astefanutti · 2024-02-28T08:32:09Z

Tangentially, do you think adding a running_worloads metric could be useful too?

Does it mean Prometheus metric?

Yes, a Prometheus metric, that would complement the existing pending_workloads metric.

alculquicondor · 2024-02-28T14:47:32Z

we already have it

kueue/pkg/metrics/metrics.go

Lines 120 to 126 in 9afd7e6

    
           AdmittedActiveWorkloads = prometheus.NewGaugeVec( 
        
           	prometheus.GaugeOpts{ 
        
           		Subsystem: constants.KueueName, 
        
           		Name:      "admitted_active_workloads", 
        
           		Help:      "The number of admitted Workloads that are active (unsuspended and not finished), per 'cluster_queue'", 
        
           	}, []string{"cluster_queue"}, 
        
           )

astefanutti · 2024-02-28T16:37:04Z

we already have it

kueue/pkg/metrics/metrics.go

Lines 120 to 126 in 9afd7e6

AdmittedActiveWorkloads = prometheus.NewGaugeVec(

prometheus.GaugeOpts{

Subsystem: constants.KueueName,

Name: "admitted_active_workloads",

Help: "The number of admitted Workloads that are active (unsuspended and not finished), per 'cluster_queue'",

}, []string{"cluster_queue"},

)

Ah right, this is the one, thanks!

alculquicondor · 2024-02-28T17:18:36Z

Would you like to propose something for this area? Maybe along the lines of a new visibility endpoint?

astefanutti · 2024-02-28T17:59:05Z

Yes. Do you want it to be initially proposed as a KEP?

alculquicondor · 2024-02-28T18:03:31Z

Given that it's an API change, a KEP would be good.

tenzen-y · 2024-02-28T19:25:10Z

Feel free to assign yourself with /assign.
Since you joined this org, the command should work well :)

KunWuLuan · 2024-04-25T07:28:07Z

If no one working for this, I can help.

astefanutti · 2024-04-25T07:29:44Z

@KunWuLuan I haven't got the chance to work on it. Feel free to assign it to you.

KunWuLuan · 2024-05-06T03:38:25Z

Do we still need the new endpoints if we already have the KueueCtl?

alculquicondor · 2024-05-06T12:35:52Z

Maybe less so? One slight advantage would be to filter on the server side.

But maybe it would add unnecessary load to the kueue binary?

k8s-triage-robot · 2024-08-04T13:12:55Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tenzen-y · 2024-08-14T16:19:48Z

/remove-lifecycle stale

alculquicondor · 2024-08-14T17:23:16Z

This is supported using kueuectl https://kueue.sigs.k8s.io/docs/reference/kubectl-kueue/commands/kueuectl_list/kueuectl_list_workload/

Should we close this?

tenzen-y · 2024-08-15T07:18:37Z

This is supported using kueuectl https://kueue.sigs.k8s.io/docs/reference/kubectl-kueue/commands/kueuectl_list/kueuectl_list_workload/

Should we close this?

Oh, you're right. Let me close this.
/close

k8s-ci-robot · 2024-08-15T07:18:41Z

@tenzen-y: Closing this issue.

In response to this:

This is supported using kueuectl https://kueue.sigs.k8s.io/docs/reference/kubectl-kueue/commands/kueuectl_list/kueuectl_list_workload/

Should we close this?

Oh, you're right. Let me close this.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

alculquicondor added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 27, 2024

alculquicondor mentioned this issue Apr 29, 2024

KEP 2076: Kueuectl #2093

Merged

KunWuLuan mentioned this issue May 7, 2024

support running workloads in visibility endpoint #2145

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 4, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 14, 2024

k8s-ci-robot closed this as completed Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A mechanism to list the Workloads that are admitted and not finished #1776

A mechanism to list the Workloads that are admitted and not finished #1776

alculquicondor commented Feb 27, 2024

tenzen-y commented Feb 27, 2024

alculquicondor commented Feb 27, 2024

tenzen-y commented Feb 27, 2024

astefanutti commented Feb 28, 2024

tenzen-y commented Feb 28, 2024

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024 •

edited

Loading

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024

tenzen-y commented Feb 28, 2024

KunWuLuan commented Apr 25, 2024

astefanutti commented Apr 25, 2024

KunWuLuan commented May 6, 2024

alculquicondor commented May 6, 2024

k8s-triage-robot commented Aug 4, 2024

tenzen-y commented Aug 14, 2024

alculquicondor commented Aug 14, 2024

tenzen-y commented Aug 15, 2024

k8s-ci-robot commented Aug 15, 2024

A mechanism to list the Workloads that are admitted and not finished #1776

A mechanism to list the Workloads that are admitted and not finished #1776

Comments

alculquicondor commented Feb 27, 2024

tenzen-y commented Feb 27, 2024

alculquicondor commented Feb 27, 2024

tenzen-y commented Feb 27, 2024

astefanutti commented Feb 28, 2024

tenzen-y commented Feb 28, 2024

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024 • edited Loading

astefanutti commented Feb 28, 2024

alculquicondor commented Feb 28, 2024

tenzen-y commented Feb 28, 2024

KunWuLuan commented Apr 25, 2024

astefanutti commented Apr 25, 2024

KunWuLuan commented May 6, 2024

alculquicondor commented May 6, 2024

k8s-triage-robot commented Aug 4, 2024

tenzen-y commented Aug 14, 2024

alculquicondor commented Aug 14, 2024

tenzen-y commented Aug 15, 2024

k8s-ci-robot commented Aug 15, 2024

alculquicondor commented Feb 28, 2024 •

edited

Loading