Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add tutorial to explain how to collect metrics from operator for Prometheus #921

Open
2 tasks done
Yicheng-Lu-llll opened this issue Feb 22, 2023 · 0 comments
Open
2 tasks done
Labels
enhancement New feature or request

Comments

@Yicheng-Lu-llll
Copy link
Contributor

Yicheng-Lu-llll commented Feb 22, 2023

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

metrics from the operator:

As I see the code here:

// Define all the prometheus counters for all clusters
var (
clustersCreatedCount = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "ray_operator_clusters_created_total",
Help: "Counts number of clusters created",
},
[]string{"namespace"},
)
clustersDeletedCount = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "ray_operator_clusters_deleted_total",
Help: "Counts number of clusters deleted",
},
[]string{"namespace"},
)
clustersSuccessfulCount = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "ray_operator_clusters_successful_total",
Help: "Counts number of clusters successful",
},
[]string{"namespace"},
)
clustersFailedCount = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "ray_operator_clusters_failed_total",
Help: "Counts number of clusters failed",
},
[]string{"namespace"},
)
)

And according to kubebuilder doc, It is possible to collect the above metrics (e.g ray_operator_clusters_created_total) and the default metrics created by the controller run time.

How to collect the metrics from the operator:

kind create cluster
helm install kuberay-operator kuberay/kuberay-operator --version 0.4.0
helm install raycluster kuberay/ray-cluster --version 0.4.0  
kubectl port-forward svc/kuberay-operator 8080:8080
# Then, see the result in http://localhost:8080/metrics.
# You can see metrics kuberay defined and default metrics created by the controller run time.
Later it can be collected by Prometheus monitor.

Reason to also run helm install raycluster kuberay/ray-cluster --version 0.4.0 :

According to the code here:

if len(headPods.Items) == 0 || headPods.Items == nil {
// create head pod
r.Log.Info("reconcilePods ", "creating head pod for cluster", instance.Name)
common.CreatedClustersCounterInc(instance.Namespace)
if err := r.createHeadPod(*instance); err != nil {
common.FailedClustersCounterInc(instance.Namespace)
return err
}
common.SuccessfulClustersCounterInc(instance.Namespace)

Creating raycluster will increase some metrics and it is expected to see increased metrics in http://localhost:8080/metrics.

Some background I collect:

Use case

Collecting metrics from the operator is useful. It helps to debug/benchmark/visualize the operator.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@Yicheng-Lu-llll Yicheng-Lu-llll added the enhancement New feature or request label Feb 22, 2023
@Yicheng-Lu-llll Yicheng-Lu-llll changed the title [Feature] Add tutorial to explain how to collect metrics from operator [Feature] Add tutorial to explain how to collect metrics from operator for Prometheus Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant