Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose controller-runtime metrics #786

Merged
merged 35 commits into from
Jan 25, 2019

Conversation

lilic
Copy link
Member

@lilic lilic commented Nov 29, 2018

Description of the change:

Bring in the controller-runtime metrics by exposing and creating a Service object. By default it is on 8080 the same as in controller-runtime we just pass the default value in case the user changes it this is then reflected in the Service creation as well.

Motivation for the change:

As decided offline: as currently the controller-runtime uses a global registry we should use that instead of creating a new one and serving the metrics ourselves.

Closes #222

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 29, 2018
@lilic lilic mentioned this pull request Nov 29, 2018
@lilic
Copy link
Member Author

lilic commented Nov 29, 2018

I would personally not merge this until controller-runtime creates a new release as this would mean all the new operators would pin against master instead of a new release. But as we agreed we should open this to review and pin against the master for now.

@lilic
Copy link
Member Author

lilic commented Nov 29, 2018

Tested this locally and this is the example metrics as served on 8080:

# HELP controller_runtime_reconcile_queue_length Length of reconcile queue per controller
# TYPE controller_runtime_reconcile_queue_length gauge
controller_runtime_reconcile_queue_length{controller="appservice-controller"} 0
# HELP controller_runtime_reconcile_time_seconds Length of time per reconcile per controller
# TYPE controller_runtime_reconcile_time_seconds histogram
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.005"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.01"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.025"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.05"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.1"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.25"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="0.5"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="1"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="2.5"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="5"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="10"} 5
controller_runtime_reconcile_time_seconds_bucket{controller="appservice-controller",le="+Inf"} 5
controller_runtime_reconcile_time_seconds_sum{controller="appservice-controller"} 6.119999999999999e-07
controller_runtime_reconcile_time_seconds_count{controller="appservice-controller"} 5
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.0225e-05
go_gc_duration_seconds{quantile="0.25"} 6.1858e-05
go_gc_duration_seconds{quantile="0.5"} 0.000228719
go_gc_duration_seconds{quantile="0.75"} 0.000725184
go_gc_duration_seconds{quantile="1"} 0.003963635
go_gc_duration_seconds_sum 0.005039621
go_gc_duration_seconds_count 5
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 40
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.11.2"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 5.679208e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 1.3982888e+07
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.445848e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 89562
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 0.05740754063652995
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.379776e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 5.679208e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.9236352e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 7.118848e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 46716
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 0
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.63552e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.5435088556178067e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 136278
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 3456
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 94088
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 98304
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 8.23344e+06
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 710944
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 753664
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 753664
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.176012e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 10
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.13
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 8
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.4887296e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.54350885539e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.33582848e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1

Copy link
Member

@estroz estroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than 2 nits.

pkg/scaffold/cmd.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
@lilic lilic force-pushed the lili/metrics-helpers branch 2 times, most recently from 1e27f3f to 7c0f0ce Compare November 30, 2018 13:23
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM just personal preference around constant files

pkg/metrics/constants.go Outdated Show resolved Hide resolved
@lilic lilic force-pushed the lili/metrics-helpers branch 3 times, most recently from 59682b1 to 9691354 Compare December 3, 2018 10:29
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 3, 2018
pkg/k8sutil/k8sutil.go Outdated Show resolved Hide resolved
pkg/scaffold/cmd.go Outdated Show resolved Hide resolved
@lilic
Copy link
Member Author

lilic commented Dec 5, 2018

As agreed we will put this on hold and wait until controller-runtime issues a new release.

@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2018
@lilic lilic force-pushed the lili/metrics-helpers branch 2 times, most recently from 8bc16b3 to 93be61a Compare December 10, 2018 13:35
@lilic lilic force-pushed the lili/metrics-helpers branch 3 times, most recently from 51cb32f to f255a39 Compare December 19, 2018 13:31
@lilic
Copy link
Member Author

lilic commented Jan 21, 2019

@joelanford @hasbro17 Can you please have another look, thanks! Adjusted per your suggestions.

kubeconfig, err := config.GetConfig()
err = createService(mgr, s)

return s, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the service already exists, we return the one returned by initOperatorService, not the existing one. Just making sure that's what we want to happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is a missing error there in general 🤦‍♀️

Just making sure that's what we want to happen.

Yes not sure there, do you think requesting a Service would be best and return that, or just always error out and not return a service and by that removing the exist check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a straightforward answer due to the fact that an operator might be running in a deployment with leader election enabled. I think I've opened up a can of worms.

So there are two related concerns here:

  1. Which pods are selected by the Service's selector? Right now it looks like all pods in the deployment? I'm not totally familiar with how the metrics work, but I'd imagine that we want Prometheus to scrape only the leader's metrics, right?
  2. What's the lifecycle of the Service? Does it live for the duration of the leader pod or of the operator deployment?

I think we need to answer those questions to make sure we get the logic correct in ExposeMetricsPort.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which pods are selected by the Service's selector? Right now it looks like all pods in the deployment?

Yes since the service uses the same selector as the Deployment name=<operator-name> it will expose the ports for all Deployment pods.
And it's alright if we scrape the metrics for all non-leader pods as well. If a leader pod steps down for some reason it can have different metrics from the new leader (e.g number of reconciles) which is worth exporting.

What's the lifecycle of the Service? Does it live for the duration of the leader pod or of the operator deployment?

I think it should live for the duration of the deployment. Recreating the service every time there's a new leader doesn't make much sense if it's going to be selecting all replicas. So if a service does not exist, the leader pod should create it with the ownerref set to the Deployement that owns it.

do you think requesting a Service would be best and return that, or just always error out and not return a service and by that removing the exist check?

I think ExposeMetricsPort() should return the actual service that exists. Either it get's created by the pod if it doesn't exist, or it gets and returns the service created by a previous pod.

Although it's worth considering any drawbacks to tying the service lifecycle to the operator Deployment if any. I can't think of any right now though.

/cc @shawn-hurley

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes since the service uses the same selector as the Deployment name= it will expose the ports for all Deployment pods.
And it's alright if we scrape the metrics for all non-leader pods as well. If a leader pod steps down for some reason it can have different metrics from the new leader (e.g number of reconciles) which is worth exporting.

Will prometheus scrape each service endpoint individually or will it scrape the service and get round-robin-ed among each endpoint? If the former, then that sounds good. If the latter, won't that cause problems with prometheus (e.g. counters will jump up and down depending on which pod services the request)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with option 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some testing and 1. doesn't work because metrics get served only when Start is called. I would like to suggest upstream to serve metrics before that as it should be independantly of Start. I am assuming that we should hold the lock before that, right?

They do handle leader election and serve metrics if there is a lock, but we use our own leader logic so we can't rely on that IIUC.

I guess for now so we have some metrics, I would suggest we go with option 2. and contribute upstream if they agree on serving metrics independantly. SGTY @shawn-hurley @hasbro17 @joelanford ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sum up from discussion with @shawn-hurley:

  1. Will look into disabling serving of metrics in controller-runtime, so we can serve the metrics they expose in operator-sdk.
  2. Open issue to discuss starting serving metrics independently of Start() being called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Option 2 sounds fine until we can work upstream or find a workaround to expose the metrics before getting the lock and calling start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Option 2 sounds fine until we can work upstream or find a workaround to expose the metrics before getting the lock and calling start.

Okay, will open an issue to make it work for all pods, not just leader when this is merged 👍 Until then we will at least have some metrics.

pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
@hasbro17
Copy link
Contributor

@lilic Looks good but just a few more nits.

Can you also please update the CHANGELOG to mention that the SDK scaffold for main.go now by default exposes the controller-runtime metrics on port 8383.
Since we never exposed metrics after the 0.1.0 refactor I think this will just go in the Added section.
We'll update that further with a link to more docs later on.

hasbro17 and others added 4 commits January 24, 2019 09:44
Co-Authored-By: LiliC <cosiclili@gmail.com>
Co-Authored-By: LiliC <cosiclili@gmail.com>
Co-Authored-By: LiliC <cosiclili@gmail.com>
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Due to the cache not being started at the time when we attempt to query
for the Service, we instead create a new client in a similiar way as we
do with leader elections.
@lilic
Copy link
Member Author

lilic commented Jan 24, 2019

We couldn't get the Service using the clients manager as the cache was not started, so I did the same as we do when handling leader and created a new client.

@hasbro17 @joelanford Tested locally and added to the CHANGELOG. PTALA.


service, err := k8sutil.InitOperatorService()
// ExposeMetricsPort creates a Kubernetes Service to expose the passed metrics port.
func ExposeMetricsPort(ctx context.Context, port int32) (*v1.Service, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 On passing in the context.

Will we want to change this function signature at all if kubernetes-sigs/controller-runtime#273 gets merged? Would we go back to passing in the manager (or maybe the client directly)?

If so, I'm wondering if it would be worth anticipating that now to avoid an API change, or if we should just wait since we don't know exactly how it'll look.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I'm wondering if it would be worth anticipating that now to avoid an API change, or if we should just wait since we don't know exactly how it'll look.

Yes was thinking about that as well, but as currently we have no idea if that will get merged and how it will look like in the end, so not sure we can fully predict it and think about not breaking the API. And we will have to change Leader functions signature in the case we use from above PR, not sure it makes a difference here. So yes, most likely if that gets merged we will break the API, or just decide to leave it as is, we always have that choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about needing to change the leader election API as well.

In that case, I agree with waiting and breaking the API for both if necessary.

Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one more question.

Copy link
Contributor

@hasbro17 hasbro17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lilic lilic merged commit 6b070cd into operator-framework:master Jan 25, 2019
@lilic lilic deleted the lili/metrics-helpers branch January 25, 2019 09:15
@stepin
Copy link

stepin commented Jan 25, 2019

Maybe it's wrong place to ask but how to enable metrics? When I'm just start in container (built from latest sources):

/usr/local/bin/ansible-operator run ansible --watches-file=/opt/ansible/watches.yaml

there is nothing on 8383 port:

bash-4.2$ curl http://127.0.0.1:8383/metrics
curl: (7) Failed connect to 127.0.0.1:8383; Connection refused

@shawn-hurley
Copy link
Member

@stepin I believe that we as the ansible operator need to turn on metrics in our main file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prometheus Integration
7 participants