Exposes metrics ports on pods in order to enable GCP Managed Prometheus #2712

austin-space · 2022-08-18T19:28:10Z

What type of PR is this?
/kind feature

What this PR does / Why we need it: GKE managed prometheus(GMP) makes some interesting decisions in how it implements a prometheus operator-like system. The most notable change is that there is no concept of a ServiceMonitor in GMP, only a PodMonitoring custom resource. As the name implies this monitor only has visibility on pods, not onto services. Fortunately I can setup PodMonitoring resources that closely imitate the ServiceMonitor resources in the included helm charts, the only issue is that the metrics port for the allocator service is not defined on the pod itself, just on the metrics service.

This is the least intrusive way of introducing this. Alternatively a GMP flag could be included in the values and I could add all of the pod monitors as well.

Which issue(s) this PR fixes: none that I'm aware of

Special notes for your reviewer:

agones-bot · 2022-08-18T19:54:58Z

Build Succeeded 👏

Build Id: d752667a-1771-401b-a365-14d6f794b995

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.26.0-7d743f8-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.26.0-7d743f8-linux-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.26.0-7d743f8-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.26.0-7d743f8-amd64
Linux C++ SDK (build): agonessdk-1.26.0-7d743f8-amd64-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.26.0-7d743f8-amd64.zip

A preview of the website (the last 30 builds are retained):

https://7d743f8-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/2712/head:pr_2712 && git checkout pr_2712
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.26.0-7d743f8-amd64

markmandel · 2022-08-24T20:36:28Z

Thanks for digging into this! I'm now also learning how Google Cloud Managed Prometheus works!

i think in general this looks like the correct approach (exposing the container ports on the relevant pods)

It's worth noting that we have a metrics endpoint on the controller as well as the allocation system (controller's service monitor) - so this will need to cover both.

🤔 another thought I had, not sure if it sticks, but does it make any sense to allow someone to add arbitrary container ports to either the controller or the allocation Pods? Much in the same way we do agones.controller.annotations for example.

Just trying to think more generically, rather than a specific solution for this specific problem.

(Also docs are good 😄 )

WDYT?

austin-space · 2022-08-24T22:38:13Z

Hey Mark, the controller metrics port is actually already exposed since it's the same port that is used for its other http traffic. There's probably some work that can be done to unify the way that these ports are setup/exposed since some have names and port numbers coming from the values.yaml file, and others are hardcoded in the templates. I can update the docs in this PR later today or tomorrow when I get some time. For docs are you thinking it would be better to have a quick blurb on GMP(something like "if you are using Google managed prometheus, you will need to set ... and set up your own podmonitoring resources(link to google docs)") or something more? Also should the podmonitoring resources required for managed prometheus be included in the helm chart(behind a usingGoogleManagedPrometheus flag or something)? I don't know how much you do or don't want to avoid provider specific stuff sneaking in there.

Thinking about the more generic solution: outside of this port, I'm not sure what other ports even have something interesting running on them and aren't exposed on the pod by default(as opposed to a service) so I don't know if that would be of too much value at this point in time(but might be if another one of these crops up).

markmandel · 2022-08-25T16:33:20Z

That 100% makes sense.

My thought here then, would be to just leave the container port always open (i.e. don't bother putting in a helm configuration variable). It's inside the cluster anyway, so I don't think it really matters. @roberthbailey WDYT?

Regarding documentation - good point re: platform specific.

I'm thinking something generic like "if your metric collection agent needs to scrape container ports directly (such as with Google Cloud Managed Prometheus), the ports you would need to scrape can be found at {insert details}"

How does that sound?

agones-bot · 2022-08-25T16:56:16Z

Build Succeeded 👏

Build Id: a2f03f1f-27fe-4020-900b-2138ede895d4

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.26.0-ba90653-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.26.0-ba90653-linux-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.26.0-ba90653-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.26.0-ba90653-amd64
Linux C++ SDK (build): agonessdk-1.26.0-ba90653-amd64-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.26.0-ba90653-amd64.zip

A preview of the website (the last 30 builds are retained):

https://ba90653-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/2712/head:pr_2712 && git checkout pr_2712
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.26.0-ba90653-amd64

roberthbailey · 2022-08-26T19:03:47Z

My thought here then, would be to just leave the container port always open (i.e. don't bother putting in a helm configuration variable). It's inside the cluster anyway, so I don't think it really matters. @roberthbailey WDYT?

I agree that it should be fine to just add the port all the time. If nothing scrapes it then it doesn't do any harm, but it's there if someone wants to scrape it (using GMP or a different prometheus scraper that uses PodMonitoring) without needing to re-install / re-configure Agones later.

markmandel · 2022-08-26T20:46:18Z

Sounds like we have consensus!

If we could add some docs with a feature tag around it for the next version, that would be perfect 👍🏻

markmandel · 2022-08-30T20:44:13Z

Just a heads up, we are one week away from our release candidate, so if you have time to implement the above comments, it would be awesome to get that in for that release.

…e/agones into expose-metrics-ports

austin-space · 2022-08-31T23:46:29Z

Sorry, the week got away from me. I think everything should be good now. Let me know if you want any changes with the little blurb I added in the docs.

agones-bot · 2022-08-31T23:53:15Z

Build Failed 😱

Build Id: 797931f1-ddf6-4050-bfd0-96bdc665d58c

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

agones-bot · 2022-09-01T00:09:11Z

Build Succeeded 👏

Build Id: 9e75dc67-aacc-4e7b-aeda-2bf2ec7f35df

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.26.0-eaba510-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.26.0-eaba510-linux-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.26.0-eaba510-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.26.0-eaba510-amd64
Linux C++ SDK (build): agonessdk-1.26.0-eaba510-amd64-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.26.0-eaba510-amd64.zip

A preview of the website (the last 30 builds are retained):

https://eaba510-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/2712/head:pr_2712 && git checkout pr_2712
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.26.0-eaba510-amd64

site/content/en/docs/Guides/metrics.md

Co-authored-by: Mark Mandel <markmandel@google.com>

agones-bot · 2022-09-01T17:39:37Z

Build Failed 😱

Build Id: 286bc54c-9343-429c-926b-3974ee2a86f1

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

markmandel · 2022-09-01T17:52:47Z

make[1]: Leaving directory '/workspace/build'
sort /workspace/install/yaml/install.yaml > /tmp/agones-install/install.current.yaml.sorted
diff /tmp/agones-install/install.yaml.sorted /tmp/agones-install/install.current.yaml.sorted
13943a13944
>           containerPort:  8080
14156a14158
>         - name: http

oooh, I see what it is.

If you could run make gen-install in the ./build directory (assuming you have Make and Docker installed) that will refresh the install.yaml to have your changes, and this should be good to go 👍🏻

markmandel · 2022-09-01T17:53:45Z

Lemme know if you run into any issues doing that, and I can do it on my end, and submit a PR to your PR 😄

…e/agones into expose-metrics-ports

google-oss-prow · 2022-09-01T20:52:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: austin-space, markmandel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [markmandel]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow · 2022-09-01T21:03:40Z

New changes are detected. LGTM label has been removed.

agones-bot · 2022-09-01T21:13:08Z

Build Succeeded 👏

Build Id: 0af444b2-ccef-4959-8cd2-5759e104261c

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.26.0-0dc7715-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.26.0-0dc7715-linux-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.26.0-0dc7715-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.26.0-0dc7715-amd64
Linux C++ SDK (build): agonessdk-1.26.0-0dc7715-amd64-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.26.0-0dc7715-amd64.zip

A preview of the website (the last 30 builds are retained):

https://0dc7715-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/2712/head:pr_2712 && git checkout pr_2712
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.26.0-0dc7715-amd64

agones-bot · 2022-09-01T21:30:40Z

Build Succeeded 👏

Build Id: 4915143f-ae06-4744-a3a1-9b76c5c9b1fe

The following development artifacts have been built, and will exist for the next 30 days:

image: us-docker.pkg.dev/agones-images/ci/agones-controller:1.26.0-3a9841b-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-sdk:1.26.0-3a9841b-linux-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-ping:1.26.0-3a9841b-amd64
image: us-docker.pkg.dev/agones-images/ci/agones-allocator:1.26.0-3a9841b-amd64
Linux C++ SDK (build): agonessdk-1.26.0-3a9841b-amd64-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.26.0-3a9841b-amd64.zip

A preview of the website (the last 30 builds are retained):

https://3a9841b-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/googleforgames/agones.git pull/2712/head:pr_2712 && git checkout pr_2712
helm install agones ./install/helm/agones --namespace agones-system --set agones.image.tag=1.26.0-3a9841b-amd64

Exposes metrics ports on pods in order to enable GCP Managed Prometheus

7d743f8

google-oss-prow bot added the size/XS label Aug 18, 2022

google-oss-prow bot requested review from markmandel and pooneh-m August 18, 2022 19:28

roberthbailey added the kind/feature New features for Agones label Aug 19, 2022

Merge branch 'main' into expose-metrics-ports

ba90653

austin-space added 2 commits August 31, 2022 16:39

add docs for expose metrics on pods

c6bc9ff

Merge branch 'expose-metrics-ports' of https://github.com/austin-spac…

eaba510

…e/agones into expose-metrics-ports

google-oss-prow bot added size/S and removed size/XS labels Aug 31, 2022

remove 'expose metrics' flag

59d333c

markmandel reviewed Sep 1, 2022

View reviewed changes

site/content/en/docs/Guides/metrics.md Outdated Show resolved Hide resolved

markmandel mentioned this pull request Sep 1, 2022

Docs/Helm: Formatted table, fix typo #2724

Merged

Update site/content/en/docs/Guides/metrics.md

da6346f

Co-authored-by: Mark Mandel <markmandel@google.com>

austin-space added 2 commits September 1, 2022 13:45

updated install.yaml

e2cfa84

Merge branch 'expose-metrics-ports' of https://github.com/austin-spac…

0dc7715

…e/agones into expose-metrics-ports

markmandel approved these changes Sep 1, 2022

View reviewed changes

google-oss-prow bot assigned markmandel Sep 1, 2022

google-oss-prow bot added the lgtm label Sep 1, 2022

google-oss-prow bot added the approved label Sep 1, 2022

Merge branch 'main' into expose-metrics-ports

3a9841b

google-oss-prow bot removed the lgtm label Sep 1, 2022

markmandel merged commit 2e9a43e into googleforgames:main Sep 1, 2022

SaitejaTamma added this to the 1.26.0 milestone Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposes metrics ports on pods in order to enable GCP Managed Prometheus #2712

Exposes metrics ports on pods in order to enable GCP Managed Prometheus #2712

austin-space commented Aug 18, 2022

agones-bot commented Aug 18, 2022

markmandel commented Aug 24, 2022

austin-space commented Aug 24, 2022

markmandel commented Aug 25, 2022

agones-bot commented Aug 25, 2022

roberthbailey commented Aug 26, 2022

markmandel commented Aug 26, 2022

markmandel commented Aug 30, 2022

austin-space commented Aug 31, 2022

agones-bot commented Aug 31, 2022

agones-bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022

markmandel commented Sep 1, 2022

markmandel commented Sep 1, 2022

google-oss-prow bot commented Sep 1, 2022

google-oss-prow bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022

Exposes metrics ports on pods in order to enable GCP Managed Prometheus #2712

Exposes metrics ports on pods in order to enable GCP Managed Prometheus #2712

Conversation

austin-space commented Aug 18, 2022

agones-bot commented Aug 18, 2022

markmandel commented Aug 24, 2022

austin-space commented Aug 24, 2022

markmandel commented Aug 25, 2022

agones-bot commented Aug 25, 2022

roberthbailey commented Aug 26, 2022

markmandel commented Aug 26, 2022

markmandel commented Aug 30, 2022

austin-space commented Aug 31, 2022

agones-bot commented Aug 31, 2022

agones-bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022

markmandel commented Sep 1, 2022

markmandel commented Sep 1, 2022

google-oss-prow bot commented Sep 1, 2022

google-oss-prow bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022

agones-bot commented Sep 1, 2022