Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmpnet: Write config enabling metrics collection by prometheus #2764

Closed
wants to merge 5 commits into from

Conversation

marun
Copy link
Contributor

@marun marun commented Feb 23, 2024

Why this should be merged

Temporary networks used for testing previously lacked an easy way to enable metrics collection. This PR ensures that prometheus has what it needs to scrape the metrics endpoints of a temporary network.

How this works

  • Write prometheus configuration to ~/.tmpnet/prometheus/file_sd_configs for each node on startup and remove it on shutdown. This enables scraping of all nodes in a network no matter when they are started.
  • Add script to scrape temporary networks with agent-mode prometheus.

How this was tested

Locally tested.

TODO

@marun marun added testing This primarily focuses on testing monitoring This primarily focuses on logs, metrics, and/or tracing labels Feb 23, 2024
@marun marun self-assigned this Feb 23, 2024
@marun marun linked an issue Feb 23, 2024 that may be closed by this pull request
5 tasks
@marun marun removed the monitoring This primarily focuses on logs, metrics, and/or tracing label Feb 23, 2024
@marun marun force-pushed the tmpnet-network-uuids branch 2 times, most recently from a5cb88c to de22c95 Compare March 6, 2024 18:48
@marun marun changed the base branch from tmpnet-network-uuids to master March 6, 2024 18:49
Previously, tmpnet networks were only identified by their path on
disk. In anticipation of centralizing metrics collection for temporary
networks, tmpnet will now generate a UUID for each temporary network
to ensure a unique identifier across execution environments.
@marun marun marked this pull request as ready for review March 6, 2024 23:17
@marun marun closed this Mar 7, 2024
@marun marun reopened this Mar 7, 2024
@marun marun marked this pull request as draft March 7, 2024 00:29
Temporary networks used for testing previously lacked an easy way to
enable metrics collection. This PR ensures that prometheus has what it
needs to scrape the metrics endpoints of a temporary network and
enables scraping of CI jobs using temporary networks.

- Write prometheus configuration to ~/.tmpnet/prometheus/file_sd_configs for
  each node on startup and remove it on shutdown. This enables scraping
  of all nodes in a network no matter when they are started.

  Ref: https://prometheus.io/docs/guides/file-sd/

- Add script to scrape temporary networks with agent-mode
  prometheus. Works locally and in CI
@marun marun closed this Mar 7, 2024
@marun marun deleted the tmpnet-prometheus-config branch March 7, 2024 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing This primarily focuses on testing
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

testing: Enable metrics collection for tmpnet clusters
1 participant