Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit per-task allocated resources telemetry #4241

Closed
multani opened this issue May 2, 2018 · 10 comments
Closed

Emit per-task allocated resources telemetry #4241

multani opened this issue May 2, 2018 · 10 comments

Comments

@multani
Copy link
Contributor

multani commented May 2, 2018

We would like to measure how much resources a task is using compared to how much it reserved in the first place, with 2 specific goals in mind:

  • detect tasks which reserved a lot of resources but are actually using not that much. We could then reduce the resources reservation from these tasks.
  • detect tasks which are getting closer to use all the reserved resources. We could then tune up the task and/or the reservations to prevent tasks from starving in the future.

AFAIK, there's no allocated resources metrics which are emitted per-tasks. #2327 provides these allocated metrics at the client level. #2330 was supposed to provide these metrics at the tasks level but I guess @burdandrei deleted his branch :) and the code is not available anymore.

@burdandrei
Copy link
Contributor

@multanii just moved it to another branch.
In the meantime, we're pretty happy with telemetry per allocation.
FYI https://github.com/jippi/hashi-ui can show per task usage, but only in real time

@multani
Copy link
Contributor Author

multani commented May 3, 2018

@burdandrei Thanks, I just saw the branch now!

As for jippi/hashi-ui, I guess it probably does it by polling Nomad API directly (it's definitely available there). Would be nice to provide them as part of the telemetry process though :)

@burdandrei
Copy link
Contributor

@multani API calls it is.
I could bring this PR alive, but I still smaller PRs opened for months =( #3882 for example

@mlehner616
Copy link

Just a bump on this. Because nomad is so strict on memory reservations, it is really crucial to be able to measure a given allocation's usage against its own reservation over time. Measuring this in real time via API is almost pointless because it doesn't account for so many things that can happen while you're not observing the task (nightly process or high load request for example).

@margueritepd
Copy link
Contributor

margueritepd commented Sep 27, 2018

@burdandrei please open up a PR with your branch! I see that #3882 has been merged since your comment :)

@burdandrei
Copy link
Contributor

@margueritepd hardly believe that so outdated branch will survive the rebase, but I'll see what's left from it

@stale
Copy link

stale bot commented May 10, 2019

Hey there

Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this.

Thanks!

@mlehner616
Copy link

Bump

@tgross tgross added this to Needs Roadmapping in Nomad - Community Issues Triage Feb 12, 2021
@tgross tgross added the stage/needs-verification Issue needs verifying it still exists label Mar 4, 2021
@tgross tgross added stage/needs-discussion and removed stage/needs-verification Issue needs verifying it still exists labels Jul 8, 2022
@tgross tgross changed the title Emits per-task allocated resources telemetry Emit per-task allocated resources telemetry Jul 8, 2022
@tgross
Copy link
Member

tgross commented Jul 8, 2022

Doing a little bit of issue cleanup... allocation metrics include task labels. I'm not sure off the top of my head when this was introduced, but see: https://www.nomadproject.io/docs/operations/metrics-reference#allocation-metrics

@tgross tgross closed this as completed Jul 8, 2022
Nomad - Community Issues Triage automation moved this from Needs Roadmapping to Done Jul 8, 2022
@github-actions
Copy link

github-actions bot commented Nov 6, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

6 participants