Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce jobtype variable in worker dashboard #3262

Merged
merged 8 commits into from
Mar 21, 2023

Conversation

croissanne
Copy link
Member

@croissanne croissanne commented Feb 2, 2023

image


The tradeoff is the 2nd target_duration variable. As it's tricky to have a variable with multiple values (so we could define the target duration per jobtype), you sort of need to know which target_duration to set for which jobtype (1792 for osbuild jobs, 32 for depsolve/resolve jobs).

https://grafana.stage.devshift.net/d/image-builder-worker-sanne/image-builder-worker-sanne?orgId=1

@croissanne croissanne changed the title Worker dash rework Introduce jobtype variable in worker dashboard Feb 2, 2023
@lavocatt
Copy link
Contributor

lavocatt commented Feb 6, 2023

Can you explain a bit more why is there a need to specify the target time statically please ? I think I kinda understand but a bit unsure.

@croissanne
Copy link
Member Author

Can you explain a bit more why is there a need to specify the target time statically please ? I think I kinda understand but a bit unsure.

So per jobtype we have different targets of what consititutes "fast". We commit to 95% of depsolve jobs taking 32s or lower, and we commit to osbuild jobs taking less than 30 minutes. I haven't really found a way to change the jobtype and the target time at the same time, so I made a 2nd variable.

The target time is needed to correctly calculate the error budget and slow request rates in the job duration and job wait duration sections.

@kingsleyzissou
Copy link
Contributor

The tradeoff is the 2nd target_duration variable. As it's tricky to have a variable with multiple values (so we could define the target duration per jobtype), you sort of need to know which target_duration to set for which jobtype (1792 for osbuild jobs, 32 for depsolve/resolve jobs).

Awesome stuff. It looks a lot better and less cluttered like this. Pity about the jobtype... but I don't think it's a big deal. I mean this is mostly for us right? And I guess we'll know to switch the target duration for each job. would it make sense then to have. Having said that, would it make more sense to have the input selector be a single selector so you can only view one job at a time rather than a multi select?

@croissanne
Copy link
Member Author

Discussed with Tom:

  • job wait duration per architecture
  • future: stats per image type

@croissanne
Copy link
Member Author

Discussed with Tom:

* job wait duration per architecture

Ok so we don't actually add the arch label to the job_wait_duration metric.

* future: stats per image type

@croissanne croissanne marked this pull request as ready for review March 13, 2023 12:51
@croissanne croissanne marked this pull request as draft March 13, 2023 12:51
This removes the rows of panels per job type, and uses the jobtype
variable.
This aligns vertical dividers between panels across rows.
95th percentile duration is now a fixed colour, as it's tricky to get
dynamic thresholds based on the job type.

Budget remaining thresholds are now only green at infinity, turn yellow
below 4 weeks, and turn red when budget consumption would only last 3
weeks (out of 4).
Display the wait duration of jobs per architecture.
@croissanne
Copy link
Member Author

Job wait duration per architecture is now available after #3289

@croissanne croissanne marked this pull request as ready for review March 13, 2023 14:15
@croissanne croissanne merged commit a2a3a26 into osbuild:main Mar 21, 2023
@croissanne croissanne deleted the worker-dash-rework branch March 21, 2023 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants