Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafana dashboard updates #5255

Merged
merged 41 commits into from
May 12, 2024

Conversation

Tom-Newton
Copy link
Contributor

@Tom-Newton Tom-Newton commented Apr 19, 2024

Tracking issue

Why are the changes needed?

There are a few issues with the dashboards as is. This includes some bugs and some outdated metric names.

What changes were proposed in this pull request?

  • This is a series of incremental changes I ended up making when trying to understand flyte propeller performance and debug a problem on our network.
    • Add a few extra graphs/shuffle some around a bit.
    • Fix some aggregation bugs e.g. missing rate funciton while the title says rate
    • Fix some axis labels.
    • Correct some metrics names based on flyte-core 1.11
    • Add a few descriptions.
  • Probably this is still a long way from perfect but I think its probably still worth contributing.

How was this patch tested?

Setup process

Screenshots

image
image

Check all the applicable boxes

  • I updated the documentation accordingly. Not applicable
  • All new and existing tests passed. Not applicable
  • All commits are signed-off.

Related PRs

Docs link

@Tom-Newton Tom-Newton force-pushed the tomnewton/dashboard_updates branch 2 times, most recently from 402ba45 to 598bd8c Compare April 19, 2024 19:32
@Tom-Newton
Copy link
Contributor Author

It looks like the CI failure is because matplotlib's website is down and breaking a docs build.

@Tom-Newton Tom-Newton marked this pull request as ready for review April 19, 2024 20:09
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Apr 19, 2024
Copy link
Contributor

@davidmirror-ops davidmirror-ops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@davidmirror-ops
Copy link
Contributor

@neverett any idea why CI fails with a doc that's not changed by this PR?

@Tom-Newton
Copy link
Contributor Author

I think the doc failure was caused by matplotlib's website going down. Is it still failing?

@Tom-Newton
Copy link
Contributor Author

Actually that looks like a different error now

@neverett
Copy link
Contributor

@Tom-Newton @davidmirror-ops I think you may need to merge master in to pick up changes from #5254 since that introduced some significant updates to the docs build

@davidmirror-ops
Copy link
Contributor

@Tom-Newton could you merge master and try again?

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
@Tom-Newton Tom-Newton force-pushed the tomnewton/dashboard_updates branch from c35d80a to 5af4a10 Compare May 7, 2024 18:46
@Tom-Newton
Copy link
Contributor Author

Sorry for being so slow. I just rebased on master 🤞

Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.09%. Comparing base (8db9901) to head (5af4a10).
Report is 3 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #5255       +/-   ##
===========================================
- Coverage   79.30%   61.09%   -18.21%     
===========================================
  Files          18      794      +776     
  Lines        1295    51213    +49918     
===========================================
+ Hits         1027    31289    +30262     
- Misses        204    17043    +16839     
- Partials       64     2881     +2817     
Flag Coverage Δ
unittests-datacatalog 69.31% <ø> (?)
unittests-flyteadmin 58.86% <ø> (?)
unittests-flytecopilot 17.79% <ø> (?)
unittests-flytectl 68.30% <ø> (?)
unittests-flyteidl 79.30% <ø> (ø)
unittests-flyteplugins 61.94% <ø> (?)
unittests-flytepropeller 57.32% <ø> (?)
unittests-flytestdlib 65.75% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Tom-Newton Tom-Newton merged commit 2f38d65 into flyteorg:master May 12, 2024
51 of 52 checks passed
robert-ulbrich-mercedes-benz pushed a commit to robert-ulbrich-mercedes-benz/flyte that referenced this pull request Jul 2, 2024
* Add enqueued workflows graph

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update queue metrics

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Better aggregations

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add enqueued workflows graph

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix trailing comma

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add transition latency, etc

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Check-in latest dashboard

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fixes

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add informer stats to dashboard

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add more stats on workflow store

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* More round metrics

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix Y axis labels

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Check-in latest dashboard

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add garbage collection stats

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix scale on cache hit rate

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Working flytepropeller grpc histograms

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Change to a single accumulated graph

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add GRPC histogram on flyteadmin too

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix admin metrics name

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update workqueue metric names for flyte 1.11.0

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add other queues to queue graphs

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add round success counter, remove workqueue latencies

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Split enqueues by type and add missing rate for skip rate metric

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Don't use rate for round latency graphs

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix metric name for plugin failures and make it a rate

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix duplicate refIds

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix typo

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update lots of title and units

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add round errors graph

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add total round rate

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Group metrics onto the same graphs were relevant

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Small updates

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix streaks graph

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* rename legend

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Rename

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add descriptions

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add a missing title

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Remove unused imports

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Adjust a description

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* More minor updates

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

---------

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants