-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Change counts of metrics to rates of metrics #47236
[data] Change counts of metrics to rates of metrics #47236
Conversation
python/ray/dashboard/modules/metrics/dashboards/data_dashboard_panels.py
Outdated
Show resolved
Hide resolved
unit="bytes", | ||
title="Bytes Output / Second", | ||
description="Bytes output per second by dataset operators.", | ||
unit="bytes/sec", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after making this change, could you also share a screenshot of the dashboard with the fixes, to verify that the units properly show up on the axes + the conversion works (e.g. once we start hitting 100's of MBs, this should display as MB/s instead of B/s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python/ray/dashboard/modules/metrics/dashboards/data_dashboard_panels.py
Outdated
Show resolved
Hide resolved
python/ray/dashboard/modules/metrics/dashboards/data_dashboard_panels.py
Outdated
Show resolved
Hide resolved
python/ray/dashboard/modules/metrics/dashboards/data_dashboard_panels.py
Outdated
Show resolved
Hide resolved
python/ray/dashboard/modules/metrics/dashboards/data_dashboard_panels.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Matthew Owen <mowen@anyscale.com>
…_panels.py Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
…_panels.py Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
…_panels.py Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
…_panels.py Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
…_panels.py Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu>
Signed-off-by: Matthew Owen <mowen@anyscale.com>
Signed-off-by: Matthew Owen <mowen@anyscale.com>
dc6a227
to
51774d5
Compare
## Why are these changes needed? This changes the counts of various metrics for ray data to rates for those same metrics. This allows for easier visualization. There are a variety of different approaches we can take here to calculate the rates: * Window of rate: We can compute the rate over a time window of specified size, the relevancy of this depends on the function chosen. * Function: * `rate`: Calculates the average per-second rate of increase of a time series over a specified range. Less volatile, produces more smoothing depending on the window. * `irate`: Calculates the per-second rate of increase based on the two most recent data points. More volatile and reactive, less smoothing applied, not as sensitive to the window size because last two data points are used. ### Examples: **Yellow: `rate[30s]`** (Have selected this as this seems to be the best balance between the various options) Orange: `irate[30s]` Blue: `rate[5m]` <img width="2688" alt="image" src="https://github.com/user-attachments/assets/a2424318-e06a-4b32-8d8c-83fc1dfbb540"> #### Sample code ```python import ray import time context = ray.init(include_dashboard=True) print(context.dashboard_url) def f(x): time.sleep(0.1) return x def g(x): time.sleep(0.1) return x ray.data.range(10000).map(f).materialize() ``` ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Matthew Owen <mowen@anyscale.com> Signed-off-by: Matthew Owen <omatthew98@berkeley.edu> Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Why are these changes needed?
This changes the counts of various metrics for ray data to rates for those same metrics. This allows for easier visualization.
There are a variety of different approaches we can take here to calculate the rates:
rate
: Calculates the average per-second rate of increase of a time series over a specified range. Less volatile, produces more smoothing depending on the window.irate
: Calculates the per-second rate of increase based on the two most recent data points. More volatile and reactive, less smoothing applied, not as sensitive to the window size because last two data points are used.Examples:
Yellow:
rate[30s]
(Have selected this as this seems to be the best balance between the various options)Orange:
irate[30s]
Blue:
rate[5m]
Sample code
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.