-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ray Core] Fix the Paths in the Generated Monitoring Configs to Consider temp-dir #47871
Conversation
… temp-dir Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
2. fix comment 3. Update the logic in the install_and_start_prometheus to continue leverage the hard coded file, in order to avoid generating a file in user's package or working directory. Tests added accordingly Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
# in install_and_start_prometheus.py is updated to reload the config file. | ||
shutil.copy(PROMETHEUS_CONFIG_INPUT_PATH, prometheus_config_output_path) | ||
|
||
# The code here is slightly different from how the prometheus config is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I fully understand this comment section. What about:
# This code generates the Prometheus config based on the custom temporary root path
# set by the user at Ray cluster start up (via --temp-dir). In contrast, start_prometheus in
# install_and_start_prometheus.py uses a hardcoded Prometheus config at
# PROMETHEUS_CONFIG_INPUT_PATH that always uses "/tmp/ray". Other than the root path,
# the config file generated here is identical to that hardcoded config file.
# However, if in the future Ray ever modifies this file at runtime, we'll | ||
# need to use the user-friendly location instead, and reload the config | ||
# file after it's updated by Ray. | ||
# The function assumes, to-be-prometheus-monitored ray cluster uses the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
# This function assumes the Ray cluster to be monitored by Prometheus uses the default
# configuration with "/tmp/ray" as the root temporary directory.
#
# This is to support the `ray metrics launch-prometheus` command, when a ray cluster
# is not started yet and we have no way to get a `--temp-dir` anywhere. So we choose to
# use a hardcoded default value.
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
@angelinalg please take a look at this - Mengjin fixed an issue about when users specify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some style nits and a typo. For future reference, please run Vale to get style feedback: https://docs.ray.io/en/latest/ray-contribute/docs.html#how-to-use-vale
@@ -22,6 +22,12 @@ For a quick demo, you can run Prometheus locally on your machine. Follow the qui | |||
|
|||
### Quickstart: Running Prometheus locally | |||
|
|||
```{admonition} Note | |||
:class: note | |||
If you need to change the root temporary directory by using "--temp-dir" in your ray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you need to change the root temporary directory by using "--temp-dir" in your ray | |
If you need to change the root temporary directory by using "--temp-dir" in your Ray |
```{admonition} Note | ||
:class: note | ||
If you need to change the root temporary directory by using "--temp-dir" in your ray | ||
cluster setup, please follow the [manual steps](#optional-manual-running-prometheus-locally) to setup Prometheus locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cluster setup, please follow the [manual steps](#optional-manual-running-prometheus-locally) to setup Prometheus locally. | |
cluster setup, follow these [manual steps](#optional-manual-running-prometheus-locally) to set up Prometheus locally. |
@@ -76,7 +82,7 @@ tar xvfz prometheus-*.tar.gz | |||
cd prometheus-* | |||
``` | |||
|
|||
Ray provides a Prometheus config that works out of the box. After running Ray, you can find the config at `/tmp/ray/session_latest/metrics/prometheus/prometheus.yml`. | |||
Ray provides a Prometheus config that works out of the box. After running Ray, you can find the config at `/tmp/ray/session_latest/metrics/prometheus/prometheus.yml`. If you specify the `--temp-dir={your_temp_path}` when starting the ray cluster, the config file will be at `{yout_temp_path}/session_latest/metrics/prometheus/prometheus.yml` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ray provides a Prometheus config that works out of the box. After running Ray, you can find the config at `/tmp/ray/session_latest/metrics/prometheus/prometheus.yml`. If you specify the `--temp-dir={your_temp_path}` when starting the ray cluster, the config file will be at `{yout_temp_path}/session_latest/metrics/prometheus/prometheus.yml` | |
Ray provides a Prometheus config that works out of the box. After running Ray, you can find the config at `/tmp/ray/session_latest/metrics/prometheus/prometheus.yml`. If you specify the `--temp-dir={your_temp_path}` when starting the Ray cluster, the config file is at: `{your_temp_path}/session_latest/metrics/prometheus/prometheus.yml` |
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? This PR fixed the style suggestions in #47871. The original PR got accidentally merged before the suggestions are fixed. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes #1234" --> ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
… Configs to Consider temp-dir Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
…48054) * Resubmit #47871 and #48030 as they are reverted as release blocker * Fixed the failed test in windows CI test. * The test `test_metrics_folder_and_content` failed because of the different file separator between linux and Windows * The PR fix the issue by using `os.path.join` to generate the path for validation instead of directly using formatted string Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
…to Consider temp-dir" (ray-project#48050) Reverts ray-project#47871
…led Test for Windows (ray-project#48054) * Resubmit ray-project#47871 and ray-project#48030 as they are reverted as release blocker * Fixed the failed test in windows CI test. * The test `test_metrics_folder_and_content` failed because of the different file separator between linux and Windows * The PR fix the issue by using `os.path.join` to generate the path for validation instead of directly using formatted string Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
…led Test for Windows (ray-project#48054) * Resubmit ray-project#47871 and ray-project#48030 as they are reverted as release blocker * Fixed the failed test in windows CI test. * The test `test_metrics_folder_and_content` failed because of the different file separator between linux and Windows * The PR fix the issue by using `os.path.join` to generate the path for validation instead of directly using formatted string Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com>
…led Test for Windows (ray-project#48054) * Resubmit ray-project#47871 and ray-project#48030 as they are reverted as release blocker * Fixed the failed test in windows CI test. * The test `test_metrics_folder_and_content` failed because of the different file separator between linux and Windows * The PR fix the issue by using `os.path.join` to generate the path for validation instead of directly using formatted string Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com> Signed-off-by: mohitjain2504 <mohit.jain@dream11.com>
Why are these changes needed?
Today, when a user specifies the
--temp-dir
when starting ray, the paths in the default config files generated are still with the default root directory/tmp/ray
. The issue is because the default config files are directly copied from the package when ray starts and the paths are all hardcoded.This PR fix the issue by generating the corresponding config files with the corresponding path at runtime.
Also:
ray metrics launch-prometheus
command because the command can be executed with/without ray cluster started. The doc is updated accordingly to suggest the user to use manual mode to start prometheus is --temp-dir will be specifiedRelated issue number
Closes #41648
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.