Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Exception: Failed to read dashboard.err file #34504

Closed
shreyanssethi opened this issue Apr 17, 2023 · 6 comments
Closed

[core] Exception: Failed to read dashboard.err file #34504

shreyanssethi opened this issue Apr 17, 2023 · 6 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-observability good first issue Great starter issue for someone just starting to contribute to Ray P2 Important issue, but not time-critical stability

Comments

@shreyanssethi
Copy link

shreyanssethi commented Apr 17, 2023

What happened + What you expected to happen

Trying to run the following code for ray start:

ray start --head \
    --port $RAY_PORT \
    --dashboard-port $((RAY_PORT + 1)) \
    --include-dashboard True \
    --object-store-memory 10000000000 \
    --num-cpus 0 --num-gpus 0 \
    --temp-dir ./temp_link

And I keep getting the error:
Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: './temp_link/session_2023-04-17_18-41-09_060425_2337608/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
and then:
Exception: Failed to read dashboard.err file: cannot mmap an empty file

I have checked that the ./temp_link/session_2023-04-17_18-41-09_060425_2337608/logs/ directory does exist but there is no dashboard.log file. I have an issue in launching the Ray cluster even if I set 'include-dashboard' as False

I know that others experienced similar issues here (#26320) and I tried using the following fix:

pip install grpcio == 1.49.1
pip uninstall -y ray
pip install -U "ray[default]"

However, my issue continues to exist.

Versions / Dependencies

Using linux
Python 3.10.6
ray 2.3.1
grpcio 1.49.1

Reproduction script

ray start --head
--port $RAY_PORT
--dashboard-port $((RAY_PORT + 1))
--include-dashboard True
--object-store-memory 10000000000
--num-cpus 0 --num-gpus 0
--temp-dir ./temp

Issue Severity

High: It blocks me from completing my task.

@shreyanssethi shreyanssethi added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Apr 17, 2023
@rickyyx rickyyx self-assigned this Apr 18, 2023
@rickyyx rickyyx added core Issues that should be addressed in Ray Core and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Apr 18, 2023
@rickyyx
Copy link
Contributor

rickyyx commented Apr 19, 2023

I think it might be a path issue. I ran into issues with starting ray with your repro. Seems some parts of the ray wasn't handling relative path well (the Plasmastore)

Could you try using the abs path for the temp dir and see if that works for you while I work on a fix for this?

@hgl2017
Copy link

hgl2017 commented Jun 20, 2023

I also has same issue.
2023-06-20 18:24:41,889 ERROR services.py:1207 -- Failed to start the dashboard
2023-06-20 18:24:41,896 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-06-20 18:24:41,899 ERROR services.py:1242 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: './scratch/leuven/330/vsc33053/ray_spill/session_2023-06-20_18-24-13_924769_17774/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2023-06-20 18:24:41,901 ERROR services.py:1276 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues

@anyscalesam anyscalesam added the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Feb 14, 2024
@jjyao jjyao added good first issue Great starter issue for someone just starting to contribute to Ray P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 11, 2024
@arshiya031196
Copy link

Hi, I'm also facing the same issue. I'm using only one node and don't even need ray, only vLLM but internally it initializes a ray session and gets stuck indefinitely here:

2024-04-16 19:43:51,045 ERROR services.py:1330 -- Failed to start the dashboard
2024-04-16 19:43:51,045 ERROR services.py:1355 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2024-04-16 19:43:51,045 ERROR services.py:1365 -- Couldn't read dashboard.log file. Error: [Errno 2] No such file or directory: '/tmp/ray/session_2024-04-16_19-43-09_468986_3298/logs/dashboard.log'. It means the dashboard is broken even before it initializes the logger (mostly dependency issues). Reading the dashboard.err file which contains stdout/stderr.
2024-04-16 19:43:51,045 ERROR services.py:1399 -- Failed to read dashboard.err file: cannot mmap an empty file. It is unexpected. Please report an issue to Ray github. https://github.com/ray-project/ray/issues
2024-04-16 19:43:53,550 INFO worker.py:1752 -- Started a local Ray instance.

Is there some way to disable ray in only vLLM scripts or mitigate this issue?

@rickyyx
Copy link
Contributor

rickyyx commented Apr 16, 2024

cc @anyscalesam

@yangalan123
Copy link

It works for me that I just uninstall grpcio, ray and vllm and re-install latest version of vllm (==0.4.3, which automatically install ray==2.24.0). Hope that helps!

@jjyao jjyao added P2 Important issue, but not time-critical and removed P1 Issue that should be fixed within a few weeks labels Oct 30, 2024
@dayshah
Copy link
Contributor

dayshah commented Dec 9, 2024

Initial issue seems to no longer exists on latest version of ray, #36431 merged after this was opened also helps assure that this won't come up again now that we're always requiring absolute dir. @arshiya031196 Feel free to open a separate issue with the versions of the packages you're using and how you're running vllm if you're also still getting this issue on the latest vllm.

@dayshah dayshah closed this as completed Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-observability good first issue Great starter issue for someone just starting to contribute to Ray P2 Important issue, but not time-critical stability
Projects
None yet
Development

No branches or pull requests

8 participants