-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[torchbench] moco
fails to run with CUDA OpenXLA fallback.
#7647
Comments
seems like during mark_step we found a XLATensor with empty data handle |
As far as I have investigated, the only fallback we are running differently is |
|
Right. But I wonder whether this issue sheds light into a CUDA OpenXLA fallback implementation issue. In the sense that, even if we run that on CUDA, it should still work. |
This is odd. I tried replacing the DLPack conversion with |
Forcing CPU fallback on |
@JackCaoG
Basically, this is the timeline I am seeing:
Do you see anything strange? Any ideas of where to look at? |
In an external discussion, we decided to work around this issue for now by forcing |
🐛 Bug
Running the upstreamed benchmarking scripts with the following command results in an unexpected error. It does work when using CPU OpenXLA fallback, though.
python xla/benchmarks/experiment_runner.py \ --suite-name torchbench \ --accelerator cuda \ --xla PJRT \ --dynamo None \ --test eval \ --repeat 30 --iterations-per-run 5 \ --print-subprocess \ --no-resume --filter moco
Environment
cc @miladm @JackCaoG
The text was updated successfully, but these errors were encountered: