[torchbench] Regression after changing `load_benchmark` method. #6348

ysiraichi · 2024-01-22T19:52:19Z

🐛 Bug

Starting on #6296, we began instantiating the model on the accelerator passed on the command line. Then, for XLA executions, we moved the model to the XLA device. While it worked for most of the models, it also broke a few others:

These are breaking on both non-dynamo and dynamo+openxla.

Raw Error

Traceback (most recent call last):
  File "xla/benchmarks/experiment_runner.py", line 906, in <module>
    main()
  File "xla/benchmarks/experiment_runner.py", line 902, in main
    runner.run()
  File "xla/benchmarks/experiment_runner.py", line 59, in run
    self.run_single_config()
  File "xla/benchmarks/experiment_runner.py", line 247, in run_single_config
    metrics, last_output = self.run_once_and_gather_metrics(
  File "xla/benchmarks/experiment_runner.py", line 324, in run_once_and_gather_metrics
    output, _ = loop(iter_fn=self._default_iter_fn)
  File "xla/benchmarks/experiment_runner.py", line 293, in loop
    output, timing, trace = iter_fn(benchmark_experiment, benchmark_model,
  File "xla/benchmarks/experiment_runner.py", line 209, in _default_iter_fn
    output = benchmark_model.model_iter_fn(
  File "xla/benchmarks/benchmark_model.py", line 155, in eval
    pred = self.module(*inputs)
  File "torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/lib/python3.8/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 150, in forward
    return self.inference(batched_inputs)
  File "/lib/python3.8/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 208, in inference
    proposals, _ = self.proposal_generator(images, features, None)
  File "torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/lib/python3.8/site-packages/detectron2/modeling/proposal_generator/rpn.py", line 454, in forward
    pred_objectness_logits, pred_anchor_deltas = self.rpn_head(features)
  File "torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/lib/python3.8/site-packages/detectron2/modeling/proposal_generator/rpn.py", line 175, in forward
    pred_objectness_logits.append(self.objectness_logits(t))
  File "torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "torch/nn/modules/conv.py", line 460, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "torch/nn/modules/conv.py", line 456, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::Half) should be the same

To Reproduce

python xla/benchmarks/experiment_runner.py --no-resume --suite-name torchbench --repeat 2 --accelerator cuda --test eval --xla PJRT --dynamo None -k <benchmark>

Environment

Reproducible on XLA backend [CPU/TPU]: CUDA
torch_xla version: a8b27eb

Additional Context

Further discussion can be found on #6336

cc @miladm @JackCaoG

The text was updated successfully, but these errors were encountered:

ysiraichi added the xla:gpu label Jan 22, 2024

ysiraichi changed the title ~~Torchbench regression changing load_benchmark method.~~ [torchbench] Regression after changing load_benchmark method. Jan 22, 2024

This was referenced Jan 26, 2024

Force to fp16 models to fp32 if XLA_USE_FP16 is already set. #6389

Merged

Failing Torchbench Models: tracking issue #5932

Open

ysiraichi closed this as completed in #6389 Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torchbench] Regression after changing `load_benchmark` method. #6348

[torchbench] Regression after changing `load_benchmark` method. #6348

ysiraichi commented Jan 22, 2024 •

edited

Loading

[torchbench] Regression after changing load_benchmark method. #6348

[torchbench] Regression after changing load_benchmark method. #6348

Comments

ysiraichi commented Jan 22, 2024 • edited Loading

🐛 Bug

To Reproduce

Environment

Additional Context

[torchbench] Regression after changing `load_benchmark` method. #6348

[torchbench] Regression after changing `load_benchmark` method. #6348

ysiraichi commented Jan 22, 2024 •

edited

Loading