Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torchbench] hf_BigBird (inference and training) fails to run on dynamo. #7833

Open
ysiraichi opened this issue Aug 12, 2024 · 0 comments
Open
Labels

Comments

@ysiraichi
Copy link
Collaborator

🐛 Bug

Running the upstreamed benchmarking scripts with the following command results in an unexpected error.

python xla/benchmarks/experiment_runner.py \
       --suite-name torchbench \
       --accelerator cuda \
       --xla PJRT \
       --dynamo openxla \
       --test eval  --test train \
       --repeat 30 --iterations-per-run 5 \
       --print-subprocess \
       --no-resume --filter hf_BigBird
Traceback (most recent call last):
  File "torch/_dynamo/output_graph.py", line 1438, in _call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "torch/_dynamo/repro/after_dynamo.py", line 129, in __call__
    compiled_gm = compiler_fn(gm, example_inputs)
  File "torch/__init__.py", line 2281, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "torch/_dynamo/backends/common.py", line 72, in __call__
    cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
  File "torch/_functorch/aot_autograd.py", line 1033, in aot_module_simplified
    compiled_fn = dispatch_and_compile()
  File "torch/_functorch/aot_autograd.py", line 1022, in dispatch_and_compile
    compiled_fn, _ = create_aot_dispatcher_function(
  File "torch/_functorch/aot_autograd.py", line 435, in create_aot_dispatcher_function
    return _create_aot_dispatcher_function(flat_fn, flat_args, aot_config)
  File "torch/_functorch/aot_autograd.py", line 600, in _create_aot_dispatcher_function
    fw_metadata = run_functionalized_fw_and_collect_metadata(
  File "torch/_functorch/_aot_autograd/collect_metadata_analysis.py", line 168, in inner
    flat_f_outs = f(*flat_f_args)
  File "torch/_functorch/_aot_autograd/traced_function_transforms.py", line 805, in functional_call
    out = PropagateUnbackedSymInts(mod).run(
  File "torch/fx/interpreter.py", line 147, in run
    self.env[node] = self.run_node(node)
  File "torch/fx/experimental/symbolic_shapes.py", line 5592, in run_node
    result = super().run_node(n)
  File "torch/fx/interpreter.py", line 204, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "torch/fx/interpreter.py", line 298, in call_method
    return getattr(self_obj, target)(*args_tail, **kwargs)
  File "torch/_subclasses/functional_tensor.py", line 526, in __torch_dispatch__
    func(*args, **kwargs)
  File "torch/_ops.py", line 713, in __call__
    return self._op(*args, **kwargs)
RuntimeError: torch_xla/csrc/aten_xla_bridge.cpp:105 : Check failed: xtensor
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()
        torch_xla::bridge::GetXlaTensor(at::Tensor const&)
        torch_xla::XLANativeFunctions::as_strided_(at::Tensor const&, c10::ArrayRef<long>, c10::ArrayRef<long>, std::optional<long>)



        at::_ops::as_strided_::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>, std::optional<c10::Sy
mInt>)
        at::native::unsqueeze_(at::Tensor&, long)



        torch::jit::invokeOperatorFromPython(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch
::jit::Operator> > > const&, pybind11::args const&, pybind11::kwargs const&, std::optional<c10::DispatchKey>)
        torch::jit::_get_operation_for_overload_or_packet(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::sha
red_ptr<torch::jit::Operator> > > const&, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::Disp
atchKey>)



        _PyObject_Call
        _PyEval_EvalFrameDefault

        _PyObject_FastCallDictTstate
        _PyObject_Call_Prepend

        _PyObject_Call
        _PyEval_EvalFrameDefault





        PyObject_CallMethod
        torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<_object*>, _object*, _object*, char const*, _object*, char c
onst*, torch::TorchFunctionName)



        at::_ops::unsqueeze_::redispatch(c10::DispatchKeySet, at::Tensor&, long)


        at::_ops::unsqueeze_::redispatch(c10::DispatchKeySet, at::Tensor&, long)





        at::_ops::unsqueeze_::call(at::Tensor&, long)


        _PyObject_Call
        _PyEval_EvalFrameDefault


        _PyEval_EvalFrameDefault


        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault



        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        PyVectorcall_Call
        _PyEval_EvalFrameDefault

        _PyObject_FastCallDictTstate
        _PyObject_Call_Prepend

        _PyObject_Call
        _PyEval_EvalFrameDefault

        _PyObject_FastCallDictTstate
        _PyObject_Call_Prepend

        _PyObject_Call

        _PyObject_MakeTpCall
        _PyEval_EvalFrameDefault

        _PyObject_FastCallDictTstate
        _PyObject_Call_Prepend

        _PyObject_MakeTpCall
        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault


        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault


        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault



        _PyEval_EvalFrameDefault

        _PyEval_EvalFrameDefault

*** End stack trace ***
Input tensor is not an XLA tensor: XLAFloatType

While executing %unsqueeze_ : [num_users=0] = call_method[target=unsqueeze_](args = (%band_mask, 1), kwargs = {})
Original traceback:
  File "xla/benchmarks/benchmark_model.py", line 183, in eval
    pred = self.module(*inputs)
  File "/usr/local/lib/python3.10/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 2450, in forward
    outputs = self.bert(
  File "/usr/local/lib/python3.10/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 2089, in forward
    blocked_encoder_mask, band_mask, from_mask, to_mask = self.create_masks_for_block_sparse_attn(
  File "/usr/local/lib/python3.10/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 2201, in create_masks_for_block_sparse_attn
    band_mask = create_band_mask_from_inputs(blocked_encoder_mask, blocked_encoder_mask)
  File "/usr/local/lib/python3.10/site-packages/transformers/models/big_bird/modeling_big_bird.py", line 2197, in create_band_mask_from_inputs
    band_mask.unsqueeze_(1)

Environment

  • Reproducible on XLA backend [CPU/TPU/CUDA]: CUDA
  • torch_xla version: 60b9dfe

cc @miladm @JackCaoG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant