`upsample_bilinear2d` HLO returns unexpected data-type. #7095

ysiraichi · 2024-05-22T20:08:48Z

🐛 Bug

At first, it seems to work, i.e. the returned data-type is torch.float16, as expected. However, when using it with another torch.float16 tensor, it breaks unexpectedly.

In the example below, foo stacks the result of an upsample_bilinear with another torch.float16 tensor. The function fails when using PyTorch/XLA because stack (lowered to concatenate) expects all inputs to be of the same data-type (note that this behavior is being fixed in #7091). However, as we can see from the error message, we are trying to call concatenate(f32[...], f16[...]). Meaning that the result of upsample_bilinear wasn't really f16.

In summary: upsample_bilinear2d returns a torch.float16 tensor, even though its HLO representation is f32. The expected data-type is f16

def foo(x, y):
    return torch.stack([torch.nn.functional.upsample_bilinear(x, scale_factor=2), y])

a = torch.rand(1, 3, 10, 10, dtype=torch.half)
b = torch.rand(1, 3, 20, 20, dtype=torch.half)

Xa = a.to(xm.xla_device())
Xb = b.to(xm.xla_device())

out = foo(a, b)
print(out.dtype)  # torch.float16

Xout = foo(Xa, Xb)
print(Xout.dtype)  # torch.float16

# Fails with the error below.
Xout.cpu()

Non-OK-status: status.status() status: INTERNAL: during context [Unknown]: Seen floating point types of different precisions in %concatenate.82 = f32[2,1,3,20,20]{4,3,2,1,0} concatenate(f32[1,1,3,20,20]{4,3,2,1,0} %reshape.80, f16[1,1,3,20,20]{4,3,2,1,0} %reshape.81), dimensions={0}, but mixed precision is disallowed.
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()
        std::unique_ptr<xla::PjRtLoadedExecutable, std::default_delete<xla::PjRtLoadedExecutable> > ConsumeValue<std::unique_ptr<xla::PjRtLoadedExecutable, std::default_delete<xla::PjRtLoadedExecutable> > >(absl::lts_20230802::StatusOr<std::unique_ptr<xla::PjRtLoadedExecutable, std::default_delete<xla::PjRtLoadedExecutable> > >&&)
        torch_xla::runtime::PjRtComputationClient::Compile(std::vector<torch_xla::runtime::ComputationClient::CompileInstance, std::allocator<torch_xla::runtime::ComputationClient::CompileInstance> >)
        torch_xla::XLAGraphExecutor::Compile(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > >&, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, torch::lazy::LazyGraphExecutor::SyncTensorCollection const&, torch::lazy::LazyGraphExecutor::PostOrderData*, std::vector<torch::lazy::Value, std::allocator<torch::lazy::Value> > const&)
        torch_xla::XLAGraphExecutor::SyncTensorsGraphInternal(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > >*, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, torch::lazy::LazyGraphExecutor::SyncTensorsConfig const&, bool)
        torch_xla::XLAGraphExecutor::SyncTensorsGraph(std::vector<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> >, std::allocator<c10::intrusive_ptr<torch_xla::XLATensor, c10::detail::intrusive_target_default_null_type<torch_xla::XLATensor> > > >*, absl::lts_20230802::Span<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, bool, bool, bool)
        torch_xla::XLATensor::ApplyPendingGraph()
        torch_xla::XLATensor::GetXlaData()
        torch_xla::XLATensor::ToTensor(bool)
        torch_xla::XLANativeFunctions::_to_copy(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>)




        at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>)



        at::_ops::_to_copy::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>)





        at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>)





        at::_ops::_to_copy::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>)


        at::native::to(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, bool, std::optional<c10::MemoryFormat>)



        at::_ops::to_dtype_layout::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, bool, std::optional<c10::MemoryFormat>)
        at::Tensor::to(c10::TensorOptions, bool, bool, std::optional<c10::MemoryFormat>) const



        _PyEval_EvalFrameDefault

        PyEval_EvalCode



        _PyRun_SimpleFileObject
        _PyRun_AnyFileObject
        Py_RunMain
        Py_BytesMain
        __libc_start_main
        _start
*** End stack trace ***

Environment

Reproducible on XLA backend [CPU/TPU/CUDA]: CUDA
torch_xla version: 8d35eb0

Additional context

This seems to happen due to the fact that we are computing on F32 regardless of what the original input data type is.

xla/torch_xla/csrc/resize_ops.cpp

Lines 56 to 59 in f336317

    
           if (is_kernel_bilinear || xla::primitive_util::IsIntegralType(input_type)) { 
        
             input = xla::ConvertElementType(input, xla::F32); 
        
             input_type = xla::F32; 
        
           }

cc @miladm @JackCaoG

The text was updated successfully, but these errors were encountered:

ysiraichi · 2024-05-22T20:31:37Z

Note: the offending code was extracted from elsewhere:

xla/torch_xla/csrc/resize_ops.cpp

Lines 22 to 23 in f336317

    
           // Code copied from 
        
           // https://github.com/tensorflow/tensorflow/blob/e51d6ab5730092775d516b18fa4ee85d49602cd8/tensorflow/compiler/tf2xla/kernels/image_resize_ops.cc#L477-L672

ysiraichi · 2024-05-22T20:32:52Z

@JackCaoG I would imagine, not only we should return a tensor casted to the input data-type, but we also should do the computation in f16, since we are not using AMP. Let me know what you think.

JackCaoG · 2024-05-22T22:49:52Z

I think this op is copied from tf, and from the comment in tensorflow/tensorflow@f8b35e0 TF output shape is always f32.. I guess we don't need to follow that rule.

ysiraichi · 2024-05-24T15:17:14Z

I'm thinking that the way to go here is to compute and return using the input data-type, and not f32.
@JackCaoG how does that sound?

JackCaoG · 2024-05-24T17:29:08Z

sgtm!

ysiraichi added the xla:gpu label May 22, 2024

This comment was marked as outdated.

Sign in to view

ysiraichi changed the title ~~upsample_bilinear2d returns unexpected data-type.~~ upsample_bilinear2d HLO returns unexpected data-type. May 22, 2024

This was referenced May 24, 2024

upsample_bilinear: fix output data-type. #7111

Merged

Failing Torchbench Models: tracking issue #5932

Open

ysiraichi closed this as completed in #7111 May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`upsample_bilinear2d` HLO returns unexpected data-type. #7095

`upsample_bilinear2d` HLO returns unexpected data-type. #7095

ysiraichi commented May 22, 2024 •

edited

Loading

This comment was marked as outdated.

ysiraichi commented May 22, 2024

ysiraichi commented May 22, 2024

JackCaoG commented May 22, 2024

ysiraichi commented May 24, 2024

JackCaoG commented May 24, 2024

upsample_bilinear2d HLO returns unexpected data-type. #7095

upsample_bilinear2d HLO returns unexpected data-type. #7095

Comments

ysiraichi commented May 22, 2024 • edited Loading

🐛 Bug

Environment

Additional context

This comment was marked as outdated.

ysiraichi commented May 22, 2024

ysiraichi commented May 22, 2024

JackCaoG commented May 22, 2024

ysiraichi commented May 24, 2024

JackCaoG commented May 24, 2024

`upsample_bilinear2d` HLO returns unexpected data-type. #7095

`upsample_bilinear2d` HLO returns unexpected data-type. #7095

ysiraichi commented May 22, 2024 •

edited

Loading