Fix output type of custom calls while lowering quant/dequant torch op to HLO #6283

sdasgup3 · 2024-01-10T01:30:23Z

#5763 allows lowering the torch quantize/dequantize operations to HLO custom calls. For example,
the following PyTorch code

x = torch.ops.quantized_decomposed.quantize_per_tensor(
    x, 0.4, 2, -128, 127, torch.int8)
x = torch.ops.quantized_decomposed.dequantize_per_tensor(
    x, 0.4, 2, -128, 127, torch.int8)

is lowered to the following HLO operations:

ENTRY %IrToHlo.5 (p0.1: f32[2,3,4,5]) -> (f32[2,3,4,5]) {
  %p0.1 = f32[2,3,4,5]{3,2,1,0} parameter(0)

  %custom-call.2 = f32[2,3,4,5]{3,2,1,0} custom-call(f32[2,3,4,5]{3,2,1,0} %p0.1), custom_call_target="stablehlo.uniform_quantize", api_version=API_VERSION_TYPED_FFI, backend_config={scale=[0.4],zero_point=[2],storage_type=si8,expressed_type=f32,storage_min=-128,storage_max=127}

  %custom-call.3 = f32[2,3,4,5]{3,2,1,0} custom-call(f32[2,3,4,5]{3,2,1,0} %custom-call.2), custom_call_target="stablehlo.uniform_dequantize", api_version=API_VERSION_TYPED_FFI, backend_config={scale=[0.4],zero_point=[2],storage_type=si8,expressed_type=f32,storage_min=-128,storage_max=127}
  ROOT %tuple.4 = (f32[2,3,4,5]{3,2,1,0}) tuple(f32[2,3,4,5]{3,2,1,0} %custom-call.3)
}

Note that the output of custom call corresponding to quantize op has element type f32. The fact that the output of custom_call (for quantize operation) is a float is more of a logical problem as the the result of quantization is generally expected to be in integer domain. Also, note that the choice of output type should not effect the eventual conversion of HLO custom calls to mhlo uniform.quantize/uniform.dequantize operations.

Moreover, based on https://github.com/pytorch/pytorch/blob/0b72ce1bd1a4a0596dde4053899b8a9a7999bc47/torch/ao/quantization/fx/_decomposed.py#L164 we set the output element type of dequatize operation to be f32

Finally improved the debuggability of the map queries using proper error messages.

cc @GleasonK

torch_xla/csrc/ops/dequant_tensor.cpp

GleasonK · 2024-01-10T01:40:13Z

torch_xla/csrc/runtime/stablehlo_helper.cc

@@ -192,4 +192,16 @@ GetHloDtypeToStablehloDtypeMap() {
  return m_;
 }

+const std::unordered_map<std::string, xla::PrimitiveType>&
+GetTorchIntDtypeToHloDtypeMap() {
+  static const std::unordered_map<std::string, xla::PrimitiveType> m_{


We could implement this as a switch / series of if stmts to avoid a static dictionary, it's a short enough list. Not sure of PT/XLAs stance on static data.

I agree about less static data floating around. Replaced the dictionary with conditionals.
@lsy323 If the change LG to you then I can merge once the CI is green.

… to HLO (#6283)

sdasgup3 requested a review from lsy323 January 10, 2024 01:30

GleasonK approved these changes Jan 10, 2024

View reviewed changes

lsy323 approved these changes Jan 10, 2024

View reviewed changes

sdasgup3 force-pushed the sdasgup3/fix-output-type-of-quantize-op branch 2 times, most recently from 8ad8b29 to f970401 Compare January 10, 2024 19:16

fix output type of quantize operation

9da267d

sdasgup3 force-pushed the sdasgup3/fix-output-type-of-quantize-op branch from f970401 to 9da267d Compare January 10, 2024 21:45

lsy323 merged commit 68f4750 into master Jan 11, 2024
19 checks passed

bhavya01 pushed a commit that referenced this pull request Apr 22, 2024

Fix output type of custom calls while lowering quant/dequant torch op…

ecd3e91

… to HLO (#6283)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix output type of custom calls while lowering quant/dequant torch op to HLO #6283

Fix output type of custom calls while lowering quant/dequant torch op to HLO #6283

sdasgup3 commented Jan 10, 2024 •

edited

Loading

GleasonK Jan 10, 2024

sdasgup3 Jan 10, 2024

Fix output type of custom calls while lowering quant/dequant torch op to HLO #6283

Fix output type of custom calls while lowering quant/dequant torch op to HLO #6283

Conversation

sdasgup3 commented Jan 10, 2024 • edited Loading

GleasonK Jan 10, 2024

Choose a reason for hiding this comment

sdasgup3 Jan 10, 2024

Choose a reason for hiding this comment

sdasgup3 commented Jan 10, 2024 •

edited

Loading