Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 6th No.52】move quantize、dequantize op to phi -part #63776

Closed
wants to merge 66 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
ce3a7f7
fix
enkilee Apr 23, 2024
aad789a
fix
enkilee Apr 23, 2024
bcac994
fix
enkilee Apr 23, 2024
66871e2
fix
enkilee Apr 23, 2024
93250ca
fix
enkilee Apr 23, 2024
ec0dfd3
fix
enkilee Apr 23, 2024
a811f41
fix
enkilee Apr 24, 2024
460e91c
fix
enkilee Apr 24, 2024
6ed1339
fix
enkilee Apr 24, 2024
4c354d5
fix
enkilee Apr 24, 2024
a6129bc
fix
enkilee Apr 25, 2024
c2ada6d
fix old error :is_nagative
enkilee Apr 26, 2024
2762f51
fix
enkilee Apr 26, 2024
7a7cacd
fix
enkilee Apr 28, 2024
ea1f7cb
fix
enkilee Apr 28, 2024
1abcb7c
Merge branch 'PaddlePaddle:develop' into hackathon6-No52-part1
enkilee Apr 28, 2024
4d73a0e
fix
enkilee Apr 28, 2024
68ecbea
fix
enkilee Apr 28, 2024
21fea0b
revert
enkilee Apr 28, 2024
05424d0
fix
enkilee Apr 28, 2024
45ab520
fix
enkilee Apr 28, 2024
bb1c78f
fix
enkilee Apr 29, 2024
6f7b016
fix
enkilee Apr 30, 2024
055d1e0
Merge branch 'PaddlePaddle:develop' into hackathon6-No52-part1
enkilee May 8, 2024
ad6217e
fix
enkilee May 9, 2024
2479c80
fix
enkilee May 9, 2024
c5862ef
fix
enkilee May 11, 2024
fca71e8
fix
enkilee May 13, 2024
998ac7e
fix
enkilee May 13, 2024
99ae6b5
Merge branch 'develop' into hackathon6-No52-part1
enkilee May 13, 2024
76e1bd5
fix
enkilee May 13, 2024
2fd451a
fix
enkilee May 13, 2024
edbc163
fix
enkilee May 13, 2024
f1c0303
fix
enkilee May 13, 2024
0593deb
fix
enkilee May 13, 2024
66d36bc
fix
enkilee May 13, 2024
d74bf99
fix
enkilee May 13, 2024
21d3fd9
fix
enkilee May 13, 2024
5c00825
resume
enkilee May 14, 2024
abfe3a3
resume
enkilee May 14, 2024
ac45c6c
fix
enkilee May 14, 2024
677cf82
fix
enkilee May 14, 2024
ebe2078
redo
enkilee May 14, 2024
5ab41af
Merge branch 'PaddlePaddle:develop' into hackathon6-No52-part1
enkilee May 14, 2024
68e281f
fix
enkilee May 14, 2024
2aae25b
fix
enkilee May 15, 2024
65ddd93
Merge branch 'PaddlePaddle:develop' into hackathon6-No52-part1
enkilee May 15, 2024
51495cb
fix
enkilee May 15, 2024
33d2c91
fix
enkilee May 16, 2024
80b5265
remove dequantize in fluid
enkilee May 16, 2024
025d3ca
Merge branch 'develop' into hackathon6-No52-part1
enkilee May 16, 2024
d56217d
move to legacy_ops.yaml
enkilee May 16, 2024
a06ba07
fix
enkilee May 17, 2024
986b00e
fix
enkilee May 17, 2024
b3ae712
fix
enkilee May 20, 2024
1b85f48
fix
enkilee May 20, 2024
ceebf88
fix
enkilee May 20, 2024
70a5aee
fix
enkilee May 20, 2024
b21bdba
fix
enkilee May 20, 2024
22f23f7
fix
enkilee May 21, 2024
da1d735
fix
enkilee May 21, 2024
146f8e3
fix
enkilee May 21, 2024
7332d94
merge
enkilee May 21, 2024
6e58a03
merge
enkilee May 21, 2024
4f694b0
fix
enkilee May 21, 2024
1413910
fix
enkilee May 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions paddle/fluid/operators/dequantize_op.cc
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Expand Down
3 changes: 0 additions & 3 deletions paddle/fluid/operators/dequantize_op.h
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Expand Down
129 changes: 0 additions & 129 deletions paddle/fluid/operators/onednn/quantize_onednn_op.cc

This file was deleted.

3 changes: 0 additions & 3 deletions paddle/fluid/operators/ops_signature/dequantize_sig.cc
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
/* Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Expand Down
26 changes: 26 additions & 0 deletions paddle/fluid/operators/ops_signature/quantize_sig.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/* Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/phi/core/compat/op_utils.h"

namespace phi {

KernelSignature QuantOpArgumentMapping(const ArgumentMappingContext& ctx) {
return KernelSignature(
"quantize",
{"Input"},
{"is_negative_input", "Scale", "Shift", "output_format", "bfloat16"},
{"Output"});
}

} // namespace phi

PD_REGISTER_ARG_MAPPING_FN(quantize, phi::QuantOpArgumentMapping);
10 changes: 5 additions & 5 deletions paddle/fluid/operators/quantize_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ namespace operators {

phi::KernelKey QuantOp::GetExpectedKernelType(
const framework::ExecutionContext& ctx) const {
return phi::KernelKey(
phi::Backend::ONEDNN,
phi::DataLayout::ONEDNN,
phi::TransToPhiDataType(
OperatorWithKernel::IndicateVarDataType(ctx, "Input")));
auto input_data_type =
framework::OperatorWithKernel::IndicateVarDataType(ctx, "Input");
return phi::KernelKey(phi::Backend::ONEDNN,
phi::DataLayout::ONEDNN,
phi::TransToPhiDataType(input_data_type));
}

void QuantOpMaker::Make() {
Expand Down
3 changes: 0 additions & 3 deletions paddle/fluid/operators/quantize_op.h
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Expand Down
11 changes: 11 additions & 0 deletions paddle/phi/infermeta/unary.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3453,6 +3453,17 @@ void QrInferMeta(const MetaTensor& x,
r->set_dtype(x.dtype());
}

void QuantizeInferMeta(const MetaTensor& input,
bool is_negative_input,
float scale,
float shift,
const std::string& output_format,
bool bfloat16,
MetaTensor* output) {
output->set_dims(input.dims());
output->share_lod(input);
}

DDim ReduceInferDim(const MetaTensor& x,
const std::vector<int64_t>& axis,
bool keep_dim,
Expand Down
8 changes: 8 additions & 0 deletions paddle/phi/infermeta/unary.h
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,14 @@ void QrInferMeta(const MetaTensor& x,
MetaTensor* q,
MetaTensor* r);

void QuantizeInferMeta(const MetaTensor& input,
bool is_negative_input,
float scale,
float shift,
const std::string& output_format,
bool bfloat16,
MetaTensor* output);

void QuantizeXPUInferMeta(const MetaTensor& x,
DataType out_dtype,
float scale,
Expand Down
117 changes: 117 additions & 0 deletions paddle/phi/kernels/onednn/quantize_kernel.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
/* Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/phi/kernels/quantize_kernel.h"
#include "paddle/phi/backends/onednn/onednn_reuse.h"
#include "paddle/phi/core/compat/convert_utils.h"
#include "paddle/phi/core/enforce.h"
#include "paddle/phi/core/expect.h"
#include "paddle/phi/core/kernel_registry.h"
#include "paddle/phi/core/utils/data_type.h"

namespace phi {

using dnnl::memory;

template <typename T, typename Context>
void QuantOpKernel(const Context& dev_ctx,
const DenseTensor& input,
bool is_negative_input,
float scale,
float shift,
const std::string& output_format,
bool bfloat16,
DenseTensor* output) {
const auto quantization_shift = static_cast<int32_t>(shift);
const bool with_scale = scale != 1.0f;
const bool with_shift = quantization_shift != 0.0f;

PADDLE_ENFORCE_NE(scale,
0.0f,
phi::errors::InvalidArgument(
"Quantization scale must be different than 0.0f"));
PADDLE_ENFORCE(quantization_shift <= 255 && quantization_shift >= 0,
phi::errors::InvalidArgument(
"Quantization shift must be lower or equal to ",
"255 and greater or equal to 0, but got %f",
quantization_shift));

auto x_tz = common::vectorize<int64_t>(input.dims());
dnnl::primitive_attr attrs;
static constexpr int32_t mask = 0;

if (with_scale) {
attrs.set_scales_mask(DNNL_ARG_SRC, mask);
}

if (with_shift) {
attrs.set_zero_points_mask(DNNL_ARG_DST, mask);
}

auto x_type = phi::funcs::ToOneDNNDataType(input.dtype());
DataType out_dtype;

if (bfloat16) {
out_dtype = DataType::BFLOAT16;
} else if (is_negative_input && !with_shift) {
out_dtype = DataType::INT8;
} else {
out_dtype = DataType::UINT8;
}

auto out_type = phi::funcs::ToOneDNNDataType(out_dtype);

phi::funcs::ReorderOneDNNHandler reorder_handler(
x_tz, input.dtype(), x_type, out_dtype, out_type, dev_ctx.GetEngine());

auto reorder_src_memory_p = reorder_handler.AcquireSrcMemory(
input.mem_desc(), phi::funcs::to_void_cast(input.data<T>()));
auto reorder_dst_memory_p = reorder_handler.AcquireDstMemory(
output, input.mem_desc(), dev_ctx.GetPlace());

auto reorder_p = reorder_handler.AcquireReorder(
reorder_dst_memory_p, reorder_src_memory_p, attrs);

auto& astream = phi::OneDNNContext::tls().get_stream();

auto scales_md = dnnl::memory::desc(
{1}, dnnl::memory::data_type::f32, dnnl::memory::format_tag::x);
auto scales_mem = dnnl::memory(
scales_md, dev_ctx.GetEngine(), phi::funcs::to_void_cast<float>(&scale));
auto zero_points_md = dnnl::memory::desc(
{1}, dnnl::memory::data_type::s32, dnnl::memory::format_tag::x);
auto zero_points_mem =
dnnl::memory(zero_points_md,
dev_ctx.GetEngine(),
phi::funcs::to_void_cast<int32_t>(&quantization_shift));

std::unordered_map<int, dnnl::memory> reorder_args;
reorder_args.insert({DNNL_ARG_SRC, *reorder_src_memory_p});
reorder_args.insert({DNNL_ARG_DST, *reorder_dst_memory_p});
if (with_scale) {
reorder_args.insert({DNNL_ARG_ATTR_SCALES | DNNL_ARG_SRC, scales_mem});
}
if (with_shift) {
reorder_args.insert(
{DNNL_ARG_ATTR_ZERO_POINTS | DNNL_ARG_DST, zero_points_mem});
}

reorder_p->execute(astream, reorder_args);
astream.wait();

output->set_mem_desc(reorder_dst_memory_p->get_desc());
}
} // namespace phi

PD_REGISTER_KERNEL(quantize, OneDNN, ONEDNN, phi::QuantOpKernel, float) {}
Loading