[Fix]modify quantized ops during mkldnn int8 inference #3514

ranchongzhi · 2023-10-10T08:08:42Z

在跑通act示例项目过程中发现CPU量化推理精度下降比较严重，发现是在测试的时候比量化训练多量化了pool2d以及elementwise_mul两个op，去掉之后推理精度恢复正常水平。下面是实验结果，都设置了benchmark=True：

	原模型fp32推理	原模型fp32+mkldnn加速	量化模型int8推理（量化conv2d,depthwise_conv2d）	量化模型int8推理（量化conv2d,depthwise_conv2d,elementwise_mul）	量化模型int8推理（量化conv2d,depthwise_conv2d,elementwise_mul,pool2d）
mIoU	0.7704	0.7704	0.7658	0.7657	0.7372
耗时（ms）	1216.8	1191.3	434.5	439.6	505.8

paddle-bot · 2023-10-10T08:08:47Z

Thanks for your contribution!

shiyutang

lgtm

ranchongzhi added 2 commits October 10, 2023 16:00

modify quantized ops during mkldnn int8 inference

d188b5d

modify

0721080

paddle-bot bot added the contributor Contribution from developers label Oct 10, 2023

shiyutang approved these changes Oct 11, 2023

View reviewed changes

shiyutang merged commit 3c3a12a into PaddlePaddle:develop Oct 11, 2023

shiyutang added the Contributor PR is Merged label Oct 18, 2023

Provide feedback