Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix]modify quantized ops during mkldnn int8 inference #3514

Merged
merged 2 commits into from
Oct 11, 2023

Conversation

ranchongzhi
Copy link
Contributor

PR types

PR changes

Description

在跑通act示例项目过程中发现CPU量化推理精度下降比较严重,发现是在测试的时候比量化训练多量化了pool2d以及elementwise_mul两个op,去掉之后推理精度恢复正常水平。下面是实验结果,都设置了benchmark=True

原模型fp32推理 原模型fp32+mkldnn加速 量化模型int8推理(量化conv2d,depthwise_conv2d) 量化模型int8推理(量化conv2d,depthwise_conv2d,elementwise_mul) 量化模型int8推理(量化conv2d,depthwise_conv2d,elementwise_mul,pool2d)
mIoU 0.7704 0.7704 0.7658 0.7657 0.7372
耗时(ms) 1216.8 1191.3 434.5 439.6 505.8

@paddle-bot
Copy link

paddle-bot bot commented Oct 10, 2023

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor Contribution from developers label Oct 10, 2023
Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@shiyutang shiyutang merged commit 3c3a12a into PaddlePaddle:develop Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contributor PR is Merged contributor Contribution from developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants