Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.105】move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi #57881

Merged
merged 17 commits into from
Oct 27, 2023

Conversation

zeroRains
Copy link
Contributor

PR types

Others

PR changes

Others

Description

move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi
#57262

@paddle-bot
Copy link

paddle-bot bot commented Oct 6, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Oct 6, 2023
@zeroRains
Copy link
Contributor Author

zeroRains commented Oct 7, 2023

额,这个输出在fluid的OpMaker中有一个.AsIntermediate()描述,我在处理的时候当他不存在了,但是下面的报错告诉我应该是不能忽略的,想问问研发大哥这个描述应该怎么处理呢?@yuanlehome
3d4fe3c29863b48fb71ea04fc9bff47

@yuanlehome
Copy link
Contributor

额,这个输出在fluid的OpMaker中有一个.AsIntermediate()描述,我在处理的时候当他不存在了,但是下面的报错告诉我应该是不能忽略的,想问问研发大哥这个描述应该怎么处理呢? 3d4fe3c29863b48fb71ea04fc9bff47

AsIntermediate表示这个输出仅仅用于反向,不会作为其他算子的输入,参考这个
48bbfac76f14af583f7008ac676c8c84

@luotao1
Copy link
Contributor

luotao1 commented Oct 10, 2023

需要解决下冲突

@zeroRains
Copy link
Contributor Author

需要解决下冲突

Done

@zeroRains
Copy link
Contributor Author

zeroRains commented Oct 15, 2023

image
迁移了fusion_gru_op.cc之后,使得fusion_gru_mkldnn_op单测不过,但是这个单测调用的kernel应该在fusion_gru_mkldnn_op.cc文件中,我都没有修改过,仔细看了一下和fusion_gru_op.cc也没有调用关系。实在看不出来为什么会出现这个错误。

@yuanlehome

@yuanlehome
Copy link
Contributor

image 迁移了fusion_gru_op.cc之后,使得fusion_gru_mkldnn_op单测不过,但是这个单测调用的kernel应该在fusion_gru_mkldnn_op.cc文件中,我都没有修改过,仔细看了一下和fusion_gru_op.cc也没有调用关系。实在看不出来为什么会出现这个错误。

@yuanlehome

fusion_gru_mkldnn_op和fusion_gru_op使用的是同一个op名字“fusion_gru”,只是后端不同,看起来需要把fusion_gru_mkldnn_op也一并迁到phi里去才可以

@yuanlehome
Copy link
Contributor

image 迁移了fusion_gru_op.cc之后,使得fusion_gru_mkldnn_op单测不过,但是这个单测调用的kernel应该在fusion_gru_mkldnn_op.cc文件中,我都没有修改过,仔细看了一下和fusion_gru_op.cc也没有调用关系。实在看不出来为什么会出现这个错误。
@yuanlehome

fusion_gru_mkldnn_op和fusion_gru_op使用的是同一个op名字“fusion_gru”,只是后端不同,看起来需要把fusion_gru_mkldnn_op也一并迁到phi里去才可以。还需要关注下 paddle/fluid/operators/fused/fusion_gru_op.h

@zeroRains
Copy link
Contributor Author

image 迁移了fusion_gru_op.cc之后,使得fusion_gru_mkldnn_op单测不过,但是这个单测调用的kernel应该在fusion_gru_mkldnn_op.cc文件中,我都没有修改过,仔细看了一下和fusion_gru_op.cc也没有调用关系。实在看不出来为什么会出现这个错误。
@yuanlehome

fusion_gru_mkldnn_op和fusion_gru_op使用的是同一个op名字“fusion_gru”,只是后端不同,看起来需要把fusion_gru_mkldnn_op也一并迁到phi里去才可以。还需要关注下 paddle/fluid/operators/fused/fusion_gru_op.h

我迁移fusion_gru_mkldnn_op试试

@yuanlehome
Copy link
Contributor

部分单测仍然会挂掉,需要再看一下

@zeroRains
Copy link
Contributor Author

部分单测仍然会挂掉,需要再看一下

嗯嗯,我看到了,我后面检查一下

@zeroRains
Copy link
Contributor Author

部分单测仍然会挂掉,需要再看一下

嗯嗯,我看到了,我后面检查一下

额,请问这些单侧的文件在哪里呢,我在本地直接执行,好像没找到
image

image

@zeroRains
Copy link
Contributor Author

请问开启FLAGS_enable_new_ir_in_executor=1后,出现定义的输出和实际输出的数量不对应的情况,这种一般会是什么原因呢?
image

@zeroRains
Copy link
Contributor Author

image

调试中写的这些输出好像也没发挥作用,(:з」∠)

@@ -208,7 +208,7 @@
data_type : x

- op : fusion_gru
args : (Tensor x, Tensor h0, Tensor weight_x, Tensor weight_h, Tensor bias, str activation = "tanh", str gate_activation = "sigmoid", bool is_reverse = false, bool use_seq = true, bool origin_mode = false, bool use_mkldnn = false, str mkldnn_data_type = "float32", float scale_data = 1.0f, float shift_data = 0.0f, float[] scale_weights = {1.0f}, bool force_fp32_output = false)
args : (Tensor x, Tensor h0, Tensor weight_x, Tensor weight_h, Tensor bias, str activation = "tanh", str gate_activation = "sigmoid", bool is_reverse = false, bool use_seq = true, bool origin_mode = false, bool use_onednn = false, str onednn_data_type = "float32", float scale_data = 1.0f, float shift_data = 0.0f, float[] scale_weights = {1.0f}, bool force_fp32_output = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op的属性改为onednn字样还需要评估下,会不会导致一些不兼容的问题

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

额,那我改回来_(:з)∠)_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不用改回来,下面那个改回来吧

DenseTensor* batched_input,
DenseTensor* batched_out,
DenseTensor* hidden) {
void FusionGRUOneDNNKernel(const Context& dev_ctx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个kernel名字就没要加OneDNN字样了吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1332,15 +1325,51 @@
extra :
attrs : [str data_format = "AnyLayout"]

- op : fusion_transpose_flatten_concat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么把fusion_transpose_flatten_concat给删掉了?它的输入输出不是不一致吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个没有删掉啊,之前解决冲突的时候被我换了个位置然后用pre-commit的时候位置调整了一下,他现在在1368行

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

@yuanlehome
Copy link
Contributor

另外确认下,迁移完后相关的几个单测都是跑通的对吧?

@zeroRains
Copy link
Contributor Author

另外确认下,迁移完后相关的几个单测都是跑通的对吧?

不开启FLAGS_enable_new_ir_in_executor=1的时候,是可以跑通的

yuanlehome
yuanlehome previously approved these changes Oct 26, 2023
Copy link
Contributor

@yuanlehome yuanlehome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

zyfncg
zyfncg previously approved these changes Oct 26, 2023
Comment on lines +2365 to +2366
out->set_dims({x_dims[0], w_dims[1]});
col_mat->set_dims({x_dims[0], w_dims[0]});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一般dtype也需要进行设置,这几个InferMeta函数可以再提个PR补一下

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeroRains 直接在这个PR补上吧~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前合入的那个PR好像也没有设置,也一并加上吧~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@zeroRains zeroRains dismissed stale reviews from zyfncg and yuanlehome via 187fb27 October 26, 2023 10:57
@zeroRains
Copy link
Contributor Author

额。。Corverage超时了。。。我还没权限重跑

attrs :
trans_axis : trans_axis
flatten_axis : flatten_axis
concat_axis : concat_axis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么将原来的删除了?会不会存在兼容性问题

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不会的,name一样不需要添加映射

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

名称一致所以删掉

Copy link
Contributor

@heavyrain-lzy heavyrain-lzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for YAML

@phlrain phlrain self-requested a review October 27, 2023 08:27
Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No doc changes

@yuanlehome yuanlehome merged commit 2157c07 into PaddlePaddle:develop Oct 27, 2023
@zeroRains zeroRains deleted the 105 branch October 27, 2023 09:21
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
…on_seqexpand_concat_fc to phi (PaddlePaddle#57881)

* add a part of fusion_gru

* add some code

* move the fusion_seqconv_eltadd_relu_op to phi but have the same bug in gru_op

* move fusion_seqexpand_concat_fc_op to phi, but have the same bug in fusion_gru_op

* fix the intermediate bug

* fix the conflict

* pass the fusion_seqconv in new ir

* try to move fusion_gru_mkldnn_kernel to phi

* move fusion_gru_mkldnn to phi, pass the test but some bug in new iR

* change some discribe

* fix conflict

* fix bug

* fix the bug in getInputName

* remove some describe

* add the set_dtype
@yuanlehome
Copy link
Contributor

hi同学,这几个kernel的迁移到phi存在问题,需要提PR修复下哈,mkldnn相关的参数不需要写到yaml中,也不需要在kernel、infermata的签名中体现。
参数传入的方法可参考:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L3545
kernel中取参数可参考:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/kernels/onednn/matmul_kernel.cc#L109

@yuanlehome
Copy link
Contributor

hi同学,这几个kernel的迁移到phi存在问题,需要提PR修复下哈,mkldnn相关的参数不需要写到yaml中,也不需要在kernel、infermata的签名中体现。 参数传入的方法可参考:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/operator.cc#L3545 kernel中取参数可参考:https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/kernels/onednn/matmul_kernel.cc#L109

修复前后,与之前的验证方法一样哈

@PaddlePaddle PaddlePaddle deleted a comment from zeroRains Nov 29, 2023
@luotao1 luotao1 changed the title 【Hackathon 5th No.105】move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi 【Hackathon 5th No.105】move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi -part Dec 22, 2023
@luotao1 luotao1 changed the title 【Hackathon 5th No.105】move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi -part 【Hackathon 5th No.105】move fusion_gru/fusion_seqconv_eltadd_relu/fusion_seqexpand_concat_fc to phi Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants