Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PIR Dist Op Reg No.22】 reg pull_sparse_v2 #63014

Merged
merged 14 commits into from
Apr 26, 2024

Conversation

xiaoyewww
Copy link
Contributor

@xiaoyewww xiaoyewww commented Mar 25, 2024

PR Category

Execute Infrastructure

PR Types

Devs

Description

#60436
注册算子pull_sparse_v2

Copy link

paddle-bot bot commented Mar 25, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Mar 25, 2024
Copy link

paddle-bot bot commented Mar 25, 2024

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Mar 26, 2024
@xiaoyewww xiaoyewww force-pushed the pir/pull_sparse_v2 branch from 36c57b0 to cc3882b Compare March 27, 2024 14:57
@xiaoyewww
Copy link
Contributor Author

@xingmingyyj 这个算子看上去还有问题,我看这个算子好像跟其他不太一样,yaml仿照push_sparse_v2来注册的

@xingmingyyj
Copy link
Contributor

xingmingyyj commented Mar 29, 2024

@xingmingyyj 这个算子看上去还有问题,我看这个算子好像跟其他不太一样,yaml仿照push_sparse_v2来注册的

哦哦,这个我也不清楚。麻烦研发老师看一下吧@kangguangli

@xiaoyewww xiaoyewww force-pushed the pir/pull_sparse_v2 branch from edea9e7 to 015a785 Compare April 7, 2024 11:16
paddle/phi/infermeta/binary.h Outdated Show resolved Hide resolved
paddle/phi/infermeta/binary.cc Outdated Show resolved Hide resolved
@xiaoyewww
Copy link
Contributor Author

xiaoyewww commented Apr 9, 2024

@kangguangli 麻烦辛苦再review一下,这里ci上报了第三方库相应的报错
#62935
看上去这两个pr是同一个问题

phi::errors::InvalidArgument(
"Output(Out) of PullSparseV2Op can not be null"));

auto hidden_size = embeddingdim;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto hidden_size = embeddingdim;
auto hidden_size = embedding_dim;

这个PR的问题目前是出在这里。

Comment on lines 1839 to 1842
forward : pull_sparse_v2 (Tensor[] ids, Tensor[] w, int embeddingdim = 11, int tableid = 0, str accessorclass = "", str ctrlabelname = "", int paddingid = 0, bool scalesparsegrad = true, str[] inputnames = {}, bool is_distributed = true) -> Tensor[](out)
args : (Tensor[] ids, Tensor[] w, Tensor[] out_grad, int embeddingdim, int tableid, str accessorclass, str ctrlabelname, int paddingid, bool scalesparsegrad, str[] inputnames, bool is_distributed)
output : Tensor[](out_grad_out)
invoke : push_sparse_v2(ids, w, out_grad, embeddingdim, tableid, accessorclass, ctrlabelname, paddingid, scalesparsegrad, inputnames, is_distributed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的命名也需要替换下。

Copy link

paddle-ci-bot bot commented Apr 19, 2024

Sorry to inform you that 759b953's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@@ -2653,6 +2653,14 @@
outputs :
out : Out

- op : pull_sparse_v2
inputs :
{ ids : Ids, W : w}
Copy link
Contributor

@xingmingyyj xingmingyyj Apr 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ ids : Ids, W : w}
{ ids : Ids, w : W}

push_sparse_v2也需要同步修改一下。


for (size_t i = 0; i < n_ids; ++i) {
out[i]->set_dims(outs_dims[i]);
out[i]->share_lod(*ids[i], i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要设置一下dtype

from paddle.base.layer_helper import LayerHelper


class TestPullGpupsSparseOpTranslator(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TestPullGpupsSparseOpTranslator(
class TestPullSparseV2OpTranslator(

inputs={"Ids": [ids], "W": [w]},
outputs={"Out": [out]},
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return out

这里也需要将out返回

@@ -1835,6 +1835,12 @@
data_type : x
optional : boxes_num

- backward_op : pull_sparse_v2_grad
forward : pull_sparse_v2 (Tensor[] ids, Tensor[] w, int embedding_dim = 11, int table_id = 0, str accessor_class = "", str ctrlabel_name = "", int padding_id = 0, bool scale_sparse_grad = true, str[] input_names = {}, bool is_distributed = true) -> Tensor[](out)
args : (Tensor[] ids, Tensor[] w, Tensor[] out_grad, int embedding_dim, int table_id, str accessor_class, str ctrlabel_name, int padding_id, bool scale_sparse_grad, str[] input_names, bool is_distributed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args : (Tensor[] ids, Tensor[] w, Tensor[] out_grad, int embedding_dim, int table_id, str accessor_class, str ctrlabel_name, int padding_id, bool scale_sparse_grad, str[] input_names, bool is_distributed)
args : (Tensor[] ids, Tensor[] w, Tensor[] out_grad, int embedding_dim = 11, int table_id = 0, str accessor_class = "", str ctrlabel_name = "", int padding_id = 0, bool scale_sparse_grad = true, str[] input_names = {}, bool is_distributed = true)

现在的报错是这里造成的,需要设置一下默认值,否则python脚本就用None填充了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢~已修改

@xingmingyyj
Copy link
Contributor

现在的报错是因为out的stop_gradient属性为True,导致反向op没插入造成的,可以等pull_gpups_sparse这个PR合入之后再rerun一下CI。

@xiaoyewww xiaoyewww force-pushed the pir/pull_sparse_v2 branch from 4641b5f to 1b49103 Compare April 25, 2024 15:10
@xiaoyewww
Copy link
Contributor Author

现在的报错是因为out的stop_gradient属性为True,导致反向op没插入造成的,可以等pull_gpups_sparse这个PR合入之后再rerun一下CI。

多谢,目前已经rebase后重新rerun了

@kangguangli kangguangli merged commit 2566968 into PaddlePaddle:develop Apr 26, 2024
28 of 30 checks passed
runzhech pushed a commit to runzhech/Paddle that referenced this pull request Apr 30, 2024
* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2
runzhech pushed a commit to runzhech/Paddle that referenced this pull request Apr 30, 2024
* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2
co63oc pushed a commit to co63oc/Paddle that referenced this pull request May 10, 2024
* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2

* feat(pir): reg pull_sparse_v2
@xiaoyewww xiaoyewww deleted the pir/pull_sparse_v2 branch May 10, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers HappyOpenSource 快乐开源活动issue与PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants