-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668
【Hackathon 5th No.28】为 Paddle 新增 slice_scatter API #668
Conversation
文档中要求slice_scatter API的实现放在python/paddle/tensor/manipulation.py中,因此该算子可以在Python层用代码实现,且Python API的开发更便捷,本RFC将实现方式定为Python API组合。但是python/paddle/tensor/manipulation.py中的其他API均调用C++代码的算子库,为了一致性可能使用C++实现比较合适。想请问老师对于这两种实现手段有何看法,关于C++和Python API到底选用那种方式更合适。 |
- dim: (int) 将要插入src slice的维度。 | ||
- start: (Optional[int]) 待插入slice位置的起始index。 | ||
- end: (Optional[int]) 待插入slice位置的结束index。 | ||
- step: (int) 待插入slice的步长。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经修改
## API实现方案 | ||
|
||
- 使用`paddle.Tensor.clone`拷贝一个Tensor | ||
- 使用PythonAPI组合,以索引或切片方式在新Tensor上赋值以完成功能。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不建议直接使用索引切片的方式实现,首先有额外的不必要解析耗时,其次索引赋值是原地的(因此这里才提到需要clone),建议参考set_value OP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的方案本质上是:
a = src.clone()
a[...,start:end:step,...] = input
事实上Torch的实现也比较类似,但是Torch中是使用Aten在C++中实现的。
set_value我注意到文档中写道This API is ONLY available in Dygraph mode,似乎和我们的要求不符?想麻烦老师说明如何参考。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a = src.clone() a[...,start:end:step,...] = input
这个写法底层的本质仍然是调用的set_value OP,这个目前没暴露成api,可以参考下 OP定义https://github.com/PaddlePaddle/Paddle/blob/864779c75bc9d57a67442935c49e7903943edbc0/paddle/phi/kernels/cpu/set_value_kernel.cc#L15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
收到,我去看一看
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看了一下,set_value OP其实完成的就是将Tensor的某些位置赋值给另一个Tensor。示例代码这里是inplace完成的。
老师的意思是,我们实现时仿照_setitem_impl_
函数,通过set_value的C++ OP调用完成,同时取消inplace设置么?这样做应该就可以省去老师所说的解析开销?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_value的C++ OP调用完成,同时取消inplace设置么?这样做应该就可以省去老师所说的解析开销?
是的,解析开销主要指从x[:] = 1
的调用形式,逐步确认要调用哪个OP,以及准备这个OP的输入。这里由于我们已经知道应该调用哪个OP了,直接调用即可。此外,静态图还是需要append_op的方式,这个请参考上述代码或其他API的写法
- start: (Optional[int]) 待插入slice位置的起始index。 | ||
- end: (Optional[int]) 待插入slice位置的结束index。 | ||
- stop: (Optional[int]) 待插入slice位置的结束index。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为和slice是对偶API,参数名建议和slice对齐;
此外x
, y
表示两个对等的参数。例如add(x, y),这里使用value
更合适
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处还需要修改下
此外,辛苦补充下pytorch对应API是否要求value
的shape与input取出的情况一致呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我再调研补充一下,抱歉之前可能没看到这条review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
关于pytorch对应API是否要求value的shape与input取出的情况一致的问题,其算子中TORCH_CHECK(slice.sizes() == src.sizes(), "expected src to have a size equal to the slice of self. src size = ", src.sizes(), ", slice size = ", slice.sizes());
这行代码会检测value的shape和input的切片的一致性,因此是需要一致的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为和slice是对偶API,参数名建议和slice对齐; 此外
x
,y
表示两个对等的参数。例如add(x, y),这里使用value
更合适
是指将参数名变为x, value, axis, start, end, step
么?此处我选择用了单数,因为slice的输入可以有多个,而这里只有一个输入。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是指将参数名变为
x, value, axis, start, end, step
么?此处我选择用了单数,因为slice的输入可以有多个,而这里只有一个输入。
这个地方,建议除了将y
改成value
以外。由于set_value
算子本身和slice一样,支持多个轴,多个start等输入,个人觉得从对偶的角度,和slice一样支持多输入更好。此时也能覆盖目前的单输入场景
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是指将参数名变为
x, value, axis, start, end, step
么?此处我选择用了单数,因为slice的输入可以有多个,而这里只有一个输入。这个地方,建议除了将
y
改成value
以外。由于set_value
算子本身和slice一样,支持多个轴,多个start等输入,个人觉得从对偶的角度,和slice一样支持多输入更好。此时也能覆盖目前的单输入场景
如果要将后面的输入作为多个的话,是指将value的值放入多次切片后的位置(类似二维张量中的二维区域)么?这里我主要出于与torch对齐选择了当前的API格式,如果修改的话需要考虑对单输入特殊处理么。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是指将参数名变为
x, value, axis, start, end, step
么?此处我选择用了单数,因为slice的输入可以有多个,而这里只有一个输入。这个地方,建议除了将
y
改成value
以外。由于set_value
算子本身和slice一样,支持多个轴,多个start等输入,个人觉得从对偶的角度,和slice一样支持多输入更好。此时也能覆盖目前的单输入场景如果要将后面的输入作为多个的话,是指将value的值放入多次切片后的位置(类似二维张量中的二维区域)么?
是这样的语义。
目前看起来,可以将axes
, starts
等和paddle.slice
一样以list类型作为输入,单输入只是List里只有一个元素而多输入是有多个元素即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看了一下,我目前觉得可能还是单输入更好。一是多输入情况下我们其实仍要完成和set item有一部分重复的解析和判断操作(输入list内各参数的解析与判断,axes的合法性判断,多重输入判断等等),实现起来过于复杂。二是诸如pytorch和mindspore等其他框架都使用这一格式,我觉得api如果能够尽量与其他框架一致会更好。
## API实现方案 | ||
|
||
- 使用`paddle.Tensor.clone`拷贝一个Tensor | ||
- 使用PythonAPI组合,以索引或切片方式在新Tensor上赋值以完成功能。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_value的C++ OP调用完成,同时取消inplace设置么?这样做应该就可以省去老师所说的解析开销?
是的,解析开销主要指从x[:] = 1
的调用形式,逐步确认要调用哪个OP,以及准备这个OP的输入。这里由于我们已经知道应该调用哪个OP了,直接调用即可。此外,静态图还是需要append_op的方式,这个请参考上述代码或其他API的写法
@VOIDMalkuth 请问审核人的意见都改完了么,麻烦评论一下 |
目前还剩API的格式待讨论确定,完成后我们会at老师 |
由于个人事务较多,无法继续开发工作,希望#784 可以接替有关工作 |
为 Paddle 新增 slice_scatter API