-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reduce_merge files for reduce op #32697
Conversation
Thanks for your contribution! |
paddle/fluid/operators/reduce_ops/reduce_merge/reduce_functor_op.h
Outdated
Show resolved
Hide resolved
建议文件放在reduce_ops下即可,后缀改为.cuh |
注意clang-format |
07f509e
to
8378a5e
Compare
5a68611
to
1c44bae
Compare
…a malloc and writed more notes
1c44bae
to
f83712c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议PR代码接入一个算子,跑下完整的CI,不然光看代码看不出是否有问题。#32804 正在给reduce_sum添加fp16 kernel,这个实现当前没有针对fp16进行优化,先不要改reduce_sum。
#include "paddle/fluid/operators/amp/fp16_type_traits.h" | ||
#include "paddle/fluid/platform/device_context.h" | ||
#include "paddle/fluid/platform/hostdevice.h" | ||
#include "paddle/fluid/platform/macros.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
极小化include的头文件,看起来以上头文件都没有实际依赖。
} | ||
|
||
template <typename T, size_t ElementCount, typename VectorLikeType> | ||
static inline paddle::framework::Array<T, ElementCount> from( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数是从什么转到什么?单个的From
函数一般是用作类的成员函数,如果是单独的函数,最好命名更清楚一些。
vec.size(), ElementCount)); | ||
size_t n = static_cast<size_t>(vec.size()); | ||
paddle::framework::Array<T, ElementCount> ret; | ||
for (size_t i = 0; i < n; ++i) ret[i] = vec[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要写在同一行,加{}。
|
||
template <typename Tx, typename Ty, int BlockDim, typename ReduceOp, | ||
typename TransformOp, int kRank, int kReduceRank> | ||
static void launchKernel(const Tx* x_data, Ty* y_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
launchKernel -> LaunchKernel,首字母大写。
|
||
template <typename Tx, typename Ty, int BlockDim, typename ReduceOp, | ||
typename TransformOp> | ||
static void launchReduceKernel(const Tx* x_data, Ty* y_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
launchReduceKernel -> LaunchReduceKernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 先合入这个pr,下个pr尽快接入算子进行ci验证。
PR types
Performance optimization
PR changes
APIs
Describe
add reduce_op.h and functor.h in reduce_merge for reduce op
性能测试数据: