-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix linalg vector norm backward bug #8015
Fix linalg vector norm backward bug #8015
Conversation
感觉可以试试其他框架譬如paddle的? |
好 |
看了一下,paddle没这个接口 |
我用paddle试了一下,也是0 |
@@ -153,6 +153,13 @@ struct AtanhFunctor<float> { | |||
} | |||
}; | |||
|
|||
template<> | |||
struct NotEqualZeroFunctor<float> { | |||
static OF_DEVICE_FUNC float Forward(const float x) { return x != 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
应该用 static_cast(0.0), 你这样是拿着float和int比较吧
* fix reduce_sum scalar check bug * fix linalg vector norm and clip grad bug * fix comment * auto format by CI * Fix linalg vector norm backward bug (#8015) * has multi definition bug * fix bug * fix commnet * fix bug Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
之前linalg.vector_norm在ord=0的情况下,是用ScalarLogicalNotEqual+ReduceSum来做的,这会导致在反向的时候后向图断掉。这里修改新增了一个Unary算子NotEqualZero解决了这个问题。
需要说明的是pytorch的linalg.vector_norm在ord=0的时候求梯度是直接用的flow.zeros_like来设置,见:https://github.com/pytorch/pytorch/pull/59135/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5L231-L232 。导致只要ord=0,那么输入的梯度就永远都是0。但这明显是不符合这个api的语义的,倾向于这是一个pytorch bug。所以我还是坚持我们的做法,感觉我们这样才是正确的。
例子:
pytorch输出0, oneflow输出1. 按照语义来看,梯度确实应该是1.