-
Notifications
You must be signed in to change notification settings - Fork 127
operator: Support Clip op for cpu, wasm, and webgl backends #107
Conversation
const size = result.floatData.length; | ||
if (inputs[0].type === 'float32') { | ||
WasmBinding.getInstance().ccall( | ||
'_clip_f32', [inputs[0].floatData, 'float32ptr'], [result.floatData, 'float32ptr', 'out'], [size, 'int32'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No use of templatizing as unary-op
in WASM backend as there is not much to templatize (no core implementation here). In fact, the overhead in dealing with types and attributes parsing (which is much more prevalent than in binary-op
case) makes it an unattractive option. So, keep the kernels separate even for unary-op
like implementations
|
||
// Core implementation of the op | ||
template <typename T> | ||
void clip_imp(const T *input, T *output, const int32_t length, const float min, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No use of templatizing like binary_imp
as in binary ops there is a lot to templatize on (for example - the broadcasting logic, etc.). Here - there is only a single for
loop to templatize on and it becomes cumbersome to deal with special-casing of attributes in unary-op
case. So, only templatizing on different types
of the op and instantiating as necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is OK for now. If we are going to have 10+ unary-op implementations it will be the time to consider this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can potentially support all unary-ops in WASM. Even then I would find it hard to templatize the core implementation as each has separate attributes. Since there is no native "attribute" object in our WASM layer, it becomes really hard to templatize this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me. there is one small bug in WASM implementation.
About WASM:
Compare WASM vs. javascript on element-wise unary op for performance:
- possible perf gain: avoid float32<->float64 convert for each element (because javascript
number
is float64, and the underlying data is usuallyFloat32Array
) - possible drawback: extra cost in interop (
ccallSerialize()
andccallDeserialize()
)
And this is also a JIT(V8)-vs-emcc battle, so the perf result may be very different in different browser version.
Due to our previous perf data (as we tried Relu
before) in resnet50 there is no performance gain from implementation of simple element-wise unary op in WASM. But it is possible that WASM implementation can be faster than javascript. So in my opinion it's a good idea to have WASM unary-op implementation.
src/wasm-ops/clip.h
Outdated
const float max) { | ||
for (size_t i = 0; i < length; ++i) { | ||
const auto &val = input[i]; | ||
output[i] = (val < min) ? min : (val > max) ? val : val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output[i] = (val < min) ? min : (val > max) ? max
: val;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird, I am not sure how this is not being caught by node tests. I tried adding unit tests, but I realized that our test data sits in some other repo making it hard to add tests :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
op test is in the repo and node tests are not. do you want to add test case into node tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not resolved this op in the wasm
backend and so it was passing (despite the bug) all tests by falling back to the cpu
backend. I added wasm
native clip in the operators resolve function and this caught the bug (as expected)
Another thing to consider for JS vs WASM unary-ops is the actual operation being performed on each element. For example, let's consider |
It is actually For float32 tensor, the data is stored in The real underlying math function is always native so that if they are same function it should not make difference. however |
Resolve #92