-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for broadcast multiply along batch dimension (W) #5769
Comments
fyi @apalaguha @esmalTT |
This op is crtical for us when doing element-wise multiplication of conv1D weights (batch independent) with inputs (batch dependent). The workaround to manually broadcast the conv1D weights makes the implementation harder and non-performant. |
Hi @kpaigwar @jliangTT
|
@bharane-ab first approach would be ideal |
@kpaigwar Regarding the above why are we just considering mul operation alone? What about other binary, operations like squared_diff, add, sub, etc.. ? If we make changes in commonplace then it would be applied to all. Tried to update the ![]() @kpaigwar Do you want me to go with separate op like |
Tried the implementation in composite as a separate batch mul op #6442 |
If you are targeting performance, are you planning to run this model/op sharded? Or exactly as specified in this issue? The optimizations/support for sharding are different than interleaved so want to make sure we're targeting/optimizing the right thing. Similar question for #6361 |
Support Merged to Main |
Requirement
Need support for an op which can perform element-wise multiply and broadcast along batch dim.
For example, (32, 1, 32, 1024)*(1, 1, 32, 1024)
The text was updated successfully, but these errors were encountered: