Add support for broadcast multiply along batch dimension (W) #5769

kpaigwar · 2024-02-28T14:51:59Z

Requirement

Need support for an op which can perform element-wise multiply and broadcast along batch dim.
For example, (32, 1, 32, 1024)*(1, 1, 32, 1024)

kpaigwar · 2024-02-28T14:52:45Z

fyi @apalaguha @esmalTT

kpaigwar · 2024-02-29T23:31:45Z

This op is crtical for us when doing element-wise multiplication of conv1D weights (batch independent) with inputs (batch dependent). The workaround to manually broadcast the conv1D weights makes the implementation harder and non-performant.

bharane-abb · 2024-03-12T15:44:47Z

Hi @kpaigwar @jliangTT
We have come up with two ideas for multiplication without using broadcasting:

Using the repeat function, we will repeat the smaller tensor into the required shape and then proceed with multiplication.
Using the unpad function, we can unpad the larger tensor into a smaller tensor and multiply it with the smaller tensor. However, this process is time-consuming as it involves many unpad, multiply, and concat operations.

kpaigwar · 2024-03-12T15:58:26Z

@bharane-ab first approach would be ideal

umadevimcw · 2024-03-15T08:17:47Z

@kpaigwar Regarding the above why are we just considering mul operation alone? What about other binary, operations like squared_diff, add, sub, etc.. ? If we make changes in commonplace then it would be applied to all.

Tried to update the eltwise_binary_op.hpp to apply changes to all ops in binary. But getting error shown in below image. Suggested flag (in the image not helping). Not able to use repeat in the mentioned file

@kpaigwar Do you want me to go with separate op like batch_mul or some thing? Any suggestions?

umadevimcw · 2024-03-15T10:45:31Z

Tried the implementation in composite as a separate batch mul op #6442

tt-aho · 2024-03-15T15:37:31Z

If you are targeting performance, are you planning to run this model/op sharded? Or exactly as specified in this issue? The optimizations/support for sharding are different than interleaved so want to make sure we're targeting/optimizing the right thing. Similar question for #6361

kpaigwar · 2024-03-15T16:24:12Z

Since this issue is also related to repeat #6361, I will drop the priority of this issue and would address P0 issue first #6361

umadevimcw · 2024-03-20T11:11:26Z

@kpaigwar Please find the updated PR #6587 for batch mul support. It doesn't involve new op. Also changed the code in common place hence it is working for other binary op as well

umadevimcw · 2024-03-22T12:59:15Z

Support Merged to Main

kpaigwar added operations P2 feature LLMs on Metal labels Feb 28, 2024

kpaigwar added P1_feature_needed and removed P2 labels Feb 29, 2024

jliangTT added the op_cat: broadcast label Mar 2, 2024

jliangTT assigned umadevimcw Mar 8, 2024

umadevimcw assigned bharane-ab Mar 11, 2024

umadevimcw added a commit that referenced this issue Mar 15, 2024

#5769: Add batch mul support in composite

99da763

umadevimcw mentioned this issue Mar 15, 2024

Eltwise / Reduce / Broadcast Related tasks/bugs - MCW #6445

Open

44 tasks

umadevimcw added a commit that referenced this issue Mar 15, 2024

#5769: Add batch mul support in composite

56e7868

tt-aho mentioned this issue Mar 15, 2024

ttnn.repeat throws runtime memory error #6361

Closed

umadevimcw mentioned this issue Mar 18, 2024

#5769: Add batch mul support in composite #6442

Closed

smehtaTT added the P1 label Mar 18, 2024

umadevimcw added a commit that referenced this issue Mar 20, 2024

#5769: Update batch broadcast using repeat in common place

b3a0ee6

umadevimcw added a commit that referenced this issue Mar 20, 2024

#5769: Update batch broadcast using repeat in common place

3857984

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Fix CI fail issue

e2c02c3

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Update batch broadcast using repeat in common place

99e185c

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Fix CI fail issue

fca3190

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Update batch broadcast using repeat in common place

5f7b871

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Fix CI fail issue

87a7ae0

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Update batch broadcast using repeat in common place

ed35500

umadevimcw added a commit that referenced this issue Mar 22, 2024

#5769: Fix CI fail issue

9657ed3

umadevimcw closed this as completed Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for broadcast multiply along batch dimension (W) #5769

Add support for broadcast multiply along batch dimension (W) #5769

kpaigwar commented Feb 28, 2024

kpaigwar commented Feb 28, 2024

kpaigwar commented Feb 29, 2024 •

edited

Loading

bharane-abb commented Mar 12, 2024 •

edited

Loading

kpaigwar commented Mar 12, 2024

umadevimcw commented Mar 15, 2024 •

edited

Loading

umadevimcw commented Mar 15, 2024

tt-aho commented Mar 15, 2024

kpaigwar commented Mar 15, 2024 •

edited

Loading

umadevimcw commented Mar 20, 2024

umadevimcw commented Mar 22, 2024

Add support for broadcast multiply along batch dimension (W) #5769

Add support for broadcast multiply along batch dimension (W) #5769

Comments

kpaigwar commented Feb 28, 2024

Requirement

kpaigwar commented Feb 28, 2024

kpaigwar commented Feb 29, 2024 • edited Loading

bharane-abb commented Mar 12, 2024 • edited Loading

kpaigwar commented Mar 12, 2024

umadevimcw commented Mar 15, 2024 • edited Loading

umadevimcw commented Mar 15, 2024

tt-aho commented Mar 15, 2024

kpaigwar commented Mar 15, 2024 • edited Loading

umadevimcw commented Mar 20, 2024

umadevimcw commented Mar 22, 2024

kpaigwar commented Feb 29, 2024 •

edited

Loading

bharane-abb commented Mar 12, 2024 •

edited

Loading

umadevimcw commented Mar 15, 2024 •

edited

Loading

kpaigwar commented Mar 15, 2024 •

edited

Loading