[feat] Compositional Attention #41

blefaudeux · 2021-10-26T04:55:50Z

🚀 Feature

Intriguing paper, keep the softmax(QKt) and V untangled, in that retrievals (*V_i in the vanilla attention) can have a look at all the searchs, that is it can be evaluated against all the softmax(QKt)_j, on a per head basis ("heads" become how many searchs and and many retrieval you support, possibly different)

Motivation

Interesting take for some tasks, does not seem life changing for classical MLM but seems very relevant to reasoning or vision related tasks

Pitch

Implement this, see how it goes in something like Dino ?

Alternatives

Not doing it

Additional context

Paper
Reference implementation

blefaudeux · 2022-01-26T16:49:34Z

done

…h#41)

blefaudeux added the enhancement New feature or request label Oct 26, 2021

blefaudeux self-assigned this Oct 26, 2021

blefaudeux added the brainstorm dropping an idea, may or may not be implemented in the end. RFC label Oct 28, 2021

blefaudeux mentioned this issue Jan 10, 2022

[feat] Compositional attention #178

Merged

14 tasks

blefaudeux closed this as completed Jan 26, 2022

xwhan pushed a commit to xwhan/xformers that referenced this issue Feb 8, 2022

[bugfix] Random attention / constant mask over batch (facebookresearc…

4106b4f

…h#41)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Compositional Attention #41

[feat] Compositional Attention #41

blefaudeux commented Oct 26, 2021

blefaudeux commented Jan 26, 2022

[feat] Compositional Attention #41

[feat] Compositional Attention #41

Comments

blefaudeux commented Oct 26, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

blefaudeux commented Jan 26, 2022