[feat] Adding a conv MLP, following VAN #321

blefaudeux · 2022-06-03T05:11:08Z

What does this PR do?

One step towards #319, adding the MLP/Conv2d hybrid proposed by the VAN paper. Interestingly, testing this with a "Metaformer" (in true xformers fashion you can mix and match) on a tiny example does bring a measurable benefit.

Small (6M) Metaformer on Cifar10

Orange is the default (scaled dot product attention, not poolformer) + MLP White is the same but with the ConvMLP that this PR introduces

Before submitting

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

xformers/factory/weight_init.py

blefaudeux · 2022-06-03T05:15:52Z

xformers/components/feedforward/conv_mlp.py

+
+
+@register_feedforward("ConvMLP", ConvMlpConfig)
+class ConvMLP(Feedforward):


cc @fmassa, it's an interesting take I think

codecov-commenter · 2022-06-03T16:21:07Z

Codecov Report

Merging #321 (1b13a83) into main (5ccbcd9) will decrease coverage by 0.04%.
The diff coverage is 89.79%.

@@            Coverage Diff             @@
##             main     #321      +/-   ##
==========================================
- Coverage   93.75%   93.70%   -0.05%     
==========================================
  Files          68       69       +1     
  Lines        3840     3889      +49     
==========================================
+ Hits         3600     3644      +44     
- Misses        240      245       +5

Flag	Coverage Δ
Python	`93.70% <89.79%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
xformers/factory/weight_init.py	`90.82% <ø> (ø)`
xformers/helpers/hierarchical_configs.py	`100.00% <ø> (ø)`
xformers/components/feedforward/conv_mlp.py	`89.36% <89.36%> (ø)`
xformers/components/feedforward/base.py	`100.00% <100.00%> (ø)`
xformers/factory/block_factory.py	`97.03% <100.00%> (+0.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ccbcd9...1b13a83. Read the comment docs.

blefaudeux · 2022-06-03T21:55:10Z

xformers/components/feedforward/conv_mlp.py

+        )
+
+        # This feedforward requires a context length which is squared, often due to 2D pooling
+        self.requires_squared_context = True


this does 2D convolutions, meaning that the layer needs to be able to go from [Batch x Context x Embedding] to [Batch x H x W x Embedding]. A solution which is not too intrusive is to force the use of sequences being squared numbers, meaning essentially that we only work with square pictures. It's pretty common in vision codebases, I think that another solution would be to keep track of the original H and W prior to flattening this dimension.

blefaudeux · 2022-06-03T22:32:12Z

Codecov Report

Merging #321 (1b13a83) into main (5ccbcd9) will decrease coverage by 0.04%.
The diff coverage is 89.79%.
@@            Coverage Diff             @@
##             main     #321      +/-   ##
==========================================
- Coverage   93.75%   93.70%   -0.05%     
==========================================
  Files          68       69       +1     
  Lines        3840     3889      +49     
==========================================
+ Hits         3600     3644      +44     
- Misses        240      245       +5     
Flag Coverage Δ
Python 93.70% <89.79%> (-0.05%) arrow_down

Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files Coverage Δ
xformers/factory/weight_init.py 90.82% <ø> (ø)
xformers/helpers/hierarchical_configs.py 100.00% <ø> (ø)
xformers/components/feedforward/conv_mlp.py 89.36% <89.36%> (ø)
xformers/components/feedforward/base.py 100.00% <100.00%> (ø)
xformers/factory/block_factory.py 97.03% <100.00%> (+0.02%) arrow_up

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ccbcd9...1b13a83. Read the comment docs.

should be fixed with the last update

blefaudeux · 2022-06-03T22:55:34Z

dianaml0

LGTM! Nice!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 3, 2022

blefaudeux force-pushed the conv_mlp branch from c9e2217 to 0f5d2a9 Compare June 3, 2022 05:14

blefaudeux commented Jun 3, 2022

View reviewed changes

xformers/factory/weight_init.py Outdated Show resolved Hide resolved

blefaudeux commented Jun 3, 2022

View reviewed changes

blefaudeux marked this pull request as draft June 3, 2022 15:25

blefaudeux changed the title ~~[DRAFT] Adding a conv MLP, following VAN~~ [feat] Adding a conv MLP, following VAN Jun 3, 2022

blefaudeux requested review from dianaml0, fmassa and danthe3rd June 3, 2022 16:14

blefaudeux marked this pull request as ready for review June 3, 2022 16:14

blefaudeux force-pushed the conv_mlp branch from 5fe5ec3 to 55119ab Compare June 3, 2022 20:27

blefaudeux commented Jun 3, 2022

View reviewed changes

blefaudeux added 4 commits June 3, 2022 15:00

Adding a conv MLP, following VAN

30b8dca

Renaming to Conv2DFeedforward, more specific I believe

f942caf

Catch FF requiring squared context length

5a45e84

Adding a reference in the README

1b13a83

blefaudeux force-pushed the conv_mlp branch from af41712 to 1b13a83 Compare June 3, 2022 22:00

removing dead code

6e230ae

dianaml0 approved these changes Jun 6, 2022

View reviewed changes

blefaudeux merged commit b41a3f3 into main Jun 6, 2022

blefaudeux deleted the conv_mlp branch June 6, 2022 15:54

blefaudeux mentioned this pull request Jun 8, 2022

[feat] Adding Visual Attention #329

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Adding a conv MLP, following VAN #321

[feat] Adding a conv MLP, following VAN #321

blefaudeux commented Jun 3, 2022 •

edited

Loading

blefaudeux Jun 3, 2022

codecov-commenter commented Jun 3, 2022 •

edited

Loading

blefaudeux Jun 3, 2022

blefaudeux commented Jun 3, 2022

Codecov Report

blefaudeux commented Jun 3, 2022

dianaml0 left a comment



		@register_feedforward("ConvMLP", ConvMlpConfig)
		class ConvMLP(Feedforward):

[feat] Adding a conv MLP, following VAN #321

[feat] Adding a conv MLP, following VAN #321

Conversation

blefaudeux commented Jun 3, 2022 • edited Loading

What does this PR do?

Before submitting

PR review

blefaudeux Jun 3, 2022

Choose a reason for hiding this comment

codecov-commenter commented Jun 3, 2022 • edited Loading

Codecov Report

blefaudeux Jun 3, 2022

Choose a reason for hiding this comment

blefaudeux commented Jun 3, 2022

Codecov Report

blefaudeux commented Jun 3, 2022

dianaml0 left a comment

Choose a reason for hiding this comment

blefaudeux commented Jun 3, 2022 •

edited

Loading

codecov-commenter commented Jun 3, 2022 •

edited

Loading