Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Adding a conv MLP, following VAN #321

Merged
merged 5 commits into from
Jun 6, 2022
Merged

[feat] Adding a conv MLP, following VAN #321

merged 5 commits into from
Jun 6, 2022

Conversation

blefaudeux
Copy link
Contributor

@blefaudeux blefaudeux commented Jun 3, 2022

What does this PR do?

One step towards #319, adding the MLP/Conv2d hybrid proposed by the VAN paper. Interestingly, testing this with a "Metaformer" (in true xformers fashion you can mix and match) on a tiny example does bring a measurable benefit.

Small (6M) Metaformer on Cifar10 Orange is the default (scaled dot product attention, not poolformer) + MLP White is the same but with the ConvMLP that this PR introduces

Screenshot from 2022-06-02 22-07-12

Before submitting

  • Did you have fun?
    • Make sure you had fun coding 🙃
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • N/A
  • Did you make sure to update the docs?
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you update the changelog? (if needed)
    • N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 3, 2022


@register_feedforward("ConvMLP", ConvMlpConfig)
class ConvMLP(Feedforward):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @fmassa, it's an interesting take I think

@blefaudeux blefaudeux marked this pull request as draft June 3, 2022 15:25
@blefaudeux blefaudeux changed the title [DRAFT] Adding a conv MLP, following VAN [feat] Adding a conv MLP, following VAN Jun 3, 2022
@blefaudeux blefaudeux marked this pull request as ready for review June 3, 2022 16:14
@codecov-commenter
Copy link

codecov-commenter commented Jun 3, 2022

Codecov Report

Merging #321 (1b13a83) into main (5ccbcd9) will decrease coverage by 0.04%.
The diff coverage is 89.79%.

@@            Coverage Diff             @@
##             main     #321      +/-   ##
==========================================
- Coverage   93.75%   93.70%   -0.05%     
==========================================
  Files          68       69       +1     
  Lines        3840     3889      +49     
==========================================
+ Hits         3600     3644      +44     
- Misses        240      245       +5     
Flag Coverage Δ
Python 93.70% <89.79%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
xformers/factory/weight_init.py 90.82% <ø> (ø)
xformers/helpers/hierarchical_configs.py 100.00% <ø> (ø)
xformers/components/feedforward/conv_mlp.py 89.36% <89.36%> (ø)
xformers/components/feedforward/base.py 100.00% <100.00%> (ø)
xformers/factory/block_factory.py 97.03% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ccbcd9...1b13a83. Read the comment docs.

)

# This feedforward requires a context length which is squared, often due to 2D pooling
self.requires_squared_context = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does 2D convolutions, meaning that the layer needs to be able to go from [Batch x Context x Embedding] to [Batch x H x W x Embedding]. A solution which is not too intrusive is to force the use of sequences being squared numbers, meaning essentially that we only work with square pictures. It's pretty common in vision codebases, I think that another solution would be to keep track of the original H and W prior to flattening this dimension.

@blefaudeux
Copy link
Contributor Author

Codecov Report

Merging #321 (1b13a83) into main (5ccbcd9) will decrease coverage by 0.04%.
The diff coverage is 89.79%.

@@            Coverage Diff             @@
##             main     #321      +/-   ##
==========================================
- Coverage   93.75%   93.70%   -0.05%     
==========================================
  Files          68       69       +1     
  Lines        3840     3889      +49     
==========================================
+ Hits         3600     3644      +44     
- Misses        240      245       +5     

Flag Coverage Δ
Python 93.70% <89.79%> (-0.05%) arrow_down

Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files Coverage Δ
xformers/factory/weight_init.py 90.82% <ø> (ø)
xformers/helpers/hierarchical_configs.py 100.00% <ø> (ø)
xformers/components/feedforward/conv_mlp.py 89.36% <89.36%> (ø)
xformers/components/feedforward/base.py 100.00% <100.00%> (ø)
xformers/factory/block_factory.py 97.03% <100.00%> (+0.02%) arrow_up

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ccbcd9...1b13a83. Read the comment docs.

should be fixed with the last update

@blefaudeux
Copy link
Contributor Author

Screenshot from 2022-06-03 15-55-17

Copy link
Contributor

@dianaml0 dianaml0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice!

@blefaudeux blefaudeux merged commit b41a3f3 into main Jun 6, 2022
@blefaudeux blefaudeux deleted the conv_mlp branch June 6, 2022 15:54
@blefaudeux blefaudeux mentioned this pull request Jun 8, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants