Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Add DeepNorm/DeepNet residual path #227

Closed
blefaudeux opened this issue Mar 7, 2022 · 1 comment · Fixed by #230
Closed

[feat] Add DeepNorm/DeepNet residual path #227

blefaudeux opened this issue Mar 7, 2022 · 1 comment · Fixed by #230
Assignees
Labels
enhancement New feature or request ongoing

Comments

@blefaudeux
Copy link
Contributor

🚀 Feature

See https://arxiv.org/abs/2203.00555v1, combination of init + residual path
The residual path is already modular in xformers, it should be possible to add this in a very clean way

Motivation

Seems better all around, worth testing it out and exposing the option

Pitch

Add another residual path definition on top of preLN/postLN

Alternatives

Not doing it

Additional context

Training stability issues are real, see

@blefaudeux blefaudeux self-assigned this Mar 7, 2022
@blefaudeux
Copy link
Contributor Author

cc @dianaml0 @fmassa

@blefaudeux blefaudeux added enhancement New feature or request ongoing labels Mar 7, 2022
@blefaudeux blefaudeux mentioned this issue Mar 7, 2022
10 tasks
@blefaudeux blefaudeux linked a pull request Mar 8, 2022 that will close this issue
8 tasks
@blefaudeux blefaudeux changed the title [feat] Add DeepNorm residual path [feat] Add DeepNorm/DeepNet residual path Mar 9, 2022
blefaudeux added a commit that referenced this issue Mar 14, 2022
* Should be good to go but needs testing
* adding a unit test + assert
* removing default values which could become a footgun
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ongoing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant