-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update transformer_tutorial.py #2363
Conversation
fix to "perhaps there is a misprint at line 40 pytorch#2111"; review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests: "Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including that position. We need to prevent leftward information flow in the decoder to preserve the auto-regressive property. We implement this inside of scaled dot-product attention by masking out (setting to −∞) all values in the input of the softmax which correspond to illegal connections. See Figure 2." Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.
Hi @frasertajima! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Please sign the CLA so we can review your PR. |
Signed CLA at few hours ago. I will try again?
…On Wed, 31 May 2023 at 12:20, Svetlana Karslioglu ***@***.***> wrote:
Please sign the CLA so we can review your PR.
—
Reply to this email directly, view it on GitHub
<#2363 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQRHIJHZLSSPHY7Q76RCVY3XI6KZTANCNFSM6AAAAAAYV2LYNA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Fraser Tajima
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix!
Fix to #2111 "perhaps there is a misprint at line 40 "; original did not link file. Searched for file over the internet.
Review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests (bold added):
"Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including that position. We need to prevent leftward information flow in the decoder to preserve the auto-regressive property. We implement this inside of scaled dot-product attention by masking out (setting to −∞) all values in the input of the softmax which correspond to illegal connections. See Figure 2."
Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.
Fixes #2111
Description
Checklist
cc @svekars @carljparker @pytorch/team-text-core @Nayef211