perhaps there is a misprint at line 40 #2111

weiguo-li · 2022-11-04T07:48:58Z

instead of # self-attention layers in nn.TransformerEncoder are only allowed to attend,

self-attention layers in `nn.TransformerDecoder` are only allowed to attend.....

Decoder rather than Encoder

cc @svekars @carljparker

The text was updated successfully, but these errors were encountered:

BaliDataMan · 2023-01-28T07:53:12Z

@weiguo-li Can you please add link of the file here?

frasertajima · 2023-05-31T17:47:13Z

\assigntome

fix to "perhaps there is a misprint at line 40 pytorch#2111"; review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests: "Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including that position. We need to prevent leftward information flow in the decoder to preserve the auto-regressive property. We implement this inside of scaled dot-product attention by masking out (setting to −∞) all values in the input of the softmax which correspond to illegal connections. See Figure 2." Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.

Fix to "perhaps there is a misprint at line 40 #2111"; review of referenced paper https://arxiv.org/pdf/1706.03762.pdf section 3.2.3 suggests: "Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including that position. We need to prevent leftward information flow in the decoder to preserve the auto-regressive property. We implement this inside of scaled dot-product attention by masking out (setting to −∞) all values in the input of the softmax which correspond to illegal connections. See Figure 2." Thus the suggested change in reference from nn.Transform.Encoder to nn.Transform.Decoder seems reasonable.

kit1980 · 2023-06-02T22:23:14Z

Closing as fixed by #2363

svekars added the grammar label Dec 5, 2022

svekars added easy docathon-h1-2023 A label for the docathon in H1 2023 labels May 31, 2023

frasertajima mentioned this issue May 31, 2023

Update transformer_tutorial.py #2363

Merged

4 tasks

frasertajima mentioned this issue Jun 1, 2023

Update nn_tutorial.py #2368

Merged

4 tasks

kit1980 closed this as completed Jun 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perhaps there is a misprint at line 40 #2111

perhaps there is a misprint at line 40 #2111

weiguo-li commented Nov 4, 2022 •

edited by pytorch-bot bot

Loading

BaliDataMan commented Jan 28, 2023

frasertajima commented May 31, 2023

kit1980 commented Jun 2, 2023

perhaps there is a misprint at line 40 #2111

perhaps there is a misprint at line 40 #2111

Comments

weiguo-li commented Nov 4, 2022 • edited by pytorch-bot bot Loading

self-attention layers in nn.TransformerDecoder are only allowed to attend.....

BaliDataMan commented Jan 28, 2023

frasertajima commented May 31, 2023

kit1980 commented Jun 2, 2023

weiguo-li commented Nov 4, 2022 •

edited by pytorch-bot bot

Loading

self-attention layers in `nn.TransformerDecoder` are only allowed to attend.....