Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Commit

Permalink
Fix bug in copy_unk (#964)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #964

When the copy_unk flag is set to true. Any unk that is produced in the output of the Seq2Seq model is replaced by the token that was mapped to unk from the utterance. This is a easy way to get gains since outputs with unk are always wrong.

Looking at the old code for copying the unk token we see that TorchScript optimizes out the actual search of the unk token in utterance:

{F207887831}

This diff updates the code to produce the correct TorchScript Graph

{F207888470}

Reviewed By: arbabu123

Differential Revision: D17213086

fbshipit-source-id: ebbfc52dcd703939316b15250110271336ef131d
  • Loading branch information
ArmenAg authored and facebook-github-bot committed Sep 10, 2019
1 parent 9fc6aa4 commit b362c31
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion pytext/utils/torch.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def lookup_words_1d(

@torch.jit.script_method
def lookup_word(self, idx: int, possible_unk_token: Optional[str] = None):
if idx < len(self.vocab):
if idx < len(self.vocab) and idx != self.unk_idx:
return self.vocab[idx]
else:
return (
Expand Down

0 comments on commit b362c31

Please sign in to comment.