make RoBERTa usable in more tasks including QA #1017

borguz · 2019-10-01T21:45:05Z

Summary:
Currently Roberta encoder, model and tensorizer are pretty stand-alone, not conforming to other PyText tasks. This diff is an attempt to better integrate it.

It involves the following:

Make GPT-2 BPE act like a proper tokenizer and also return char indices. This makes Roberta tensorizer more modular so code can be re-used
Make Roberta tensorizer conform more closely to BERTTensorizer so that the TransformerSentenceEncoder interfaces are better aligned.
Add a Roberta tensorizer for question answering

Differential Revision: D17690805

Summary: Pull Request resolved: facebookresearch#1017 Currently Roberta encoder, model and tensorizer are pretty stand-alone, not conforming to other PyText tasks. This diff is an attempt to better integrate it. It involves the following: - Make GPT-2 BPE act like a proper tokenizer and also return char indices. This makes Roberta tensorizer more modular so code can be re-used - Make Roberta tensorizer conform more closely to BERTTensorizer so that the TransformerSentenceEncoder interfaces are better aligned. - Add a Roberta tensorizer for question answering Differential Revision: D17690805 fbshipit-source-id: fc7872098e5a654e1da2c8d4878be56535991c60

facebook-github-bot · 2019-10-02T20:11:31Z

This pull request was exported from Phabricator. Differential Revision: D17690805

Summary: Pull Request resolved: facebookresearch#1017 Currently Roberta encoder, model and tensorizer are pretty stand-alone, not conforming to other PyText tasks. This diff is an attempt to better integrate it. It involves the following: - Make GPT-2 BPE act like a proper tokenizer and also return char indices. This makes Roberta tensorizer more modular so code can be re-used - Make Roberta tensorizer conform more closely to BERTTensorizer so that the TransformerSentenceEncoder interfaces are better aligned. - Add a Roberta tensorizer for question answering Differential Revision: D17690805 fbshipit-source-id: a7c6d10497f2c0a0cdf7bdffe843e4c64384d5c1

facebook-github-bot · 2019-10-03T21:01:44Z

This pull request was exported from Phabricator. Differential Revision: D17690805

Summary: Pull Request resolved: facebookresearch#1017 Currently Roberta encoder, model and tensorizer are pretty stand-alone, not conforming to other PyText tasks. This diff is an attempt to better integrate it. It involves the following: - Make GPT-2 BPE act like a proper tokenizer and also return char indices. This makes Roberta tensorizer more modular so code can be re-used - Make Roberta tensorizer conform more closely to BERTTensorizer so that the TransformerSentenceEncoder interfaces are better aligned. - Add a Roberta tensorizer for question answering Differential Revision: D17690805 fbshipit-source-id: 3c774429ed97598094d657c3069bd9865dc46f27

Summary: Pull Request resolved: facebookresearch#1017 Currently Roberta encoder, model and tensorizer are pretty stand-alone, not conforming to other PyText tasks. This diff is an attempt to better integrate it. It involves the following: - Make GPT-2 BPE act like a proper tokenizer and also return char indices. This makes Roberta tensorizer more modular so code can be re-used - Make Roberta tensorizer conform more closely to BERTTensorizer so that the TransformerSentenceEncoder interfaces are better aligned. - Add a Roberta tensorizer for question answering Differential Revision: D17690805 fbshipit-source-id: e97fe1352047e8d915005a3fcdd76bd53b268a03

facebook-github-bot · 2019-10-04T02:24:08Z

This pull request has been merged in 3fd9764.

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 1, 2019

borguz force-pushed the export-D17690805 branch from 0dbc24a to 2ee75f9 Compare October 2, 2019 20:11

borguz force-pushed the export-D17690805 branch from 2ee75f9 to 607e381 Compare October 3, 2019 21:01

facebook-github-bot closed this in 3fd9764 Oct 4, 2019

facebook-github-bot added the Merged label Oct 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make RoBERTa usable in more tasks including QA #1017

make RoBERTa usable in more tasks including QA #1017

borguz commented Oct 1, 2019

facebook-github-bot commented Oct 2, 2019

facebook-github-bot commented Oct 3, 2019

facebook-github-bot commented Oct 4, 2019

make RoBERTa usable in more tasks including QA #1017

make RoBERTa usable in more tasks including QA #1017

Conversation

borguz commented Oct 1, 2019

facebook-github-bot commented Oct 2, 2019

facebook-github-bot commented Oct 3, 2019

facebook-github-bot commented Oct 4, 2019