-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding inputs_embeds argument and switch to paddle.nn.TransformerEncoder for Electra models #3401
Adding inputs_embeds argument and switch to paddle.nn.TransformerEncoder for Electra models #3401
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your PR looks so great, and let's discuss about two small suggestions. Waiting for your comments.
if input_ids is not None: | ||
input_embeddings = self.word_embeddings(input_ids) | ||
else: | ||
input_embeddings = inputs_embeds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code block can be improved to:
if input_ids is None:
inputs_embeds = self.word_embeddings(input_ids)
and in the forward
method, rename input_embeddings
to inputs_embeds
. In this way, the code looks more concise. how do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the input_embeddings = self.word_embeddings(input_ids)
in following original code be removed
if token_type_ids is None:
token_type_ids = paddle.zeros_like(input_ids, dtype="int64")
input_embeddings = self.word_embeddings(input_ids)
position_embeddings = self.position_embeddings(position_ids)
token_type_embeddings = self.token_type_embeddings(token_type_ids)
embeddings = input_embeddings + position_embeddings + token_type_embeddings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review folks.
@wj-Mcat While I agree that renaming input_embeddings
to inputs_embeds
makes the code more concise, it also makes it less explicit/readable. Therefore I prefer the way it is now.
@guoshengCS good call. Removed the redundant line of code.
inputs_embeds = None | ||
if self.use_inputs_embeds: | ||
inputs_embeds = floats_tensor( | ||
[self.batch_size, self.seq_length, self.embedding_size]) | ||
# In order to use inputs_embeds, input_ids needs to set to None | ||
input_ids = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not use_inputs_embeds
, it should not prepare the input_ids
tensor in prepare_config_and_inputs
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
Addressed both comments. I think this PR should be ready for merging. |
There is another to-do under the PaddleNLP/paddlenlp/transformers/model_outputs.py Lines 147 to 152 in f43cfd0
you set the |
Good catch! Regarding the to-do: |
There are some thing I want to tell you:
and there are some modules that using I prefer that you do it in this pr. how do you think about it? @sijunhe @guoshengCS |
In order to make this pr merged, you can make some changes in |
I noticed that before #3411, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Function optimization
PR changes
APIs
Description
Addressing part of #3382
into the embedding space. This is particularly useful for use cases such as P-Tuning.
TransformerEncoderPro
and switch topaddle.nn.TransformerEncoder