-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiHeadAttention Layer #1062
MultiHeadAttention Layer #1062
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I have some interests in this, so here are some comments/questions:
@guillaumekln thanks for the comments! I've updated the code to address some of them. |
@guillaumekln @AakashKumarNain @facaiy @seanpmorgan Code should be ready for review :) Only 2 things are missing:
|
Anyone knows what is wrong with |
bias_initializer: typing.Union[str, typing.Callable] = "zeros", | ||
bias_regularizer: typing.Union[str, typing.Callable] = None, | ||
bias_constraint: typing.Union[str, typing.Callable] = None, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that the comma after **kwargs
will cause E999 SyntaxError: invalid syntax
in the flake8 test. You can run flake8 tensorflow_addons/layers/multihead_attention.py
directly to check it out
@ulf1 I see, thanks! |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I've merged master into your branch to update it and fixed any formatting/conflicts it might have. If you need to do some more modifications, please do |
The |
I'll look into it and push the fix to your branch. Thanks for the heads up :) |
@Squadrick I added a small commit, I don't know if that ticked off the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change, more comments about the unit test.
@qlzh727 the changes you requested where made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM. Also will you be willing to maintain this going forward? If so please add it to the CODEOWNERS file
@seanpmorgan Yeah, happy to maintain it. Added entry to CODEOWNERS. |
@cgarciae Sorry for the conflicts. Could you resolve and then LGTM |
@seanpmorgan no problem. Conflicts solved! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for this great contribution! Will leave the PR open for another day in case any other the other reviewers have any issues.
@qlzh727 please let us know if the changes you requested are sufficient. I believe they were addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* Add MultiHeadAttention Layer
Implementation of
MultiHeadAttention
as presented in Attention Is All You Need and discussed in #951. Usestf.einsum
to generalize dot-product to multiple heads.Missing:
References: