Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiHeadAttention Layer #1062

Merged
merged 25 commits into from
Mar 10, 2020
Merged

MultiHeadAttention Layer #1062

merged 25 commits into from
Mar 10, 2020

Conversation

cgarciae
Copy link
Contributor

@cgarciae cgarciae commented Feb 10, 2020

Implementation of MultiHeadAttention as presented in Attention Is All You Need and discussed in #951. Uses tf.einsum to generalize dot-product to multiple heads.

Missing:

  • Documentation
  • Tests
  • Parameters for weight initializers, regularizers, constaints, etc.
  • config method

References:

Copy link
Contributor

@guillaumekln guillaumekln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I have some interests in this, so here are some comments/questions:

tensorflow_addons/layers/multihead_attention.py Outdated Show resolved Hide resolved
tensorflow_addons/layers/multihead_attention.py Outdated Show resolved Hide resolved
tensorflow_addons/layers/multihead_attention.py Outdated Show resolved Hide resolved
@cgarciae
Copy link
Contributor Author

@guillaumekln thanks for the comments! I've updated the code to address some of them.

@cgarciae
Copy link
Contributor Author

cgarciae commented Feb 25, 2020

@guillaumekln @AakashKumarNain @facaiy @seanpmorgan

Code should be ready for review :) Only 2 things are missing:

  • Finish docstring
  • Fix Github CI Issues

@cgarciae
Copy link
Contributor Author

cgarciae commented Feb 25, 2020

Anyone knows what is wrong with flake8? I am not getting any error from flake8 locally.

@cgarciae cgarciae changed the title [WIP] MultiHeadAttention Layer MultiHeadAttention Layer Feb 25, 2020
bias_initializer: typing.Union[str, typing.Callable] = "zeros",
bias_regularizer: typing.Union[str, typing.Callable] = None,
bias_constraint: typing.Union[str, typing.Callable] = None,
**kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that the comma after **kwargs will cause E999 SyntaxError: invalid syntax in the flake8 test. You can run flake8 tensorflow_addons/layers/multihead_attention.py directly to check it out

@cgarciae
Copy link
Contributor Author

@ulf1 I see, thanks! black is automatically adding that comma :( I'll disable "format on save" to remove it.

@googlebot

This comment has been minimized.

@gabrieldemarmiesse

This comment has been minimized.

@googlebot

This comment has been minimized.

@gabrieldemarmiesse
Copy link
Member

I've merged master into your branch to update it and fixed any formatting/conflicts it might have. If you need to do some more modifications, please do git pull beforehand.

@cgarciae
Copy link
Contributor Author

The pre-commit.sh script modified a bunch of files unrelated to this PR, possibly the ones added during the merge by @gabrieldemarmiesse .

@gabrieldemarmiesse
Copy link
Member

I'll look into it and push the fix to your branch. Thanks for the heads up :)

@cgarciae cgarciae requested review from qlzh727 and a team as code owners February 26, 2020 19:51
@cgarciae
Copy link
Contributor Author

@Squadrick I added a small commit, I don't know if that ticked off the kokoro stuff.

Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, more comments about the unit test.

tensorflow_addons/layers/multihead_attention.py Outdated Show resolved Hide resolved
tensorflow_addons/layers/multihead_attention.py Outdated Show resolved Hide resolved
tensorflow_addons/layers/multihead_attention_test.py Outdated Show resolved Hide resolved
@cgarciae
Copy link
Contributor Author

cgarciae commented Mar 3, 2020

@qlzh727 the changes you requested where made.

Copy link
Member

@seanpmorgan seanpmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM. Also will you be willing to maintain this going forward? If so please add it to the CODEOWNERS file

tensorflow_addons/layers/__init__.py Outdated Show resolved Hide resolved
@boring-cyborg boring-cyborg bot added the github label Mar 9, 2020
@cgarciae
Copy link
Contributor Author

cgarciae commented Mar 9, 2020

@seanpmorgan Yeah, happy to maintain it. Added entry to CODEOWNERS.

@seanpmorgan
Copy link
Member

@cgarciae Sorry for the conflicts. Could you resolve and then LGTM

@cgarciae
Copy link
Contributor Author

cgarciae commented Mar 9, 2020

@seanpmorgan no problem. Conflicts solved!

Copy link
Member

@seanpmorgan seanpmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for this great contribution! Will leave the PR open for another day in case any other the other reviewers have any issues.

@seanpmorgan
Copy link
Member

LGTM thanks for this great contribution! Will leave the PR open for another day in case any other the other reviewers have any issues.

@qlzh727 please let us know if the changes you requested are sufficient. I believe they were addressed.

Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@seanpmorgan seanpmorgan merged commit 3b0d978 into tensorflow:master Mar 10, 2020
jrruijli pushed a commit to jrruijli/addons that referenced this pull request Dec 23, 2020
* Add MultiHeadAttention Layer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants