-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable dropout = 0.0
as an equivalent to none
in BPE
#1550
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding tests in python and rust.
rebasing on main should fix clippy issues |
Thanks. I fixed one of the lint errors which was a range readability thing. |
Fixed one more formatting issue. Now I think it should be all good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! Le't s remove the unrelated change (maybe from rebasing ? )
.github/workflows/trufflehog.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this should not be in the diff
I goofed, and now its getting worse 😬 |
Ok, it's all good now, unless you want me to squash the commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No don't worry, this LGMT! thanks for updating
* enable dropout = 0.0 * typo * lint * formatter * enable dropout = 0.0 * formatter
This is related to the discussion in #1541.
This PR allows for
0.0
to be used as the dropout value in BPE models with equivalent functionality tonone
. Previously, the docs and implementation were inconsistent:none
none
)dropout \in (0.0, 1.0]
BPE(dropout = 0.0)
)This simply allows for
0.0
to be an acceptable value during initialization and enables caching when tokenizing ifdropout == 0.0
.E.g., now the following works
whereas before it errored.
As future work, I think that dropout should be made non-optional, with the default being 0.0. This would remove the checks for
dropout.is_none()
, etc, but keep the functionality the same. However, I guess this would be a breaking change (since then all tokenizers serialized before this change would be invalid?).