-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize DoRA in eval
and no dropout
#2122
Merged
BenjaminBossan
merged 17 commits into
huggingface:main
from
ariG23498:aritra/optimize-dora
Oct 16, 2024
+77
−45
Merged
Changes from 1 commit
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
9593abf
whether dropout
ariG23498 ba60f0b
chore: adding optimized code for dora
ariG23498 1a01895
formating
ariG23498 4b3af88
chore: adding bias
ariG23498 e61fb97
chore: refactor code
ariG23498 3506f39
chore: adding comments
ariG23498 5b71f6e
change formulation
ariG23498 1f0888c
changes
c8a2700
variable naming
09a1958
chore: review suggestions
24d53c1
chore: fix variable name
ariG23498 4f76aa5
Merge branch 'main' into aritra/optimize-dora
aab5e19
chore: adding dora changed to bnb and hqq
ariG23498 78136f7
chore: style
ariG23498 109cd89
chore: add dora changes to tp and add docs
ariG23498 9f39c51
adding benchmark results to docs
ariG23498 0e68aba
update docs
ariG23498 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminBossan did you envision something like this?
My intuition was:
forward
or DoRA layers -- where I would need to skip the alignment step and reusex
(the base model outputs)Let me know if I am on the right track.
Note: I could not figure out a way to catch if the model was in
eval
mode. How would you have done it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the dropout check is valid as is. Regarding eval mode, I think that checking
self.training
should work.On how to proceed, my thinking was that if we find that we can make this optimization, we pass the base result as an additional argument to DoRA
forward
(default for that argument beingNone
) and there, we use this base result if it's given and if not, we calculate it like we currently do. Could be that I'm missing something but that's my idea.The good news is that since we have a working implementation, we can then compare the results using both approaches and it should be identical (of course not when there is dropout, but apart from that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat solution!