-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use wrapped model for reference completions in WinRateCallback
and set default freq
to eval_steps
in LogCompletionsCallback`
#2074
Conversation
trl/trainer/callbacks.py
Outdated
@@ -253,7 +253,8 @@ def on_train_begin(self, args: TrainingArguments, state: TrainerState, control: | |||
tokenizer = kwargs["tokenizer"] | |||
tokenizer.padding_side = "left" | |||
accelerator = self.trainer.accelerator | |||
model = getattr(self.trainer, "ref_model", kwargs["model"]) # get the ref model if any, else use the model | |||
# Use the reference model if available, otherwise use the initial model | |||
model = getattr(self.trainer, "ref_model", self.trainer.model_wrapped) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work in the peft case? ie when ref_model is None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch - let me see if I can write a unit test for this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added here with some minor cleanup to reuse the same expected winrates across tests: 592a088
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to merge @qgallouedec or would you like some final changes?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just added a suggestion for a comment
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
What does this PR do?
While training some models with
GKDTrainer
I ran into an issue with ZeRO-3 andWinRateCallback
where the initial model provided by the callback is unwrapped and this incompatible withunwrap_model_for_generation()
. I also realised that usingeval_steps
is far more intuitive for bothWinRateCallback
andLogCompletionsCallback
because we usually reserve logging for fast computations like the loss, andLogCompletionsCallback
produces a major bottleneck in the logging if we generate everylogging_steps
.This PR fixes both issues
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.