Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration argument overriding by deprecated default values #1905

Closed
qgallouedec opened this issue Aug 6, 2024 · 0 comments · Fixed by #1909
Closed

Configuration argument overriding by deprecated default values #1905

qgallouedec opened this issue Aug 6, 2024 · 0 comments · Fixed by #1909

Comments

@qgallouedec
Copy link
Member

qgallouedec commented Aug 6, 2024

For a many deprecated args, you need to pass them it in the config and in the trainer, otherwise it will be overwritten by the default value of the trainer. Example with num_of_sequences and SFTTrainer:

>>> from trl import SFTConfig, SFTTrainer
>>> args = SFTConfig("tmp", num_of_sequences=32)
>>> trainer = SFTTrainer("gpt2", args=args)
/fsx/qgallouedec/trl/trl/trainer/sft_trainer.py:185: UserWarning: You passed a model_id to the SFTTrainer. This will automatically create an `AutoModelForCausalLM` or a `PeftModel` (if you passed a `peft_config`) for you.
  warnings.warn(
/fsx/qgallouedec/trl/trl/trainer/sft_trainer.py:289: UserWarning: You didn't pass a `max_seq_length` argument to the SFTTrainer, this will default to 1024
  warnings.warn(
/fsx/qgallouedec/trl/trl/trainer/sft_trainer.py:352: UserWarning: You passed a `num_of_sequences` argument to the SFTTrainer, the value you passed will override the one in the `SFTConfig`.
  warnings.warn(
>>> trainer.args.num_of_sequences
1024

If wrongly report that I've passed a num_of_sequences argument to the SFTTrainer. And the value that I've set in SFTConfig is overwritten by the trainer's default (1024). The only way to make it work as expected is to use

args = SFTConfig("tmp", num_of_sequences=32)
trainer = SFTTrainer("gpt2", args=args, num_of_sequences=32)

which is not satisfying, since it create a duplicate, and it uses a deprecated arg.

@qgallouedec qgallouedec linked a pull request Aug 9, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant