generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Add CLIs in TRL ! #1419
Merged
Merged
FEAT: Add CLIs in TRL ! #1419
Changes from all commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
f0d29ce
CLI V1
younesbelkada 3a2283a
v1 CLI
younesbelkada 15d166e
add rich enhancmeents
younesbelkada 16463b8
revert unindented change
younesbelkada d20167f
some comments
younesbelkada 38ee375
cleaner CLI
younesbelkada f83882c
fix
younesbelkada 14911d2
fix
younesbelkada b7f96bc
remove print callback
younesbelkada a328d9b
move to cli instead of trl_cli
younesbelkada 4ee1c8e
revert unneeded changes
younesbelkada 55659ce
fix test
younesbelkada 459c3eb
Update trl/commands/sft.py
younesbelkada 171fd94
remove redundant strings
younesbelkada 54662da
Merge branch 'add-cli' of https://github.com/lvwerra/trl into add-cli
younesbelkada fbec5ca
fix import issue
younesbelkada 616ee60
fix other issues
younesbelkada 553f898
add packing
younesbelkada d098262
add config parser
younesbelkada 6110423
some refactor
younesbelkada e9d4f91
cleaner
younesbelkada 265b488
add example config yaml file
younesbelkada 355e57c
small refactor
younesbelkada 8df993a
change a bit the logic
younesbelkada 8cf0b05
Merge remote-tracking branch 'origin/main' into add-cli
younesbelkada f64618f
fix issues here and there
younesbelkada c33dead
add CLI in docs
younesbelkada 0e45168
move to examples/sft
younesbelkada 3119da6
remove redundant licenses
younesbelkada 4d0da9b
make it work on dpo
younesbelkada cf2290f
set to None
younesbelkada bac1780
switch to accelerate and fix many things
younesbelkada 086b37c
add docs
younesbelkada d49e5e8
more docs
younesbelkada c7f4c83
added tests
younesbelkada 90526bf
doc clarification
younesbelkada c91513a
Merge remote-tracking branch 'origin/main' into add-cli
younesbelkada 3e0e3c9
Merge remote-tracking branch 'origin/main' into add-cli
younesbelkada c260432
more docs
younesbelkada 2ba16f3
fix CI for windows and python 3.8
younesbelkada a9d68b5
fix
younesbelkada 755db1e
attempt to fix CI
younesbelkada 61ee67b
fix?
younesbelkada 89b594e
test
younesbelkada d93a8e1
Merge branch 'add-cli' of https://github.com/lvwerra/trl into add-cli
younesbelkada d5ab9d6
fix
younesbelkada 026ceef
tweak?
younesbelkada be1ec61
fix
younesbelkada 8600269
test
younesbelkada ac99f35
another test
younesbelkada 7252ad0
fix
younesbelkada 55eda92
test
younesbelkada 76dbe94
fix
younesbelkada e6678f3
fix
younesbelkada 45424b8
fix
younesbelkada 2184b05
skip tests for windows
younesbelkada 79a4074
test @lvwerra approach
younesbelkada c92dd3b
make dev
younesbelkada a477236
revert unneeded changes
younesbelkada a1d228f
fix sft dpo
younesbelkada ef144d0
optimize a bit
younesbelkada c85b8e4
address final comments
younesbelkada 91a55ca
update docs
younesbelkada 7754760
final comment
younesbelkada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
# Command Line Interfaces (CLIs) | ||
|
||
You can use TRL to fine-tune your Language Model on Supervised Fine-Tuning (SFT) or Direct Policy Optimization (DPO) using the TRL CLIs. | ||
|
||
Currently supported CLIs are: | ||
|
||
- `trl sft` | ||
- `trl dpo` | ||
|
||
## Get started | ||
|
||
Before getting started, pick up a Language Model from Hugging Face Hub. Supported models can be found with the filter "text-generation" within models. Also make sure to pick up a relevant dataset for your task. | ||
|
||
Also make sure to run: | ||
```bash | ||
accelerate config | ||
``` | ||
and pick up the right configuration for your training setup (single / multi-GPU, DeepSpeed, etc.). Make sure to complete all steps of `accelerate config` before running any CLI command. | ||
|
||
We also recommend you passing a YAML config file to configure your training protocol. Below is a simple example of a YAML file that you can use for training your models with `trl sft` command. | ||
|
||
```yaml | ||
model_name_or_path: | ||
HuggingFaceM4/tiny-random-LlamaForCausalLM | ||
dataset_name: | ||
imdb | ||
dataset_text_field: | ||
text | ||
report_to: | ||
none | ||
learning_rate: | ||
0.0001 | ||
lr_scheduler_type: | ||
cosine | ||
``` | ||
|
||
Save that config in a `.yaml` and get directly started ! Note you can overwrite the arguments from the config file by explicitly passing them to the CLI, e.g.: | ||
|
||
```bash | ||
trl sft --config example_config.yaml --output_dir test-trl-cli --lr_scheduler_type cosine_with_restarts | ||
``` | ||
|
||
Will force-use `cosine_with_restarts` for `lr_scheduler_type`. | ||
|
||
## Supported Arguments | ||
|
||
We do support all arguments from `transformers.TrainingArguments`, for loading your model, we support all arguments from `~trl.ModelConfig`: | ||
|
||
[[autodoc]] ModelConfig | ||
|
||
You can pass any of these arguments either to the CLI or the YAML file. | ||
|
||
### Supervised Fine-tuning (SFT) | ||
|
||
Follow the basic instructions above and run `trl sft --output_dir <output_dir> <*args>`: | ||
|
||
```bash | ||
trl sft --config config.yaml --output_dir your-output-dir | ||
``` | ||
|
||
The SFT CLI is based on the `examples/scripts/sft.py` script. | ||
|
||
### Direct Policy Optimization (DPO) | ||
|
||
First, follow the basic instructions above and run `trl dpo --output_dir <output_dir> <*args>`. Make sure to process your DPO dataset in the TRL format as follows: | ||
|
||
1- Make sure to pre-tokenize the dataset using chat templates: | ||
|
||
```bash | ||
python examples/datasets/tokenize_ds.py --model gpt2 --dataset yourdataset | ||
``` | ||
|
||
You might need to adapt the `examples/datasets/tokenize_ds.py` to use yout chat template | ||
|
||
2- Format the dataset into TRL format (you can adapt the `examples/datasets/anthropic_hh.py`): | ||
|
||
```bash | ||
python examples/datasets/anthropic_hh.py --push_to_hub --hf_entity your-hf-org | ||
``` | ||
|
||
Once your dataset being pushed, run the dpo CLI as follows: | ||
|
||
```bash | ||
trl dpo --config config.yaml --output_dir your-output-dir | ||
``` | ||
|
||
The SFT CLI is based on the `examples/scripts/dpo.py` script. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# This is an example configuration file of TRL CLI, you can use it for | ||
# SFT like that: `trl sft --config config.yaml --output_dir test-sft` | ||
# The YAML file supports environment variables by adding an `env` field | ||
# as below | ||
|
||
# env: | ||
# CUDA_VISIBLE_DEVICES: 0 | ||
|
||
model_name_or_path: | ||
HuggingFaceM4/tiny-random-LlamaForCausalLM | ||
dataset_name: | ||
imdb | ||
dataset_text_field: | ||
text | ||
report_to: | ||
none | ||
learning_rate: | ||
1e-4 | ||
lr_scheduler_type: | ||
cosine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about adding it in the cli or examples folder folder?