Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error] TypeError: Trainer.__init__() got an unexpected keyword argument 'processing_class' #2207

Closed
2 of 4 tasks
himanshushukla12 opened this issue Oct 9, 2024 · 3 comments
Labels
🐛 bug Something isn't working 🏋 Reward Related to Reward modelling

Comments

@himanshushukla12
Copy link

System Info

Here are the versions I'm using

  • transformers: 4.45.2
  • trl: 0.12.0.dev0
  • platform: Linux-6.8.0-41-generic-x86_64-with-glibc2.35
  • python: 3.10.11

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

Steps to Reproduce this behavior:

run the command below in trl directory:

python examples/scripts/reward_modeling.py \
    --model_name_or_path Qwen/Qwen2-0.5B-Instruct \
    --dataset_name trl-lib/ultrafeedback_binarized \
    --output_dir Qwen2-0.5B-Reward-LoRA \
    --per_device_train_batch_size 8 \
    --num_train_epochs 1 \
    --gradient_checkpointing True \
    --learning_rate 1.0e-4 \
    --logging_steps 25 \
    --eval_strategy steps \
    --eval_steps 50 \
    --max_length 2048 \
    --use_peft \
    --lora_r 32 \
    --lora_alpha 16

Expected behavior

Expected behavior was to start LoRA fine-tuning, but it is not coming as expected.

Actual error:

Some weights of Qwen2ForSequenceClassification were not initialized from the model checkpoint at Qwen/Qwen2-0.5B-Instruct and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/z004x2xz/miscExcersises/himanshushukla12/trl/examples/scripts/reward_modeling.py:99: UserWarning: You are using a `task_type` that is different than `SEQ_CLS` for PEFT. This will lead to silent bugs Make sure to pass --lora_task_type SEQ_CLS when using this script with PEFT.
  warnings.warn(
Traceback (most recent call last):
  File "/home/z004x2xz/miscExcersises/himanshushukla12/trl/examples/scripts/reward_modeling.py", line 112, in <module>
    trainer = RewardTrainer(
  File "/home/z004x2xz/miscExcersises/himanshushukla12/trl/trl/trainer/reward_trainer.py", line 246, in __init__
    super().__init__(
TypeError: Trainer.__init__() got an unexpected keyword argument 'processing_class'

Things I tried to fix the issue:

I tried to change processing_class to tokenizer but it didn't worked

@qgallouedec
Copy link
Member

Thanks for reporting. You need to use the dev transformers version:

pip install git+https://github.com/huggingface/transformers.git

@qgallouedec qgallouedec added 🐛 bug Something isn't working 🏋 Reward Related to Reward modelling labels Oct 9, 2024
@himanshushukla12
Copy link
Author

Now got this error:

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

@qgallouedec
Copy link
Member

Probably not related. Can you open another issue for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working 🏋 Reward Related to Reward modelling
Projects
None yet
Development

No branches or pull requests

2 participants