generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
4
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
AttributeError: 'Qwen2ForCausalLM' object has no attribute 'optimizer' during GRPO training with ZERO-3
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2782
opened Feb 6, 2025 by
Co1lin
5 tasks done
NashMD trainer sampling policy wrong
⚡accelerate
Related to accelerate
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2781
opened Feb 6, 2025 by
zhourunlong
5 tasks done
lora don't work! OOM
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2780
opened Feb 6, 2025 by
zhangguoxin1
5 tasks done
ORPOTrainer crashes due to pickling failure if dataloader_num_workers > 0
🐛 bug
Something isn't working
🏋 ORPO
Related to ORPO
#2779
opened Feb 6, 2025 by
kiratp
Allow vllm sub-batching to avoid CUDA out of memory
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2775
opened Feb 5, 2025 by
cfpark00
GRPO tests failing in multi-device setting on main
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2774
opened Feb 5, 2025 by
tyler-romero
5 tasks done
How to log more metrics with wandb when using GRPO trainer and accelerate
⚡accelerate
Related to accelerate
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2768
opened Feb 5, 2025 by
andrewsiah
5 tasks done
Add Custom Reward Functions To Online DPO (and other methods)
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
🏋 Online DPO
Related to Online DPO
🏋 Reward
Related to Reward modelling
🏋 RLOO
Related to RLOO
#2767
opened Feb 4, 2025 by
xzuyn
Wrong quick start guide and value_model error
🐛 bug
Something isn't working
📚 documentation
Improvements or additions to documentation
🏋 PPO
Related to PPO
#2764
opened Feb 4, 2025 by
elliot-zzh
It seems difficult to use both Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 GKD
Related to GKD
LigerKernel
and Flash Attention
for both the teacher and student models in GKD.
🐛 bug
#2761
opened Feb 4, 2025 by
YooSungHyun
5 tasks done
Llama 3 family of models does not seem to work with RewardTrainer
⚡accelerate
Related to accelerate
⚡ PEFT
Related to PEFT
🏋 Reward
Related to Reward modelling
#2758
opened Feb 4, 2025 by
JohnGiorgi
5 tasks done
Tracking Liger-Kernel progress for GRPO Loss
🏋 GRPO
Related to GRPO
#2756
opened Feb 3, 2025 by
Superskyyy
How to do multi-node training for GRPO with DeepSpeed + vLLM?
🚀 deepspeed
Related to deepspeed
🏋 GRPO
Related to GRPO
#2754
opened Feb 3, 2025 by
nikhilchandak
🐛 Installation Issue: Unable to Install : From provided instruction of contribution.md
🐛 bug
Something isn't working
📚 documentation
Improvements or additions to documentation
#2753
opened Feb 3, 2025 by
rawathemant246
Possible discrepancy in GRPO loss: Paper vs. implementation (log-prob vs. prob)
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2752
opened Feb 3, 2025 by
liranringel
Training with grpo required 20 mins for single step
✨ enhancement
New feature or request
#2751
opened Feb 3, 2025 by
imrankh46
feat(GRPOTrainer): New feature or request
🏋 GRPO
Related to GRPO
reward_func
return None
to skip
✨ enhancement
#2737
opened Feb 2, 2025 by
ctjlewis
PLZ make padding_free for New feature or request
🏋 GKD
Related to GKD
🙋 help from community wanted
Open invitation for community members to contribute
DataCollatorForChatML
.
✨ enhancement
#2736
opened Feb 2, 2025 by
YooSungHyun
SFTvsRL SFT Memorizes, RL Generalizes
✨ enhancement
New feature or request
#2735
opened Feb 2, 2025 by
NickyDark1
GRPO Trainer supports VLMs
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2734
opened Feb 2, 2025 by
sunildkumar
GKD Example why do not use labels?
🏋 GKD
Related to GKD
❓ question
Seeking clarification or more information
#2732
opened Feb 2, 2025 by
YooSungHyun
5 tasks done
Latest TRL code = significantly worse rewards for GRPO training
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2731
opened Feb 2, 2025 by
abacaj
5 tasks done
OOM for 7B model on A100 80Gb
🐛 bug
Something isn't working
#2719
opened Jan 31, 2025 by
JohnConnor123
5 tasks done
AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'base_model_prefix'
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2718
opened Jan 31, 2025 by
Tarak200
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.