New Features and Optimizations
- Implement Kahneman-Tversky Optimization (KTO).
- Sequence packing is now supported when running SFT with SFTChatDataset.
Breaking Changes
Bug Fixes
- Change
log_prob_forward_micro_batch_size
in DPO to mean the same as themicro_batch_size
, which is how many samples(chosen and rejected included) that we process at once.