Add Chunked SimPO Loss #386

pramodith · 2024-11-15T14:50:09Z

Summary

This PR adds the Simple Preference Optimization Loss function. The only difference between SimPO and CPO is a margin term gamma which specifies that the preferred response should be atleast gamma logit points better than the losing response.

$$SimPOLoss = -\log(\sigma(\beta\log(\pi_\theta(y_c|x)) - \beta\log(\pi_\theta(y_r|x)) - \gamma))$$

Note that SimPO explicitly specifies that $$\pi_\theta(y|x)$$ needs to be normalized by length, unlike DPO.

This corresponds to Eq 6 in the paper.

Testing Done

GPU A100-80G-SXM

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

shivam15s · 2024-11-19T03:17:06Z

Great PR @pramodith ! Could you rebase?

pramodith · 2024-11-19T10:37:26Z

Great PR @pramodith ! Could you rebase?

Good to go now!

shivam15s

lgtm

shivam15s · 2024-11-19T03:02:44Z

src/liger_kernel/chunked_loss/simpo_loss.py

+            beta (float): Weight for the odds ratio loss.
+            gamma (float): The simpo gamma, margin term.
+        """
+        logits = beta * (chosen_logps - rejected_logps) - gamma


Shouldn't this be normalized by length, as you said in the description?

Nvm, the logps are already averaged

pramodith added 3 commits November 14, 2024 17:02

Initial commit

0ac4e9b

Add SimPO Loss

6357eed

Merge branch 'main' into pramodith/chunked_simpo_loss

cd5c0da

pramodith changed the title ~~Add Chunked Simpo Loss~~ Add Chunked SimPO Loss Nov 15, 2024

pramodith and others added 2 commits November 15, 2024 14:59

Fix merge

965ee55

Merge branch 'main' into pramodith/chunked_simpo_loss

bf69261

Merge branch 'main' into pramodith/chunked_simpo_loss

98706f6

pramodith added 3 commits November 19, 2024 11:04

Fix checkstyle

3ef9bad

compile just once.

0534bb3

fix checkstyle

7422b7e

pramodith enabled auto-merge (squash) November 19, 2024 11:45

shivam15s approved these changes Nov 19, 2024

View reviewed changes

pramodith merged commit ebd5303 into main Nov 19, 2024
3 checks passed

pramodith deleted the pramodith/chunked_simpo_loss branch November 19, 2024 17:46

shivam15s mentioned this pull request Nov 19, 2024

[RFC] Liger FlexChunkLoss: Alignment and Distillation loss #371

Open

12 tasks

pramodith mentioned this pull request Dec 5, 2024

KTO loss #410

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Chunked SimPO Loss #386

Add Chunked SimPO Loss #386

pramodith commented Nov 15, 2024

shivam15s commented Nov 19, 2024

pramodith commented Nov 19, 2024

shivam15s left a comment

shivam15s Nov 19, 2024

shivam15s Nov 19, 2024

Add Chunked SimPO Loss #386

Add Chunked SimPO Loss #386

Conversation

pramodith commented Nov 15, 2024

Summary

Testing Done

shivam15s commented Nov 19, 2024

pramodith commented Nov 19, 2024

shivam15s left a comment

Choose a reason for hiding this comment

shivam15s Nov 19, 2024

Choose a reason for hiding this comment

shivam15s Nov 19, 2024

Choose a reason for hiding this comment