Fix logits compuation in KTO trainer prediction step #2050

issamemari · 2024-09-11T09:41:37Z

Description of the issue

There is a bug in the following few lines of code in kto_trainer.py

logits_dict = {
    "eval_logits/chosen": metrics["logits/chosen"],
    "eval_logits/rejected": metrics["logits/rejected"],
}
logits = tuple(v.unsqueeze(dim=0) for k, v in logits_dict.items() if k not in ignore_keys)
logits = torch.stack(logits).mean(axis=1).to(self.accelerator.device)

This assumes that the values in the logits_dict are tensors, but they are not. These values are computed in get_batch_loss_metrics where the the logits are averaged and .item() is called on the resulting tensor to get a float.

This causes the following error when running a KTO training:

AttributeError: 'float' object has no attribute 'unsqueeze'

What does this PR do?

Treat the values of logits_dict as floats
Fixes AttributeError: 'float' object has no attribute 'unsqueeze' during the evaluation phase of a KTO training

trl/trainer/kto_trainer.py

HuggingFaceDocBuilderDev · 2024-09-11T10:32:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

trl/trainer/kto_trainer.py

Fix logits compuation in KTO trainer prediction step

50f6deb

kashif reviewed Sep 11, 2024

View reviewed changes

trl/trainer/kto_trainer.py Outdated Show resolved Hide resolved

Update trl/trainer/kto_trainer.py

9b97a2a

kashif approved these changes Sep 11, 2024

View reviewed changes

kashif reviewed Sep 11, 2024

View reviewed changes

trl/trainer/kto_trainer.py Outdated Show resolved Hide resolved

Update trl/trainer/kto_trainer.py

fd9b583

kashif approved these changes Sep 11, 2024

View reviewed changes

kashif merged commit 9c043e5 into huggingface:main Sep 11, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix logits compuation in KTO trainer prediction step #2050

Fix logits compuation in KTO trainer prediction step #2050

issamemari commented Sep 11, 2024

HuggingFaceDocBuilderDev commented Sep 11, 2024

Fix logits compuation in KTO trainer prediction step #2050

Fix logits compuation in KTO trainer prediction step #2050

Conversation

issamemari commented Sep 11, 2024

Description of the issue

What does this PR do?

HuggingFaceDocBuilderDev commented Sep 11, 2024