You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that we have this code line: target_indicies = [i for i in range(len(random_batch["kl"])) if random_batch["kl"][i] is False] in the evaluation_loop() and this doesn't function correctly, as the eval dataset don't have this key "kl" normally.
See Error below:
[rank0]: target_indicies = [i for i in range(len(random_batch["kl"])) if random_batch["kl"][i] is False] [rank0]: KeyError: 'kl'
So, I guess the original idea for self.generate_during_eval should try to test the samples with "label == False"?
If I am correct, I can submit a PR to fix it, thanks.
Moreover, the latter code also has a subtle problem: target_batch = { "prompt_input_ids": itemgetter(*target_indicies)(random_batch["prompt_input_ids"]), "prompt_attention_mask": itemgetter(*target_indicies)(random_batch["prompt_attention_mask"]), "prompt": itemgetter(*target_indicies)(random_batch["prompt"]), }
The selected item from itemgetter would be a tuple, which cannot be directly used in the next generate() function: policy_output = model.generate( input_ids=batch["prompt_input_ids"], attention_mask=batch["prompt_attention_mask"], max_length=self.max_length, do_sample=True, pad_token_id=self.tokenizer.pad_token_id, )
This could lead to an error: tuple doesn't have shape[0], because we want a batched tensor to input, instead of a tuple of tensors.
So I suggest the following fix be updated: target_batch = { "prompt_input_ids": random_batch["prompt_input_ids"][target_indicies], "prompt_attention_mask": random_batch["prompt_attention_mask"][target_indicies], "prompt": itemgetter(*target_indicies)(random_batch["prompt"]), }
The last one is about the prediction_step():
with torch.no_grad(), prediction_context_manager():
loss, metrics = self.get_batch_loss_metrics(model, inputs)
# force log the metrics
if self.accelerator.is_main_process:
self.store_metrics(metrics, train_eval="eval")
if prediction_loss_only:
return (loss.detach(), None, None)
# logits for the chosen and rejected samples from model
logits_dict = {
"eval_logits/chosen": metrics["logits/chosen"],
"eval_logits/rejected": metrics["logits/rejected"],
}
logits = tuple(v.unsqueeze(dim=0) for k, v in logits_dict.items() if k not in ignore_keys)
logits = torch.stack(logits).mean(axis=1).to(self.accelerator.device)
labels = torch.zeros(logits.shape[0], device=self.accelerator.device)
return (loss.detach(), logits, labels)
Because the implemented get_batch_loss_metrics() doesn't return any logits, so if self.compute_metrics in the trainer is utilised, then it will give key error: logits/chosen and logits/rejected.
The text was updated successfully, but these errors were encountered:
Hi, there.
target_indicies = [i for i in range(len(random_batch["kl"])) if random_batch["kl"][i] is False]
in theevaluation_loop()
and this doesn't function correctly, as the eval dataset don't have this key "kl" normally.See Error below:
[rank0]: target_indicies = [i for i in range(len(random_batch["kl"])) if random_batch["kl"][i] is False]
[rank0]: KeyError: 'kl'
So, I guess the original idea for
self.generate_during_eval
should try to test the samples with "label == False"?If I am correct, I can submit a PR to fix it, thanks.
target_batch = { "prompt_input_ids": itemgetter(*target_indicies)(random_batch["prompt_input_ids"]), "prompt_attention_mask": itemgetter(*target_indicies)(random_batch["prompt_attention_mask"]), "prompt": itemgetter(*target_indicies)(random_batch["prompt"]), }
The selected item from
itemgetter
would be a tuple, which cannot be directly used in the nextgenerate()
function:policy_output = model.generate( input_ids=batch["prompt_input_ids"], attention_mask=batch["prompt_attention_mask"], max_length=self.max_length, do_sample=True, pad_token_id=self.tokenizer.pad_token_id, )
This could lead to an error:
tuple doesn't have shape[0]
, because we want a batched tensor to input, instead of a tuple of tensors.So I suggest the following fix be updated:
target_batch = { "prompt_input_ids": random_batch["prompt_input_ids"][target_indicies], "prompt_attention_mask": random_batch["prompt_attention_mask"][target_indicies], "prompt": itemgetter(*target_indicies)(random_batch["prompt"]), }
prediction_step()
:Because the implemented
get_batch_loss_metrics()
doesn't return any logits, so ifself.compute_metrics
in the trainer is utilised, then it will give key error:logits/chosen
andlogits/rejected
.The text was updated successfully, but these errors were encountered: