RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #290

CaiJichang212 · 2024-06-04T11:54:30Z

when run run_knowedit_llama2.py with Llama-2-7b model &Wiki_recent dataset

bug:

Traceback (most recent call last):
File "EasyEdit/examples/run_knowedit_llama2.py", line 208, in
metrics, edited_model, _ = editor.edit(
File "EasyEdit/easyeditor/editors/editor.py", line 171, in edit
return self.edit_requests(requests, sequential_edit, verbose, **kwargs)
File "EasyEdit/easyeditor/editors/editor.py", line 346, in edit_requests
edit_evaluation(all_metrics, request, edited_model, i, eval_metric, test_generation, icl_examples, **kwargs)
File "EasyEdit/easyeditor/editors/editor.py", line 321, in edit_evaluation
"post": compute_edit_quality(edited_model, self.model_name, self.hparams, self.tok, request, self.hparams.device, eval_metric=eval_metric, test_generation=test_generation),
File "EasyEdit/easyeditor/evaluate/evaluate.py", line 93, in compute_edit_quality
ret['fluency'] = test_generation_quality(model=model,tok=tok,prefixes=rewrite_prompts if isinstance(rewrite_prompts,list) else [rewrite_prompts,], max_out_len=100, vanilla_generation=False)
File "EasyEdit/easyeditor/evaluate/evaluate_utils.py", line 191, in test_generation_quality
gen_texts = generate_fast(
File "EasyEdit/easyeditor/util/generate.py", line 134, in generate_fast
new_tok_indices = torch.multinomial(softmax_out_top_k, 1)
RuntimeError: probability tensor contains either inf, nan or element < 0

other info, log:

Executing ROME algorithm for the update: [The names of the siblings of Frances Polidori are] -> [ John William Polidori]
Computing left vector (u)...
Selected u projection object Frances Polidori
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: 11 | Sentence: The names of the siblings of Frances Polidori are John William Polid | Token: ori
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
loss 6.671 = 6.671 + 0.0 + 0.0 avg prob of [ John William Polidori] 0.0012978289742022753
loss 4.004 = 4.0 + 0.004 + 0.0 avg prob of [ John William Polidori] 0.018648600205779076
loss 1.892 = 1.879 + 0.013 + 0.0 avg prob of [ John William Polidori] 0.15321972966194153
loss 0.922 = 0.758 + 0.164 + 0.0 avg prob of [ John William Polidori] 0.4717167913913727
loss 1.229 = 1.228 + 0.0 + 0.0 avg prob of [ John William Polidori] 0.2986283600330353
loss nan = nan + nan + 0.0 avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
Delta norm: nan
Change in target norm: 41.875 to nan => nan
Division Factor: 3.130859375
Right vector norm: nan
Right vector shape: torch.Size([4096])
Deltas successfully computed for ['model.layers.5.mlp.down_proj.weight']
New weights successfully inserted into ['model.layers.5.mlp.down_proj.weight']

The text was updated successfully, but these errors were encountered:

CaiJichang212 · 2024-06-04T11:56:26Z

in yaml file, i set fp16: true.
is this the cause of the bug?

XeeKee · 2024-06-05T04:25:37Z

Yes, enabling fp16: true in our local tests causes nan to appear

CaiJichang212 · 2024-06-05T04:53:48Z

bf16 can solve this data overflow.
KE/EasyEdit/easyeditor/editors/editor.py

            if hasattr(hparams, 'fp16') and hparams.fp16:
                torch_dtype = torch.float16
            elif hasattr(hparams, 'bf16') and hparams.bf16:
                torch_dtype = torch.bfloat16
            else:
                torch_dtype = torch.float32

Hint: torch > 2.0.1
i test with torch=2.3.0

zxlzr · 2024-06-08T07:44:14Z

Hi, do you have any further questions?

zxlzr added the question Further information is requested label Jun 5, 2024

zxlzr closed this as completed Jun 9, 2024

StarLooo mentioned this issue Oct 17, 2024

[Bug Fixed] Difficulties to reproduce the KnowEdit results in the survey paper #390

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #290

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #290

CaiJichang212 commented Jun 4, 2024

CaiJichang212 commented Jun 4, 2024

XeeKee commented Jun 5, 2024

CaiJichang212 commented Jun 5, 2024

zxlzr commented Jun 8, 2024

RuntimeError: probability tensor contains either inf, nan or element < 0 #290

RuntimeError: probability tensor contains either inf, nan or element < 0 #290

Comments

CaiJichang212 commented Jun 4, 2024

CaiJichang212 commented Jun 4, 2024

XeeKee commented Jun 5, 2024

CaiJichang212 commented Jun 5, 2024

zxlzr commented Jun 8, 2024

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #290

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #290