Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: probability tensor contains either inf, nan or element < 0 #290

Closed
CaiJichang212 opened this issue Jun 4, 2024 · 4 comments
Labels
question Further information is requested

Comments

@CaiJichang212
Copy link

when run run_knowedit_llama2.py with Llama-2-7b model &Wiki_recent dataset

bug:

Traceback (most recent call last):
File "EasyEdit/examples/run_knowedit_llama2.py", line 208, in
metrics, edited_model, _ = editor.edit(
File "EasyEdit/easyeditor/editors/editor.py", line 171, in edit
return self.edit_requests(requests, sequential_edit, verbose, **kwargs)
File "EasyEdit/easyeditor/editors/editor.py", line 346, in edit_requests
edit_evaluation(all_metrics, request, edited_model, i, eval_metric, test_generation, icl_examples, **kwargs)
File "EasyEdit/easyeditor/editors/editor.py", line 321, in edit_evaluation
"post": compute_edit_quality(edited_model, self.model_name, self.hparams, self.tok, request, self.hparams.device, eval_metric=eval_metric, test_generation=test_generation),
File "EasyEdit/easyeditor/evaluate/evaluate.py", line 93, in compute_edit_quality
ret['fluency'] = test_generation_quality(model=model,tok=tok,prefixes=rewrite_prompts if isinstance(rewrite_prompts,list) else [rewrite_prompts,], max_out_len=100, vanilla_generation=False)
File "EasyEdit/easyeditor/evaluate/evaluate_utils.py", line 191, in test_generation_quality
gen_texts = generate_fast(
File "EasyEdit/easyeditor/util/generate.py", line 134, in generate_fast
new_tok_indices = torch.multinomial(softmax_out_top_k, 1)
RuntimeError: probability tensor contains either inf, nan or element < 0

other info, log:

Executing ROME algorithm for the update: [The names of the siblings of Frances Polidori are] -> [ John William Polidori]
Computing left vector (u)...
Selected u projection object Frances Polidori
Left vector shape: torch.Size([11008])
Computing right vector (v)
Lookup index found: 11 | Sentence: The names of the siblings of Frances Polidori are John William Polid | Token: ori
Rewrite layer is 5
Tying optimization objective to 31
Recording initial value of v*
loss 6.671 = 6.671 + 0.0 + 0.0 avg prob of [ John William Polidori] 0.0012978289742022753
loss 4.004 = 4.0 + 0.004 + 0.0 avg prob of [ John William Polidori] 0.018648600205779076
loss 1.892 = 1.879 + 0.013 + 0.0 avg prob of [ John William Polidori] 0.15321972966194153
loss 0.922 = 0.758 + 0.164 + 0.0 avg prob of [ John William Polidori] 0.4717167913913727
loss 1.229 = 1.228 + 0.0 + 0.0 avg prob of [ John William Polidori] 0.2986283600330353
loss nan = nan + nan + 0.0 avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
loss nan = nan + nan + nan avg prob of [ John William Polidori] nan
Delta norm: nan
Change in target norm: 41.875 to nan => nan
Division Factor: 3.130859375
Right vector norm: nan
Right vector shape: torch.Size([4096])
Deltas successfully computed for ['model.layers.5.mlp.down_proj.weight']
New weights successfully inserted into ['model.layers.5.mlp.down_proj.weight']

@CaiJichang212
Copy link
Author

in yaml file, i set fp16: true.
is this the cause of the bug?

@zxlzr zxlzr added the question Further information is requested label Jun 5, 2024
@XeeKee
Copy link
Collaborator

XeeKee commented Jun 5, 2024

Yes, enabling fp16: true in our local tests causes nan to appear

@CaiJichang212
Copy link
Author

bf16 can solve this data overflow.
KE/EasyEdit/easyeditor/editors/editor.py

            if hasattr(hparams, 'fp16') and hparams.fp16:
                torch_dtype = torch.float16
            elif hasattr(hparams, 'bf16') and hparams.bf16:
                torch_dtype = torch.bfloat16
            else:
                torch_dtype = torch.float32

image
Hint: torch > 2.0.1
i test with torch=2.3.0

@zxlzr
Copy link
Contributor

zxlzr commented Jun 8, 2024

Hi, do you have any further questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants