deepseek和gemma的simpo问题 #6785

Guochry · 2025-01-31T15:18:04Z

Reminder

I have read the above rules and searched the existing issues.

System Info

请问有人跑过simpo训gemma和deepseek吗，为啥我训完后的模型没法用vllm推理了（训前的base model可以直接vllm推理），推理报错如下：


[rank0]: Traceback (most recent call last):
[rank0]:   File "....../rollout_open_source/generate_vllm.py", line 93, in <module>
[rank0]:     fire.Fire(main)
[rank0]:   File "/nethome/gguo37/miniconda3/envs/myenv/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
[rank0]:     component_trace = _Fire(component, args, parsed_flag_args, context, name)
[rank0]:   File "/nethome/gguo37/miniconda3/envs/myenv/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
[rank0]:     component, remaining_args = _CallAndUpdateTrace(
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
[rank0]:     component = fn(*varargs, **kwargs)
[rank0]:   File "....../rollout_open_source/generate_vllm.py", line 71, in main
[rank0]:     for prompts, data_list in tqdm(batch_iter(input_file, batch_size, model_name_or_path), total=num_batches, desc="generating"):
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
[rank0]:     for obj in iterable:
[rank0]:   File "....../rollout_open_source/generate_vllm.py", line 18, in batch_iter
[rank0]:     tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, trust_remote_code=True)
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 907, in from_pretrained
[rank0]:     return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2208, in from_pretrained
[rank0]:     return cls._from_pretrained(
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2442, in _from_pretrained
[rank0]:     tokenizer = cls(*init_inputs, **init_kwargs)
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 171, in __init__
[rank0]:     self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 198, in get_spm_processor
[rank0]:     tokenizer.Load(self.vocab_file)
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/sentencepiece/__init__.py", line 961, in Load
[rank0]:     return self.LoadFromFile(model_file)
[rank0]:   File "....../miniconda3/envs/myenv/lib/python3.10/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
[rank0]:     return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
[rank0]: TypeError: not a string
[rank0]:[W130 23:21:39.260107041 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
srun: error: voltron: task 0: Exited with exit code 1
srun: Terminating StepId=123594.3

Reproduction

Put your message here.

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2025-01-31T19:26:05Z

把原始模型的 tokenizer 复制到新目录并覆盖

DBtxy · 2025-02-07T04:06:31Z

把原始模型的 tokenizer 复制到新目录并覆盖

这个是不是缺少vocab.json导致的，我想要微调deepseek-distill-qwen-32b-awq版本，也遇到了这个问题，请问如何解决

Guochry added bug Something isn't working pending This problem is yet to be addressed labels Jan 31, 2025

hiyouga closed this as completed Jan 31, 2025

hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepseek和gemma的simpo问题 #6785

deepseek和gemma的simpo问题 #6785

Guochry commented Jan 31, 2025

hiyouga commented Jan 31, 2025

DBtxy commented Feb 7, 2025 •

edited

Loading

deepseek和gemma的simpo问题 #6785

deepseek和gemma的simpo问题 #6785

Comments

Guochry commented Jan 31, 2025

Reminder

System Info

Reproduction

Others

hiyouga commented Jan 31, 2025

DBtxy commented Feb 7, 2025 • edited Loading

DBtxy commented Feb 7, 2025 •

edited

Loading