generat关于eembedding输入相关问题 #436

bingwork · 2023-10-27T13:06:07Z

bingwork
Oct 27, 2023

我没看到具体generate方法代码，就先用prepare_inputs_for_generation分析。
如上图，llama的prepare_inputs_for_generation可以支持embedding输入，但是chatglm没有。
请问chatglm的generate方法是否不支持embedding输入？
如果理解错误，还望见谅。
@xunkai55 @davidlvxin @duzx16

LittleGreenYuan · 2023-11-10T02:04:25Z

LittleGreenYuan
Nov 10, 2023

请问，你有任何新的想法吗？我在源文件中找到了‘PrefixEncoder’的类，似乎被用在了P-TuningV2里
定义在：

65: class PrefixEncoder(torch.nn.Module)

使用在：

736: class ChatGLMModel(ChatGLMPreTrainedModel):
789:      def forward():

在这一过程中，官方是将这里的emmbedding作为past_key_values: Optional[Tuple[Tuple[torch.Tensor, torch.Tensor], ...]] = None
送入模型的，不知道能不能帮到你，我也恰好正在研究这里

0 replies

zRzRzRzRzRzRzR · 2023-11-15T03:42:34Z

zRzRzRzRzRzRzR
Nov 15, 2023
Maintainer

不确定，将跟算法同学进行讨论

0 replies

Junjie-Chu · 2023-11-24T00:11:57Z

Junjie-Chu
Nov 24, 2023

I'm trying to use GCG with ChatGLM3.

After I read the code carefully, I think generate() actually supports inputs_embeds, which may solve the issue.
I found that input_ids is only used to provide size to create attention_mask and position_ids. When inputs_embeds is passed in, according to the code

if inputs_embeds is None:
    inputs_embeds = self.embedding(input_ids)

the parameter input_ids does not actually affect the inference results?

So in fact, to use inputs_embeds as input, we only needmodel(input_ids, inputs_embeds)

Not sure if my understanding is correct?

And I find, when run model(input_ids=input_ids.unsqueeze(0),inputs_embeds=full_embeds), the output dimensions of ChatGLM3 seem to be different with those of Llama2 or Vicuna? Need to use something like .permute(1, 0, 2)?

Not sure about my understanding, thanks a lot in advance for your support!

0 replies

LittleGreenYuan · 2023-11-24T01:18:20Z

LittleGreenYuan
Nov 24, 2023

Following the code below does pass embedding as an input, but when using model.generate(), it will prompt an error:"You passed inputs_embeds to .generate(), but the model class ChatGLMForConditionalGeneration doesn't have its forwarding implemented. See the GPT2 implementation for an example (huggingface/transformers#21405), and feel free to open a PR with it!"

inputs  = tokenizer(MutilTalk_Prompt,padding = 'max_length',max_length = 99)
tensor_input_ids = torch.tensor(inputs['input_ids']+[2])
tensor_input_ids = tensor_input_ids.cuda()
print(tensor_input_ids)
input_embeds = model.transformer.embedding(tensor_input_ids.unsqueeze(0))

outputs = model(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

#error
outputs = model.generate(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

1 reply

Junjie-Chu Nov 24, 2023

May I ask, if simlply following GPT-2 example, modify the prepare_inputs_for_generation function, could make it run normally?

Junjie-Chu · 2023-11-24T01:40:52Z

Junjie-Chu
Nov 24, 2023

Following the code below does pass embedding as an input, but when using model.generate(), it will prompt an error:"You passed inputs_embeds to .generate(), but the model class ChatGLMForConditionalGeneration doesn't have its forwarding implemented. See the GPT2 implementation for an example (huggingface/transformers#21405), and feel free to open a PR with it!"

inputs  = tokenizer(MutilTalk_Prompt,padding = 'max_length',max_length = 99)
tensor_input_ids = torch.tensor(inputs['input_ids']+[2])
tensor_input_ids = tensor_input_ids.cuda()
print(tensor_input_ids)
input_embeds = model.transformer.embedding(tensor_input_ids.unsqueeze(0))

outputs = model(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

#error
outputs = model.generate(input_ids=tensor_input_ids.unsqueeze(0),inputs_embeds=input_embeds)
logits_output = tokenizer.batch_decode(torch.argmax(outputs['logits'], -1).detach().cpu().numpy(), skip_special_tokens=True)
print(logits_output)

Oh, I get what u mean now, actually I do not use generate(), instead I just use model().logits. And in this case it runs well. But output has a different dimension with that of llama2 or vicuna XD

0 replies

lh-kyf · 2023-11-28T08:51:10Z

heavenhellchen · 2023-12-19T13:52:11Z

heavenhellchen
Dec 19, 2023

我也遇到这种问题，模型中没有set_input_embedding的接口，导致运行的时候报NotImplementedError，这个该怎么解决呀

5 replies

CRonaldo1997 Feb 7, 2024

@heavenhellchen 遇到同样问题，请问解决了吗大佬

zRzRzRzRzRzRzR Feb 7, 2024
Maintainer

更新huggingface

did you solve it?

louiss007 · 2024-04-17T09:59:35Z

louiss007
Apr 17, 2024

更新huggingface
你好，您是指更新transformers版本，是吧。有具体的版本号吗，我更新到4.39.3后，依然报错~

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generat关于eembedding输入相关问题 #436

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 8 comments 9 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

generat关于eembedding输入相关问题 #436

Replies: 8 comments · 9 replies

zRzRzRzRzRzRzR Nov 15, 2023 Maintainer

zRzRzRzRzRzRzR Feb 7, 2024 Maintainer

Replies: 8 comments 9 replies

zRzRzRzRzRzRzR
Nov 15, 2023
Maintainer

zRzRzRzRzRzRzR Feb 7, 2024
Maintainer