We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prompt = "这是关于{}的文章:".format(label) prompt_tokens = tokenizer.encode(prompt) prompt_len = len(prompt_tokens) ... second_mask = [0] * (args.seq_length - 1) for idx in range(prompt_len - 1, len(tokens) - 1): second_mask[idx] = 1
prompt_tokens最后一位应该是冒号‘:’,second_mask[prompt_len - 1]是否应该设置为0?
下面是一些pdb的打印参考结果
(Pdb) p prompt '这是关于news_story的文章:' (Pdb) p prompt_tokens [621, 671, 14464, 555, 27743, 11, 1630, 8, 17] (Pdb) p prompt_len - 1 8 (Pdb) p prompt_tokens[8] 17 (Pdb) p tokenizer.decode(17) ':' (Pdb) p second_mask[8] 1
The text was updated successfully, but these errors were encountered:
这个mask是对于label的mask,而不是对于输入的mask,所以需要进行减一。比如文本是[1,2,3,4],语言模型的输入是[1,2,3],label是[2,3,4],mask是对于label的,就有个移位的过程。
Sorry, something went wrong.
No branches or pull requests
prompt_tokens最后一位应该是冒号‘:’,second_mask[prompt_len - 1]是否应该设置为0?
下面是一些pdb的打印参考结果
The text was updated successfully, but these errors were encountered: