-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PaddlePaddle Hackathon] 55题 提交 #1133
Conversation
权重文件链接file.json信息与权重README已更新 |
OK |
已修改 |
|
||
def forward(self, input_ids, token_type_ids=None, position_ids=None): | ||
if position_ids is None: | ||
# maybe need use shape op to unify static graph and dynamic graph | ||
ones = paddle.ones_like(input_ids, dtype="int64") | ||
seq_length = paddle.cumsum(ones, axis=-1) | ||
position_ids = seq_length - ones | ||
cls_token_id = input_ids[0][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roberta的position_id跟cls_token_id相关的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里cls_token_id为了兼容不同形式postion_id.
55题中有英文的模型(postion_id 用的原论文的方式), 原仓库的Roberta只考虑了中文roberta的position_id编码方式(与bert一致)
为了兼容这两个不同的编码形式,函数传进来的参数中只有input_id 中的cls_token_id(bert为101,原论文roberta为0)可以用来判断positon_id是用bert的形式还是roberta原论文的形式. @joey12300
"roberta-base-ft-chinanews-chn": | ||
"https://huggingface.co/uer/roberta-base-finetuned-chinanews-chinese/resolve/main/vocab.txt", | ||
"roberta-base-ft-cluener2020-chn": | ||
"https://huggingface.co/uer/roberta-base-finetuned-cluener2020-chinese/resolve/main/vocab.txt", | ||
"roberta-base-chn-extractive-qa": | ||
"https://huggingface.co/uer/roberta-base-chinese-extractive-qa/resolve/main/vocab.txt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的链接需要修改一下哈
roberta_en_base_vocab_link = "https://huggingface.co/roberta-base/resolve/main/vocab.json" | ||
roberta_en_base_merges_link = "https://huggingface.co/roberta-base/resolve/main/merges.txt" | ||
pretrained_resource_files_map = { | ||
"vocab_file": { | ||
"roberta-en-base": roberta_en_base_vocab_link, | ||
"roberta-en-large": roberta_en_base_vocab_link, | ||
"roberta-base-squad2": roberta_en_base_vocab_link, | ||
"tiny-distilroberta-base": | ||
"https://huggingface.co/sshleifer/tiny-distilroberta-base/resolve/main/vocab.json" | ||
}, | ||
"merges_file": { | ||
"roberta-en-base": roberta_en_base_merges_link, | ||
"roberta-en-large": roberta_en_base_merges_link, | ||
"roberta-base-squad2": roberta_en_base_merges_link, | ||
"tiny-distilroberta-base": | ||
"https://huggingface.co/sshleifer/tiny-distilroberta-base/resolve/main/merges.txt" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的链接需要修改一下哈
权重已上传至bos |
pad_token_id=0): | ||
pad_token_id=0, | ||
layer_norm_eps=1e-12 | ||
): # roberta-base,large的eps=1e-5; wwm-ext为1e-12,方便通过config调整对齐 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除中文注释~
layer._epsilon = 1e-12 | ||
elif isinstance( | ||
layer, nn.LayerNorm | ||
): # roberta-base,large的eps=1e-5; wwm-ext为1e-12,方便通过config调整对齐 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除中文注释~
class RobertaForMultipleChoice(RobertaPretrainedModel): | ||
def __init__(self, roberta): | ||
super().__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RobertaForMultipleChoice类添加docstrings
class RobertaForMaskedLM(RobertaPretrainedModel): | ||
def __init__(self, roberta): | ||
super().__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RobertaForMaskedLM类添加docstrings
class RobertaForCausalLM(RobertaPretrainedModel): | ||
def __init__(self, roberta): | ||
super().__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RobertaForCausalLM类添加docstrings
r""" | ||
Example:: | ||
>>> from paddlenlp.transformers import RobertaTokenizer, RobertaForCausalLM, RobertaConfig | ||
>>> import paddle | ||
>>> tokenizer = RobertaBPETokenizer.from_pretrained('roberta-base') | ||
>>> model = RobertaForCausalLM.from_pretrained('roberta-base', config=config) | ||
>>> inputs = tokenizer("Hello, my dog is cute")['input_ids'] | ||
>>> inputs = paddle.to_tensor(inputs) | ||
>>> outputs = model(inputs) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example格式需要和其他类统一。
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Task: #1075
PR types
New features
PR changes
Models
Description
Task: #1075
模型权重对齐验证以在AiStudio中建立了一个项目方便验证:运行项目https://aistudio.baidu.com/aistudio/projectdetail/2453823 notebook即可验证
转换后的7个模型权重下载链接(aistudio):https://aistudio.baidu.com/aistudio/datasetdetail/111650/0