-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add svtr large model #10937
add svtr large model #10937
Conversation
Thanks for your contribution! |
d6dc304
to
435a928
Compare
configs/rec/rec_svtrnet_large.yml
Outdated
use_visualdl: false | ||
infer_img: doc/imgs_words/ch/word_1.jpg | ||
character_dict_path: ppocr/utils/ppocr_keys_v1.txt | ||
max_text_length: &max_text_length 25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以修改成40
configs/rec/rec_svtrnet_large.yml
Outdated
beta2: 0.99 | ||
epsilon: 1.0e-08 | ||
weight_decay: 0.05 | ||
no_weight_decay_name: norm pos_embed char_node_embed pos_node_embed char_pos_embed vis_pos_embed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是优化过的吗?
configs/rec/rec_svtrnet_large.yml
Outdated
out_channels: 512 | ||
patch_merging: Conv | ||
embed_dim: [192, 256, 512] | ||
depth: [6, 6, 9] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参数都是调整过的?
configs/rec/rec_svtrnet_large.yml
Outdated
|
||
Architecture: | ||
model_type: rec | ||
algorithm: SVTR_LCNet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
algorithm:SVTR?
已经没有LCNet了
57d2780
to
c675aa5
Compare
self.dec_pos_embed = self.create_parameter( | ||
shape=[1, w, dim], default_initializer=zeros_) | ||
self.add_parameter("dec_pos_embed", self.dec_pos_embed) | ||
# self.pos_drop = nn.Dropout(p=drop_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除多余代码
@@ -88,7 +111,9 @@ def __init__(self, in_channels, out_channels_list, **kwargs): | |||
'{} is not supported in MultiHead yet'.format(name)) | |||
|
|||
def forward(self, x, targets=None): | |||
|
|||
if self.use_pool: | |||
# print(x.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
@@ -61,8 +78,14 @@ def __init__(self, in_channels, out_channels_list, **kwargs): | |||
max_text_length = gtc_args.get('max_text_length', 25) | |||
nrtr_dim = gtc_args.get('nrtr_dim', 256) | |||
num_decoder_layers = gtc_args.get('num_decoder_layers', 4) | |||
self.before_gtc = nn.Sequential( | |||
if self.use_pos: | |||
# add_pos = AddPos(nrtr_dim, 60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
del
b3487df
to
f8eb3f5
Compare
ppocr/modeling/backbones/rec_vit.py
Outdated
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from matplotlib.mlab import stride_windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除无关代码
|
||
def forward(self, x): | ||
|
||
qkv = paddle.reshape(self.qkv(x), (0, -1, 3, self.num_heads, self.dim // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这么写会不会不能导出inference model,验证过了吗
f8eb3f5
to
40396c4
Compare
40396c4
to
2d8a6ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
embed_dim: [192, 256, 512] | ||
depth: [6, 6, 9] | ||
num_heads: [6, 8, 16] | ||
mixer: ['Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @zhangyubo0722, can I have a question?
I guess the Permuation
column (in the SVTR paper) is the value of mixer
, so I set my config is Conv*10
and Global*11
, but your config is Conv*9
and Global*11
. Can you show me the quotation you used for this config please. Thank you a lot
No description provided.