Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add svtr large model #10937

Merged
merged 2 commits into from
Sep 26, 2023
Merged

Conversation

zhangyubo0722
Copy link
Collaborator

No description provided.

@paddle-bot
Copy link

paddle-bot bot commented Sep 18, 2023

Thanks for your contribution!

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 2 times, most recently from d6dc304 to 435a928 Compare September 18, 2023 12:08
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/ppocr_keys_v1.txt
max_text_length: &max_text_length 25
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以修改成40

beta2: 0.99
epsilon: 1.0e-08
weight_decay: 0.05
no_weight_decay_name: norm pos_embed char_node_embed pos_node_embed char_pos_embed vis_pos_embed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是优化过的吗?

out_channels: 512
patch_merging: Conv
embed_dim: [192, 256, 512]
depth: [6, 6, 9]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参数都是调整过的?


Architecture:
model_type: rec
algorithm: SVTR_LCNet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algorithm:SVTR?
已经没有LCNet了

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 2 times, most recently from 57d2780 to c675aa5 Compare September 25, 2023 12:01
self.dec_pos_embed = self.create_parameter(
shape=[1, w, dim], default_initializer=zeros_)
self.add_parameter("dec_pos_embed", self.dec_pos_embed)
# self.pos_drop = nn.Dropout(p=drop_rate)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除多余代码

@@ -88,7 +111,9 @@ def __init__(self, in_channels, out_channels_list, **kwargs):
'{} is not supported in MultiHead yet'.format(name))

def forward(self, x, targets=None):

if self.use_pool:
# print(x.shape)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

@@ -61,8 +78,14 @@ def __init__(self, in_channels, out_channels_list, **kwargs):
max_text_length = gtc_args.get('max_text_length', 25)
nrtr_dim = gtc_args.get('nrtr_dim', 256)
num_decoder_layers = gtc_args.get('num_decoder_layers', 4)
self.before_gtc = nn.Sequential(
if self.use_pos:
# add_pos = AddPos(nrtr_dim, 60)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

@zhangyubo0722 zhangyubo0722 force-pushed the add_svtr_large branch 3 times, most recently from b3487df to f8eb3f5 Compare September 25, 2023 12:24
tink2123
tink2123 previously approved these changes Sep 25, 2023
# See the License for the specific language governing permissions and
# limitations under the License.

from matplotlib.mlab import stride_windows
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除无关代码


def forward(self, x):

qkv = paddle.reshape(self.qkv(x), (0, -1, 3, self.num_heads, self.dim //
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这么写会不会不能导出inference model,验证过了吗

Copy link
Collaborator

@tink2123 tink2123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tink2123 tink2123 merged commit e49e491 into PaddlePaddle:dygraph Sep 26, 2023
embed_dim: [192, 256, 512]
depth: [6, 6, 9]
num_heads: [6, 8, 16]
mixer: ['Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Conv','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global','Global']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zhangyubo0722, can I have a question?
I guess the Permuation column (in the SVTR paper) is the value of mixer, so I set my config is Conv*10 and Global*11, but your config is Conv*9 and Global*11. Can you show me the quotation you used for this config please. Thank you a lot
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants