Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cityscapes SOTA 模型导出报错'Config' object has no attribute 'model' #3358

Closed
2 of 3 tasks
Siiiiiigma opened this issue Jul 10, 2023 · 11 comments
Closed
2 of 3 tasks
Assignees
Labels
bug Something isn't working contributor Contribution from developers GoodFirstIssue

Comments

@Siiiiiigma
Copy link

问题确认 Search before asking

Bug描述 Describe the Bug

python export.py --config configs/mscale_ocr_cityscapes_autolabel_mapillary.yml --save_dir ./output --input_shape 1 3 2048 1024
按照readme要求在制定位置下载了模型参数和预训练参数,使用以上命令导出预训练的模型网络时,出现以下报错
尝试了历史issue中提到的几种方法,例如通过源码安装开发版paddleseg,问题仍然存在
利用飞浆ai studio的notebook也同样存在此问题,和配置环境应该无关

报错内容
d:\deeplearning\paddleseg\paddleseg\cvlibs\manager.py:113: UserWarning: MscaleOCRNet exists already! It is now updated to <class 'models.mscale_ocrnet.MscaleOCRNet'> !!!
warnings.warn("{} exists already! It is now updated to {} !!!".
Traceback (most recent call last):
File "D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA\export.py", line 140, in
main(args)
File "D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA\export.py", line 84, in main
net = cfg.model
AttributeError: 'Config' object has no attribute 'model'

复现环境 Environment

paddlepaddle-gpu 2.4.2.post117
paddleseg 2.8.0 d:\deeplearning\paddleseg

Bug描述确认 Bug description confirmation

  • 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR? Are you willing to submit a PR?

  • 我愿意提交PR!I'd like to help by submitting a PR!
@Siiiiiigma Siiiiiigma added the bug Something isn't working label Jul 10, 2023
@Asthestarsfalll
Copy link
Contributor

@Siiiiiigma 你好,这应该是一个bug,问题在于CityscapesSOTA使用了paddleseg中的模块,而后续paddleseg更新时没有及时修改。可以尝试使用更早之前的版本,稍后我将会修复这个问题。

@Asthestarsfalll
Copy link
Contributor

@Siiiiiigma 我已经提交了一个PR,你可以尝试克隆我的修改试试

@Siiiiiigma
Copy link
Author

@Asthestarsfalll
感谢修复,我尝试导出第一个配置(mscale_ocr_cityscapes_autolabel_mapillary.yml)时,出现如下警告,请问是正常的吗?
(Paddle) D:\DeepLearning\PaddleSeg\contrib\CityscapesSOTA>python export.py --config configs/mscale_ocr_cityscapes_autolabel_mapillary.yml --save_dir ./output --input_shape 1 3 2048 1024
d:\deeplearning\paddleseg\paddleseg\cvlibs\manager.py:113: UserWarning: MscaleOCRNet exists already! It is now updated to <class 'models.mscale_ocrnet.MscaleOCRNet'> !!!
warnings.warn("{} exists already! It is now updated to {} !!!".
2023-07-10 16:43:06 [WARNING] Add the in_channels in train_dataset class to model config. We suggest you manually set in_channels in model config.
2023-07-10 16:43:06 [INFO] Use the following config to build model
model:
backbone:
in_channels: 3
type: HRNet_W48_NV
backbone_indices:

0
n_scales:
0.5
1.0
2.0
num_classes: 19
pretrained: pretrain/pretrained.pdparams
type: MscaleOCRNet
W0710 16:43:06.020490 7732 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 12.1, Runtime API Version: 11.7
W0710 16:43:06.046422 7732 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4.
2023-07-10 16:43:10 [INFO] Loading pretrained model from pretrain/pretrained.pdparams
2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.cls_head.weight doesn't match.(Pretrained: (65, 512, 1, 1), Actual: [19, 512, 1, 1])
2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.cls_head.bias doesn't match.(Pretrained: (65,), Actual: [19])
2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.aux_head.1.weight doesn't match.(Pretrained: (65, 720, 1, 1), Actual: [19, 720, 1, 1])
2023-07-10 16:43:13 [WARNING] [SKIP] Shape of pretrained params ocrnet.head.aux_head.1.bias doesn't match.(Pretrained: (65,), Actual: [19])
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._conv.weight is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm.weight is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm.bias is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm._mean is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.0._batch_norm._variance is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._conv.weight is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm.weight is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm.bias is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm._mean is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.1._batch_norm._variance is not in pretrained model
2023-07-10 16:43:13 [WARNING] scale_attn.atten_head.2.weight is not in pretrained model
2023-07-10 16:43:14 [INFO] There are 1572/1587 variables loaded into MscaleOCRNet.
2023-07-10 16:43:48 [INFO] The inference model is saved in ./output

@Asthestarsfalll
Copy link
Contributor

Asthestarsfalll commented Jul 10, 2023

@Siiiiiigma
第一处警告是因为MscaleOCRNet在paddleseg.model中被注册过了,会在CityscapesSOTA重新注册一遍,没有影响。
第二处pretrained params是因为线性层的权重形状不一致,预训练的head通道数和微调不一致也很正常,没有影响。
第三处scale_attn的警告是因为你加载的是预训练权重,所以不存在scale_attn这个模块,deploy应该加载在下游任务训练好的权重。

@shiyutang shiyutang added the contributor Contribution from developers label Jul 10, 2023
@Siiiiiigma
Copy link
Author

谢谢,明白了,修改为加载之前下载的saved_model/model.pdparams之后就没有警告了

@Siiiiiigma
Copy link
Author

@Asthestarsfalll
你好,我想测试该模型在任意街景图上的效果,准备了一张2048*1024的JPG图像,放在image文件夹内,当我在飞桨ai studio的notebook下运行以下命令时:
python deploy/python/infer.py
--config /home/aistudio/PaddleSeg-2.6.0/output/deploy.yaml
--image_path /home/aistudio/PaddleSeg-2.6.0/image
--save_dir /home/aistudio/PaddleSeg-2.6.0/result

出现了如下报错:
2023-07-10 18:47:58 [INFO] Use GPU
--- Running analysis [ir_graph_build_pass]
I0710 18:48:00.875998 2513 executor.cc:187] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running IR pass [map_op_to_another_pass]
--- Running IR pass [identity_scale_op_clean_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [constant_folding_pass]
--- Running IR pass [silu_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [vit_attention_fuse_pass]
--- Running IR pass [fused_multi_transformer_encoder_pass]
--- Running IR pass [fused_multi_transformer_decoder_pass]
--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]
--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]
--- Running IR pass [fuse_multi_transformer_layer_pass]
--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]
--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]
--- Running IR pass [matmul_scale_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [fc_elementwise_layernorm_fuse_pass]
--- Running IR pass [conv_elementwise_add_act_fuse_pass]
--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0710 18:48:47.483732 2513 fuse_pass_base.cc:59] --- detected 12 subgraphs
--- Running IR pass [transpose_flatten_concat_fuse_pass]
--- Running IR pass [conv2d_fusion_layout_transfer_pass]
--- Running IR pass [transfer_layout_elim_pass]
--- Running IR pass [auto_mixed_precision_pass]
--- Running IR pass [inplace_op_var_pass]
I0710 18:48:47.669679 2513 fuse_pass_base.cc:59] --- detected 3 subgraphs
--- Running analysis [save_optimized_model_pass]
W0710 18:48:47.685402 2513 save_optimized_model_pass.cc:28] save_optim_cache_model is turned off, skip save_optimized_model_pass
--- Running analysis [ir_params_sync_among_devices_pass]
I0710 18:48:47.685453 2513 ir_params_sync_among_devices_pass.cc:51] Sync params from CPU to GPU
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [memory_optimize_pass]
I0710 18:48:50.664584 2513 memory_optimize_pass.cc:222] Cluster name : shape_28.tmp_0_slice_0 size: 8
I0710 18:48:50.664654 2513 memory_optimize_pass.cc:222] Cluster name : shape_0.tmp_0_slice_0 size: 8
I0710 18:48:50.664659 2513 memory_optimize_pass.cc:222] Cluster name : concat_1.tmp_0 size: -2147483648
I0710 18:48:50.664661 2513 memory_optimize_pass.cc:222] Cluster name : transpose_0.tmp_0 size: 1073741824
I0710 18:48:50.664664 2513 memory_optimize_pass.cc:222] Cluster name : relu_78.tmp_0 size: 50331648
I0710 18:48:50.664673 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_305.tmp_2 size: 1509949440
I0710 18:48:50.664676 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_196.tmp_2 size: 50331648
I0710 18:48:50.664680 2513 memory_optimize_pass.cc:222] Cluster name : relu_227.tmp_0 size: 12582912
I0710 18:48:50.664685 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_200.tmp_2 size: 25165824
I0710 18:48:50.664688 2513 memory_optimize_pass.cc:222] Cluster name : x size: 25165824
I0710 18:48:50.664702 2513 memory_optimize_pass.cc:222] Cluster name : relu_171.tmp_0 size: 25165824
I0710 18:48:50.664711 2513 memory_optimize_pass.cc:222] Cluster name : batch_norm_930.tmp_1 size: 768
I0710 18:48:50.664716 2513 memory_optimize_pass.cc:222] Cluster name : concat_0.tmp_0 size: 1509949440
I0710 18:48:50.664718 2513 memory_optimize_pass.cc:222] Cluster name : tmp_310 size: 3145728
I0710 18:48:50.664721 2513 memory_optimize_pass.cc:222] Cluster name : bilinear_interp_v2_35.tmp_0 size: 76
--- Running analysis [ir_graph_to_program_pass]
I0710 18:48:51.751169 2513 analysis_predictor.cc:1660] ======= optimize end =======
I0710 18:48:51.776242 2513 naive_executor.cc:164] --- skip [feed], feed -> x
I0710 18:48:51.808507 2513 naive_executor.cc:164] --- skip [argmax_0.tmp_0], fetch -> fetch
W0710 18:48:51.966293 2513 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.6
W0710 18:48:51.974512 2513 gpu_resources.cc:149] device: 0, cuDNN Version: 8.4.
Traceback (most recent call last):
File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 430, in
main(args)
File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 418, in main
predictor.run(imgs_list)
File "/home/aistudio/PaddleSeg-2.6.0/deploy/python/infer.py", line 375, in run
self.predictor.run()
ValueError: (InvalidArgument) The 2-th dimension of input[0] and input[1] is expected to be equal.But received input[0]'s shape = [1, 512, 1024, 512], input[1]'s shape = [1, 512, 512, 1024].
[Hint: Expected inputs_dims[0][j] == inputs_dims[i][j], but received inputs_dims[0][j]:1024 != inputs_dims[i][j]:512.] (at ../paddle/phi/kernels/funcs/concat_funcs.h:83)
[operator < concat > error]

请问是我输入数据的形状问题吗,还是模型的问题?

@Asthestarsfalll
Copy link
Contributor

@Siiiiiigma 应该是输入数据的形状问题

@Siiiiiigma
Copy link
Author

https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/docs/deployment/inference/python_inference_cn.md
我使用该链接提供的cityscapes_demo.png仍然报同样的问题,感觉不像是形状的问题,是我漏了什么预处理步骤吗

@Asthestarsfalll
Copy link
Contributor

https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/docs/deployment/inference/python_inference_cn.md 我使用该链接提供的cityscapes_demo.png仍然报同样的问题,感觉不像是形状的问题,是我漏了什么预处理步骤吗

看报错是模型内部concat时tensor形状不一样,使用develop分支试试呢?

@Siiiiiigma
Copy link
Author

@Asthestarsfalll
我在本地使用了源码安装的开发者版本(2.8.0),以及在ai studio使用notebook提供的2.6.0版本,且均使用cityscapes_demo.png测试,该问题仍然存在,报错位置相同,请检查一下模型内部是否存在bug

@TingquanGao
Copy link
Collaborator

Thanks for this issue. As it has been inactive for a long time, we would close it. If you has any questions, please feel free to reopen or new issue, and we will follow up and resolve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contributor Contribution from developers GoodFirstIssue
Projects
None yet
Development

No branches or pull requests

4 participants