About the bug of paddleocr running with distributed.launch #13218

LDOUBLEV · 2024-06-29T08:37:51Z

问题描述 / Problem Description

获取gpu id 的地方需要优化，paddle.distributed.launch 多卡调用paddleocr会导致模型只使用第一张卡运行？

运行环境 / Runtime Environment

OS:
Paddle:
PaddleOCR:

复现代码 / Reproduction Code

完整报错 / Complete Error Message

可能解决方案 / Possible solutions

附件 / Appendix

GreatV · 2024-06-29T15:12:25Z

@LDOUBLEV, 老哥你能不能抽时间给它修一下。
这个好像一直都是只能用一个gpu推理。

GreatV · 2024-06-29T15:25:34Z

而且看前面window系统直接返回gpu_id=0了

LDOUBLEV · 2024-07-01T03:40:54Z

你们修一下呗

PaddleOCR/tools/infer/utility.py

Line 250 in 4336771

config.enable_use_gpu(args.gpu_mem, args.gpu_id)

主要是和inference的同学确认下 inference Config里的参数设置问题，是不是必须要传入gpu_id；用distributed.launch的能不能跑inference

@LDOUBLEV, 老哥你能不能抽时间给它修一下。这个好像一直都是只能用一个gpu推理。

jzhang533 · 2024-07-01T04:04:14Z

@LDOUBLEV 威威，现在 PaddleOCR 项目，主要是用爱发电在维护。 GreatV 也不是百度的雇员，所以也联系不到 inference 的人。

GreatV · 2024-07-01T04:39:50Z

https://www.paddlepaddle.org.cn/inference/master/api_reference/cxx_api_doc/Config/GPUConfig.html#gpu 从这里看是必须要传入gpu_id 的

LDOUBLEV · 2024-07-03T10:50:59Z

https://www.paddlepaddle.org.cn/inference/master/api_reference/cxx_api_doc/Config/GPUConfig.html#gpu 从这里看是必须要传入gpu_id 的

看来是的；想了下，用distrbute.launch并行跑inference不太合适，这个case是一个用户发我的，主要诉求是多卡并行跑ppocr的推理，包装个多进程就能实现，每个卡上都初始化ppocr的inference model

不过这里windows的处理那里还是有bug

也可以加一个警告，告诉用户默认使用第一个卡执行推理 @GreatV

GreatV · 2024-07-03T11:27:12Z

也可以加一个警告，告诉用户默认使用第一个卡执行推理
@LDOUBLEV Got it.

GreatV added the bug Something isn't working label Jun 29, 2024

GreatV linked a pull request Jul 6, 2024 that will close this issue

optimize func: get_infer_gpuid #13275

Merged

jzhang533 closed this as completed in #13275 Jul 9, 2024

github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the bug of paddleocr running with distributed.launch #13218

About the bug of paddleocr running with distributed.launch #13218

LDOUBLEV commented Jun 29, 2024

GreatV commented Jun 29, 2024 •

edited

Loading

GreatV commented Jun 29, 2024

LDOUBLEV commented Jul 1, 2024

jzhang533 commented Jul 1, 2024

GreatV commented Jul 1, 2024

LDOUBLEV commented Jul 3, 2024

GreatV commented Jul 3, 2024

About the bug of paddleocr running with distributed.launch #13218

About the bug of paddleocr running with distributed.launch #13218

Comments

LDOUBLEV commented Jun 29, 2024

问题描述 / Problem Description

运行环境 / Runtime Environment

复现代码 / Reproduction Code

完整报错 / Complete Error Message

可能解决方案 / Possible solutions

附件 / Appendix

GreatV commented Jun 29, 2024 • edited Loading

GreatV commented Jun 29, 2024

LDOUBLEV commented Jul 1, 2024

jzhang533 commented Jul 1, 2024

GreatV commented Jul 1, 2024

LDOUBLEV commented Jul 3, 2024

GreatV commented Jul 3, 2024

GreatV commented Jun 29, 2024 •

edited

Loading