Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the bug of paddleocr running with distributed.launch #13218

Closed
LDOUBLEV opened this issue Jun 29, 2024 · 7 comments · Fixed by #13275
Closed

About the bug of paddleocr running with distributed.launch #13218

LDOUBLEV opened this issue Jun 29, 2024 · 7 comments · Fixed by #13275
Labels
bug Something isn't working

Comments

@LDOUBLEV
Copy link
Collaborator

问题描述 / Problem Description

获取gpu id 的地方需要优化,paddle.distributed.launch 多卡调用paddleocr会导致模型只使用第一张卡运行?
image

image

运行环境 / Runtime Environment

  • OS:
  • Paddle:
  • PaddleOCR:

复现代码 / Reproduction Code

完整报错 / Complete Error Message

可能解决方案 / Possible solutions

附件 / Appendix

@GreatV
Copy link
Collaborator

GreatV commented Jun 29, 2024

@LDOUBLEV, 老哥你能不能抽时间给它修一下。
这个好像一直都是只能用一个gpu推理。

@GreatV GreatV added the bug Something isn't working label Jun 29, 2024
@GreatV
Copy link
Collaborator

GreatV commented Jun 29, 2024

而且看前面window系统直接返回gpu_id=0

@LDOUBLEV
Copy link
Collaborator Author

LDOUBLEV commented Jul 1, 2024

你们修一下呗

config.enable_use_gpu(args.gpu_mem, args.gpu_id)

主要是和inference的同学确认下 inference Config里的参数设置问题,是不是必须要传入gpu_id;用distributed.launch的能不能跑inference

@LDOUBLEV, 老哥你能不能抽时间给它修一下。 这个好像一直都是只能用一个gpu推理。

@jzhang533
Copy link
Collaborator

@LDOUBLEV 威威,现在 PaddleOCR 项目,主要是用爱发电在维护。 GreatV 也不是百度的雇员,所以也联系不到 inference 的人。

@GreatV
Copy link
Collaborator

GreatV commented Jul 1, 2024

https://www.paddlepaddle.org.cn/inference/master/api_reference/cxx_api_doc/Config/GPUConfig.html#gpu 从这里看是必须要传入gpu_id 的

@LDOUBLEV
Copy link
Collaborator Author

LDOUBLEV commented Jul 3, 2024

https://www.paddlepaddle.org.cn/inference/master/api_reference/cxx_api_doc/Config/GPUConfig.html#gpu 从这里看是必须要传入gpu_id 的

看来是的;想了下,用distrbute.launch并行跑inference不太合适,这个case是一个用户发我的,主要诉求是多卡并行跑ppocr的推理,包装个多进程就能实现,每个卡上都初始化ppocr的inference model

不过这里windows的处理那里还是有bug

也可以加一个警告,告诉用户默认使用第一个卡执行推理 @GreatV

@GreatV
Copy link
Collaborator

GreatV commented Jul 3, 2024

也可以加一个警告,告诉用户默认使用第一个卡执行推理
@LDOUBLEV Got it.

@GreatV GreatV linked a pull request Jul 6, 2024 that will close this issue
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants