Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make dynamic_loader more safe and enhance error message on windows #28117

Merged
merged 1 commit into from
Oct 21, 2020

Conversation

zhwesky2010
Copy link
Contributor

@zhwesky2010 zhwesky2010 commented Oct 20, 2020

PR types

Bug fixes

PR changes

Others

Describe

修改了Windows下动态库加载CUDA10.1/10.2的bug,同时增强鲁棒性与报错信息;

由于Nvidia官方对CUDA的动态库命名方法多变,例如:

  • 10.0的动态库是以 主版本+次版本 命名: curand64_100.dll
  • 10.2的动态库却仅以 主版本 命名:curand64_10.dll

当前Windows使用了装饰的LoadLibrary函数来加载动态库,鲁棒性较低,这里进行了修复,使之更安全:

  1. Paddle目前会从Flags_cuda_dir、指定路径、环境变量尝试加载,但Windows只会尝试第1个方法,改为了全部尝试后才抛出异常,与Linux对齐;
  2. 可以通过;来指定多个可能的动态库名字,例如 curand64_102.dll;curand64_10.dll,会逐个加载,直到全部没有才失败;
  3. 将cmake时的cuda路径传入到dynamic_loader中,增加指定加载路径,增加成功率;
  4. 修改了cudnn加载失败后的Windows报错提示信息,提醒用户需要下载并解压复制到CUDA文件夹;

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@zhwesky2010 zhwesky2010 changed the title fix dynamic_loader more safe and error message on windows fix dynamic_loader more safe and enhance error message on windows Oct 20, 2020
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhwesky2010 zhwesky2010 merged commit 5d70002 into PaddlePaddle:develop Oct 21, 2020
@zhwesky2010 zhwesky2010 changed the title fix dynamic_loader more safe and enhance error message on windows make dynamic_loader more safe and enhance error message on windows Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants