Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] limit exported symbols in DLL #4494

Merged
merged 1 commit into from
Jul 30, 2021
Merged

[R-package] limit exported symbols in DLL #4494

merged 1 commit into from
Jul 30, 2021

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Jul 30, 2021

Through investigation of #4464, I've learned a lot about how R loads DLLs for packages. Particularly, I've learned that there are some settings where two DLLs having the same symbols can lead to conflicts that cause crashes or incorrect behavior.

This PR proposes a change to limit the visibility of symbols in the lightgbm.dll created for this project's R package.

From https://cran.r-project.org/doc/manuals/R-exts.html#Writing-portable-packages

It is good practice for DLLs to register their symbols (see Registering native routines), restrict visibility (see Controlling visibility) and not allow symbol search (see Registering native routines). It should be possible for a DLL to have only one visible symbol, R_init_pkgname, on suitable platforms, which would completely avoid symbol conflicts

From https://cran.r-project.org/doc/manuals/R-exts.html#Controlling-visibility

...there is an equally effective way to control which entry points are visible [on Windows], by supplying a definitions file pkgnme/src/pkgname-win.def: only entry points listed in that file will be visible

And from https://cran.r-project.org/doc/manuals/R-exts.html#Creating-shared-objects.

Under Windows you can supply an exports definitions file called dllname-win.def: otherwise all entry points in objects (but not libraries) supplied to R CMD SHLIB will be exported from the DLL

How this improves {lightgbm}

This does not fix #4464, but it might help to avoid future similar issues where LightGBM's DLL has exported symbols that conflict with the exports from other DLLs loaded in users' sessions.

How to test this

I used dumpbin.exe to examine the exported symbols from the package DLL before and after this change.

In my case, I used the dumpbin.exe bundled with Visual Studio. Like this:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.24.28314\bin\Hostx64\x64\dumpbin.exe" /EXPORTS "C:\Users\James\Documents\R\win-library\4.1\lightgbm\libs\x64\lightgbm.dll"

Before this change, the CRAN package's DLL had 3983 exported symbols.

`dumpbin` results (click me)
Dump of file C:\Users\James\Documents\R\win-library\4.1\lightgbm\libs\x64\lightgbm.dll

File Type: DLL

  Section contains the following exports for lightgbm.dll

    00000000 characteristics
    61036AC0 time date stamp Fri Jul 30 03:58:08 2021
        0.00 version
           1 ordinal base
        3983 number of functions
        3983 number of names

    ordinal hint RVA      name

          1    0 000B75E0 LGBM_BoosterAddValidData
          2    1 000BC690 LGBM_BoosterAddValidData_R
          3    2 000AEA10 LGBM_BoosterCalcNumPredict
          4    3 000BE160 LGBM_BoosterCalcNumPredict_R
          5    4 000B6E30 LGBM_BoosterCreate
          6    5 000B4FA0 LGBM_BoosterCreateFromModelfile
          7    6 000BC290 LGBM_BoosterCreateFromModelfile_R
          8    7 000BC120 LGBM_BoosterCreate_R
          ...
       3980  F8B 002C9180 __emutls_v._ZN8LightGBM7Network7buffer_E
       3981  F8C 002C90E0 __emutls_v._ZN8LightGBM7Network8linkers_E
       3982  F8D 002CD300 __emutls_v._ZZN8LightGBM3Log14GetLogCallBackEvE8callback
       3983  F8E 002CD320 __emutls_v._ZZN8LightGBM3Log8GetLevelEvE5level

  Summary

        1000 .CRT
        2000 .bss
        5000 .data
       70000 .edata
        2000 .idata
       18000 .pdata
       36000 .rdata
        3000 .reloc
      2C8000 .text
        1000 .tls
       2E000 .xdata

After this change, the CRAN package's DLL had exactly 1 exported symbol.

`dumpbin` results (click me)
Dump of file C:\Users\James\Documents\R\win-library\4.1\lightgbm\libs\x64\lightgbm.dll

File Type: DLL

  Section contains the following exports for lightgbm.dll

    00000000 characteristics
    61036F5D time date stamp Fri Jul 30 04:17:49 2021
        0.00 version
           1 ordinal base
           1 number of functions
           1 number of names

    ordinal hint RVA      name

          1    0 000BE750 R_init_lightgbm

  Summary

        1000 .CRT
        2000 .bss
        5000 .data
        1000 .edata
        2000 .idata
       18000 .pdata
       36000 .rdata
        3000 .reloc
      2C8000 .text
        1000 .tls
       2E000 .xdata

This change does not impact the DLL produced by CMake-based builds. Before and after this change, CMake package's DLL had 151 exported symbols.

`dumpbin` results (click me)
Dump of file C:\Users\James\Documents\R\win-library\4.1\lightgbm\libs\x64\lib_lightgbm.dll

File Type: DLL

  Section contains the following exports for lib_lightgbm.dll

    00000000 characteristics
    FFFFFFFF time date stamp
        0.00 version
           1 ordinal base
         151 number of functions
         151 number of names

    ordinal hint RVA      name

          1    0 00003220 ??0Boosting@LightGBM@@QEAA@XZ
          2    1 00075B30 ??0Dataset@LightGBM@@QEAA@H@Z
          3    2 00075D20 ??0Dataset@LightGBM@@QEAA@XZ
          4    3 00084A30 ??0DatasetLoader@LightGBM@@QEAA@AEBUConfig@1@AEBV?$function@$$A6AXAEBV?$vector@U?$pair@HN@std@@V?$allocator@U?$pair@HN@std@@@2@@std@@PEAN@Z@std@@HPEBD@Z
          ...
        148   93 0018ED50 LGBM_NetworkInitWithFunctions
        149   94 0018EDB0 LGBM_RegisterLogCallback
        150   95 0018EE30 LGBM_SampleIndices
        151   96 00192E50 R_init_lightgbm

  Summary

        F000 .data
       10000 .pdata
       4B000 .rdata
        2000 .reloc
        1000 .rsrc
      1A2000 .text

@jameslamb jameslamb marked this pull request as ready for review July 30, 2021 04:08
@jameslamb jameslamb requested a review from Laurae2 as a code owner July 30, 2021 04:08
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the detailed description!

@jameslamb jameslamb merged commit 5d5f490 into master Jul 30, 2021
@jameslamb jameslamb deleted the feat/r-def branch July 30, 2021 21:13
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R-package] R package crashes on windows when loaded together with {fansi} or anything that depends on it
2 participants