-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests (fixes #4674) #4678
Conversation
.github/workflows/r_package.yml
Outdated
env: | ||
# env variables from CRAN's configuration: https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt | ||
ASAN_OPTIONS: "detect_leaks=0:detect_odr_violation=0" | ||
UBSAN_OPTIONS: "print_stacktrace=1" | ||
RJAVA_JVM_STACK_WORKAROUND: 0 | ||
RGL_USE_NULL: true | ||
R_DONT_USE_TK: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like these variables are already set in the r-devel Docker on which r-debug Docker is based:
https://github.com/wch/r-debug/blob/77369304e2458974a31c3598219297e04d64f969/r-devel/Dockerfile#L111-L116
https://github.com/wch/r-debug/blob/77369304e2458974a31c3598219297e04d64f969/r-debug/Dockerfile#L4
...
https://github.com/wch/r-debug/blob/77369304e2458974a31c3598219297e04d64f969/r-debug-1/Dockerfile#L4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh great! I did also just see this conversation in that project's issues about keeping these images' configuration in sync with CRAN: wch/r-debug#21
Since it looks like the the current image is up to date with CRAN, I'll remove this configuration and check that these jobs still reproduce the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed 6389d2a, which removes these env variables.
I intentionally did not merge the latest master
into this branch, since it include a fix for the issues this test is supposed to catch (#4673).
If the next round of CI builds reproduces that issue, I'll update this PR's branch to the latest master
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha sorry, and cec4267. Realized I included an inaccurate comment copy-pasting from another part of the template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally did not merge the latest master
Ahhhh, I was wondering why after pushing these changes, I couldn't replicate the test errors! But now I understand...the version of the code tested at CI will be the state of this branch merged into master
, so just having #4673 on master
is enough to make the tests in this PR now pass.
From the checkout
task on https://github.com/microsoft/LightGBM/runs/3971480759?check_suite_focus=true
I'll update this to latest master
and temporarily add a revert commit reverting #4673, just so we can test that the CI jobs are doing what they should.
I am still travelling though, so might not be able to return to this for for 2-3 days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yeah, seems to be working! As of the most recent state of this branch, these new CI jobs are reproducing the issues found in CRAN's clang-ASAN
and gcc-ASAN
checks.
clang-ASAN
link to failing job: https://github.com/microsoft/LightGBM/runs/3971678264?check_suite_focus=true
evidence from logs that ASAN and UBSAN are both being used.
clang++ -fsanitize=address,undefined -fno-sanitize=float-divide-by-zero -fno-sanitize=alignment -fno-omit-frame-pointer -frtti -std=gnu++11 -I"/usr/local/RDcsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -pthread -fpic -g -O0 -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o
gcc-ASAN
link to failing job: https://github.com/microsoft/LightGBM/runs/3971678238?check_suite_focus=true
evidence from logs that ASAN and UBSAN are both being used
g++ -fsanitize=address,undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/RDsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -DSWITCH_TO_REFCNT -pthread -fpic -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O0 -Wall -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o
I'll add the changes from #4673 back in and mark this ready for review.
I think this section of README should be also updated. |
Ah yes, definitely! Thanks for the reminder. Just updated it in 6389d2a. |
Ok I think this is ready for review! Added #4673 fixed the specific issues raised by CRAN and the CI jobs introduced here should hopefully help us to catch such issues in the future, during development. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Just two very minor suggestions below:
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
#4674 documents an issue in LightGBM discovered by CRAN's checks using the address sanitizer for
gcc
andclang
.This PR proposes introducing CI jobs that I believe can reproduce that issue, and which I hope will help us to catch similar issues during development in the future (to reduce the risk of CRAN rejections).
gcc
: https://github.com/jameslamb/LightGBM/pull/49/checks?check_run_id=3890682813clang
: https://github.com/jameslamb/LightGBM/pull/49/checks?check_run_id=3890682831