Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests (fixes #4674) #4678

Merged
merged 15 commits into from
Oct 26, 2021

Conversation

jameslamb
Copy link
Collaborator

#4674 documents an issue in LightGBM discovered by CRAN's checks using the address sanitizer for gcc and clang.

This PR proposes introducing CI jobs that I believe can reproduce that issue, and which I hope will help us to catch similar issues during development in the future (to reduce the risk of CRAN rejections).

Comment on lines 191 to 197
env:
# env variables from CRAN's configuration: https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt
ASAN_OPTIONS: "detect_leaks=0:detect_odr_violation=0"
UBSAN_OPTIONS: "print_stacktrace=1"
RJAVA_JVM_STACK_WORKAROUND: 0
RGL_USE_NULL: true
R_DONT_USE_TK: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh great! I did also just see this conversation in that project's issues about keeping these images' configuration in sync with CRAN: wch/r-debug#21

Since it looks like the the current image is up to date with CRAN, I'll remove this configuration and check that these jobs still reproduce the issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed 6389d2a, which removes these env variables.

I intentionally did not merge the latest master into this branch, since it include a fix for the issues this test is supposed to catch (#4673).

If the next round of CI builds reproduces that issue, I'll update this PR's branch to the latest master.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha sorry, and cec4267. Realized I included an inaccurate comment copy-pasting from another part of the template.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally did not merge the latest master

Ahhhh, I was wondering why after pushing these changes, I couldn't replicate the test errors! But now I understand...the version of the code tested at CI will be the state of this branch merged into master, so just having #4673 on master is enough to make the tests in this PR now pass.

From the checkout task on https://github.com/microsoft/LightGBM/runs/3971480759?check_suite_focus=true

HEAD is now at d3763ec Merge eddfffd into d88b445

I'll update this to latest master and temporarily add a revert commit reverting #4673, just so we can test that the CI jobs are doing what they should.

I am still travelling though, so might not be able to return to this for for 2-3 days.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yeah, seems to be working! As of the most recent state of this branch, these new CI jobs are reproducing the issues found in CRAN's clang-ASAN and gcc-ASAN checks.

clang-ASAN

link to failing job: https://github.com/microsoft/LightGBM/runs/3971678264?check_suite_focus=true

evidence from logs that ASAN and UBSAN are both being used.

clang++ -fsanitize=address,undefined -fno-sanitize=float-divide-by-zero -fno-sanitize=alignment -fno-omit-frame-pointer -frtti -std=gnu++11 -I"/usr/local/RDcsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -pthread -fpic -g -O0 -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o

gcc-ASAN

link to failing job: https://github.com/microsoft/LightGBM/runs/3971678238?check_suite_focus=true

evidence from logs that ASAN and UBSAN are both being used

g++ -fsanitize=address,undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/RDsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -DSWITCH_TO_REFCNT -pthread -fpic -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O0 -Wall -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o

I'll add the changes from #4673 back in and mark this ready for review.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 17, 2021

I think this section of README should be also updated.
https://github.com/microsoft/LightGBM/blob/master/R-package/README.md#ubsan

@jameslamb
Copy link
Collaborator Author

I think this section of README should be also updated.

Ah yes, definitely! Thanks for the reminder.

Just updated it in 6389d2a.

@jameslamb jameslamb changed the title WIP: [ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests [ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests (fixes #4674) Oct 24, 2021
@jameslamb
Copy link
Collaborator Author

Ok I think this is ready for review! Added (fixes #4674) to it, since it addresses the remaining items noted in #4674 (comment).

#4673 fixed the specific issues raised by CRAN and the CI jobs introduced here should hopefully help us to catch such issues in the future, during development.

This was referenced Oct 25, 2021
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Just two very minor suggestions below:

R-package/README.md Outdated Show resolved Hide resolved
R-package/README.md Outdated Show resolved Hide resolved
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
@StrikerRUS StrikerRUS merged commit f6c5574 into master Oct 26, 2021
@StrikerRUS StrikerRUS deleted the ci/cran-asan branch October 26, 2021 23:25
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants