Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] [ci] Add test on R package with sanitizers #3439

Merged
merged 21 commits into from
Oct 9, 2020

Conversation

jameslamb
Copy link
Collaborator

This PR adds tests with the sanitizers run by CRAN on package submissions. See this blog post for a lot more background.

The tests take 22 minutes to run, so in this PR I'm proposing that we add them as a manual test that can be triggered by a comment (copying @StrikerRUS 's great work on #3424 ).

This can be triggered by commenting /gha run r-sanitizers-check on a PR.

How this makes LightGBM better

  • catches issues like memory violations in lib_lightgbm
  • allows us to catch issues in CI to improve the likelihood of CRAN accepting submissions of the R package

@StrikerRUS
Copy link
Collaborator

@jameslamb

The tests take 22 minutes to run, so in this PR I'm proposing that we add them as a manual test that can be triggered by a comment

I don't think that 22 min is something unacceptable for us. For example, right now duration of some tests is about 17 min.

image

image

So, given that we have 20 free parallel builds for GitHub Actions, I believe we can run new test just as a normal check.

I remember old times before rebalancing CI jobs when we had to wait about 40min-1h 😄 .

@jameslamb
Copy link
Collaborator Author

haha ok! I'll move it back to the main R GitHub Actions then

@StrikerRUS
Copy link
Collaborator

@jameslamb BTW, haven't you find a way to make one particular check "optional, but if run then required to be passed"?

@jameslamb
Copy link
Collaborator Author

@jameslamb BTW, haven't you find a way to make one particular check "optional, but if run then required to be passed"?

no, I'm not sure how to do that, sorry!

src/c_api.cpp Outdated Show resolved Hide resolved
@jameslamb jameslamb changed the title [WIP] [R-package] [ci] Add manual test on R package with sanitizers [R-package] [ci] Add manual test on R package with sanitizers Oct 8, 2020
@jameslamb jameslamb mentioned this pull request Oct 8, 2020
12 tasks
@StrikerRUS StrikerRUS changed the title [R-package] [ci] Add manual test on R package with sanitizers [R-package] [ci] Add test on R package with sanitizers Oct 8, 2020
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Amazing job as always! I'm pleasantly surprised that we can use container and write just few lines to have a sanitizers checks. Looks great overall, just check my minor comments below.

.github/workflows/r_package.yml Outdated Show resolved Hide resolved
.github/workflows/r_package.yml Outdated Show resolved Hide resolved
.github/workflows/r_package.yml Outdated Show resolved Hide resolved
.github/workflows/r_package.yml Outdated Show resolved Hide resolved
run: |
cd R-package/tests
Rscriptdevel testthat.R 2>&1 > ubsan-tests.log
cat ubsan-tests.log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it left intentionally or it was for debugging purposes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it in intentionally, so that if the job fails you have enough logs to be able to tell what went wrong

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 8, 2020

@jameslamb

no, I'm not sure how to do that, sorry!

No problem! Just wanted to be sure that I'm not discovering something that is already known 🙂 .

Looks like it is impossible to do such things with standard GitHub Actions API. But I have an idea to use REST API and query a status of required workflow via it in our cool all-successful job:

all-successful:
# https://git.luolix.topmunity/t/is-it-possible-to-require-all-github-actions-tasks-to-pass-without-enumerating-them/117957/4?u=graingert
runs-on: ubuntu-latest
needs: [test]
steps:
- name: Note that all tests succeeded
run: echo "🎉"

https://docs.github.com/en/free-pro-team@latest/rest/reference/actions#get-a-workflow-run

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 step towards CRAN, nice!

@StrikerRUS StrikerRUS merged commit 3f71d75 into master Oct 9, 2020
@StrikerRUS StrikerRUS deleted the ci/r-ubsan-gcc branch October 9, 2020 13:51
@StrikerRUS
Copy link
Collaborator

@jameslamb What do you think about moving all out R jobs into containers? It looks quite comfortable to have all R infrastructure and do not waste CI time to install it (very often with network errors, BTW) by our own. 🙂

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 10, 2020

But I have an idea to use REST API and query a status of required workflow via it in our cool all-successful job:

OK, today I went ahead and prepared a draft for this feature. I tested it a little bit and it looks like it can work 👍 .
But it requires two actions from maintainers: drop a comment to trigger optional workflow and re-run (if already completed) main workflow to fetch the latest status of optional one.

# .ci/check_workflow_status.py
import json
from os import environ
from sys import exit
from time import sleep
from urllib import request


def get_runs():
    with request.urlopen("https://api.github.com/repos/microsoft/LightGBM/actions/workflows/test_1.yml/runs") as url:
        data = json.loads(url.read().decode())
    pr_runs = []
    if environ.get("GITHUB_EVENT_NAME", "") == "pull_request":
        pr_runs = [i for i in data['workflow_runs']
                   if i['event'] == 'pull_request_review_comment' and
                   (i.get('pull_requests') and
                    i['pull_requests'][0]['number'] == int(environ.get("GITHUB_REF").split('/')[-2]) or
                    i['head_branch'] == environ.get("GITHUB_HEAD_REF").split('/')[-1])]
    return sorted(pr_runs, key=lambda i: i['run_number'], reverse=True)


def get_status(runs):
    status = 'ok'
    for run in runs:
        if run['status'] == 'completed':
            if run['conclusion'] in {'cancelled', 'skipped'}:
                continue
            if run['conclusion'] in {'failure', 'timed_out'}:
                status = 'fail'
                break
            if run['conclusion'] == 'success':
                break
        if run['status'] in {'in_progress', 'queued'}:
            status = 'rerun'
            break
    return status


if __name__ == "__main__":
    while True:
        status = get_status(get_runs())
        if status != 'rerun':
            break
        sleep(60)
    if status == 'fail':
        exit(1)
# .github/workflows/test_1.yml
name: Test 1

on:
  pull_request_review_comment:
    types: [created]

jobs:
  test:
    name: Test 1
    runs-on: ubuntu-latest
    if: github.event.comment.body == '/gha run' && contains('OWNER,MEMBER,COLLABORATOR', github.event.comment.author_association)
    timeout-minutes: 60
    strategy:
      fail-fast: false
    steps:
      - name: Checkout repository
        uses: actions/checkout@v1
        with:
          fetch-depth: 5
          submodules: true
      - name: Test
        run: |
            sleep 2m
            exit -1
# .github/workflows/r_package.yml

...

  all-successful:
    # https://git.luolix.topmunity/t/is-it-possible-to-require-all-github-actions-tasks-to-pass-without-enumerating-them/117957/4?u=graingert
    runs-on: ubuntu-latest
    needs: [test]
    steps:
      - name: Checkout repository
        uses: actions/checkout@v1
        with:
          fetch-depth: 5
          submodules: false
      - name: Install Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.x'
      - name: Note that all tests succeeded
        run: python "$GITHUB_WORKSPACE/.ci/check_workflow_status.py"

@StrikerRUS
Copy link
Collaborator

But it requires two actions from maintainers: ...

Probably it is possible to simplify it to just one comment with the help of https://docs.github.com/en/free-pro-team@latest/rest/reference/actions#re-run-a-workflow. But I'm not sure about the required permissions though.

@jameslamb
Copy link
Collaborator Author

@jameslamb What do you think about moving all out R jobs into containers? It looks quite comfortable to have all R infrastructure and do not waste CI time to install it (very often with network errors, BTW) by our own. 🙂

I think that would only work for Linux builds, and that that wouldn't help much unfortunately...since most of the temporary errors in CI have been on Mac and Windows.

@jameslamb
Copy link
Collaborator Author

But I have an idea to use REST API and query a status of required workflow via it in our cool all-successful job:

OK, today I went ahead and prepared a draft for this feature. I tested it a little bit and it looks like it can work 👍 .
But it requires two actions from maintainers: drop a comment to trigger optional workflow and re-run (if already completed) main workflow to fetch the latest status of optional one.

# .ci/check_workflow_status.py
import json
from os import environ
from sys import exit
from time import sleep
from urllib import request


def get_runs():
    with request.urlopen("https://api.github.com/repos/microsoft/LightGBM/actions/workflows/test_1.yml/runs") as url:
        data = json.loads(url.read().decode())
    pr_runs = []
    if environ.get("GITHUB_EVENT_NAME", "") == "pull_request":
        pr_runs = [i for i in data['workflow_runs']
                   if i['event'] == 'pull_request_review_comment' and
                   (i.get('pull_requests') and
                    i['pull_requests'][0]['number'] == int(environ.get("GITHUB_REF").split('/')[-2]) or
                    i['head_branch'] == environ.get("GITHUB_HEAD_REF").split('/')[-1])]
    return sorted(pr_runs, key=lambda i: i['run_number'], reverse=True)


def get_status(runs):
    status = 'ok'
    for run in runs:
        if run['status'] == 'completed':
            if run['conclusion'] in {'cancelled', 'skipped'}:
                continue
            if run['conclusion'] in {'failure', 'timed_out'}:
                status = 'fail'
                break
            if run['conclusion'] == 'success':
                break
        if run['status'] in {'in_progress', 'queued'}:
            status = 'rerun'
            break
    return status


if __name__ == "__main__":
    while True:
        status = get_status(get_runs())
        if status != 'rerun':
            break
        sleep(60)
    if status == 'fail':
        exit(1)
# .github/workflows/test_1.yml
name: Test 1

on:
  pull_request_review_comment:
    types: [created]

jobs:
  test:
    name: Test 1
    runs-on: ubuntu-latest
    if: github.event.comment.body == '/gha run' && contains('OWNER,MEMBER,COLLABORATOR', github.event.comment.author_association)
    timeout-minutes: 60
    strategy:
      fail-fast: false
    steps:
      - name: Checkout repository
        uses: actions/checkout@v1
        with:
          fetch-depth: 5
          submodules: true
      - name: Test
        run: |
            sleep 2m
            exit -1
# .github/workflows/r_package.yml

...

  all-successful:
    # https://git.luolix.topmunity/t/is-it-possible-to-require-all-github-actions-tasks-to-pass-without-enumerating-them/117957/4?u=graingert
    runs-on: ubuntu-latest
    needs: [test]
    steps:
      - name: Checkout repository
        uses: actions/checkout@v1
        with:
          fetch-depth: 5
          submodules: false
      - name: Install Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.x'
      - name: Note that all tests succeeded
        run: python "$GITHUB_WORKSPACE/.ci/check_workflow_status.py"

oooo interesting! Are you thinking about something like this for the checks in #3424 and #3443 ?

@StrikerRUS
Copy link
Collaborator

@jameslamb

Are you thinking about something like this for the checks in #3424 and #3443 ?

Yeah, definitely!

@StrikerRUS
Copy link
Collaborator

@guolinke Could you please generate a secret access token for LightGBM repository with public_repo and workflows permissions? I'm working on allowing optional checks to be in either skipped or succeeded status (#3439 (comment)). And I need that token to re-trigger our main GitHub Actions workflow from optional one to workaround additional manual step for maintainer.
Steps 1 and 2 from this guide: https://stevenmortimer.com/running-github-actions-sequentially/#step-1---create-a-personal-access-token-pat.

@guolinke
Copy link
Collaborator

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 19, 2020

@guolinke Thanks a lot! Unfortunately, I don't have an access to this tab of LightGBM repo settings:

image

image

But I think it shouldn't block me from using WORKFLOW name as a secret value in appropriate place in the workflow 🙂 .

@guolinke
Copy link
Collaborator

guolinke commented Oct 19, 2020

yeah, you can use it as ${{ secrets.WORKFLOW }}

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants