Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action sporadically fails with exec /usr/bin/buildctl: exec format error #313

Closed
3 tasks done
clarkohw opened this issue Apr 8, 2024 · 10 comments
Closed
3 tasks done

Comments

@clarkohw
Copy link

clarkohw commented Apr 8, 2024

Contributing guidelines

I've found a bug, and:

  • The documentation does not mention anything about my problem
  • There are no open or closed issues that are related to my problem

Description

The docker/setup-buildx-action@v3 sporadically fails on the booting builder step. The sporadic nature of the issue seems similar to #283, but i am not using self hosted runners and getting different error messages.

Expected behaviour

The action should install buildx.

Actual behaviour

Occasionally, maybe 10% of the time, the Booting builder step of the action fails.

Repository URL

No response

Workflow run URL

No response

YAML workflow

integration-tests:
    needs: [ setup-matrix, compile-contracts ]
    timeout-minutes: 20
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix: ${{fromJson(needs.setup-matrix.outputs.matrix)}}
    steps:
      - name: Add hosts to /etc/hosts
        run: |
          sudo echo "127.0.0.1 **.local.**.com" | sudo tee -a /etc/hosts
          sudo echo "127.0.0.1 **.local.**.com" | sudo tee -a /etc/hosts


      - name: Checkout ** from ${{ github.event.pull_request.base.ref }}
        uses: actions/checkout@v2

      - name: Cache contract artifacts
        uses: actions/cache@v3
        with:
          fail-on-cache-miss: true
          path: |
            ./abis/
            ./artifacts/
            ./cache/
            ./typechain-types/
          key: ${{ runner.os }}-compiled-contracts-${{ hashFiles('./contracts/') }}

      - name: Set Branch Name
        run: echo "GH_BRANCH_NAME=${{ github.event_name == 'workflow_dispatch' && github.ref_name || github.event.pull_request.base.ref }}" >> $GITHUB_ENV

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11.5'
          cache: 'pip'
          token: ${{ secrets.GH_ADMIN_TOKEN }}
      - run: pip install -r requirements.txt

      - name: Set up Docker
        uses: docker/setup-buildx-action@v3

      - name: Set up AWS CLI
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{env.AWS_REGION}}

      - name: Use Node.js ${{ env.NODE_VERSION }}
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'yarn'
      - run: yarn install
      - run: yarn global add pm2

      - name: Install Foundry
        uses: foundry-rs/foundry-toolchain@v1

      - uses: nick-fields/retry@v2
        with:
          timeout_minutes: 10
          max_attempts: 2
          command: python3 -u run.py ${{ matrix.test }} --docker

Workflow logs

Run docker/setup-buildx-action@v3
Docker info
Buildx version
Creating a new builder instance
  /usr/bin/docker buildx create --name builder-20e51f00-009d-499f-ba6b-ec39d5720f3f --driver docker-container --buildkitd-flags --allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host --use
  builder-20e51f00-009d-499f-ba6b-ec39d5720f3f
Booting builder
  /usr/bin/docker buildx inspect --bootstrap --builder builder-20e51f00-009d-499f-ba6b-ec39d5720f3f
  #1 [internal] booting buildkit
  #1 pulling image moby/buildkit:buildx-stable-1
  #1 pulling image moby/buildkit:buildx-stable-1 0.2s done
  #1 creating container buildx_buildkit_builder-20e51f00-009d-499f-ba6b-ec39d5720f3f0
  #1 17.79 time="2024-04-08T14:25:[18](https://github.com/**/**/actions/runs/8601790451/job/23569835513#step:7:19)Z" level=warning msg="using host network as the defaul#1 creating container buildx_buildkit_builder-20e51f00-009d-499f-ba6b-ec39d5720f3f0 17.6s done
  time="2024-04-08T14:25:18Z" level=warning msg="using host network as the default"
  #1 17.79 time="2024-04-08T14:25:18Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
  #1 17.79 �dtime="2024-04-08T14:25:18Z" level=info msg="found 1 workers, default=\"lh4fhblojyqn1krqg9m33sxft\""
  #1 17.79 �`time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  #1 17.79 �\time="2024-04-08T14:25:18Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
  #1 17.79 time="2024-04-08T14:25:18Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
  #1 17.79 time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  #1 17.79 �time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  #1 17.79 exec /usr/bin/buildctl: exec format error
  #1 ERROR: exit code 1
  ------
   > [internal] booting buildkit:
  time="2024-04-08T14:25:18Z" level=warning msg="using host network as the default"
  17.79 time="2024-04-08T14:25:18Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
  17.79 �dtime="2024-04-08T14:25:18Z" level=info msg="found 1 workers, default=\"lh4fhblojyqn1krqg9m33sxft\""
  17.79 �`time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  17.79 �\time="2024-04-08T14:25:18Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
  17.79 time="2024-04-08T14:25:18Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
  17.79 time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  17.79 �time="2024-04-08T14:25:18Z" level=warning msg="currently, only the default worker can be used."
  17.79 exec /usr/bin/buildctl: exec format error
  ------
  ERROR: exit code 1
Error: The process '/usr/bin/docker' failed with exit code 1

but i also recently go this error message:

Run docker/setup-buildx-action@v3
Docker info
Buildx version
Creating a new builder instance
Booting builder
  /usr/bin/docker buildx inspect --bootstrap --builder builder-4f1b9e61-ada5-4527-ae74-af370ab097db
  #1 [internal] booting buildkit
  #1 pulling image moby/buildkit:buildx-stable-1
  #1 pulling image moby/buildkit:buildx-stable-1 0.4s done
  #1 creating container buildx_buildkit_builder-4f1b9e61-ada5-4527-ae74-af370ab097db0
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 creating container buildx_buildkit_builder-4f1b9e61-ada5-4527-ae74-af370ab097db0 17.6s done
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  #1 18.04 
  #1 ERROR: Error response from daemon: Container 38688efaa8949e35e2e2ff6d861513afbc4cfbdd24e4e40366dd65ef2d6b05dc is restarting, wait until the container is running
  ------
   > [internal] booting buildkit:
  18.04 
  18.04 
  18.04 /usr/bin/buildkitd: line 0: syntax error: unexpected word (expecting ")")
  ------
  ERROR: Error response from daemon: Container 38688efaa8949e35e2e2ff6d861513afbc4cfbdd24e4e40366dd65ef2d6b05dc is restarting, wait until the container is running

BuildKit logs

No response

Additional info

One potentially relevant factor is that we run many workflows at the same time (>20) at some times so I was thinking it could be related to that?

@gete76
Copy link

gete76 commented Apr 8, 2024

+1 . We also run both GH and self hosted runners with many parallel workflow runs that use this action. A cache issue?
Pinning to an older version that's been stable for us has patched the issue for us:

  - name: Set up Docker Buildx
    uses: docker/setup-buildx-action@v3
    with:
      platforms: linux/amd64
      version: v0.11.2
      buildkitd-flags: --debug
      driver-opts: image=moby/buildkit:v0.11.2
      cache-binary: false

@tonistiigi
Copy link
Member

We also run both GH and self hosted runners with many parallel workflow runs that use this action. A cache issue?
Pinning to an older version that's been stable for us has patched the issue for us:

Let us know if this shows up in older version as well. There is nothing atm pointing to issue with our release and parallel workflow runs are out of our control as well.

I have 100 clean runs in a row in https://github.com/tonistiigi/gh-exec-format-error-debug/actions/runs/8606687477 based on another report. If you can point me any differences what should be tried instead to reproduce this then lmk.

@osarobo
Copy link

osarobo commented Apr 9, 2024

For those experiencing this issue, I think @tonistiigi may have used the new updated runner build released yesterday, see https://github.com/actions/runner-images/releases.

Try again with the latest docker/setup-buildx-action@v3 version and see if you are still having the unexpected behaviour.

@gete76
Copy link

gete76 commented Apr 9, 2024

We also run both GH and self hosted runners with many parallel workflow runs that use this action. A cache issue?
Pinning to an older version that's been stable for us has patched the issue for us:

Let us know if this shows up in older version as well. There is nothing atm pointing to issue with our release and parallel workflow runs are out of our control as well.

I have 100 clean runs in a row in https://github.com/tonistiigi/gh-exec-format-error-debug/actions/runs/8606687477 based on another report. If you can point me any differences what should be tried instead to reproduce this then lmk.

Last Friday, the error started showing up very pronounced in our CI Merge queue. It was causing almost all merge queue runs to be booted by the end of the day. Reading up on this error log message "Error: The process '/usr/bin/docker'" and other messages about default network, seemed to point to an issue of matching versions of buildkit with buildx. I tried running action with just the --cache-binary=false to test it with the latest packages, hoping it was a cache issue but the error still showed up.

@tonistiigi
Copy link
Member

@gete76 And you still see the issue?

@gete76
Copy link

gete76 commented Apr 9, 2024

@tonistiigi , I haven't tested that default setting today because I don't want to disrupt our CI. SLOs and what not. I'll have to find a way to test this without disruption.

@gete76
Copy link

gete76 commented Apr 9, 2024

@tonistiigi , I can tell you it did show up yesterday morning around 11AM EST, when I tested the latest (default) packages with no caching.

@gete76
Copy link

gete76 commented Apr 9, 2024

For those experiencing this issue, I think @tonistiigi may have used the new updated runner build released yesterday, see https://github.com/actions/runner-images/releases.

Try again with the latest docker/setup-buildx-action@v3 version and see if you are still having the unexpected behaviour.

Thanks, I'll give this a try. It does appear that this is only happening on our GH hosted runners. Our internal ones build off of the summerwind action-runner image.

@tonistiigi
Copy link
Member

Atm this looks like a Github side issue related to 20240403.1.0 runner release that now looks to be deleted https://github.com/actions/runner-images/blob/ubuntu20/20240403.1/images/ubuntu/Ubuntu2004-Readme.md (404).

This is related issue actions/runner-images#9632 and comment about release being broken actions/runner-images#9654 (comment)

@gete76
Copy link

gete76 commented Apr 9, 2024

Confirmed, this new runner image has resolved the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants