Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build summary generation failing #1143

Closed
3 tasks done
leocencetti opened this issue Jun 19, 2024 · 7 comments · Fixed by docker/actions-toolkit#376
Closed
3 tasks done

Build summary generation failing #1143

leocencetti opened this issue Jun 19, 2024 · 7 comments · Fixed by docker/actions-toolkit#376

Comments

@leocencetti
Copy link

leocencetti commented Jun 19, 2024

Contributing guidelines

I've found a bug, and:

  • The documentation does not mention anything about my problem
  • There are no open or closed issues that are related to my problem

Description

The generation of the build summary in the post-build job (added by https://github.com/docker/build-push-action/releases/tag/v6.0.0) fails

Expected behaviour

Generation should succeed

Actual behaviour

The post-build job fails unexpectedly with the following error:

error: Unavailable: connection error: desc = "error reading server preface: http2: frame too large"

The error can be reproduced when rerunning the workflow

Repository URL

No response

Workflow run URL

No response

YAML workflow

name: Build toolchain

on:
  workflow_call:
    inputs:
      push:
        description: Push image to registry
        default: false
        type: boolean
      tag:
        description: Optional tag
        type: string

env:
  REGISTRY: ghcr.io

defaults:
  run:
    shell: bash

permissions:
  contents: read
  packages: write

jobs:
  build-toolchain:
    name: Build rootfs toolchain
    runs-on:
      - self-hosted
      - linux
      - ARM64
    container:
      image: ghcr.io/leocencetti/docker:latest
      options: --privileged
      credentials:
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
      volumes:
        - /var/lib/docker:/var/lib/docker
        - /var/cache/github-runner:/tmp/cache/

    steps:
      - name: Check out the repo
        uses: actions/checkout@v4.1.6
        with:
          submodules: recursive
          token: ${{ secrets.ACCESS_TOKEN }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to Docker Hub
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Generate image tags
        id: image_meta
        uses: docker/metadata-action@v5.5.0
        with:
          images: my-image
          tags: |
            type=raw,value=rid-${{ github.run_id }}
            type=raw,value=${{ inputs.tag }},enable=${{ inputs.tag != '' }}
            type=raw,value=latest,enable=${{ github.event_name == 'release' }}

      - name: Build and push image
        uses: docker/build-push-action@v6.0.0
        with:
          push: ${{ inputs.push }}
          build-args: |
            GIT_REF=${{ github.ref }}
            GIT_SHA=${{ github.sha }}
          tags: ${{ steps.image_meta.outputs.tags }}
          labels: ${{ steps.image_meta.outputs.labels }}
          secrets: |
            image_password=${{ secrets.IMAGE_PASSWORD }}
          context: .
          file: Dockerfile
          target: payload
          cache-from: |
            type=local,src=/tmp/cache/.buildx-cache
            type=local,src=/tmp/cache/.buildx-cache-new
          cache-to: type=local,dest=/tmp/cache/.buildx-cache-new,mode=max
          load: ${{ !inputs.push }}

Workflow logs

Post job cleanup.
/usr/bin/docker exec  eef2ffbc1bcffc4394ac503367b038b661a13cfadddf8db48cd4c2b72d8ec728 sh -c "cat /etc/*release | grep ^ID"
Generating build summary
  exporting build record to /__w/_temp/docker-actions-toolkit-u0hgvJ/export
  /usr/bin/mkfifo /__w/_temp/docker-actions-toolkit-u0hgvJ/buildx-in-GfD6vA.fifo
  /usr/bin/mkfifo /__w/_temp/docker-actions-toolkit-u0hgvJ/buildx-out-55Fw6E.fifo
  docker buildx --builder builder-39f22689-5718-4eaf-b370-5e4d83eddf10 dial-stdio
  docker run --rm -i -v /github/home/.docker/buildx/refs:/buildx-refs -v /__w/_temp/docker-actions-toolkit-u0hgvJ/export:/out docker.io/dockereng/export-build:latest --ref-state-dir=/buildx-refs --node=builder-39f22689-5718-4eaf-b370-5e4d83eddf10/builder-39f22689-5718-4eaf-b370-5e4d83eddf100 --ref=e5749kkypteaysamhknl3lfgs --uid=0 --gid=0
  Unable to find image 'dockereng/export-build:latest' locally
  latest: Pulling from dockereng/export-build
  170e3bcedcd0: Pulling fs layer
  5b2524eeb8ff: Pulling fs layer
  5b2524eeb8ff: Download complete
  170e3bcedcd0: Verifying Checksum
  170e3bcedcd0: Download complete
  170e3bcedcd0: Pull complete
  5b2524eeb8ff: Pull complete
  Digest: sha256:3dfedea3148487c108965dede834f22e81528fc5b2f3989e4b8ecec2f8fe10ae
  Status: Downloaded newer image for dockereng/export-build:latest
  2024/06/19 09:21:22 error: Unavailable: connection error: desc = "error reading server preface: http2: frame too large"
  github.com/moby/buildkit/util/stack.Enable
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/util/stack/stack.go:77
  github.com/moby/buildkit/util/grpcerrors.FromGRPC
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/util/grpcerrors/grpcerrors.go:198
  github.com/moby/buildkit/util/grpcerrors.UnaryClientInterceptor
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/util/grpcerrors/intercept.go:41
  google.golang.org/grpc.(*ClientConn).Invoke
  	/go/pkg/mod/google.golang.org/grpc@v1.59.0/call.go:35
  github.com/moby/buildkit/api/services/control.(*controlClient).ListWorkers
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/api/services/control/control.pb.go:2306
  github.com/moby/buildkit/client.(*Client).ListWorkers
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/client/workers.go:31
  main.run
  	/src/main.go:103
  main.main
  	/src/main.go:80
  runtime.main
  	/usr/local/go/src/runtime/proc.go:267
  runtime.goexit
  	/usr/local/go/src/runtime/asm_arm64.s:1197
  failed to list workers
  github.com/moby/buildkit/client.(*Client).ListWorkers
  	/go/pkg/mod/github.com/moby/buildkit@v0.13.1/client/workers.go:33
  main.run
  	/src/main.go:103
  main.main
  	/src/main.go:80
  runtime.main
  	/usr/local/go/src/runtime/proc.go:267
  runtime.goexit
  	/usr/local/go/src/runtime/asm_arm64.s:1197
  failed to list workers
  main.run
  	/src/main.go:105
  main.main
  	/src/main.go:80
  runtime.main
  	/usr/local/go/src/runtime/proc.go:267
  runtime.goexit
  	/usr/local/go/src/runtime/asm_arm64.s:1197
  Warning: Process "docker run" exited with code 1
Removing temp folder /__w/_temp/docker-actions-toolkit-ifKwVX
Post cache
  State not set

BuildKit logs

No response

Additional info

No response

@crazy-max
Copy link
Member

Thanks for reporting, looking at your workflow:

    runs-on:
      - self-hosted
      - linux
      - ARM64
    container:
      image: ghcr.io/leocencetti/docker:latest
      options: --privileged
      credentials:
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
      volumes:
        - /var/lib/docker:/var/lib/docker
        - /var/cache/github-runner:/tmp/cache/

This does not look like a common setup 😅

What is the ghcr.io/leocencetti/docker:latest image? Seems to be a private package, would you mind sharing it if possible?

Also not sure what kind of runner you're using looking at self-hosted, linux, ARM64 but seems like these are self-hosted runners. Can you share the full workflow logs to help use figure out what's going on? And also enable debug for BuildKit to have containers logs: https://docs.docker.com/build/ci/github-actions/configure-builder/#buildkit-container-logs.

@leocencetti
Copy link
Author

leocencetti commented Jun 19, 2024

What is the ghcr.io/leocencetti/docker:latest image? Seems to be a private package, would you mind sharing it if possible?

This is roughly equivalent to this dockerfile, I am just using ubuntu:22.04 as the base image instead of alpine (with the required package manager adaptations).

Also not sure what kind of runner you're using looking at self-hosted, linux, ARM64 but seems like these are self-hosted runners

Yes, I am using a docker-in-docker (DIND) workflow on a self-hosted ARM64 runner (NVIDIA).

Can you share the full workflow logs to help use figure out what's going on?

Yes. I've collected the logs from relevant jobs in the workflow. I have omitted the docker build logs as they contain private info (and are probably unrelated).
logs.zip

On a side note, the image I am building is on the larger side (some GB), and the full workflow logs are quite verbose (5k+ lines). I noticed that the logs are fetched by the action to produce the summary, so I am wondering if their size could be the issue. I don't seem to have this problem when building smaller (and less verbose) images using the same setup.

@crazy-max
Copy link
Member

crazy-max commented Jun 19, 2024

Yes. I've collected the logs from relevant jobs in the workflow. I have omitted the docker build logs as they contain private info (and are probably unrelated).
logs.zip

Thanks! Looking at the logs it seems you're using an old version of buildx:

2024-06-19T11:49:15.0952010Z [command]/usr/local/bin/docker buildx version
2024-06-19T11:49:15.1712504Z github.com/docker/buildx v0.11.2 9872040b6626fb7d87ef7296fd5b832e8cc2ad17

That doesn't support dial-stdio command introduced in Buildx 0.13.0: https://github.com/docker/buildx/releases/tag/v0.13.0

Can you make this change in your workflow to use latest stable and see if it fixes the issue on your side?:

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        with:
          version: latest
          buildkitd-flags: --debug

I will try to repro on my side with older version.

Edit: Was able to repro:

image

@crazy-max
Copy link
Member

crazy-max commented Jun 19, 2024

Opened #1145 to mitigate the issue. You can test with:

      - name: Build and push image
        uses: crazy-max/docker-build-push-action@summary-check

@leocencetti
Copy link
Author

@crazy-max I tried this morning to run the CI workflow with the latest buildx (v0.15.1) and I still get a failure (not the same one though):

docker buildx --builder builder-3e2fdd69-2ba2-4478-b367-2501d8cef169 dial-stdio
  docker run --rm -i -v /github/home/.docker/buildx/refs:/buildx-refs -v /__w/_temp/docker-actions-toolkit-0j650g/export:/out docker.io/dockereng/export-build:latest --ref-state-dir=/buildx-refs --node=builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690 --ref=b5ap7n4x5arkdbv617hvgbprg --uid=0 --gid=0
  2024/06/20 06:42:45 failed to fill local state: failed to stat local ref directory /buildx-refs/builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690: stat /buildx-refs/builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690/: no such file or directory
  Warning: Failed to export build record: /__w/_temp/docker-actions-toolkit-0j650g/export/rec.dockerbuild not found

Note, I am not using your latest fix yet, although I doubt it will help here (buildx version is fine)...

Full logs:
logs.zip

@crazy-max
Copy link
Member

and I still get a failure (not the same one though):

docker buildx --builder builder-3e2fdd69-2ba2-4478-b367-2501d8cef169 dial-stdio
  docker run --rm -i -v /github/home/.docker/buildx/refs:/buildx-refs -v /__w/_temp/docker-actions-toolkit-0j650g/export:/out docker.io/dockereng/export-build:latest --ref-state-dir=/buildx-refs --node=builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690 --ref=b5ap7n4x5arkdbv617hvgbprg --uid=0 --gid=0
  2024/06/20 06:42:45 failed to fill local state: failed to stat local ref directory /buildx-refs/builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690: stat /buildx-refs/builder-3e2fdd69-2ba2-4478-b367-2501d8cef169/builder-3e2fdd69-2ba2-4478-b367-2501d8cef1690/: no such file or directory
  Warning: Failed to export build record: /__w/_temp/docker-actions-toolkit-0j650g/export/rec.dockerbuild not found

Temp folder /__w/_temp looks odd compared to what we have on GitHub public runners /home/runner/work/_temp: https://github.com/docker/build-push-action/actions/runs/9585155679/job/26430413150#step:6:7 but don't think that's the issue. I wonder if volumes mount are just broken with your current setup when using your DinD image. Maybe we should rely on docker cp instead of volumes 🤔. I also see that the local ref cannot be found with /github/home/.docker/buildx/refs:/buildx-refs.

Can you add these extra steps after - name: Build and push image and give the logs?:

      - name: Check docker config
        run: |
          tree -punahig /github/home/.docker

      - name: Dump context
        uses: crazy-max/ghaction-dump-context@v2

@macripps
Copy link

macripps commented Aug 8, 2024

Hi crazy-max, I'm getting the same error as leocencetti. We're using our own GitHub runner and the docker/bake-action@v5.6.1 - I can provide the logs from the extra two commands above from our system. As this issue is closed, would here the best place for them or would an issue in the bake-action repo be more appropriate? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants