Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs that run privileged containers started failing #3673

Closed
1 of 7 tasks
orlangure opened this issue Jun 30, 2021 · 7 comments
Closed
1 of 7 tasks

Jobs that run privileged containers started failing #3673

orlangure opened this issue Jun 30, 2021 · 7 comments
Assignees
Labels
Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu

Comments

@orlangure
Copy link

Description

In gnomock there is an automated test that runs lightweight kubernetes distribution (k3s) inside a docker container. This test passed successfully 5 days ago, and started to fail consistently after the latest github environments upgrade.

The error that occurs inside the container is

151 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
2021-06-30T11:22:53.9605151Z F0630 11:22:53.387138     151 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

but I'm not sure this is the root cause.

I have the same test running on circleci, and it continues to pass.

I have a few other jobs that set up docker containers, and they still work. The difference is that k3s job starts a privileged container.

Virtual environments affected

  • Ubuntu 16.04
  • Ubuntu 18.04
  • Ubuntu 20.04
  • macOS 10.15
  • macOS 11
  • Windows Server 2016
  • Windows Server 2019

Image version and build link

   Environment: ubuntu-20.04
  Version: 20210628.1
  Included Software: https://github.com/actions/virtual-environments/blob/ubuntu20/20210628.1/images/linux/Ubuntu2004-README.md
  Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20210628.1

Failed build: https://github.com/orlangure/gnomock/runs/2951632183?check_suite_focus=true
Successful build (5 days ago): https://github.com/orlangure/gnomock/runs/2916092414?check_suite_focus=true

Is it regression?

20210614.1

Expected behavior

No response

Actual behavior

No response

Repro steps

Run [preset] k3s job from Gnomock repository.

@dibir-magomedsaygitov dibir-magomedsaygitov added OS: Ubuntu Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. and removed needs triage labels Jun 30, 2021
@dibir-magomedsaygitov
Copy link
Contributor

Hello @orlangure. Thank you for your report. We will take a look.

@lukaszo
Copy link

lukaszo commented Jun 30, 2021

We have the same error when using kind(K8s in Docker)

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

@orlangure, Looks like the issue is not with the image - rancher/rancher#33300 . Manually set up nf_conntrack_max=131072 do the trick:

    steps:
        - uses: actions/checkout@v2
          with:
            repository: 'orlangure/gnomock'
        - run: sudo sysctl -w net/netfilter/nf_conntrack_max=131072
        - name: Set up Go 1.16
          uses: actions/setup-go@v1
          with:
            go-version: 1.16
        - name: Get dependencies
          run: go get -v -t -d ./...
        - name: Test preset
          run: go test -race -cover -coverprofile=preset-cover.txt -coverpkg=./... -v ./preset/k3s/...
        - name: Test server
          run: go test -race -cover -coverprofile=server-cover.txt -coverpkg=./... -v ./internal/gnomockd -run TestK3s

image

@lukaszo, kind - rancher/rancher#33360

@orlangure
Copy link
Author

Manually set up nf_conntrack_max=131072 do the trick:

@al-cheb, interesting, but the image that I use for tests didn't change for a while (updated 8 months ago), and the tests passed until now. The only change that I noticed in the past days was github actions virtual environment upgrade.

From the linked issues it appears that the problem happens not only in github actions, so I assume it could be related to the kernel upgrade or some package that changed recently?

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

Manually set up nf_conntrack_max=131072 do the trick:

@al-cheb, interesting, but the image that I use for tests didn't change for a while (updated 8 months ago), and the tests passed until now. The only change that I noticed in the past days was github actions virtual environment upgrade.

From the linked issues it appears that the problem happens not only in github actions, so I assume it could be related to the kernel upgrade or some package that changed recently?

Yep, that's right the kernel was updated - https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20210628.1 .

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

@orlangure, Could you please update an image to the latest version https://k3d.io/faq/faq/#solved-nodes-fail-to-start-or-get-stuck-in-notready-state-with-log-nf_conntrack_max-permission-denied to test the workaround?

@al-cheb al-cheb self-assigned this Jul 1, 2021
lukaszo added a commit to capactio/capact that referenced this issue Jul 1, 2021
It contains a fix for kubernetes-sigs/kind#2240

We've hit when running GitHub actions actions/runner-images#3673
@orlangure
Copy link
Author

Thanks @al-cheb, and sorry for late response.
The issue appears to be gone with 1.19.12 (the only one I tried so far).

I'll prepare an update for my users to let them know that older k3s versions won't work in Gnomock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu
Projects
None yet
Development

No branches or pull requests

4 participants