Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaniko integration tests failing - localhost:5000 inaccessible -> k3s is evicting registry pod #2803

Closed
aaron-prindle opened this issue Oct 17, 2023 · 2 comments · Fixed by #2804
Labels
area/ci-cd kind/bug Something isn't working meta/integration-tests priority/p0 Highest priority. Break user flow. We are actively looking at delivering it.
Milestone

Comments

@aaron-prindle
Copy link
Collaborator

aaron-prindle commented Oct 17, 2023

Currently all of Kaniko's integration tests are failing @ HEAD with the below log snippet. Doesn't seem related to a direct change in Kaniko code but likely a change in the Github Action worker image (dep update). Need to root cause and fix this issue:

log link:
https://github.com/GoogleContainerTools/kaniko/actions/runs/6489137282/job/17622851856?pr=2791

log error snippet

integration_test.go:604: Error building image: Failed to build image localhost:5000/kaniko-dockerfile_test_cache with kaniko command "[docker run --net=host -e BENCHMARK_FILE=false -v /home/runner/work/kaniko/kaniko/integration:/workspace -v /tmp/2024572772:/kaniko/benchmarks -v /home/runner/.docker/config.json:/root/.docker/config.json -e DOCKER_CONFIG=/root/.docker executor-image -f /workspace/dockerfiles/Dockerfile_test_cache -d localhost:5000/kaniko-dockerfile_test_cache --force -c /workspace]": exit status 1
            error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "localhost:5000/kaniko-dockerfile_test_cache": creating push check transport for localhost:5000 failed: Get "https://localhost:5000/v2/": dial tcp [::1]:5000: connect: connection refused; Get "http://localhost:5000/v2/": dial tcp [::1]:5000: connect: connection refused
@aaron-prindle aaron-prindle added meta/integration-tests kind/bug Something isn't working area/ci-cd priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. labels Oct 17, 2023
@aaron-prindle aaron-prindle added this to the v1.17.0 milestone Oct 17, 2023
@aaron-prindle
Copy link
Collaborator Author

aaron-prindle commented Oct 17, 2023

Issue is related to disk pressure, snippet of k3s status during failing run:

log link:
https://github.com/GoogleContainerTools/kaniko/actions/runs/6554012241/job/17800290036?pr=2804#step:6:127

log error snippet

...
       local-path-provisioner-957fdf8bc-h2b8x                1/1     Running     0          22m
            coredns-77ccd57875-9rd7w                              1/1     Running     0          22m
            helm-install-traefik-crd-fgmlk                        0/1     Completed   0          22m
            helm-install-local-registry-8zsth                     0/1     Completed   0          22m
            helm-install-traefik-wlmmz                            0/1     Completed   1          22m
            traefik-64f55bb67d-wv4b5                              1/1     Running     0          21m
            metrics-server-648b5df564-srtt4                       1/1     Running     0          22m
            local-registry-docker-registry-56d58d5d6-6dncc        0/1     Error       0          21m
            local-registry-docker-registry-56d58d5d6-sh4ls        0/1     Evicted     0          18s
            local-registry-docker-registry-56d58d5d6-tphm2        0/1     Pending     0          18s
            svclb-local-registry-docker-registry-c7ea5916-zggj6   0/1     Evicted     0          6s
            svclb-traefik-050d8b73-crcsf                          0/2     Evicted     0          0s

@aaron-prindle aaron-prindle changed the title Kaniko integration tests failing - likely related to Github Actions worker update Kaniko integration tests failing - localhost:5000 inaccessible -> k3s is evicted registry pod Oct 18, 2023
@aaron-prindle aaron-prindle changed the title Kaniko integration tests failing - localhost:5000 inaccessible -> k3s is evicted registry pod Kaniko integration tests failing - localhost:5000 inaccessible -> k3s is evicting registry pod Oct 18, 2023
@aaron-prindle
Copy link
Collaborator Author

aaron-prindle commented Oct 18, 2023

NOTE: not sure why disk pressure issue happened, culprits are either tests started using more disk (doesn't seem to be the case), k3s started using more disk (last release ~1 mo ago so unlikely IIUC given dates), or Github Actions workers are using more disk or have equipped less disk space

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci-cd kind/bug Something isn't working meta/integration-tests priority/p0 Highest priority. Break user flow. We are actively looking at delivering it.
Projects
None yet
1 participant