Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync SR-IOV Network Operator Helm chart #603

Merged
merged 1 commit into from
Sep 26, 2023
Merged

Conversation

e0ne
Copy link
Collaborator

@e0ne e0ne commented Sep 22, 2023

No description provided.

@e0ne
Copy link
Collaborator Author

e0ne commented Sep 25, 2023

/retest-nic_operator_helm

@abdallahyas
Copy link
Contributor

manager container is failing with

    -   containerID: containerd://0d97256b7e7fbd7c31c1f6abe7075266347e6ba3e2e5fda01ba677b511bc3571
        image: docker.io/mellanox/network-operator:latest
        imageID: docker.io/mellanox/network-operator@sha256:bb62a913e01f5ec0716e474d968b039680923dd1fe636c1c7d768895920be424
        lastState:
            terminated:
                containerID: containerd://0d97256b7e7fbd7c31c1f6abe7075266347e6ba3e2e5fda01ba677b511bc3571
                exitCode: 128
                finishedAt: '2023-09-25T10:34:23Z'
                message: 'failed to create containerd task: failed to create shim:
                    OCI runtime create failed: container_linux.go:380: starting container
                    process caused: exec: "/manager": stat /manager: no such file
                    or directory: unknown'
                reason: StartError
                startedAt: '1970-01-01T00:00:00Z'
        name: network-operator
        ready: false
        restartCount: 9
        started: false
        state:
            waiting:
                message: back-off 5m0s restarting failed container=network-operator
                    pod=network-operator-helm-ci-6779fdd9d6-jsxpd_sriov-network-operator(d5b49d94-6281-489a-8f64-db32e9dfacb8)
                reason: CrashLoopBackOff
    hostIP: 172.18.0.2
    phase: Running
    podIP: 192.168.211.2
    podIPs:

see https://nvidia-nbu-ci-logs.s3.us-east-2.amazonaws.com/nic_operator_helm-ci/1147/logs/pods_infos/network-operator-helm-ci-6779fdd9d6-jsxpd.yaml for details

@abdallahyas
Copy link
Contributor

/retest-nic_operator_helm

@abdallahyas
Copy link
Contributor

Error in CI is

start failed in pod sriov-cuda-test-pod_default(2c235b65-beaa-45cb-ba64-c480591e982e): RunContainerError: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:364: creating new parent process caused: container_linux.go:2005: running lstat on namespace path "/proc/683386/ns/ipc" caused: lstat /proc/683386/ns/ipc: no such file or directory: unknown

Which does not seem related to the patch trying to retrigger, and if the issue persists needs to reproduce manually and try to debug.
/retest-nic_operator_helm

Signed-off-by: Ivan Kolodiazhny <ikolodiazhny@nvidia.com>
@e0ne
Copy link
Collaborator Author

e0ne commented Sep 26, 2023

Sync with upstream version. CI passed so it's safe to merge

@e0ne e0ne merged commit 0183193 into Mellanox:master Sep 26, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants