-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing Flake: iptables chain already exists #3447
Comments
I've heard of this mostly through Ed's tests. I have not personally seen these errors. |
Though, I will note Ed has a consistent reproducer. |
Ohhh, that's good to hear. @edsantiago would it be possible to use a fresh(er) build of CNI to see if the problem is fixed? (see the CNI issue I opened for details) I only see it very inconsistently in libpod CI 😞 |
I only see the problem on RHEL8, and I have no idea how to get a new CNI there |
From the two RHEL8 VMs I have I see this one of them always and on the other one never. |
@edsantiago darn, I was afraid you'd say that. I'd guesstimate I was seeing this hit maybe 1/10 CI jobs here (when I opened the issues). Across all distros IIRC. Now it seems much reduced, but I dislike leaving that up to chance. I remember matt saying it would be really really expensive if podman did the locking for iptables. The crappy thing is, CNI can literally just ignore this error. I can't imagine why anything should ever care about a chain existing on create, since that was the intention. There's no other data tied to existence or absence that I know of. (AFAIK, the up suspected "fix" in golang doesn't ignore the error, it does something else) |
Feel free to follow along at rhbz1627561 |
That BZ is private, but I will share that it's tracking the same issue, and fix I reported upstream in containernetworking/plugins#335 So I think we can be fairly confident when that finalizes for RHEL, we can close all these issues also. |
Dangit...saw this happen again on master:
|
We need a fixed release of containernetworking-plugins vendoring the patched go-iptables. |
@mheon did that happen? |
We are still waiting on a plugins tag |
We're carrying a patch for RHEL/Cent 8.1. We should look into doing the same for Fedora. |
Note to me: We have fresh images for all platforms in master. I will check and verify the version of CNI plugins that are present. |
This is what we have today:
I'm not able to tell if any of those were built with the iptables vendor code required to fix the problem. @lsm5 do you have a way to know what vendor code was used to make those packages? (specifically go-iptables 1.8.1 or later) |
Problem is def. not fixed in fedora : https://api.cirrus-ci.com/v1/task/6382599951351808/logs/integration_test.log |
@lsm5 can we please get new containernetworking-plugins packages that are built with vendor go-iptables 1.8.1 or later? I believe latest upstream should already specify that. |
Looks like this is still happening:
Cirrus says this is fedora-30 but I have no way of knowing which version of containernetworking-plugins is installed. |
UPDATE: package_versions.log shows:
|
@rhvgoyal was just showing some similar errors to this. We had to do a podman system reset to clear them up. |
/kind bug
Description
Occasionally, automated testing fails due to a race-condition involving CNI iptables (though others CNI tools could also race). Since
firewalld
is intended to mitigate this, but use is forbidden, synchronization either needs to happen within CNI or within libpod.Steps to reproduce the issue:
Describe the results you received:
Multiple tests fail with error from CNI claiming "iptables chain FOO already exists"
Describe the results you expected:
Testing continuously passes despite slightly varying conditions.
Additional information you deem important (e.g. issue happens only occasionally):
containernetworking-plugins-0.7.5-1.fc30.x86_64
Output of
podman version
:Output of
podman info --debug
:Additional environment details (AWS, VirtualBox, physical, etc.):
Fedora 30 VM running in GCE
/etc/cni/net.d/87-podman-bridge.conflist
:The text was updated successfully, but these errors were encountered: