Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOCTOU Race adding firewall chain #335

Closed
cevich opened this issue Jun 12, 2019 · 7 comments
Closed

TOCTOU Race adding firewall chain #335

cevich opened this issue Jun 12, 2019 · 7 comments
Labels

Comments

@cevich
Copy link

cevich commented Jun 12, 2019

I'm not very familiar with this codebase or tool, so please forgive me of some ignorance 😄

During CI-testing of libpod, we run ginkgo with -nodes 3 (and tests in random order as default) for speed, but also because it catches "interesting" race conditions. One in particular I've been researching, seems like it may have an easy fix.

Since podman has no daemon, it's entirely possible for multiple cni operations to be running concurrently. I believe this is exposing a time-of-check, time-of-use race here:

https://github.com/containernetworking/plugins/blob/master/plugins/meta/firewall/iptables.go#L63-L69

In other words, two containers are coming up, no chain exists, both try to create, one wins, the other fails. I'm willing to take a stab at fixing this, but since I don't know the code well, I wanted to check first.

I think the fix might be as simple as always ignoring that particular error message. But also I'm not sure of the best way to add a new unit(?) test for this.

Please advise.

@squeed
Copy link
Member

squeed commented Jun 13, 2019

Ah yes, that is a very good observation. That does seem like an easy workaround, and your suggestion seems correct.

Is there a chance that we could get the same error message when trying to insert a slightly different rule? I don't think so, but can you check?

@cevich
Copy link
Author

cevich commented Jun 13, 2019

Easily, and worse. Iptables is extremely naive/simple-minded about such things. It's easily possible one process can add a rule and another one can change it. Top-of-the-head example: One process "inserts" a rule at position 1 while another (simultaneously) "deletes" the rule at position 5...which position 5? The one before or after the first process succeeded? I'd guess this was one of the main reasons firewalld/firewall-cmd was invented 😄

@dcbw
Copy link
Member

dcbw commented Jul 10, 2019

Does this fix the problem for you? coreos/go-iptables#62

I believe that was the result of debugging some podman issues and we found that when using iptables-nft the error code for "this chain already exists" was different and thus not ignored.

@cevich
Copy link
Author

cevich commented Jul 10, 2019

Thanks @dcbw it might could be, I don't recall seeing this particular flake coming out of our CI results recently. We were just talking about scooping up a new CNI version anyway for packaging, which would address the problem outside of CI (I have no reports of happening, but it seems technically possible).

I'll keep an eye on our CI system runs and mention here if I come across it occurring in the next few weeks. Okay by me to assume no-news == good-news, and close this.

@cevich
Copy link
Author

cevich commented Jul 24, 2019

Dangit...found this happening again in a run of testing libpod on master.

@mheon
Copy link

mheon commented Jul 24, 2019

The go-iptables fix seems to work in testing - would be nice to get a release including it so we can get it packaged for Fedora

@cevich
Copy link
Author

cevich commented Jul 24, 2019

Ack, thanks for confirming our CI isn't yet up to snuff...okay, so back to monitoring this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants