-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RIP routes marked inactive and not being replaced #5174
Comments
More info. I found that this 0.0.0.0 -> 10.10.1.254 is not coming from the router but from three of our ubuntu nodes (running FRR 7.1): K>* 0.0.0.0/0 [0/0] via 10.10.1.254, primary-lan, 02w2d00h This comes from netplan (default routes added for each LAN segment). So to sum up, machine 25 is getting a default route via 10.10.1.254 from machine 34 via rip. It is also getting default from 10.10.1.1 and 10.10.1.2 from BGP. Something is happening (I guess to machine 34 now) that is making the route inactive ... so why isn't RIP timing that route out and picking up the default from one of the two routers? |
I took the default routes of netplan.yaml in nj34 and ran netplan apply. The kernel routes above stayed in the routing table. I deleted both with ip route del 0.0.0.0/0. I then ran netstat -nr | grep 0.0.0.0 several times and watched the default route get acquired from different machines in my network. Until it stopped and there was no more default route. Curious, I logged into zebra and did a sh ip ro, and got the following: R 0.0.0.0/0 [120/2] via 10.10.2.254 inactive, 00:00:15 So even after I deleted the route manually, it is being held (long past all timers). I finally restarted frr and it picked up the default from one of the routers. Very odd behavior. |
can you reproduce it ? |
@seanfulton can you possibly try to recreate this on a later version of Centos? We don't really do regression testing for 6 anymore given that it's more or less EOL at this point. |
I can confirm this is happening on centos 7, frr 7.2. Same exact behavior; |
R 0.0.0.0/0 [120/2] via 10.10.1.254 inactive, 01:30:53 |
What do you want me to do do here? This is becoming very problemmatic for us. Its happening on CENTOS 6, CENTOS 7 UBUNTU 18.04 on the 7.2 versions. |
Hey guys, this is a serious issue. I'm reverting all of our nodes back to Quagga until someone figures this out. Too risky to continue in production with this. sean |
Seems related with #13561 |
We are using FRR RPM frr-7.0-01.el6.x86_64 on CENTOS 6. We've used Quagga up until about a month ago with no problems but upgraded to FRR. Since then we've noticed that machines will randomly lose their default route. When I examine the routing table, I'll see the default route marked as a RIP route but inactive.
This seems similar to: #4535
About our network: We have two border routers running zebra. Each gets a default route via BGP and advertises it to the network using RIP. We have a static IP (#.#.#.254) that floats from router to router that non-RIP devices can use as a default GW.
When the hang occurs, I see this:
If I restart FRR, it immediately picks up a new default via RIP from 10.10.1.1 or 10.10.2.1, depending.
So my theory is that something causes the .254 address to flip over from say router A to router B.
My feeling is that if this .254 address becomes inactive, it should be flushed from the routing table and a new route gained from rip for either 10.10.1.1 or 10.10.1.2. Instead, the old route hangs.
Any idea why?
ripd.conf:
zebra.conf:
The text was updated successfully, but these errors were encountered: