Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Route Sync Routine #1262

Merged
merged 3 commits into from
Mar 18, 2022
Merged

Conversation

aauren
Copy link
Collaborator

@aauren aauren commented Mar 15, 2022

@murali-reddy @mrueg

Replaces #1151 and largely based on the original commit @Rusox89 made in that PR. Thanks for your work there!

I just rebased his work and patched up some stuff to fix a few unaccounted use-cases and bring it a little more in line with the rest of the kube-router code base.

Fixes #509, which is a rare situation where routes will get lost after an interface is brought down and back up, by introducing a route synchronization goroutine. This keeps track of what the kernel routing table should look like and syncs it from time to time.

I have deployed this in a test cluster and tested taking routes in / out and starting / stopping peers. It acts correctly in all use-cases.

@aauren aauren changed the title Fix/route sync routine Add Route Sync Routine Mar 15, 2022
@aauren aauren force-pushed the fix/route_sync_routine branch 2 times, most recently from 5173dbb to beb279f Compare March 18, 2022 14:02
aauren added 2 commits March 18, 2022 11:19
Added the following items to the original logic:
* Added map route entry deletion on withdrawl so that the system doesn't
  incorrectly sync it back to the kernel's routing table
* Added an immediate route sync upon BGP path receive
* Added a mutex to ensure that deleted routes aren't accidentally synced
  back to the system
* Added stopCh and wg (wait group) handling
* Increase default sync time from 15 seconds to 1 minute since this
  scenario is unlikely and netlink calls could potentially be burdensome
  in large clusters.
@aauren aauren force-pushed the fix/route_sync_routine branch from beb279f to b73f5f3 Compare March 18, 2022 16:34
@aauren
Copy link
Collaborator Author

aauren commented Mar 18, 2022

Gave this quite a bit of testing in my development cluster. Along with the unit tests, I think that there can be a reasonable amount of confidence. Merging.

@aauren aauren merged commit 2d9fb92 into cloudnativelabs:master Mar 18, 2022
eskytthe added a commit to eskytthe/kube-router that referenced this pull request Apr 29, 2022
eskytthe added a commit to eskytthe/kube-router that referenced this pull request May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Route table gone forever after network interface link down and up
2 participants