Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch GCP network lb to global TCP proxy lb #190

Merged
merged 1 commit into from
Apr 18, 2018

Conversation

dghubble
Copy link
Member

@dghubble dghubble commented Apr 15, 2018

  • Allow multi-controller clusters on Google Cloud
  • GCP regional network load balancers have a long open bug in which requests originating from a backend instance are routed to the instance itself, regardless of whether the health check passes or not. As a result, only the 0th controller node registers. We've recommended just using single master GCP clusters for a while
  • https://issuetracker.google.com/issues/67366622
  • Workaround issue by switching to a GCP TCP Proxy load balancer. TCP proxy lb routes traffic to a backend service (global) of instance group backends. In our case, spread controllers across 3 zones (all regions have 3+ zones) and organize them in 3 zonal unmanaged instance groups that serve as backends. Allows multi-controller cluster creation
  • GCP network load balancers only allowed legacy HTTP health checks so kubelet 10255 was checked as an approximation of controller health. Replace with TCP apiserver health checks to detect unhealth or unresponsive apiservers.
  • Drawbacks: GCP provision time increases, tailed logs now timeout (similar tradeoff in AWS), controllers only span 3 zones instead of the exact number present in the region
  • Workaround in Typhoon has been known and posted for 5 months, but there still appears to be no better alternative. Its probably time to support multi-master and accept the downsides

Fixes: #54

@dghubble dghubble force-pushed the google-cloud-multi-controller branch 2 times, most recently from 96ad0a5 to 07fcf34 Compare April 15, 2018 21:55
* Allow multi-controller clusters on Google Cloud
* GCP regional network load balancers have a long open
bug in which requests originating from a backend instance
are routed to the instance itself, regardless of whether
the health check passes or not. As a result, only the 0th
controller node registers. We've recommended just using
single master GCP clusters for a while
* https://issuetracker.google.com/issues/67366622
* Workaround issue by switching to a GCP TCP Proxy load
balancer. TCP proxy lb routes traffic to a backend service
(global) of instance group backends. In our case, spread
controllers across 3 zones (all regions have 3+ zones) and
organize them in 3 zonal unmanaged instance groups that
serve as backends. Allows multi-controller cluster creation
* GCP network load balancers only allowed legacy HTTP health
checks so kubelet 10255 was checked as an approximation of
controller health. Replace with TCP apiserver health checks
to detect unhealth or unresponsive apiservers.
* Drawbacks: GCP provision time increases, tailed logs now
timeout (similar tradeoff in AWS), controllers only span 3
zones instead of the exact number in the region
* Workaround in Typhoon has been known and posted for 5 months,
but there still appears to be no better alternative. Its
probably time to support multi-master and accept the downsides
@dghubble dghubble force-pushed the google-cloud-multi-controller branch from 07fcf34 to ad2e431 Compare April 18, 2018 07:09
@dghubble dghubble merged commit ad2e431 into master Apr 18, 2018
@dghubble dghubble deleted the google-cloud-multi-controller branch April 18, 2018 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant