Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Calico networking on Azure and DigitalOcean #472

Merged
merged 1 commit into from
May 20, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,23 @@ Notable changes between versions.
* Kubernetes [v1.14.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142)
* Update etcd from v3.3.12 to [v3.3.13](https://github.com/etcd-io/etcd/releases/tag/v3.3.13)
* Upgrade Calico from v3.6.1 to [v3.7.2](https://docs.projectcalico.org/v3.7/release-notes/)
* Change flannel port from 8472 (kernel default) to 4789 (IANA VXLAN)
* Change flannel VXLAN port from 8472 (kernel default) to 4789 (IANA VXLAN)

#### AWS

* Only set internal VXLAN rules when `networking` is flannel (default: calico)
* Only set internal VXLAN rules when `networking` is "flannel" (default: calico)

#### Azure

* Allow choosing Calico as the network provider (experimental) ([#472](https://github.com/poseidon/typhoon/pull/472))
* Add a `networking` variable accepting "flannel" (default) or "calico"
* Use VXLAN encapsulation since Azure doesn't support IPIP

#### DigitalOcean

* Allow choosing Calico as the network provider (experimental) ([#472](https://github.com/poseidon/typhoon/pull/472))
* Add a `networking` variable accepting "flannel" (default) or "calico"
* Use VXLAN encapsulation since DigitalOcean doesn't support IPIP
* Add explicit ordering between firewall rule creation and secure copying Kubelet credentials ([#469](https://github.com/poseidon/typhoon/pull/469))
* Fix race scenario if copies to nodes were before rule creation, blocking cluster creation

Expand Down
17 changes: 12 additions & 5 deletions azure/container-linux/kubernetes/bootkube.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,18 @@
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=85571f6dae3522e2a7de01b7e0a3f7e3a9359641/"

cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"
networking = "flannel"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"

networking = "${var.networking}"

# only effective with Calico networking
# we should be able to use 1450 MTU, but in practice, 1410 was needed
network_encapsulation = "vxlan"
network_mtu = "1410"

pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
Expand Down
6 changes: 6 additions & 0 deletions azure/container-linux/kubernetes/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,12 @@ variable "asset_dir" {
type = "string"
}

variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}

variable "host_cidr" {
description = "CIDR IPv4 range to assign to instances"
type = "string"
Expand Down
17 changes: 11 additions & 6 deletions digital-ocean/container-linux/kubernetes/bootkube.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,17 @@
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=85571f6dae3522e2a7de01b7e0a3f7e3a9359641/"

cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"
networking = "flannel"
network_mtu = 1440
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"

networking = "${var.networking}"

# only effective with Calico networking
network_encapsulation = "vxlan"
network_mtu = "1450"

pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
Expand Down
1 change: 1 addition & 0 deletions digital-ocean/container-linux/kubernetes/ssh.tf
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"

depends_on = [
"digitalocean_firewall.rules",
]
Expand Down
6 changes: 6 additions & 0 deletions digital-ocean/container-linux/kubernetes/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ variable "asset_dir" {
type = "string"
}

variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}

variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
Expand Down
1 change: 1 addition & 0 deletions digital-ocean/fedora-atomic/kubernetes/ssh.tf
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"

depends_on = [
"digitalocean_firewall.rules",
]
Expand Down
1 change: 1 addition & 0 deletions docs/cl/azure.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ Reference the DNS zone with `"${azurerm_dns_zone.clusters.name}"` and its resour
| worker_priority | Set priority to Low to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Low |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) |
| host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
Expand Down
1 change: 1 addition & 0 deletions docs/cl/digital-ocean.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma
| image | Container Linux image for instances | "coreos-stable" | coreos-stable, coreos-beta, coreos-alpha |
| controller_clc_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) |
| worker_clc_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) |
| networking | Choice of networking provider | "flannel" | "flannel" or "calico" (experimental) |
| pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" |
| service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" |
| cluster_domain_suffix | FQDN suffix for Kubernetes services answered by coredns. | "cluster.local" | "k8s.example.com" |
Expand Down
13 changes: 6 additions & 7 deletions docs/topics/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,19 @@ Network performance varies based on the platform and CNI plugin. `iperf` was use
|----------------------------|-------:|-------------:|-------------:|
| AWS (flannel) | 5 Gb/s | 4.94 Gb/s | 4.89 Gb/s |
| AWS (calico, MTU 1480) | 5 Gb/s | 4.94 Gb/s | 4.42 Gb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.75 Gb/s |
| Azure (flannel) | Varies | 749 Mb/s | 680 Mb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.90 Gb/s |
| Azure (flannel) | Varies | 749 Mb/s | 650 Mb/s |
| Azure (calico) | Varies | 749 Mb/s | 650 Mb/s |
| Bare-Metal (flannel) | 1 Gb/s | 940 Mb/s | 903 Mb/s |
| Bare-Metal (calico) | 1 Gb/s | 940 Mb/s | 931 Mb/s |
| Bare-Metal (flannel, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Bare-Metal (calico, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Digital Ocean | 2 Gb/s | 1.97 Gb/s | 1.64 Gb/s |
| Digital Ocean (flannel) | Varies | 1.97 Gb/s | 1.20 Gb/s |
| Digital Ocean (calico) | Varies | 1.97 Gb/s | 1.20 Gb/s |
| Google Cloud (flannel) | 2 Gb/s | 1.94 Gb/s | 1.76 Gb/s |
| Google Cloud (calico) | 2 Gb/s | 1.94 Gb/s | 1.81 Gb/s |

Notes:

* Calico and Flannel have comparable performance. Platform and configuration differences dominate.
* AWS and Azure node bandwidth (i.e. upper bound) depends greatly on machine type
* Azure and DigitalOcean network performance can be quite variable or depend on machine type
* Only [certain AWS EC2 instance types](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#jumbo_frame_instances) allow jumbo frames. This is why the default MTU on AWS must be 1480.
* Neither CNI provider seems to be able to leverage bonded NICs well (bare-metal)