Skip to content

Commit

Permalink
Allow Calico networking on Azure and DigitalOcean
Browse files Browse the repository at this point in the history
* Introduce "calico" as a `networking` option on Azure and DigitalOcean
using Calico's new VXLAN support (similar to flannel). Flannel remains
the default on these platforms for now.
* Historically, DigitalOcean and Azure only allowed Flannel as the
CNI provider, since those platforms don't support IPIP traffic that
was previously required for Calico.
* Looking forward, its desireable for Calico to become the default
across Typhoon clusters, since it provides NetworkPolicy and a
consistent experience
* No changes to AWS, GCP, or bare-metal where Calico remains the
default CNI provider. On these platforms, IPIP mode will always
be used, since its available and more performant than vxlan
  • Loading branch information
dghubble committed May 17, 2019
1 parent 222a942 commit c88129d
Show file tree
Hide file tree
Showing 7 changed files with 41 additions and 16 deletions.
17 changes: 12 additions & 5 deletions azure/container-linux/kubernetes/bootkube.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,18 @@
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=85571f6dae3522e2a7de01b7e0a3f7e3a9359641/"

cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"
networking = "flannel"
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = ["${formatlist("%s.%s", azurerm_dns_a_record.etcds.*.name, var.dns_zone)}"]
asset_dir = "${var.asset_dir}"

networking = "${var.networking}"

# only effective with Calico networking
# we should be able to use 1450 MTU, but in practice, 1410 was needed
network_encapsulation = "vxlan"
network_mtu = "1410"

pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
Expand Down
6 changes: 6 additions & 0 deletions azure/container-linux/kubernetes/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,12 @@ variable "asset_dir" {
type = "string"
}

variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}

variable "host_cidr" {
description = "CIDR IPv4 range to assign to instances"
type = "string"
Expand Down
17 changes: 11 additions & 6 deletions digital-ocean/container-linux/kubernetes/bootkube.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,17 @@
module "bootkube" {
source = "git::https://github.com/poseidon/terraform-render-bootkube.git?ref=85571f6dae3522e2a7de01b7e0a3f7e3a9359641/"

cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"
networking = "flannel"
network_mtu = 1440
cluster_name = "${var.cluster_name}"
api_servers = ["${format("%s.%s", var.cluster_name, var.dns_zone)}"]
etcd_servers = "${digitalocean_record.etcds.*.fqdn}"
asset_dir = "${var.asset_dir}"

networking = "${var.networking}"

# only effective with Calico networking
network_encapsulation = "vxlan"
network_mtu = "1450"

pod_cidr = "${var.pod_cidr}"
service_cidr = "${var.service_cidr}"
cluster_domain_suffix = "${var.cluster_domain_suffix}"
Expand Down
1 change: 1 addition & 0 deletions digital-ocean/container-linux/kubernetes/ssh.tf
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"

depends_on = [
"digitalocean_firewall.rules",
]
Expand Down
6 changes: 6 additions & 0 deletions digital-ocean/container-linux/kubernetes/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ variable "asset_dir" {
type = "string"
}

variable "networking" {
description = "Choice of networking provider (flannel or calico)"
type = "string"
default = "flannel"
}

variable "pod_cidr" {
description = "CIDR IPv4 range to assign Kubernetes pods"
type = "string"
Expand Down
1 change: 1 addition & 0 deletions digital-ocean/fedora-atomic/kubernetes/ssh.tf
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Secure copy etcd TLS assets and kubeconfig to controllers. Activates kubelet.service
resource "null_resource" "copy-controller-secrets" {
count = "${var.controller_count}"

depends_on = [
"digitalocean_firewall.rules",
]
Expand Down
9 changes: 4 additions & 5 deletions docs/topics/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ Network performance varies based on the platform and CNI plugin. `iperf` was use
|----------------------------|-------:|-------------:|-------------:|
| AWS (flannel) | 5 Gb/s | 4.94 Gb/s | 4.89 Gb/s |
| AWS (calico, MTU 1480) | 5 Gb/s | 4.94 Gb/s | 4.42 Gb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.75 Gb/s |
| AWS (calico, MTU 8981) | 5 Gb/s | 4.94 Gb/s | 4.90 Gb/s |
| Azure (flannel) | Varies | 749 Mb/s | 680 Mb/s |
| Azure (calico) | Varies | 749 Mb/s | 680 Mb/s |
| Bare-Metal (flannel) | 1 Gb/s | 940 Mb/s | 903 Mb/s |
| Bare-Metal (calico) | 1 Gb/s | 940 Mb/s | 931 Mb/s |
| Bare-Metal (flannel, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Bare-Metal (calico, bond) | 3 Gb/s | 2.3 Gb/s | 1.17 Gb/s |
| Digital Ocean | 2 Gb/s | 1.97 Gb/s | 1.64 Gb/s |
| Digital Ocean (flannel) | Varies | 1.97 Gb/s | 1.64 Gb/s |
| Digital Ocean (calico) | Varies | 1.97 Gb/s | 1.50 Gb/s |
| Google Cloud (flannel) | 2 Gb/s | 1.94 Gb/s | 1.76 Gb/s |
| Google Cloud (calico) | 2 Gb/s | 1.94 Gb/s | 1.81 Gb/s |

Expand All @@ -41,5 +41,4 @@ Notes:
* Calico and Flannel have comparable performance. Platform and configuration differences dominate.
* AWS and Azure node bandwidth (i.e. upper bound) depends greatly on machine type
* Only [certain AWS EC2 instance types](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#jumbo_frame_instances) allow jumbo frames. This is why the default MTU on AWS must be 1480.
* Neither CNI provider seems to be able to leverage bonded NICs well (bare-metal)

0 comments on commit c88129d

Please sign in to comment.