kuberntes 集群使用 etcd 存储所有数据,是最重要的组件之一,注意 etcd集群需要奇数个节点(1,3,5...),本文档使用3个节点做集群。
请在另外窗口打开roles/etcd/tasks/main.yml 文件,对照看以下讲解内容。
注意:证书是在部署节点创建好之后推送到目标etcd节点上去的,以增加ca证书的安全性
创建ectd证书请求 etcd-csr.json.j2
{
"CN": "etcd",
"hosts": [
{% for host in groups['etcd'] %}
"{{ host }}",
{% endfor %}
"127.0.0.1"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "HangZhou",
"L": "XS",
"O": "k8s",
"OU": "System"
}
]
}
- etcd使用对等证书,hosts 字段必须指定授权使用该证书的 etcd 节点 IP,这里枚举了所有ectd节点的地址
创建etcd 服务文件 etcd.service.j2
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory={{ ETCD_DATA_DIR }}
ExecStart={{ bin_dir }}/etcd \
--name=etcd-{{ inventory_hostname }} \
--cert-file={{ ca_dir }}/etcd.pem \
--key-file={{ ca_dir }}/etcd-key.pem \
--peer-cert-file={{ ca_dir }}/etcd.pem \
--peer-key-file={{ ca_dir }}/etcd-key.pem \
--trusted-ca-file={{ ca_dir }}/ca.pem \
--peer-trusted-ca-file={{ ca_dir }}/ca.pem \
--initial-advertise-peer-urls=https://{{ inventory_hostname }}:2380 \
--listen-peer-urls=https://{{ inventory_hostname }}:2380 \
--listen-client-urls=https://{{ inventory_hostname }}:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://{{ inventory_hostname }}:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster={{ ETCD_NODES }} \
--initial-cluster-state={{ CLUSTER_STATE }} \
--data-dir={{ ETCD_DATA_DIR }} \
--wal-dir={{ ETCD_WAL_DIR }} \
--snapshot-count=50000 \
--auto-compaction-retention=1 \
--auto-compaction-mode=periodic \
--max-request-bytes=10485760 \
--quota-backend-bytes=8589934592
Restart=always
RestartSec=15
LimitNOFILE=65536
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
- 完整参数列表请使用
etcd --help
查询 - 注意etcd 即需要服务器证书也需要客户端证书,为方便使用一个peer 证书代替两个证书
--initial-cluster-state
值为new
时,--name
的参数值必须位于--initial-cluster
列表中--snapshot-count
--auto-compaction-retention
一些性能优化参数,请查阅etcd项目文档- 设置
--data-dir
和--wal-dir
使用不同磁盘目录,可以避免磁盘io竞争,提高性能,具体请参考etcd项目文档
- systemctl status etcd 查看服务状态
- journalctl -u etcd 查看运行日志
- 在任一 etcd 集群节点上执行如下命令
# 根据hosts中配置设置shell变量 $NODE_IPS
export NODE_IPS="192.168.1.1 192.168.1.2 192.168.1.3"
for ip in ${NODE_IPS}; do
ETCDCTL_API=3 etcdctl \
--endpoints=https://${ip}:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem \
endpoint health; done
for ip in ${NODE_IPS}; do
ETCDCTL_API=3 etcdctl \
--endpoints=https://${ip}:2379 \
--cacert=/etc/kubernetes/ssl/ca.pem \
--cert=/etc/kubernetes/ssl/etcd.pem \
--key=/etc/kubernetes/ssl/etcd-key.pem \
--write-out=table endpoint status; done
预期结果:
https://192.168.1.1:2379 is healthy: successfully committed proposal: took = 2.210885ms
https://192.168.1.2:2379 is healthy: successfully committed proposal: took = 2.784043ms
https://192.168.1.3:2379 is healthy: successfully committed proposal: took = 3.275709ms
三台 etcd 的输出均为 healthy 时表示集群服务正常。