etcd是什么?官方给出的定义如下:
etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node.
etcd是一个高可用的、强一致性的、分布式的key-value存储,它有以下特点:
使用简单,基于HTTP的API,可以使用标准的HTTP客户端进行读写,如curl
key-value存储,将数据存储在分层组织的目录中,类似标准文件系统
监视特定键或目录的更改,并对更改做出响应
安全,可选的SSL客户认证机制
每个实例支持1000/s的写操作
Optional TTLs for keys expiration
使用Raft协议实现的分布式
安装 假设集群部署环境如下:
etcd1: 192.168.122.11
etcd2: 192.168.122.243
etcd3: 192.168.122.41
OS: CentOS 7.x
etcd version: 3.3.18
首先下载etcd,直接下载二进制包,下载略。
启动 standalone cluster,命令如下:
1 2 3 4 5 6 7 8 9 10 11 etcd \ --data-dir=/opt/etcd/data \ --advertise-client-urls=http://192.168.122.11:2379 \ --initial-advertise-peer-urls=http://192.168.122.11:2380 \ --initial-cluster=etcd1=http://192.168.122.11:2380 \ --listen-client-urls=http://127.0.0.1:2379,http://192.168.122.11:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --listen-peer-urls=http://192.168.122.11:2380 \ --initial-cluster-state new \ --initial-cluster-token etcd-test-cluster-1 \ --name=etcd1
启动之后,就可以用客户端etcdctl与etcd集群进行交互:
1 2 3 4 export ETCDCTL_API=3 etcdctl put foo bar etcdctl put foo1 bar1 etcdctl get foo
standalone模式的集群会成为整个基础架构的单点,接下来增加两个member(etcd2和etcd3)来实现高可用。首先把ectd2增加到集群中,需要通过客户端命令为集群增加member,在ectd1上执行:
1 etcdctl member add etcd2 --peer-urls "http://192.168.122.243:2380"
然后在etcd2上执行如下命令启动etdc服务:
1 2 3 4 5 6 7 8 9 10 etcd \ --data-dir /opt/etcd/data \ --advertise-client-urls http://192.168.122.243:2379 \ --initial-advertise-peer-urls http://192.168.122.243:2380 \ --initial-cluster-state existing \ --initial-cluster "etcd1=http://192.168.122.11:2380,etcd2=http://192.168.122.243:2380" \ --listen-client-urls http://127.0.0.1:2379,http://192.168.122.243:2379 \ --listen-metrics-urls http://127.0.0.1:2381 \ --listen-peer-urls http://192.168.122.243:2380 \ --name etcd2
注意这里 --initial-cluster-state
和 --initial-cluster
参数。
再把ectd3加入集群,同样,执行命令增加member:
1 etcdctl member add etcd3 --peer-urls "http://192.168.122.41:2380"
在etcd3上启动服务:
1 2 3 4 5 6 7 8 9 10 etcd \ --data-dir /opt/etcd/data \ --advertise-client-urls http://192.168.122.41:2379 \ --initial-advertise-peer-urls http://192.168.122.41:2380 \ --initial-cluster-state existing \ --initial-cluster "etcd2=http://192.168.122.243:2380,etcd1=http://192.168.122.11:2380,etcd3=http://192.168.122.41:2380" \ --listen-client-urls http://127.0.0.1:2379,http://192.168.122.41:2379 \ --listen-metrics-urls http://127.0.0.1:2381 \ --listen-peer-urls http://192.168.122.41:2380 \ --name etcd3
到此完成etcd集群的部署,这个过程也可以作为有standalone改造为cluster的操作步骤。
TLS etcd支持通过tls协议对数据通信进行加密,包括etcd peer之间的通信和client跟etcd的通信。如果连接需要互相验证,这种情况需要通过统一的证书管理中心(CA)来创建etcd实例及client的证书。这里使用CloudFlare的一个PKI工具cfssl 来管理整个公钥基础设施。
首先创建Certificate Authority及配置文件,用来对接下来的tls证书进行授权:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 { cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "8760h" }, "profiles": { "etcd": { "usages": ["signing", "key encipherment", "server auth", "peer auth", "client auth"], "expiry": "8760h" } } } } EOF cat > ca-csr.json <<EOF { "CN": "etcd", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Xiamen", "O": "Corp", "OU": "CA", "ST": "Fujian" } ] } EOF cfssl gencert -initca ca-csr.json | cfssljson -bare ca
执行后,会生成ca证书及秘钥:
接下来为etcd集群的3个节点创建证书,每个节点配置文件如下: etcd1-csr.json
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 { "CN": "etcd1", "hosts": [ "etcd1", "192.168.122.11", "127.0.0.1" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Xiamen", "O": "Corp", "OU": "CA", "ST": "Fujian" } ] }
etcd2-csr.json
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 { "CN": "etcd2", "hosts": [ "etcd2", "192.168.122.243", "127.0.0.1" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Xiamen", "O": "Corp", "OU": "CA", "ST": "Fujian" } ] }
etcd3-csr.json
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 { "CN": "etcd3", "hosts": [ "etcd3", "192.168.122.41", "127.0.0.1" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "Xiamen", "O": "Corp", "OU": "CA", "ST": "Fujian" } ] }
创建证书:
1 2 3 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd1-csr.json | cfssljson -bare etcd1 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd2-csr.json | cfssljson -bare etcd2 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd3-csr.json | cfssljson -bare etcd3
执行后,会生成以下文件,把etcd2和etcd3各自证书及ca.pem拷贝到各自服务器上,假设证书都放置在/opt/etcd目录下:
1 2 3 4 5 6 etcd1.pem etcd1-key.pem etcd2.pem etcd2-key.pem etcd3.pem etcd3-key.pem
启动etcd1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 etcd \ --name etcd1 \ --data-dir=/opt/etcd/data1 \ --initial-advertise-peer-urls=https://192.168.122.11:2380 \ --listen-peer-urls=https://192.168.122.11:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.11:2379 \ --advertise-client-urls=https://192.168.122.11:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --initial-cluster-state new \ --initial-cluster-token etcd-test-cluster-1 \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd1.pem \ --key-file=/opt/etcd/etcd1-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd1.pem \ --peer-key-file=/opt/etcd/etcd1-key.pem
启动etcd2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 etcd \ --name etcd2 \ --data-dir=/opt/etcd/data1 \ --initial-advertise-peer-urls=https://192.168.122.243:2380 \ --listen-peer-urls=https://192.168.122.243:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.243:2379 \ --advertise-client-urls=https://192.168.122.243:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --initial-cluster-state new \ --initial-cluster-token etcd-test-cluster-1 \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd2.pem \ --key-file=/opt/etcd/etcd2-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd2.pem \ --peer-key-file=/opt/etcd/etcd2-key.pem
启动etcd3:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 etcd \ --name etcd3 \ --data-dir=/opt/etcd/data1 \ --initial-advertise-peer-urls=https://192.168.122.41:2380 \ --listen-peer-urls=https://192.168.122.41:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.41:2379 \ --advertise-client-urls=https://192.168.122.41:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --initial-cluster-state new \ --initial-cluster-token etcd-test-cluster-1 \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd3.pem \ --key-file=/opt/etcd/etcd3-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd3.pem \ --peer-key-file=/opt/etcd/etcd3-key.pem
集群启动后,检查日志是否正常,也可以通过curl来进行健康检查,在etcd1服务器上:
1 curl --cacert ca.pem --cert etcd1.pem --key etcd1-key.pem https://domain-name:2379/health
集群正常的话会返回以下信息:
在本例中,我们启动了一个全新的实例(–data-dir使用了新的路径),如果要把开始的非tls集群改造成tls集群,需要怎么做呢?
非tls转换为tls 首先使用上文的方式创建相应的证书,证书创建完,在原来集群的基础上,对etcd集群配置peerURLs进行更新,假设在etcd1上进行操作,查看member信息:
1 2 export ETCDCTL_API=3 etcdctl member list
输出如下:
1 2 3 13acd07d0ffd081a, started, etcd2, http://192.168.122.243:2380, http://192.168.122.243:2379 4bb6fb458796057d, started, etcd3, http://192.168.122.41:2380, http://192.168.122.41:2379 95291b72fbb71ec3, started, etcd1, http://192.168.122.11:2380, http://192.168.122.11:2379
更新member etcd2和etcd3的peerURLs,使用tls加密:
1 2 etcdctl member update 13acd07d0ffd081a --peer-urls="https://192.168.122.243:2380" etcdctl member update 4bb6fb458796057d --peer-urls="https://192.168.122.41:2380"
更改配置后,会导致整个集群etcd不可能,这时需要重启etcd2和etcd3,使用tls方式进行启动,命令如下: etcd2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 etcd \ --name etcd2 \ --data-dir=/opt/etcd/data \ --initial-advertise-peer-urls=https://192.168.122.243:2380 \ --listen-peer-urls=https://192.168.122.243:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.243:2379 \ --advertise-client-urls=https://192.168.122.243:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd2.pem \ --key-file=/opt/etcd/etcd2-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd2.pem \ --peer-key-file=/opt/etcd/etcd2-key.pem
etcd3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 etcd \ --name etcd3 \ --data-dir=/opt/etcd/data \ --initial-advertise-peer-urls=https://192.168.122.41:2380 \ --listen-peer-urls=https://192.168.122.41:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.41:2379 \ --advertise-client-urls=https://192.168.122.41:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd3.pem \ --key-file=/opt/etcd/etcd3-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd3.pem \ --peer-key-file=/opt/etcd/etcd3-key.pem
etcd2和etcd3启动之后,集群会变为正常,但是这时etcd1还未加入集群,再更改etcd1的peerURLs:
1 2 etcdctl --endpoints="192.168.122.243:2379" --cacert=./ca.pem --cert=./etcd1.pem --key=./etcd1-key.pem member list etcdctl --endpoints="192.168.122.243:2379" --cacert=./ca.pem --cert=./etcd1.pem --key=./etcd1-key.pem member update 95291b72fbb71ec3 --peer-urls="https://192.168.122.11:2380"
更新完后,重启etcd1上etcd服务,启动命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 etcd \ --name etcd1 \ --data-dir=/opt/etcd/data \ --initial-advertise-peer-urls=https://192.168.122.11:2380 \ --listen-peer-urls=https://192.168.122.11:2380 \ --listen-client-urls=https://127.0.0.1:2379,https://192.168.122.11:2379 \ --advertise-client-urls=https://192.168.122.11:2379 \ --listen-metrics-urls=http://127.0.0.1:2381 \ --initial-cluster "etcd1=https://192.168.122.11:2380,etcd2=https://192.168.122.243:2380,etcd3=https://192.168.122.41:2380" \ --client-cert-auth \ --trusted-ca-file=/opt/etcd/ca.pem \ --cert-file=/opt/etcd/etcd1.pem \ --key-file=/opt/etcd/etcd1-key.pem \ --peer-client-cert-auth \ --peer-trusted-ca-file=/opt/etcd/ca.pem \ --peer-cert-file=/opt/etcd/etcd1.pem \ --peer-key-file=/opt/etcd/etcd1-key.pem
再次检查日志,这时etcd1也加入集群了。
etcd in Kubernetes 在Kubernetes集群中,etcd实例可以跟部署在master上,也可以独立出来部署,两种方式拓扑图如下: 部署在master上
独立部署方式
以下序列图是当一个pod创建时涉及到的组件,以及apiserver和etcd的交互过程:
获取etcd所有的key/value数据:
1 2 3 4 5 6 7 8 9 ADVERTISE_URL="https://192.168.122.12:2379" kubectl exec etcd-node-01 -n kube-system -- sh -c \ "ETCDCTL_API=3 etcdctl \ --endpoints $ADVERTISE_URL \ --cacert /etc/kubernetes/pki/etcd/ca.crt \ --key /etc/kubernetes/pki/etcd/server.key \ --cert /etc/kubernetes/pki/etcd/server.crt \ get \"\" --prefix=true -w json" > etcd-kv.json
参考