k8s集群部署v1.15实践8:部署高可用 kube-controller-manager 集群

参考文档node

部署高可用 kube-controller-manager 集群

备注:git

本文档介绍部署高可用 kube-controller-manager 集群的步骤.github

集群包含 3 个节点,启动后将经过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态.当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的可用性.web

二进制文件,在前面部署api时,已经下载好且已经配置到指定位置了.docker

[root@k8s-node1 ~]# cd /opt/k8s/bin
[root@k8s-node1 bin]# ls
cfssl           cfssljson       etcd     flanneld        kube-controller-manager  kube-scheduler
cfssl-certinfo  environment.sh  etcdctl  kube-apiserver  kubectl                  mk-docker-opts.sh
[root@k8s-node1 bin]#

*1.建立 kube-controller-manager 证书和私钥json

建立证书签名请求:bootstrap

hosts 列表包含全部 kube-controller-manager 节点 IP.api

CN 为 system:kube-controller-manager.网络

O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予kube-controller-manager 工做所需的权限.ssh

[root@k8s-node1 kube-controller-manager]# pwd
/opt/k8s/k8s_software/server/kube-controller-manager
[root@k8s-node1 kube-controller-manager]# cat kube-controller-manager-csr.json 
{
"CN": "system:kube-controller-manager",
"key": {
"algo": "rsa",
"size": 2048
},
"hosts": [
"127.0.0.1",
"192.168.174.128",
"192.168.174.129",
"192.168.174.130"
],
"names": [
{
"C": "CN",
"ST": "SZ",
"L": "SZ",
"O": "system:kube-controller-manager",
"OU": "4Paradigm"
}
]
}
[root@k8s-node1 kube-controller-manager]#

生成证书和密钥

[root@k8s-node1 kube-controller-manager]#  cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
2019/11/04 21:47:27 [INFO] generate received request
2019/11/04 21:47:27 [INFO] received CSR
2019/11/04 21:47:27 [INFO] generating key: rsa-2048
2019/11/04 21:47:27 [INFO] encoded CSR
2019/11/04 21:47:27 [INFO] signed certificate with serial number 87619246134042005891041298787368847333016054812
2019/11/04 21:47:27 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
[root@k8s-node1 kube-controller-manager]# ls
kube-controller-manager.csr  kube-controller-manager-csr.json  kube-controller-manager-key.pem  kube-controller-manager.pem
[root@k8s-node1 kube-controller-manager]#

把证书添加到全部节点

kube-controller-manager.csr  kube-controller-manager-csr.json  kube-controller-manager-key.pem  kube-controller-manager.pem
[root@k8s-node1 kube-controller-manager]# cp kube-controller-manager-key.pem kube-controller-manager.pem /etc/kubernetes/cert/
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager-key.pem kube-controller-manager.pem root@k8s-node2:/etc/kubernetes/cert/
kube-controller-manager-key.pem                                                                       100% 1675     1.6MB/s   00:00    
kube-controller-manager.pem                                                                           100% 1489     1.2MB/s   00:00    
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager-key.pem kube-controller-manager.pem root@k8s-node3:/etc/kubernetes/cert/
kube-controller-manager-key.pem                                                                       100% 1675     1.6MB/s   00:00    
kube-controller-manager.pem                                                                           100% 1489     1.2MB/s   00:00    
[root@k8s-node1 kube-controller-manager]#

修改文件属主和加x权限

[root@k8s-node1 kube-controller-manager]# chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/"
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/"

2.建立和分发kube-controller-manager使用的kubeconfig文件

kubeconfig 文件包含访问 apiserver 的全部信息,如 apiserver 地址,CA 证书和自身使用的证书,和访问api的用户名和rbac权限等等.

kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=https://192.168.174.127:8443 --kubeconfig=kube-controller-manager.kubeconfig
Cluster "kubernetes" set.

这条是写入集群信息,集群名字为kubernetes,认证的证书为ca.pem,集群的地址为vip地址192.168.174.147.把这些信息写入kube-controller-manager.kubeconfig

kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig
User "system:kube-controller-manager" set.

这条是写入用户信息,用户为system:kube-controller-manager,用户的证书为kube-controller-manager.pem,私钥为kube-controller-manager-key.pem,把这些信息写入kube-controller-manager.kubeconfig

kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
Context "system:kube-controller-manager" created.

这条是配置上下文参数,上下文名字为system:kube-controller-manager,使用的集群为kubernetes,用户为system:kube-controller-manager,把这些信息写入kube-controller-manager.kubeconfig

kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
Switched to context "system:kube-controller-manager".

使用上面配置的上下文.

分发kubeconfig 到全部节点

[root@k8s-node1 kube-controller-manager]# cp kube-controller-manager.kubeconfig /etc/kubernetes/
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.kubeconfig root@k8s-node2:/etc/kubernetes/
kube-controller-manager.kubeconfig                                                                    100% 6445     5.3MB/s   00:00    
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.kubeconfig root@k8s-node3:/etc/kubernetes/
kube-controller-manager.kubeconfig

修改下权限

[root@k8s-node1 kube-controller-manager]# chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/"
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/"

3.建立和分发kube-controller-manager的system unit文件

参数说明

--port=0 :关闭监听 http /metrics 的请求.同时 --address 参数无效,--bind-address 参数有效.这项必须取消禁止,不然报错.
解释:

k8s集群部署v1.15实践8:部署高可用 kube-controller-manager 集群

--secure-port=10252 、 --bind-address=0.0.0.0:在全部网络接口监听10252 端口的 https /metrics 请求,这两项也要禁止.

--kubeconfig:指定 kubeconfig 文件路径,kube-controller-manager 使用它链接和验证 kube-apiserver;

--cluster-signing--file:签名 TLS Bootstrap 建立的证书.

--experimental-cluster-signing-duration:指定 TLS Bootstrap 证书的有效期.

--root-ca-file :放置到容器 ServiceAccount 中的 CA 证书,用来对 kubeapiserver的证书进行校验.

--service-account-private-key-file:签名 ServiceAccount 中 Token 的私钥文件,必须和 kube-apiserver 的 --service-account-key-file 指定的公钥文件配对使用.

--service-cluster-ip-range:指定 Service Cluster IP 网段,必须和 kubeapiserver中的同名参数一致.

--leader-elect=true:集群运行模式,启用选举功能.被选为 leader 的节点负责处理工做,其它节点为阻塞状态.

--feature-gates=RotateKubeletServerCertificate=true:开启 kublet server 证书的自动更新特性.

--controllers=,bootstrapsigner,tokencleaner:启用的控制器列表,tokencleaner 用于自动清理过时的 Bootstrap token.

--horizontal-pod-autoscaler-*:custom metrics 相关参数,支持autoscaling/v2alpha1.

--tls-cert-file 、 --tls-private-key-file:使用 https 输出 metrics 时使用的 Server 证书和秘钥;

--use-service-account-credentials=true:ClusteRole:system:kube-controller-manager 的权限很小,只能建立 secret,serviceaccount 等资源对象.各 controller 的权限分散到 ClusterRolesystem:controller:XXX 中.须要在 kube-controller-manager 的启动参数中添加 --use-service-accountcredentials=true 参数,这样 main controller 会为各 controller 建立对应的ServiceAccount XXX-controller.内置的 ClusterRoleBinding system:controller:XXX 将赋予各 XXX-controllerServiceAccount 对应的 ClusterRole system:controller:XXX 权限.

User=k8s:使用 k8s 帐户运行.

kube-controller-manager 不对请求 https metrics 的 Client 证书进行校验,故不须要指定--tls-ca-file 参数,并且该参数已被淘汰.

[root@k8s-node1 kube-controller-manager]# source /opt/k8s/bin/environment.sh
[root@k8s-node1 kube-controller-manager]# cat kube-controller-manager.service 
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
ExecStart=/opt/k8s/bin/kube-controller-manager \
#--port=0 \
#--secure-port=10252 \
#--bind-address=127.0.0.1 \
--kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
--service-cluster-ip-range=10.254.0.0/16 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \
--experimental-cluster-signing-duration=8760h \
--root-ca-file=/etc/kubernetes/cert/ca.pem \
--service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \
--leader-elect=true \
--feature-gates=RotateKubeletServerCertificate=true \
--controllers=*,bootstrapsigner,tokencleaner \
--horizontal-pod-autoscaler-use-rest-clients=true \
--horizontal-pod-autoscaler-sync-period=10s \
--tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \
--tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \
--use-service-account-credentials=true \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2
Restart=on
Restart=on-failure
RestartSec=5
User=k8s
[Install]
WantedBy=multi-user.target
[root@k8s-node1 kube-controller-manager]#

分发文件都全部节点

[root@k8s-node1 kube-controller-manager]# cp kube-controller-manager.service /etc/systemd/system
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.service root@k8s-node2:/etc/systemd/system
kube-controller-manager.service                                                                       100% 1231     1.4MB/s   00:00    
[root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.service root@k8s-node3:/etc/systemd/system
kube-controller-manager.service                                                                       100% 1231     1.5MB/s   00:00    
[root@k8s-node1 kube-controller-manager]#

修改权限

[root@k8s-node1 kube-controller-manager]# chmod +x /etc/systemd/system/kube-controller-manager.service
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chmod +x /etc/systemd/system/kube-controller-manager.service"
[root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chmod +x /etc/systemd/system/kube-controller-manager.service"

4.启动服务

systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager
[root@k8s-node1 kube-controller-manager]# systemctl status kube-controller-manager
● kube-controller-manager.service - Kubernetes Controller Manager
   Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-11-04 22:36:49 EST; 2min 38s ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
 Main PID: 17475 (kube-controller)
    Tasks: 6
   Memory: 15.0M
   CGroup: /system.slice/kube-controller-manager.service
           └─17475 /opt/k8s/bin/kube-controller-manager #--port=0 #--secure-port=10252 #--bind-address=127.0.0.1 --kubeconfig=/etc/ku...

Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847823   17475 flags.go:33] FLAG: --v="2"
Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847826   17475 flags.go:33] FLAG: --version="false"
Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847831   17475 flags.go:33] FLAG: --vmodule=""
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112321   17475 authentication.go:249] No authentication...work.
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112555   17475 authentication.go:252] No authentication...work.
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112575   17475 authorization.go:146] No authorization-k...work.
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.112597   17475 controllermanager.go:164] Version: v1.15.5
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.113013   17475 secure_serving.go:116] Serving securely ...10257
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.115743   17475 deprecated_insecure_serving.go:53] Servi...10252
Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.115793   17475 leaderelection.go:235] attempting to acq...er...
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-node1 kube-controller-manager]#

执行命令测试下

[root@k8s-node1 kube-controller-manager]# kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                          
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-1               Healthy     {"health":"true"}                                                                           
etcd-0               Healthy     {"health":"true"}                                                                           
[root@k8s-node1 kube-controller-manager]#

检索那个是Leader

[root@k8s-node1 kube-controller-manager]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-node2_c38fd29c-84e3-47c2-b2c0-6c19952c3e86","leaseDurationSeconds":15,"acquireTime":"2019-11-05T03:37:03Z","renewTime":"2019-11-05T03:42:42Z","leaderTransitions":1}'
creationTimestamp: "2019-11-05T03:25:45Z"
name: kube-controller-manager
namespace: kube-system
resourceVersion: "2991"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 0bf961ae-1219-41f4-98f8-bebd83875190
[root@k8s-node1 kube-controller-manager]#

中止k8s-node2的kube-controller-manager服务,看看Leader会切换吗?

[root@k8s-node2 ~]# systemctl stop kube-controller-manager
[root@k8s-node2 ~]# systemctl status kube-controller-manager
● kube-controller-manager.service - Kubernetes Controller Manager
   Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2019-11-04 22:43:26 EST; 1min 28s ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
  Process: 17267 ExecStart=/opt/k8s/bin/kube-controller-manager #--port=0 #--secure-port=10252 #--bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem --experimental-cluster-signing-duration=8760h --root-ca-file=/etc/kubernetes/cert/ca.pem --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem --leader-elect=true --feature-gates=RotateKubeletServerCertificate=true --controllers=*,bootstrapsigner,tokencleaner --horizontal-pod-autoscaler-use-rest-clients=true --horizontal-pod-autoscaler-sync-period=10s --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem --use-service-account-credentials=true --alsologtostderr=true --logtostderr=false --log-dir=/var/log/kubernetes --v=2 (code=killed, signal=TERM)
 Main PID: 17267 (code=killed, signal=TERM)
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.941851   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.941931   17267 garbagecollector.go:240] synced garbage ...ector
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.965357   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.966845   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.966911   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.975179   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.975275   17267 garbagecollector.go:137] Garbage collect...rbage
Nov 04 22:37:08 k8s-node2 kube-controller-manager[17267]: I1104 22:37:08.014970   17267 controller_utils.go:1036] Caches are syn...oller
Nov 04 22:43:26 k8s-node2 systemd[1]: Stopping Kubernetes Controller Manager...
Nov 04 22:43:26 k8s-node2 systemd[1]: Stopped Kubernetes Controller Manager.
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-node2 ~]#

切换到k8s-node1了

[root@k8s-node2 ~]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-node1_8f0109d6-8ec3-4e60-8061-6b36e0fb9f79","leaseDurationSeconds":15,"acquireTime":"2019-11-05T03:43:43Z","renewTime":"2019-11-05T03:45:16Z","leaderTransitions":2}'
creationTimestamp: "2019-11-05T03:25:45Z"
name: kube-controller-manager
namespace: kube-system
resourceVersion: "3108"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 0bf961ae-1219-41f4-98f8-bebd83875190
[root@k8s-node2 ~]#