参考文档node
备注:git
本文档介绍部署高可用 kube-controller-manager 集群的步骤.github
集群包含 3 个节点,启动后将经过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态.当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的可用性.web
二进制文件,在前面部署api时,已经下载好且已经配置到指定位置了.docker
[root@k8s-node1 ~]# cd /opt/k8s/bin [root@k8s-node1 bin]# ls cfssl cfssljson etcd flanneld kube-controller-manager kube-scheduler cfssl-certinfo environment.sh etcdctl kube-apiserver kubectl mk-docker-opts.sh [root@k8s-node1 bin]#
*1.建立 kube-controller-manager 证书和私钥json
建立证书签名请求:bootstrap
hosts 列表包含全部 kube-controller-manager 节点 IP.api
CN 为 system:kube-controller-manager.网络
O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予kube-controller-manager 工做所需的权限.ssh
[root@k8s-node1 kube-controller-manager]# pwd /opt/k8s/k8s_software/server/kube-controller-manager
[root@k8s-node1 kube-controller-manager]# cat kube-controller-manager-csr.json { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.174.128", "192.168.174.129", "192.168.174.130" ], "names": [ { "C": "CN", "ST": "SZ", "L": "SZ", "O": "system:kube-controller-manager", "OU": "4Paradigm" } ] } [root@k8s-node1 kube-controller-manager]#
生成证书和密钥
[root@k8s-node1 kube-controller-manager]# cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager 2019/11/04 21:47:27 [INFO] generate received request 2019/11/04 21:47:27 [INFO] received CSR 2019/11/04 21:47:27 [INFO] generating key: rsa-2048 2019/11/04 21:47:27 [INFO] encoded CSR 2019/11/04 21:47:27 [INFO] signed certificate with serial number 87619246134042005891041298787368847333016054812 2019/11/04 21:47:27 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements").
[root@k8s-node1 kube-controller-manager]# ls kube-controller-manager.csr kube-controller-manager-csr.json kube-controller-manager-key.pem kube-controller-manager.pem [root@k8s-node1 kube-controller-manager]#
把证书添加到全部节点
kube-controller-manager.csr kube-controller-manager-csr.json kube-controller-manager-key.pem kube-controller-manager.pem [root@k8s-node1 kube-controller-manager]# cp kube-controller-manager-key.pem kube-controller-manager.pem /etc/kubernetes/cert/ [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager-key.pem kube-controller-manager.pem root@k8s-node2:/etc/kubernetes/cert/ kube-controller-manager-key.pem 100% 1675 1.6MB/s 00:00 kube-controller-manager.pem 100% 1489 1.2MB/s 00:00 [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager-key.pem kube-controller-manager.pem root@k8s-node3:/etc/kubernetes/cert/ kube-controller-manager-key.pem 100% 1675 1.6MB/s 00:00 kube-controller-manager.pem 100% 1489 1.2MB/s 00:00 [root@k8s-node1 kube-controller-manager]#修改文件属主和加x权限
[root@k8s-node1 kube-controller-manager]# chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/ [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/" [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chown -R k8s /etc/kubernetes/cert/ && chmod -R +x /etc/kubernetes/cert/"
2.建立和分发kube-controller-manager使用的kubeconfig文件
kubeconfig 文件包含访问 apiserver 的全部信息,如 apiserver 地址,CA 证书和自身使用的证书,和访问api的用户名和rbac权限等等.
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=https://192.168.174.127:8443 --kubeconfig=kube-controller-manager.kubeconfig Cluster "kubernetes" set.
这条是写入集群信息,集群名字为kubernetes,认证的证书为ca.pem,集群的地址为vip地址192.168.174.147.把这些信息写入kube-controller-manager.kubeconfig
kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig User "system:kube-controller-manager" set.
这条是写入用户信息,用户为system:kube-controller-manager,用户的证书为kube-controller-manager.pem,私钥为kube-controller-manager-key.pem,把这些信息写入kube-controller-manager.kubeconfig
kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig Context "system:kube-controller-manager" created.
这条是配置上下文参数,上下文名字为system:kube-controller-manager,使用的集群为kubernetes,用户为system:kube-controller-manager,把这些信息写入kube-controller-manager.kubeconfig
kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig Switched to context "system:kube-controller-manager".
使用上面配置的上下文.
分发kubeconfig 到全部节点
[root@k8s-node1 kube-controller-manager]# cp kube-controller-manager.kubeconfig /etc/kubernetes/ [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.kubeconfig root@k8s-node2:/etc/kubernetes/ kube-controller-manager.kubeconfig 100% 6445 5.3MB/s 00:00 [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.kubeconfig root@k8s-node3:/etc/kubernetes/ kube-controller-manager.kubeconfig
修改下权限
[root@k8s-node1 kube-controller-manager]# chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/ [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/" [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chown -R k8s /etc/kubernetes/ && chmod -R +x /etc/kubernetes/"
3.建立和分发kube-controller-manager的system unit文件
参数说明
--port=0 :关闭监听 http /metrics 的请求.同时 --address 参数无效,--bind-address 参数有效.这项必须取消禁止,不然报错.
解释:
--secure-port=10252 、 --bind-address=0.0.0.0:在全部网络接口监听10252 端口的 https /metrics 请求,这两项也要禁止.
--kubeconfig:指定 kubeconfig 文件路径,kube-controller-manager 使用它链接和验证 kube-apiserver;
--cluster-signing--file:签名 TLS Bootstrap 建立的证书.
--experimental-cluster-signing-duration:指定 TLS Bootstrap 证书的有效期.
--root-ca-file :放置到容器 ServiceAccount 中的 CA 证书,用来对 kubeapiserver的证书进行校验.
--service-account-private-key-file:签名 ServiceAccount 中 Token 的私钥文件,必须和 kube-apiserver 的 --service-account-key-file 指定的公钥文件配对使用.
--service-cluster-ip-range:指定 Service Cluster IP 网段,必须和 kubeapiserver中的同名参数一致.
--leader-elect=true:集群运行模式,启用选举功能.被选为 leader 的节点负责处理工做,其它节点为阻塞状态.
--feature-gates=RotateKubeletServerCertificate=true:开启 kublet server 证书的自动更新特性.
--controllers=,bootstrapsigner,tokencleaner:启用的控制器列表,tokencleaner 用于自动清理过时的 Bootstrap token.
--horizontal-pod-autoscaler-*:custom metrics 相关参数,支持autoscaling/v2alpha1.
--tls-cert-file 、 --tls-private-key-file:使用 https 输出 metrics 时使用的 Server 证书和秘钥;
--use-service-account-credentials=true:ClusteRole:system:kube-controller-manager 的权限很小,只能建立 secret,serviceaccount 等资源对象.各 controller 的权限分散到 ClusterRolesystem:controller:XXX 中.须要在 kube-controller-manager 的启动参数中添加 --use-service-accountcredentials=true 参数,这样 main controller 会为各 controller 建立对应的ServiceAccount XXX-controller.内置的 ClusterRoleBinding system:controller:XXX 将赋予各 XXX-controllerServiceAccount 对应的 ClusterRole system:controller:XXX 权限.
User=k8s:使用 k8s 帐户运行.
kube-controller-manager 不对请求 https metrics 的 Client 证书进行校验,故不须要指定--tls-ca-file 参数,并且该参数已被淘汰.
[root@k8s-node1 kube-controller-manager]# source /opt/k8s/bin/environment.sh
[root@k8s-node1 kube-controller-manager]# cat kube-controller-manager.service [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/k8s/bin/kube-controller-manager \ #--port=0 \ #--secure-port=10252 \ #--bind-address=127.0.0.1 \ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --service-cluster-ip-range=10.254.0.0/16 \ --cluster-name=kubernetes \ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \ --experimental-cluster-signing-duration=8760h \ --root-ca-file=/etc/kubernetes/cert/ca.pem \ --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \ --leader-elect=true \ --feature-gates=RotateKubeletServerCertificate=true \ --controllers=*,bootstrapsigner,tokencleaner \ --horizontal-pod-autoscaler-use-rest-clients=true \ --horizontal-pod-autoscaler-sync-period=10s \ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \ --use-service-account-credentials=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on Restart=on-failure RestartSec=5 User=k8s [Install] WantedBy=multi-user.target [root@k8s-node1 kube-controller-manager]#
分发文件都全部节点
[root@k8s-node1 kube-controller-manager]# cp kube-controller-manager.service /etc/systemd/system [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.service root@k8s-node2:/etc/systemd/system kube-controller-manager.service 100% 1231 1.4MB/s 00:00 [root@k8s-node1 kube-controller-manager]# scp kube-controller-manager.service root@k8s-node3:/etc/systemd/system kube-controller-manager.service 100% 1231 1.5MB/s 00:00 [root@k8s-node1 kube-controller-manager]#
修改权限
[root@k8s-node1 kube-controller-manager]# chmod +x /etc/systemd/system/kube-controller-manager.service [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node2 "chmod +x /etc/systemd/system/kube-controller-manager.service" [root@k8s-node1 kube-controller-manager]# ssh root@k8s-node3 "chmod +x /etc/systemd/system/kube-controller-manager.service"
4.启动服务
systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager
[root@k8s-node1 kube-controller-manager]# systemctl status kube-controller-manager ● kube-controller-manager.service - Kubernetes Controller Manager Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2019-11-04 22:36:49 EST; 2min 38s ago Docs: https://github.com/GoogleCloudPlatform/kubernetes Main PID: 17475 (kube-controller) Tasks: 6 Memory: 15.0M CGroup: /system.slice/kube-controller-manager.service └─17475 /opt/k8s/bin/kube-controller-manager #--port=0 #--secure-port=10252 #--bind-address=127.0.0.1 --kubeconfig=/etc/ku... Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847823 17475 flags.go:33] FLAG: --v="2" Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847826 17475 flags.go:33] FLAG: --version="false" Nov 04 22:36:49 k8s-node1 kube-controller-manager[17475]: I1104 22:36:49.847831 17475 flags.go:33] FLAG: --vmodule="" Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112321 17475 authentication.go:249] No authentication...work. Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112555 17475 authentication.go:252] No authentication...work. Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: W1104 22:36:50.112575 17475 authorization.go:146] No authorization-k...work. Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.112597 17475 controllermanager.go:164] Version: v1.15.5 Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.113013 17475 secure_serving.go:116] Serving securely ...10257 Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.115743 17475 deprecated_insecure_serving.go:53] Servi...10252 Nov 04 22:36:50 k8s-node1 kube-controller-manager[17475]: I1104 22:36:50.115793 17475 leaderelection.go:235] attempting to acq...er... Hint: Some lines were ellipsized, use -l to show in full. [root@k8s-node1 kube-controller-manager]#
执行命令测试下
[root@k8s-node1 kube-controller-manager]# kubectl get cs NAME STATUS MESSAGE ERROR scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Healthy ok etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"} [root@k8s-node1 kube-controller-manager]#
检索那个是Leader
[root@k8s-node1 kube-controller-manager]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-node2_c38fd29c-84e3-47c2-b2c0-6c19952c3e86","leaseDurationSeconds":15,"acquireTime":"2019-11-05T03:37:03Z","renewTime":"2019-11-05T03:42:42Z","leaderTransitions":1}' creationTimestamp: "2019-11-05T03:25:45Z" name: kube-controller-manager namespace: kube-system resourceVersion: "2991" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: 0bf961ae-1219-41f4-98f8-bebd83875190 [root@k8s-node1 kube-controller-manager]#中止k8s-node2的kube-controller-manager服务,看看Leader会切换吗?
[root@k8s-node2 ~]# systemctl stop kube-controller-manager [root@k8s-node2 ~]# systemctl status kube-controller-manager ● kube-controller-manager.service - Kubernetes Controller Manager Loaded: loaded (/etc/systemd/system/kube-controller-manager.service; enabled; vendor preset: disabled) Active: inactive (dead) since Mon 2019-11-04 22:43:26 EST; 1min 28s ago Docs: https://github.com/GoogleCloudPlatform/kubernetes Process: 17267 ExecStart=/opt/k8s/bin/kube-controller-manager #--port=0 #--secure-port=10252 #--bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem --experimental-cluster-signing-duration=8760h --root-ca-file=/etc/kubernetes/cert/ca.pem --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem --leader-elect=true --feature-gates=RotateKubeletServerCertificate=true --controllers=*,bootstrapsigner,tokencleaner --horizontal-pod-autoscaler-use-rest-clients=true --horizontal-pod-autoscaler-sync-period=10s --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem --use-service-account-credentials=true --alsologtostderr=true --logtostderr=false --log-dir=/var/log/kubernetes --v=2 (code=killed, signal=TERM) Main PID: 17267 (code=killed, signal=TERM) Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.941851 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.941931 17267 garbagecollector.go:240] synced garbage ...ector Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.965357 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.966845 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.966911 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.975179 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:37:07 k8s-node2 kube-controller-manager[17267]: I1104 22:37:07.975275 17267 garbagecollector.go:137] Garbage collect...rbage Nov 04 22:37:08 k8s-node2 kube-controller-manager[17267]: I1104 22:37:08.014970 17267 controller_utils.go:1036] Caches are syn...oller Nov 04 22:43:26 k8s-node2 systemd[1]: Stopping Kubernetes Controller Manager... Nov 04 22:43:26 k8s-node2 systemd[1]: Stopped Kubernetes Controller Manager. Hint: Some lines were ellipsized, use -l to show in full. [root@k8s-node2 ~]#
切换到k8s-node1了
[root@k8s-node2 ~]# kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-node1_8f0109d6-8ec3-4e60-8061-6b36e0fb9f79","leaseDurationSeconds":15,"acquireTime":"2019-11-05T03:43:43Z","renewTime":"2019-11-05T03:45:16Z","leaderTransitions":2}' creationTimestamp: "2019-11-05T03:25:45Z" name: kube-controller-manager namespace: kube-system resourceVersion: "3108" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: 0bf961ae-1219-41f4-98f8-bebd83875190 [root@k8s-node2 ~]#