在开始以前,部署Kubernetes集群机器须要知足如下几个条件:node
角色 | IP |
---|---|
k8s-lb |
192.168.50.100 |
master1 |
192.168.50.128 |
master2 |
192.168.50.129 |
master3 |
192.168.50.130 |
node1 |
192.168.50.131 |
node2 |
192.168.50.132 |
master1
节点设置:linux
~]# hostnamectl set-hostname master1
master2
节点设置:nginx
~]# hostnamectl set-hostname master2
master3
节点设置:git
~]# hostnamectl set-hostname master3
node1
从节点设置:github
~]# hostnamectl set-hostname node1
node2
从节点设置:算法
~]# hostnamectl set-hostname node2
执行bash
命令以加载新设置的主机名docker
hosts
全部的节点都要添加hosts
解析记录shell
~]# cat >>/etc/hosts <<EOF 192.168.50.100 k8s-lb 192.168.50.128 master1 192.168.50.129 master2 192.168.50.130 master3 192.168.50.131 node1 192.168.50.132 node2 EOF
在master1
节点生成密钥对,并分发给其余的全部主机。json
[root@master1 ~]# ssh-keygen -t rsa -b 1200 Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:OoMw1dARsWhbJKAQL2hUxwnM4tLQJeLynAQHzqNQs5s root@localhost.localdomain The key's randomart image is: +---[RSA 1200]----+ |*=X=*o*+ | |OO.*.O.. | |BO= + + | |**o* o | |o E . S | | o . . | | . + | | o | | | +----[SHA256]-----+
[root@master1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@master1 [root@master1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@master2 [root@master1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@master3 [root@master1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1 [root@master1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
经过下载kernel image
的rpm包进行安装。bootstrap
centos7系统:http://elrepo.org/linux/kernel/el7/x86_64/RPMS/
编写shell
脚本升级内核
#!/bin/bash # ---------------------------- # upgrade kernel by bomingit@126.com # ---------------------------- yum localinstall -y kernel-lt* if [ $? -eq 0 ];then grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)" fi echo "please reboot your system quick!!!"
注意:必定要重启机器
[root@master1 ~]# uname -r 4.4.229-1.el7.elrepo.x86_64
selinux
~]# systemctl disable --now firewalld ~]# setenforce 0 ~]# sed -i 's/enforcing/disabled/' /etc/selinux/config
上面的是临时关闭,固然也能够永久关闭,即在/etc/fstab
文件中将swap
挂载所在的行注释掉便可。
swap
分区~]# swapoff -a ~]# sed -i.bak 's/^.*centos-swap/#&/g' /etc/fstab
第一条是临时关闭,固然也可使用第二条永久关闭,后者手动在/etc/fstab
文件中将swap
挂载所在的行注释掉便可。
~]# cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp.keepaliv.probes = 3 net.ipv4.tcp_keepalive_intvl = 15 net.ipv4.tcp.max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp.max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.top_timestamps = 0 net.core.somaxconn = 16384 EOF
使其当即生效
~]# sysctl --system
yum
源全部的节点均采用阿里云官网的base
和epel
源
~]# mv /etc/yum.repos.d/* /tmp ~]# curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo ~]# curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
~]# ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime ~]# yum install dnf ntpdate -y ~]# ntpdate ntp.aliyun.com
shell
将上面的第5-8步骤写成shell
脚本自动化快速完成
#!/bin/sh #****************************************************************# # ScriptName: init.sh # Author: boming # Create Date: 2020-06-23 22:19 #***************************************************************# #关闭防火墙 systemctl disable --now firewalld setenforce 0 sed -i 's/enforcing/disabled/' /etc/selinux/config #关闭swap分区 swapoff -a sed -i.bak 's/^.*centos-swap/#&/g' /etc/fstab #优化系统 cat > /etc/sysctl.d/k8s.conf << EOF net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp.keepaliv.probes = 3 net.ipv4.tcp_keepalive_intvl = 15 net.ipv4.tcp.max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp.max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.top_timestamps = 0 net.core.somaxconn = 16384 EOF #当即生效 sysctl --system #配置阿里云的base和epel源 mv /etc/yum.repos.d/* /tmp curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo #安装dnf工具 yum install dnf -y dnf makecache #安装ntpdate工具 dnf install ntpdate -y #同步阿里云时间 ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime ntpdate ntp.aliyun.com
在其余的节点执行此脚本跑一下便可。
docker
docker
软件yum
源方法:浏览器打开mirrors.aliyun.com
网站,找到docker-ce
,便可看到镜像仓库源
~]# curl -o /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo ~]# cat /etc/yum.repos.d/docker-ce.repo [docker-ce-stable] name=Docker CE Stable - $basearch baseurl=https://mirrors.aliyun.com/docker-ce/linux/centos/7/$basearch/stable enabled=1 gpgcheck=1 gpgkey=https://mirrors.aliyun.com/docker-ce/linux/centos/gpg ... ...
docker-ce
组件列出全部能够安装的版本
~]# dnf list docker-ce --showduplicates docker-ce.x86_64 3:18.09.6-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.7-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.8-3.el7 docker-ce-stable docker-ce.x86_64 3:18.09.9-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.0-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.1-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.2-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.3-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.4-3.el7 docker-ce-stable docker-ce.x86_64 3:19.03.5-3.el7 docker-ce-stable .....
这里咱们安装最新版本的docker
,全部的节点都须要安装docker
服务
~]# dnf install -y docker-ce docker-ce-cli
docker
并设置开机自启动~]# systemctl enable --now docker
查看版本号,检测docker
是否安装成功
~]# docker --version Docker version 19.03.12, build 48a66213fea
上面的这种查看docker client
的版本的。建议使用下面这种方法查看docker-ce
版本号,这种方法把docker
的client
端和server
端的版本号查看的一清二楚。
~]# docker version Client: Version: 19.03.12 API version: 1.40 Go version: go1.13.10 Git commit: 039a7df9ba Built: Wed Sep 4 16:51:21 2019 OS/Arch: linux/amd64 Experimental: false Server: Docker Engine - Community Engine: Version: 19.03.12 API version: 1.40 (minimum version 1.12) Go version: go1.13.10 Git commit: 039a7df Built: Wed Sep 4 16:22:32 2019 OS/Arch: linux/amd64 Experimental: false
docker
的镜像仓库源默认的镜像仓库地址是docker
官方的,国内访问异常缓慢,所以更换为我的阿里云的源。
~]# cat > /etc/docker/daemon.json << EOF { "registry-mirrors": ["https://f1bhsuge.mirror.aliyuncs.com"] } EOF
因为从新加载docker仓库源
,因此须要重启docker
~]# systemctl restart docker
kubernetes
kubernetes
软件yum
源方法:浏览器打开mirrors.aliyun.com
网站,找到kubernetes
,便可看到镜像仓库源
~]# cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
最好是从新生成缓存
~]# dnf clean all ~]# dnf makecache
kubeadm
、kubelet
和kubectl
组件全部的节点都须要安装这几个组件。
[root@master1 ~]# dnf list kubeadm --showduplicates kubeadm.x86_64 1.17.7-0 kubernetes kubeadm.x86_64 1.17.7-1 kubernetes kubeadm.x86_64 1.17.8-0 kubernetes kubeadm.x86_64 1.17.9-0 kubernetes kubeadm.x86_64 1.18.0-0 kubernetes kubeadm.x86_64 1.18.1-0 kubernetes kubeadm.x86_64 1.18.2-0 kubernetes kubeadm.x86_64 1.18.3-0 kubernetes kubeadm.x86_64 1.18.4-0 kubernetes kubeadm.x86_64 1.18.4-1 kubernetes kubeadm.x86_64 1.18.5-0 kubernetes kubeadm.x86_64 1.18.6-0 kubernetes
因为kubernetes版本变动很是快,所以列出有哪些版本,选择一个合适的。咱们这里安装1.18.6
版本。
[root@master1 ~]# dnf install -y kubelet-1.18.6 kubeadm-1.18.6 kubectl-1.18.6
咱们先设置开机自启,可是
kubelet
服务暂时先不启动。
[root@master1 ~]# systemctl enable kubelet
Haproxy+Keepalived
配置高可用VIP高可用咱们采用官方推荐的HAproxy+Keepalived
,HAproxy
和Keepalived
以守护进程的方式在全部Master
节点部署。
keepalived
和haproxy
注意:只须要在三个master
节点安装便可
[root@master1 ~]# dnf install -y keepalived haproxy
Haproxy
服务全部master
节点的haproxy
配置相同,haproxy的配置文件是/etc/haproxy/haproxy.cfg
。master1
节点配置完成以后再分发给master二、master3
两个节点。
global maxconn 2000 ulimit-n 16384 log 127.0.0.1 local0 err stats timeout 30s defaults log global mode http option httplog timeout connect 5000 timeout client 50000 timeout server 50000 timeout http-request 15s timeout http-keep-alive 15s frontend monitor-in bind *:33305 mode http option httplog monitor-uri /monitor listen stats bind *:8006 mode http stats enable stats hide-version stats uri /stats stats refresh 30s stats realm Haproxy\ Statistics stats auth admin:admin frontend k8s-master bind 0.0.0.0:8443 bind 127.0.0.1:8443 mode tcp option tcplog tcp-request inspect-delay 5s default_backend k8s-master backend k8s-master mode tcp option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 server master1 192.168.50.128:6443 check inter 2000 fall 2 rise 2 weight 100 server master2 192.168.50.129:6443 check inter 2000 fall 2 rise 2 weight 100 server master3 192.168.50.130:6443 check inter 2000 fall 2 rise 2 weight 100
注意这里的三个master节点的ip地址要根据你本身的状况配置好。
Keepalived
服务keepalived
中使用track_script
机制来配置脚本进行探测kubernetes
的master
节点是否宕机,并以此切换节点实现高可用。
master1
节点的keepalived
配置文件以下所示,配置文件所在的位置/etc/keepalived/keepalived.cfg
。
! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script chk_kubernetes { script "/etc/keepalived/check_kubernetes.sh" interval 2 weight -5 fall 3 rise 2 } vrrp_instance VI_1 { state MASTER interface ens33 mcast_src_ip 192.168.50.128 virtual_router_id 51 priority 100 advert_int 2 authentication { auth_type PASS auth_pass K8SHA_KA_AUTH } virtual_ipaddress { 192.168.50.100 } # track_script { # chk_kubernetes # } }
须要注意几点(前两点记得修改):
mcast_src_ip
:配置多播源地址,此地址是当前主机的ip地址。priority
:keepalived
根据此项参数的大小仲裁master
节点。咱们这里让master节点为kubernetes
提供服务,其余两个节点暂时为备用节点。所以master1
节点设置为100
,master2
节点设置为99
,master3
节点设置为98
。state
:咱们将master1
节点的state
字段设置为MASTER
,其余两个节点字段修改成BACKUP
。我这里将健康检测脚本放置在/etc/keepalived
目录下,check_kubernetes.sh
检测脚本以下:
#!/bin/bash #****************************************************************# # ScriptName: check_kubernetes.sh # Author: boming # Create Date: 2020-06-23 22:19 #***************************************************************# function chech_kubernetes() { for ((i=0;i<5;i++));do apiserver_pid_id=$(pgrep kube-apiserver) if [[ ! -z $apiserver_pid_id ]];then return else sleep 2 fi apiserver_pid_id=0 done } # 1:running 0:stopped check_kubernetes if [[ $apiserver_pid_id -eq 0 ]];then /usr/bin/systemctl stop keepalived exit 1 else exit 0 fi
根据上面的注意事项配置master2
、master3
节点的keepalived
服务。
Keeplived
和Haproxy
服务~]# systemctl enable --now keepalived haproxy
确保万一,查看一下服务状态
~]# systemctl status keepalived haproxy ~]# ping 192.168.50.100 #检测一下是否通 PING 192.168.50.100 (192.168.50.100) 56(84) bytes of data. 64 bytes from 192.168.50.100: icmp_seq=1 ttl=64 time=0.778 ms 64 bytes from 192.168.50.100: icmp_seq=2 ttl=64 time=0.339 ms
Master
节点在master
节点执行以下指令:
[root@master1 ~]# kubeadm config print init-defaults > kubeadm-init.yaml
这个文件kubeadm-init.yaml
,是咱们初始化使用的文件,里面大概修改这几项参数。
[root@master1 ~]# cat kubeadm-init.yaml apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.50.100 #VIP的地址 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: master1 taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: #添加以下两行信息 certSANs: - "192.168.50.100" #VIP地址 timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers #阿里云的镜像站点 controlPlaneEndpoint: "192.168.50.100:8443" #VIP的地址和端口 kind: ClusterConfiguration kubernetesVersion: v1.18.3 #kubernetes版本号 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 #选择默认便可,固然也能够自定义CIDR podSubnet: 10.244.0.0/16 #添加pod网段 scheduler: {}
注意:上面的advertiseAddress
字段的值,这个值并不是当前主机的网卡地址,而是高可用集群的VIP
的地址。
注意:上面的controlPlaneEndpoint
这里填写的是VIP
的地址,而端口则是haproxy
服务的8443
端口,也就是咱们在haproxy
里面配置的这段信息。
frontend k8s-master bind 0.0.0.0:8443 bind 127.0.0.1:8443 mode tcp
这一段里面的8443
端,若是你自定义了其余端口,这里请记得修改controlPlaneEndpoint
里面的端口。
若是直接采用kubeadm init
来初始化,中间会有系统自动拉取镜像的这一步骤,这是比较慢的,我建议分开来作,因此这里就先提早拉取镜像。
[root@master1 ~]# kubeadm config images pull --config kubeadm-init.yaml [config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.18.0 [config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.1 [config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.4.3-0 [config/images] Pulled registry.aliyuncs.com/google_containers/coredns:1.6.5
若是你们看到开头的两行warning
信息(我这里没有打印),没必要担忧,这只是警告,不影响咱们完成实验。
master
节点提早拉取镜像其余两个master
节点在初始化以前也尽可能先把镜像拉取下来,这样子减小初始化时间
[root@master1 ~]# scp kubeadm-init.yaml root@master2:~ [root@master1 ~]# scp kubeadm-init.yaml root@master3:~
master2
节点
# 注意在master2节点执行以下命令 [root@master2 ~]# kubeadm config images pull --config kubeadm-init.yaml
master3
节点
# 注意在master3节点执行以下命令 [root@master3 ~]# kubeadm config images pull --config kubeadm-init.yaml
kubenetes
的master1
节点执行以下命令
[root@master1 ~]# kubeadm init --config kubeadm-init.yaml --upload-certs [init] Using Kubernetes version: v1.18.3 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [certs] apiserver serving cert is signed for DNS names [master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.50.128 192.168.50.100] ... # 省略 [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node master1 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [addons] Applied essential addon: CoreDNS [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648 \ --control-plane --certificate-key 4931f39d3f53351cb6966a9dcc53cb5cbd2364c6d5b83e50e258c81fbec69539 Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648
这个过程大概30s
的时间就作完了,之因此初始化的这么快就是由于咱们提早拉取了镜像。像我上面这样的没有报错信息,而且显示上面的最后10行相似的信息这些,说明咱们的master1
节点是初始化成功的。
在使用集群以前还须要作些收尾工做,在master1
节点执行:
[root@master1 ~]# mkdir -p $HOME/.kube [root@master1 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master1 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
再配置一下环境变量
[root@master1 ~]# cat >> ~/.bashrc <<EOF export KUBECONFIG=/etc/kubernetes/admin.conf EOF [root@master1 ~]# source ~/.bashrc
好了,此时的master1
节点就算是初始化完毕了。
有个重要的点就是最后几行信息中,其中有两条kubeadm join 192.168.50.100:8443
开头的信息。 这分别是其余master
节点和node
节点加入kubernetes
集群的认证命令。这个密钥是系统根据 sha256
算法计算出来的,必须持有这样的密钥才能够加入当前的kubernetes
集群。
这两条加入集群的命令是有一些区别的:
好比这个第一条,咱们看到最后有一行内容--control-plane --certificate-key xxxx
,这是控制节点加入集群的命令,控制节点是kubernetes
官方的说法,其实在咱们这里指的就是其余的master
节点。
kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648 \ --control-plane --certificate-key 4931f39d3f53351cb6966a9dcc53cb5cbd2364c6d5b83e50e258c81fbec69539
而最后一条就表示node
节点加入集群的命令,好比:
kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648
因此这两个节点使用时要看清楚是什么类型的节点加入集群的。
若是此时查看当前集群的节点,会发现只有master1
节点本身。
[root@master1 ~]# kubectl get node NAME STATUS ROLES AGE VERSION master1 NotReady master 9m58s v1.18.4
接下来咱们把其余两个master
节点加入到kubernetes
集群中
master
节点加入kubernetes
集群中master2
节点加入集群既然是其余的master节点加入集群,那确定是使用以下命令:
[root@master2 ~]# kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648 \ --control-plane --certificate-key 4931f39d3f53351cb6966a9dcc53cb5cbd2364c6d5b83e50e258c81fbec69539 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The ...... #省略若干 [mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] This node has joined the cluster and a new control plane instance was created: To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster.
看上去没有报错,说明加入集群成功,如今再执行一些收尾工做
[root@master2 ~]# mkdir -p $HOME/.kube [root@master2 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master2 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
加环境变量
[root@master2 ~]# cat >> ~/.bashrc <<EOF export KUBECONFIG=/etc/kubernetes/admin.conf EOF [root@master2 ~]# source ~/.bashrc
master3
节点加入集群[root@master3 ~]# kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648 \ --control-plane --certificate-key 4931f39d3f53351cb6966a9dcc53cb5cbd2364c6d5b83e50e258c81fbec69539
作一些收尾工做
[root@master3 ~]# mkdir -p $HOME/.kube [root@master3 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master3 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config [root@master3 ~]# cat >> ~/.bashrc <<EOF export KUBECONFIG=/etc/kubernetes/admin.conf EOF [root@master3 ~]# source ~/.bashrc
到此,全部的master
节点都已经加入集群
master
节点[root@master1 ~]# kubectl get node NAME STATUS ROLES AGE VERSION master1 NotReady master 25m v1.18.4 master2 NotReady master 12m v1.18.4 master3 NotReady master 3m30s v1.18.4
你能够在任意一个master
节点上执行kubectl get node
查看集群节点的命令。
node
节点加入kubernetes
集群中正如咱们上面所说的,master1
节点初始化完成后,第二条kubeadm join xxx
(或者说是最后一行内容)内容即是node
节点加入集群的命令。
~]# kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648
注意:node
节点加入集群只须要执行上面的一条命令便可,只要没有报错就表示成功。没必要像master
同样作最后的加入环境变量等收尾工做。
node1
节点加入集群[root@node1 ~]# kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ > --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [preflight] Reading configuration from the cluster... .... .... [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
当看到倒数第四行内容This node has joined the cluster
,这一行信息表示node1
节点加入集群成功。
node2
节点加入集群[root@node2 ~]# kubeadm join 192.168.50.100:8443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:4c738bc8e2684c5d52d80687d48925613b66ab660403649145eb668d71d85648
此时咱们能够在任意一个master
节点执行以下命令查看此集群的节点信息。
[root@master1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 NotReady master 20h v1.18.4 master2 NotReady master 20h v1.18.4 master3 NotReady master 20h v1.18.4 node1 NotReady <none> 5m15s v1.18.4 node2 NotReady <none> 5m11s v1.18.4
能够看到集群的五个节点都已经存在,可是如今还不能用,也就是说如今集群节点是不可用的,缘由在于上面的第2个字段,咱们看到五个节点都是`NotReady
状态,这是由于咱们尚未安装网络插件。
网络插件有calico
,flannel
等插件,这里咱们选择使用flannel
插件。
默认你们从网上看的教程都会使用这个命令来初始化。
~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
事实上不少用户都不能成功,由于国内网络受限,因此能够这样子来作。
flannel
镜像源master1
节点上修改本地的hosts
文件添加以下内容以便解析
199.232.28.133 raw.githubusercontent.com
而后下载flannel
文件
[root@master1 ~]# curl -o kube-flannel.yml https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
编辑镜像源,默认的镜像地址咱们修改一下。把yaml
文件中全部的quay.io
修改成 quay-mirror.qiniu.com
[root@master1 ~]# sed -i 's/quay.io/quay-mirror.qiniu.com/g' kube-flannel.yml
此时保存保存退出。在master
节点执行此命令。
[root@master1 ~]# kubectl apply -f kube-flannel.yml podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds-amd64 created daemonset.apps/kube-flannel-ds-arm64 created daemonset.apps/kube-flannel-ds-arm created daemonset.apps/kube-flannel-ds-ppc64le created daemonset.apps/kube-flannel-ds-s390x created
这样子就能够成功拉取flannel
镜像了。固然你也可使用我提供给你们的kube-flannel.yml
文件
flannel
是否正常若是你想查看flannel
这些pod
运行是否正常,使用以下命令
[root@master1 ~]# kubectl get pods -n kube-system | grep flannel NAME READY STATUS RESTARTS AGE kube-flannel-ds-amd64-dp972 1/1 Running 0 66s kube-flannel-ds-amd64-lkspx 1/1 Running 0 66s kube-flannel-ds-amd64-rmsdk 1/1 Running 0 66s kube-flannel-ds-amd64-wp668 1/1 Running 0 66s kube-flannel-ds-amd64-zkrwh 1/1 Running 0 66s
若是第三字段STATUS
不是处于Running
状态的话,说明flannel
是异常的,须要排查问题所在。
Ready
稍等片刻,执行以下指令查看节点是否可用
[root@master1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready master 21h v1.18.4 master2 Ready master 21h v1.18.4 master3 Ready master 21h v1.18.4 node1 Ready <none> 62m v1.18.4 node2 Ready <none> 62m v1.18.4
目前节点状态是Ready
,表示集群节点如今是可用的。
kubernetes
集群kubernetes
集群测试nginx
的pod
如今咱们在kubernetes
集群中建立一个nginx
的pod
,验证是否能正常运行。
在master
节点执行一下步骤:
[root@master1 ~]# kubectl create deployment nginx --image=nginx deployment.apps/nginx created [root@master1 ~]# kubectl expose deployment nginx --port=80 --type=NodePort service/nginx exposed
如今咱们查看pod
和service
[root@master1 ~]# kubectl get pod,svc -o wide
打印的结果中,前半部分是pod
相关信息,后半部分是service
相关信息。咱们看service/nginx
这一行能够看出service
暴漏给集群的端口是30249
。记住这个端口。
而后从pod
的详细信息能够看出此时pod
在node2
节点之上。node2
节点的IP地址是192.168.50.132
nginx
验证集群那如今咱们访问一下。打开浏览器(建议火狐浏览器),访问地址就是:http://192.168.50.132:30249
dashboard
dashboard
先把dashboard
的配置文件下载下来。因为咱们以前已经添加了hosts
解析,所以能够下载。
[root@master1 ~]# wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
默认Dashboard
只能集群内部访问,修改Service
为NodePort
类型,暴露到外部:
大概在此文件的32-44
行之间,修改成以下:
kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort #加上此行 ports: - port: 443 targetPort: 8443 nodePort: 30001 #加上此行,端口30001能够自行定义 selector: k8s-app: kubernetes-dashboard
yaml
文件[root@master1 ~]# kubectl apply -f recommended.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created ... service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created
dashboard
运行是否正常[root@master1 ~]# kubectl get pods -n kubernetes-dashboard NAME READY STATUS RESTARTS AGE dashboard-metrics-scraper-694557449d-mlnl4 1/1 Running 0 2m31s kubernetes-dashboard-9774cc786-ccvcf 1/1 Running 0 2m31s
主要是看status
这一列的值,若是是Running
,而且RESTARTS
字段的值为0
(只要这个值不是一直在渐渐变大),就是正常的,目前来看是没有问题的。咱们能够继续下一步。
查看此dashboard
的pod
运行所在的节点
从上面能够看出,kubernetes-dashboard-9774cc786-ccvcf
运行所在的节点是node2
上面,而且暴漏出来的端口是30001
,因此访问地址是:https://192.168.50.132:30001
用火狐浏览器访问,访问的时候会让输入token
,今后处能够查看到token
的值。
[root@master1 ~]# kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
把上面的token
值输入进去便可进去dashboard
界面。
不过如今咱们虽然能够登录上去,可是咱们权限不够还查看不了集群信息,由于咱们尚未绑定集群角色,同窗们能够先按照上面的尝试一下,再来作下面的步骤
[root@master1 ~]# kubectl create serviceaccount dashboard-admin -n kube-system [root@master1 ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin [root@master1 ~]# kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
再使用输出的token
登录dashboard
便可。
(1)其余master节点没法加入集群
[check-etcd] Checking that the etcd cluster is healthy error execution phase check-etcd: error syncing endpoints with etc: context deadline exceeded To see the stack trace of this error execute with --v=5 or higher
查看集群的高可用配置是否有问题,好比keepalived的配置中,主备,优先级是否都配置好了。