Kubeadm是管理集群生命周期的重要工具,从建立到配置再到升级,Kubeadm处理现有硬件上的生产集群的引导,并以最佳实践方式配置核心Kubernetes组件,以便为新节点提供安全而简单的链接流程并支持轻松升级。随着Kubernetes 1.13 的发布,如今Kubeadm正式成为GA。html
首先准备2台虚拟机(CPU最少2核),我是使用Hyper-V建立的2台Ubuntu18.04虚拟机,IP和机器名以下:node
172.17.20.210 masternginx
172.17.20.211 node1docker
Kubernetes 1.8开始要求必须禁用Swap,若是不关闭,默认配置下kubelet将没法启动。bootstrap
编辑/etc/fstab
文件:ubuntu
sudo vim /etc/fstab UUID=8be04efd-f7c5-11e8-be8b-00155d000500 / ext4 defaults 0 0 UUID=C0E3-6A72 /boot/efi vfat defaults 0 0 #/swap.img none swap sw 0 0
如上,将/swap.img
所在的行注释掉,而后运行:vim
sudo swapoff -a
在Ubuntu18.04+版本中,DNS由systemd
全面接管,接口监听在127.0.0.53:53
,配置文件在/etc/systemd/resolved.conf
中。api
有时候会致使没法解析域名的问题,可以使用以下2种方式来解决:安全
1.最简单的就是关闭systemd-resolvd服务bash
sudo systemctl stop systemd-resolved sudo systemctl disable systemd-resolved
而后手动修改/etc/resolv.conf
文件就能够了。
2.更加推荐的作法是修改systemd-resolv的设置:
sudo vim /etc/systemd/resolved.conf # 修改成以下 [Resolve] DNS=1.1.1.1 1.0.0.1 #FallbackDNS= #Domains= LLMNR=no #MulticastDNS=no #DNSSEC=no #Cache=yes #DNSStubListener=yes
DNS=设置的是域名解析服务器的IP地址,这里分别设为1.1.1.1和1.0.0.1
LLMNR=设置的是禁止运行LLMNR(Link-Local Multicast Name Resolution),不然systemd-resolve会监听5535端口。
Kubernetes从1.6开始使用CRI(Container Runtime Interface)容器运行时接口。默认的容器运行时仍然是Docker,是使用kubelet中内置dockershim CRI来实现的。
Docker的安装能够参考以前的博客:Docker初体验。
须要注意的是,Kubernetes 1.13已经针对Docker的1.11.1, 1.12.1, 1.13.1, 17.03, 17.06, 17.09, 18.06等版本作了验证,最低支持的Docker版本是1.11.1,最高支持是18.06,而Docker最新版本已是
18.09
了,故咱们安装时须要指定版本为18.06.1-ce
:sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu
部署以前,咱们须要安装三个包:
kubeadm: 引导启动k8s集群的命令行工具。
kubelet: 在群集中全部节点上运行的核心组件, 用来执行如启动pods和containers等操做。
kubectl: 操做集群的命令行工具。
首先添加apt-key:
sudo apt update && sudo apt install -y apt-transport-https curl curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
添加kubernetes源:
sudo vim /etc/apt/sources.list.d/kubernetes.list deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
安装:
sudo apt update sudo apt install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl
K8s的控制面板组件运行在Master节点上,包括etcd和API server(Kubectl即是经过API server与k8s通讯)。
在执行初始化以前,咱们还有一下3点须要注意:
1.选择一个网络插件,并检查它是否须要在初始化Master时指定一些参数,好比咱们可能须要根据选择的插件来设置--pod-network-cidr
参数。参考:Installing a pod network add-on。
2.kubeadm使用eth0的默认网络接口(一般是内网IP)作为Master节点的advertise address,若是咱们想使用不一样的网络接口,可使用--apiserver-advertise-address=<ip-address>
参数来设置。若是适应IPv6,则必须使用IPv6d的地址,如:--apiserver-advertise-address=fd00::101
。
3.使用kubeadm config images pull
来预先拉取初始化须要用到的镜像,用来检查是否能链接到Kubenetes的Registries。
Kubenetes默认Registries地址是k8s.gcr.io
,很明显,在国内并不能访问gcr.io,所以在kubeadm v1.13以前的版本,安装起来很是麻烦,可是在1.13
版本中终于解决了国内的痛点,其增长了一个--image-repository
参数,默认值是k8s.gcr.io
,咱们将其指定为国内镜像地址:registry.aliyuncs.com/google_containers
,其它的就能够彻底按照官方文档来愉快的玩耍了。
其次,咱们还须要指定--kubernetes-version
参数,由于它的默认值是stable-1
,会致使从https://dl.k8s.io/release/stable-1.txt
下载最新的版本号,咱们能够将其指定为固定版本(最新版:v1.13.1)来跳过网络请求。
如今,咱们就来试一下:
# 使用calico网络 --pod-network-cidr=192.168.0.0/16 sudo kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.13.1 --pod-network-cidr=192.168.0.0/16 # 输出 [init] Using Kubernetes version: v1.13.1 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.17.20.210] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [172.17.20.210 127.0.0.1 ::1] [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [172.17.20.210 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [apiclient] All control plane components are healthy after 42.003645 seconds [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master" as an annotation [mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 6pkrlg.8glf2fqpuf3i489m [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 172.17.20.210:6443 --token 6pkrlg.8glf2fqpuf3i489m --discovery-token-ca-cert-hash sha256:eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222
此次很是顺利的就部署成功了,若是咱们想使用非root用户操做kubectl
,可使用如下命令,这也是kubeadm init
输出的一部分:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
为了让Pods间能够相互通讯,咱们必须安装一个网络插件,而且必须在部署任何应用以前安装,CoreDNS也是在网络插件安装以后才会启动的。
网络的插件完整列表,请参考 Networking and Network Policy。
在安装以前,咱们先查看一下当前Pods的状态:
kubectl get pods --all-namespaces # 输出 NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-78d4cf999f-6pgfr 0/1 Pending 0 87s kube-system coredns-78d4cf999f-m9kgs 0/1 Pending 0 87s kube-system etcd-master 1/1 Running 0 47s kube-system kube-apiserver-master 1/1 Running 0 38s kube-system kube-controller-manager-master 1/1 Running 0 55s kube-system kube-proxy-mkg24 1/1 Running 0 87s kube-system kube-scheduler-master 1/1 Running 0 41s
如上,能够看到CoreDND的状态是Pending
,这是由于咱们尚未安装网络插件。
Calico是一个纯三层的虚拟网络方案,Calico 为每一个容器分配一个 IP,每一个 host 都是 router,把不一样 host 的容器链接起来。与 VxLAN 不一样的是,Calico 不对数据包作额外封装,不须要 NAT 和端口映射,扩展性和性能都很好。
默认状况下,Calico网络插件使用的的网段是192.168.0.0/16
,在init
的时候,咱们已经经过--pod-network-cidr=192.168.0.0/16
来适配Calico,固然你也能够修改calico.yml
文件来指定不一样的网段。
可使用以下命令命令来安装Canal
插件:
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml # 上面的calico.yaml会去quay.io拉取镜像,若是没法拉取,可以使用下面的国内镜像 kubectl apply -f http://mirror.faasx.com/k8s/calico/v3.3.2/rbac-kdd.yaml kubectl apply -f http://mirror.faasx.com/k8s/calico/v3.3.2/calico.yaml
关于更多Canal
的信息能够查看Calico官方文档:kubeadm quickstart。
稍等片刻,再使用kubectl get pods --all-namespaces
命令来查看网络插件的安装状况:
kubectl get pods --all-namespaces # 输出 NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-x96gn 2/2 Running 0 47s kube-system coredns-78d4cf999f-6pgfr 1/1 Running 0 54m kube-system coredns-78d4cf999f-m9kgs 1/1 Running 0 54m kube-system etcd-master 1/1 Running 3 53m kube-system kube-apiserver-master 1/1 Running 3 53m kube-system kube-controller-manager-master 1/1 Running 3 53m kube-system kube-proxy-mkg24 1/1 Running 2 54m kube-system kube-scheduler-master 1/1 Running 3 53m
如上,STATUS所有变为了Running
,表示安装成功,接下来就能够加入其余节点以及部署应用了。
默认状况下,因为安全缘由,集群并不会将pods部署在Master节点上。可是在开发环境下,咱们可能就只有一个Master节点,这时可使用下面的命令来解除这个限制:
kubectl taint nodes --all node-role.kubernetes.io/master- ## 输出 node/master untainted
要为群集添加工做节点,须要为每台计算机执行如下操做:
kubeadm init
命令输出的:kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>
若是咱们忘记了Master节点的加入token,可使用以下命令来查看:
kubeadm token list # 输出 TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 6pkrlg.8glf2fqpuf3i489m 22h 2018-12-07T13:46:33Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
默认状况下,token的有效期是24小时,若是咱们的token已通过期的话,可使用如下命令从新生成:
kubeadm token create # 输出 u2mt59.tyqpo0v5wf05lx2q
若是咱们也没有--discovery-token-ca-cert-hash
的值,可使用如下命令生成:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' # 输出 eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222
如今,咱们登陆到工做节点服务器,而后运行以下命令加入集群(这也是上面init
输出的一部分):
sudo kubeadm join 172.17.20.210:6443 --token 6pkrlg.8glf2fqpuf3i489m --discovery-token-ca-cert-hash sha256:eebfe256113bee397b218ba832f412273ae734bd4686241fb910885d26efd222 # 输出 [sudo] password for raining: [preflight] Running pre-flight checks [discovery] Trying to connect to API Server "172.17.20.210:6443" [discovery] Created cluster-info discovery client, requesting info from "https://172.17.20.210:6443" [discovery] Requesting info from "https://172.17.20.210:6443" again to validate TLS against the pinned public key [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.17.20.210:6443" [discovery] Successfully established connection with API Server "172.17.20.210:6443" [join] Reading configuration from the cluster... [join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Activating the kubelet service [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node1" as an annotation This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the master to see this node join the cluster.
等待一会,咱们能够在Master节点上使用kubectl get nodes
命令来查看节点的状态:
kubectl get nodes # 输出 NAME STATUS ROLES AGE VERSION master Ready master 17m v1.13.1 node1 Ready <none> 15m v1.13.1
如上所有Ready
,大功告成,咱们能够运行一些命令来测试一下集群是否正常。
首先验证kube-apiserver, kube-controller-manager, kube-scheduler, pod network 是否正常:
# 部署一个 Nginx Deployment,包含两个Pod # https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ kubectl create deployment nginx --image=nginx:alpine kubectl scale deployment nginx --replicas=2 # 验证Nginx Pod是否正确运行,而且会分配192.168.开头的集群IP kubectl get pods -l app=nginx -o wide # 输出以下: NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-54458cd494-p8jzs 1/1 Running 0 31s 192.168.1.2 node1 <none> <none> nginx-54458cd494-v2m4b 1/1 Running 0 24s 192.168.1.3 node1 <none> <none>
再验证一下kube-proxy
是否正常:
# 以 NodePort 方式对外提供服务 https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/ kubectl expose deployment nginx --port=80 --type=NodePort # 查看集群外可访问的Port kubectl get services nginx # 输出 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx NodePort 10.110.49.49 <none> 80:31899/TCP 4s # 能够经过任意 NodeIP:Port 在集群外部访问这个服务,本示例中部署的2台集群IP分别是172.17.20.210和172.17.20.211 curl http://172.17.20.210:31899 curl http://172.17.20.211:31899
最后验证一下dns, pod network是否正常:
# 运行Busybox并进入交互模式 kubectl run -it curl --image=radial/busyboxplus:curl # 输入`nslookup nginx`查看是否能够正确解析出集群内的IP,已验证DNS是否正常 [ root@curl-66959f6557-6sfqh:/ ]$ nslookup nginx # 输出 Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: nginx Address 1: 10.110.49.49 nginx.default.svc.cluster.local # 经过服务名进行访问,验证kube-proxy是否正常 [ root@curl-66959f6557-6sfqh:/ ]$ curl http://nginx/ # 输出以下: # <!DOCTYPE html> ---省略 # 分别访问一下2个Pod的内网IP,验证跨Node的网络通讯是否正常 [ root@curl-66959f6557-6sfqh:/ ]$ curl http://192.168.1.2/ [ root@curl-66959f6557-6sfqh:/ ]$ curl http://192.168.1.3/
验证经过,集群搭建成功,接下来咱们就能够参考官方文档来部署其余服务,愉快的玩耍了。
想要撤销kubeadm执行的操做,首先要排除节点,并确保该节点为空, 而后再将其关闭。
在Master节点上运行:
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets kubectl delete node <node name>
而后在须要移除的节点上,重置kubeadm的安装状态:
sudo kubeadm reset
若是你想从新配置集群,使用新的参数从新运行kubeadm init
或者kubeadm join
便可。