使用kubeadm
配置多master
节点,实现高可用。html
lab1: etcd master haproxy keepalived 11.11.11.111
lab2: etcd master haproxy keepalived 11.11.11.112
lab3: etcd master haproxy keepalived 11.11.11.113
lab4: node 11.11.11.114
lab5: node 11.11.11.115
lab6: node 11.11.11.116
vip(loadblancer ip): 11.11.11.110
复制代码
Vagrantfile
# -*- mode: ruby -*-
# vi: set ft=ruby :
ENV["LC_ALL"] = "en_US.UTF-8"
Vagrant.configure("2") do |config|
(1..6).each do |i|
config.vm.define "lab#{i}" do |node|
node.vm.box = "centos-7.4-docker-17"
node.ssh.insert_key = false
node.vm.hostname = "lab#{i}"
node.vm.network "private_network", ip: "11.11.11.11#{i}"
node.vm.provision "shell",
inline: "echo hello from node #{i}"
node.vm.provider "virtualbox" do |v|
v.cpus = 2
v.customize ["modifyvm", :id, "--name", "lab#{i}", "--memory", "2048"]
end
end
end
end
复制代码
参考以前的文章《centos7安装kubeadm》node
# 配置kubelet使用国内可用镜像
# 修改/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# 添加以下配置
Environment="KUBELET_EXTRA_ARGS=--pod-infra-container-image=registry.cn-shanghai.aliyuncs.com/gcr-k8s/pause-amd64:3.0"
# 使用命令
sed -i '/ExecStart=$/i Environment="KUBELET_EXTRA_ARGS=--pod-infra-container-image=registry.cn-shanghai.aliyuncs.com/gcr-k8s/pause-amd64:3.0"' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# 从新载入配置
systemctl daemon-reload
复制代码
cat >>/etc/hosts<<EOF
11.11.11.111 lab1
11.11.11.112 lab2
11.11.11.113 lab3
11.11.11.114 lab4
11.11.11.115 lab5
11.11.11.116 lab6
EOF
复制代码
在lab1,lab2,lab3
节点上启动etcd
集群nginx
# lab1
docker stop etcd && docker rm etcd
rm -rf /data/etcd
mkdir -p /data/etcd
docker run -d \
--restart always \
-v /etc/etcd/ssl/certs:/etc/ssl/certs \
-v /data/etcd:/var/lib/etcd \
-p 2380:2380 \
-p 2379:2379 \
--name etcd \
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \
etcd --name=etcd0 \
--advertise-client-urls=http://11.11.11.111:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-advertise-peer-urls=http://11.11.11.111:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \
--initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \
--initial-cluster-state=new \
--auto-tls \
--peer-auto-tls \
--data-dir=/var/lib/etcd
# lab2
docker stop etcd && docker rm etcd
rm -rf /data/etcd
mkdir -p /data/etcd
docker run -d \
--restart always \
-v /etc/etcd/ssl/certs:/etc/ssl/certs \
-v /data/etcd:/var/lib/etcd \
-p 2380:2380 \
-p 2379:2379 \
--name etcd \
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \
etcd --name=etcd1 \
--advertise-client-urls=http://11.11.11.112:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-advertise-peer-urls=http://11.11.11.112:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \
--initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \
--initial-cluster-state=new \
--auto-tls \
--peer-auto-tls \
--data-dir=/var/lib/etcd
# lab3
docker stop etcd && docker rm etcd
rm -rf /data/etcd
mkdir -p /data/etcd
docker run -d \
--restart always \
-v /etc/etcd/ssl/certs:/etc/ssl/certs \
-v /data/etcd:/var/lib/etcd \
-p 2380:2380 \
-p 2379:2379 \
--name etcd \
registry.cn-hangzhou.aliyuncs.com/google_containers/etcd-amd64:3.1.12 \
etcd --name=etcd2 \
--advertise-client-urls=http://11.11.11.113:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-advertise-peer-urls=http://11.11.11.113:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--initial-cluster-token=9477af68bbee1b9ae037d6fd9e7efefd \
--initial-cluster=etcd0=http://11.11.11.111:2380,etcd1=http://11.11.11.112:2380,etcd2=http://11.11.11.113:2380 \
--initial-cluster-state=new \
--auto-tls \
--peer-auto-tls \
--data-dir=/var/lib/etcd
# 验证查看集群
docker exec -ti etcd ash
etcdctl member list
etcdctl cluster-health
exit
复制代码
# 生成token
# 保留token后面还要使用
token=$(kubeadm token generate)
echo $token
# 生成配置文件
cat >kubeadm-master.config<<EOF
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
kubernetesVersion: v1.10.1
#imageRepository: registry.cn-shanghai.aliyuncs.com/gcr-k8s
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
api:
advertiseAddress: 11.11.11.111
apiServerExtraArgs:
endpoint-reconciler-type: lease
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s
networking:
podSubnet: 192.168.0.0/16
etcd:
endpoints:
- "http://11.11.11.111:2379"
- "http://11.11.11.112:2379"
- "http://11.11.11.113:2379"
apiServerCertSANs:
- "lab1"
- "lab2"
- "lab3"
- "11.11.11.111"
- "11.11.11.112"
- "11.11.11.113"
- "11.11.11.110"
- "127.0.0.1"
token: $token
tokenTTL: "0"
featureGates:
CoreDNS: true
EOF
# 初始化
kubeadm init --config kubeadm-master.config
systemctl enable kubelet
# 保存初始化完成以后的join命令
# 若是丢失可使用命令"kubeadm token list"获取
# kubeadm join 11.11.11.111:6443 --token nevmjk.iuh214fc8i0k3iue --discovery-token-ca-cert-hash sha256:0e4f738348be836ff810bce754e059054845f44f01619a37b817eba83282d80f
# 配置kubectl使用
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 安装网络插件
# 下载配置
mkdir flannel && cd flannel
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 修改配置
# 此处的ip配置要与上面kubeadm的pod-network一致
net-conf.json: |
{
"Network": "192.168.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
# 修改镜像
image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64
# 启动
kubectl apply -f kube-flannel.yml
# 若是Node有多个网卡的话,参考flannel issues 39701,
# https://github.com/kubernetes/kubernetes/issues/39701
# 目前须要在kube-flannel.yml中使用--iface参数指定集群主机内网网卡的名称,
# 不然可能会出现dns没法解析。容器没法通讯的状况,须要将kube-flannel.yml下载到本地,
# flanneld启动参数加上--iface=<iface-name>
containers:
- name: kube-flannel
image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth1
# 查看
kubectl get pods --namespace kube-system
kubectl get svc --namespace kube-system
# 设置master容许部署应用pod,参与工做负载,如今能够部署其余系统组件
# 如 dashboard, heapster, efk等
kubectl taint nodes --all node-role.kubernetes.io/master-
复制代码
# 打包第一台master初始化以后的/etc/kubernetes/pki目录
cd /etc/kubernetes && tar czvf /root/pki.tgz pki/ && cd ~
# 上传到其余master的/etc/kubernetes目录下
tar xf pki.tgz -C /etc/kubernetes/
# 删除pki目录下的apiserver.crt 和 apiserver.key文件
rm -rf /etc/kubernetes/pki/{apiserver.crt,apiserver.key}
# 生成配置文件
# 使用和以前master同样的配置文件
# token保持一致
cat >kubeadm-master.config<<EOF
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
kubernetesVersion: v1.10.1
#imageRepository: registry.cn-shanghai.aliyuncs.com/gcr-k8s
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
# 注意修改IP
api:
advertiseAddress: 11.11.11.112
apiServerExtraArgs:
endpoint-reconciler-type: lease
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s
networking:
podSubnet: 192.168.0.0/16
etcd:
endpoints:
- "http://11.11.11.111:2379"
- "http://11.11.11.112:2379"
- "http://11.11.11.113:2379"
apiServerCertSANs:
- lab1
- lab2
- lab3
- "11.11.11.111"
- "11.11.11.112"
- "11.11.11.113"
- "11.11.11.110"
- "127.0.0.1"
token: nevmjk.iuh214fc8i0k3iue
tokenTTL: "0"
featureGates:
CoreDNS: true
EOF
# 初始化
kubeadm init --config kubeadm-master.config
systemctl enable kubelet
# 查看状态
kubectl get pod --all-namespaces -o wide | grep lab1
kubectl get pod --all-namespaces -o wide | grep lab2
kubectl get pod --all-namespaces -o wide | grep lab3
kubectl get nodes -o wide
复制代码
在lab1,lab2,lab3
节点上启动haproxy
和keepalived
git
# 拉取haproxy镜像
docker pull haproxy:1.7.8-alpine
mkdir /etc/haproxy
cat >/etc/haproxy/haproxy.cfg<<EOF
global
log 127.0.0.1 local0 err
maxconn 50000
uid 99
gid 99
#daemon
nbproc 1
pidfile haproxy.pid
defaults
mode http
log 127.0.0.1 local0 err
maxconn 50000
retries 3
timeout connect 5s
timeout client 30s
timeout server 30s
timeout check 2s
listen admin_stats
mode http
bind 0.0.0.0:1080
log 127.0.0.1 local0 err
stats refresh 30s
stats uri /haproxy-status
stats realm Haproxy\ Statistics
stats auth will:will
stats hide-version
stats admin if TRUE
frontend k8s-https
bind 0.0.0.0:8443
mode tcp
#maxconn 50000
default_backend k8s-https
backend k8s-https
mode tcp
balance roundrobin
server lab1 11.11.11.111:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
server lab2 11.11.11.112:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
server lab3 11.11.11.113:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
EOF
# 启动haproxy
docker run -d --name my-haproxy \
-v /etc/haproxy:/usr/local/etc/haproxy:ro \
-p 8443:8443 \
-p 1080:1080 \
--restart always \
haproxy:1.7.8-alpine
# 查看日志
docker logs my-haproxy
# 浏览器查看状态
http://11.11.11.111:1080/haproxy-status
http://11.11.11.112:1080/haproxy-status
# 拉取keepalived镜像
docker pull osixia/keepalived:1.4.4
# 启动
# 载入内核相关模块
lsmod | grep ip_vs
modprobe ip_vs
# 启动keepalived
# eth1为本次实验11.11.11.0/24网段的所在网卡
docker run --net=host --cap-add=NET_ADMIN \
-e KEEPALIVED_INTERFACE=eth1 \
-e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['11.11.11.110']" \
-e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['11.11.11.111','11.11.11.112','11.11.11.113']" \
-e KEEPALIVED_PASSWORD=hello \
--name k8s-keepalived \
--restart always \
-d osixia/keepalived:1.4.4
# 查看日志
# 会看到两个成为backup 一个成为master
docker logs k8s-keepalived
# 此时会配置 11.11.11.110 到其中一台机器
# ping测试
ping -c4 11.11.11.110
# 若是失败后清理后,从新实验
docker rm -f k8s-keepalived
ip a del 11.11.11.110/32 dev eth1
# 修改~/.kube/config文件里ip和端口,而后使用kubectl测试
rm -rf .kube/cache .kube/http-cache
kubectl get pods -n kube-system -o wide
复制代码
# lab1 lab2 lab3
sed -i 's@server: https://11.11.11.*:6443@server: https://11.11.11.110:8443@g' /etc/kubernetes/{admin.conf,kubelet.conf,scheduler.conf,controller-manager.conf}
# 重启kubelet
systemctl daemon-reload
systemctl restart kubelet docker
# 查看全部节点状态
kubectl get nodes -o wide
复制代码
# 修改kube-proxy的配置指定vip
# 执行命令以后修改成 server: https://11.11.11.110:8443
kubectl edit -n kube-system configmap/kube-proxy
# 查看设置
kubectl get -n kube-system configmap/kube-proxy -o yaml
# 删除重建kube-proxy
kubectl get pods --all-namespaces -o wide | grep proxy
all_proxy_pods=$(kubectl get pods --all-namespaces -o wide | grep proxy | awk '{print $2}' | xargs)
echo $all_proxy_pods
kubectl delete pods $all_proxy_pods -n kube-system
kubectl get pods --all-namespaces -o wide | grep proxy
复制代码
# 加入master节点
# 这个命令是以前初始化master完成时,输出的命令
kubeadm join 11.11.11.110:8443 --token nevmjk.iuh214fc8i0k3iue --discovery-token-ca-cert-hash sha256:0e4f738348be836ff810bce754e059054845f44f01619a37b817eba83282d80f
systemctl enable kubelet
复制代码
# 修改配置
sed -i 's@server: https://11.11.11.*:6443@server: https://11.11.11.110:8443@g' /etc/kubernetes/kubelet.conf
# 重启kubelet
systemctl daemon-reload
systemctl restart kubelet docker
# 查看全部节点状态
kubectl get nodes -o wide
复制代码
禁止master节点发布应用github
设置master不接受负载docker
# 查看状态
kubectl get nodes
# 设置
# kubectl patch node lab1 -p '{"spec":{"unschedulable":true}}'
kubectl taint nodes lab1 lab2 lab3 node-role.kubernetes.io/master=true:NoSchedule
# 查看状态
kubectl get nodes
复制代码
# 删除coredns的pods
kubectl get pods -n kube-system -o wide | grep coredns
all_coredns_pods=$(kubectl get pods -n kube-system -o wide | grep coredns | awk '{print $1}' | xargs)
echo $all_coredns_pods
kubectl delete pods $all_coredns_pods -n kube-system
# 修改副本数
# replicas: 3
# 能够修改成node节点的个数
kubectl edit deploy coredns -n kube-system
# 查看状态
kubectl get pods -n kube-system -o wide | grep coredns
复制代码
1. 启动shell
# 直接使用命令测试
kubectl run nginx --replicas=2 --image=nginx:alpine --port=80
kubectl expose deployment nginx --type=NodePort --name=example-service-nodeport
kubectl expose deployment nginx --name=example-service
# 使用配置文件测试
cat >example-nginx.yml<<EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 10
periodSeconds: 3
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 10
periodSeconds: 3
---
kind: Service
apiVersion: v1
metadata:
name: example-service
spec:
selector:
app: nginx
ports:
- name: http
port: 80
targetPort: 80
---
kind: Service
apiVersion: v1
metadata:
name: example-service-nodeport
spec:
selector:
app: nginx
type: NodePort
ports:
- name: http-nodeport
port: 80
nodePort: 32223
EOF
kubectl apply -f example-nginx.yml
复制代码
2. 查看状态json
kubectl get deploy
kubectl get pods
kubectl get svc
kubectl describe svc example-service
复制代码
3. DNS解析centos
kubectl run curl --image=radial/busyboxplus:curl -i --tty
nslookup kubernetes
nslookup example-service
curl example-service
# 若是时间过长会返回错误,可使用以下方式再进入测试
curlPod=$(kubectl get pod | grep curl | awk '{print $1}')
kubectl exec -ti $curlPod -- sh
复制代码
4. 访问测试api
# 10.96.59.56 为查看svc时获取到的clusterip
curl "10.96.59.56:80"
# 32223 为查看svc时获取到的 nodeport
http://11.11.11.114:32223/
http://11.11.11.115:32223/
复制代码
3. 清理删除
kubectl delete svc example-service example-service-nodeport
kubectl delete deploy nginx curl
复制代码
关闭master
节点测试集群是可否正常执行上一步的基础测试
,查看相关信息,不能同时关闭lab1
和lab2
,由于上面有haproxy
和keepalived
服务
kubectl get pod --all-namespaces -o wide
kubectl get pod --all-namespaces -o wide | grep lab1
kubectl get pod --all-namespaces -o wide | grep lab2
kubectl get pod --all-namespaces -o wide | grep lab3
kubectl get nodes -o wide
kubectl get deploy
kubectl get pods
kubectl get svc
kubectl describe svc example-service
复制代码
node
节点关闭时,只有过了5分钟
以后,上面的pod才会被检测到有问题,并迁移到其余节点若是想快速迁移能够执行
kubectl delete node
也能够修改
controller-manager的
的pod-eviction-timeout
参数,默认5m
node-monitor-grace-period
参数,默认40s