咱们当前k8s集群上使用prometheus作监控,因为开发同窗有部分业务使用websocket
接口,也为了能有效对业务应用进行监控和报警,颇有必要对websocket api
接口存活性进行探测和监管。具体方案、实施流程和测试详见下文。node
咱们定义一个简单的websocket service
用来监控报警测试,以下:python
# 建立虚拟环境,也能够直接在宿主机上部署 mkvirtualenv -p /usr/bin/python3 websocket-server # 安装必要包 pip3 install websockets
# cat websocket-server.py import asyncio import websockets async def echo(websocket, path): async for message in websocket: message = "I got your message: {}".format(message) await websocket.send(message) # 定义的ip地址要能与k8s通讯 asyncio.get_event_loop().run_until_complete(websockets.serve(echo, '192.168.128.6', 8765)) asyncio.get_event_loop().run_forever()
# 启动websocket服务 python websocket-server.py & # 查看服务 netstat -lnp|grep 8765
这里咱们定义一个deployment
用来将监控的多个websocket api
metrics对接到prometheus
,内容以下:git
# cat websocket-kube-mon-prometheus.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: wss app.kubernetes.io/version: v1.8.0 name: websocket-exporter namespace: kube-mon spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: wss template: metadata: labels: app.kubernetes.io/name: wss app.kubernetes.io/version: v1.8.0 spec: containers: - image: registry.cn-shanghai.aliyuncs.com/ai-voice-test/wss-expoter:v0.0.1 env: - name: ENDPOINT #多个ws用逗号分开 value: ws://www.abc.com,ws://192.168.128.6:8765 name: websocket-exporter ports: - containerPort: 9189 name: wss-metrics
定义websocket service
用来被prometheus
监控,内容以下:github
# cat service-websocket.yaml apiVersion: v1 kind: Service metadata: name: websocket namespace: kube-mon spec: # 暂使用nodeport的形式 type: NodePort ports: - port: 9189 targetPort: 9189 protocol: TCP nodePort: 32071 selector: app.kubernetes.io/name: wss
# 启动上面deploy和service kubectl apply -f websocket-kube-mon-prometheus.yaml kubectl apply -f service-websocket.yaml # 查看pod和service kubectl get pod -n kube-mon kubectl get svc -n kube-mon NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE websocket NodePort 192.168.237.56 <none> 9189:32071/TCP 1h
# vim sidecar/cm-kube-mon-sidecar.yaml //添加如下配置 - job_name: 'websocket' static_configs: - targets: ['192.168.237.56:9189']
# 重载 kubectl apply -f sidecar/cm-kube-mon-sidecar.yaml # prometheus reload: curl -X POST http://prometheus-pod-ip:9090/-/reload
# vim sidecar/rules-cm-kube-mon-sidecar.yaml //添加如下配置 - alert: websocket 接口探测到异常 expr: websocket{job="websocket"} < 1 for: 30s labels: severity: 紧急 annotations: #summary: "接口{{ $labels.url }} 探测异常" description: "websocket地址: {{ $labels.url }} 探测异常 , 状态为: down ."
# 重载,prometheus有热更新,稍等待1分钟左右便可 kubectl apply -f sidecar/rules-cm-kube-mon-sidecar.yaml
# 查看进程号 netstat -lnp|grep 8765 # 杀掉进程 kill you-id
咱们能够终端请求直接看到接口监控状态,以下:web
curl 192.168.237.56:9189/metrics # HELP websocket websocket_help # TYPE websocket gauge websocket{url="ws://192.168.128.6:8765"} 0
稍等待一下子,报警信息报出,内容以下:vim
python websocket-server.py &
curl 192.168.237.56:9189/metrics # HELP websocket websocket_help # TYPE websocket gauge websocket{url="ws://192.168.128.6:8765"} 1
稍等待一下子,恢复信息报出,内容以下:api