本文来自 Rancher Labs
在过去的文章中,咱们花了至关大的篇幅来聊关于监控的话题。这是由于当你正在管理Kubernetes集群时,一切都会以极快的速度发生变化。所以有一个工具来监控集群的健康状态和资源指标极为重要。git
在Rancher 2.5中,咱们引入了基于Prometheus Operator的新版监控,它能够提供Prometheus以及相关监控组件的原生Kubernetes部署和管理。Prometheus Operator可让你监控集群节点、Kubernetes组件和应用程序工做负载的状态和进程。同时,它还可以经过Prometheus收集的指标来定义告警而且建立自定义仪表盘,经过Grafana能够轻松地可视化收集到的指标。你能够访问下列连接获取更多关于新版监控组件的细节:github
https://rancher.com/docs/ranc...golang
新版本的监控也采用prometheus-adapter,开发人员能够利用其基于自定义指标和HPA扩展他们的工做负载。web
在本文中,咱们将探索如何利用Prometheus Operator来抓取自定义指标并利用这些指标进行高级工做负载管理。json
在Rancher 2.5中安装Prometheus极为简单。仅需访问Cluster Explorer -> Apps并安装rancher-monitoring便可。api
你须要了解如下默认设置:app
prometheus-adapter
将会做为chart安装的一部分启用ServiceMonitorNamespaceSelector
留为空,容许 Prometheus 在全部命名空间中收集 ServiceMonitors
安装完成后,咱们能够从Cluster Explorer访问监控组件。ide
如今让咱们部署一个从应用层暴露自定义指标的示例工做负载。该工做负载暴露了一个简单的应用程序,该应用程序已经使用Prometheus client_golang库进行了检测,并在/metric
端点上提供了一些自定义指标。工具
它有两个指标:ui
如下manifest部署了工做负载、相关服务以及访问该工做负载的ingress:
apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: prometheus-example-app name: prometheus-example-app spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: prometheus-example-app template: metadata: labels: app.kubernetes.io/name: prometheus-example-app spec: containers: - name: prometheus-example-app image: gmehta3/demo-app:metrics ports: - name: web containerPort: 8080 --- apiVersion: v1 kind: Service metadata: name: prometheus-example-app labels: app.kubernetes.io/name: prometheus-example-app spec: selector: app.kubernetes.io/name: prometheus-example-app ports: - protocol: TCP port: 8080 targetPort: 8080 name: web --- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: prometheus-example-app spec: rules: - host: hpa.demo http: paths: - path: / backend: serviceName: prometheus-example-app servicePort: 8080
ServiceMonitor是一个自定义资源定义(CRD),可让咱们声明性地定义如何监控一组动态服务。
你能够访问如下连接查看完整的ServiceMonitor规范:
https://github.com/prometheus...
如今,咱们来部署ServiceMonitor,Prometheus用它来收集组成prometheus-example-app Kubernetes服务的pod。
kind: ServiceMonitor metadata: name: prometheus-example-app spec: selector: matchLabels: app.kubernetes.io/name: prometheus-example-app endpoints: - port: web
如你所见,如今用户能够在Rancher监控中浏览ServiceMonitor。
不久以后,新的service monitor和服务相关联的pod应该会反映在Prometheus服务发现中。
咱们也可以在Prometheus中看到指标。
在Rancher 2.5中,监控可让用户将Grafana仪表盘存储为cattle-dashboards
命名空间中的ConfigMaps。
用户或集群管理员如今能够在这一命名空间中添加更多的仪表盘以扩展Grafana的自定义仪表盘。
Dashboard ConfigMap Example
apiVersion: v1 kind: ConfigMap metadata: name: prometheus-example-app-dashboard namespace: cattle-dashboards labels: grafana_dashboard: "1" data: prometheus-example-app.json: | { "annotations": { "list": [ { "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "type": "dashboard" } ] }, "editable": true, "gnetId": null, "graphTooltip": 0, "links": [], "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": null, "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, "y": 0 }, "hiddenSeries": false, "id": 2, "legend": { "avg": false, "current": false, "max": false, "min": false, "show": true, "total": false, "values": false }, "lines": true, "linewidth": 1, "nullPointMode": "null", "percentage": false, "pluginVersion": "7.1.5", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "rate(http_requests_total{code=\"200\",service=\"prometheus-example-app\"}[5m])", "instant": false, "interval": "", "legendFormat": "", "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "http_requests_total_200", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": null, "description": "", "fieldConfig": { "defaults": { "custom": {} }, "overrides": [] }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 12, "x": 0, "y": 9 }, "hiddenSeries": false, "id": 4, "legend": { "avg": false, "current": false, "max": false, "min": false, "show": true, "total": false, "values": false }, "lines": true, "linewidth": 1, "nullPointMode": "null", "percentage": false, "pluginVersion": "7.1.5", "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { "expr": "rate(http_requests_total{code!=\"200\",service=\"prometheus-example-app\"}[5m])", "interval": "", "legendFormat": "", "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "http_requests_total_not_200", "tooltip": { "shared": true, "sort": 0, "value_type": "individual" }, "type": "graph", "xaxis": { "buckets": null, "mode": "time", "name": null, "show": true, "values": [] }, "yaxes": [ { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { "format": "short", "label": null, "logBase": 1, "max": null, "min": null, "show": true } ], "yaxis": { "align": false, "alignLevel": null } } ], "schemaVersion": 26, "style": "dark", "tags": [], "templating": { "list": [] }, "time": { "from": "now-15m", "to": "now" }, "timepicker": { "refresh_intervals": [ "5s", "10s", "30s", "1m", "5m", "15m", "30m", "1h", "2h", "1d" ] }, "timezone": "", "title": "prometheus example app", "version": 1 }
如今,用户应该可以在Grafana中访问prometheus example app的仪表盘。
这一部分假设你已经将prometheus-adapter
做为监控的一部分安装完毕了。实际上,在默认状况下,监控安装程序会安装prometheus-adapter。
用户如今能够建立一个HPA spec,以下所示:
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: prometheus-example-app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: prometheus-example-app minReplicas: 1 maxReplicas: 5 metrics: - type: Object object: describedObject: kind: Service name: prometheus-example-app metric: name: http_requests target: averageValue: "5" type: AverageValue
你能够查看如下连接获取关于HPA的更多信息:
https://kubernetes.io/docs/ta...
咱们将使用自定义的http_requests_total指标来执行pod自动伸缩。
如今咱们能够生成一个样本负载来查看HPA的运行状况。我可使用hey
进行一样的操做。
hey -c 10 -n 5000 http://hpa.demo
在本文中,咱们探讨了Rancher 2.5中新监控的灵活性。开发人员和集群管理员能够利用该堆栈来监控它们的工做负载,部署可视化,并利用Kubernetes内可用的高级工做负载管理功能。