简介:服务网格ASM的Mixerless Telemetry技术,为业务容器提供了无侵入式的遥测数据。遥测数据一方面做为监控指标被ARMPS/prometheus采集,用于服务网格可观测性;另外一方面被HPA和flaggers使用,成为应用级扩缩容和渐进式灰度发布的基石。 本系列聚焦于遥测数据在应用级扩缩容和渐进式灰度发布上的实践,将分三篇介绍遥测数据(监控指标)、应用级扩缩容,和渐进式灰度发布。
服务网格ASM的Mixerless Telemetry技术,为业务容器提供了无侵入式的遥测数据。遥测数据一方面做为监控指标被ARMPS/prometheus采集,用于服务网格可观测性;另外一方面被HPA和flaggers使用,成为应用级扩缩容和渐进式灰度发布的基石。html
本系列聚焦于遥测数据在应用级扩缩容和渐进式灰度发布上的实践,将分三篇介绍遥测数据(监控指标)、应用级扩缩容,和渐进式灰度发布。node
本系列的整体架构以下图所示:git
Flagger官网描述了渐进式发布流程,这里翻译以下:github
acceptance-test
验证load-test
验证原文以下:json
With the above configuration, Flagger will run a canary release with the following steps:api
- detect new revision (deployment spec, secrets or configmaps changes)
- scale from zero the canary deployment
- wait for the HPA to set the canary minimum replicas
- check canary pods health
- run the acceptance tests
- abort the canary release if tests fail
- start the load tests
- mirror 100% of the traffic from primary to canary
- check request success rate and request duration every minute
- abort the canary release if the metrics check failure threshold is reached
- stop traffic mirroring after the number of iterations is reached
- route live traffic to the canary pods
- promote the canary (update the primary secrets, configmaps and deployment spec)
- wait for the primary deployment rollout to finish
- wait for the HPA to set the primary minimum replicas
- check primary pods health
- switch live traffic back to primary
- scale to zero the canary
- send notification with the canary analysis result
本篇将介绍如何基于ASM配置并采集应用级监控指标(好比请求数量总数istio_requests_total
和请求延迟istio_request_duration
等)。主要步骤包括建立EnvoyFilter、校验envoy遥测数据和校验Prometheus采集遥测数据。浏览器
登陆ASM控制台,左侧导航栏选择服务网格 >网格管理,并进入ASM实例的功能配置页面。架构
`prometheus:9090
(本系列将使用社区版Prometheus,后文将使用这个配置)。若是使用阿里云产品ARMS,请参考集成ARMS Prometheus实现网格监控。点击肯定后,咱们将在控制平面看到ASM生成的相关EnvoyFilter列表:app
执行以下命令安装Prometheus(完整脚本参见:demo\_mixerless.sh)。less
kubectl --kubeconfig "$USER_CONFIG" apply -f $ISTIO_SRC/samples/addons/prometheus.yaml
安装完Prometheus,咱们须要为其配置添加istio相关的监控指标。登陆ACK控制台,左侧导航栏选择配置管理>配置项,在istio-system
下找到prometheus
一行,点击编辑。
在prometheus.yaml
配置中,将scrape\_configs.yaml中的配置追加到scrape_configs
中。
保存配置后,左侧导航栏选择工做负载>容器组,在istio-system
下找到prometheus
一行,删除Prometheus POD,以确保配置在新的POD中生效。
能够执行以下命令查看Prometheus配置中的job_name
:
kubectl --kubeconfig "$USER_CONFIG" get cm prometheus -n istio-system -o jsonpath={.data.prometheus\\.yml} | grep job_name - job_name: 'istio-mesh' - job_name: 'envoy-stats' - job_name: 'istio-policy' - job_name: 'istio-telemetry' - job_name: 'pilot' - job_name: 'sidecar-injector' - job_name: prometheus job_name: kubernetes-apiservers job_name: kubernetes-nodes job_name: kubernetes-nodes-cadvisor - job_name: kubernetes-service-endpoints - job_name: kubernetes-service-endpoints-slow job_name: prometheus-pushgateway - job_name: kubernetes-services - job_name: kubernetes-pods - job_name: kubernetes-pods-slow
使用以下命令部署本系列的示例应用podinfo:
kubectl --kubeconfig "$USER_CONFIG" apply -f $PODINFO_SRC/kustomize/deployment.yaml -n test kubectl --kubeconfig "$USER_CONFIG" apply -f $PODINFO_SRC/kustomize/service.yaml -n test
使用以下命令请求podinfo,以产生监控指标数据
podinfo_pod=$(k get po -n test -l app=podinfo -o jsonpath={.items..metadata.name}) for i in {1..10}; do kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -c podinfod -n test -- curl -s podinfo:9898/version echo done
本系列重点关注的监控指标项是istio_requests_total
和istio_request_duration
。首先,咱们在envoy容器内确认这些指标已经生成。
使用以下命令请求envoy获取stats相关指标数据,并确认包含istio_requests_total
。
kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -n test -c istio-proxy -- curl -s localhost:15090/stats/prometheus | grep istio_requests_total
返回结果信息以下:
:::: istio_requests_total :::: # TYPE istio_requests_total counter istio_requests_total{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest"} 10 istio_requests_total{response_code="200",reporter="source",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="unknown",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest"} 10
使用以下命令请求envoy获取stats相关指标数据,并确认包含istio_request_duration
。
kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -n test -c istio-proxy -- curl -s localhost:15090/stats/prometheus | grep istio_request_duration
返回结果信息以下:
:::: istio_request_duration :::: # TYPE istio_request_duration_milliseconds histogram istio_request_duration_milliseconds_bucket{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest",le="0.5"} 10 istio_request_duration_milliseconds_bucket{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest",le="1"} 10 ...
最后,咱们验证Envoy生成的监控指标数据,是否被Prometheus实时采集上来。对外暴露Prometheus服务,并使用浏览器请求该服务。而后在查询框输入istio_requests_total
,获得结果以下图所示。
本文内容由阿里云实名注册用户自发贡献,版权归原做者全部,阿里云开发者社区不拥有其著做权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。若是您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将马上删除涉嫌侵权内容。