上篇:html
Ingress 基本概念:node
部署 httpbin 服务,一样,官方demo已经提供了该配置文件,执行以下命令应用便可:docker
[root@m1 ~]# kubectl apply -f /usr/local/istio-1.8.1/samples/httpbin/httpbin.yaml serviceaccount/httpbin created service/httpbin created deployment.apps/httpbin created [root@m1 ~]#
为 httpbin 服务配置 Ingress 网关,定义 Ingress gateway,以下所示:json
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 80 name: http protocol: HTTP hosts: # 暴露给客户端访问的host,也就是访问该host时才会进入这个Gateway - "httpbin.example.com" EOF
而后定义 Virtual Service 配置路由规则并关联该 Gateway:后端
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin spec: hosts: # host要与Gateway中的定义对应上 - "httpbin.example.com" gateways: # 经过名称关联指定的Gateway - httpbin-gateway http: - match: # 请求匹配规则,也能够说是暴露哪些接口 - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF
使用以下命令获取 istio-ingressgateway 服务的实际请求地址和端口号(但服务的 EXTERNAL-IP 为 pending 或 none 时采用此方式,详见:官方文档):api
[root@m1 ~]# kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}' 32482 [root@m1 ~]# kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}' 192.168.243.140 [root@m1 ~]#
接下来使用 curl
命令测试一下 httpbin 的接口可否正常访问:安全
[root@m1 ~]# curl -s -I -HHost:httpbin.example.com "http://192.168.243.140:32482/status/200" HTTP/1.1 200 OK server: istio-envoy date: Tue, 22 Dec 2020 03:45:17 GMT content-type: text/html; charset=utf-8 access-control-allow-origin: * access-control-allow-credentials: true content-length: 0 x-envoy-upstream-service-time: 16 [root@m1 ~]#
访问没有被暴露的接口,此时返回404:bash
[root@m1 ~]# curl -s -I -HHost:httpbin.example.com "http://192.168.243.140:32482/headers" HTTP/1.1 404 Not Found date: Tue, 22 Dec 2020 03:47:15 GMT server: istio-envoy transfer-encoding: chunked [root@m1 ~]#
有进入网格的流量也就有从网格出去的流量,这种入口流量与出口流量也就是咱们常说的南北流量,在Istio中咱们能够对网格的入口和出口流量进行管控。网络
Istio中访问外部服务的方法:架构
global.outboundTrafficPolicy.mode = ALLOW_ANY
(默认)Egress 概念:
在本小节,咱们将实践建立一个 Egress 网关,让内部服务(sleep)经过它访问外部服务(httpbin.org),这两个服务在前面的章节示例中都已经演示过了:
查看 istio-egressgateway 组件是否存在:
[root@m1 ~]# kubectl get pods -n istio-system NAME READY STATUS RESTARTS AGE istio-egressgateway-d84f95b69-dmpzf 1/1 Running 1 27h ...
确认 sleep 服务已处于正常运行状态:
[root@m1 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE ... sleep-854565cb79-gm6hj 2/2 Running 2 15h
为 httpbin.org 这个外部服务定义 ServiceEntry:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: ServiceEntry metadata: name: httpbin spec: hosts: - httpbin.org ports: - number: 80 name: http-port protocol: HTTP resolution: DNS EOF
确认建立成功:
[root@m1 ~]# kubectl get se NAME HOSTS LOCATION RESOLUTION AGE httpbin ["httpbin.org"] DNS 2s
定义 Egress gateway:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: istio-egressgateway spec: selector: istio: egressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - httpbin.org EOF
定义路由,将流量引导到 istio-egressgateway:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: vs-for-egressgateway spec: hosts: - httpbin.org gateways: - istio-egressgateway - mesh http: - match: # 针对内部服务的路由规则,会把全部内部的请求都指向egress网关这个节点 - gateways: - mesh port: 80 route: # 将请求路由到egress网关 - destination: host: istio-egressgateway.istio-system.svc.cluster.local # egress组件的dns名称 subset: httpbin port: number: 80 weight: 100 - match: # 针对Egress网关的路由规则 - gateways: - istio-egressgateway port: 80 route: # 将Egress网关的请求指向最终的外部服务地址,即httpbin.org - destination: host: httpbin.org port: number: 80 weight: 100 --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: dr-for-egressgateway spec: host: istio-egressgateway.istio-system.svc.cluster.local # egress组件的dns名称 subsets: - name: httpbin EOF
测试访问 httpbin.org 服务的接口:
[root@m1 ~]# kubectl exec -it sleep-854565cb79-gm6hj -c sleep -- curl http://httpbin.org/ip { "origin": "172.22.152.252, 223.74.101.7" } [root@m1 ~]#
查看日志验证出口流量是否通过了Egress网关,输出了以下日志信息表明Egress网关配置成功,出口流量通过了该Egress网关:
[root@m1 ~]# kubectl logs -f istio-egressgateway-d84f95b69-dmpzf -n istio-system ... [2020-12-22T06:32:22.057Z] "GET /ip HTTP/2" 200 - "-" 0 47 835 834 "172.22.152.252" "curl/7.69.1" "88e2392e-9b43-9e5f-8007-82873cdd9701" "httpbin.org" "54.164.234.192:80" outbound|80||httpbin.org 172.22.78.138:55966 172.22.78.138:8080 172.22.152.252:35658 - -
此时 sleep 服务访问外部服务的流程以下图:
对于一个分布式系统来讲,出现网络故障是在所不免的,所以如何提高系统弹性,提高系统在面对故障时的处理能力是分布式架构很是重要的一个主题。其中,超时和重试是很是重要且经常使用的,用于提高系统弹性的机制。
基本概念
接下来咱们仍是经过Bookinfo这个应用来做为演示,对其中的一些服务添加超时策略和重试策略。咱们会将请求指向 reviews 服务的 v2 版本,并在 ratings 服务中添加延迟设置,模拟一个故障出现的状况,以此来验证咱们设置的超时和重试策略是否生效:
首先,建立一个Virtual Service将请求路由到 reviews 服务的 v2 版本:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: reviews spec: hosts: - reviews http: - route: - destination: host: reviews subset: v2 EOF
给 ratings 服务注入延迟,模拟故障:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: ratings spec: hosts: - ratings http: - fault: delay: percent: 100 fixedDelay: 2s route: - destination: host: ratings subset: v1 EOF
给 reviews 服务添加超时策略:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: reviews spec: hosts: - reviews http: - route: - destination: host: reviews subset: v2 timeout: 1s # 添加超时策略 EOF
此时刷新应用页面,能够看到返回了错误信息:
将 reviews 服务的超时策略取消掉,而后给 ratings 服务添加剧试策略:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: ratings spec: hosts: - ratings http: - fault: delay: percent: 100 fixedDelay: 5s route: - destination: host: ratings subset: v1 retries: attempts: 2 # 重试次数 perTryTimeout: 1s # 每次重试的间隔时间 EOF
查看 ratings 服务的 Sidecar 日志,而后刷新应用页面,正常状况下从日志输出能够看到重试了两次请求:
[root@m1 ~]# kubectl logs -f ratings-v1-7d99676f7f-jhcv6 -c istio-proxy ... [2020-12-22T07:57:28.104Z] "GET /ratings/0 HTTP/1.1" 200 - "-" 0 48 0 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "c07ce12d-cddf-950d-93ae-f8fee93aeca6" "ratings:9080" "127.0.0.1:9080" inbound|9080|| 127.0.0.1:54592 172.22.152.250:9080 172.22.78.132:46554 outbound_.9080_.v1_.ratings.default.svc.cluster.local default [2020-12-22T07:57:31.108Z] "GET /ratings/0 HTTP/1.1" 200 - "-" 0 48 0 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "c07ce12d-cddf-950d-93ae-f8fee93aeca6" "ratings:9080" "127.0.0.1:9080" inbound|9080|| 127.0.0.1:54628 172.22.152.250:9080 172.22.78.132:42092 outbound_.9080_.v1_.ratings.default.svc.cluster.local default
配置选项:
本小节咱们实践一下为 httpbin 服务添加断路器配置,而后经过负载测试工具来触发熔断。断路器相关配置是在服务的 DestinationRule 中里进行配置的,以下所示:
$ kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: httpbin spec: host: httpbin trafficPolicy: # 流量策略 connectionPool: # 定义链接池,舱壁模式,利用链接池来隔离资源 tcp: maxConnections: 1 # tcp请求容许的最大链接数为1 http: # http每一个链接的最大请求设置为1 http1MaxPendingRequests: 1 maxRequestsPerConnection: 1 outlierDetection: # 异常检测 consecutiveErrors: 1 # 失败的次数,达到该次数就触发熔断 interval: 1s # 熔断的间隔时间 baseEjectionTime: 3m # 最小驱逐时间,实现指数级的退避策略 maxEjectionPercent: 100 # 最大可被驱逐的比例 EOF
安装fortio,使用该负载测试工具来触发熔断:
[root@m1 ~]# kubectl apply -f /usr/local/istio-1.8.1/samples/httpbin/sample-client/fortio-deploy.yaml service/fortio created deployment.apps/fortio-deploy created [root@m1 ~]#
先尝试发送单个请求,确认该工具可以正常工做:
[root@m1 ~]# export FORTIO_POD=$(kubectl get pods -lapp=fortio -o 'jsonpath={.items[0].metadata.name}') [root@m1 ~]# kubectl exec "$FORTIO_POD" -c fortio -- /usr/bin/fortio curl -quiet http://httpbin:8000/get HTTP/1.1 200 OK server: envoy date: Tue, 22 Dec 2020 08:32:18 GMT content-type: application/json content-length: 622 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 22 { "args": {}, "headers": { "Content-Length": "0", "Host": "httpbin:8000", "User-Agent": "fortio.org/fortio-1.11.3", "X-B3-Parentspanid": "918687527bfa9f4c", "X-B3-Sampled": "1", "X-B3-Spanid": "7c652cb91e0e3ad0", "X-B3-Traceid": "7c29c42837142e88918687527bfa9f4c", "X-Envoy-Attempt-Count": "1", "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=8452370e5dc510c7fd99321b642b4af3bd83bc832636112365a4a45e344d0875;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/default" }, "origin": "127.0.0.1", "url": "http://httpbin:8000/get" } [root@m1 ~]#
没问题后,经过以下命令进行并发压测,并发数是3,执行30次:
[root@m1 ~]# kubectl exec "$FORTIO_POD" -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get 08:36:00 I logger.go:127> Log level is now 3 Warning (was 2 Info) Fortio 1.11.3 running at 0 queries per second, 4->4 procs, for 30 calls: http://httpbin:8000/get Starting at max qps with 3 thread(s) [gomax 4] for exactly 30 calls (10 per thread + 0) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) 08:36:00 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503) Ended after 31.42646ms : 30 calls. qps=954.61 Aggregated Function Time : count 30 avg 0.0020163533 +/- 0.001477 min 0.000302125 max 0.005454188 sum 0.060490598 # range, mid point, percentile, count >= 0.000302125 <= 0.001 , 0.000651062 , 36.67, 11 > 0.001 <= 0.002 , 0.0015 , 43.33, 2 > 0.002 <= 0.003 , 0.0025 , 83.33, 12 > 0.003 <= 0.004 , 0.0035 , 86.67, 1 > 0.004 <= 0.005 , 0.0045 , 93.33, 2 > 0.005 <= 0.00545419 , 0.00522709 , 100.00, 2 # target 50% 0.00216667 # target 75% 0.00279167 # target 90% 0.0045 # target 99% 0.00538606 # target 99.9% 0.00544738 Sockets used: 15 (for perfect keepalive, would be 3) Jitter: false Code 200 : 17 (56.7 %) # 56.7 %的请求返回成功,状态码为200 Code 503 : 13 (43.3 %) # 43.3 %的请求返回失败,状态码为503,说明触发了熔断 Response Header Sizes : count 30 avg 130.33333 +/- 114 min 0 max 230 sum 3910 Response Body/Total Sizes : count 30 avg 587.23333 +/- 302.8 min 241 max 852 sum 17617 All done 30 calls (plus 0 warmup) 2.016 ms avg, 954.6 qps [root@m1 ~]#
若是但愿查看具体指标可使用以下命令:
$ kubectl exec "$FORTIO_POD" -c istio-proxy -- pilot-agent request GET stats | grep httpbin | grep pending
配置选项:
在配置好网络(包括故障恢复策略)后,咱们可使用Istio的故障注入机制来测试应用程序的总体故障恢复能力。故障注入是一种将错误引入系统以确保系统可以承受并从错误条件中恢复的测试方法。
因此故障注入机制特别有用,能够提早暴露一些故障恢复策略不兼容或限制性太强,从而可能致使的关键服务不可用的问题。故障注入在业界的发展和应用例子:
其实咱们在以前的小节中早已演示过了Istio的故障注入配置,在超时与重试的小节中,咱们就为 ratings 服务注入过一个延迟故障:
使用以下命令将Bookinfo应用各个服务的路由信息设置到各自的 v1 版本:
[root@m1 ~]# kubectl apply -f /usr/local/istio-1.8.1/samples/bookinfo/networking/virtual-service-all-v1.yaml
而后将 reviews 服务的流量指向它的 v2 版本,由于只有 v2 和 v3 版本才会调用 ratings 服务:
[root@m1 ~]# kubectl apply -f /usr/local/istio-1.8.1/samples/bookinfo/networking/virtual-service-reviews-test-v2.yaml
给 ratings 服务注入延迟故障:
[root@m1 ~]# kubectl apply -f /usr/local/istio-1.8.1/samples/bookinfo/networking/virtual-service-ratings-test-delay.yaml
virtual-service-ratings-test-delay.yaml
文件的内容:
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: ratings spec: hosts: - ratings http: - match: - headers: # 经过header匹配指定的用户名,目的是针对已登陆的指定用户所产生的流量进行故障模拟 end-user: exact: jason fault: # 配置故障注入 delay: # 故障类型为延迟故障 percentage: # 指定会受故障影响的流量比例 value: 100.0 fixedDelay: 7s # 延迟多长时间 route: - destination: host: ratings subset: v1 - route: - destination: host: ratings subset: v1
配置选项:
相信不少开发人员都遇到过这样的问题,就是在开发/测试环境中运行良好的功能,一上线就出问题。即使作足了单元测试、集成测试,测试覆盖率也很高,也依然会有这种问题存在,而且这类问题在开发/测试环境难以复现。
出现这种问题的一个主要缘由,是由于线上环境,特别是数据环境,好比说数据量、请求的并发量以及用户使用数据的方式都与开发/测试环境很是的不同。因为这种不一致性致使咱们很难在开发/测试环境中发现线上的问题。
那么一个很是好的解决办法,就是使用流量镜像机制,将线上流量复刻一份到开发/测试环境中进行测试。
流量镜像(Traffic mirroring,也称为Shadowing)是一个强大的概念,它容许开发团队以尽量小的风险为生产环境带来更改。流量镜像机制能够将实时流量的副本发送到一个镜像服务。对流量的镜像发生在主服务的关键请求路径以外。
接下来咱们实践一下如何配置Istio中的流量镜像机制,需求是将发送到 v1 版本的流量镜像到 v2 版本。所以,咱们首先部署 httpbin 服务的 v1 和 v2 版本。v1 版本的配置以下:
apiVersion: apps/v1 kind: Deployment metadata: name: httpbin-v1 spec: replicas: 1 selector: matchLabels: app: httpbin version: v1 template: metadata: labels: app: httpbin version: v1 spec: containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin command: ["gunicorn", "--access-logfile", "-", "-b", "0.0.0.0:80", "httpbin:app"] ports: - containerPort: 80
部署 httpbin 服务的 v2 版本:
apiVersion: apps/v1 kind: Deployment metadata: name: httpbin-v2 spec: replicas: 1 selector: matchLabels: app: httpbin version: v2 template: metadata: labels: app: httpbin version: v2 spec: containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin command: ["gunicorn", "--access-logfile", "-", "-b", "0.0.0.0:80", "httpbin:app"] ports: - containerPort: 80
为 httpbin 建立一个 Service 资源,让 Pod 可以经过服务的方式暴露出来:
apiVersion: v1 kind: Service metadata: name: httpbin labels: app: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin
为 httpbin 服务建立默认的虚拟服务与目标规则:
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin spec: hosts: - httpbin http: - route: - destination: host: httpbin subset: v1 # 将请求指向v1版本 weight: 100 --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: httpbin spec: host: httpbin subsets: - name: v1 labels: version: v1 - name: v2 labels: version: v2
测试一下可否正常访问到 httpbin 服务的接口:
[root@m1 ~]# export SLEEP_POD=$(kubectl get pod -l app=sleep -o jsonpath={.items..metadata.name}) [root@m1 ~]# kubectl exec "${SLEEP_POD}" -c sleep -- curl -s http://httpbin:8000/headers { "headers": { "Accept": "*/*", "Content-Length": "0", "Host": "httpbin:8000", "User-Agent": "curl/7.69.1", "X-B3-Parentspanid": "86bf83bdc5d1be76", "X-B3-Sampled": "1", "X-B3-Spanid": "07b81e555626fe41", "X-B3-Traceid": "72d7ebc62e57ec7386bf83bdc5d1be76", "X-Envoy-Attempt-Count": "1", "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/default;Hash=a9722ea6836d43d677f56f8b39cddbec522311a27ea0baf25ea08b265303d4f3;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/sleep" } } [root@m1 ~]#
完成以上的准备工做后,咱们就能够在虚拟服务中为 httpbin 服务的 v2 版本配置对 v1 版本的流量镜像:
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin spec: hosts: - httpbin http: - route: - destination: host: httpbin subset: v1 weight: 100 mirror: # 配置流量镜像 host: httpbin # 指定镜像服务,即复制到哪一个服务上 subset: v2 # 限定服务的版本,实现了将v1的流量复制到v2上的效果 mirror_percent: 100 # 流量复制的比例
尝试请求 httpbin 服务的接口,因为咱们配置了路由规则,该请求必定是被路由到 v1 版本上的:
[root@m1 ~]# kubectl exec "${SLEEP_POD}" -c sleep -- curl -s http://httpbin:8000/headers
此时观察 v2 版本的日志,从日志输出中能够发现 v2 版本也接收到了相同的请求,表明咱们的流量镜像配置生效了:
[root@m1 ~]# kubectl logs -f httpbin-v2-75d9447d79-rtc9x -c httpbin ... 127.0.0.1 - - [22/Dec/2020:10:05:56 +0000] "GET /headers HTTP/1.1" 200 591 "-" "curl/7.69.1"
配置选项: