Rancher在IPTABLES的应用

IPTABLES对运维的同窗来讲是一个很是有用的工具,配合tcpdump/wireshark来定位四层的收发包问题会格外的有效。比方说报文到了协议栈的设备上,经过tcpdump模拟协议栈抓到了交互的报文,可是报文没有往上投递到四层应用,这中间必定是有一些机制和策略阻碍的报文的向上投递,这时候在用户空间用上IPTABLES每每能截获到很多的信息。mysql

 

提到IPTABLES,是一个用户态的应用,它经过netlink socket与内核态netfilter子系统通讯,完成对报文的控制(防火墙,负载均衡等常见应用)。主要是调用glibc库函数的socket(PF_NETLINK, SOCK_RAW, NETLINK_IP6_FW)/socket(PF_NETLINK, SOCK_RAW, NETLINK_FIREWALL)方法,经过系统调用陷入内核,调用内核函数获取到与内核netfilter子系统通讯的socket句柄,截获firewall hook(PREROUTING, FORWARD, INPUT, OUPUT, POSTROUTING)的报文,完成处理逻辑。这部分逻辑代码能够在iptables源码里面libipq.c的ipq_create_handle方法里面找到,这里不作深究。sql

 

下面简单介绍一下,结合IPTABLES分析Rancher报文流向控制。docker

 

清理IPTABLES mangle表全部chain下面的规则负载均衡

  • iptables -t mangle -F

 

给iptables mangle表的5个钩子添加日志追踪的行为,追踪的报文协议是tcp,源/目标端口为3306(稍等会在Rancher上部署一个mysql应用,暴露出来的端口是3306),设置打印日志的级别而后打印日志运维

  • iptables -t mangle -A PREROUTING -p tcp --dport 3306 -j LOG --log-prefix "M-PREROUTING:" --log-level 7
  • iptables -t mangle -A POSTROUTING -p tcp --dport 3306 -j LOG --log-prefix "M-POSTROUTING:" --log-level 7
  • iptables -t mangle -A FORWARD -p tcp --dport 3306 -j LOG --log-prefix "M-FORWARD:" --log-level 7
  • iptables -t mangle -A OUTPUT -p tcp --dport 3306 -j LOG --log-prefix "M-OUTPUT:" --log-level 7
  • iptables -t mangle -A INPUT -p tcp --dport 3306 -j LOG --log-prefix "M-INPUT:" --log-level 7
  • iptables -t mangle -A PREROUTING -p tcp --sport 3306 -j LOG --log-prefix "M-PREROUTING:" --log-level 7
  • iptables -t mangle -A POSTROUTING -p tcp --sport 3306 -j LOG --log-prefix "M-POSTROUTING:" --log-level 7
  • iptables -t mangle -A FORWARD -p tcp --sport 3306 -j LOG --log-prefix "M-FORWARD:" --log-level 7
  • iptables -t mangle -A OUTPUT -p tcp --sport 3306 -j LOG --log-prefix "M-OUTPUT:" --log-level 7
  • iptables -t mangle -A INPUT -p tcp --sport 3306 -j LOG --log-prefix "M-INPUT:" --log-level 7

 

在Rancher上部署mysql应用,暴露端口为3306,同时在容器所在宿主机上查看Rancher为服务添加的IPTABLES主要规则,为下面的报文分析作下铺垫socket

  • NAT表
    • -A CATTLE_HOSTPORTS_POSTROUTING -s 10.42.158.152/32 -d 10.42.158.152/32 -p tcp -m tcp --dport 3306 -j MASQUERADE
    • -A CATTLE_OUTPUT -p tcp -m tcp --dport 3306 -m addrtype --dst-type LOCAL -j DNAT --to-destination 10.42.158.152:3306
    • -A CATTLE_PREROUTING ! -i docker0 -p tcp -m tcp --dport 3306 -j DNAT --to-destination 10.42.158.152:3306
    • -A CATTLE_PREROUTING -p tcp -m tcp --dport 3306 -m addrtype --dst-type LOCAL -j DNAT --to-destination 10.42.158.152:3306
  • FILTER表
    • -A CATTLE_FORWARD -m mark --mark 0x1068 -j ACCEPT
    • -A CATTLE_FORWARD -m mark --mark 0x4000 -j ACCEPT
    • -A CATTLE_FORWARD -d 10.42.0.0/16 -o docker0 -j ACCEPT

 

查看/var/log/kern.log日志,分析报文流转tcp

  • 跨主机访问mysql所在主机的3306端口
    • SYN报文到达mangle表的PREROUTING chain,写下了下面的日志,同时在NAT表的PREROUTING chain作了DNAT,命中规则见上文提到的CATTLE_PREROUTING自定义链。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196230] M-PREROUTING:IN=enp0s8 OUT= MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=172.168.1.200 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • SYN报文到达mangle表的FORWARD chain,写下了下面的日志,同时在FILTER表的FORWARD chain 被ACCEPT,命中规则见上文提到的CATTLE_FORWARD自定义链。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196283] M-FORWARD:IN=enp0s8 OUT=docker0 MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=63 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • SYN报文到达mangle表的POSTROUTING chain,写下了下面的日志,而后过NAT表的POSTROUTING chain,没有作SNAT操做,而后出协议栈,最终SYN报文到达容器,容器收到SYN报文以后,回复ACK+SYN报文,这部分流转本文没有深究列出。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196296] M-POSTROUTING:IN= OUT=docker0 SRC=172.168.1.204 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=63 ID=63457 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x1068
    • ACK+SYN报文到达mangle表的PREROUTING chain,写下下面的日志,而后过NAT表的PREROUTING chain,没有作DNAT操做。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196717] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK+SYN报文到达mangle表的FORWARD chain,写下了下面的日志,同时在FILTER表的FORWARD chain 被默认ACCEPT。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196729] M-FORWARD:IN=docker0 OUT=enp0s8 PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK+SYN报文到达mangle表的POSTROUTING chain,写下了下面的日志,而后过NAT表的POSTROUTING chain,作SNAT的操做,命中规则见上文提到的CATTLE_HOSTPORTS_POSTROUTING自定义链,出主机协议栈,ACK+SYN报文到达发起访问的主机网卡设备。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196735] M-POSTROUTING:IN= OUT=enp0s8 PHYSIN=vethr1368e17ce3 SRC=10.42.158.152 DST=172.168.1.204 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=3306 DPT=54022 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • ACK报文到达主机,处理方式跟第一步同样。
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196984] M-PREROUTING:IN=enp0s8 OUT= MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=172.168.1.200 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068
      • Nov 17 07:07:39 cattleh2 kernel: [46619.196996] M-FORWARD:IN=enp0s8 OUT=docker0 MAC=08:00:27:e7:fe:f9:08:00:27:b5:2f:92:08:00 SRC=172.168.1.204 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068
      • Nov 17 07:07:39 cattleh2 kernel: [46619.197002] M-POSTROUTING:IN= OUT=docker0 SRC=172.168.1.204 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=63 ID=63458 DF PROTO=TCP SPT=54022 DPT=3306 WINDOW=229 RES=0x00 ACK URGP=0 MARK=0x1068

 

  • 本机(127.0.0.1)访问主机3306端口,和跨主机访问流转一致,下面是流转mangle表打出来的日志,能够参考上面的分析逻辑帮助理解下面的日志。
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906692] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=19503 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=43690 RES=0x00 SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906708] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=60 TOS=0x10 PREC=0x00 TTL=64 ID=19503 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=43690 RES=0x00 SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906776] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906791] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=27000 RES=0x00 ACK SYN URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906806] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19504 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.906812] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19504 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907322] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=147 TOS=0x08 PREC=0x00 TTL=64 ID=34614 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907331] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=147 TOS=0x08 PREC=0x00 TTL=64 ID=34614 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907362] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19505 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:36 cattleh2 kernel: [49315.907367] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=52 TOS=0x10 PREC=0x00 TTL=64 ID=19505 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290761] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19506 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290777] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19506 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290939] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34615 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:37 cattleh2 kernel: [49317.290948] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34615 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473238] M-OUTPUT:IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19507 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473255] M-POSTROUTING:IN= OUT=docker0 SRC=127.0.0.1 DST=10.42.158.152 LEN=54 TOS=0x10 PREC=0x00 TTL=64 ID=19507 DF PROTO=TCP SPT=49508 DPT=3306 WINDOW=342 RES=0x00 ACK PSH URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473378] M-PREROUTING:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=10.42.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34616 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0
    • Nov 17 07:52:38 cattleh2 kernel: [49317.473386] M-INPUT:IN=docker0 OUT= PHYSIN=vethr1368e17ce3 MAC=02:42:2f:b4:a7:5d:02:24:37:a3:1c:c0:08:00 SRC=10.42.158.152 DST=127.0.0.1 LEN=52 TOS=0x08 PREC=0x00 TTL=64 ID=34616 DF PROTO=TCP SPT=3306 DPT=49508 WINDOW=211 RES=0x00 ACK URGP=0

 

综上分析,能够看到Rancher并无使用docker-proxy来暴露服务。若是使用docker的userland proxy,试想若是开10000个服务,意味着主机上要开销出10000个端口来暴露服务,对内核来讲,是一笔不小的开销。Rancher的设计是利用iptables控制报文在host上的流向无疑是一件很是科学的事情,可能会有小伙伴问到为啥不用ipvs呢?弱鸡小编肤浅的以为这也能够是一种尝试,谢谢你们。函数

相关文章
相关标签/搜索