k8s单机容器网络(20250216)
k8s单机容器网络(20250216) Linux 容器能瞥见的“网络栈”,现实上是被隔离在它自己的 Network Namespace 当中的。
“网络栈”,就包罗了:网卡(Network Interface)、回环设备(Loopback Device)、路由表(Routing Table)和 iptables 规则。
Veth Pair 设备
Veth Pair 设备的特点是:它被创建出来后,总是以两张虚拟网卡(Veth Peer)的形式成对出现的。而且,从其中一个“网卡”发出的数据包,可以直接出现在与它对应的另一张“网卡”上,哪怕这两个“网卡”在差别的 Network Namespace 里。
Microsoft Windows [版本 10.0.26100.2894]
(c) Microsoft Corporation。保留所有权利。
C:\Users\admin>ssh root@192.168.117.207
root@192.168.117.207's password:
Last login: Mon Feb 17 08:19:32 2025
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1ee1263b4193 cbb01a7bd410 "/coredns -conf /etc…" 1 second ago Up 1 second k8s_coredns_coredns-857d9ff4c9-29ldj_kube-system_9ee2e5e5-d728-4c02-a87e-8dcaab82fbd7_13
829516e501fa registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 3 seconds ago Up 2 seconds k8s_POD_coredns-857d9ff4c9-29ldj_kube-system_9ee2e5e5-d728-4c02-a87e-8dcaab82fbd7_8
e0c8a6330d0e 9344fce2372f "/usr/local/bin/kube…" 7 seconds ago Up 6 seconds k8s_kube-proxy_kube-proxy-nq4x2_kube-system_a3ee7cb5-f97d-4339-8f9e-01e0e15874ba_9
255fea7d86a5 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 8 seconds ago Up 7 seconds k8s_POD_calico-node-9fhpq_kube-system_92a3a119-8007-48a9-8743-0afdf65f592c_7
36c5922e79eb registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 9 seconds ago Up 8 seconds k8s_POD_kube-proxy-nq4x2_kube-system_a3ee7cb5-f97d-4339-8f9e-01e0e15874ba_7
1cfe981dc26a a0eed15eed44 "etcd --advertise-cl…" 23 seconds ago Up 23 seconds k8s_etcd_etcd-k8s-master_kube-system_e4b42e5b51c6629d934233cc43f26a22_9
17717a8530ef 6fc5e6b7218c "kube-scheduler --au…" 23 seconds ago Up 23 seconds k8s_kube-scheduler_kube-scheduler-k8s-master_kube-system_299cca9182c20d90f643981b13c43213_16
e0df13dfff62 8a9000f98a52 "kube-apiserver --ad…" 24 seconds ago Up 23 seconds k8s_kube-apiserver_kube-apiserver-k8s-master_kube-system_bc05f019b265f704d6a2ffb204a2c88f_10
6a21496a57a4 138fb5a3a2e3 "kube-controller-man…" 24 seconds ago Up 23 seconds k8s_kube-controller-manager_kube-controller-manager-k8s-master_kube-system_51eafc84967051e22b58cf0ebce14e35_15
5631104357a5 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 25 seconds ago Up 25 seconds k8s_POD_kube-apiserver-k8s-master_kube-system_bc05f019b265f704d6a2ffb204a2c88f_7
562543f7a8d6 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 25 seconds ago Up 25 seconds k8s_POD_kube-controller-manager-k8s-master_kube-system_51eafc84967051e22b58cf0ebce14e35_7
16dbdd75513f registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 25 seconds ago Up 25 seconds k8s_POD_kube-scheduler-k8s-master_kube-system_299cca9182c20d90f643981b13c43213_7
5bfab6a1a042 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 26 seconds ago Up 25 seconds k8s_POD_etcd-k8s-master_kube-system_e4b42e5b51c6629d934233cc43f26a22_7
# docker start nginx-1
nginx-1
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
063af5a1782b 17e960f4e39c "start_runit" 12 seconds ago Up 12 seconds k8s_calico-node_calico-node-9fhpq_kube-system_92a3a119-8007-48a9-8743-0afdf65f592c_66
133fda8d5c2f cbb01a7bd410 "/coredns -conf /etc…" 22 seconds ago Up 21 seconds k8s_coredns_coredns-857d9ff4c9-ntrmg_kube-system_9a07dc52-b60a-4376-add2-5a128335c9df_12
2cad37aaa64d 08c1b67c88ce "/usr/bin/kube-contr…" 22 seconds ago Up 22 seconds k8s_calico-kube-controllers_calico-kube-controllers-558d465845-x59c8_kube-system_1586cb4f-6051-4cf2-bcbc-7a05f93739ee_11
245ed185ea4a registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 27 seconds ago Up 26 seconds k8s_POD_coredns-857d9ff4c9-ntrmg_kube-system_9a07dc52-b60a-4376-add2-5a128335c9df_8
60a93585eea1 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 27 seconds ago Up 27 seconds k8s_POD_calico-kube-controllers-558d465845-x59c8_kube-system_1586cb4f-6051-4cf2-bcbc-7a05f93739ee_9
1ee1263b4193 cbb01a7bd410 "/coredns -conf /etc…" 45 seconds ago Up 45 seconds k8s_coredns_coredns-857d9ff4c9-29ldj_kube-system_9ee2e5e5-d728-4c02-a87e-8dcaab82fbd7_13
829516e501fa registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 47 seconds ago Up 46 seconds k8s_POD_coredns-857d9ff4c9-29ldj_kube-system_9ee2e5e5-d728-4c02-a87e-8dcaab82fbd7_8
e0c8a6330d0e 9344fce2372f "/usr/local/bin/kube…" 51 seconds ago Up 50 seconds k8s_kube-proxy_kube-proxy-nq4x2_kube-system_a3ee7cb5-f97d-4339-8f9e-01e0e15874ba_9
255fea7d86a5 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 52 seconds ago Up 51 seconds k8s_POD_calico-node-9fhpq_kube-system_92a3a119-8007-48a9-8743-0afdf65f592c_7
36c5922e79eb registry.aliyuncs.com/google_containers/pause:3.8 "/pause" 53 seconds ago Up 52 seconds k8s_POD_kube-proxy-nq4x2_kube-system_a3ee7cb5-f97d-4339-8f9e-01e0e15874ba_7
1cfe981dc26a a0eed15eed44 "etcd --advertise-cl…" About a minute ago Up About a minute k8s_etcd_etcd-k8s-master_kube-system_e4b42e5b51c6629d934233cc43f26a22_9
17717a8530ef 6fc5e6b7218c "kube-scheduler --au…" About a minute ago Up About a minute k8s_kube-scheduler_kube-scheduler-k8s-master_kube-system_299cca9182c20d90f643981b13c43213_16
e0df13dfff62 8a9000f98a52 "kube-apiserver --ad…" About a minute ago Up About a minute k8s_kube-apiserver_kube-apiserver-k8s-master_kube-system_bc05f019b265f704d6a2ffb204a2c88f_10
6a21496a57a4 138fb5a3a2e3 "kube-controller-man…" About a minute ago Up About a minute k8s_kube-controller-manager_kube-controller-manager-k8s-master_kube-system_51eafc84967051e22b58cf0ebce14e35_15
5631104357a5 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" About a minute ago Up About a minute k8s_POD_kube-apiserver-k8s-master_kube-system_bc05f019b265f704d6a2ffb204a2c88f_7
562543f7a8d6 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" About a minute ago Up About a minute k8s_POD_kube-controller-manager-k8s-master_kube-system_51eafc84967051e22b58cf0ebce14e35_7
16dbdd75513f registry.aliyuncs.com/google_containers/pause:3.8 "/pause" About a minute ago Up About a minute k8s_POD_kube-scheduler-k8s-master_kube-system_299cca9182c20d90f643981b13c43213_7
5bfab6a1a042 registry.aliyuncs.com/google_containers/pause:3.8 "/pause" About a minute ago Up About a minute k8s_POD_etcd-k8s-master_kube-system_e4b42e5b51c6629d934233cc43f26a22_7
d85077c98a69 nginx "/docker-entrypoint.…" 18 hours ago Up 12 seconds 80/tcp nginx-1# docker exec -it nginx-1 /bin/bash
root@d85077c98a69:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet 172.17.0.2netmask 255.255.0.0broadcast 172.17.255.255
ether 02:42:ac:11:00:02txqueuelen 0(Ethernet)
RX packets 14bytes 1252 (1.2 KiB)
RX errors 0dropped 0overruns 0frame 0
TX packets 0bytes 0 (0.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING>mtu 65536
inet 127.0.0.1netmask 255.0.0.0
inet6 ::1prefixlen 128scopeid 0x10<host>
looptxqueuelen 1000(Local Loopback)
RX packets 0bytes 0 (0.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 0bytes 0 (0.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
root@d85077c98a69:/# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 172.17.0.1 0.0.0.0 UG 0 0 0 eth0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
#宿主机
root@d85077c98a69:/# exit
exit
# ifconfig
cali6632e2eedff: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet6 fe80::ecee:eeff:feee:eeeeprefixlen 64scopeid 0x20<link>
ether ee:ee:ee:ee:ee:eetxqueuelen 1000(Ethernet)
RX packets 3bytes 125 (125.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 8bytes 770 (770.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
cali7b6489f2f47: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet6 fe80::ecee:eeff:feee:eeeeprefixlen 64scopeid 0x20<link>
ether ee:ee:ee:ee:ee:eetxqueuelen 1000(Ethernet)
RX packets 3bytes 125 (125.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 8bytes 770 (770.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
calieaec58fb34e: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet6 fe80::ecee:eeff:feee:eeeeprefixlen 64scopeid 0x20<link>
ether ee:ee:ee:ee:ee:eetxqueuelen 1000(Ethernet)
RX packets 3bytes 125 (125.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 8bytes 770 (770.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet 172.17.0.1netmask 255.255.0.0broadcast 172.17.255.255
inet6 fe80::42:5fff:fe05:698cprefixlen 64scopeid 0x20<link>
ether 02:42:5f:05:69:8ctxqueuelen 0(Ethernet)
RX packets 3bytes 125 (125.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 8bytes 770 (770.0 B)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet 192.168.117.207netmask 255.255.255.0broadcast 192.168.117.255
inet6 fe80::20c:29ff:fe96:278cprefixlen 64scopeid 0x20<link>
ether 00:0c:29:96:27:8ctxqueuelen 1000(Ethernet)
RX packets 554bytes 64561 (63.0 KiB)
RX errors 0dropped 0overruns 0frame 0
TX packets 596bytes 65850 (64.3 KiB)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING>mtu 65536
inet 127.0.0.1netmask 255.0.0.0
inet6 ::1prefixlen 128scopeid 0x10<host>
looptxqueuelen 1000(Local Loopback)
RX packets 49719bytes 16290594 (15.5 MiB)
RX errors 0dropped 0overruns 0frame 0
TX packets 49719bytes 16290594 (15.5 MiB)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
veth6881202: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>mtu 1500
inet6 fe80::408f:cdff:fe98:623aprefixlen 64scopeid 0x20<link>
ether 42:8f:cd:98:62:3atxqueuelen 0(Ethernet)
RX packets 3bytes 167 (167.0 B)
RX errors 0dropped 0overruns 0frame 0
TX packets 18bytes 1566 (1.5 KiB)
TX errors 0dropped 0 overruns 0carrier 0collisions 0
#
# brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.02425f05698c no veth6881202
# 这就使得 Veth Pair 常常被用作连接差别 Network Namespace 的“网线”。
我们启动了一个叫作 nginx-1 的容器
这个容器里有一张叫作 eth0 的网卡,它正是一个 Veth Pair 设备在容器里的这一端。
通过 route 命令检察 nginx-1 容器的路由表,我们可以看到,这个 eth0 网卡是这个容器里的默认路由设备;所有对 172.17.0.0/16 网段的请求,也会被交给 eth0 来处理(第二条 172.17.0.0 路由规则)。
通过宿主机 ifconfig 命令的输出,你可以看到,nginx-1 容器对应的 Veth Pair 设备,在宿主机上是一张虚拟网卡。它的名字叫作veth6881202
而且,通过 brctl show 的输出,你可以看到这张网卡被“插”在了 docker0 上。
假如我们再在这台宿主机上启动另一个 Docker 容器,比如 nginx-2
# brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.02425f05698c no veth6881202
# docker run -d --name nginx-2 nginx
e3b1a33fa82952f99bdf47e1451d05d83a9686cb006798744d2e593f02cf65c8
# brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.02425f05698c no veth40408f3
veth6881202
#检察容器ip
#
# docker inspect nginx-1
[
{
"Id": "d85077c98a69846efe9bf17c4b1b4efb2152ec2078f5de483edc524c674eed76",
"Created": "2025-02-16T06:21:15.681636573Z",
"Path": "/docker-entrypoint.sh",
----------
"Links": null,
"Aliases": null,
"MacAddress": "02:42:ac:11:00:02",
"DriverOpts": null,
"NetworkID": "5ce1ccec1789844b6a4712acd0c8d6f0ef9fba840c00f53be667a0dd6fbae39c",
"EndpointID": "786e7d287ca79fda20dc3895bb64b9830a99f1989538fd503e9f877e4ad574f3",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"DNSNames": null
}
}
}
}
]ip为"IPAddress": "172.17.0.2",
进入nginx-2来ping nginx-1(curl也行)
# docker exec -it nginx-2 /bin/bash
root@e3b1a33fa829:/# ping 172.17.0.2
bash: ping: command not found
root@e3b1a33fa829:/# yum -y install ping
bash: yum: command not found
root@e3b1a33fa829:/# apt-get install -y iputils-ping
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package iputils-ping
root@e3b1a33fa829:/# curl http://172.17.0.2
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a target="_blank" href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a target="_blank" href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@e3b1a33fa829:/#https://static001.geekbang.org/resource/image/e0/66/e0d28e0371f93af619e91a86eda99a66.png?wh=1715*995
当你在 nginx-1 容器里访问 nginx-2 容器的 IP 地址(比如 ping 172.17.0.3)的时候,这个目的 IP 地址会匹配到 nginx-1 容器里的第二条路由规则。可以看到,这条路由规则的网关(Gateway)是 0.0.0.0,这就意味着这是一条直连规则,即:凡是匹配到这条规则的 IP 包,应该经过本机的 eth0 网卡,通过二层网络直接发往目的主机。
这个 eth0 网卡,是一个 Veth Pair,它的一端在这个 nginx-1 容器的 Network Namespace 里,而另一端则位于宿主机上(Host Namespace),而且被“插”在了宿主机的 docker0 网桥上。
一旦一张虚拟网卡被“插”在网桥上,它就会变成该网桥的“从设备”。从设备会被“剥夺”调用网络协议栈处理数据包的资格,从而“降级”成为网桥上的一个端口。而这个端口唯一的作用,就是接收流入的数据包,然后把这些数据包的“生杀大权”(比如转发或者丢弃),全部交给对应的网桥。
在收到这些 ARP 请求之后,docker0 网桥就会扮演二层互换机的角色,把 ARP 广播转发到其他被“插”在 docker0 上的虚拟网卡上。这样,同样连接在 docker0 上的 nginx-2 容器的网络协议栈就会收到这个 ARP 请求,从而将 172.17.0.3 所对应的 MAC 地址回复给 nginx-1 容器。
有了这个目的 MAC 地址,nginx-1 容器的 eth0 网卡就可以将数据包发出去。
被限制在 Network Namespace 里的容器进程,现实上是通过 Veth Pair 设备 + 宿主机网桥的方式,实现了跟同其他容器的数据互换。
当一个容器试图连接到另外一个宿主机时,比如:ping 10.168.0.3,它发出的请求数据包,首先经过 docker0 网桥出现在宿主机上。然后根据宿主机的路由表里的直连路由规则(10.168.0.0/24 via eth0)),对 10.168.0.3 的访问请求就会交给宿主机的 eth0 处理。 这个数据包就会经宿主机的 eth0 网卡转发到宿主机网络上,终极到达 10.168.0.3 对应的宿主机上。固然,这个过程的实现要求这两台宿主机本身是连通的
https://static001.geekbang.org/resource/image/90/95/90bd630c0723ea8a1fb7ccd738ad1f95.png?wh=1834*994
当你遇到容器连不通“外网”的时候,你都应该先试试 docker0 网桥能不能 ping 通,然后检察一下跟 docker0 和 Veth Pair 设备相关的 iptables 规则是不是有非常,往往就可以或许找到问题的答案了。
veth pair: 虚拟1 - docker0 - 虚拟2,每个上面都有一个地址,虚拟1,2不需要解析包,网桥docker0来解析,并做转发操作
“跨主通信”问题
假如在另外一台宿主机(比如:10.168.0.3)上,也有一个 Docker 容器。那么,我们的 nginx-1 容器又该怎样访问它呢?
在 Docker 的默认设置下,一台宿主机上的 docker0 网桥,和其他宿主机上的 docker0 网桥,没有任何关联,它们相互之间也没办法连通。以是,连接在这些网桥上的容器,自然也没办法进行通信了。
假如我们通过软件的方式,创建一个整个集群“公用”的网桥,然后把集群里的所有容器都连接到这个网桥上,不就可以相互通信了吗?
https://static001.geekbang.org/resource/image/b4/3d/b4387a992352109398a66d1dbe6e413d.png?wh=1828*721
构建这种容器网络的核心在于:我们需要在已有的宿主机网络上,再通过软件构建一个覆盖在已有宿主机网络之上的、可以把所有容器连通在一起的虚拟网络。以是,这种技能就被称为:Overlay Network(覆盖网络)。
Overlay Network 本身,可以由每台宿主机上的一个“特别网桥”共同构成。比如,当 Node 1 上的 Container 1 要访问 Node 2 上的 Container 3 的时候,Node 1 上的“特别网桥”在收到数据包之后,可以或许通过某种方式,把数据包发送到精确的宿主机,比如 Node 2 上。而 Node 2 上的“特别网桥”在收到数据包后,也可以或许通过某种方式,把数据包转发给精确的容器,比如 Container 3。
甚至,每台宿主机上,都不需要有一个这种特别的网桥,而仅仅通过某种方式设置宿主机的路由表,就可以或许把数据包转发到精确的宿主机上。
这里的关键在于,容器要想跟外界进行通信,它发出的 IP 包就必须从它的 Network Namespace 里出来,来到宿主机上。而解决这个问题的方法就是:为容器创建一个一端在容器里充当默认网卡、另一端在宿主机上的 Veth Pair 设备。
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
页:
[1]