使用 kubeadm 创建高可用 Kubernetes 及外部 etcd 集群

打印 上一主题 下一主题

主题 862|帖子 862|积分 2586

博客链接:使用 kubeadm 创建高可用 Kubernetes 及外部 etcd 集群
前言

Kubernetes 的官方中文文档内容全面,表达清晰,有大量示例和解析
无论任何情况下都推荐先花几个小时通读官方文档,来了解配置过程中的可选项,以及可能会遇到哪些问题
本文基于官方文档中 入门 - 生产环境 一章来整理摆设流程
Kubernetes 文档 | Kubernetes
架构


  • OS: Debian 12
  • CGroup Driver: systemd
  • Container Runtime: containerd
  • CNI: Calico
  • Kubernetes: v1.32.0
注意
全部节点服务器都需要关闭 swap


  • Other

    • 说明

      • 该服务器运行 K8S 外部应用,包括 Nginx、Nexus 等
      • 该服务器运行的全部业务通过 docker-compose 管理
      • 与 K8S 自身配置相干的步调说明中的“全部节点”不包括该服务器

    • Server

      • vCPU: 2
      • Memory: 4G

    • Network: 192.168.1.100 2E:7E:86:3A:A5:20
    • Port:

      • 8443/tcp: 向集群提供 Kubernetes APIServer 负载平衡


  • Etcd

    • Server

      • vCPU: 1
      • Memory: 1G

    • Network

      • Etcd-01: 192.168.1.101 2E:7E:86:3A:A5:21
      • Etcd-02: 192.168.1.102 2E:7E:86:3A:A5:22
      • Etcd-03: 192.168.1.103 2E:7E:86:3A:A5:23

    • Port:

      • 2379/tcp: etcd HTTP API
      • 2380/tcp: etcd peer 通讯


  • Master

    • Server

      • vCPU: 4
      • Memory: 8G

    • Network

      • Master-01: 192.168.1.104 2E:7E:86:3A:A5:24
      • Master-02: 192.168.1.105 2E:7E:86:3A:A5:25
      • Master-03: 192.168.1.106 2E:7E:86:3A:A5:26

    • Port:

      • 179/tcp: Calico BGP
      • 6443/tcp: Kubernetes APIServer
      • 10250/tcp: kubelet API


  • Node

    • Server

      • vCPU: 4
      • Memory: 8G

    • Network

      • Node-01: 192.168.1.107 2E:7E:86:3A:A5:27
      • Node-02: 192.168.1.108 2E:7E:86:3A:A5:28
      • Node-03: 192.168.1.109 2E:7E:86:3A:A5:29

    • Port:

      • 179/tcp: Calico BGP
      • 10250/tcp: kubelet API


配置基础环境

说明
全部节点
  1. apt update
  2. apt upgrade
  3. apt install curl apt-transport-https ca-certificates gnupg2 software-properties-common vim
  4. curl -fsSL https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
  5. chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg
  6. echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://mirrors.ustc.edu.cn/kubernetes/core:/stable:/v1.32/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list
  7. curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
  8. chmod a+r /etc/apt/keyrings/docker.asc
  9. echo "deb [signed-by=/etc/apt/keyrings/docker.asc] https://mirrors.ustc.edu.cn/docker-ce/linux/debian bookworm stable" | tee /etc/apt/sources.list.d/docker.list
  10. apt update
  11. apt install containerd.io
  12. mkdir -p /etc/containerd
  13. containerd config default | tee /etc/containerd/config.toml
  14. systemctl restart containerd
  15. apt install kubelet kubeadm kubectl
  16. apt-mark hold kubelet kubeadm kubectl
复制代码
开启 ipv4 转发
编辑 /etc/sysctl.conf,找到下方配置并取消注释
  1. net.ipv4.ip_forward=1
复制代码
执行 sysctl -p 应用配置
创建 crictl 配置
  1. cat << EOF > /etc/crictl.yaml
  2. runtime-endpoint: unix:///run/containerd/containerd.sock
  3. image-endpoint: unix:///run/containerd/containerd.sock
  4. timeout: 10
  5. debug: false
  6. EOF
复制代码
如果需要通过代理服务器访问容器仓库,需要为 containerd 配置代理服务
  1. mkdir -p /etc/systemd/system/containerd.service.d
  2. cat << EOF > /etc/systemd/system/containerd.service.d/http-proxy.conf
  3. [Service]
  4. Environment="HTTP_PROXY=http://username:password@proxy-server-ip:port"
  5. Environment="HTTPS_PROXY=http://username:password@proxy-server-ip:port"
  6. Environment="NO_PROXY=localhost,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
  7. EOF
  8. systemctl daemon-reload
  9. systemctl restart containerd.service
复制代码
已知问题

使用 systemd 作为 CGroup Driver 且使用 containerd 作为 CRI 运行时
需要修改 /etc/containerd/config.toml,添加如下配置
相干文章:配置 systemd cgroup 驱动 | Kubernetes
  1. [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  2.   ...
  3.   [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  4.     SystemdCgroup = true
复制代码
执行 systemctl restart containerd
参照另一篇文章的办理方案
相干文章:Why does etcd fail with Debian/bullseye kernel? - General Discussions - Discuss Kubernetes
  1. cat /etc/default/grub
  2. # Source:
  3. # GRUB_CMDLINE_LINUX_DEFAULT="quiet"
  4. # Modify:
  5. GRUB_CMDLINE_LINUX_DEFAULT="quiet systemd.unified_cgroup_hierarchy=0"
复制代码
执行 update-grub 并重启
配置 etcd 节点

将 kubelet 配置为 etcd 的服务管理器

说明
全部 etcd 节点
  1. apt update
  2. apt install etcd-client
  3. mkdir -p /etc/systemd/system/kubelet.service.d
  4. cat << EOF > /etc/systemd/system/kubelet.service.d/kubelet.conf
  5. apiVersion: kubelet.config.k8s.io/v1beta1
  6. kind: KubeletConfiguration
  7. authentication:
  8.   anonymous:
  9.     enabled: false
  10.   webhook:
  11.     enabled: false
  12. authorization:
  13.   mode: AlwaysAllow
  14. cgroupDriver: systemd
  15. address: 127.0.0.1
  16. containerRuntimeEndpoint: unix:///var/run/containerd/containerd.sock
  17. staticPodPath: /etc/kubernetes/manifests
  18. EOF
  19. cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
  20. [Service]
  21. Environment="KUBELET_CONFIG_ARGS=--config=/etc/systemd/system/kubelet.service.d/kubelet.conf"
  22. ExecStart=
  23. ExecStart=/usr/bin/kubelet $KUBELET_CONFIG_ARGS
  24. Restart=always
  25. EOF
  26. systemctl daemon-reload
  27. systemctl restart kubelet
复制代码
为 kubeadm 创建配置文件

说明
Etcd-01 节点,由该节点向其他节点分发证书及配置
该节点同时作为 CA
生成 CA
  1. kubeadm init phase certs etcd-ca
复制代码
生成如下文件

  • /etc/kubernetes/pki/etcd/ca.crt
  • /etc/kubernetes/pki/etcd/ca.key
为方便接下来的步调操作,先将 etcd 节点信息导出为环境变量
  1. export HOST0=192.168.1.101
  2. export HOST1=192.168.1.102
  3. export HOST2=192.168.1.103
  4. export NAME0="etcd-01"
  5. export NAME1="etcd-02"
  6. export NAME2="etcd-03"
复制代码
为 etcd 成员生成 kubeadm 配置
  1. HOSTS=(${HOST0} ${HOST1} ${HOST2})
  2. NAMES=(${NAME0} ${NAME1} ${NAME2})
  3. for i in "${!HOSTS[@]}"; do
  4. HOST=${HOSTS[$i]}
  5. NAME=${NAMES[$i]}
  6. mkdir -p /tmp/${HOST}
  7. cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
  8. ---
  9. apiVersion: "kubeadm.k8s.io/v1beta4"
  10. kind: InitConfiguration
  11. nodeRegistration:
  12.     name: ${NAME}
  13. localAPIEndpoint:
  14.     advertiseAddress: ${HOST}
  15. ---
  16. apiVersion: "kubeadm.k8s.io/v1beta4"
  17. kind: ClusterConfiguration
  18. etcd:
  19.     local:
  20.         serverCertSANs:
  21.         - "${HOST}"
  22.         peerCertSANs:
  23.         - "${HOST}"
  24.         extraArgs:
  25.         - name: initial-cluster
  26.           value: ${NAMES[0]}=https://${HOSTS[0]}:2380,${NAMES[1]}=https://${HOSTS[1]}:2380,${NAMES[2]}=https://${HOSTS[2]}:2380
  27.         - name: initial-cluster-state
  28.           value: new
  29.         - name: name
  30.           value: ${NAME}
  31.         - name: listen-peer-urls
  32.           value: https://${HOST}:2380
  33.         - name: listen-client-urls
  34.           value: https://${HOST}:2379
  35.         - name: advertise-client-urls
  36.           value: https://${HOST}:2379
  37.         - name: initial-advertise-peer-urls
  38.           value: https://${HOST}:2380
  39. EOF
  40. done
复制代码
为每个成员创建证书
  1. kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
  2. kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
  3. kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
  4. kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
  5. cp -R /etc/kubernetes/pki /tmp/${HOST2}/
  6. # Clear useless cert
  7. find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
  8. kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
  9. kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
  10. kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
  11. kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
  12. cp -R /etc/kubernetes/pki /tmp/${HOST1}/
  13. find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
  14. kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
  15. kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
  16. kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
  17. kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
  18. # Clear ca key from member
  19. find /tmp/${HOST2} -name ca.key -type f -delete
  20. find /tmp/${HOST1} -name ca.key -type f -delete
复制代码
将证书移动到对应的成员服务器
  1. scp -r /tmp/${HOST2}/pki root@${HOST2}:/etc/kubernetes/
  2. scp /tmp/${HOST2}/kubeadmcfg.yaml root@${HOST2}:~/
  3. scp -r /tmp/${HOST1}/pki root@${HOST1}:/etc/kubernetes/
  4. scp /tmp/${HOST1}/kubeadmcfg.yaml root@${HOST1}:~/
  5. mv /tmp/${HOST0}/kubeadmcfg.yaml ~/
  6. rm -rf /tmp/${HOST2}
  7. rm -rf /tmp/${HOST1}
  8. rm -rf /tmp/${HOST0}
复制代码
此时在三台 etcd 节点中的文件结构均应如下
  1. /root
  2. └── kubeadmcfg.yaml
  3. ---
  4. /etc/kubernetes/pki
  5. ├── apiserver-etcd-client.crt
  6. ├── apiserver-etcd-client.key
  7. └── etcd
  8.     ├── ca.crt
  9.     ├── ca.key # 仅 CA 节点既 etcd-01
  10.     ├── healthcheck-client.crt
  11.     ├── healthcheck-client.key
  12.     ├── peer.crt
  13.     ├── peer.key
  14.     ├── server.crt
  15.     └── server.key
复制代码
创建静态 Pod 清单

说明
全部 etcd 节点
  1. kubeadm init phase etcd local --config=/root/kubeadmcfg.yaml
复制代码
检查集群运行情况

将 ${HOST0} 替换为想要检查的节点 ip
  1. ETCDCTL_API=3 etcdctl \
  2. --cert /etc/kubernetes/pki/etcd/peer.crt \
  3. --key /etc/kubernetes/pki/etcd/peer.key \
  4. --cacert /etc/kubernetes/pki/etcd/ca.crt \
  5. --endpoints https://${HOST0}:2379 endpoint health
复制代码
使用 kubeadm 创建高可用集群

说明

配置过程中需要完全重置控制平面节点的配置时,需要有至少一台能够访问集群的节点,在该节点上按如下游程操作
  1. kubectl delete pods,nodes,namespaces,deployments,services --all --all-namespaces --force
  2. kubectl delete -f tigera-operator.yaml --force
  3. kubectl delete -f custom-resources.yaml --force
  4. kubeadm reset --cleanup-tmp-dir -f
  5. rm -rf /etc/cni/net.d/*
  6. rm -rf ~/.kube
  7. systemctl restart kubelet containerd
复制代码
为 kube-apiserver 创建负载平衡

说明
本文中负载平衡使用 Nginx
Nginx 配置
  1. http {
  2.     ...
  3. }
  4. stream {
  5.     upstream apiserver {
  6.         server 192.168.1.104:6443 weight=5 max_fails=3 fail_timeout=30s; # Master-01
  7.         server 192.168.1.105:6443 weight=5 max_fails=3 fail_timeout=30s; # Master-02
  8.         server 192.168.1.106:6443 weight=5 max_fails=3 fail_timeout=30s; # Master-03
  9.     }
  10.     server {
  11.         listen 8443;
  12.         proxy_pass apiserver;
  13.     }
  14. }
复制代码
为控制平面节点配置外部 etcd 节点

说明
任一 etcd 节点与主控制平面节点,本文中为 Etcd-01 与 Master-01
从集群中任一 etcd 节点复制到主控制平面节点
  1. scp /etc/kubernetes/pki/etcd/ca.crt /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/apiserver-etcd-client.key root@192.168.1.104:~
复制代码
在主控制平面节点中将文件移动到指定位置
  1. mkdir -p /etc/kubernetes/pki/etcd
  2. mv ~/ca.crt /etc/kubernetes/pki/etcd/
  3. mv ~/apiserver-etcd-client.crt /etc/kubernetes/pki/
  4. mv ~/apiserver-etcd-client.key /etc/kubernetes/pki/
复制代码
创建 kubeadm-config.yaml,内容如下

  • controlPlaneEndpoint: 负载平衡服务器
  • etcd

    • external

      • endpoints: etcd 节点列表


  • networking

    • podSubnet: pod ip cidr

  1. ---
  2. apiVersion: kubeadm.k8s.io/v1beta4
  3. kind: ClusterConfiguration
  4. kubernetesVersion: v1.32.0
  5. controlPlaneEndpoint: 192.168.1.100:8443
  6. etcd:
  7.   external:
  8.     endpoints:
  9.       - https://192.168.1.101:2379
  10.       - https://192.168.1.102:2379
  11.       - https://192.168.1.103:2379
  12.     caFile: /etc/kubernetes/pki/etcd/ca.crt
  13.     certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
  14.     keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
  15. networking:
  16.   dnsDomain: cluster.local
  17.   podSubnet: 10.244.0.0/24
  18.   serviceSubnet: 10.96.0.0/16
复制代码
初始化主控制平面

说明
主控制平面节点,本文中为 Master-01


  • --upload-certs: 将控制平面间的共享证书上传到 kubeadm-certs Secret

    • kubeadm-certs Secret 和解密密钥将在两小时后失效
    • 如果要重新上传证书并生成新的解密密钥,需要在已参加集群的控制平面节点上执行 kubeadm init phase upload-certs --upload-certs

  1. kubeadm init --config kubeadm-config.yaml --upload-certs
复制代码
期待运行完成后应输出雷同如下内容
  1. ...
  2. Your Kubernetes control-plane has initialized successfully!
  3. To start using your cluster, you need to run the following as a regular user:
  4.   mkdir -p $HOME/.kube
  5.   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  6.   sudo chown $(id -u):$(id -g) $HOME/.kube/config
  7. Alternatively, if you are the root user, you can run:
  8.   export KUBECONFIG=/etc/kubernetes/admin.conf
  9. You should now deploy a pod network to the cluster.
  10. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  11.   https://kubernetes.io/docs/concepts/cluster-administration/addons/
  12. You can now join any number of control-plane nodes running the following command on each as root:
  13.   kubeadm join 192.168.1.100:8443 --token 7r34LU.iLiRgu2qHdAeeanS --discovery-token-ca-cert-hash sha256:9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08 --control-plane --certificate-key 03d66dd08835c1ca3f128cceacd1f31ac94163096b20f445ae84285bc0832d72
  14. Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
  15. As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
  16. "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
  17. Then you can join any number of worker nodes by running the following on each as root:
  18. kubeadm join 192.168.1.100:8443 --token 7r34LU.iLiRgu2qHdAeeanS --discovery-token-ca-cert-hash sha256:9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08
复制代码
先将控制台输出的以上内容生存,稍后将使用这些命令来将其他控制平面节点和工作节点参加集群
根据输出的提示,复制 kubeconfig 用于 kubectl
  1. mkdir -p ~/.kube
  2. cp /etc/kubernetes/admin.conf ~/.kube/config
复制代码
应用 CNI 插件
由于该清单过大,kubectl apply 会产生如下报错,使用 kubectl create 或 kubectl replace
注意
确认 custom-resources.yaml 中 calicoNetwork 配置的 ip cidr 与集群 podSubnet 配置一致
  1. # kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/refs/heads/release-v3.29/manifests/tigera-operator.yaml
  2. # The CustomResourceDefinition "installations.operator.tigera.io" is invalid: metadata.annotations: Too long: may not be more than 262144 bytes
  3. wget https://raw.githubusercontent.com/projectcalico/calico/refs/heads/release-v3.29/manifests/tigera-operator.yaml
  4. wget https://raw.githubusercontent.com/projectcalico/calico/refs/heads/release-v3.29/manifests/custom-resources.yaml
  5. kubectl create -f tigera-operator.yaml
  6. kubectl create -f custom-resources.yaml
复制代码
输入以下内容查看控制平面组件 pod 启动状态
  1. kubectl get pod -A
复制代码
初始化其他控制平面

说明
除主控制平面节点外的其他控制平面节点,本文中为 Master-02 Master-03
使用 kubeadm join 命令参加集群的节点会将 KubeConfig 同步到 /etc/kubernetes/admin.conf
依照上面输出的命令,分别在其他控制平面节点中执行

  • --control-plane: 通知 kubeadm join 创建一个新控制平面
  • --certificate-key xxx: 从集群 kubeadm-certs Secret 下载控制平面证书并使用给定的密钥解密
  1. kubeadm join 192.168.1.100:8443 --token 7r34LU.iLiRgu2qHdAeeanS --discovery-token-ca-cert-hash sha256:9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08 --control-plane --certificate-key 03d66dd08835c1ca3f128cceacd1f31ac94163096b20f445ae84285bc0832d72
复制代码
根据输出的提示,复制 kubeconfig 用于 kubectl
  1. mkdir -p ~/.kube
  2. cp /etc/kubernetes/admin.conf ~/.kube/config
复制代码
初始化负载节点

说明
全部负载节点
使用 kubeadm join 命令参加集群的节点会将 KubeConfig 同步到 /etc/kubernetes/kubelet.conf
依照上面输出的命令,分别在负载节点中执行
  1. kubeadm join 192.168.1.100:8443 --token 7r34LU.iLiRgu2qHdAeeanS --discovery-token-ca-cert-hash sha256:9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08
复制代码
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
回复

使用道具 举报

0 个回复

正序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

用户国营

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表