说明
systemctl disable ufw ufw disable
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
coreDNS会proxy非集群的search(也就是pod访问外网,这个就是集群外的解析)到宿主机的/etc/resolv.conf里的nameserver,这个文件内容会和宿主机一样,Ubuntu系统会把DNS解析到127.0.0.x本地的一个DNS server,代理本地所有的DNS请求到公网,这样会导致POD无法解析到外网域名。
这里我们需要禁用Ubuntu 20.04LTS 的reslove.conf中的127.0.0.53的代理。
PS: 我们在修改/etc/reslov.conf 中DNS的 nameserver 114.114.114.114,每次重启之后,就会重置为:127.0.0.53
Override Ubuntu 20.04 DNS using systemd-resolved
打开/etc/systemd/resolved.conf,修改为:
[Resolve]DNS=114.114.114.114#FallbackDNS=#Domains=LLMNR=no#MulticastDNS=no#DNSSEC=no#Cache=yes#DNSStubListener=yes
LLMNR=设置的是禁止运行LLMNR(Link-Local Multicast Name Resolution),否则systemd-resolve会监听5535端口。
rm /etc/reslove.conf ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf systemctl restart systemd-resolved
swapoff -a && sysctl -w vm.swappiness=0sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
apt update && apt install -y wget \ git \ psmisc \ nfs-kernel-server \ nfs-common \ jq \ socat \ bash-completion \ ipset \ ipvsadm \ conntrack \ libseccomp2 \ net-tools \ cron \ sysstat \ unzip \ dnsutils \ tcpdump \ telnet \ lsof \ htop \ curl \ apt-transport-https \ ca-certificates
这里安装完成之后,会默认启用NFS server,这里我们选择禁用.
systemctl stop nfs-serversystemctl disable nfs-server
:> /etc/modules-load.d/ipvs.confmodule=(ip_vsip_vs_rrip_vs_wrrip_vs_shnf_conntrackbr_netfilter )for kernel_module in ${module[@]};do /sbin/modinfo -F filename $kernel_module |& grep -qv ERROR && echo $kernel_module >> /etc/modules-load.d/ipvs.conf || :done
使用systemctl cat systemd-modules-load看下是否有Install段,没有则执行:
cat>>/usr/lib/systemd/system/systemd-modules-load.service<
启动该模块管理服务.
systemctl daemon-reloadsystemctl enable --now systemd-modules-load.servicesystemctl restart systemd-modules-load.service#确认内核模块加载lsmod | grep ip_v
cat < /etc/sysctl.d/k8s.confnet.ipv6.conf.all.disable_ipv6 = 1net.ipv6.conf.default.disable_ipv6 = 1net.ipv6.conf.lo.disable_ipv6 = 1net.ipv4.neigh.default.gc_stale_time = 120net.ipv4.conf.all.rp_filter = 0net.ipv4.conf.default.rp_filter = 0net.ipv4.conf.default.arp_announce = 2net.ipv4.conf.lo.arp_announce = 2net.ipv4.conf.all.arp_announce = 2net.ipv4.ip_forward = 1net.ipv4.tcp_max_tw_buckets = 5000net.ipv4.tcp_syncookies = 1net.ipv4.tcp_max_syn_backlog = 1024net.ipv4.tcp_synack_retries = 2# 要求iptables不对bridge的数据进行处理net.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1net.bridge.bridge-nf-call-arptables = 1net.netfilter.nf_conntrack_max = 2310720fs.inotify.max_user_watches=89100fs.may_detach_mounts = 1fs.file-max = 52706963fs.nr_open = 52706963vm.overcommit_memory=1vm.panic_on_oom=0vm.swappiness = 0EOF
如果kube-proxy使用ipvs的话,为了防止timeout需要设置下tcp参数.
cat <> /etc/sysctl.d/k8s.conf# https://github.com/moby/moby/issues/31208 # ipvsadm -l --timout# 修复ipvs模式下长连接timeout问题 小于900即可net.ipv4.tcp_keepalive_time = 600net.ipv4.tcp_keepalive_intvl = 30net.ipv4.tcp_keepalive_probes = 10EOFsysctl --system
这里修改内核参数,部分会在重启之后失效,比如禁用IPV6.重启之后无法生效。
需要配置相关服务的重新启动。
vim /etc/rc.local#添加如下内容#!/bin/bash# /etc/rc.local/etc/sysctl.d/etc/init.d/procps restartexit 0#添加执行权限chmod 755 /etc/rc.local
优化SSH连接,禁用DNS
sed -ri 's/^#(UseDNS )yes/\1no/' /etc/ssh/sshd_config
优化文件最大打开数,在子配置文件中定义.
cat>/etc/security/limits.d/kubernetes.conf<
apt install -y chrony
由于默认的配置文件亦可进行时间同步,这里不再进行修改。
reboot
apt install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"apt update
apt install docker-ce=5:19.03.15~3-0~ubuntu-focal -y
mkdir -p /etc/docker/cat>/etc/docker/daemon.json<
systemctl restart dockersystemctl enable docker
#取消文件/etc/bash.bashrc内下面行的注释# enable bash completion in interactive shells#if ! shopt -oq posix; then# if [ -f /usr/share/bash-completion/bash_completion ]; then# . /usr/share/bash-completion/bash_completion# elif [ -f /etc/bash_completion ]; then# . /etc/bash_completion# fi#fi
复制补全脚本
cp /usr/share/bash-completion/completions/docker /etc/bash_completion.d/
配置环境变量
source /usr/share/bash-completion/bash_completionecho "source <(kubectl completion bash)" >> ~/.bashrc
默认源在国外会无法安装,我们使用国内的镜像源,所有机器都需要操作。(这里使用的是阿里云的源,华为云源签名验证不通过)
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -cat>/etc/apt/sources.list.d/kubernetes.list<
安装相关软件
apt install kubeadm=1.23.3-00 kubelet=1.23.3-00 kubectl=1.23.3-00
准备初始化文件
#首先生成初始化文件kubeadm config print init-defaults > initconfig.yaml
修改后的内容如下:
这里由于我们的网络组件使用cilium。且依然使用cilium代理svc层,所以我们在初始化时,需要拒绝使用kube-proxy。
apiVersion: kubeadm.k8s.io/v1beta3kind: ClusterConfigurationimageRepository: registry.aliyuncs.com/k8sxiokubernetesVersion: v1.23.3 # 如果镜像列出的版本不对就这里写正确版本号certificatesDir: /etc/kubernetes/pkiclusterName: kubernetesnetworking: #https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Networking dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 podSubnet: 10.244.0.0/16apiServer: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#APIServer timeoutForControlPlane: 4m0s certSANs: - 10.96.0.1 # service cidr的第一个ip - 127.0.0.1 # 多个master的时候负载均衡出问题了能够快速使用localhost调试 - localhost - 192.168.10.81 - master - kubernetes - kubernetes.default - kubernetes.default.svc - kubernetes.default.svc.cluster.local extraVolumes: - hostPath: /etc/localtime mountPath: /etc/localtime name: localtime readOnly: truecontrollerManager: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ControlPlaneComponent extraArgs: bind-address: "0.0.0.0" experimental-cluster-signing-duration: 876000h extraVolumes: - hostPath: /etc/localtime mountPath: /etc/localtime name: localtime readOnly: truescheduler: extraArgs: bind-address: "0.0.0.0" extraVolumes: - hostPath: /etc/localtime mountPath: /etc/localtime name: localtime readOnly: truedns: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#DNS imageRepository: docker.io/coredns # azk8s.cn已失效,此处使用上面的ns下镜像,使用dockerhub上coredns官方镜像,下面的images list如果显示不对,这里改成docker.io/coredns再试试 imageTag: 1.8.7etcd: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd local: #imageRepository: quay.io/coreos #取消注释使用quay.io,这里使用registry.aliyuncs.com/k8sxio的 imageTag: v3.4.15 dataDir: /var/lib/etcd extraArgs: # 官方暂时没有extraVolumes auto-compaction-retention: "1h" max-request-bytes: "33554432" quota-backend-bytes: "8589934592" enable-v2: "false" # disable etcd v2 api---apiVersion: kubelet.config.k8s.io/v1beta1kind: KubeletConfiguration # https://godoc.org/k8s.io/kubelet/config/v1beta1#KubeletConfigurationcgroupDriver: systemdfailSwapOn: true # 如果开启swap则设置为false
验证配置文件是否报错
kubeadm init --config initconfig.yaml --dry-run
这里我们一定要保存好配置文件.方便后续的升级.
拉取镜像
kubeadm config images list --config initconfig.yamlkubeadm config images pull --config initconfig.yaml
进行初始化,且跳过kube-proxy.
kubeadm init --config initconfig.yml --skip-phases=addon/kube-proxy
始化完成之后,会出现配置kubectl管理员以及将子节点加入集群中的提示。
配置kubectl:
mkdir -p $HOME/.kubecp -i /etc/kubernetes/admin.conf $HOME/.kube/configchown $(id -u):$(id -g) $HOME/.kube/config
将子节点加入到集群中:
kubeadm join 192.168.10.81:6443 --token 9b2qv2.vgge4e62maud7hxx \ --discovery-token-ca-cert-hash sha256:ed13dd55246982df17f809fd3355745e61b788fb9c745ac437aa2b3624511df0
此时没有配置cilium网络组建,所有节点均处于NotReady状态。如下:
root@unode1:~# kubectl get node -o wideNAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIMEunode1 NotReady control-plane,master 7m45s v1.23.3 192.168.10.81 Ubuntu 20.04.3 LTS 5.4.0-99-generic docker://19.3.15unode2 NotReady 6m24s v1.23.3 192.168.10.82 Ubuntu 20.04.3 LTS 5.4.0-99-generic docker://19.3.15unode3 NotReady 6m12s v1.23.3 192.168.10.83 Ubuntu 20.04.3 LTS 5.4.0-99-generic docker://19.3.15
到GitHub上下载对应版本的cilium-cli
地址:# https://github.com/cilium/cilium-cli/releases
参考官方文档:# https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}sha256sum --check cilium-linux-amd64.tar.gz.sha256sumsudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/binrm cilium-linux-amd64.tar.gz{,.sha256sum}
Requirements:
如上是cilium的部署要求。
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3chmod 700 get_helm.sh./get_helm.sh
官网地址:https://docs.cilium.io/en/stable/concepts/networking/ipam/
IP地址管理(IPAM)负责分配和管理cilium管理的网络终点(容器和其他)使用的IP地址。支持多种IPAM模式:
这里如果使用Cluster Scope集群范围,建议使用此种方式进行安装
建议安装前进行--help,详细阅读各个参数的意义,然后进行自定义。
With helm the previous options can be defined as:
ipam: kubernetes: --set ipam.mode=kubernetes.
k8s-require-ipv4-pod-cidr: true: --set k8s.requireIPv4PodCIDR=true, which only works with --set ipam.mode=kubernetes
k8s-require-ipv6-pod-cidr: true: --set k8s.requireIPv6PodCIDR=true, which only works with --set ipam.mode=kubernetes
如上所示,是官网给出的我们需要进行helm参数的配置。
#将对应版本的cilium仓库下载下来helm repo add cilium https://helm.cilium.io/mkdir /opt/helmrepocd /opt/helmrepo/helm repo listhelm search repo -l cilium/ciliumhelm fetch cilium/cilium --version=1.11.1#这里我们将对应版本的 cilium charts下载下来,然后修改values.yaml进行配置
修改的配置如下:
k8sServiceHost: 192.168.10.81k8sServicePort: 6443extraHostPathMounts: - name: localtime mountPath: /etc/localtime hostPath: /etc/localtime readOnly: trueresources: limits: cpu: 2000m memory: 2Gi requests: cpu: 100m memory: 512Miipam: # -- Configure IP Address Management mode. # ref: https://docs.cilium.io/en/stable/concepts/networking/ipam/ mode: "kubernetes" operator: # -- Deprecated in favor of ipam.operator.clusterPoolIPv4PodCIDRList. # IPv4 CIDR range to delegate to individual nodes for IPAM. clusterPoolIPv4PodCIDR: "10.244.0.0/16" # -- IPv4 CIDR list range to delegate to individual nodes for IPAM. clusterPoolIPv4PodCIDRList: [] # -- IPv4 CIDR mask size to delegate to individual nodes for IPAM. clusterPoolIPv4MaskSize: 24 ipv4: # -- Enable IPv4 support. enabled: truek8s: # -- requireIPv4PodCIDR enables waiting for Kubernetes to provide the PodCIDR # range via the Kubernetes node resource requireIPv4PodCIDR: true kubeProxyReplacement: "strict"tunnel: "vxlan"operator:# -- Additional cilium-operator hostPath mounts. extraHostPathMounts: - name: localtime mountPath: /etc/localtime hostPath: /etc/localtime readOnly: true # -- cilium-operator resource limits & requests # ref: https://kubernetes.io/docs/user-guide/compute-resources/ resources: limits: cpu: 1000m memory: 1Gi requests: cpu: 100m memory: 128Mi
然后进行安装:
helm install cilium --namespace kube-system ./cilium#如果后续还有其他value值进行更新,可通过如下命令:helm upgrade cilium --namespace kube-system ./cilium
最后进行验证:
root@unode1:/opt/helmrepo# cilium status /¯¯\ /¯¯\__/¯¯\ Cilium: OK \__/¯¯\__/ Operator: OK /¯¯\__/¯¯\ Hubble: disabled \__/¯¯\__/ ClusterMesh: disabled \__/DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2Containers: cilium Running: 3 cilium-operator Running: 2Cluster Pods: 6/6 managed by CiliumImage versions cilium quay.io/cilium/cilium:v1.11.1@sha256:251ff274acf22fd2067b29a31e9fda94253d2961c061577203621583d7e85bd2: 3 cilium-operator quay.io/cilium/operator-generic:v1.11.1@sha256:977240a4783c7be821e215ead515da3093a10f4a7baea9f803511a2c2b44a235: 2
后续还会继续进行cilium的学习,大家有任何疑问可以关注我,并留言.
关注并转发者,可以私信留个邮箱,有机会获得 kubernetes权威指南第四版本 PDF电子书.
留言与评论(共有 0 条评论) “” |