k8s集群搭建
使用官方centos7 镜像,使用虚拟机进行安装k8s集群:
一、初始化
关闭防火墙
# 关闭防火墙
systemctl stop firewalld
# 禁止自启动,永久关闭
systemctl disable firewalld
关闭seLinux
# 临时禁用selinux
setenforce 0
# 永久关闭selinux
sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
禁用Swap(交换)分区
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
切换阿里云yum源
# 备份官方的原yum源的配置
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
# 下载Centos-7.repo文件
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
增加阿里云的Docker-ce的源:
# 安装yum管理工具
yum install -y yum-utils
# 配置阿里云的docker源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
增加kubernates 源
这段命令直接粘贴需要终端支持,测试powershell 是可以直接支持粘贴的
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
nabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
二、基础安装
通用安装包
yum install vim bash-completion net-tools gcc -y
安装docker
yum install docker-ce
# 启动Docker
systemctl start docker
# 设置自动启动
systemctl enable docker
Kubernetes安装
Master上安装(控制面):
# 安装kubeadm、kubectl、kubelet
yum install -y kubectl kubeadm kubelet
# 启动kubelet服务
systemctl enable kubelet && systemctl start kubelet
Node上安装(工作节点):
# 安装kubeadm、kubectl、kubelet
yum install -y kubectl kubeadm kubelet
# 启动kubelet服务
systemctl enable kubelet && systemctl start kubelet
三、初始化K8s集群
准备工作
虚拟机安装后每台机器的名称都是 localhost.localdomain ,为了不必要的麻烦,需要在每一台机器上更改一个不一样的名称:
# master节点
hostnamectl --static set-hostname k8s-master
hostname $hostname # 立刻生效
# node节点
hostnamectl --static set-hostname k8s-node1
hostname $hostname# 立刻生效
初始化master节点
执行命令:kubeadm init
kubeadm init --image-repository registry.aliyuncs.com/google_containers --apiserver-advertise-address 192.168.1.33 --pod-network-cidr=10.122.0.0/16 --token-ttl 0
关于报错
初始化过程中,如果出现报错:
- 如果报 container runtime is not running:的错误,此时需要执行下面命令:
rm -rf /etc/containerd/config.toml systemctl restart containerd
- 如果报 Initial timeout of 40s passed,4分钟超时错误后需要执行:
执行命令:
ctr -n k8s.io images pull -k registry.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
# 重命名镜像registry.aliyuncs.com/google_containers/pause:3.6的tag为registry.k8s.io/pause:3.6
kubeadm reset -f
然后继续执行上面kubeadm init命令。一切顺利的话:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.33:6443 --token v2nbj9.n96aegm563ub38zt --discovery-token-ca-cert-hash sha256:1a6b394358789c92e09a55eb0ae8279d5054c89aef867d4dca9dae1cf4ccf859
根据提示,我们执行:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
查看Master节点工作状态:
kubectl get nodes
此时是未准备状态。
安装网络插件(Calico)
# 执行命令
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/tigera-operator.yaml
然后执行:
# 下载到任意的文件夹:
wget https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/custom-resources.yaml
vim custom-resources.yaml
这个是 custom-resources 文件内容,需要将cidr 网段的内容与 集群初始化时参数–pod-network-cidr 对应上。
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
# cidr: 192.168.0.0/16
cidr: 10.122.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
执行
kubectl create -f custom-resources.yaml
等待一段时间后,(大约4分钟)此时的Master节点已经准备就绪。
加入Node节点
然后执行上述的命令:
kubeadm join 192.168.1.33:6443 --token v2nbj9.n96aegm563ub38zt --discovery-token-ca-cert-hash sha256:1a6b394358789c92e09a55eb0ae8279d5054c89aef867d4dca9dae1cf4ccf859
如果之前没有保存,可以再Master节点上执行:
kubeadm token create --print-join-command
显示下列内容证明已经加入k8s集群,但是在Master节点上查看当前的node节点的状态为notReady。
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
报错
如果报 container runtime is not running:的错误,此时需要执行下面命令:
rm -rf /etc/containerd/config.toml
systemctl restart containerd
# 再执行join命令
如果节点一直not Ready:
查看日志:
journalctl -f -u kubelet.service
执行命令:k8s.io images pull
ctr -n k8s.io images pull -k registry.aliyuncs.com/google_containers/pause:3.6 ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
等待一段时间后。会发现:集群已经成功运行
[root@k8s-master yum.repos.d]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master Ready control-plane 17h v1.28.2 k8s-node1 Ready <none> 17h v1.28.2 k8s-node2 Ready <none> 16h v1.28.2
四、重置k8s集群
当虚拟接的网络发生变更之后,发现整个集群都不可用了。因为k8s集群高度依赖网络进行通信。学习环境可以重置集群,使用 kubeadm 工具重置集群的基本步骤:
在控制面(Master)节点和工作节点(Worker Nodes)上执行:
- 停止 kubelet 服务:
sudo systemctl stop kubelet
- 使用 kubeadm 重置:
sudo kubeadm reset
这个命令会清除 kubelet 的配置,移除 Kubernetes 的相关容器,以及执行一些清理操作。
- 清理 iptables:
清理可能残留的 iptables 规则,这些规则可能会干扰集群的重新初始化。sudo iptables -F sudo iptables -X sudo iptables -t nat -F sudo iptables -t nat -X sudo iptables -t mangle -F sudo iptables -t mangle -X sudo iptables -P FORWARD ACCEPT
(控制节点)重新初始化集群:
在控制面节点上,您可以使用 kubeadm init 命令来重新初始化集群。在执行此命令时,可以指定新的 IP 地址作为 API server 的地址。
新 IP 地址是 192.168.1.17
,执行命令:
kubeadm init --image-repository registry.aliyuncs.com/google_containers --apiserver-advertise-address 192.168.1.137 --pod-network-cidr=10.122.0.0/16 --token-ttl 0
完成初始化后,按照提示操作,配置 kubectl 工具的使用环境。
(控制节点)网络重新初始化
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/tigera-operator.yaml
kubectl create -f custom-resources.yaml
将工作节点加入集群:
使用命令:
kubeadm join 192.168.1.37:6443 --token 0c4gsd.ack5w3mvuyuauivl --discovery-token-ca-cert-hash sha256:0ebb75771d6616cca43750771263760f28f31044da103de93ac6a010fc5afff9
五、查看安装镜像及导入镜像
查看镜像
# 查看镜像 crictl images ls ctr images list # 镜像导入 ctr -n k8s.io images import docker.tar
查看镜像报错
[root@localhost ~]# crictl images ls WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. E0822 23:32:15.167096 8661 remote_image.go:119] “ListImages with filter from image service failed” err=”rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory\”” filter=”&ImageFilter{Image:&ImageSpec{Image:ls,Annotations:map[string]string{},},}” FATA[0000] listing images: rpc error: code = Unavailable desc = connection error: desc = “transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory”
vim /etc/crictl.yaml runtime-endpoint: "/run/containerd/containerd.sock" image-endpoint: "/run/containerd/containerd.sock"
六、单master节点
方法一、pod YAML配置中添加容忍,允许pod在控制平面节点上运行
apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: my-container image: nginx tolerations: - key: "node-role.kubernetes.io/control-plane" operator: "Exists" effect: "NoSchedule"
方法二、移除master节点的污点
kubectl taint nodes --all node-role.kubernetes.io/control-plane:NoSchedule-
使用 KuboardSpray 安装kubernetes
docker run -d \
--privileged \
--restart=unless-stopped \
--name=kuboard-spray \
-p 80:80/tcp \
-v /var/run/docker.sock:/var/run/docker.sock \
-v ~/kuboard-spray-data:/data \
eipwork/kuboard-spray:latest-amd64
参考链接:KuboardSpray 安装kubernetes