1. 为什么需要自己搭建Kubernetes集群
在云原生时代,Kubernetes已经成为容器编排的事实标准。虽然各大云厂商都提供了托管版Kubernetes服务,但自己搭建集群仍然是每个DevOps工程师和云原生开发者的必修课。通过手动搭建,你能够:
- 深入理解Kubernetes的架构原理和组件交互
- 掌握集群网络、存储等核心配置的底层细节
- 为后续的故障排查和性能调优打下坚实基础
- 在本地或离线环境中构建符合特定需求的定制化集群
我曾在生产环境中部署过数十个Kubernetes集群,从单节点开发环境到跨多个可用区的高可用生产集群。这个教程将把我积累的经验和踩过的坑都分享给你。
2. 环境准备与基础配置
2.1 硬件资源规划
对于学习环境,建议至少准备:
- 2台4核8GB内存的虚拟机(1个Master+1个Worker)
- 每台50GB磁盘空间(系统盘+数据盘)
- 千兆网络互联
生产环境则需要根据负载评估,通常:
- Master节点:4核16GB内存起(高可用需3台)
- Worker节点:按应用需求横向扩展
- 建议使用SSD存储并配置RAID
注意:所有节点需要时间同步(chrony或ntpd),否则证书会出问题
2.2 操作系统配置
以Ubuntu 20.04为例,基础配置如下:
bash复制# 所有节点执行
sudo apt update && sudo apt upgrade -y
sudo swapoff -a # 永久禁用需注释/etc/fstab中的swap行
sudo sed -i '/ swap / s/^/#/' /etc/fstab
# 加载内核模块
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF
# 设置内核参数
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
3. 容器运行时与Kubernetes组件安装
3.1 容器运行时选择与配置
推荐使用containerd作为运行时:
bash复制# 安装containerd
sudo apt install -y containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd
3.2 Kubernetes组件安装
配置APT仓库并安装三件套:
bash复制sudo apt install -y apt-transport-https ca-certificates curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl # 防止自动升级
4. 集群初始化与节点加入
4.1 Master节点初始化
使用kubeadm初始化集群:
bash复制sudo kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=<MASTER_IP> \
--control-plane-endpoint=<MASTER_IP>:6443 \
--upload-certs
初始化完成后,按照提示配置kubectl:
bash复制mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
4.2 安装网络插件
推荐使用Flannel:
bash复制kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
4.3 Worker节点加入集群
在Master节点上获取join命令:
bash复制kubeadm token create --print-join-command
在Worker节点执行输出的join命令。
5. 关键组件配置与优化
5.1 存储配置
安装NFS客户端提供动态存储:
bash复制# 所有节点安装
sudo apt install -y nfs-common
# 部署NFS Provisioner
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=<NFS_SERVER_IP> \
--set nfs.path=/data/nfs
5.2 监控方案部署
使用kube-prometheus-stack:
bash复制helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace
6. 生产级高可用配置
6.1 多Master节点部署
初始化第一个Master后,在其他控制平面节点执行:
bash复制sudo kubeadm join <LOAD_BALANCER_IP>:6443 \
--token <TOKEN> \
--discovery-token-ca-cert-hash sha256:<HASH> \
--control-plane \
--certificate-key <KEY>
6.2 负载均衡配置
使用HAProxy实现API Server负载均衡:
haproxy复制frontend k8s-api
bind *:6443
mode tcp
default_backend k8s-api-servers
backend k8s-api-servers
mode tcp
balance roundrobin
server master1 <MASTER1_IP>:6443 check
server master2 <MASTER2_IP>:6443 check
server master3 <MASTER3_IP>:6443 check
7. 常见问题排查指南
7.1 节点NotReady问题排查流程
-
检查kubelet状态:
bash复制
systemctl status kubelet journalctl -xeu kubelet -
验证网络插件:
bash复制
kubectl get pods -n kube-system kubectl logs <flannel-pod> -n kube-system -
检查CNI配置:
bash复制ls /etc/cni/net.d/ ip route show
7.2 Pod创建失败排查
典型错误及解决方案:
| 错误现象 | 可能原因 | 解决方案 |
|---|---|---|
| ImagePullBackOff | 镜像拉取失败 | 检查镜像地址/凭证 |
| CrashLoopBackOff | 应用启动失败 | 查看应用日志 kubectl logs |
| Pending | 资源不足 | 检查资源请求/节点容量 |
8. 安全加固建议
8.1 RBAC权限控制
创建最小权限ServiceAccount:
yaml复制apiVersion: v1
kind: ServiceAccount
metadata:
name: limited-user
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
subjects:
- kind: ServiceAccount
name: limited-user
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
8.2 Pod安全策略
启用PodSecurityPolicy(或新版PodSecurity):
bash复制kube-apiserver --enable-admission-plugins=PodSecurityPolicy
示例策略:
yaml复制apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'secret'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
- min: 1
max: 65535
fsGroup:
rule: 'MustRunAs'
ranges:
- min: 1
max: 65535
9. 集群维护与升级
9.1 版本升级步骤
-
升级kubeadm:
bash复制sudo apt update sudo apt install -y kubeadm=<version> -
升级Master节点:
bash复制sudo kubeadm upgrade plan sudo kubeadm upgrade apply v<version> -
升级kubelet和kubectl:
bash复制sudo apt install -y kubelet=<version> kubectl=<version> sudo systemctl daemon-reload sudo systemctl restart kubelet -
升级Worker节点:
bash复制sudo kubeadm upgrade node
9.2 日常维护命令
常用维护操作:
bash复制# 查看集群状态
kubectl get nodes -o wide
kubectl get pods -A -o wide
# 排空节点(维护前)
kubectl drain <node> --ignore-daemonsets
# 解除排空
kubectl uncordon <node>
# 证书检查
kubeadm alpha certs check-expiration
10. 性能调优实践
10.1 kubelet参数优化
调整/etc/kubernetes/kubelet.conf:
yaml复制apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
evictionHard:
memory.available: "500Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%"
kubeReserved:
cpu: "500m"
memory: "1Gi"
ephemeral-storage: "5Gi"
systemReserved:
cpu: "500m"
memory: "1Gi"
ephemeral-storage: "5Gi"
10.2 调度器配置
自定义调度器配置:
yaml复制apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: default-scheduler
plugins:
score:
disabled:
- name: ImageLocality
enabled:
- name: NodeResourcesBalancedAllocation
weight: 1
- name: InterPodAffinity
weight: 2
在实际生产环境中,我发现合理设置Pod拓扑分布约束能显著提高应用可用性:
yaml复制apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 3
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: nginx