在Kubernetes集群中部署有状态服务一直是个颇具挑战性的任务。与无状态服务不同,有状态服务对网络标识、存储持久性和启动顺序都有严格要求。StatefulSet正是为解决这些问题而设计的核心控制器。
有状态服务(Stateful Service)的核心特征体现在三个方面:
典型的有状态服务包括:
无状态服务(如Deployment管理的Pod)具有以下特点:
bash复制# 无状态服务Pod名称示例
my-app-7cbbf5d5f5-abc12
my-app-7cbbf5d5f5-xyz34
# 有状态服务Pod名称示例
web-0
web-1
web-2
StatefulSet通过三个核心机制确保有状态服务的稳定性:
重要提示:StatefulSet要求预先创建对应的Headless Service,且Service名称必须与StatefulSet中的serviceName字段匹配。
一个完整的StatefulSet资源清单包含两个层次的spec定义:
yaml复制apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec: # 第一层spec定义StatefulSet行为
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template: # Pod模板
metadata:
labels:
app: nginx
spec: # 第二层spec定义容器属性
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeClaimTemplates: # 存储卷申请模板
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "nfs"
resources:
requests:
storage: 1Gi
podManagementPolicy:
updateStrategy:
volumeClaimTemplates:
<模板名称>-<StatefulSet名称>-<序号>Headless Service是StatefulSet正常工作的前提条件:
yaml复制apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
clusterIP: None # 关键配置
ports:
- port: 80
name: web
selector:
app: nginx
与普通Service的区别:
<pod-name>.<svc-name>.<namespace>.svc.cluster.local的DNS记录StatefulSet支持两种存储配置方式:
静态预配置:
动态供应:
实践经验:生产环境建议使用动态供应,但需要确保StorageClass配置正确,特别是回收策略(reclaimPolicy)通常应设置为Retain以避免数据意外删除。
下面是一个完整的StatefulSet部署示例,包含以下组件:
yaml复制# nginx-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "nfs-client"
resources:
requests:
storage: 1Gi
部署步骤:
bash复制# 应用配置
kubectl apply -f nginx-statefulset.yaml
# 查看创建的资源
kubectl get statefulset
kubectl get pods -l app=nginx
kubectl get pvc
kubectl get pv
# 验证DNS解析
kubectl run -it --rm --image=busybox:1.28 dns-test -- /bin/sh
> nslookup web-0.nginx.default.svc.cluster.local
> nslookup nginx.default.svc.cluster.local
StatefulSet支持优雅的扩缩容:
bash复制# 扩容到5个副本
kubectl scale statefulset web --replicas=5
# 或者通过编辑配置
kubectl edit statefulset web
# 修改spec.replicas后保存
# 缩容到2个副本
kubectl patch statefulset web -p '{"spec":{"replicas":2}}'
# 观察Pod有序创建/删除过程
kubectl get pods -l app=nginx -w
重要特性:缩容时,StatefulSet会按照从高到低的顺序删除Pod(如先删除web-4,再web-3),且会保留关联的PVC以便后续扩容时重新挂载。
StatefulSet支持两种更新策略:
yaml复制updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 2 # 只更新序号>=2的Pod
yaml复制updateStrategy:
type: OnDelete
更新镜像版本示例:
bash复制# 方法1:直接编辑
kubectl edit statefulset web
# 修改spec.template.spec.containers[0].image后保存
# 方法2:使用patch命令
kubectl patch statefulset web --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value":"nginx:1.22"}]'
# 观察更新过程
kubectl rollout status statefulset web
存储规划:
网络优化:
监控与日志:
备份策略:
问题1:Pod卡在Pending状态
可能原因及解决方案:
bash复制kubectl describe pvc www-web-0
kubectl get storageclass
bash复制kubectl describe pod web-0
kubectl get nodes -o wide
问题2:DNS解析失败
排查步骤:
bash复制# 检查Service是否正确创建
kubectl get svc nginx
# 检查CoreDNS运行状态
kubectl -n kube-system get pods -l k8s-app=kube-dns
# 在Pod内执行nslookup测试
kubectl exec -it web-0 -- nslookup nginx.default.svc.cluster.local
问题3:存储卷挂载失败
检查方法:
bash复制# 查看Pod事件
kubectl describe pod web-0
# 检查PVC状态
kubectl get pvc
# 检查PV绑定情况
kubectl get pv
# 检查存储插件日志
kubectl -n kube-system logs -l app=nfs-client-provisioner
查看StatefulSet事件:
bash复制kubectl describe statefulset web
访问特定Pod:
bash复制# 通过Pod名称直接访问
kubectl exec -it web-0 -- /bin/bash
# 通过Service访问特定Pod
curl http://web-0.nginx.default.svc.cluster.local
强制删除卡住的Pod:
bash复制kubectl delete pod web-0 --grace-period=0 --force
查看控制器决策日志:
bash复制kubectl logs -n kube-system <statefulset-controller-pod-name>
下面是一个MySQL主从集群的StatefulSet配置示例:
yaml复制# mysql-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- port: 3306
name: mysql
clusterIP: None
selector:
app: mysql
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: init-mysql
image: mysql:5.7
command:
- bash
- "-c"
- |
set -ex
# 基于Pod序号生成server-id
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
# 主节点配置binlog
if [[ $ordinal -eq 0 ]]; then
echo log-bin=mysql-bin >> /mnt/conf.d/master.cnf
else
echo log-slave-updates=1 >> /mnt/conf.d/slave.cnf
fi
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: "password"
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: data
mountPath: /var/lib/mysql
- name: conf
mountPath: /etc/mysql/conf.d
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "ssd"
resources:
requests:
storage: 10Gi
- metadata:
name: conf
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "standard"
resources:
requests:
storage: 1Gi
关键设计点:
Redis集群部署需要特别注意节点发现和配置:
yaml复制# redis-cluster-statefulset.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster
data:
redis.conf: |
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
appendonly yes
---
apiVersion: v1
kind: Service
metadata:
name: redis-cluster
spec:
ports:
- port: 6379
name: client
- port: 16379
name: gossip
clusterIP: None
selector:
app: redis-cluster
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:6.2
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossip
command: ["redis-server", "/etc/redis/redis.conf"]
volumeMounts:
- name: conf
mountPath: /etc/redis
readOnly: true
- name: data
mountPath: /data
volumes:
- name: conf
configMap:
name: redis-cluster
items:
- key: redis.conf
path: redis.conf
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "ssd"
resources:
requests:
storage: 5Gi
集群初始化脚本:
bash复制# 获取所有Pod IP
REDIS_CLI="kubectl exec -it redis-cluster-0 -- redis-cli --cluster create"
for i in {0..5}; do
REDIS_CLI="$REDIS_CLI $(kubectl get pod redis-cluster-$i -o jsonpath='{.status.podIP}'):6379"
done
REDIS_CLI="$REDIS_CLI --cluster-replicas 1"
# 执行集群创建
eval $REDIS_CLI
Kafka集群部署需要考虑broker ID和广告地址:
yaml复制# kafka-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: kafka
spec:
ports:
- port: 9092
name: client
clusterIP: None
selector:
app: kafka
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
serviceName: kafka
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: confluentinc/cp-kafka:6.2.0
env:
- name: KAFKA_BROKER_ID
valueFrom:
fieldRef:
fieldPath: metadata.name
apiVersion: v1
- name: KAFKA_ADVERTISED_LISTENERS
value: PLAINTEXT://$(POD_NAME).kafka.default.svc.cluster.local:9092
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
ports:
- containerPort: 9092
name: client
volumeMounts:
- name: data
mountPath: /var/lib/kafka/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "ssd"
resources:
requests:
storage: 20Gi
关键配置说明:
存储优化:
网络优化:
资源限制:
yaml复制resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "1"
memory: 2Gi
调度优化:
最小权限原则:
yaml复制securityContext:
runAsNonRoot: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
网络隔离:
敏感数据保护:
审计与监控:
跨可用区部署:
yaml复制topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: mysql
定期备份:
故障转移测试:
监控关键指标:
在实际生产环境中部署StatefulSet时,建议从小规模开始,逐步验证各项功能,确保满足应用的可用性和持久性要求。对于关键业务系统,应该制定详细的运维手册,包括日常维护、监控指标、故障处理流程等内容。