1. 高可用架构设计与原理剖析
在互联网服务架构中,高可用性(High Availability)是保障业务连续性的核心要求。Keepalived与Nginx的组合方案,通过虚拟IP(VIP)漂移机制实现服务无缝切换,当主节点故障时能在秒级完成故障转移。这套方案特别适合对服务中断容忍度低的Web应用场景。
核心组件协作原理:
- Nginx:作为高性能的反向代理服务器,处理客户端请求并实现负载均衡
- Keepalived:基于VRRP协议实现VIP的自动漂移,监控Nginx进程状态
- VRRP协议:通过多播通信实现主备节点状态同步,优先级决定VIP归属
关键提示:实际部署时建议主备节点硬件配置一致,避免切换后性能瓶颈。网络方面需确保组播通信畅通(IGMP协议支持)
2. 环境准备与系统配置
2.1 网络拓扑规划
典型双节点部署方案IP分配如下:
| 节点角色 | 物理IP | 虚拟IP(VIP) | 服务端口 |
|---|---|---|---|
| Master | 192.168.100.201 | 192.168.100.250 | 80 |
| Backup | 192.168.100.202 | 192.168.100.250 | 80 |
2.2 系统基础配置
主机名与网络配置:
bash复制# Master节点
hostnamectl set-hostname nginx-master
# Backup节点
hostnamectl set-hostname nginx-backup
# 网络配置示例(CentOS)
nmcli con mod eth0 ipv4.addresses 192.168.100.201/24
nmcli con mod eth0 ipv4.gateway 192.168.100.1
nmcli con up eth0
防火墙策略配置:
bash复制# 开放VRRP组播通信
firewall-cmd --add-rich-rule='rule protocol value="vrrp" accept' --permanent
# 开放HTTP服务端口
firewall-cmd --add-port=80/tcp --permanent
firewall-cmd --reload
经验之谈:生产环境建议禁用SELinux或设置为permissive模式,避免权限问题导致服务异常
3. Nginx集群部署详解
3.1 编译安装最佳实践
依赖安装:
bash复制yum install -y gcc pcre-devel openssl-devel zlib-devel
编译参数优化:
bash复制./configure \
--prefix=/usr/local/nginx \
--with-http_ssl_module \
--with-http_stub_status_module \
--with-http_realip_module \
--with-threads \
--with-stream
性能调优配置(nginx.conf):
nginx复制worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 4096;
use epoll;
multi_accept on;
}
http {
keepalive_timeout 65;
keepalive_requests 10000;
sendfile on;
tcp_nopush on;
}
3.2 双节点一致性配置
- 创建统一的网页测试文件:
bash复制echo "<h1>Welcome to 192.168.100.201</h1>" > /usr/local/nginx/html/index.html
- 配置systemd服务管理:
ini复制# /usr/lib/systemd/system/nginx.service
[Unit]
Description=nginx service
After=network.target
[Service]
Type=forking
ExecStart=/usr/local/nginx/sbin/nginx
ExecReload=/usr/local/nginx/sbin/nginx -s reload
ExecStop=/usr/local/nginx/sbin/nginx -s quit
PrivateTmp=true
[Install]
WantedBy=multi-user.target
4. Keepalived高可用配置
4.1 主备节点配置差异
Master节点配置(/etc/keepalived/keepalived.conf):
nginx复制global_defs {
router_id nginx_master
}
vrrp_script chk_nginx {
script "/etc/keepalived/check_nginx.sh"
interval 2
weight -20
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
virtual_ipaddress {
192.168.100.250/24
}
track_script {
chk_nginx
}
}
Backup节点配置:
nginx复制global_defs {
router_id nginx_backup
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
virtual_ipaddress {
192.168.100.250/24
}
track_script {
chk_nginx
}
}
4.2 Nginx健康检测脚本
bash复制#!/bin/bash
count=$(ps -ef | grep nginx | grep -v grep | wc -l)
if [ $count -eq 0 ];then
systemctl start nginx
sleep 2
if [ $(ps -ef | grep nginx | grep -v grep | wc -l) -eq 0 ];then
systemctl stop keepalived
fi
fi
设置执行权限:
bash复制chmod +x /etc/keepalived/check_nginx.sh
5. 高级功能实现
5.1 LVS负载均衡集成
DR模式配置示例:
bash复制# 调度器配置
ipvsadm -A -t 192.168.100.250:80 -s rr
ipvsadm -a -t 192.168.100.250:80 -r 192.168.100.202:80 -g
ipvsadm -a -t 192.168.100.250:80 -r 192.168.100.203:80 -g
# RS节点配置(每台真实服务器)
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce
echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce
5.2 Keepalived+LVS自动切换
整合配置示例:
nginx复制virtual_server 192.168.100.250 80 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 50
protocol TCP
real_server 192.168.100.202 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.100.203 80 {
weight 1
TCP_CHECK {
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
6. 故障排查与性能优化
6.1 常见问题诊断
VIP无法漂移排查步骤:
- 检查VRRP通信:
bash复制
tcpdump -i eth0 vrrp -n - 验证防火墙规则:
bash复制
firewall-cmd --list-all - 查看Keepalived日志:
bash复制
journalctl -u keepalived -f
Nginx性能瓶颈分析:
bash复制# 连接数统计
netstat -ant | grep :80 | wc -l
# 工作进程负载
top -p $(pgrep -d',' nginx)
6.2 生产环境优化建议
-
Keepalived参数调优:
nginx复制vrrp_instance VI_1 { preempt_delay 300 # 主节点恢复后延迟抢占 notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" } -
Nginx高可用增强:
- 配置多进程健康检查
- 启用共享内存zone保存会话状态
- 设置主备节点配置自动同步
-
监控方案集成:
bash复制# Prometheus监控示例 location /metrics { stub_status on; access_log off; }
在实际运维中,我们通过封装Ansible角色实现了整套方案的自动化部署。关键经验是:每次变更后必须进行故障转移测试,建议每月至少进行一次主动切换演练。对于金融级应用,可考虑部署三节点架构,通过设置nopreempt避免脑裂情况。