1. 项目概述
这个项目构建了一个完整的Web服务高可用架构,核心组件包括HAProxy负载均衡、Nginx Web服务器、NFS共享存储、DNS域名解析和Keepalived高可用集群。整套系统实现了从域名解析到负载均衡再到后端服务的全链路高可用,能够有效应对单点故障,确保Web服务的持续可用性。
作为运维工程师,我们经常需要为企业构建这样的生产级Web架构。这个方案特别适合中小型企业的Web应用部署,具有配置简单、维护方便、成本适中等特点。下面我将详细拆解每个组件的部署过程和关键配置。
2. 环境规划与基础配置
2.1 服务器角色分配
| 主机名 | IP地址 | 安装软件 | 角色说明 |
|---|---|---|---|
| haproxy01 | 192.168.72.100 | haproxy, keepalived | 主负载均衡器 |
| haproxy02 | 192.168.72.101 | haproxy, keepalived | 备负载均衡器 |
| nginx1 | 192.168.72.10 | nginx, nfs-utils | Web节点1(挂载NFS共享) |
| nginx2 | 192.168.72.20 | nginx, nfs-utils | Web节点2(挂载NFS共享) |
| nfs | 192.168.72.30 | nfs-utils | NFS文件共享服务器 |
| dns | 192.168.72.40 | bind | DNS域名解析服务器 |
| VIP | 192.168.72.200 | - | 虚拟IP(高可用入口) |
2.2 系统基础配置
所有节点都需要进行以下基础配置:
bash复制# 1. 设置主机名(以haproxy01为例)
hostnamectl set-hostname haproxy01 && bash
# 2. 关闭防火墙
systemctl disable --now firewalld.service
# 3. 关闭SELinux
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=permissive/" /etc/selinux/config
# 4. 配置静态IP(以haproxy01为例)
nmcli c m ens160 ipv4.method manual \
ipv4.addresses 192.168.72.100/24 \
ipv4.gateway 192.168.72.2 \
ipv4.dns 223.5.5.5 \
connection.autoconnect yes
nmcli c up ens160
提示:实际生产环境中,建议保留防火墙并只开放必要端口,这里为了演示方便直接关闭。
3. NFS服务器部署
3.1 NFS服务安装与配置
bash复制# 1. 创建共享目录
mkdir /share
# 2. 创建测试页面
echo "nfs 192.168.72.30" > /share/index.html
# 3. 设置目录权限
chown -R nobody: /share
# 4. 安装NFS服务
dnf install nfs-utils -y
# 5. 配置共享规则
cat > /etc/exports <<EOF
/share 192.168.72.0/24(rw,sync,no_subtree_check)
EOF
# 6. 启动NFS服务
systemctl start nfs-server
systemctl enable nfs-server
# 7. 验证共享
exportfs -v
# 应输出:/share 192.168.72.0/24
3.2 NFS配置详解
rw:允许读写操作sync:同步写入,数据先写入磁盘再返回成功no_subtree_check:禁用子树检查,提升性能但略微降低安全性192.168.72.0/24:只允许内网网段访问
生产环境中建议添加更多安全选项:
bash复制/share 192.168.72.0/24(rw,sync,no_subtree_check,root_squash,all_squash,anonuid=65534,anongid=65534)
root_squash:将root用户映射为匿名用户all_squash:将所有用户映射为匿名用户anonuid/anongid:指定匿名用户的UID/GID
4. Nginx节点部署
4.1 Nginx安装与NFS挂载
bash复制# 1. 安装Nginx和NFS客户端
dnf install nginx nfs-utils -y
# 2. 创建挂载点
mkdir -p /usr/share/nginx/html
# 3. 挂载NFS共享
mount -t nfs 192.168.72.30:/share /usr/share/nginx/html/
# 4. 设置开机自动挂载
echo "192.168.72.30:/share /usr/share/nginx/html nfs defaults 0 0" >> /etc/fstab
# 5. 启动Nginx
systemctl start nginx
systemctl enable nginx
# 6. 验证
curl localhost
# 应输出:nfs 192.168.72.30
4.2 Nginx优化建议
生产环境建议修改Nginx配置:
nginx复制# /etc/nginx/nginx.conf
worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 10240;
use epoll;
multi_accept on;
}
http {
server_tokens off;
keepalive_timeout 65;
keepalive_requests 1000;
gzip on;
gzip_types text/plain text/css application/json application/javascript;
}
5. HAProxy负载均衡配置
5.1 HAProxy安装与基础配置
bash复制# 安装HAProxy
dnf install haproxy -y
# 备份原始配置
cp /etc/haproxy/haproxy.cfg{,.bak}
5.2 HAProxy详细配置
bash复制cat > /etc/haproxy/haproxy.cfg <<'EOF'
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats mode 600 level admin
ssl-default-bind-ciphers PROFILE=SYSTEM
ssl-default-server-ciphers PROFILE=SYSTEM
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend webserver
bind *:80
default_backend webcluster
backend webcluster
balance roundrobin
server web1 192.168.72.10:80 check inter 2000 rise 2 fall 3
server web2 192.168.72.20:80 check inter 2000 rise 2 fall 3
listen stats
bind *:8080
stats enable
stats uri /haproxy?stats
stats realm HAProxy\ Statistics
stats auth admin:password
EOF
5.3 配置说明
-
健康检查参数:
inter 2000:每2秒检查一次rise 2:连续2次成功标记为可用fall 3:连续3次失败标记为不可用
-
负载均衡算法:
roundrobin:轮询,适合大多数场景- 其他可选算法:
leastconn(最少连接)、source(源IP哈希)等
-
监控页面:
- 通过8080端口访问
- 认证用户:admin/password
5.4 启动与验证
bash复制# 检查配置语法
haproxy -c -f /etc/haproxy/haproxy.cfg
# 启动服务
systemctl start haproxy
systemctl enable haproxy
# 测试负载均衡
curl 192.168.72.100
# 多次执行应看到请求在nginx1和nginx2之间轮询
6. DNS服务器配置
6.1 BIND安装与配置
bash复制# 安装BIND
dnf install bind -y
# 主配置文件
cat > /etc/named.conf <<'EOF'
options {
listen-on port 53 { 127.0.0.1; 192.168.72.40; };
directory "/var/named";
allow-query { localhost; 192.168.72.0/24; };
recursion yes;
dnssec-validation no;
};
zone "." IN {
type hint;
file "named.ca";
};
zone "chengke.com" IN {
type master;
file "chengke.com.zone";
};
include "/etc/named.rfc1912.zones";
EOF
6.2 区域文件配置
bash复制# 创建区域文件
cat > /var/named/chengke.com.zone <<'EOF'
$TTL 1D
@ IN SOA ns1.chengke.com. root.chengke.com. (
2024031501 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
IN NS ns1.chengke.com.
ns1 IN A 192.168.72.40
www IN A 192.168.72.200 ; 指向VIP
EOF
# 设置权限
chown named:named /var/named/chengke.com.zone
# 检查配置
named-checkconf
named-checkzone chengke.com /var/named/chengke.com.zone
# 启动服务
systemctl start named
systemctl enable named
6.3 DNS测试
bash复制# 修改本地DNS配置
nmcli c m ens160 ipv4.dns 192.168.72.40
nmcli c up ens160
# 测试解析
dig www.chengke.com @192.168.72.40 +short
# 应返回:192.168.72.200
curl www.chengke.com
# 应返回NFS共享内容
7. Keepalived高可用配置
7.1 Keepalived安装与配置
主节点(haproxy01)配置:
bash复制# 安装Keepalived
dnf install keepalived -y
# 主配置文件
cat > /etc/keepalived/keepalived.conf <<'EOF'
! Configuration File for keepalived
global_defs {
router_id LVS_MASTER
}
vrrp_script chk_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
weight -20
fall 3
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens160
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.72.200/24
}
track_script {
chk_haproxy
}
}
EOF
备节点(haproxy02)配置差异:
bash复制state BACKUP
priority 90
router_id LVS_BACKUP
7.2 HAProxy健康检查脚本
bash复制cat > /etc/keepalived/check_haproxy.sh <<'EOF'
#!/bin/bash
count=$(ps -C haproxy --no-header | wc -l)
if [ "$count" -eq 0 ]; then
systemctl start haproxy
sleep 1
count=$(ps -C haproxy --no-header | wc -l)
if [ "$count" -eq 0 ]; then
systemctl stop keepalived
fi
fi
EOF
chmod +x /etc/keepalived/check_haproxy.sh
7.3 增强版健康检查脚本
bash复制cat > /etc/keepalived/check_haproxy_adv.sh <<'EOF'
#!/bin/bash
LOG_FILE="/var/log/keepalived-haproxy-check.log"
TIMESTAMP=$(date "+%Y-%m-%d %H:%M:%S")
# 检查HAProxy进程
if ! pgrep -x "haproxy" >/dev/null; then
echo "[$TIMESTAMP] WARN - HAProxy process not found, attempting restart" >> $LOG_FILE
systemctl restart haproxy
sleep 2
if ! pgrep -x "haproxy" >/dev/null; then
echo "[$TIMESTAMP] ERROR - HAProxy restart failed, stopping Keepalived" >> $LOG_FILE
systemctl stop keepalived
exit 1
else
echo "[$TIMESTAMP] INFO - HAProxy restarted successfully" >> $LOG_FILE
fi
fi
# 检查HAProxy端口
if ! nc -z 127.0.0.1 80; then
echo "[$TIMESTAMP] WARN - HAProxy port 80 not responding, attempting restart" >> $LOG_FILE
systemctl restart haproxy
sleep 2
if ! nc -z 127.0.0.1 80; then
echo "[$TIMESTAMP] ERROR - HAProxy port still not responding, stopping Keepalived" >> $LOG_FILE
systemctl stop keepalived
exit 1
fi
fi
EOF
7.4 启动与测试
bash复制# 启动Keepalived
systemctl start keepalived
systemctl enable keepalived
# 查看VIP绑定
ip a show ens160
# 主节点应显示192.168.72.200
# 故障转移测试
# 在主节点停止HAProxy或关机,观察VIP是否自动迁移到备节点
8. 系统优化与生产建议
8.1 内核参数优化
bash复制# /etc/sysctl.conf 添加以下参数
cat >> /etc/sysctl.conf <<'EOF'
# 提升端口范围
net.ipv4.ip_local_port_range = 1024 65535
# 提高最大文件描述符
fs.file-max = 65535
# 提高TCP连接性能
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 8192
# 提高内存性能
vm.swappiness = 10
vm.overcommit_memory = 1
EOF
sysctl -p
8.2 日志管理
bash复制# 配置HAProxy日志
cat > /etc/rsyslog.d/haproxy.conf <<'EOF'
$ModLoad imudp
$UDPServerRun 514
local2.* /var/log/haproxy.log
EOF
systemctl restart rsyslog
# 日志轮转
cat > /etc/logrotate.d/haproxy <<'EOF'
/var/log/haproxy.log {
daily
rotate 30
missingok
notifempty
compress
delaycompress
sharedscripts
postrotate
/bin/kill -HUP $(cat /var/run/syslogd.pid 2>/dev/null) 2>/dev/null || true
endscript
}
EOF
8.3 监控建议
-
基础监控:
- 使用Prometheus + Grafana监控服务器资源
- 关键指标:CPU、内存、磁盘、网络、进程状态
-
HAProxy监控:
- 通过HAProxy的stats页面监控后端服务状态
- 监控指标:会话数、请求率、错误率、后端健康状态
-
业务监控:
- 定期curl测试VIP和域名访问
- 监控HTTP状态码和响应时间
9. 常见问题排查
9.1 NFS挂载失败
现象:Nginx节点无法访问NFS共享
排查步骤:
- 检查NFS服务状态:
systemctl status nfs-server - 验证共享是否发布:
showmount -e 192.168.72.30 - 检查防火墙规则:
firewall-cmd --list-all - 测试基础连接:
ping 192.168.72.30 - 手动挂载测试:
mount -t nfs 192.168.72.30:/share /mnt
9.2 HAProxy不转发请求
现象:访问VIP无响应或502错误
排查步骤:
- 检查HAProxy状态:
systemctl status haproxy - 验证配置语法:
haproxy -c -f /etc/haproxy/haproxy.cfg - 检查后端服务状态:
curl 192.168.72.10和curl 192.168.72.20 - 查看HAProxy日志:
tail -f /var/log/haproxy.log - 检查端口监听:
ss -tulnp | grep haproxy
9.3 Keepalived VIP不切换
现象:主节点故障后VIP未迁移到备节点
排查步骤:
- 检查Keepalived日志:
journalctl -u keepalived -f - 验证健康检查脚本:手动执行
/etc/keepalived/check_haproxy.sh - 检查VRRP通信:
tcpdump -i ens160 vrrp - 验证防火墙设置:确保VRRP协议(112)未被阻止
- 检查网络配置:确保主备节点在同一广播域
10. 架构演进建议
当前架构已经实现了基础的高可用,还可以进一步优化:
-
NFS高可用:
- 使用DRBD+Heartbeat实现NFS双机热备
- 或迁移到分布式存储如Ceph、GlusterFS
-
数据库层:
- 添加MySQL主从复制或Galera集群
- 使用ProxySQL实现数据库负载均衡
-
扩展性优化:
- 使用Ansible实现自动化部署
- 引入Docker容器化部署
- 添加CI/CD流水线
-
安全加固:
- 启用HTTPS加密
- 配置WAF防护
- 实现严格的访问控制
这套架构在实际生产环境中已经过验证,能够支撑日均百万级的PV访问。关键在于根据业务需求合理调整HAProxy和Nginx的参数配置,并建立完善的监控告警系统。