在企业级大数据环境中,Hadoop集群的安全防护一直是运维工作的重中之重。传统基于Linux用户权限的认证机制存在两大核心缺陷:一是无法实现服务级别的细粒度访问控制,二是缺乏有效的凭证验证机制。Kerberos认证系统的引入,从根本上解决了这些安全问题。
Kerberos的工作原理可以类比为现代化办公大楼的门禁系统。在没有Kerberos的环境中,用户只需通过大楼门禁(服务器登录)就能畅通无阻地访问所有办公室(服务),这种粗放式的安全控制存在明显漏洞。Kerberos通过引入KDC(Key Distribution Center,密钥分发中心)这一核心组件,构建了完善的三方认证机制:
这种机制实现了以下安全特性:
本次部署采用四节点架构,具体规划如下:
| 节点IP | 主机名 | 角色 | 关键软件包 |
|---|---|---|---|
| 192.168.239.130 | node1 | Hadoop NameNode | krb5-workstation, krb5-libs |
| 192.168.239.131 | node2 | Hadoop DataNode | krb5-workstation, krb5-libs |
| 192.168.239.132 | node3 | Hadoop DataNode | krb5-workstation, krb5-libs |
| 192.168.239.133 | node4 | KDC服务器 | krb5-server全套组件 |
关键准备事项:
在KDC节点(node4)执行:
bash复制yum install -y krb5-server krb5-libs krb5-workstation
在其他Hadoop节点执行:
bash复制yum install -y krb5-workstation krb5-libs
注意:krb5-server包含KDC服务核心组件,只需安装在KDC节点。krb5-workstation提供管理工具,所有节点都需要安装。
所有节点都需要配置/etc/krb5.conf文件,建议先备份原文件:
bash复制mv /etc/krb5.conf /etc/krb5.conf.bak
新建配置文件内容如下:
ini复制[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = HADOOP.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
udp_preference_limit = 1
default_tgs_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
default_tkt_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
permitted_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
[realms]
HADOOP.COM = {
kdc = node4:88
admin_server = node4:749
}
[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM
关键参数解析:
加密算法选择:
票据生命周期:
域名解析:
在KDC节点(node4)配置/var/kerberos/krb5kdc/kdc.conf:
ini复制[kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88
[realms]
HADOOP.COM = {
master_key_type = aes256-cts
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal
max_life = 24h
max_renewable_life = 7d
}
配置ACL权限文件/var/kerberos/krb5kdc/kadm5.acl:
code复制*/admin@HADOOP.COM *
在KDC节点执行:
bash复制kdb5_util create -s -r HADOOP.COM
启动服务:
bash复制systemctl start krb5kdc kadmin
systemctl enable krb5kdc kadmin
bash复制kadmin.local -q "addprinc admin/admin"
验证管理员登录:
bash复制kadmin -p admin/admin
Hadoop各组件需要的主体格式有严格要求:
| 服务组件 | 主体格式 | 数量 | 说明 |
|---|---|---|---|
| NameNode | nn/hostname@REALM | 2 | 主备节点各一个 |
| DataNode | dn/hostname@REALM | N | 每个DataNode节点一个 |
| JournalNode | jn/hostname@REALM | 3 | 通常部署3个 |
| NodeManager | nm/hostname@REALM | N | 每个计算节点一个 |
| ResourceManager | rm/hostname@REALM | 2 | 主备节点各一个 |
| JobHistory | jhs/hostname@REALM | 1 | 历史服务器节点 |
| HTTP | HTTP/hostname@REALM | N | 所有节点都需要 |
| ZooKeeper | zookeeper/hostname@REALM | 3 | 每个ZK节点一个 |
在KDC节点执行:
bash复制mkdir -p /opt/keytab
# NameNode
kadmin.local -q "addprinc -randkey nn/node1"
kadmin.local -q "addprinc -randkey nn/node2"
# JournalNode
kadmin.local -q "addprinc -randkey jn/node1"
kadmin.local -q "addprinc -randkey jn/node2"
kadmin.local -q "addprinc -randkey jn/node3"
# DataNode & NodeManager
for i in {1..3}; do
kadmin.local -q "addprinc -randkey dn/node$i"
kadmin.local -q "addprinc -randkey nm/node$i"
done
# ResourceManager
kadmin.local -q "addprinc -randkey rm/node2"
kadmin.local -q "addprinc -randkey rm/node3"
# HTTP
for i in {1..3}; do
kadmin.local -q "addprinc -randkey HTTP/node$i"
done
# ZooKeeper
for i in {1..3}; do
kadmin.local -q "addprinc -randkey zookeeper/node$i"
done
# 合并生成keytab文件
kadmin.local -q "ktadd -k /opt/keytab/hadoop.keytab \
nn/node1 nn/node2 \
jn/node1 jn/node2 jn/node3 \
dn/node1 dn/node2 dn/node3 \
nm/node1 nm/node2 nm/node3 \
rm/node2 rm/node3 \
HTTP/node1 HTTP/node2 HTTP/node3 \
zookeeper/node1 zookeeper/node2 zookeeper/node3"
# 验证keytab内容
klist -k /opt/keytab/hadoop.keytab
将生成的keytab分发到各节点:
bash复制for i in {1..3}; do
scp /opt/keytab/hadoop.keytab node$i:/opt/keytab/
done
在所有Hadoop节点执行:
bash复制groupadd hadoop
useradd -g hadoop hdfs
useradd -g hadoop yarn
useradd -g hadoop mapred
useradd -g hadoop http
# 设置各用户密码
echo "密码" | passwd --stdin hdfs
echo "密码" | passwd --stdin yarn
echo "密码" | passwd --stdin mapred
echo "密码" | passwd --stdin http
bash复制chown -R root:hadoop /opt/keytab/
chmod 440 /opt/keytab/hadoop.keytab
按照官方推荐配置目录权限:
bash复制# Hadoop安装目录
chown -R hdfs:hadoop /opt/hadoop323
chmod -R 755 /opt/hadoop323
# HDFS数据目录
chown -R hdfs:hadoop /opt/hadoop323/hdpData/dfs
chmod -R 700 /opt/hadoop323/hdpData/dfs
# 日志目录
chown -R hdfs:hadoop /opt/hadoop323/logs
chmod -R 775 /opt/hadoop323/logs
# YARN本地目录
chown -R yarn:hadoop /opt/hadoop323/hdpData/tmp/nm-local-dir
chmod -R 755 /opt/hadoop323/hdpData/tmp/nm-local-dir
# 用户日志目录
chown -R yarn:hadoop /opt/hadoop323/logs/userlogs
chmod -R 755 /opt/hadoop323/logs/userlogs
# JournalNode数据目录
chown -R hdfs:hadoop /opt/hadoop323/hdpData/journaldata
chmod -R 755 /opt/hadoop323/hdpData/journaldata
xml复制<!-- 启用Kerberos认证 -->
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<!-- 启用服务授权 -->
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<!-- 主体到本地用户映射规则 -->
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1](HTTP.*)s/.*/http/
RULE:[2:$1](nn.*)s/.*/hdfs/
RULE:[2:$1](dn.*)s/.*/hdfs/
RULE:[2:$1](rm.*)s/.*/yarn/
RULE:[2:$1](nm.*)s/.*/yarn/
RULE:[2:$1](jhs.*)s/.*/mapred/
RULE:[2:$1](jn.*)s/.*/hdfs/
DEFAULT
</value>
</property>
xml复制<!-- NameNode配置 -->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>nn/_HOST@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/opt/keytab/hadoop.keytab</value>
</property>
<!-- DataNode配置 -->
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>dn/_HOST@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/opt/keytab/hadoop.keytab</value>
</property>
<!-- HTTPS配置 -->
<property>
<name>dfs.http.policy</name>
<value>HTTPS_ONLY</value>
</property>
xml复制<!-- ResourceManager配置 -->
<property>
<name>yarn.resourcemanager.principal</name>
<value>rm/_HOST@HADOOP.COM</value>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/opt/keytab/hadoop.keytab</value>
</property>
<!-- NodeManager配置 -->
<property>
<name>yarn.nodemanager.principal</name>
<value>nm/_HOST@HADOOP.COM</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/opt/keytab/hadoop.keytab</value>
</property>
在CA节点执行:
bash复制openssl req -new -x509 -keyout /root/hdfs_ca_key -out /root/hdfs_ca_cert -days 36500 -subj '/C=CN/ST=beijing/L=haidian/O=devA/OU=devB/CN=devC'
每个节点执行:
bash复制# 生成keystore
keytool -keystore /root/keystore -alias node1 -genkey -keyalg RSA -dname "CN=node1, OU=dev,O=dev,L=dev,ST=dev,C=CN" -storepass 123456 -keypass 123456
# 生成证书请求
keytool -certreq -keystore /root/keystore -alias node1 -file /root/cert -storepass 123456 -keypass 123456
# CA签名
openssl x509 -req -CA /root/hdfs_ca_cert -CAkey /root/hdfs_ca_key -in /root/cert -out /root/cert_signed -days 36500 -CAcreateserial
# 导入CA证书和签名证书
keytool -keystore /root/keystore -alias ca -import -file /root/hdfs_ca_cert -storepass 123456 -noprompt
keytool -keystore /root/keystore -alias node1 -import -file /root/cert_signed -storepass 123456 -noprompt
# 生成truststore
keytool -keystore /root/truststore -alias ca -import -file /root/hdfs_ca_cert -storepass 123456 -trustcacerts -noprompt
# 设置权限
chown -R root:hadoop /home/keystore /home/truststore
chmod 770 /home/keystore /home/truststore
配置ssl-server.xml:
xml复制<property>
<name>ssl.server.truststore.location</name>
<value>/home/truststore</value>
</property>
<property>
<name>ssl.server.truststore.password</name>
<value>123456</value>
</property>
<property>
<name>ssl.server.keystore.location</name>
<value>/home/keystore</value>
</property>
<property>
<name>ssl.server.keystore.password</name>
<value>123456</value>
</property>
zoo.cfg追加:
properties复制authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
jaasLoginRenew=3600000
jaas.conf配置:
code复制Server {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/opt/keytab/hadoop.keytab"
storeKey=true
useTicketCache=false
principal="zookeeper/node1@HADOOP.COM";
};
bash复制kinit -kt /opt/keytab/hadoop.keytab nn/node1@HADOOP.COM
bash复制klist
bash复制hadoop fs -ls /
bash复制hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.3.jar pi 10 100
症状:GSSException: No valid credentials provided
排查步骤:
症状:String index out of range: -1
解决方案:
确保zookeeper主体包含完整的主机名,格式为:
code复制zookeeper/hostname@REALM
症状:SSL handshake failed
排查步骤:
症状:User hdfs is not privileged
解决方案:
bash复制chmod 6050 $HADOOP_HOME/bin/container-executor
实际部署中,Kerberos配置往往需要与企业的LDAP/AD系统集成,建议在测试环境充分验证后再进行生产部署。对于大型集群,可以考虑使用Apache Ranger等工具进行更细粒度的访问控制。