
[ Hadoop 3.2.1 官方文档 ] Hadoop 开启 Kerberos 安全模式

一. 介绍




二. 验证


启用服务级别身份验证后,固定用户必须先对自己进行身份验证,然后才能与Hadoop服务进行交互。最简单的方法是使用户使用Kerberos kinit命令进行交互式身份验证。如果无法通过kinit进行交互式登录,则可以使用使用Kerberos keytab文件的程序身份验证。


确保HDFS和YARN守护程序以不同的Unix用户身份运行,例如hdfs和yarn。另外,请确保MapReduce JobHistory服务器以不同的用户身份(例如mapred)运行。


hdfs:hadoopNameNode, Secondary NameNode, JournalNode, Datanode
yarn:hadoopResourceManager, NodeManager
mapred:hadoopMapReduce JobHistory Server

2.3. Hadoop守护程序的Kerberos 凭证

必须为每个Hadoop Service实例配置其Kerberos 凭证 和keytab文件位置。

服务 principals 的一般格式为 ServiceName/_HOST@REALM.TLD 。例如 ‘dn/_HOST@EXAMPLE.COM’ 。

Hadoop通过允许将服务 principals 的主机名组件指定为_HOST通配符来简化配置文件的部署。每个服务实例将在运行时将_HOST替换为其自己的标准主机名。这使管理员可以在所有节点上部署相同的配置文件集。但是,密钥表文件将有所不同。

2.3.1. HDFS


$ klist -e -k -t /etc/security/keytab/nn.service.keytab
Keytab name: FILE:/etc/security/keytab/nn.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)

该主机上的Secondary NameNode keytab文件应如下所示:

$ klist -e -k -t /etc/security/keytab/sn.service.keytab
Keytab name: FILE:/etc/security/keytab/sn.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)


$ klist -e -k -t /etc/security/keytab/dn.service.keytab
Keytab name: FILE:/etc/security/keytab/dn.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)

2.3.2. YARN


$ klist -e -k -t /etc/security/keytab/rm.service.keytab
Keytab name: FILE:/etc/security/keytab/rm.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)

每个主机上的NodeManager keytab文件应如下所示:

$ klist -e -k -t /etc/security/keytab/nm.service.keytab
Keytab name: FILE:/etc/security/keytab/nm.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)

2.4. MapReduce JobHistory服务器

该主机上的MapReduce JobHistory Server密钥表文件应如下所示:

$ klist -e -k -t /etc/security/keytab/jhs.service.keytab
Keytab name: FILE:/etc/security/keytab/jhs.service.keytab
KVNO Timestamp         Principal
   4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
   4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)

2.5. 从Kerberos principals 映射到OS用户帐户

In the default hadoop mode a Kerberos principal must be matched against a rule that transforms the principal to a simple form, i.e. a user account name without ‘@’ or ‘/’, otherwise a principal will not be authorized and a error will be logged. In case of the MIT mode the rules work in the same way as the auth_to_local in Kerberos configuration file (krb5.conf) and the restrictions of hadoop mode do not apply. If you use MIT mode it is suggested to use the same auth_to_local rules that are specified in your /etc/krb5.conf as part of your default realm and keep them in sync. In both hadoop and MIT mode the rules are being applied (with the exception of DEFAULT) to all principals regardless of their specified realm. Also, note you should not rely on the auth_to_local rules as an ACL and use proper (OS) mechanisms.


RULE:exp The local name will be formulated from exp. The format for exp is [n:string](regexp)s/pattern/replacement/g. The integer n indicates how many components the target principal should have. If this matches, then a string will be formed from string, substituting the realm of the principal for $0 and the n’th component of the principal for $n (e.g., if the principal was johndoe/admin then [2:$2$1foo] would result in the string adminjohndoefoo). If this string matches regexp, then the s//[g] substitution command will be run over the string. The optional g will cause the substitution to be global over the string, instead of replacing only the first match in the string. As an extension to MIT, Hadoop auth_to_local mapping supports the /L flag that lowercases the returned name.

DEFAULT Picks the first component of the principal name as the system user name if and only if the realm matches the default_realm (usually defined in /etc/krb5.conf). e.g. The default rule maps the principal host/full.qualified.domain.name@MYREALM.TLD to system user host if the default realm is MYREALM.TLD.






可以使用hadoop kerbname 指令 测试自定义规则。该命令允许指定一个 principals 并应用Hadoop当前的auth_to_local规则集。

2.6. 从用户到组的映射


实际上,您需要使用Kerberos和LDAP for Hadoop在安全模式下管理SSO环境。


代表最终用户访问Hadoop服务的某些产品(例如Apache Oozie)需要能够模拟最终用户。有关详细信息,请参见代理用户的文档

2.8. Datanode安全

由于Datanode数据传输协议未使用Hadoop RPC框架,因此Datanodes必须使用dfs.datanode.addressdfs.datanode.http.address指定的特权端口对自己进行身份验证。

当您以root用户身份执行hdfs datanode命令时,服务器进程首先绑定特权端口,然后放弃特权并以HDFS_DatanODE_SECURE_USER指定的用户帐户运行。此启动过程使用安装到JSVC_HOME的 jsvc program 。您必须在启动时(在hadoop-env.sh中)将HDFS_DatanODE_SECURE_USER和JSVC_HOME指定为环境变量。

从2.6.0版开始,SASL可用于验证数据传输协议。在此配置中,安全集群不再需要使用jsvc作为root 启动Datanode并绑定到特权端口。要在数据传输协议上启用SASL,请在hdfs-site.xml中设置dfs.data.transfer.protection

1 .为dfs.datanode.address设置一个非特权端口。



3.1. RPC上的数据加密

在hadoop服务和客户端之间传输的数据可以在网络上加密。在core-site.xml中将hadoop.rpc.protection设置为’privacy’ 可激活数据加密。

3.2. Block 数据传输时数据加密。



dfs.encrypt.data.transfer.cipher.suites设置为AES / CTR / nopadding可激活AES加密。认情况下,这是未指定的,因此不使用AES。


3.3. HTTP上的数据加密


要启用HDFS守护进程,一套Web控制台SSL dfs.http.policy要么HTTPS_ONLYHTTP_AND_HTTPS在HDFS-site.xml中。

有关分别启用基于HTTPS的KMS和基于HTTPS的HttpFS的说明,请参阅基于HTTP的 Hadoop KMSHadoop HDFS-Server 设置。

To enable SSL for web console of YARN daemons, set yarn.http.policy to HTTPS_ONLY in yarn-site.xml.

To enable SSL for web console of MapReduce JobHistory server, set mapreduce.jobhistory.http.policy to HTTPS_ONLY in mapred-site.xml.

四. 配置

4.1. HDFS和 local fileSystem 路径的权限

下表列出了HDFS和 local fileSystem (在所有节点上)的各种路径以及建议的权限:


4.2. 通用配置

为了在hadoop中打开RPC身份验证,请将hadoop.security.authentication属性的值设置为“ kerberos”,并适当地设置下面列出的与安全性相关的设置。


hadoop.security.authenticationkerberossimple : No authentication. (default) kerberos : Enable authentication by Kerberos.
hadoop.security.authorizationtrueEnable RPC service-level authorization.
hadoop.rpc.protectionauthenticationauthentication : authentication only (default); integrity : integrity check in addition to authentication; privacy : data encryption in addition to integrity
hadoop.security.auth_to_localRULE:exp1 RULE:exp2 … DEFAULTThe value is string containing new line characters. See Kerberos documentation for the format of exp.
hadoop.proxyuser.superuser.hostscomma separated hosts from which superuser access are allowed to impersonation. * means wildcard.
hadoop.proxyuser.superuser.groupscomma separated groups to which users impersonated by superuser belong. * means wildcard.

4.3. NameNode

dfs.block.access.token.enabletrueEnable HDFS block access tokens for secure operations.
dfs.namenode.kerberos.principalnn/_HOST@REALM.TLDKerberos principal name for the NameNode.
dfs.namenode.keytab.file/etc/security/keytab/nn.service.keytabKerberos keytab file for the NameNode.
dfs.namenode.kerberos.internal.spnego.principalHTTP/_HOST@REALM.TLDThe server principal used by the NameNode for web UI SPNEGO authentication. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is ‘*’, the web server will attempt to login with every

以下设置允许配置对NameNode Web UI的SSL访问(可选)。

dfs.http.policyHTTP_ONLY or HTTPS_ONLY or HTTP_AND_HTTPSHTTPS_ONLY turns off http access. This option takes precedence over the deprecated configuration dfs.https.enable and hadoop.ssl.enabled. If using SASL to authenticate data transfer protocol instead of running Datanode as root and using privileged ports, then this property must be set to HTTPS_ONLY to guarantee authentication of HTTP servers. (See dfs.data.transfer.protection.)
dfs.namenode.https-address0.0.0.0:9871This parameter is used in non-HA mode and without federation. See HDFS High Availability and HDFS Federation for details.
dfs.https.enabletrueThis value is deprecated. Use dfs.http.policy

4.4. Secondary NameNode

dfs.namenode.secondary.http-address0.0.0.0:9868HTTP web UI address for the Secondary NameNode.
dfs.namenode.secondary.https-address0.0.0.0:9869HTTPS web UI address for the Secondary NameNode.
dfs.secondary.namenode.keytab.file/etc/security/keytab/sn.service.keytabKerberos keytab file for the Secondary NameNode.
dfs.secondary.namenode.kerberos.principalsn/_HOST@REALM.TLDKerberos principal name for the Secondary NameNode.
dfs.secondary.namenode.kerberos.internal.spnego.principalHTTP/_HOST@REALM.TLDThe server principal used by the Secondary NameNode for web UI SPNEGO authentication. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is ‘*’, the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} i.e use the value of dfs.web.authentication.kerberos.principal.

4.5. JournalNode

dfs.journalnode.kerberos.principal jn/_HOST@REALM.TLDKerberos principal name for the JournalNode.
dfs.journalnode.keytab.file/etc/security/keytab/jn.service.keytabKerberos keytab file for the JournalNode.
dfs.journalnode.kerberos.internal.spnego.principalHTTP/_HOST@REALM.TLDThe server principal used by the JournalNode for web UI SPNEGO authentication when Kerberos security is enabled. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is ‘*’, the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} i.e use the value of dfs.web.authentication.kerberos.principal.
dfs.web.authentication.kerberos.keytab/etc/security/keytab/spnego.service.keytabSPNEGO keytab file for the JournalNode. In HA clusters this setting is shared with the Name Nodes.

4.6. Datanode

dfs.datanode.address0.0.0.0:1004Secure Datanode must use privileged port in order to assure that the server was started securely. This means that the server must be started via jsvc. Alternatively, this must be set to a non-privileged port if using SASL to authenticate data transfer protocol. (See dfs.data.transfer.protection.)
dfs.datanode.http.address0.0.0.0:1006Secure Datanode must use privileged port in order to assure that the server was started securely. This means that the server must be started via jsvc.
dfs.datanode.https.address0.0.0.0:9865HTTPS web UI address for the Data Node.
dfs.datanode.kerberos.principaldn/_HOST@REALM.TLDKerberos principal name for the Datanode.
dfs.datanode.keytab.file/etc/security/keytab/dn.service.keytabKerberos keytab file for the Datanode.
dfs.encrypt.data.transferfalseset to true when using data encryption
dfs.encrypt.data.transfer.algorithmoptionally set to 3des or rc4 when using data encryption to control encryption algorithm
dfs.encrypt.data.transfer.cipher.suitesoptionally set to AES/CTR/nopadding to activate AES encryption when using data encryption
dfs.encrypt.data.transfer.cipher.key.bitlengthoptionally set to 128, 192 or 256 to control key bit length when using AES with data encryption
dfs.data.transfer.protectionauthentication : authentication only; integrity : integrity check in addition to authentication; privacy : data encryption in addition to integrity This property is unspecified by default. Setting this property enables SASL for authentication of data transfer protocol. If this is enabled, then dfs.datanode.address must use a non-privileged port, dfs.http.policy must be set to HTTPS_ONLY and the HDFS_DatanODE_SECURE_USER environment variable must be undefined when starting the Datanode process.

4.7. WebHDFS

dfs.web.authentication.kerberos.principalhttp/_HOST@REALM.TLDKerberos principal name for the WebHDFS. In HA clusters this setting is commonly used by the JournalNodes for securing access to the JournalNode HTTP server with SPNEGO.
dfs.web.authentication.kerberos.keytab/etc/security/keytab/http.service.keytabKerberos keytab file for WebHDFS. In HA clusters this setting is commonly used the JournalNodes for securing access to the JournalNode HTTP server with SPNEGO.

范围 价值 笔记
dfs.web.authentication.kerberos.principal http/_HOST@REALM.TLD WebHDFS的Kerberosprincipal名称。在HA群集中,JournalNode经常使用此设置来保护对SPNEGO对JournalNode HTTP服务器的访问。
dfs.web.authentication.kerberos.keytab /etc/security/keytab/http.service.keytab WebHDFS的Kerberos密钥表文件。在HA群集中,此设置通常用于JournalNode,以通过SPNEGO保护对JournalNode HTTP服务器的访问。

4.8. ResourceManager

yarn.resourcemanager.principalrm/_HOST@REALM.TLDKerberos principal name for the ResourceManager.
yarn.resourcemanager.keytab/etc/security/keytab/rm.service.keytabKerberos keytab file for the ResourceManager.
yarn.resourcemanager.webapp.https.address${yarn.resourcemanager.hostname}:8090The https adddress of the RM web application for non-HA. In HA clusters, use yarn.resourcemanager.webapp.https.address.rm-id for each ResourceManager. See ResourceManager High Availability for details.

4.9. NodeManager

yarn.nodemanager.principalnm/_HOST@REALM.TLDKerberos principal name for the NodeManager.
yarn.nodemanager.keytab/etc/security/keytab/nm.service.keytabKerberos keytab file for the NodeManager.
yarn.nodemanager.container-executor.classorg.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutorUse LinuxContainerExecutor.
yarn.nodemanager.linux-container-executor.grouphadoopUnix group of the NodeManager.
yarn.nodemanager.linux-container-executor.path/path/to/bin/container-executorThe path to the executable of Linux container executor.
yarn.nodemanager.webapp.https.address0.0.0.0:8044The https adddress of the NM web application.

4.10. Configuration for WebAppProxy


yarn.web-proxy.addressWebAppProxy host:port for proxy to AM web apps.host:port if this is the same as yarn.resourcemanager.webapp.address or it is not defined then the ResourceManager will run the proxy otherwise a standalone proxy server will need to be launched.
yarn.web-proxy.keytab/etc/security/keytab/web-app.service.keytabKerberos keytab file for the WebAppProxy.
yarn.web-proxy.principalwap/_HOST@REALM.TLDKerberos principal name for the WebAppProxy.

4.11. LinuxContainerExecutor


YARN框架使用ContainerExecutor 来启动和控制container .

以下是Hadoop YARN中可用的内容

DefaultContainerExecutorYARN认的executor . 用于管理container 执行.容器进程与NodeManager具有相同的Unix用户


mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/

可执行文件应安装在 $HADOOP_YARN_HOME/bin 中。

可执行文件必须具有特定的权限:6050--Sr-s ---权限由root用户(超级用户)拥有,并由NodeManager Unix用户是其成员的特殊组(例如hadoop)拥有而且没有普通的应用程序用户



例如,假设NodeManager以用户yarn的身份运行,该用户是AAAAA和hadoop组的一部分,而其中的任何一个都是primary group。 假设AAAAA同时拥有yarn和另一个用户(应用程序提交者)alice作为其成员,并且alice不属于hadoop。

按照上面的描述,setuid / setgid可执行文件应设置为6050或–Sr-s —,其中用户所有者为yarn,组所有者为hadoop,其成员是yarn(而不是拥有alice用户的AAAAA用户组) 作为其成员(除了yarn)。


  • conf/container-executor.cfg

需要一个叫container-executor.cfg 的配置文件


配置文件必须由运行NodeManager的用户拥有(在上例中为user yarn),任何人都必须归属于该用户用户组,并且应具有权限 0400 or r--------

可执行文件要求conf / container-executor.cfg文件中包含以下配置项。这些项目应以简单的 key=value 的形式提及,每行一个

yarn.nodemanager.linux-container-executor.grouphadoopNodeManager所属的Unix group



4.10.MapReduce JobHistory Server

mapreduce.jobhistory.addressMapReduce JobHistory Server host:portDefault port is 10020.
mapreduce.jobhistory.keytab/etc/security/keytab/jhs.service.keytabKerberos keytab file for the MapReduce JobHistory Server.
mapreduce.jobhistory.principaljhs/_HOST@REALM.TLDKerberos principal name for the MapReduce JobHistory Server.

五. 多宿主

其中每个主机在DNS中具有多个主机名(例如,对应于公共和专用网络接口的不同主机名)的多宿主设置可能需要其他配置才能使Kerberos身份验证起作用。请参阅 HDFS Support for Multihomed Networks

六. 故障排除


  1. 网络和DNS配置。
  2. 主机上的Kerberos配置(/etc/krb5.conf)。
  3. Keytab的创建和维护。
  4. 环境设置:JVM,用户登录名,系统时钟等。



  • 将环境变量HADOOP_JAAS_DEBUG设置为true。
  • 编辑log4j.properties文件以在DEBUG级别记录Hadoop的安全包。
  • 通过设置一些系统属性来启用JVM级调试。
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug"



它包含一系列探针JVM的配置和环境,转储出一些系统文件(/etc/krb5.conf中,/etc/ntp.conf中),打印出一些系统状态,然后尝试登录到Kerberos作为当前用户或named keytab中的特定principal。


该KDiag命令有其自己的入口点; 通过将kdiag传递给bin / hadoop命令来调用它。因此,它将显示用于调用它的命令的kerberos客户端状态。

hadoop kdiag




-1: the command Failed for an unkNown reason
41: Unauthorized (== HTTP’s 401). KDiag detected a condition which causes Kerberos to not work. examine the output to identify the issue.

6.1. 使用

KDiag: Diagnose Kerberos Problems
  [-D key=value] : Define a configuration option.
  [--jaas] : Require a JAAS file to be defined in java.security.auth.login.config.
  [--keylen <keylen>] : Require a minimum size for encryption keys supported by the JVM. Default value : 256.
  [--keytab <keytab> --principal <principal>] : Login from a keytab as a specific principal.
  [--nofail] : Do not fail on the first problem.
  [--nologin] : Do not attempt to log in.
  [--out <file>] : Write output to a file.
  [--resource <resource>] : Load an XML configuration resource.
  [--secure] : Require the hadoop configuration to be secure.
  [--verifyshortname <principal>]: Verify the short name of the specific principal does not contain '@' or '/'

–jaas: Require a JAAS file to be defined in java.security.auth.login.config.
If --jaas is set, the Java system property java.security.auth.login.config must be set to a JAAS file; this file must exist, be a simple file of non-zero bytes, and readable by the current user. More detailed validation is not performed.

JAAS files are not needed by Hadoop itself, but some services (such as Zookeeper) do require them for secure operation.

–keylen : Require a minimum size for encryption keys supported by the JVM".
If the JVM does not support this length, the command will fail.

The default value is to 256, as needed for the AES256 encryption scheme. A JVM without the Java Cryptography Extensions installed does not support such a key length. Kerberos will not work unless configured to use an encryption scheme with a shorter key length.

–keytab --principal : Log in from a keytab.
Log in from a keytab as the specific principal.

1.The file must contain the specific principal, including any named host. That is, there is no mapping from _HOST to the current hostname.
2. KDiag will log out and attempt to log back in again. This catches JVM compatibility problems which have existed in the past. (Hadoop’s Kerberos support requires use of/introspection into JVM-specific classes).

–nofail : Do not fail on the first problem
KDiag will make a best-effort attempt to diagnose all Kerberos problems, rather than stop at the first one.

This is somewhat limited; checks are made in the order which problems surface (e.g keylength is checked first), so an early failure can trigger many more problems. But it does produce a more detailed report.

–nologin: Do not attempt to log in.
Skip trying to log in. This takes precedence over the --keytab option, and also disables trying to log in to kerberos as the current kinited user.

This is useful when the KDiag command is being invoked within an application, as it does not set up Hadoop’s static security state —merely check for some basic Kerberos preconditions.

–out outfile: Write output to file.

hadoop kdiag --out out.txt

Much of the diagnostics information comes from the JRE (to stderr) and from Log4j (to stdout). To get all the output, it is best to redirect both these output streams to the same file, and omit the --out option.

hadoop kdiag --keytab zk.service.keytab --principal zookeeper/devix.example.org@REALM > out.txt 2>&1

Even there, the output of the two streams, emitted across multiple threads, can be a bit confusing. It will get easier with practise. Looking at the thread name in the Log4j output to distinguish background threads from the main thread helps at the hadoop level, but doesn’t assist in JVM-level logging.

–resource : XML configuration resource to load.
To load XML configuration files, this option can be used. As by default, the core-default and core-site XML resources are only loaded. This will help, when additional configuration files has any Kerberos related configurations.

hadoop kdiag --resource hbase-default.xml --resource hbase-site.xml

For extra logging during the operation, set the logging and HADOOP_JAAS_DEBUG environment variable to the values listed in “Troubleshooting”. The JVM options are automatically set in KDiag.

–secure: Fail if the command is not executed on a secure cluster.
That is: if the authentication mechanism of the cluster is explicitly or implicitly set to “simple”:


Needless to say, an application so configured cannot talk to a secure Hadoop cluster.

–verifyshortname : validate the short name of a principal
This verifies that the short name of a principal contains neither the “@” nor “/” characters.

6.2. Example

hadoop kdiag \
  --nofail \
  --resource hdfs-site.xml --resource yarn-site.xml \
  --keylen 1024 \
  --keytab zk.service.keytab --principal zookeeper/devix.example.org@REALM

This attempts to to perform all diagnostics without failing early, load in the HDFS and YARN XML resources, require a minimum key length of 1024 bytes, and log in as the principal zookeeper/devix.example.org@REALM, whose key must be in the keytab zk.service.keytab

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。
