0

I have a spark streaming application, for every batch, I need to insert it to the hbase which is protected by kerberos. I found a solution, that is in the driver side I create a connection and obtain a token from that conn and then pass it to the executor. In the executor side, I decode it and get the token, in this way I can insert data to hbase successfully. This seems good, but my concern is that will the token expired? If so , how to solve it please?

My code snippet is

val ugi=UserGroupInformation.loginUserFromKeytabAndReturnUGI(principle,keytabfile);
ugi.doAs(new PrivilegedAction[Unit]() {
  def run(): Unit = {
    conn = ConnectionFactory.createConnection(conf)
    val token = TokenUtil.obtainToken(conn)
    tokenStr = token.encodeToUrlString()
  }
})

in the rdd.foreachpartition,

val token = new Token()
token.decodeFromUrlString(tokenStr)
UserGroupInformation.getCurrentUser.addToken(token)

Although I have searched a lot from Internet about this issue, but I did not found a good solution about this issue. The common answer to this question is

UserGroupInformation.getLoginUser().checkTGTAndReloginFromKeytab();

But as my test, inside this method,

public synchronized void checkTGTAndReloginFromKeytab() throws IOException {
if (!isSecurityEnabled()
    || user.getAuthenticationMethod() != AuthenticationMethod.KERBEROS
    || !isKeytab)
  return;
KerberosTicket tgt = getTGT();
if (tgt != null && Time.now() < getRefreshTime(tgt)) {
  return;
}
reloginFromKeytab();

}

The isKeytab is always false, so it will never execute the following code,I do not understand why this return false. So anybody can help me solve this question? Any help is appreciated!

Coinnigh
  • 611
  • 10
  • 18
  • 1
    Congratulations! You have re-developed the Spark feature that automatically retrieves a "HBase token" and broadcasts it to the executors, whenever the Spark launcher detects a `hbase-site.xml` with Kerberos auth (since V1.3) You have also re-developed part of the standard HBase plugin for Spark, that was contributed by Cloudera and is available *(a)* in the CDH distro, *(b)* as an additional JAR for other distros using HBase 1.x, or *(c)* natively in HBase 2.x – Samson Scharfrichter Jun 29 '17 at 11:45
  • But as I test, after several hours, the applications stoped at the insertToHabse method.And I got "javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. " exception, I suspect the token expired. I can renew the token, but I do not know when need I do it. I want to renew before the token expired, but when will the token expired – Coinnigh Jul 01 '17 at 03:25
  • Did you simply SEARCH past StackOverflow questions? You might have found the answer by Chris Nauroth (Hortonworks) to https://stackoverflow.com/questions/34616676/should-i-call-ugi-checktgtandreloginfromkeytab-before-every-action-on-hadoop – Samson Scharfrichter Jul 01 '17 at 12:40

1 Answers1

0

It is caused by the java version. If you want to run a secured Hadoop cluster on JDK 1.7.0_85 or later, then you must run Apache Hadoop 2.7.0 or later.

To see this Jira issue HADOOP-10786

Talha Junaid
  • 2,351
  • 20
  • 29
Dialong
  • 99
  • 1
  • 1
  • 3