0

I have a Kafka Connect with some HDFS sink connectors instances. They write in a secured hdfs with kerberos. Although connectors are working fine I have some questions about the security configuration. Below you can find the security aspects of the connector configurations:

      "hdfs.authentication.kerberos": "true",
      "connect.hdfs.principal": "my_custom_user@MY_DOMAIN",
      "connect.hdfs.keytab": "/etc/kafka/my_costom_user.keytab",
      "hdfs.namenode.principal": "nn/_HOST@MY_DOMAIN",

I wrote the configuration following the official documentation but I do not understand why I need to specify (all togethers) the connect.hdfs.principal, the connect.hdfs.keytab and the hdfs.namenode.principal. I know the ticket has to be created but I am not sure how the request is issued. I guess connect.principal + connect.keytab are used the get the TGT, but why do I need the hdfs.namenode.principal?

I thought hdfs.namenode.principal was required to get the Authorization Token and then obtain a Delegation Token, but currently I think it does not make sense, because, Kafka Connect cluster does not have installed the keytab for hdfs.namenode.principal so I understand if the hdfs.namenode.principal is used to create a TGT this should happen in Hadoop cluster.

could anybody shed some light on that?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
kiuby_88
  • 334
  • 1
  • 6
  • 18
  • maybe `connect.hdfs.principal` and `connect.hdfs.keytab` are used to create and login the UGI? – kiuby_88 Mar 15 '20 at 19:53
  • I think this https://stackoverflow.com/a/34691071/2598606 answer explains very well waht RPC is and how it works. I have checked again HDFS sink code and I understand the following. `hdfs.namenode.principal` + `hdfs.namenode.keytab` are using to configure the `UserGroupInformation#loginUserFromKeytab` and `hdfs.namenode.principal`. I understand `hdfs.namenode.principal` is the principal for the namenode – kiuby_88 Mar 18 '20 at 17:00

0 Answers0