I have a Kafka Connect with some HDFS sink connectors instances. They write in a secured hdfs with kerberos. Although connectors are working fine I have some questions about the security configuration. Below you can find the security aspects of the connector configurations:
"hdfs.authentication.kerberos": "true",
"connect.hdfs.principal": "my_custom_user@MY_DOMAIN",
"connect.hdfs.keytab": "/etc/kafka/my_costom_user.keytab",
"hdfs.namenode.principal": "nn/_HOST@MY_DOMAIN",
I wrote the configuration following the official documentation but I do not understand why I need to specify (all togethers) the connect.hdfs.principal, the connect.hdfs.keytab and the hdfs.namenode.principal. I know the ticket has to be created but I am not sure how the request is issued. I guess connect.principal + connect.keytab are used the get the TGT, but why do I need the hdfs.namenode.principal?
I thought hdfs.namenode.principal was required to get the Authorization Token and then obtain a Delegation Token, but currently I think it does not make sense, because, Kafka Connect cluster does not have installed the keytab for hdfs.namenode.principal so I understand if the hdfs.namenode.principal is used to create a TGT this should happen in Hadoop cluster.
could anybody shed some light on that?