Zoomdata Version

Connecting to Hive Sources on Kerberized HDP Cluster

A secure HDP Cluster uses Kerberos authentication to validate and confirm access requests. You can set up Zoomdata to connect to the secure HDP Cluster using the instructions provided below (which has been tested on HDP v2.3.0 and Tez v0.7.0).

Preparing The Hive cluster

  • Kerberos authentication requires precise time correspondence on all instances to work properly. You need to enable the Network Time Protocol service in your network. For more information, access the article Using the Network Time Protocol to Synchronize Time .

Configuring Zoomdata services

Obtaining Kerberos Credentials

Each service must have its own unique identifier called a principal. Perform the following steps:

  1. Install the Kerberos client on the CentOS or Ubuntu machine where the Zoomdata Server resides.
  2. Generate Kerberos principal and corresponding keytab for Zoomdata service. Before you proceed, make sure that:

  • Zoomdata service is running on a node with proper Kerberos configuration: /etc/krb5.conf or similar location for your Linux distribution.
  • The Kerberos realm on your environment is the same as the realm specified in the kdc.conf file from the Hive server.

3. Check the Kerberos configuration (that is, krb5.conf ) and validity of the principal and keytab pair using MIT Kerberos client:

     kinit -V -k -tzoomdata_principal.keytab zoomdata_principal@KERBEROS.REALM

4. Make the keytab accessible for Zoomdata's Hive on Tez connector:

sudo mkdir /etc/zooomdata
sudo mv zoomdata_principal.keytab /etc/zoomdata
sudo chown zoomdata:zoomdata /etc/zoomdata/zoomdata_principal.keytab
sudo chmod 600 /etc/zoomdata/zoomdata_principal.keytab
Replace the placeholders with proper credentials.

Configuring a Hive on Tez Connector

  1. Create or update the file named /etc/zoomdata/edc-tez.properties . If this file already exists, verify that the information below exists in the file:

2. Restart the Hive on Tez connector:

    sudo systemctl restart zoomdata-edc-tez

Connecting to Kerberized Hive Source

You are now ready to create the Hive on Tez source:

  1. Open a new browser window and log into Zoomdata.
  2. Select Sources.
  3. Select Hive on Tez.
  4. Specify the name of your source and add a description (if desired). Select Next .
  5. On the Connection page, define the connection source. You can use an existing connection, if available, or create a new one. To create a new connection, select the Input New Credentials option button and specify the connection name and JDBC URL. Make sure that you enter the JDBC URL in the correct format:


Replace the placeholders as follows:

  • hive_host enter the IP address/host name of the Hive on Tez node you are connecting to
  • hive_principal@KERBEROS.REALM enter the principal of the Hive node you are connecting to. To get the list of all Hive principals, navigate to Ambari > Admin > Kerberos > Advanced > Hive.
The 'principal' spec contained in the JDBC URL refers to the principal of the Hive node. hive_principal@KERBEROS.REALM principal has nothing to do with the zoomdata_principal@KERBEROS.REALM principal specified for the Zoomdata connector.
  1. Select Validate and once your connection is valid, select Next.

You can continue configuring Hive on Tez data source as provided in the Connecting to Hive on Tez article.

After you have completed the configuration, Zoomdata will begin accessing Hive on Tez using zoomdata_principal@KERBEROS.REALM authenticated by its keytab in /etc/zoomdata/zoomdata_principal.keytab .

Was this topic helpful?