Zoomdata Version

Connecting to Impala on Kerberized CDH Cluster

OVERVIEW

A secure CDH Cluster uses Kerberos authentication to validate and confirm access requests. You can set up Zoomdata to connect to the secure CDH Cluster using the instructions provided below (which has been tested on CDH v5.7.1 and Impala v2.5.0).

Preparing impala cluster

To enable Kerberos for CDH distribution using Cloudera manager, see Cloudera's documentation Configuring Authentication in Cloudera Manager.

Kerberos authentication requires precise time correspondence on all instances to work properly. You need to enable the Network Time Protocol service in your network. For more information, access the article Using the Network Time Protocol to Synchronize Time .

Configuring the Zoomdata services

Zoomdata v2.5 provides you two Impala connectors: Internal Impala connector and Custom Impala connector.

This section describes the configuration of both connectors.

Obtaining Kerberos Credentials

In the Kerberos world, each service must have its own unique identifier called a principal . No matter whether you are going to use Internal Impala connector or the Custom one, you have to do the following:

  1. Install the Kerberos client on the CentOS or Ubuntu machine on which Zoomdata Server/Custom connector is installed.
  2. Generate Kerberos principal and corresponding keytab for Zoomdata service.
    Before you proceed, make sure that:

  • Zoomdata or a Custom Connector is running on a node with proper Kerberos configuration: /etc/krb5.conf or similar location for your Linux distribution.
  • The Kerberos realm on your environment is the same as the realm specified in the kdc.conf file from Impala server.

3. Check the Kerberos configuration (that is, krb5.conf ) and validity of the principal and keytab pair using MIT Kerberos client:

kinit -V -k -t zoomdata_principal .keytab [email protected]

4. Make the keytab accessible for the Zoomdata Server or a Custom connector:

sudo mkdir /etc/zooomdata
sudo mv zoomdata_principal .keytab /etc/zoomdata
sudo chown zoomdata:zoomdata /etc/zoomdata/ zoomdata_principal .keytab
sudo chmod 600 /etc/zoomdata/ zoomdata_principal .keytab
Replace the placeholders with proper credentials.

Configuring Zoomdata Server for Internal Impala Connector

  1. Create or update the Zoomdata files named /etc/zoomdata/zoomdata.env and /etc/zoomdata/spark-proxy.env with the info below. If these files already exist, verify that the information below exists in them.
KERBEROS_PRINCIPAL= [email protected]
KERBEROS_KEYTAB=/etc/zoomdata/ zoomdata_principal .keytab
KERBEROS_CONFIG=/etc/krb5.conf
The krb5.conf file is auto-generated in your Kerberos environment.

2. Restart the Zoomdata Server.

sudo systemctl restart zoomdata

Configuring Custom Impala Connector

  1. Create or update the file named /etc/zoomdata/edc-impala.properties . If this file already exists, verify that the information below exists in the file:
kerberos.krb5.conf.location=/etc/krb5.conf
kerberos.service.account.authentication=true
kerberos.service.account.principal= [email protected]
kerberos.service.account.keytab.location=/etc/zoomdata/ zoomdata_principal .keytab

2. Restart the Custom Impala connector:

sudo systemctl restart zoomdata-edc-impala

Connecting to Kerberized Impala

You are now ready to create the Cloudera Impala source:

  1. Open a new browser window and log into Zoomdata.
  2. Click the Sources menu item.

    Figure 1
  3. Click the Cloudera Impala connector icon.
  4. Specify the name of your source and add a description (if desired). Select Next .

    Figure 2
  5. On the Connection page, define the connection source. You can use an existing connection, if available, or create a new one. To create a new connection, select the Input New Credentials option button and specify the connection name and JDBC URL. Make  sure that you enter the JDBC URL in the correct format:
jdbc:hive2:// impala_host :21050/;principal= [email protected]

Replace the placeholders as follows:

  • impala_host enter the IP address/host name of the Impala node you are connecting to
  • [email protected] enter the principal of the Impala node you are connecting to. To get the list of all Impala principals, navigate to Cloudera Manager > Administration > Security > Kerberos Credentials.


Figure 3

The 'principal' spec contained in the JDBC URL refers to the principal of the Impala node. [email protected] principal has nothing to do with the zoomdata_principal @KERBEROS.REALM principal specified for Zoomdata Server or Custom connector.
  1. Click Validate . After successful validation, the values will be saved. Select Next .
If you run into connection issues, verify that the Zoomdata Server was restarted successfully. Access the troubleshooting article Verifying that the Zoomdata Server Restarts Properly for assistance.

You can continue configuring Impala data source as provided in the Connecting to Impala article.

After you have completed the configuration, Zoomdata will begin accessing Impala using zoomdata_principal @KERBEROS.REALM authenticated by its keytab in /etc/zoomdata/ zoomdata_principal .keytab .

Using TLS ENCRYPTION along with KERBEROS authentication

Refer to the Using TLS encryption along with Kerberos Authentication section for more details.