Zoomdata Version

Connecting to Impala on Kerberized CDH Cluster

OVERVIEW

A secure CDH Cluster uses Kerberos authentication to validate and confirm access requests. You can set up Zoomdata to connect to the secure CDH Cluster using the instructions provided below (which has been tested on  CDH 5.5.0 and Impala v2.3.0).

PREPARATION

Make sure that you have installed Impala correctly.

ESTABLISHING THE CONNECTION

  1. Install the Kerberos client on CentOS or Ubuntu .
If you plan to use Impala in your cluster, you must configure your KDC to allow tickets to be renewed, and you must configure krb5.conf to request renewable tickets. Typically, you can do this by adding the max_renewable_life setting to your realm in kdc.conf, and by adding the renew_lifetime parameter to the libdefaults section ofkrb5.conf. For more information about renewable tickets, see the Kerberos documentation .

Currently, you cannot use the resource management feature in CDH 5 on a cluster that has Kerberos authentication enabled.

Before you proceed, make sure that:

  • Zoomdata is running on a node with proper Kerberos configuration ( /etc/krb5.conf ) or similar location for your Linux distribution.
If you have multiple realms configured in your krb5.conf , you will have to either configure the cross realm referral or add an entry to the domain_realm section of the krb5.conf file.
  • Your environment has the necessary credentials to the Kerberos realm that can access the Hadoop cluster with keytab ( zoomdata_principal.keytab )
    of a Kerberos principal ( [email protected] ).
  • The Kerberos realm on your environment is the same as the realm specified for the Impala’s principal.
* Replace theplaceholders with proper parameters.
** Kerberos principal consists of the following components:[email protected] 
  1. Configure your Zoomdata Server. Copy the keytab so that it is accessible by the Zoomdata Server:
sudo mkdir /etc/zooomdata
sudo mv zoomdata_principal .keytab /etc/zoomdata
sudo chown zoomdata:zoomdata /etc/zoomdata/ zoomdata_principal .keytab
sudo chmod 600 /etc/zoomdata/ zoomdata_principal .keytab
  1. Optional: run the following command to verify that the provided credentials and Kerberos configuration are correct:
kinit zoomdata_principal @ KERBEROS.REALM -k -t /etc/zoomdata/ zoomdata_principal .keytab
  1. Create or update the Zoomdata file named /etc/zoomdata/zoomdata.env . If you need to create this file, input the parameters below. If this file already exists, verify that the information below exists in the file.
KERBEROS_PRINCIPAL= zoomdata_principal @ KERBEROS.REALM
KERBEROS_KEYTAB=/etc/zoomdata/ zoomdata_principal .keytab
KERBEROS_CONFIG=/etc/krb5.conf
The krb5.conf file is auto-generated in your Kerberos environment.
  1. Restart the Zoomdata Server.

    sudo service zoomdata restart
    You are now ready to create the Cloudera Impala source.
  2. Log into Zoomdata.
    Administrators and users with appropriate access privileges can connect data sources in Zoomdata.
  3. Click the Sources menu item.

Figure 1

  1. Click the Cloudera Impala connector icon.
  2. Specify the name of your source and add a description (if desired). Click Next .

Figure 2

  1. On the Connection page, define the connection source. You can use an existing connection, if available, or create a new one. To create a new connection, select the Input New Credentials option button and specify the connection name and JDBC URL. Make  sure that you enter the JDBC URL in the correct format:

jdbc:hive2:// impala_host_ip_address :21050/;principal= impala_principal @ KERBEROS.REALM

Impala's principal is listed in your Cloudera Manager account. To copy the principal, login to your account and click Administration > Security > Kerberos Credentials .

Replace the placeholders as follows:

  • impala_host_ip_address enter the IP address from your Impala service
  • [email protected] - this principal consists of the following components:
    • impala_principal - enter impala /fully.qualified.domain.name
    • KERBEROS.REALM - enter the same realm that you have specified for zoomdata_principal .
The 'principal' spec contained in the JDBC URL refers to the service name and must be set to Impala's principal. This is not related to the zoomdata_principal which is specified in the zoomdata.env file.

Leave the User Name and Password fields empty unless Impala authentication is enabled. The table name, timestamp column, and other fields would be the same as the non-secure connection.

Figure 3

  1. Click Validate . After successful validation, the values will be saved. Click Next .
If you run into connection issues, verify that the Zoomdata Server was restarted successfully. Access the troubleshooting article Verifying that the Zoomdata Server Restarts Properly for assistance.

You can continue configuring Impala data source as provided in the Connecting to Impala article (and jumping to Step 10).

After you have completed the configuration, Zoomdata will begin accessing Impala using
zoomdata_principal @KERBEROS.REALM authenticated by its keytab in /etc/zoomdata/zoomdata_principal.keytab .

Impala principal in the JDBC URL is different and should match the principal under which Impala is running.