Zoomdata Version

Connecting to Impala on Kerberized CDH Cluster

OVERVIEW

A secure CDH Cluster uses Kerberos authentication to validate and confirm access requests. You can set up Zoomdata to connect to the secure CDH Cluster using the instructions provided below (which has been tested on CDH v5.7.1 and Impala v2.5.0).

Preparing impala cluster

To enable Kerberos for CDH distribution using Cloudera manager, see Configuring Authentication in Cloudera Manager .

Kerberos authentication requires precise time correspondence on all instances to work properly. You need to enable the Network Time Protocol service  in  your network. For more information, access the article Using the Network Time Protocol to Synchronize Time .

Configuring the Zoomdata services

In the Kerberos world each service must have its own unique identifier called principal . To connect to Impala

  1. Install the Kerberos client on the CentOS or Ubuntu machine on which Zoomdata is installed.
  2. Generate Kerberos principal and corresponding keytab for Zoomdata service.

Before you proceed, make sure that:

  • Zoomdata is running on a node with proper Kerberos configuration ( /etc/krb5.conf ) or similar location for your Linux distribution.
  • Your environment has the necessary credentials to the Kerberos realm that can access the Hadoop cluster with keytab of a Kerberos principal.
  • The Kerberos realm on your environment is the same as the realm specified in the kdc.conf file from Impala server.
  1. Configure your Zoomdata Server. Copy the keytab so that it is accessible by the Zoomdata Server:
sudo mkdir /etc/zooomdata
sudo mv principalname .keytab /etc/zoomdata
sudo chown zoomdata:zoomdata /etc/zoomdata/ principalname .keytab
sudo chmod 600 /etc/zoomdata/ principalname .keytab
Replace the placeholders with proper credentials.
  1. Create or update the Zoomdata files named /etc/zoomdata/zoomdata.env and /etc/zoomdata/spark-proxy.env . If you need to create these files, input the parameters below. If the files already exist, verify that the information below exists in them.
KERBEROS_PRINCIPAL= [email protected]
KERBEROS_KEYTAB=/etc/zoomdata/ principalname .keytab
KERBEROS_CONFIG=/etc/krb5.conf
The krb5.conf file is auto-generated in your Kerberos environment.
  1. Restart the Zoomdata Server.

    sudo service zoomdata restart
    You are now ready to create the Cloudera Impala source.

Connecting to Kerberized Impala

  1. Open a new browser window and log into Zoomdata.
  2. Click the Sources menu item.

Figure 1

  1. Click the Cloudera Impala connector icon.
  2. Specify the name of your source and add a description (if desired). Select Next .

Figure 2

  1. On the Connection page, define the connection source. You can use an existing connection, if available, or create a new one. To create a new connection, select the Input New Credentials option button and specify the connection name and JDBC URL. Make  sure that you enter the JDBC URL in the correct format:

jdbc:hive2:// impala_host_ip_address :21050/;principal=impala/ your.domain.name @hadoop.com
Replace the placeholders as follows:

  • impala_host_ip_address enter the IP address from your Kerberos environment
  • your.domain.name enter your specific Hadoop domain name
The 'principal' spec contained in the JDBC URL refers to the service name and must be set to 'impala'. This is not related to the 'Username' which is specified in the zoomdata.env file.
  1. Click Validate . After successful validation, the values will be saved. Select Next .
If you run into connection issues, verify that the Zoomdata Server was restarted successfully. Access the troubleshooting article Verifying that the Zoomdata Server Restarts Properly for assistance.

You can continue configuring Impala data source as provided in the Connecting to Impala article (and jumping to Step 10).

After you have completed the configuration, Zoomdata will begin accessing Impala using [email protected] authenticated by its keytab in /etc/zoomdata/principalname.keytab .

Impala principal in the JDBC URL is different and should match the principal under which Impala is running.