Zoomdata Version

Connecting to Impala

The Cloudera Impala™ connector allows users to visualize huge volumes of data stored in their Hadoop/HDFS cluster in real time and with no ETL.


Due to a JDBC driver limitation, the Impala connector does not support connecting over SSL in Zoomdata versions 2.2.0 through 2.2.6. Zoomdata released a fix for this in v2.2.7 service pack and you can find more information here.

Connection Instructions

The Zoomdata Server supports Impala v1.2.4 and later versions. Perform the following steps to configure the connector:

  1. Log into Zoomdata.
    Administrators and users with appropriate access privileges can connect data sources in Zoomdata.
  2. Click the Sources menu item.

Figure 1

  1. Click the Cloudera Impala connector icon.
  2. Specify the name of your source and add a description (if desired). Click Next .

Figure 2

  1. On the Connection page, define your connection source. You can select an existing connection, if available, or create a new one.

    To create a new connection, select the Input new credentials option button, specify the connection name and enter a JDBC URL.

    Specify JDBC URL in the following format: jdbc:hive2://SERVERNAME:PORT/;auth=noSasl
    You can have more then one JDBC URL as long as they are separated by commas. The URLs will be utilized in a round-robin fashion. These fields are required.

    In addition, if Impala authentication is enabled, specify the necessary username and password.

Figure 3

  1. Click Validate . After successful validation, the values will be saved. Click Next .

  2. On the Tables page, select the schema and the data collection to use for your charts. The selected fields will be displayed in the Preview section. If you don't want to use specific fields, clear the checkboxes in the Fields section.

Figure 4

  1. If you want to add a custom SQL query to retrieve the data, click Custom SQL and add your query.
    Click Next .

Zoomdata wraps your SQL query into a SELECT statement. If specific statements inside the wrapped query are not supported by your data source, the query will not be executed.
  1. By default SparkIt is disabled. You can enable it if required. Click Next .

  2. On the Fields page, you can create unique label names for the available fields in your data source. These labels will be displayed in the charts.

    Specify unique label names, as needed, for each Label field.

Figure 5

  1. If necessary, change the Type , Partition , and Default options and select the Distinct Count checkbox to enable this option . For the Filter Display column, you may configure a Custom Range, if available. Click Next .

  2. On the Refresh page, you can schedule asynchronous jobs to refresh fields in your data source. Refer to Using the Zoomdata Scheduler article for more information.

Figure 6

  1. On the Charts page, you can enable the charts that will be available for the data source and edit the settings for your charts. That is, select the styles that will be available for the data source, change the global default settings, and more. Learn more about how to customize a chart .

  2. ​Click Finish to save your changes.

Figure 7