Zoomdata Version

Managing Impala Connectors

The Cloudera Impala™ connector allows users to visualize huge volumes of data stored in their Hadoop/HDFS cluster in real time and with no ETL.

Zoomdata supports Impala version 2.0.0 - 2.6.0 .

What does Impala support?

The table below lists information on the features that are supported by Cloudera Impala.

Supports Distinct Count? Yes
Supports Live Mode/ Playback? Yes
Supports Group-by Time? Yes
Supports Multi Group-by Charts? Yes
Supports Histogram? Yes
Supports Box Plot? Yes
Custom SQL Capable? Yes
Supports Last Value? Yes
Supports Partition? Yes
Table Format? Tables

Impala Authentication

Support is provided for passing along credentials for users with access privileges to Impala source. Delegation allows for Impala queries to be issued with the privileges from a specified user. This is available in the Connection page and is set as the 'Do As User' field.

Managing your impala connectors

When setting up an Impala connection, you need to provide the following.

  1. Specify the JDBC URL. You can connect to your Impala data source using either simple user credentials authentication or Kerberos authentication with optional SSL encryption. Refer to Connecting to Impala on Kerberized CDH or Connecting to Impala with TLS (SSL) for more details on the configuration.
    Zoomdata enables you to connect either to a single Impala node or to multiple nodes within a cluster.
    To connect to a single Impala node, specify a JDBC URL in the following format:
    jdbc:hive2://<impala_host>:<port>/;auth=noSas​l
    To connect to multiple Impala nodes, specify required JDBC URLs separated by commas in the corresponding field. The URLs will be utilized in a round-robin fashion. Keep in mind that such a connection will be valid as long as there is at least one available node. If all the nodes can not be reached, then the connection won't be validated.
  2. If Impala authentication has been set up, provide the 'User Name' and 'Password'.
  3. If allowing for Impala Delegation, select from the 'Do As User' drop-down list (which would have been set up by the Zoomdata Administrator).
    This field basically allows Zoomdata to pass along credentials for the specified user with access rights to Impala.
  4. Select Validate .
    If successfully validated, the connection is saved.

Setting up tables for Impala

For the Partition column, time-based fields may be configured for partitioning. The following options are available:

  • No (partitioning to be done)
  • Date - this option is available for the Time field type. If you select this option, the list of the partitioned columns will be displayed in the Configure column.
  • Function - If you select this option, the list of the partitioned columns and supported MURMUR3_HASH function will be displayed in the Configure column.

For the Configure column, numeric and time-based fields can be edited:

  • Numeric types including Money, Number and Integer - ability to select a default aggregation function
  • Time fields - ability to define the default time pattern and granularity; if the time field provides granularities of hour, minute and second, then a time zone label may be applied

For the Distinct Count column, tick the checkbox for any fields if Distinct Count is desired for it. For more information, see Enabling Distinct Counts on Cloudera Impala.

Chart Settings for Impala

On the Charts page, you can:

  1. Edit Global Default Setting
  2. Select the Standard and, if available, Custom chart styles to be used with the data source
  3. Set default parameters (group, sub-group, colors, sorting, and so on) for each chart style

Select Finish to save your changes. Once your data connection has been established, it is listed under  My Data Sources.

Was this topic helpful?