Managing Impala Connectors
The Cloudera Impala™ connector allows users to visualize huge volumes of data stored in their Hadoop/HDFS cluster in real time and with no ETL.
Zoomdata supports Impala
2.5.0 - 2.11.0
What does Impala support?
The table below lists information on the features that are supported by Cloudera Impala.
|Supports Distinct Count?||Yes|
|Supports Group-by Time?||Yes|
|Supports Multi Group-by Charts?||Yes|
|Supports Box Plot?||Yes|
|Supports Derived Fields?||Yes|
|Custom SQL Capable?||Yes|
|Live Mode & Playback||Yes|
|Supports Last Value?||Yes|
- Impala versions prior to 2.5.0 have a known issue with the unix_timestamp() function. This issue affects the Zoomdata TEXT_TO_TIME row level function and can cause the TEXT_TO_TIME function to return incorrect results. This issue was fixed by Cloudera in versions of Impala 2.5.0 and greater. To learn more see the Impala project web site for the issue details.
Support is provided for passing along credentials for users with access privileges to Impala source. Delegation allows for Impala queries to be issued with the privileges from a specified user. This is available in the Connection page and is set as the 'Do As User' field.
Managing your impala connectors
When setting up an Impala connection, you need to provide the following.
Specify the JDBC URL. You can connect to your Impala data source using either simple user credentials authentication or Kerberos authentication with optional SSL encryption. Refer to
Connecting to Impala on Kerberized CDH
Connecting to Impala with TLS (SSL)
for more details on the configuration.
Zoomdata enables you to connect either to a single Impala node or to multiple nodes within a cluster.
To connect to a single Impala node, specify a JDBC URL in the following format:
To connect to multiple Impala nodes, specify required JDBC URLs separated by commas in the corresponding field. The URLs will be utilized in a round-robin fashion. Keep in mind that such a connection will be valid as long as there is at least one available node. If all the nodes can not be reached, then the connection won't be validated.
- If Impala authentication has been set up, provide the 'User Name' and 'Password'.
If allowing for Impala Delegation, select from the 'Do As User' drop-down list (which would have been set up by the Zoomdata Administrator).
This field basically allows Zoomdata to pass along credentials for the specified user with access rights to Impala.
If successfully validated, the connection is saved.
For the Partition column, time-based fields may be configured for partitioning. The following options are available:
- No (partitioning to be done)
Date - this option is available for the Time field type. If you select this option, the list of the partitioned columns will be displayed in the Configure column.
Function - If you select this option, the list of the partitioned columns and supported MURMUR3_HASH function will be displayed in the Configure column.
For the Configure column, numeric and time-based fields can be edited:
- Numeric types including Money, Number and Integer - ability to select a default aggregation function
- Time fields - ability to define the default time pattern and granularity; if the time field provides granularities of hour, minute and second, then a time zone label may be applied
For the Distinct Count column, tick the checkbox for any fields if Distinct Count is desired for it. For more information, see Enabling Distinct Counts on Cloudera Impala.
On the Charts page, you can:
- Edit Global Default Setting
- Select the Standard and, if available, Custom chart styles to be used with the data source
- Set default parameters (group, sub-group, colors, sorting, and so on) for each chart style
Select Finish to save your changes. Once your data connection has been established, it is listed under My Data Sources.
Was this topic helpful?