Connecting to CDH-Cloudera
Zoomdata offers connection to Cloudera’s open source Apache Hadoop platform - CDH (Cloudera Distributed Hadoop)*. CDH provides unified batch processing, interactive SQL, interactive search, and role-based access controls. In addition, it offers enterprise-grade continuous availability. Specifically, Zoomdata connects to CDH’s fault‐tolerant storage system called the Hadoop Distributed File System (HDFS). Keep in mind that the connection to CDH requires Apache Spark (which is automatically enabled in the Zoomdata environment).
Zoomdata offers connection to Cloudera’s open source Apache Hadoop platform - CDH (Cloudera Distributed Hadoop)*. CDH provides unified batch processing, interactive SQL, interactive search, and role-based access controls. In addition, it offers enterprise-grade continuous availability. Specifically, Zoomdata connects to CDH’s fault‐tolerant storage system called the Hadoop Distributed File System (HDFS).
The table below lists information on the features that are supported by CDH-Cloudera:
|Supports Distinct Count?||Yes|
|Supports Live Mode/ Playback?||No|
|Supports Group-by Time?||Yes|
|Supports Multi Group-by Charts?||Yes|
|Supports Box Plot?||No|
|Custom SQL Capable?||No|
|Supports Last Value?||No|
CONFIGURING THE CONNECTION
For details about what is provided on each page of the connection process, review the article Source Connection Workflow . Depending on your needs, you can either follow the steps in order from start to finish or jump to a specific section in the connection process:
Log into Zoomdata.
Administrators and users with appropriate access privileges can connect data sources in Zoomdata.
Specify the name of your source and add a description (if desired).
- Click Next to continue to the next setup page.
This page defines the connection source for Zoomdata to be able to access the data source. Perform the steps below.
From the Remote File Settings (Spark It) list, select the number of entries to be displayed in the file preview.
- Specify the path to your remote file that you want to upload into Zoomdata.
- Select the Read Headers checkbox to use the first row of your data source as the column names.
- Specify the value separator that is in your data source in the corresponding field. Standard separators include commas (,) and semi-colons (;).
- Click Preview . From the Entries list, select the number of entries to be displayed in preview.
- In the Preview section, you can configure fields properties. Click Next .
The Fields page lets you (1) configure attribute options, (2) create custom labels for the fields in your data source (that will be displayed in the charts), (3) manage the Volume metric, and (4) work with Calculations.
- Determine whether the field should be visible or not to the user.
- Create unique label names, as needed, for each Label field.
- For the Type column, you have the option to edit the field type (although usually you won't need to do this).
column, numeric and time-based fields may be edited:
- Numeric types including Money, Number and Integer - ability to select a default aggregation function
- Time fields - ability to define the default time pattern and granularity; if the time field provides granularities of hour, minute and second, then a time zone label may be applied
- Select fields for Distinct Counts as needed.
- Refresh the connection to a particular field, as desired.
- Configure Filter Display settings for fields.
- Edit the Volume Metric settings, as needed.
, if available and as needed.
If you are setting up a new connection, the Calculations section will not be available until after the connection is saved.
- Click Next to continue.
The Refresh page lets you schedule asynchronous jobs to update the source metadata. For guidance to set up a refresh schedule, refer to the article Using the Zoomdata Scheduler .
On the Charts page, you can:
- Edit Global Default Setting
- Select the Standard and, if available, Custom chart styles to be used with the data source
- Set default parameters (group, sub-group, colors, sorting, and so on) for each chart style
Click Finish to save your changes. Once your data connection has been established, it will be listed under the My Data Sources section of the page.