Connecting to CDH-Cloudera
Zoomdata offers connection to Cloudera’s open source Apache Hadoop platform - CDH (Cloudera Distributed Hadoop)*. CDH provides unified batch processing, interactive SQL, interactive search, and role-based access controls. In addition, it offers enterprise-grade continuous availability. Specifically, Zoomdata connects to CDH’s fault‐tolerant storage system called the Hadoop Distributed File System (HDFS). Keep in mind that the connection to CDH requires Apache Spark (which is automatically enabled in the Zoomdata environment).
- Refer to the article Changing the Default Configuration for an Embedded Spark Server to learn about the Apache Spark setup in Zoomdata.
- To learn more about the Spark functionality and how it is utilized and enabled in Zoomdata, see How Zoomdata Uses Apache Spark
CONFIGURING THE HDFS CONNECTOR
To configure the connector, perform the following steps:
Log into Zoomdata.
Administrators and users with appropriate access privileges can connect data sources in Zoomdata.
- Click Next .
page, specify the path to your remote file that you want to upload into Zoomdata.
To use the first row of your data source as the column names, select the Read Headers checkbox.
Specify the value separator that is in your data source in the corresponding field. Standard separators include commas (,) and semi-colons (;).
Click Preview . From the Entries list, select the number of entries to be displayed in preview.
- In the Preview section, you can configure fields properties. Click Next .
page, create unique label names, as needed, for each
field. If necessary, change the
options, select the checkboxes in the
column. If you do not want to use specific fields from the data source, clear the checkboxes in the
Filter Display settings
for the required fields. Click
You can also add calculations in the Calculations section.
Click Next .
- On the Refresh page, you can schedule asynchronous jobs to refresh fields in your data source. Refer to Using the Zoomdata Scheduler article for more information.
page, you can enable the charts that will be available for the data source and edit the settings for your charts.
That is, select the styles that will be available for the data source, change the global default settings, and more.
Learn more about how to customize a chart .
Click Finish to save your changes.