Zoomdata Version

Managing Amazon S3 Connectors

Amazon Simple Storage Service (S3) provides a “web service interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web [1].” Zoomdata connects to S3 sources using the Apache Spark processing framework.
[1]: Excerpted from AWS Documentation “ What is Amazon S3?

 

By default, the Amazon S3 connector is included with Zoomdata. Before configuring S3, you need to download and enable the connector.

Amazon S3 now uses it's own embedded Spark functionality that runs separate from the Zoomdata embedded Spark. For steps on configuring Spark for S3, see Managing Configurations in Zoomdata.

What does s3 support?

The table below lists information on the features that are supported by Amazon S3:

Supports Distinct Count? Yes
Supports Live Mode/ Playback? No
Supports Group-by Time? Yes
Supports Multi Group-by Charts? Yes
Supports Histogram? Yes
Supports Box Plot? No
Custom SQL Capable? No
Supports Last Value? No
Supports Partition? No

Managing your Connectors

For details about managing the embedded Spark Server for your S3 data source, see Managing Configurations in Zoomdata.

When setting up your S3 connector, you need to do the following:

  1. From the Remote File Settings list, select the number of entries to be displayed in the file preview.

  2. Specify the path to file. This is the path to a remote file that you want to be uploaded into Zoomdata.
    (you can use this publicly available dataset:
    s3n://AKIAI535P5R2QX7NYAQQ:[email protected]/consolidated_olympic_events.csv)

  1. Select the Read Headers checkbox if you want to use the first row of your data source as column names.

  2. Specify the Value Separator that is in your data source. Standard separators include commas (,) and semi-colons (;).

  3. Toggle the caching setting (by default caching is enabled).

  4. Select Preview to see a preview of the data file.

Refresh settings for S3

For version 2.6, scheduled reload of newly added data is not supported for the S3 connector. To add new data, do the following:

  1. Navigate to the Fields page of your data source.
  2. Enabled the Refresh Fields option. This forces the connector to reload any new data discovered at the level of the data source.

Chart Settings for S3

On the Charts page, you can:

  1. Edit Global Default Settings .
  2. Select the Standard and, if available, Custom chart styles to be used with the data source.
  3. Set default parameters (group, sub-group, colors, sorting, and so on) for each chart style.

Select Finish to save your changes. Once your data connection has been established, it is listed under My Data Sources.

Was this topic helpful?