Connecting to Elasticsearch
Zoomdata supports ElasticSearch version
1.4.1 - 1.7.5
CONFIGURING THE elasticsearch CONNECTOR
To configure the Elasticsearch connector, perform the following steps:
Log into Zoomdata.
Administrators and users with appropriate access privileges can connect data sources in Zoomdata.
Specify the name of your source and add a description (if desired). Click Next .
page, define your connection source. You can either create a new one or select an existing connection, if available.
To create a new connection, select the ' Input New Credentials ' option and fill in the following fields:
- Connection Name - this field must be completed before you continue
- Client Type - select the connection protocol: HTTP or Transport (TCP)
- ElasticSearch Cluster Name - specify the name of the cluster that you want to connect to
- ElasticSearch Host - specify the IP address of the node to which Zoomdata will connect
- ElasticSearch Port - specify the port number
- Enable SSL - select this checkbox if you want to use SSL while connecting to the data source; if you selected the Transport client type and have SSL configured, selecting this checkbox is required
- If required, specify your Elasticsearch User Name and Password .
- Click Validate . Validated credentials will be saved. Click Next .
On the Indices page, select the indices and types to be queried, and select the fields to be handled. You can do this in three steps:
- Select indices and aliases to be queried.
You can select indices
If you want to get the data only from specific indices, select the Manually option and choose the corresponding indices from the list below.
The Automatically option is more flexible. It lets you set the pattern by which the indices will be selected automatically. This means that if a new index has been added to your data set and it matches the specified pattern, such index will be queried by Zoomdata.
Note that in case no indices match pattern while querying, you will get an empty chart.
For this option, you can select one of the pattern types:
- specify the pattern for index names. Use asterisk (*) to replace one character or a set of characters.
For example, you want to get all the indices whose name starts with log and ends with 16 . In this case, specify the following pattern: log*16
- set the time pattern to get the matching indices.
Check the supported date and time patterns
For example, the time pattern YYYY-MM will return all the indices, whose name will match this pattern (as shown in the Figure 5 example). Note that if the Index Name include text with the time and date pattern, you need to enclose the text portion in brackets [ ]:
Configure filtering by type. This step is optional. If you need to filter by the type, select Enable Filter By Type and click Filter . When you click Edit , the list of types available in the selected indices will be displayed. In case types have different mapping in different indices, you will see all fields present in both types.
If this checkbox is cleared, all the types that refer to the selected indices will be selected.
If some fields have different data types in types, you will not be able to use them for grouping, filters, and so on. However, the still be available for raw export.
- Configure the fields settings if needed. If your data set contains multi-field types, they will be recognized and listed under the select fields section.
Their sub-fields are detected according to mapping. The fields of the token_count type cannot be used in raw export and are not shown in details and the text-search results.
- Enable or disable caching and lookup values for your data source. Click Next .
On the Fields page, create unique label names for the available fields in your data source. These labels will be displayed in the charts. Specify the label names, as needed, for each Label field. You can also make the necessary changes in the Type and Default fields. If you want to perform a search by a word or phrase on your chart, select a checkbox in the Faceted filter column for the corresponding field, select the checkboxes in the Distinct Count column. Configure Filter Display settings for the required fields. Click Next .
On the Refresh page, you can schedule asynchronous jobs to refresh fields in your data source. Refer to Using the Zoomdata Scheduler article for more information.
page, you can enable the charts that will be available for the data source and edit the settings for your charts.
That is, select the styles that will be available for the data source, change the global default settings, and more. Click Finish to save your changes.
Learn more about how to customize a chart .
When you connect to your Elasticsearch data source the additional service column _type will be added.
The _type column contains all selected Elasticsearch types that you can visualize as attributes on your charts.
Working with ElasticSearch
Distinct Counts and Percentiles
Distinct count and percentiles metrics return approximate values in Elasticsearch. The precision of the result returned by distinct count metric depends on precision threshold setting (default value is 1000).
You can change the value of precision threshold by setting the
property in the
The table below lists all available properties that you can modify to work with Elasticsearch.
|Property||Default Value||Use the property to||Notes|
|elasticsearch.query.cardinality.precision.threshold||1000||control the level of accuracy of the distinct counts||The maximum supported value is 40000. However, Zoomdata does not recommend to set such value as it may result in performance issues and the data source itself may return errors. For more info, refer to the Precision Control section by Elasticsearch.|
|elasticsearch.query.limit.nongrouped||10000||set the limit for the number of non-grouped records (per shard) to execute on.|
|elasticsearch.query.limit.grouped||10000||set the limit for the number of grouped records (per shard) to execute on.|
If you need to change the default settings, you can add the corresponding properties (listed above) to the
file and assign the required values. For more details about working with the
file, refer to the article
Managing Configurations in Zoomdata
Keep in mind that Elasticsearch, by default, tokenizes or analyzes 'Fields' that are of type 'string' (or attribute). As a result, strings consisting of two or more words may become separate fields when connected to Zoomdata (for example, city names like Las Vegas ). To disable this process and ensure that a string field is not tokenized, enter the following code for that field: