Zoomdata Version

Connector Information Keys

Overview

Zoomdata sends each connector a ServerInfoRequest in order to understand the connector's capabilities and behavior. The connector responds with a series of keys handled as strings. These keys describe to Zoomdata which data store features the connector and data store support and any limitations they carry, which in turn indicate the type of requests that it can fulfill.

Types of Keys

All information keys are included in a single set and should be encoded into a single, flat structure as strings. For ease of understanding, keys can be understood as belonging to the following groups.

Except where indicated, all values are booleans reported as the strings "true" and "false".

Required Keys

The following keys are required for the correct operation of a connector server.

Key Name Description Values
REQUEST.SEND_METADATA The connector is capable of sending metadata. This value must be set to "true".
REQUEST.TYPE The connector can receive structured requests. This value must be set to "STRUCTURED".

Back to top

Feature Keys

The following keys provide information about data store functionality supported by the connector server. A value of "true" indicates that the data store supports the feature and that the connector makes this feature available to Zoomdata. Thus, there are two limiting factors: the data store's native abilities and the connector's support for them.

Key Name Description
FEATURE.CUSTOM_QUERY "true" indicates that the source supports custom queries.

For example, in SQL-based sources, Zoomdata typically wraps the query with select * from, so that if the original query was

select count(*), someField from myCollection GROUP By someField

the resulting query that Zoomdata uses will be

select * from (select count(*), someField from myCollection GROUP By someField)

While custom queries are most commonly used with SQL-based data stores, this feature can also apply to other stores that allow arbitrary queries, such as MongoDB or ElasticSearch.

For this feature to be considered implemented, generally the data store needs to support the full range of capabilities over the subquery.
FEATURE.DISTINCT_COUNT "true" indicates that the source supports counting distinct values for a field.

Given a single collection and string field with three values:

1. Apple
2. Orange
3. Apple

distinct count returns 2, since there are only two distinct values (“Apple” and “Orange”), while an ordinary count returns 3 to reflect the total number of records.

For example, SQL based sources might produce a query that looks like

select count(distinct myField) from myCollection
FEATURE.FAST_DISTINCT_VALUES "true" indicates that the source supports optimized retrieval of distinct (unique) values among a large number of records.

For example, Kudu supports dictionary encoding. ElasticSearch, as another example, keeps lists of distinct values at the ready. Features such as these make FAST_DISTINCT_VALUES possible for your connector.

There is currently no test suite coverage for this feature. There is also no metric that defines “fast”. This value is based on the judgment by the developer.

For most connectors, this feature can be safely left disabled without impact.
FEATURE.GROUP_BY_TIME "true" indicates that the source supports grouping on time fields.

Most commonly, a source has a date or timestamp type that corresponds to Zoomdata’s date type.

Take for example the SQL query

select timeField, max(otherField) from myCollection group by timeField

If grouping on time is not supported, some charts will be unavailable for the source.
FEATURE.GROUP_BY_TIME.GROUP_BY_UNIX_TIME This key represents the same concept as GROUP_BY_TIME, except in this case the value is stored in an integer field and represented as unix (epoch) time.
FEATURE.HISTOGRAM "true" indicates that the data store supports the calculations necessary for histograms.

If histograms are not supported, histogram visualizations will be unavailable for the source.

For information about histogram calculations, see Histogram Calculations .
FEATURE.HISTOGRAM.HISTOGRAM_FOR_FLOAT_POINT_VALUES "true" indicates that the data store supports the calculations for histograms with non-integer values such as floats and doubles.

Fore information about histogram calculations, see Histogram Calculations .
FEATURE.LIVE_SOURCE "true" indicates that the data store has the ability to play dates in live mode.

The data store should be capable of receiving new or updated data, that is, data that is not static like flat files. If live sourcing is not supported, the “Live Mode” checkbox will be disabled during source creation.
FEATURE.LV_METRIC "true" indicates that the data store supports the last-value metric.

Although many data stores implement a last value function, Zoomdata last value indicates the source can load the last value in a given field collection and use it in a visualization immediately.

If the last-value metric is not supported, the last-value metric function in Zoomdata is unavailable.
FEATURE.MULTI_GROUP_SUPPORT "true" indicates that the source has the ability to group by more than one field in a query.

Take for example the  SQL query

select firstField, secondField, count(distinct otherField) from myCollection group by firstField, secondField

If multi-group querying is not supported, some visualizations will be unavailable for the source.
FEATURE.OFFSET "true" indicates that the source has a method to specify offset.

Offset is typically used to return records that are not located at the beginning of a collection. Combined with a limit on the number of records returned, this allows Zoomdata to paginate data during raw data read requests.

For example, given a collection with 100 records, if we requested data from the collection, we would start from the first record and limit response to the first 10 records. Specifying an offset of 10, the next request would start from the tenth record instead of the first. This approach creates 10 "pages" of data.

Many data stores support limit-and-offset functionality, but may use different terminology.
FEATURE.PARTITION "true" indicates that the source supports partitions and pruning.

Although many sources support partitions in some form, this specifically tells Zoomdata that the partitions may be used for manual pruning of result sets to increase speed.

Currently, this feature is only used for Hive-based sources and requires multiple steps and integration points that are still evolving. Zoomdata recommends leaving this disabled at this time.
FEATURE.PERCENTILES "true" indicates that the data store supports percentile calculations.

If percentile calculation is not supported, some charts such as box plot chart will be unavailable.
FEATURE.REFRESHABLE "true" indicates that the structure of a collection, its fields, or field types can modified after source creation.

This key toggles the availability of the “refresh field statistics” button on the fields tab of source creation. It is usually true for most sources.
FEATURE.SUPPORTS_MULTI_VALUED "true" indicates that the data store supports multi-valued field such as maps and objects.

Zoomdata support for complex types is evolving. Zoomdata recommends leaving this feature unimplemented at this time.
FEATURE.SUPPORTS_NESTED "true" indicates that the data store supports nested-field structures such as JSON structures.

Zoomdata support for complex types is evolving. Zoomdata recommends leaving this feature unimplemented at this time.
FEATURE.SUPPORTS_SCHEMA "true" indicates that the data store supports some sort of namespace, schema, or catalog notation for organizing collections.

If enabled, the connector should be able to provide a list of the schemas in the MetaSchemasResponse. Zoomdata will also allow schema selection during source creation.
FEATURE.SUPPORT_OPTIMIZED_READ
FEATURE.TEXT_SEARCH "true" indicates that the data store supports search on text-based fields.

Although many data stores do support this functionality, currently this is only used in Zoomdata for search-based sources such as Elasticsearch and SOLR. Currently, Zoomdata will ignore this capability if it is implemented for other data stores. Zoomdata recommends leaving this feature disabled at this time.

Back to top

Limitation Keys

The following keys do not cause limits to a connector, but rather report to Zoomdata the presence of a limitation in the functionality of the relevant data store.

Key Name Description
FEATURE.DISTINCT_COUNT.DISTINCT_COUNT_ONLY_ONE defaults to "false".

"true" indicates that the source can only receive a single DISTINCT COUNT field per query. In this case, Zoomdata limits distinct counts to a single field per request.

Apache Impala is an example of such a data store.
FEATURE.RAW_DATA_ONLY "true" indicates that the source requires Zoomdata’s internal engine to perform aggregations.

Some sources only have raw data, meaning that they return their data, as-is, row-by-row and cannot perform functions such as count(), sum(), etc.

Back to top

Was this topic helpful?