Zoomdata Version

Implementing Responses to Data Requests

Overview

The DataReadRequest and DataReadResponse perform the main work of the connector. Zoomdata issues a DataReadRequest for creating visualizations, calculating collection statistics, determining eligible filter values, and almost any activity in the system that leverages data.

Server Workflow

There are two phases and three related response types for handling read requests. Communication with Zoomdata is asynchronous and nondeterministic, meaning that while the lifecycle for a read request can vary, its generally expected path is:

  1. Zoomdata sends a DataReadRequest to a connector. The connector response with a PrepareResponse that includes one or more unique response IDs to Zoomdata.
  2. Zoomdata may send a StatusRequest to poll the status of the request using the provided IDs. The connector server response with a StatusResponse .
  3. Zoomdata may send a Fetch request to retrieve the data in batches or send a cancellation. The connector server sends a DataResponse to provide the queried information.

These steps are illustrated below.

Each of these steps is discussed in the sections below.

Data Read Request

The DataReadRequest provides legacy support for a DataRequestType.SQL request type. The type is deprecated in favor of the more general DataRequestType.STRUCTURED, and this type should be used going forward.

StructuredRequest requests are available in the following types.

  • statsDataRequest
  • distinctValuesRequest
  • rawDataRequest
  • aggDataRequest

DataReadRequest must include exactly one of these StructuredRequest types. An example DataReadRequest making an aggDataRequest follows.

{
	type: DataRequestType.STRUCTURED,
	structured: {
		type: StructuredRequestType.AGG,
		fieldMetadata: {
			_ts: {
				name: '_ts',
				type: FieldType.INTEGER,
				fieldParams: {
					flags: [PLAYABLE]
				}
			},
			amount: {
				name: 'amount',
				type: FieldType.DOUBLE
				}
		},
		aggDataRequest: {
			metrics: [{
				type: MetricType.PERCENTILES,
				percentile: {
					field: 'amount',
					margin: 0.0
				}, {
				type: MetricType.PERCENTILES,
				percentile: {
					field: 'amount',
					margin: 100.0
				}
			}]
		}
	},
	requestInfo: /*...*/
}

More about each type of DataReadRequest follows.

StatsDataRequest

StatsDataRequest is used for gathering statistics about one or more fields in a collection. As of Zoomdata 2.3, that includes two metric functions - MIN and MAX. This request is typically issued when calculating collection statistics for a field to determine the upper and lower bounds.

Example of SQL that might be sent to data store:

select min(ds.field1) as minField1,
max(ds.field1) as maxField1,
min(ds.field2) as minField2
from myCollection as ds

DistinctValuesRequest

DistinctValuesRequest is used to identify distinct values for a field. The resulting list must include all distinct values and exclude duplicates. This request is often used to provide a pick list for selecting elements in a group so the user can filter on specific values. Distinct values requests are only issued against a single field at a time.

Example of SQL that might be sent to data store:

select distinct ds.field1 AS field1Values
from myCollection as ds
order by field1Values

By convention, if Zoomdata does not explicitly send a sort with the DistinctValuesRequest, the connector should sort the field in ascending order.

RawDataRequest

RawDataRequest is used to retrieve simple data possibly including filters, limits, offsets, and sorts. It must not contain metrics or calculations. Use cases include paginating through a data set or retrieving specific values.

Example of SQL that might be sent to data store:

select ds.column1 as myCol, ds.column2 as myCol2
from myCollection as ds
where ds.mycol3 > 5
limit 10
order by myCol

If the RawDataRequest does not request any specific fields, then the connector should treat it as selecting all available fields, as in SQL's select * clause.

AggDataRequest

AggDataRequest is used when constructing most of Zoomdata’s visualizations, and often includes some combination of fields and metrics, which may then also includes sorts, filters, limits, etc.

Example of SQL that might be sent to data store:

select ds.myCol1 as col1, sum(ds.myCol2) as sumCol
from myCollection as ds
where ds.mycol3 between 1 and 10
group by ds.myCol1

Succesfully implementing the server largely involves on successfully breaking down AggDataRequest requests and passing them along to the data store.

Time Format Handling

Zoomdata accesses time fields in one of these ways:

  • Data explicitly stored as a date or time type, such as a timestamp
  • Data stored as an integer representing time, often stored in an integer field as epoch/Unix time.

Time Granularity

Zoomdata allows the truncation of time data to a specified granularity. Connectors that support the GROUP_BY_TIME feature should be able to handle the following levels of granularity.

  • Year
  • Quarter (Jan 1, Apr 1, Jul 1, Oct 1)
  • Month
  • Week - Zoomdata expects that the week starts on a Monday. Many data stores implement week truncation assuming that weeks start on Sunday. Your connector may need to adjust its calculation accordingly to provide Zoomdata the result that it expects.
  • Day
  • Hour
  • Minute
  • Second
  • Millisecond

For example, the timestamp Wednesday, April 15, 2016 14:03:56.287, represented using the ISO 8601 standard as 2016-05-15T14:03:56.287, would be truncated as follows.

Granularity / Truncate to Returned Result
Year 2016-01-01T00:00:00
Quarter 2016-04-01T00:00:00
Month 2016-05-01T00:00:00
Week 2016-05-09T00:00:00
Day 2016-05-15T00:00:00
Hour 2016-05-15T14:00:00
Minute 2016-05-15T14:03:00
Second 2016-05-15T14:03:00
Millisecond 2016-05-15T14:03:56.287

Epoch / Unix Time Conversion

Connectors that support Unix time need the ability to transform Unix times, whether in seconds and milliseconds, to a timestamp. The connector must be able to perform granularity truncations on the resulting timestamp.

Timestamps

Connectors that handle time should be able to:

  • Accept ISO 8601-compliant timestamps as strings and treat them as timestamps for calculation purposes.
  • Cast years into ISO 8601-compliant timestamps
    For example, given the value 2014, your connector must be able to convert it to 2014-01-01T00:00:00.
  • Cast epoch/Unix time in milliseconds to a timestamp
    For example, given the Unix time 1475405646290, your connector must be able to convert it to 2016-10-02T10:54:06.

Time Zones

Because Zoomdata does not currently provide comprehensive time zone support, the best practice is to serve time data in the UTC time zone.

Prepare Response

Upon receiving a DataReadRequest, the connector generates a request ID for it.  The request ID is a universally unique identifier that Zoomdata can use to track and manage the request. The connector responds to the request with a PrepareResponse that contains any necessary request IDs.

When a connector receives a DataReadRequest, it should not necessarily expect to execute it. That is, Zoomdata may never request a status update, fetch data for the request, or even cancel the request. The best practice is not to send requests down to the data store during this phase. Another best practice is for the connector server to eventually release any resources that it allocates during this phase.

Status Request and Status Response

After Zoomdata has received a request ID from a PrepareResponse, it may poll the connector server with a StatusRequest for those IDs. In response to a StatusRequest, the connector returns a StatusResponse returns the request ID with the request's status:

  • RequestStatus.DONE indicates that the connector is ready to fetch data.
  • RequestStatus.PROGRESS indicates that the response is still being prepared.

If the data store supports reporting the progress of a request, the progress field can be used to report progress expressed as a percentage from 0.0 to 100.0 (complete) using a double.

Fetching and the Data Response

RequestStatus.DONE indicates that the connector has completed preparing the query and is ready to send it to the data store. At that point, Zoomdata may send a fetch request to the connector. The connector should then fetch the required data from the data store. The connector then returns to Zoomdata the set of Record objects and the original request ID in a DataResponse. If there are more batches to be fetched, the connector should set the hasNext property of the DataResponse to true.

Zoomdata expects the Record objects for aggregated requests to be in the following order:

  • Groups
  • Counts
  • Metrics
While the API specification does not require Record objects to be ordered in this way, failing to order them as above may interfere with proper functioning of the connector.

Other Requests and Responses

The Thrift server defition used to build your connector includes requests and responses that should not be implemented at present. These requests and responses manage unusual cases or uncommon data stores features. For a complete list of connector API feature advisories, see Connector Best Practices and Advisories .