High-Performance Query Engine for Modern Data

Plan, Optimize, and Execute

Zoomdata's high-performing query engine is purpose-built to let people explore today's most demanding data sources: big data, streaming data, and data from non-traditional data sources that don't use JDBC/ODBC, SQL, or relational database schemas.

What Does the Query Engine Do?

The Zoomdata Query Engine sits between the web application (or an API client) and Zoomdata Smart Data Connectors. It has three primary roles.

Plan

Plan

Deconstruct and convert abstract query requests into distributed query plans

Optimize

Optimize

Optimize execution plans based on data platform capabilities, in-memory cached results, and Query Engine capabilities

Execute

Execute

Push down queries to Smart Data Connectors and/or caches and perform in-memory Fusion and post-processing if needed

Query Abstraction

Zoomdata has a single query API that is standardized for all user requests. It doesn't matter to the user or to the Query Engine whether the target data platform requires SQL or its own API-based query language.

Abstracting user requests into a single, simple query API is partly to credit for making modern data platforms like Elasticsearch, Impala, MongoDB, and Snowflake readily available to a general user population. There is absolutely no need to write SQL, XML, or anything else. 

The Query Engine applies its knowledge of each target data source and what's available within its caches in order to optimize and distribute execution plans. However, the Query Engine doesn't actually write queries in SQL or any other database language. It delegates the execution of the query plans to the Smart Data Connectors, which translate Zoomdata query API requests into SQL and/or a native query API.

Adaptive Caching

When enabled for a data source, Zoomdata maintains multiple caches to accelerate time-to-answer and avoid unnecessarily expensive queries. Both data retrieved from data sources and data previously processed by the Query Engine can be cached as interactive data to improve the performance of calculations, multi-source analysis, and other analytical functionality. To optimize performance, the Query Engine evaluates different approaches for complexity, and selects the simplest approach

Pushdown Processing

Zoomdata’s default processing strategy is to push down as much work to the underlying data platforms as possible. Internal to the Query Engine is a query optimizer that evaluates each end-user request, and determines whether to submit all or part of the request to the target data platform(s). This includes pushing down filtering criteria, aggregations, calculations, and offset, limit, sort, and time bucketing operations. 

Zoomdata Fusion

Zoomdata Fusion is one way to achieve multisource analysis. Fusion is a specialized Zoomdata data source that maps fields from different data sources using common keys so that they appear to the business user to be from a single source.

The two most common reasons for using Fusion are to enrich data or provide clarity to the data. One simple example is to use Fusion to lookup labels or other attributes. Depending on the data size, Fusion can be resource-intensive. 

The Query Engine is responsible for planning and optimizing the query execution plans. If caches are enabled, the Query Engine pulls what it can from cache, and pushes as much processing as possible down to the data platforms. The Query Engine processes the data from the multiple systems to deliver the correct values back to the user.

Data Sharpening™ and Microqueries

Microqueries and Data Sharpening™ are patented technologies that work together to provide the most sophisticated end user experience for analyzing big data. There is literally nothing else like it available on the market today.

If enabled for a data source, the Query Engine determines whether to invoke microqueries. Microqueries repeatedly sample data across partitions and return the sample data to the Query Engine for processing.

Zoomdata Architecture

 

Data Sharpening complements microqueries. It consists of two processes:

  • Analyzing sample data returned by microqueries
  • Streaming estimated results to the user’s browser (or other API client)

Data Sharpening’s estimated results may fluctuate a bit up or down until the final query resolves. However, the relative values of each group usually remain consistent as the data is sharpened. For example, the tallest bar in the chart at 10% completion will almost always remain the tallest bar at 100% completion. This means that users can be confident exploring data even as it streams live to the dashboard.

When the user zooms in, filters, changes metrics or groupings, or any other action that would change data values, Zoomdata cancels any active queries. Canceling active queries, however, is not trivial, and many JDBC drivers do not support it. In these cases, Zoomdata’s smart data connectors issue native API calls to complete the task. 

An Elastically Scalable, Modern Query Engine

Modern Query Engine
  1. Zoomdata Application Server
  2. Modern Query Engine -- you are here
  3. Smart Data Connectors
  1. JavaScript SDK and RESTful APIs
  2. Microservices architecture
  3. Adaptable security model

 

 

Featured Resources

High Performance Query Engine

Zoomdata's high-performing query engine is purpose-built to let people explore today's most demanding data sources: big data, streaming data, and data from non-traditional data sources that don't use JDBC/ODBC, SQL, or relational database schemas.

Contact

Sales: +1-571-279-6166

General Inquiries: +1(571-279-6000)

sales@zoomdata.com