Why BI Must Move to Streaming Analytics

Business Requirements for Streaming Data Analytics

  • Streaming data analytics can mean several things, depending on context. For Zoomdata, we provide a platform for people to explore and interact with data while it updates in near-real-time. To be relevant and useful across a broad range of users and scenarios, customers have guided us to build a modern BI platform that fulfills the following requirements for streaming data analytics.

Stream Data in Near-Real-Time

  • Zoomdata delivers fresh data to the dashboard without requiring users to do anything -- no pressing F5 or hitting the “refresh” button. Unlike in-stream analytics for automated, split-second decision-making (for example, to open or shut a valve), streaming data analytics for BI aggregates a variety of data for human decision-making. Human eyes can’t differentiate between split-second updates, and different use cases require different rates of refresh; one data source could be updated every few seconds, for example, whereas every minute or so would work great for another data source. Refresh rates in Zoomdata are configurable.

Interactivity, Exploration, and Dynamic Calculations

  • Exploring live data should offer the same rich experience as working with historical data. To understand what’s going on across a variety of data points -- to become situationally aware -- business users need to dynamically filter, sort, group, and drill-down on live data. They also want to build new charts, create new derived fields, and even calculate new measures on-the-fly. That’s a tall order when the data is constantly rolling in, and Zoomdata does it.

Sometimes You Want to Stop Time

  • For the same reasons video surveillance is recorded, sometimes you want to pause, rewind, and playback data streams so you can better see, understand, and communicate events. What’s even more powerful, however, is delivering the Data DVR as a standard feature for interactive, visual dashboards so you can better understand what preceded current events and why.
  • There is one attribute common to all data streams: time. Every event from a heartbeat, to a purchase, or an airplane departure, can be associated with a point in time. Zoomdata holds several patents related to streaming data visualization, and how we work with time is a key part of our enabling technology.

Don’t Forget The Past

  • Business users often need to aggregate or compare near-real-time data with data from the past. The problem is that neither you nor they necessarily know how much historical data they will need until they need it. Twenty-four hours? One month? Five years? Streaming data analytics for BI purposes should assume that data doesn’t expire and that users should be able to work seamlessly with live data together with historical data. Zoomdata easily supports a variety of use cases that require “hot” and “cold” (or warm) data, including the real-time data warehouse and big data operational analytics.

Zoomdata customers put streaming data analytics to work

Streaming Data & Big Data & Embed

One of Zoomdata’s customers, a brand analytics software provider, embedded Zoomdata into its software to enable its customers to see and interact with analytics in new ways. The company’s customers are primarily in the retail, transportation and financial services markets. They need to visualize in near real-time sales transactions, social media mentions, and other factors that influence their brands. By embedding Zoomdata in their software, their customers can:

  • Visualize petabytes of data in near real time from millions of customer transactions
  • Directly query data where it lives, in data lakes, modern data platforms, and legacy data warehouses
  • Use artificial intelligence to correlate customer data
  • Turn data into actionable insights that deliver quality brand experiences in near real-time

Payment Processing: Customer Case Study

There’s more! Learn how Cielo S.A. uses Zoomdata in Live Mode to:

  • Reduce analytic latency from 30 days to under one hour
  • Monitor card usage statistics to fine tune relevant offers
  • Respond to data processing network anomalies as they occurred

FAQs about Streaming Data Analytics

A: Zoomdata keeps two-way WebSockets connections open between the users’ web browsers and the Zoomdata Query Engine. When in Live Mode, Zoomdata polls the data source for data that arrives within a configurable time window (e.g., a few seconds, or a minute), processes it, and then pushes the aggregated and calculated results to the users’ dashboards.

A: Any supported data source that includes some sort of query and processing engine, such as traditional databases and high-performance query engines for big data such as Apache Impala and Presto, can be used in Live Mode. Plain file data sources such as Amazon S3 and HDFS need a query engine in order to work in Live Mode. Zoomdata recommends using modern data sources that operate as “fast data sinks” for Live Mode.

A: A “fast data sink” is a fancy term for any database or data platform that can be configured to handle fast writes and many concurrent reads (queries).

A: Some good examples of fast data sinks are Apache Impala on top of Kudu or Parquet files in HDFS; search-engine databases such as Elasticsearch, Cloudera Search, and Solr; MemSQL; Snowflake, and the like. You can use a traditional database, but test first to make sure it can handle the data quantity and refresh rates so they are acceptable for your user community.

A: Data is “landed” in fast data sinks using any number of stream processors, such as Kafka Connect, Google Cloud Dataflow, NiFi, Spark Streaming, SQLStream, StreamSets, and Zoomdata’s built-in Stream Writer Service. Some stream processors can rapidly clean, enrich, and transform data before it is landed in the fast data sink.

A: The business requirements for near-real-time business intelligence demand a complete data platform that go beyond visualizing an event stream. Fast data sinks with “Live Mode” enabled can scale to handle ad hoc end-user exploration of potentially massive amounts of live, historical, and other related data.

A: Lambda architectures attempt to combine batch and stream processing, and are tricky to set up and expensive to maintain. Instead, Zoomdata recommends landing and keeping data in a fast data sink, and using Live Mode to get near-real-time updates. If old data needs to be rolled off the fast data sink into warm or cold data storage, you can use one of Zoomdata Multisource Many Ways™ techniques to work concurrently with live and historical data.

A: Zoomdata does not ingest or store data, but does offer a Stream Writer Service that can land data from a stream to a database.

A: The data must be indexed or partitioned by a timestamp field.

Featured Resources

Streaming Data Analytics for Fast BI Insights

Zoomdata's streaming data analytics platform lets decision-makers explore and interact with data while it updates in near-real-time.

Contact

Sales: +1 888-564-4965