Interacting with billions of rows of data in seconds from a single source is exciting, but big data exploration gets really interesting when you join data from multiple sources -- without needing to move it into a data warehouse or mart. Maybe you have application usage data in Hadoop, but you’d like to enrich that data with some customer demographic reference information or transactional data stored in an Oracle database. Or suppose you have a stream of sensor readings and you want to perform calculations across that real-time stream and include historical metrics. Or you might have product reviews indexed in Elasticsearch that you’d like to correlate with product purchase history in an enterprise data warehouse.
All of these scenarios require data blending across multiple sources to get the best insights. And get them fast while increasing the intelligence of your big data analytics efforts while improving the user experience.
In the past, to blend and analyze data from multiple sources required two steps. First, you would need to create data warehouses or data marts, then physically copy and aggregate data into them from the original systems. This two-step process has several limitations:
In addition, the optimum performance of many mobile applications requires blending data from multiple sources.
The availability of modern, very scalable in-memory data storage and processing frameworks such as Apache Spark have made it possible to take a radically fresh approach to data blending. Zoomdata Fusion makes multiple sources, including relational data sources, appear as one data set without physically moving the data to a common data store. Best of all, everyday users and business analysts can join sources without having to wait for a data architect to set it up.
Once defined, a fused source can be used like any other source in Zoomdata. All the interactive visualizations are available for a blended source. Zoomdata Fusion automatically determines when and how to access the different sources to derive a given visualization without the end user having to worry about how it works. Fused sources can also be used within dashboards and embedded applications, providing users with seamless access to disparate data.
Zoomdata comes bundled with Apache Spark pre-configured out of the box, or if you already have a Spark cluster Zoomdata can be configured to use your existing Spark cluster.