This week, Cloudera announced their production release of both Cloudera Enterprise 5.10 and Kudu 1.2. Zoomdata has been working closely with Cloudera on this development as we believe it’s an important step towards the future of streaming analytics. Congratulations to the Cloudera team on the release of Kudu 1.2, a Hadoop storage engine for fast analytics on fast data!
Kudu, a storage engine for real-time analytics, has come a long way since its initial conception. In conjunction with Zoomdata, Cloudera further developed Kudu to simplify the framework for interactive access to big data in Hadoop. With Zoomdata’s native integration with Cloudera Enterprise, including Zoomdata Data Sharpening™, our joint customers can now run analytic queries in real time and use Zoomdata’s Data DVR to pause, rewind, and replay the combined real-time and historical data stream to understand what happened
How does Kudu Work?
Kudu is a new storage engine powering fast analytics on changing data. Kudu is purpose-built to enable use cases around time series data, machine data analytics, and online reporting—as part of a complete analytic or operational database. Kudu provides a combination of fast inserts/updates and efficient columnar scans to enable multiple real-time analytic workloads across a single storage layer. Kudu is specifically designed for use cases that require fast analytics on fast (rapidly changing) data. It is engineered to take advantage of next-generation hardware and in-memory processing, and lowers query latency significantly for both Apache Impala (incubating) and Apache Spark. Insert data into Kudu and it’s available instantly for fast analytic queries by Zoomdata with very low latency.
Working with All Data
Kudu provides unified storage without the need to explicitly address append-only vs. random-access operations. Not only does it eliminate the need for any parallel infrastructure to provide data compression, but Kudu analyzes most recent data and historical queries together. This simplified structure leads to more precise analytics. Companies with Kudu don’t have to worry about performance issues that arise with compacting files at the wrong time for other programs. As the pace of data ebbs and flows with periodic trends, converting files into a different format at the wrong time can cause lapses in analytic accuracy. Cloudera and Zoomdata optimized the Kudu interaction model to streamline the entire process, keeping companies efficient and current with even the largest datasets.
Ideal for Enterprises across Multiple Industries
Using Kudu for big data visual analytics is a viable solution for all types of initiatives. Businesses tracking their network activity can easily sort data and run ad hoc analytic queries on the fly. They can also run analytics on their entire data set, even with massive amounts of fast data. Kudu handles complex data sets with ease. E-commerce companies can easily and simply track patterns across their data in real time. For example, you can sort data based on product categories and then further examine trends based on consumer income, gender, location, and more. Then, watch as the current streaming data evolves based on actual consumer traffic.
Combining Kudu with Zoomdata
Once data is entered into Kudu, Zoomdata can use it via Impala to run analytic queries immediately. The transfer of data is instant, which allows the customer to monitor pattern changes within the data as it happens. If something uncommon occurs during the data stream, customers can use Zoomdata’s Data DVR to pause, rewind, and re-examine the data behavior. To further the user experience, Kudu works with Zoomdata’s Data Sharpening algorithms, which process data tables with billions of rows in lightning speed. Our customers can see analytics results in seconds that get clearer as the remainder of the query is analyzed.
Kudu works so well with Zoomdata because it is natively formatted to integrate with Impala and the entire existing Hadoop ecosystem. All of the optimization technology Zoomdata has created for Impala now works perfectly with the Kudu-based method.