Skip to main content Skip to primary navigation Skip to secondary navigation Skip to search
Home

Want to talk to a person, call us in North America at +1 (571) 279-6166

  1. Resources
  2. Tell Us About Yourself
  3. Watch Now

Building Real-Time BI Systems With Kafka, Spark, And Kudu

Building Real-Time BI Systems With Kafka, Spark, And Kudu

LinkedIn
Twitter
Facebook

One of the key challenges in working with real-time and streaming data is that the data format for capturing data is not necessarily the optimal format for ad hoc analytic queries. For example, Avro is a convenient and popular serialization service that is great for initially bringing data into HDFS. Avro has native integration with Flume and other tools that make it a good choice for landing data in Hadoop. But columnar file formats, such as Parquet and ORC, are much better optimized for ad hoc queries that aggregate over large number of similar rows.

View slides here. 

 

Copyright © 2020 Zoomdata™ All Rights Reserved Terms of Use | Privacy Policy | EOE Statement