Blending Traditional and Modern Data Sets
Watch this video to learn about the challenge of blending traditional and modern data sets.
Today there are many sources of data. There’s structured data stored in relational databases. Unstructured data can live in Elasticsearch, Solr, or even social media. IoT devices like cars, household appliances, and smart phones produce a constant stream of real-time or near-real-time data. Then there are the modern data sources and platforms like Spark, Hadoop, Impala, and Kudu. The ability to blend this data is critical to extracting the maximum analytical value from it.
My name is Mike McCarty. I'm a senior software engineer, focused on big data application and visualizations.
One of the primary challenges we have in the industry is trying to blend traditional data sets with modern data sets. Whether they're structured coming out of a relational database or unstructured, stored in Elasticsearch or Solr, or even on social media. Some of the sources even exist on external hard drives that aren't even plugged in, or IOT devices such as toasters and refrigerators -- a lot of log files we're dealing with, as well, coming from servers.
So the challenge is trying to grab all this data and blend it with data coming out of some of the modern platforms like Spark and Hadoop, they're living on HDFS or Impala or in Kudu and extract useful knowledge from it.