Picking the Right Data Store: What is Hadoop?
Watch this video to learn, the democratizing effect Hadoop has had on big data and data analytics.
Before Hadoop, querying large amounts data meant expensive proprietary hardware and software. Not anymore. Hadoop, an open-source project that can be deployed on commodity software, has evolved beyond MapReduce to solve a lot of different problems.
What is Hadoop? That's a great question. So, imagine 10 years ago, before there was Hadoop, any time you wanted to query large amounts of data, you had to spend a vast amount of money on database software and servers. The servers were specialized and super expensive. The software was crazy expensive. That's how Larry (Larry Ellison) bought an island in Hawaii.
What's Hadoop? Hadoop is an open source project designed to leverage super cheap hardware and scale out to be able to run queries against that data. Initially, Hadoop started off with MapReduce. So, this is how the Googlers decided they needed to work out what were the relationships between different pages so that they could do pagerank, so that they could solve the problem of search and actually show people interesting results when they searched for something on the web.
That algorithm was a slow but massively scalable algorithm and was super useful. So, Hadoop since then has evolved into a lot of projects to solve a lot of particular problems. It's gone beyond MapReduce, but the concept remains the same, which is high scale out, using cheap hardware, open source software to be able to do the kinds of things that used to require proprietary hardware and proprietary software.