Every business decision taken today should be driven by data. The collection of data has become big business itself as companies try and make the best decisions possible based on what the data reveals rather than by assumption or guesswork.
Data is usually collected to confirm whether a hypothesis is correct. Someone already has a question in their mind that they want to answer. Whatever data they collect is meant to provide that answer. Charts are made, reports get written, data gets analyzed, and answers are disseminated across the organization confirming whether the hypothesis is correct. A decision might get made, and that’s usually where things end.
That process is fine, but it assumes the data you have is static and only useful for answering predefined questions.
But data that you have already collected can have more than one purpose. With the right data visualization tool, you can leverage your data into charts and analyses that can answer questions you didn’t even know you had.
What if by simply looking at your data you could find new trends, lines of inquiry, and business opportunities?
That’s where Exploratory Data Analysis (EDA) comes in. With EDA you can use charts, graphs, and other visualizations as a starting point for investigating almost every part of your data and how it all relates. You can see relationships that may not be apparent, trends that surprise, and make new predictions using the data you already have.
Here are 5 ways you can use exploratory data analysis to begin seeing what your own data can reveal:
1. Box Plots - Your starting point
Box Plots are the first steps in EDA for many data scientists. Developed by famed statistician John Tukey (who also pioneered EDA!), a box plot is a great way to get a visual sense of an entire range of data.
Box plots divides data into its quartiles. The “box” shows a user the data set between the first and third quartiles. The median gets drawn somewhere inside the box and then you see the most extreme non-outliers to finish the plot. Those lines are known as the “whiskers”. If there are any outliers then those can be plotted as well.