Analyzing Textual Data
In this video, we take a look at the value of analyzing text data, especially in narrative forms like social media, clinical notes, and product reviews.
It’s easy to aggregate and store this information, but extracting knowledge from it requires special text analytics tools such as sentiment analysis, natural language processing, and computational linguistics. With these tools, we can assess things like the mood of a Tweet or the truthfulness of a product review. In a clinical setting, text analysis can add context to test results and other forms of quantitative medical data.
My name is Mike McCarty. I'm a senior software engineer, focused on big data application and visualizations. One of the areas that's really interesting to me right now is textual data stored in the form of narratives. So, we've got a lot of comments coming off social media or even product reviews that have a large amount of text in them.
Healthcare Records -- Analyzing Clinical Notes
Also, one area is healthcare records where a professional is writing down a whole boatload of information and trying to go back over -- it's easy to get that information written down and stored. But then try to go and extract knowledge from these text blobs is a very challenging problem.
Sentiment Analysis and Natural Language Processing
So, it's a great application of techniques like sentiment analysis or natural language processing, computational linguistics. And some of the things we're trying to do with these -- the textual narratives is, for example, with political Tweets, trying to understand what's the general mood of the person writing the Tweets, or in the Amazon reviews, for example, is the quality of the review, is the person that's writing the review lying.
Quantitative and Qualitative Analysis
By using quantitative and qualitative analysis of unstructured data, organizations can make better data-driven decisions, and they can do this across multiple dimensions such as time, geography or product space. This opens up a whole new spectrum in the variety of information we can extract from data.