From the course: Data Science Foundations: Data Mining in R

R for data mining

From the course: Data Science Foundations: Data Mining in R

Start my 1-month free trial

R for data mining

- [Barton] Back in 1890, William James, the founder of American psychology and my professional hero, wrote about how newborn babies were inundated with sensory stimuli. He called it a great blooming, buzzing confusion. That takes time and effort for the baby to make sense of what's me, what's not me, what matters and what can be safely ignored? That's a feeling that comes to a lot of people when they start working with data especially large and diverse data sets. The problem is that data which holds the secrets to so much value can be an embarrassment of riches. You have so much data available to you with so much more coming in, but you don't know what to do with it. How can you make sense of it? How can you find the real value in such an enormity of raw materials? What matters, what can be safely ignored? But rather than having to wait like a newborn for the very slow processes of cognitive and social development, we can find some immediate answers to our questions by turning to data mining. I'm Barton Poulson and in this course, we'll explore some of the most important principles and techniques in modern data mining to help you cut through the noise. We'll look at some of the most useful methods for dealing with the DeLucia data. Those methods include dimensionality reduction to help you sort through the noise and find reliable indicators in your data, clustering techniques to help you sort cases into useful groups, classification to automate some of the most difficult work of categorization, association analysis to find if/then predictions in your data, time series techniques to describe and predict valuable temporal patterns, and methods for mining texts with a special focus on locating the critical evaluations that people leave in their unstructured data. We'll focus on the hands-on practices of mining data using one of my favorite tools, the R programming language and the associated RStudio environment. You'll see how you can quickly use both simple and sophisticated methods to get a better understanding of how to mine data for value and in turn, reach the goals that are important to you. All of these will help you sort through the great blooming and buzzing confusion of the data world. And with that, let's get started with Data Science Foundations: Data Mining with R.

Contents