From the course: Data Science Foundations: Data Mining in Python

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Dimensionality reduction overview

Dimensionality reduction overview - Python Tutorial

From the course: Data Science Foundations: Data Mining in Python

Start my 1-month free trial

Dimensionality reduction overview

- [Instructor] Data mining comes into its own when you're dealing with lots of data. And you may be familiar with the three V's that defined big data, volume, velocity and variety of the data. But it turns out that paradoxically, you can have too much of a good thing. Having lots of cases or observations is generally good but things get much more complicated when you have many many dimensions or fields or features or variables. And there are a few reasons why that can be a special challenge. Some of the problems that come up with high dimensionality include an exponentially increasing complexity and demands. You have so many more possibilities now and it becomes so much harder to compute it. That's because each dimension is an additive it's multiplicative. Also with each variable, you have idiosyncratic error. That means each variable doesn't perfectly represent the thing that you're hoping that it measures. It…

Contents