From the course: Data Science Foundations: Data Mining in Python
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
PCA - Python Tutorial
From the course: Data Science Foundations: Data Mining in Python
PCA
- [Instructor] By far the most common way to reduce dimensionality in a dataset is with principal component analysis, usually just called PCA. This is a very simple and easy thing to do in Python. We'll begin by loading a few standard packages, including scikit-learn or sklearn, which gives us the PCA functionality. Then I'm going to load our datasets. We're going to load the training data, split it so we have the x attribute variables separated from the y class variables and then do the same thing for the testing data. And then once we've done that, we can look at the first few rows of the training data. And you see that we have 64 attribute variables, zero through 63, P is for pixel, and then the class variable that indicates what the digit actually is is here, y at the end. We have one, one, six and so on. We'll begin by training the model with the training data. We'll set up a PCA object. That's what we're…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.