From the course: Introduction to Spark SQL and DataFrames
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Basic machine learning with DataFrames, part 2
From the course: Introduction to Spark SQL and DataFrames
Basic machine learning with DataFrames, part 2
- [Instructor] We're going to look at another commonly used machine learning technique or data science or statistics technique called linear regression. Linear regression is useful when you have data in which you believe you can make predictions about one variable using knowledge about another variable. So for example, if you believe that you think knowing CPU utilization will alow you to guess what the number of sessions are or the free memory are, then linear regression would be a good technique to use to implement that. So once again, we'll use utilization data. And I'll just load that. And as in the previous video, we're uploading some code from the machine learning libraries in Spark. And in particular, we're loading the VectorAssembler, which we have seen before, and then we're also loading the linear regression models. So, what we want to do is create a vector with the feature columns that we're interested in.…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.