From the course: Introduction to Spark SQL and DataFrames

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Basic machine learning with DataFrames, part 2

Basic machine learning with DataFrames, part 2

From the course: Introduction to Spark SQL and DataFrames

Start my 1-month free trial

Basic machine learning with DataFrames, part 2

- [Instructor] We're going to look at another commonly used machine learning technique or data science or statistics technique called linear regression. Linear regression is useful when you have data in which you believe you can make predictions about one variable using knowledge about another variable. So for example, if you believe that you think knowing CPU utilization will alow you to guess what the number of sessions are or the free memory are, then linear regression would be a good technique to use to implement that. So once again, we'll use utilization data. And I'll just load that. And as in the previous video, we're uploading some code from the machine learning libraries in Spark. And in particular, we're loading the VectorAssembler, which we have seen before, and then we're also loading the linear regression models. So, what we want to do is create a vector with the feature columns that we're interested in.…

Contents