From the course: Supervised Learning Essential Training

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Checking your dataset for assumptions

Checking your dataset for assumptions - Python Tutorial

From the course: Supervised Learning Essential Training

Start my 1-month free trial

Checking your dataset for assumptions

- [Instructor] An important task in regression modeling is checking the validity of assumptions made when fitting a linear regression model. Assumptions of linear models are never perfectly met in real life, but we want to see if they're reasonable to work with. We'll continue working with our student performance data for this. First, we need to make sure there's actually a linear relationship between the target variable and at least one feature variable. The easiest way to do this is to create a 2D plot of some feature variables with our target variable. We can then have a better understanding of which features are helpful in predicting our target. Let's get started here loading in our data. And then let's just preview it. As you can see, we have our response variable G3 at the end, and a ton of other feature variables. Let's do some data cleaning just by dropping these protected columns. And then we can get started…

Contents