From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Exploring and verifying data quality with the UCI heart dataset

Exploring and verifying data quality with the UCI heart dataset

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Exploring and verifying data quality with the UCI heart dataset

- Okay, let's revisit the Heart Dataset, after quite some time. So, we made a little bit of progress on this analysis, what would be on our mind next? Well, we have to certainly populate our cheat sheet with all the notes that we'd have on the different variables, level of measurement, potential role, questions that we want to ask the subject matter expert. We're going to assume, that the rows that we suggested should be discarded, or approved to be discarded. In other words, they weren't valid data, they needed to be set aside. But going back to the data, we're rapidly going to run into a roadblock here, because we're going to want to move on to do our univariate visualizations, our bi-variate visualizations, looking for potential predictors. There are a number of things that will be on our mind, but we have a lot of missing data, and we have to figure out, if our row is a transactional row, or if it's an ID. And…

Contents