From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Challenge: Practice describe data with the UCI heart dataset

Challenge: Practice describe data with the UCI heart dataset

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Challenge: Practice describe data with the UCI heart dataset

(electronic music) - [Instructor] Okay, we're going to continue working in the same spreadsheet with that added discards tab, which is currently empty. And that's your next challenge. To pick up where we left off, there's clearly some problematic data here in the UCI Heart spreadsheet. Identify the rows that you think are the most suspicious, not merely missing, problematic and reflective of a data loading problem. Cut and paste them into the discards tab as if you were preparing for a meeting with a subject matter expert. So specifically, here's your challenge. Identify rows that looked like they don't belong, cut them from the Data tab, and place them in the Discards tab as if you were preparing for a meeting with a subject matter expert. And this should take a good 10 minutes, since you're not very familiar with the data file yet. The simple act of identifying the bad rows might be a bit quicker, but you want…

Contents