From the course: Data Science Foundations: Data Assessment for Predictive Modeling
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Solution: What should the row be?
From the course: Data Science Foundations: Data Assessment for Predictive Modeling
Solution: What should the row be?
(upbeat music) - [Tutor] Okay, here we go. This is going to be more thought process than definitive solution because to make a definitive decision about this, you'd have to look outside the data set and consult with a subject matter expert. Nonetheless, we have a lot to consider already. So this is what I've done. I've got the Titanic data set in its original form, which you can find in the originals folder, and I've added a new tab with a pivot table. I've also turned on the filters so we can take a look. So if we click down and filter here, it looks like we have as many passenger IDs as we have passengers. It looks like almost all the numbers are used so that's good. That means that if we wanted to do a risk score at the passenger level, it looks like we don't have duplicates and so on. That's certainly a plausible way to go. What about the other options? Let's take a look at Ticket for instance as a possible…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.