From the course: Data Science Foundations: Data Assessment for Predictive Modeling
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Tips and tricks when searching for quirks in your data
From the course: Data Science Foundations: Data Assessment for Predictive Modeling
Tips and tricks when searching for quirks in your data
- [Instructor] Okay, we're in the census Income Excel spreadsheet, and I want to to talk about quirks and weird patterns and how to handle this when sitting down with a subject matter expert. Now, one of the best ways to do this since a lot of times, we're looking at the relationship between nominal variables is to run a cross tab, using the cross tab note and nine or another software that would support that. The problem is as the number of categories grows larger, it's somewhat unmanageable, so for demonstration purposes I'm going to to use filters in Excel, but a lot of times you're going to to be checking about the correspondence between two variables. So let me give you an example. If we go over to education and we go to education number 12, let's say it would be important to verify that this always means the same thing. So clearly education number is not years of school because that would be a high school…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
-
-
-
(Locked)
How to utilize an SME's time effectively2m 8s
-
(Locked)
Techniques for working with the top predictors4m 19s
-
(Locked)
Advice for weak predictors6m 4s
-
(Locked)
Tips and tricks when searching for quirks in your data4m 46s
-
(Locked)
Learning when to discard rows2m 5s
-
(Locked)
Introducing ggplot21m 44s
-
(Locked)
Orientating to R's ggplot2 for powerful multivariate data visualizations5m 52s
-
(Locked)
Challenge: Producing multivariate visualizations for case study 11m 12s
-
(Locked)
Solution: Producing multivariate visualizations for case study 12m 31s
-
(Locked)
-
-
-
-
-