From the course: Cleaning Bad Data in R

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Detecting illogical values

Detecting illogical values

From the course: Cleaning Bad Data in R

Start my 1-month free trial

Detecting illogical values

- [Narrator] Outliers aren't the only way that you can detect bad data in your datasets. Sometimes, you'll detect illogical values that break business rules or violate common sense. You can write tests in R that identify these values. For example, consider a dataset containing information about the residents of a town, including their ages, employment status, information about where they live, and whether they own a car. In the exercise files I provided this code to load the dataset that has around 2,000 records. As with our previous examples, we begin by loading the tidyverse, setting our working directory, and then reading in the residents dataset. I'll begin by taking a look at some summary statistics for this dataset and looking at the summary I see that I have records about adults of working age. All of the values for the age variable are between 18 and 65. I also have a field that shows whether the person is employed using a logical value where true means that they are employed.…

Contents