From the course: Cleaning Bad Data in R

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Suspicious multiples

Suspicious multiples

From the course: Cleaning Bad Data in R

Start my 1-month free trial

Suspicious multiples

- [Instructor] Another common source of suspicion in datasets is when you see unusual recurring multiples. The most common example of this is when all of the values in a dataset end in several zeros. This may be the result of rounding or it may come from extrapolation. For example earlier in this course we used this dataset containing the number of acres of public land in each state. Did you notice anything suspicious about this dataset when we first looked at it? Well all of the values here end in three zeros, and there's a good reason for that. I built this file using a government source document. Let's take a look at that document. The data in our file comes from the first and third columns of this page. Take a look at the third column. It contains the total area of national forest system land but it's showing the data in thousands of acres. So Alabama is listed as 665, which represents 665,000 acres. Back here in the dataset we have the round number 665,000, that's not the exact…

Contents