From the course: Cleaning Bad Data in R

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Inconsistent spellings

Inconsistent spellings

From the course: Cleaning Bad Data in R

Start my 1-month free trial

Inconsistent spellings

- [Narrator] Spelling errors are another source of chaos when it comes to data analysis, particularly when you're trying to group results. There are many possible sources of spelling errors, but one of the most common causes of those errors is when people are hand entering data into a system and they don't know how to spell a word, they type it in incorrectly, or they use inconsistent punctuation. One of the best ways that you can reduce spelling errors is to limit the use of freeform text input fields in your applications. Instead of asking a user to type in an entry, you can present them with a list of options and ask them to select the appropriate value. Of course, you can only do this when the range of choices is limited. Otherwise you'll have to undertake the painstaking process of identifying and correcting spelling errors in your data before you begin analysis. Let's take a look at an example of how we can do this is in R. We'll be working with a data file that contains the…

Contents