From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Investigating the provenance of the missing data

Investigating the provenance of the missing data

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Investigating the provenance of the missing data

- [Instructor] A very common pattern when examining data quality is variables that cluster together in terms of their missing data. When one is missing, it's consistently missing on some other column, but not in a random pattern. When this happens, try to figure out the provenance of the data. Where did it come from? I had a memorable experience visiting a cell phone company overseas. Prepaid mobile minutes are more popular internationally than they are domestically. It's quite common for international business travelers to buy a SIM card from a kiosk right in the airport. It's also common at convenience stores. Many of the customers of my client on this trip about half purchase their minutes in this way, either at a kiosk or a convenience store. Alternatively, some of their customers had a monthly contract, which of course, is the dominant method here in the US. Those customers fill out a form with various…

Contents