From the course: Leading Teams Working with Data: Pitfalls and Best Practices

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Clean and dirty data

Clean and dirty data

From the course: Leading Teams Working with Data: Pitfalls and Best Practices

Start my 1-month free trial

Clean and dirty data

- Before you do any data analysis, you need to make sure that your data is ready. This means you need to ensure your data is clean. There's an important distinction between clean data and dirty data. Dirty data can have all kinds of problems with it. There might be columns that don't align, rows that don't align. The units of measure or the currencies could be wrong. You might also find duplicative data or missing data, and other data may have somehow shifted around. All of these things need to be controlled for in order for your data to be considered clean, and the only data you should be analyzing is clean data. After all, if you analyze dirty data, it's a little bit like garbage in, garbage out, and the results of all your hard work and analysis will be something that's probably meaningless and could even point you in the wrong direction when it comes to deriving implications. Furthermore, all the work you will have done…

Contents