From the course: Big Data in the Age of AI

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Challenges with data preparation

Challenges with data preparation

From the course: Big Data in the Age of AI

Start my 1-month free trial

Challenges with data preparation

- [Narrator] Salmon can be farmed or they can be caught wild. But either way it takes a fair amount of work before they are turned into this. Everybody knows that food prep is an important although time consuming and frequently tedious part of cooking. There is a similar principle in any big data project. The rule of thumb is about 80 percent of the time on a big data project is spent preparing the data. And that's been my own experience. Now there are several reasons why this may be the case. It includes things like how is the data entered? If you're using wild caught data, meaning data that you found out there in the world and that maybe was entered with free text. You have to look at things like place names. Here are four different ways of indicating California. You can write it out, you can use various abbreviations and the inclusion of a period. At least by default marks it as a separate answer than the one without…

Contents