From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Anticipating data integration

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Anticipating data integration

- [Instructor] In the 25 years that I've been doing this, I've concluded that the best source of modeling data is often transactional data, but it's not in the form that we need. It's not flat. So what we have to do is aggregate it, and in numerous ways. This is all done in the data preparation phase, but we have to be clear on what it looks like when we're done so that we can properly perform our data understanding tasks. So just one example might be calculating average number of domestic U.S. calls over this four month period, turning four rows into just one number. Then for this one customer, that becomes their value on a single row of data in our modeling flat file. Just in the form we need, just one row per customer, or whatever our unit of analysis might be.

Contents