From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Considerations in gathering the relevant data

Considerations in gathering the relevant data

From the course: Data Science Foundations: Data Assessment for Predictive Modeling

Start my 1-month free trial

Considerations in gathering the relevant data

- [Instructor] The first task is collect initial data. One thing that experience has taught me is that the data that will be most effective during modeling is often found where no one else has been looking, so you have to do a thorough search. So where do you start? How can you start to assemble a list of potential data sources and potential subject matter experts that will help you understand the data? I often start with a building directory or an organizational chart. It might seem counterintuitive, but if you want to understand the business problem, you have to understand all aspects of the business that potentially involve that problem, and the managers within each department will be your best SMEs. Each department or business process is a potential source of data. Most variables that you try will fail, but almost all sources of data will succeed in producing at least one good variable. At first, this seems to be…

Contents