From the course: Data Science on Google Cloud Platform: Exploratory Data Analytics

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Statistics and correlations

Statistics and correlations - Google Cloud Tutorial

From the course: Data Science on Google Cloud Platform: Exploratory Data Analytics

Start my 1-month free trial

Statistics and correlations

- [Narrator] One of the key EDA functions performed on any data set is to understand the data through descriptive statistics. Descriptive statistics provide characteristics of each data element. We do the same for the campaign's data set. Pandas has an easy way to do descriptive statistics through the function describe. We do that for this data frame. As you can see, for each column in the data set, count, mean, standard deviation, and quartiles are printed as part of the command. For example, the price column has a mean of 40.08 and a standard deviation of 8.22. The range varies from 30 to 50 with 40 being the median. Another key function then for understanding data is correlation analysis between the target variable and the feature variables. In this data set, the target variable is conversion. We want to understand how conversion is impacted by other feature variables like price, discount, gender, age, etc. We do this through the corr function for the data frame. A value closer to…

Contents