The sampling distribution is the most important concept in inferential statistics. This video provides the foundational ideas for understanding sampling distributions.
- [Narrator] We'll turn our attention toward a very important concept in statistics, sampling distributions. I can't stress enough that this concept is fundamental for statistics. I've taught statistics for many, many years and I can tell you that if you understand sampling distributions, it's a good chance you'll understand statistics. If not, statistics will seem like just a long list of disconnected formulas. Let's get back to some basics. A population is a large group, a universe of individuals, and we're interested in some measurable characteristics of those individuals, parameters in other words. A sample is a small representative group that we select from the population. The idea is to measure their characteristics, which are called statistics and from the sample statistics, we infer the population parameters. Now a sampling distribution is based on a population of samples. All the samples have the same number of people. In each sample, we measure the individuals on some property, and then we calculate a statistic like the mean. So the sampling distribution of a statistic is the distribution of all those means calculated on all those samples. This is a probability distribution. Depending on what's being measured, it could be discreet, a probability mass function or it could be continuous, a probability density function. Like any other distribution, it has a mean and a standard deviation. Here's a picture that shows what I mean. A huge number of samples coming out of a population, that set of means is the sampling distribution of the mean. So a sampling distribution is the distribution of all possible values of a statistic for a given sample size. The standard deviation of the sampling distribution is called the standard error. And this is a very important and useful definition in statistics. It's a term we encounter in many contexts. Bear in mind that when we do a study, we never actually create a sampling distribution. That would take an infinite amount of effort. Here's what we do. From one sample and some additional statistical knowledge, we can find the parameters of a sampling distribution. What does this enable us to do? This enables us to calculate probabilities and thus to create estimates of parameters and to test hypotheses. Here's an analogy. It's something like a paleontologist working with a small fossil and recreating an entire dinosaur. In practice, step one is to calculate a statistic from some sample data, and step two is to compare the statistic to the sampling distribution of the statistic. For estimating parameters, what this enables us to do is set upper and lower boundaries for a parameter estimate. For testing hypotheses, this enables us to find the probability that a null hypothesis explains the sample data. In summation, a sampling distribution is the distribution of all possible values of a statistic for a given sample size. When we do a study, we don't create a sampling distribution. Instead, we use sample data and some additional knowledge to find the parameters of a sampling distribution. Sampling distributions enable us to estimate parameters and to test hypotheses.
- Explain how to calculate simple probability.
- Review the Excel statistical formulas for finding mean, median, and mode.
- Differentiate statistical nomenclature when calculating variance.
- Identify components when graphing frequency polygons.
- Explain how t-distributions operate.
- Describe the process of determining a chi-square.