One part of statistics involves parameter estimation. Learn about limits on the confidence in estimating a parameter.
- [Instructor] Let's turn to parameter estimation. When we estimate, we use a statistic, like x-bar, to estimate a parameter, like mu. We can't know the exact value of the parameter, so some uncertainty will always be in our estimate. So we quantify the amount of uncertainty we're willing to live with. Alpha is the amount of uncertainty we're willing to live with when we estimate a parameter. .05 is a value of alpha that's used a lot. Confidence is the amount of certainty we have when we estimate a parameter. Confidence equals one minus alpha. A frequently used level of confidence is .95, also referred to as 95% confidence. Confidence limits are the lower and upper bounds for an estimate, based on confidence level. A typical estimation statement might be: "The 95% confidence limits for the mean are 57.5 and 65.3." The confidence interval is the interval between the confidence limits. A typical estimation statement would be: "The 95% confidence interval for the mean is 57.5 to 65.3." To find confidence limits, we first decide on a confidence level, one minus alpha, and gather data. Then we calculate the statistics: the mean, the standard deviation, and the standard error of the mean. If the sample size is greater than or equal to 30, we use the standard normal distribution as the sampling distribution of the mean; otherwise, we use the t-distribution, and the degrees of freedom are the sample size minus one. So we work with the sampling distribution of the mean, and its mean is x-bar, and its standard error is s-x-bar. We use alpha to find the two-tailed critical values in the sampling distribution, meaning the proportions of area they cut off in the two extreme ends of the distribution have to add up to alpha. With x-bar as the center of the sampling distribution, the two-tailed critical values are the confidence limits. Here's what I'm talking about. Consider the upper and lower 95% confidence limits for the estimate of a mean. Imagine a sampling distribution of the mean with x-bar at the center. The standard error of the mean is equal to the sample standard deviation divided by the square root of the sample size. So the upper and lower 95% confidence limits for estimating mu are the two-tailed critical values that cut off areas adding up to 5% of the area under the distribution. Some things to remember. The confidence interval depends on alpha and on the sample size. For a narrower confidence interval, we could increase alpha. The 90% confidence interval is narrower than the 95% confidence interval. For a narrower confidence interval, you could increase the sample size. This decreases the standard error. Here's what I mean. The two pictures on the left show that, to have a narrower confidence interval, increase alpha. The bottom picture here on the left shows more area in the two tails than the top picture shows. So the distance between the two cutoffs is shorter in the bottom picture, and that translates to a narrower confidence interval. The two pictures on the right show that, to have a narrower confidence interval, increase the sample size. The bottom picture on the right is a sampling distribution based on a large sample size. The top picture is a sampling distribution based on a smaller sample size. The standard error in the bottom picture is smaller, so the distance between the two cutoffs is shorter, and that translates to a narrower confidence interval. So, what does this all mean? Once we find the 95% confidence interval, does that mean that .95 is the probability that the population mean is between the lower limit and the upper limit? No! That's a common mistake. The parameter is either within the confidence interval or it's not. Well, then, what does it mean? 95% confident refers to the reliability of the estimation procedure. Wait, what? Well, 95% confident means this: If we repeat this estimation procedure on many samples, we expect that the calculated 95% confidence interval, which will change from sample to sample, contains the parameter in 95% of the samples. And that's parameter estimation.
- Explain how to calculate simple probability.
- Review the Excel statistical formulas for finding mean, median, and mode.
- Differentiate statistical nomenclature when calculating variance.
- Identify components when graphing frequency polygons.
- Explain how t-distributions operate.
- Describe the process of determining a chi-square.