A widely used test of the difference between two independent samples is based on the t-distribution. In this video, learn how to use Excel to apply this test.
- [Instructor] Let's take a look at a frequently used statistical test for assessing the difference between two independent sample means. The T test. Here are the possible hypothesis tests for two independent samples. Directional, in which the alternative hypothesis specifies that the difference is greater than zero. Directional in the other direction, where the alternative hypothesis specifies that the difference between parameters is less than zero or non-directional, where the alternative hypothesis doesn't specify a direction for the difference. With the hypothesis at hand, the processes to gather the data, calculate X bar one minus X bar two, convert X bar one minus X bar two to units of the sampling distribution of the difference between means and if it's in the rejection region, reject a null hypothesis. With small samples and unknown variance, for the sampling distribution of the difference between means, we use the t-Distribution. With appropriate degrees of freedom, determine the critical value for alpha, convert X bar one minus X bar two into a t and if the t is in the rejection region, reject the null hypothesis. When we don't know the population variance, we use the sample variance as our estimate. Here, we have two estimates. We have s one squared, based on N one, that is sample one, and s two squared based on N two, sample two. We use them to create a pooled estimate of variance. Pooled estimate is a sort of weighted average of the first sample variance and the second sample variance done in this way. The quantity N one minus one times s one squared plus the quantity N two minus one times s two squared divided by N minus one, plus N two minus one and that denominator works out to N one plus N two minus two. It's important to note the degrees of freedom. The degrees of freedom are the denominator of the variance estimate. In this case, it's N one plus N two minus two. So to convert the difference between sample means into a t, follow this formula. It's the difference between sample means minus the null hypothesized difference between means divided by the pooled estimate of standard deviation times the square root of one over N one plus one over N two. And the pooled estimate of standard deviation is of course, the square root of the pooled estimate of the variance. Here's an example of all this at work. A manager believes that four 10-hour workdays will result in a different level of productivity than five eight-hour workdays. He randomly assigns 10 works to the 10-hour days, Group 1, and 10 other works to the eight-hour days, Group 2. He measures the amount of widgets each worker produces. The null hypothesis is that mu one minus mu two equals zero and the alternative is that mu one minus mu two is not zero. Alpha is 0.5. The data. X bar one is 87. X bar two is 99. S one is 14 and s two is 16. Do we reject the null hypothesis? Well, first, we'll calculate the pooled estimate of the variance. Plugging the numbers into the formula, that pooled estimate comes out to 226. Now converting X bar one minus X bar two t, plugging numbers into the formula yields a value of minus 1.785. What's the decision? Well, with degrees of freedom equal 18 and alpha equals 0.05, the critical value is plus or minus 2.10. The rejection region is on both sides because of the non-directional alternative hypothesis. Mu one minus mu two not equal to zero. The result minus 1.785 is not in the rejection region. So we do not reject the null hypothesis. Now here's something to think about. If the alternative hypothesis had been mu one minus mu two is less than zero. The 0.05 cutoff would have been minus 1.734 in just the left tail. The decision then would have been to reject the null hypothesis. What does this tell you? The one-tailed test provides a lower hurdle than the two-tailed test. It's easier to reject the null hypothesis with the one-tailed test than with the two-tailed test. The one-tailed test requires knowledge, enough knowledge to know in what direction the alternative hypothesis should be. Power in statistical terminology is the ability to correctly reject the null hypothesis. What does this tell us? Knowledge is power. Now for a spreadsheet that showcases the Excel data analysis tool for carrying out a T-test. I have data for two groups. Group 1 data in Column E and Group 2 data in Column F. On the Data tab, click the Data Analysis button which we installed as an Excel add-in. From the Data Analysis dialog box, select T-Test Two Sample Assuming Equal Variances and click OK. With the Variable 1 Range box active, select the data in Column E beginning with the Group 1 heading. With the Variable 2 Range box active, select the data in Column F beginning with the Group 2 heading. Leave the Hypothesized Mean Difference box blank and the tool assumes that the null hypothesized mean difference is zero. Click the checkbox next to labels. Note that alpha is assumed to be 0.5. We'll leave that one alone. Make sure the Radio button next to New Worksheet Ply is selected. Click OK. The results appear in a new tab. To automatically widen the columns, click in between A and B. What you get is a pretty complete list of statistics. You get the sample means and variances and the pooled variance, moving down, you have the T value, it's called t-stat. We find the area in one table of the T distribution cut off by this value. The critical T value for a one-tailed test for this number of degrees of freedom, then we have the area cut off in the two-tailed by the minus value of this T combined with the positive value of this T. Notice that the second probability is double the first. Then it provides the critical T value for a two-tailed test with this number of degrees of freedom. With this particular set of data, we reject the null hypothesis.
- Explain how to calculate simple probability.
- Review the Excel statistical formulas for finding mean, median, and mode.
- Differentiate statistical nomenclature when calculating variance.
- Identify components when graphing frequency polygons.
- Explain how t-distributions operate.
- Describe the process of determining a chi-square.