Two-factor ANOVA is the analytic test when you're studying two independent variables—factors—in one study. In this video, learn the basics of this test.
- [Instructor] Let's meet the ANOVA for analyzing the results of a two-factor study. Factor's another name for an independent variable and you can investigate more than one in a study. A factor can have any number of levels and we'll look at the simplest case. We'll cover two factors with two levels each, that's called a two-by-two design. There's an example of a two-by-two study. Researcher investigates the effects of background, silence versus music and light color, red versus green on problem solving. 20 people are randomly assigned to one of four combinations, silence and red, silence and green, music and red, or music and green. Five people are in each combination. The dependent variable is the time in seconds to solve a paper and pencil maze. Here's a full set of hypotheses. The hypotheses about red and green, the hypotheses about silence and music and hypotheses about the interaction with alpha equal .05. The table on the left shows the data. The dependent variable is as I said the time in seconds to solve a paper and pencil maze. The levels of background, silence and music are in rows, so background is the row variable. Now consider the columns. The levels of light color, read and green are in columns, so Light Color is the column variable. And now the cells. Each combination of a row level and a column level is called a cell. Statistics for the hypothesis tests involve row means, column means and cell means. Five kinds of variance are in this dataset, variance among row means, variance among column means, variance because of row and column interaction, variance within cells and the variance among all the scores. Let's have a quick look at each one. As for the row variance, well, like any set of numbers, row means have a variance. This is called the Mean Square Row or MS ROW. With respect to the column variance, same story. Like any set of numbers, column means have a variance. And this is called the Mean Square Column or MS Col. We have within cell variance too. The scores within each cell vary around their cell mean. These variances are approximately equal even though the cell means are not. We pool these variances to create an estimate of population variance. The pooled estimated is the Mean Squared Within or MSW. And now for interaction variance. That's due to the combination of the levels of the row variable with the levels of the column variable. It's what's left over after accounting for the row variance, the column variance, and the within cell variance. It's called the Mean Square Row by Column or MS Row by Col. All the scores, regardless of row, column or cell vary around the grand mean. The mean of all the scores. This is the mean square total or MST. Let's think about the reasoning process. If the null hypothesis about silence and music is true, then the row means should vary as much or as little as any two numbers randomly selected from the population. So, we'll compare MS ROW to an estimate of population variance. If it's higher, reject the null hypothesis. If the null hypothesis about red and green is true, then the column means should vary as much or as little as any two numbers randomly selected from the population. So, we'll compare MS Col to an estimated population variance. If it's higher, we reject that null hypothesis. If the null hypothesis about background by light color interaction is true, well, after accounting for row variance, column variance, and within cell variance, the remaining variance should be about the same as population variance. So, we'll compare that MS Row by Col to an estimate of population variance and if it's higher, we'll reject the null hypothesis. Examining the mean square for rows. This is a variance estimate based on the number of rows, so the degrees of freedom here is equal to the number of rows minus one. And examining column variance, this is a variance based on the number of columns. So, degrees of freedom for columns equals the number of columns minus one. As for the mean squared within, this is a variance estimated based on pooling the cell variances. So, the degrees of freedom is the denominator of the variance estimate which comes out to number of people minus number of cells. Examining the mean square total, this variance estimate is based on all the people in the study, so degrees of freedom total equals number of people minus one. So, for this example, degrees of freedom for row equals one, degrees of freedom for columns equals one, the degrees of freedom for the interaction is one, the degrees of freedom for the within comes out to 16 and the degrees of freedom of the total is 19. Some notable relationships. The degrees of freedom for total is the sum of the other four degrees of freedom. The sum of squares total is the sum of the other four sums of squares. Here's the ANOVA table for this example. And here's the graph of the cell means. Summing up. We talked about rows, columns and cells and about three sets of hypotheses. And we test by comparing to mean squared within.
- Explain how to calculate simple probability.
- Review the Excel statistical formulas for finding mean, median, and mode.
- Differentiate statistical nomenclature when calculating variance.
- Identify components when graphing frequency polygons.
- Explain how t-distributions operate.
- Describe the process of determining a chi-square.