From the course: Introduction to jamovi

Exploratory factor analysis - jamovi Tutorial

From the course: Introduction to jamovi

Exploratory factor analysis

- [Instructor] When it comes to finding clusters of variables in your data, the two most common approaches, by far, are Principal Component Analysis, which we covered in a previous video, and Exploratory Factor Analysis, which I'm going to talk about right here. And, for a lot of people, the differences between these two don't really amount to much, it's a distinction without a difference, or, it's sort of like the two procedures are identical twins from different parents. That's an interesting thing, because, these two methods make profoundly different assumptions about the relationships between the observed variables and the implicit factors that they're related to, about, really, which one drives the other. But, I want to show you how you can do both of these in jamovi, so you can make your choice depending on what seems to work best with your particular data. Now, Principal Component and Exploratory Factor Analysis are sophisticated topics, and my goal here is not to give you a thorough demonstration of the techniques and the theory, but really to show you how to set it up in jamovi, and where to find the output, so you can match that up with your understanding of how these things work in real life. To do this, I'm going to use the Big Five data set, where we have information from people about age and gender, and then, we have 50 variables with 10 each, for the five major personality characteristics. Extraversion, conscientiousness, agreeableness, openness, and neuroticism. And, what we're going to do is come up here to Factor, and choose Exploratory Factor Analysis. And the first thing we need to do there is tell it what variables we're going to use. Well, we're going to use the Radiant scale ones that are supposed to be measuring these five characteristics. And there's 10 each, so there's 50 total, I'm going to do a Shift + Click down to the bottom, and then move those over to Variables. And then you can see the table starts propagating immediately. But, we do get to make some choices. Now, first off, it's choosing to do seven, and we saw this when we did Principal Components as well. But, I want to do it a little bit differently. So, what I'm going to do, is, I'm going to come over here, and I'm going to choose different extraction methods. Minimum Residuals is a lot like the Least Squares criterion that we have in regression. And, we have some other choices. Principal Axis is going to make it practically identical to Principal Component Analysis. Let's use Maximum Likelihood, 'cause that's also a really productive approach in a lot of situations. We can also choose different kinds of rotation. Remember, a rotation is a way of changing the way you get the results. Think about it, for example, that if you have height as one variable, and weight as another, you can rotate the data a little bit, because those are going to be highly correlated, and maybe you talk about big versus small, and it's a way of making it more interpretable. So, we can come down here, and I like Promax, and so, I'm going to choose that one, it allows for correlated variables. And, it's going to shift the table a little bit over here, but we do still have seven factors. Now, because I know this is supposed to be measuring five factors, I'm going to come right here, and choose a fixed number, and I'll put in the five right there, so we can see how well it lines up with that. When I do that, and when I let things sort in the order that they appear in the data, so, we have E1 through 10, and so on, you can see, they fall into this nice little step, step, step pattern, where the off diagonal elements, there are numbers there, but they don't see them because, they've been suppressing anything below 0.3 doesn't show, and that's why we're missing one here, 'cause that would be less than zero three, and these ones are a little bit bigger, so they show. But, you still see this really clear pattern going all the way down. I can sort them by size, but that makes it harder to interpret right here. I do want to mention a couple of other things that we can do. Number One is Bartlett's test of sphericity. The test of sphericity, is, well, sphericity isn't, let me rephrase. Sphericity is kind of like normality, that, in most situations, you would expect your data to be normally distributed in a bell curve. But, when you have a multi-dimensional data set, like you're using with Exploratory Factor Analysis, what you're dealing with is something that's called sphericity, and, if we come down here now, this is going to be similar to what we saw in our previous example with Principal Component Analysis, our data do differ significantly from sphericity. Now, probably the easiest way to check that, although it's a little tedious, would be to go back and look at the distributions of the 50 variables, going into it, to see, say, for instance, are they symmetrical, do we have outliers? You can check that quickest by doing a whole series of box plots, where it's very easy to check symmetry, and whether things sort of match the expectations of a normal distribution. Another thing we can do is come down here and start looking at the factor summary, which is going to tell us the sum of squares loading, the percent of variance that each of the factors accounts for, and the cumulative variance. This is basically identical to what we had with the Principal Component Analysis. We could also look the the factor correlations, because I used a rotation that allowed for these to be correlated with each other, and that's what we have, right there. There are a collection of model-fit measures, that are available, as well. That includes things RMSEA, that's the Root Mean Square Error. And, we have several others, including a chi-square there, at the end, that you can use to see how well the data match the five factor structure that we're putting in here. And then, finally, you can also do a graphical scree plot, and, this is a way of looking at how the factors that are determined by the factor analysis account for the variability that the data began with. If each variable was standardized, then you have one unit of variance for each of the variables. We have 50 variables, so we have 50 units of variance. And, we see that the first one accounts for about seven, then four, and then so on. And, this is the five, these are the five that we expect from the Big Five now. And, empirically, we could go down to six or seven, but those don't have the theoretical tightness that we expect from a measure that's supposed to be looking at the Big Five Personality Factors. And so, the analysis we have here is nearly identical to the one that we had with Principal Component Analysis. We certainly are getting the same conclusions out of it, and, which one you choose, really, is a preference about the nature of the relationship between the individual items, and the factors, or the components that you're getting from the analysis. Either one's going to give you insight into how, for instance, you could group variables, and, either one's going to help you find the stability of your data, by averaging out across these items, and getting more reliable conclusions that will work in new situations, and help you put your insights into practice.

Contents