Which Test: Factor Analysis (FA, EFA, PCA, CFA)

Which stat testConfused about when to use FA, EFA, PCA, or CFA? Well, all of them are interdependence methods in which no single variable or group of variables are defined as being independent or dependent. The statistical procedure involves the analysis of all variables in the data set simultaneously so the goal of these interdependence procedures is to uncover structure by grouping of variables (as in factor analysis) rather than respondents (typically in cluster analysis) or objects (typically in perceptual mapping). So interdependence methods do not attempt to predict one or more variable by others as with regression analysis.

Lets take a quick look at each of them: Exploratory Factor Analysis (EFA) is often referred to as Factor Analysis (FA) or as common Factor Analysis (no, not abbreviated as CFA), and should be differentiated from its close ally, Principle Components Analysis (PCA).  While all of these “explore” (hence “exploratory”) the interrelationships among several variables to explain them in terms of their common underlying dimensions (factors), there is a subtle (but very important) difference between EFA and PCA.

Think “communalities”. Communality (denoted by h2) is defined as the amount of variance a specific manifest (measured) variable shares with other manifest variables included in the analysis. It also refers to the amount of variance that a manifest variable has in common with the latent construct on which it loads (the common factor). From the angle of the latent construct, communality refers to the amount of variance explained in a manifest variable by the latent construct. 

Therefore, EFA relies on various assumptions to estimate the latent constructs (e.g. that the correlation among manifest variables is due to one or more common underlying factors) while PCA makes no assumption about a model and is only concerned which linear relationships exist and how any particular variable might contribute to that relationship.

Expressed in terms of communalities: Because EFA is a model (so some variability will be explained by the model and some not) it analyzes only the variance that is shared (common variance called “covariance”) with other items (the proportion of variance in one variable explained by the other variables), so therefore communalities will always be less than 1. The leftover non-common variance (error variance) is  then dropped from the analysis.

PCA on the other hand, is not a model (so no unexplained error) and analyzes all the variance in the variables (not just the common variance) so therefore the (initial) communalities are all 1, which represents all (100%) of the variance of each item included in our analysis. No differentiation (or separation) is made between common variance and non-common variance (error variance).

EFA commonly relies on Principle Axis Factoring (PAF) while other methods (e.g. alpha and image factoring) are less commonly used. PCA only relies of the Principle Components method, hence the name PCA. A popular definition of PCA is: “a linear transformation technique that provides a smaller set of uncorrelated variables (called components) from a set of correlated variables while maintaining most of the information in the original data set. These components can then be used in place of the original variables in the regression model” (Yun Wang, Cornell University, 2002).

The differences can be so negligible that results are often the same. However, there could be large differences and therefore it is important to select the right method of factoring depending on the objective of the research. PCA ( Principle Components method) is most preferred by researchers as a method for data reduction (combining many variables into a few components with low inter-correlations that summarises their variance), while the less commonly used EFA (via Principle Axis Factoring) is used when the objective is to detect and to model the structure of correlations among the variables, or to establish a parsimonious representation of the relationships among manifest variables. The solution of a PCA is a “means to an end”, while the solution of an EFA normally is “the end” itself.

If you are still not exactly sure whether you should do EFA or PCA (then I bet you most likely need PCA), so launch your Factor Analysis program and select the factoring method “Principle Components” and you will be on your way to explain all the variance in your variables and extract your factors.

Confirmatory Factor Analysis (CFA) is generally part of a procedure such as Structural Equation Modeling (SEM) conducted via software such as LISREL, AMOS, MPLUS, etc. The SEM model typically includes two different sub-models: 1) the measurement model (CFA) and the structural model (SEM). In the measurement model we specify the measurement theory and then validate it with CFA so the focus is on the “validity” of our latent constructs. In CFA a requirement is the a priori selection of variables on the basis of established theory and to hypothesize beforehand the number of factors in the model. We most commonly use the CFA measurement model to validate multi-item constructs such as the items to measure a construct e.g. satisfaction.

Further Reading: 

Wang, Yun (2002), Prinicipal Components: Not Just Another Factor Analysis
Accessed July 24, 2002:  http://cscu.cornell.edu/news/statnews/stnews49.pdf