Cluster analysis and factor analysis are two statistical methods of data analysis. These two forms of analysis are heavily used in the natural and behavior sciences. Both cluster analysis and factor analysis allow the user to group parts of the data into "clusters" or onto "factors," depending on the type of analysis. Some researchers new to the methods of cluster and factor analyses may feel that these two types of analysis are similar overall. While cluster analysis and factor analysis seem similar on the surface, they differ in many ways, including in their overall objectives and applications.
Cluster analysis and factor analysis have different objectives. The usual objective of factor analysis is to explain correlation in a set of data and relate variables to each other, while the objective of cluster analysis is to address heterogeneity in each set of data. In spirit, cluster analysis is a form of categorization, whereas factor analysis is a form of simplification.
Complexity is one question on which factor analysis and cluster analysis differ: data size affects each analysis differently. As the set of data grows, cluster analysis becomes computationally intractable. This is true because the number of data points in cluster analysis is directly related to the number of possible cluster solutions. For example, the number of ways to divide twenty objects into 4 clusters of equal size is over 488 million. This makes direct computational methods, including the category of methods to which factor analysis belongs, impossible.
Even though the solutions to both factor analysis and cluster analysis problems are subjective to some degree, factor analysis allows a researcher to yield a "best" solution, in the sense that the researcher can optimize a certain aspect of the solution (orthogonality, ease of interpretation and so on). This is not so for cluster analysis, since all algorithms that could possibly yield a best cluster analysis solution are computationally inefficient. Hence, researchers employing cluster analysis cannot guarantee an optimal solution.
Factor analysis and cluster analysis differ in how they are applied to real data. Because factor analysis has the ability to reduce a unwieldy set of variables to a much smaller set of factors, it is suitable for simplifying complex models. Factor analysis also has a confirmatory use, in which the researcher can develop a set of hypotheses regarding how variables in the data are related. The researcher can then run factor analysis on the data set to confirm or deny these hypotheses. Cluster analysis, on the other hand, is suitable for classifying objects according to certain criteria. For example, a researcher can measure certain aspects of a group of newly-discovered plants and place these plants into species categories by employing cluster analysis.
- "Analyzing Multivariate Data"; James Lattin, et al.; 2003
- "The Essentials of Factor Analysis"; Dennis Child; 2006
The Difference Between SWOT & PEST
Businesses apply SWOT and PEST analysis methods to understand the feasibility of a new product, project or possible expansion. They are commonly...
How to Find the Centroid in a Clustering Analysis
Cluster analysis is a method of organizing data into representative groups based upon similar characteristics. Each member of the cluster has more...
How to Run a Factor Analysis in SPSS
Factor analysis reduces and analyzes large sets of data to identify underlying factors, and assess their influence on a set of measured...
How to Read the Output of SPSS K-Means
Clustering analysis is a statistical technique used to arrange cases in categories so that the cases in each category are similar to...
What Are Stratified Samples?
If you wanted to study the average height of people living in your neighborhood, one strategy would be to take a random...
Companies That Use Factor Analysis
Factor analysis is a process by which numerous variables are identified for a particular subject, such as why consumers buy cell phones....