Congruence coefficient and the alignment of factor loads

RL Rachid Laajaj KM Karen Macours DH Daniel Alejandro Pinzon Hernandez OA Omar Arias SG Samuel D. Gosling JP Jeff Potter MR Marta Rubio-Codina RV Renos Vakis

This protocol is extracted from research article:

Challenges to capture the big five personality traits in non-WEIRD populations

**
Sci Adv**,
Jul 10, 2019;
DOI:
10.1126/sciadv.aaw5226

Challenges to capture the big five personality traits in non-WEIRD populations

Procedure

Tucker’s congruent coefficient (or just congruence coefficient) is an index that assesses the similarity between factor structures of the same set of items applied to two different populations. PCA is a dimensionality reduction technique, which reduces a number of items into a smaller number of components (in our case five) that best explain the variation of the items. Each component is a weighted average of the items, and the vector of weights is also called the vector of loadings. The correlation between the vector of loadings in two different populations provides a measure of similarity between the two components.

After applying the PCA to two different populations, one calculates the correlation coefficient of the two vectors of loadings to assess the similarity between a component *x* and a component *y*$$\mathrm{\phi}(\mathbf{x},\mathbf{y})=\frac{{\sum}_{i,}{x}_{i}{y}_{i}}{\sqrt{\left(\sum _{i}{x}_{i}^{2}\right)\left(\sum _{i}{y}_{i}^{2}\right)}}$$where *x*_{i, j} and *y*_{i, j} are the loadings of item *i* on factors *x* and *y*, respectively (each one extracted from applying the PCA of the same items to a different population). In our case, one population is the database of interest and the other population is one of the United States, which is used as the reference. Hence, a higher congruence coefficient indicates that the factor structure follows the one that has been found in the United States, where the Big Five PTs clearly stand out. To obtain the normative target, the Varimax orthogonal rotation was used for the U.S. data.

The congruence coefficient can be interpreted as a standardized measure of proportionality of elements in both vectors. A coefficient that is equal to 1 corresponds to a perfectly identical factor structure between the two populations, while a coefficient equal to 0 corresponds to a structure that is completely orthogonal.

The calculation of the congruence coefficient also requires a decision on the type of rotation to be applied to the survey data. The Procrustes rotation on target of the reference population is typically used in confirmatory factor analysis as a way to ease comparability. First, the factor solution obtained from a replication sample is rotated orthogonally to conform to a predetermined factor structure (i.e., the target) as much as possible. Because the spatial orientation of factors in factor analysis is arbitrary, factor solutions obtained in different groups may be rotated in reference to each other to maximize their similarity. This is known as the Procrustes rotation or targeted rotation. Compared to other rotation choices, Procrustes tends to increase congruence coefficients. In the survey data, not applying the Procrustes rotation reduces the congruence coefficient, hence reinforcing the concerns raised about the factor structure (results available upon request). The low level of congruence in the survey data is especially notable given that we use the Procrustes rotation method, which specifically rotates the data to obtain the factor structure that most aligns with its target (the U.S. data).

The congruence coefficient is initially calculated by component or factor. We average it across the five factors to obtain a congruence coefficient that is an indicator of similar factor structure. To know which factor in the first population matches each factor of the second population, we calculate the average congruence coefficient for every possible combination and keep the one that maximizes the congruence coefficient. This explains why congruence coefficients tend to be higher when there are less items per factor (for reasons that are similar to overfitting when using a regression). To give an order of magnitude for the interpretation, Lorenzo-Seva and ten Berge (*50*) indicate that a congruence coefficient in a range of 0.85 to 0.94 shows fair similarity, whereas coefficients over 0.95 imply a high level of similarity (where the components can be considered equal).

In Table 1, we complement the congruence coefficient with a visual inspection of how the items sort themselves into the components for each database. To do this, each item was assigned to the component in which it has the highest loading. The red cells highlight items that are matched to the wrong component with respect to the FFM. One should take into account that, as explained above, each component was matched with a Big Five PT in the way that best aligns the factor structures, and despite this, we found on average of 4 (of 15) items per dataset that do not fit in the right component.

Note: The content above has been extracted from a research article, so it may not display correctly.

Q&A

Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.