Robust oblique Target-rotation for small samples

Beauducel, André; Hilger, Norbert

doi:10.3389/fpsyg.2023.1285212

ORIGINAL RESEARCH article

Front. Psychol. , 27 November 2023

Sec. Quantitative Psychology and Measurement

Volume 14 - 2023 | https://doi.org/10.3389/fpsyg.2023.1285212

Robust oblique Target-rotation for small samples

André Beauducel^*

Norbert Hilger

Department of Psychology, University of Bonn, Bonn, Germany

Introduction: Oblique Target-rotation in the context of exploratory factor analysis is a relevant method for the investigation of the oblique simple structure. It was argued that minimizing single cross-loadings by means of target rotation may lead to large effects of sampling error on the target rotated factor solutions.

Method: In order to minimize effects of sampling error on results of Target-rotation we propose to compute the mean cross-loadings for each block of salient loadings of the independent clusters model and to perform Target-rotation for the block-wise mean cross-loadings. The resulting transformation-matrix is than applied to the complete unrotated loading matrix in order to produce mean Target-rotated factors.

Results: A simulation study based on correlated independent clusters model and zero-mean cross-loading models revealed that mean oblique Target-rotation resulted in smaller bias of factor inter-correlations than conventional Target-rotation based on single loadings, especially when sample size was small and when the number of factors was large. An empirical example revealed that the similarity of Target-rotated factors computed for small subsamples with Target-rotated factors of the total sample was more pronounced for mean Target-rotation than for conventional Target-rotation.

Discussion: Mean Target-rotation can be recommended in the context of oblique factor models based on simple structure, especially for small samples. An R-script and an SPSS-script for this form of Target-rotation are provided in the Supplementary Material.

1 Introduction

Exploratory factor analysis (EFA) is a widely used multivariate method (Harris, 2013), especially in the context of the development of instruments for psychological assessment. Although confirmatory factor analysis may be used for similar purposes, there is still room for EFA because the expectation of perfect simple structure with one large salient loading of each observed variable on one factor and zero cross-loadings, i.e., an independent clusters model (ICM), may lead to unrealistic simplifications in the context of confirmatory factor analysis. The specification of the ICM for data sets with substantial cross-loadings may cause model misfit in confirmatory factor analysis resulting in model modifications and capitalization on chance (MacCallum et al., 1992). Hsu et al. (2014) demonstrated that even small cross-loadings not specified in the confirmatory factor model may result in substantial overestimation of factor covariances. Ximénez et al. (2022) have shown that unspecified cross-loadings in bifactor models may impair parameter recovery and that common fit indices may not be useful for the detection of misfit due to unspecified cross-loadings. Moreover, Hayes and Usami (2020) provide algebraic demonstrations of the bias of factor inter-correlations resulting from unspecified cross-loadings. Overall, these studies indicate that unspecified cross-loadings may cause problems for confirmatory factor analysis and structural equation modeling. Problems with unspecified cross-loadings do not occur with the ICM in the context of EFA because Target-rotation toward an ICM in the context of EFA or exploratory structural equation modeling (ESEM; Asparouhov and Muthén, 2009) will only provide an orientation of the factor axes so that cross-loadings might be minimized without any consequence for model fit. When the cross-loadings are large, Bayesian structural equation modeling (BSEM) may also be superior to confirmatory factor analysis based on the ICM. If the priors are known BSEM might be preferred to ESEM and ESEM might be preferred when the priors are unknown (Wei et al., 2022).

The advantage of using Target-rotation in the context of EFA instead of an ICM in the context of confirmatory factor analysis has been demonstrated for the five-factor model of personality (McCrae et al., 1996). Empirical research has also shown that the use of Target-rotation in the context of ESEM allows to avoid an over-estimation of factor inter-correlations that may occur when the ICM is specified in the context of confirmatory factor analysis (Joshanloo, 2016). The relationship between cross-loadings, factor inter-correlations, and different criteria of factor rotation has also been investigated in the context of simulation studies (Sass and Schmitt, 2010; Schmitt and Sass, 2011). Sass and Schmitt (2010) found that the criteria of factor rotation differ in allowing for larger cross-loadings and in the size of the resulting factor inter-correlations.

The relationship between the loading pattern and the factor inter-correlations has also been addressed by Zhang et al. (2019), who extended partial Target-rotation in order to allow for the specification of a Target-matrix for the factor inter-correlations in addition to the Target-matrix for the loadings. With their extension Target-rotation allows for the investigation of hypotheses on the size of factor inter-correlations. Their approach is based on oblique partial Target-rotation (Browne, 1972) and the gradient projection algorithm (Jennrich, 2002; Bernaards and Jennrich, 2005). Moreover, Hurley and Cattell (1962) initially introduced complete oblique Target-rotation providing rotated loadings and estimates for factor inter-correlations when all values of the Target-matrix of loadings are specified.

While Target-rotation allows for a specification of the ICM in the Target-loadings, Target-rotation will typically be performed in order to minimize cross-loadings. Unless specific Target-values are specified for the correlations by means of extended Target-rotation, Target-rotation will modify the factor inter-correlations in order to reduce cross-loadings. If the ICM holds in the population, sampling error will nevertheless lead to some cross-loadings. When the distribution of cross-loadings resulting from sampling error is not perfectly symmetric, minimizing these cross-loadings may affect the factor inter-correlations. Thereby, sampling error may affect factor inter-correlations resulting from Target-rotation. Moreover, when an ICM holds and when single cross-loadings are minimized by Target-rotation, random differences between single cross-loadings may also affect the rotated loading pattern.

It is therefore proposed to minimize the effect of sampling error on the loading pattern and factor inter-correlations resulting from oblique Target-rotation by means of minimizing mean cross-loadings instead of the single cross-loadings. It is expected that using the mean cross-loadings instead of the single cross-loadings for rotation will reduce the effect of sampling error on the results of Target-rotation. The method is termed oblique Mean-Target-rotation (OMT) and may also be of interest when a few substantial cross-loadings occur in the population because it avoids minimizing the single cross-loadings. Thereby, OMT could be helpful for the investigation of departures from the ICM.

The ICM is a relevant model for the investigation of oblique factor rotation because most researchers will consider a model with minimal cross-loadings as an advantage for factor interpretation. Nevertheless, it might be of interest to compare OMT- and OT-rotation for other population factor models. However, more complex factor models will typically not allow to draw clear conclusions on optimal factor rotation and optimal factor inter-correlations. The reason is that for several more complex models some researchers might prefer larger cross-loadings combined with smaller factor loadings and others might prefer smaller cross-loadings combined with larger factor inter-correlations (Schmitt and Sass, 2011). There is typically no objective way to decide between these preferences.

Effects of sampling error on cross-loadings can be positive and negative. Therefore, the absolute size of a single population cross-loading can be over- or underestimated when sampling error occurs. It is, however, rather unlikely that sampling error affects all cross-loadings on one factor in the same direction. An approximately symmetric distribution of positive and negative effects of sampling error on the cross-loadings on one factor is most likely. One might therefore expect that the positive and negative effects of sampling error on cross-loadings cancel out across cross-loadings so that the mean of the cross-loadings on one factor could be an estimate of the average population cross-loadings on this factor. However, different population factor inter-correlations may result in different population cross-loadings in the corresponding orthogonal loading pattern.

An example for the effect of the correlation between two factors on the cross-loadings of the corresponding orthogonal factors is shown in Table 1. In order to show the effect of sampling error on cross-loadings, the orthogonal population loading pattern is given together with a corresponding orthogonal sample loading pattern for n = 1,000 cases. There are two blocks of non-zero cross-loadings in the population loading pattern and, in the sample, the cross-loadings are slightly smaller or larger than the corresponding population cross-loading. The average of cross-loadings for each block of salient loadings will be close to the population cross-loadings. Therefore, the block-wise average of cross-loadings might minimize the effect of sampling error on cross-loadings while it maintains the population mean cross-loading that might be important for factor rotation. The example also illustrates that the larger cross-loadings are eliminated in the oblique loading pattern.

TABLE 1

Table 1. Example for the effect of sampling error on cross-loadings.

When the effect of sampling error on cross-loadings is minimized by block-wise averaging and the effect of population cross-loadings is minimized by oblique rotation, the resulting loading patterns with approximately zero-mean cross-loadings might allow for a rather simple interpretation of the factors. For these models, there is no, or when the sum of positive loadings is not perfectly equal to the size of negative loadings, a rather small trade-off between minimizing the absolute size of the cross-loadings and allowing for larger factor inter-correlations. Thus, like the ICM, zero-mean cross-loading models (ZCLM) refer to rather clearly defined oblique rotations of the factors. Therefore, both models are appropriate starting points for the investigation of the effect of sampling error on OMT- and OT-rotation. From a more general perspective, the ICM and the ZCLM can both be regarded as different ways to achieve simple structure. However, loading patterns with other structures may also be of interest (Browne, 2001; Ertel, 2011). The principle of using block-wise mean loadings for target-rotation might also be an interesting option for rotations that are not based on simple structure. But this more general issue is beyond the scope of the present study.

It was expected that the effect of sampling error on the weighted mean cross-loadings computed in OMT-rotation is smaller than on the single cross-loadings used in OT-rotation. Since smaller sample sizes result in larger standard errors of factor loadings, OMT-rotation should be more appropriate for the investigation of small sample sizes than OT-rotation. The standard errors of factor loadings depend on the model estimation method, on the method of factor rotation, and on the complexity of the loading pattern (Zhang, 2014; Zhang and Preacher, 2015). Therefore, several numerical methods like, for example, the nonparametric bootstrap have been proposed for this issue (Zhang, 2014). The exploration of different methods for the computation of standard errors of loadings is beyond the scope of the present study. However, the comparison of the standard deviations of OT- and OMT-rotated loadings in a simulation study should reveal whether averaging cross-loadings minimizes the effect of sampling error on factor loadings. If averaging cross-loadings reduces the effect of sampling error, the standard deviations of OMT-rotated loadings should be smaller than the standard deviations of OT-rotated loadings.

After some definitions, the OMT-rotation and a population example will be presented. A simulation study was performed for the oblique ICM to compare OMT-rotation with conventional oblique Target-rotation (OT). Moreover, OMT- and OT-rotation were compared by means of an empirical example. Finally, recommendations for analyses of oblique ICM and ZCLM by means of Target-rotations are discussed.

2 Definitions

According to the population common factor model a random vector x of p observed variables is explained by a random vector ξ of q common factors and a random vector δ of p unique factors. This can be written as

\begin{array}{l} x = Λ ξ + δ, & (1) \end{array}

where Λ is the p × q matrix of factor loadings and

E (ξ ξ') = Φ, d i a g (Φ) = I, E (δ δ') = Ψ^{2} = d i a g (Ψ^{2}),

and

E (ξ δ) = 0

. This implies

\begin{array}{l} E ({xx}^{'}) = Σ = Λ Φ Λ^{'} + Ψ^{2} = Λ_{u} Λ_{u}^{'} + Ψ^{2}, & (2) \end{array}

where Λ_u is the matrix of common factor loadings for uncorrelated factors, i.e., for Φ = I. Oblique target-rotations (Hurley and Cattell, 1962; Browne, 1972) start from an orthogonal loading matrix Λ_u, which is mostly the unrotated loading matrix resulting from factor extraction.

3 Oblique mean-target-rotation

OMT-rotation starts with an orthogonal Target-rotation (Schönemann, 1966) of the unrotated loadings Λ_u toward a loading Target-matrix Λ_T corresponding to a perfect ICM, with

\begin{array}{l} Λ_{T} = (I_{q} \otimes 1_{p / q}), & (3) \end{array}

Where I _q is a q × q identity matrix, 1_p/q is a p/q × 1 unit-vector representing the Target-loadings and “

\otimes

” denotes the Kronecker-product. The resulting Λ₁ represents an orthogonal loading matrix where the salient loadings are a least square approximation of Λ_T. Weighted mean loadings are computed for each block of salient loadings

\begin{array}{l} Λ_{1 m} = {(Λ_{1} \cdot Λ_{T})}^{'} Λ_{1} {({(Λ_{1} \cdot Λ_{T})}^{'} Λ_{T})}^{- 1}, & (4) \end{array}

where “

\cdot

” is the Hadamard-product. Therefore, Λ₁ ⋅ Λ_T yields the weights of the salient loadings so that the cross-loadings are weighted by the salient-loadings of the respective variable on the respective factor. The resulting weighted mean loading matrix Λ_1m is a q × q matrix so that a q × q identity matrix I _q can be used as Target-matrix for oblique Target-rotation according to Hurley and Cattell (1962), where the transformation matrix

\begin{array}{l} T = {(Λ_{1 m}^{'} Λ_{1 m})}^{- 1} Λ_{1 m} I_{q}, & (5) \end{array}

is normalized in order to get

\begin{array}{l} T_{n} = d i a g {(T^{'} T)}^{- 0.5} T . & (6) \end{array}

This transformation matrix is then used for rotation of the complete loadings, with the reference structure

\begin{array}{l} Λ_{2} = Λ_{1} T_{n}, & (7) \end{array}

and the OMT-rotated loading pattern

\begin{array}{l} Λ_{O} = Λ_{2} d i a g {({(T_{n}^{'} T_{n})}^{- 1})}^{0.5}, & (8) \end{array}

and the OMT-rotated factor inter-correlations

\begin{array}{l} Φ_{O} = {(Λ_{O}^{'} Λ_{O})}^{- 1} Λ_{O}^{'} (Λ_{u} Λ_{u}^{'}) Λ_{O} {(Λ_{O}^{'} Λ_{O})}^{- 1} . & (9) \end{array}

In order to evaluate whether $Λ_{1 m}^{'} Λ_{1 m}$ is ill-conditioned, the condition-number κ is computed (Moler, 2008). If κ is large, the inversion of the matrix may lead to numerical imprecision. As in ridge regression, there is the option to add small ridge constants when κ is large and to retain the solution with the largest mean congruence (Tucker, 1951) of Λ_O with Λ_T. For large sample sizes and large salient loadings, this option might be irrelevant, but in general, this option could not be harmful as the solution with the best congruence with Λ_T is retained. The loop for the ridge constant can be found in the R- and SPSS-script in the Supplementary Material.

4 Population example

An R-script as well as an SPSS-script based on the example presented here, allowing for OMT and OT-rotation is given in the Supplementary Material. Users of the R-script may install R-4.3.1 and replace the initial orthogonal loadings by orthogonal loadings of interest. The following orthogonal loading matrix shows the difference between OMT- and OT-rotation (see Table 2, left). As the mean of the cross-loadings that balance out within each block of salient loadings is zero within each block of salient loadings, the ideal OMT-rotated loading pattern is already reached so that the initial orthogonal solution is not modified by OMT-rotation. In contrast, OT-rotation minimized the negative loadings and thereby introduces a negative factor inter-correlation (Table 2, bottom). In consequence, the block-wise mean cross-loadings of the OT-rotated solution is not zero. It is, of course, a matter of theoretical preference, which model should be used. However, it is clear that the OMT-rotated solution could also be of interest when the mean non-salient loadings are expected to be zero.

TABLE 2

Table 2. Population example with initial orthogonal loadings.

5 Simulation study

5.1 Specification

5.1.1 Independent variables

A simulation study based on the population ICM and population ZCLM with q $\in$ {3, 6, 9, 12} factors and p/q $\in$ {5, 8} salient loadings per factor was performed. For p/q = 5 two levels of salient loadings were introduced with

λ_{.50} = [\begin{matrix} .40 \\ \begin{array}{l} .45 \\ .50 \end{array} \\ .55 \\ .60 \end{matrix}] a n d λ_{.70} = [\begin{matrix} .60 \\ \begin{array}{l} .65 \\ .70 \end{array} \\ .75 \\ .80 \end{matrix}] (10)

for each salient loading block with p/q = 5. For p/q = 8 the two levels of salient loadings were

λ_{.50} = [\begin{array}{l} \begin{matrix} .38 \\ \begin{array}{l} .42 \\ .45 \end{array} \\ .48 \\ .52 \end{matrix} \\ .55 \\ .58 \\ .62 \end{array}] a n d λ_{.70} = [\begin{matrix} .58 \\ \begin{array}{l} .62 \\ .65 \end{array} \\ .68 \\ \begin{array}{l} .72 \\ .75 \\ .78 \\ .82 \end{array} \end{matrix}] . (11)

The standard deviation of the salient loadings was about 0.08 for both levels of p/q. The ICM was based on zero population cross-loadings (CL = 0) and the ZCLM was based on a condition with a balanced set of non-zero population cross-loadings (CL ≠ 0). In the condition with non-zero cross-loadings the largest absolute population cross-loadings were one third of the average population salient loading. As an example, the non-zero population cross-loadings for p/q = 5 were

C L_{. 50} = [\begin{matrix} .17 \\ \begin{array}{l} - .08 \\ .06 \end{array} \\ - .04 \\ .03 \end{matrix}] and C L_{.7 0} = [\begin{matrix} .23 \\ \begin{array}{l} - .12 \\ .08 \end{array} \\ - .06 \\ .05 \end{matrix}] . (12)

The loadings in CL_.50 have M = −0.03 and SD = 0.10 and for the loadings in CL_.70 have M = −0.04 and SD = 0.14. Small deviations from zero-mean were defined in order to provide a realistic approximation to the ZCLM. The columns denoted as CL_.50 and CL_.70 can be inserted into the loading pattern as presented here, or the columns may be multiplied by −1. This results in different patterns of zero-mean non-zero cross-loadings. In order to cover a large number of combinations of cross-loadings, the patterns of cross-loading columns were slightly different for p/q = 5 and p/q = 8. As OMT-rotation is based on mean cross-loadings, which are zero for the columns, OMT-rotation should be less affected by the different loading patterns than OT-rotation. Three levels of population factor inter-correlations ϕ $\in$ {0.00, 0.25, 0.50} and five sample sizes n $\in$ {100, 150, 200, 300, 500} were investigated. The combinations of independent variables result in 4 (q) × 2 (p/q) × 2 (λ_.50, λ_.70) × 2 (CL = 0, CL ≠ 0) × 3 (ϕ) × 5 (n) = 480 conditions of the simulation study. An example for the q = 3 and CL = 0, for ϕ = 0.00 and ϕ > 0.00 is presented in Figure 1. The first panel (A) of Figure 1 shows the condition based on the ICM (CL = 0) with q = 3 factors, zero factor inter-correlations (ϕ = 0.00) and five variables with salient loadings on each factor (p/q = 5). The second panel (B) shows the same ICM with non-zero factor inter-correlations (ϕ > 0.00). This refers to the two levels of factor inter-correlations that were included into the study (ϕ = 0.25 and ϕ = 0.50). The third panel (C) shows the uncorrelated ICM with q = 3 based on eight loadings with salient loadings on each factor (p/q = 8), and the fourth panel (D) shows the correlated ICM with q = 3 based on p/q = 8. That is, the last panel refers to the conditions based on ϕ = 0.25 and ϕ = 0.50.

FIGURE 1

Figure 1. Example for models with three factors (q = 3) and CL = 0, (A) for ϕ = 0.00 and p/q = 5, (B) for ϕ > 0.00 and p/q = 5, (C) for ϕ = 0.00 and p/q = 8, (D) for ϕ > 0.00 and p/q = 8; q indicates the number of factors; p/q indicates the number of salient loadings per factor.

The independent variables of the simulation study are summarized in Table 3.

TABLE 3

Table 3. Independent variables of the simulation study.

5.1.2 Dependent variables

The dependent variables were the OT- and OMT-factor inter-correlations and their bias which was computed as the difference between the OT/OMT-factor inter-correlations and the corresponding population factor inter-correlations (ϕ). Moreover, the root mean square (RMS) difference of the OT- and OMT-rotated factor pattern with the population loading pattern as well as the recovery of factor scores, i.e., factor score indeterminacy (Grice, 2001), the correlation of the regression factor scores calculated from the OT/OMT-rotated factors with the true factors, was computed.

5.1.3 Data generation

Data generation was performed with the R-package ‘fungible’ provided by Waller et al. (2023) based on Waller (2016), where population loadings and factor inter-correlations were entered in order to generate sample correlation matrices. The population factor models were entered into the ‘MonteCarlo’ option of the ‘simFA’ package and correlation matrices were generated for continuous variables under multivariate normality. For each of the 480 conditions 1,000 sample correlation matrices were generated. Least squares factor analysis with the correct number of factors was performed with the ‘simFA’ package and unrotated factor loadings were computed. The unrotated factor loadings were entered into the script as it can be found in the Supplement (p. 7) to compute the OT- and OMT-rotated loadings and the corresponding factor inter-correlations. To compare the correlation of the factor score predictor with the original factor for OT- and OMT-rotated factors (factor score indeterminacy), we used the ‘FactorScores’ option. As the computation of individual scores needs a considerable amount of computation time, we restricted this analysis to a few conditions based on n = 100 where large differences between rotation methods can be expected. Therefore, factor scores were investigated for the condition with n = 100, q = 6 and q = 9, p/q = 5, ϕ = 0.25, λ = 0.50, CL = 0 and CL ≠ 0.

5.2 Results

Repeated measures ANOVA was performed for bias of factor inter-correlations as dependent variable and OT- versus OMT-rotation as within-factor (Rot) and number of factors (q), number of salient loadings (p/q), loading size (λ), cross-loadings (CL), factor inter-correlations (ϕ), and sample size (n) as between-factors. All effects were significant at p < 0.001 and the corresponding effect sizes $η_{p}^{2}$ are reported in Table 4. We followed Skrondal’s (2000) recommendation of reporting only main effects and two-way interaction effects on each dependent variable of Monte Carlo experiments to facilitate interpretation.

TABLE 4

Table 4. Repeated measures ANOVA main effects and two-way interactions for the conditions of the simulation study (independent variables) and bias of factor inter-correlations (dependent variable) based on OT- and OMT-rotation.

The largest effect occurred for the interaction of rotation method with cross-loadings. Whereas the negative bias of OT-rotation for factor inter-correlations was larger (M = −0.07; SE < 0.001) than the negative bias of OMT-rotation (M = −0.02; SE < 0.001) for CL = 0 (ICM), the positive bias of OT-rotation was larger (M = 0.04; SE < 0.001) than the positive bias of OMT-rotation (M < 0.001; SE < 0.001) for CL ≠ 0 (ZCLM). The extremely small standard errors indicate that the mean differences are significant. The second largest effect occurred for the interaction of rotation method with factor inter-correlations. Whereas the bias of OT-rotation on factor inter-correlations was more positive (M = 0.04; SE < 0.001) than the bias of OMT-rotation (M = 0.002; SE < 0.001) for ϕ = 0.00, the bias for ϕ = 0.25 was close to zero for OT-rotation (M = 0.004; SE < 0.001) and for OMT-rotation (M < 0.001; SE < 0.001), and for ϕ = 0.50, it was more negative for OT-rotation (M = −0.09; SE < 0.001) than for OMT-rotation (M = −0.03; SE < 0.001). The next largest effect was the interaction of rotation method with the number of salient loadings per factor (p/q). For p/q = 5 the negative bias of OT-rotation (M = −0.06; SE < 0.001) was more substantial than the negative bias of OMT-rotation (M = −0.02; SE < 0.001) and for p/q = 8 the positive bias was larger for OT-rotation (M = 0.03; SE < 0.001) than for OMT-rotation (M = 0.003; SE < 0.001). These largest interaction effects indicate that the positive and negative biases that are related to the conditions CL, ϕ, and p/q were larger for OT-rotation than for OMT-rotation.

The descriptive results for the factor inter-correlations for CL = 0 (ICM) based on population factor inter-correlations of ϕ = 0.50 are presented in Figure 2. For mean salient loadings of 0.50 and samples of n = 200 and below, the mean inter-correlations of OT-rotated factors are considerably smaller than ϕ = 0.50. In contrast, the mean inter-correlations of the OMT-rotated factors are much closer to 0.50 and show a smaller negative bias. For q = 12 factors and mean salient loadings of 0.50, the mean inter-correlations of the OT-rotated factors are zero, whereas the mean inter-correlations of the OMT-rotated factors are a bit larger than 0.20. Thus, the under-estimation of the inter-correlations is present in all target-rotated factors but it is much smaller for the OMT-rotated factors than for the OT-rotated factors. The under-estimation of the population factor inter-correlations is considerably reduced for mean salient loadings of 0.70 (see Figure 2). The under-estimation of factor inter-correlations was also smaller for OMT-rotated factors than for OT-rotated factor for ϕ = 0.25 (see Figure 3). Overall, the size of the effects was reduced and the pattern was the same as for ϕ = 0.50. No under-estimation of the population factor inter-correlations and no substantial difference between OT- and OMT-rotated factors occurred for ϕ = 0.00 (see Supplementary Figure S1). However, in this condition, the standard deviation of the factor inter-correlations was larger for OT-rotated factors than for OMT-rotated factors for mean salient loadings of 0.50, n = 100, and q = 12.

FIGURE 2

Figure 2. Means and standard deviations of inter-factor correlations resulting from OT- and OMT-rotation for the ICM, population factor inter-correlations of ϕ = 0.50; q indicates the number of factors.

FIGURE 3

Figure 3. Means and standard deviations of inter-factor correlations resulting from OT- and OMT-rotation for the ICM, population factor inter-correlations of ϕ = 0.25; q indicates the number of factors.

Whereas the results for the ICM show that the under-estimation of factor inter-correlations for OT-rotation is more pronounced than for OMT-rotation, the results for the ZCLM are more complex (see Figure 4). For p/q = 5, there are small over- and underestimations of the inter-correlations of the OT-rotated factors for different numbers of factors, whereas the inter-correlations of the OMT-rotated factors remain rather similar across different numbers of factors. In contrast, for p/q = 8, an over-estimation of factor inter-correlations for OT-rotated factors increases with the number of factors. Again, only very small variations of the inter-correlations of the OMT-rotated factors occurred. Although the different patterns of non-zero population cross-loadings induce different factor inter-correlations of OT-rotated factors, the inter-correlation of the OMT-rotated factors is only slightly affected by the different patterns.

FIGURE 4

Figure 4. Means and standard deviations of inter-factor correlations resulting from OT- and OMT-rotation for the ZCLM, population factor inter-correlations of ϕ = 0.50; q indicates the number of factors.

The mean RMS differences of the OT- and OMT-rotated loading patterns with the population loadings for the ϕ = 0.50 condition are presented in Figure 5. For all loading sizes and sample sizes, the mean RMS differences were nearly the same for q = 3. For q > 6, mean salient loadings of 0.50, and sample sizes smaller than 300, the mean RMS differences were substantially larger for OT-rotated factor patterns than for OMT-rotated factor patterns. In these conditions, the mean RMS differences increased with q for the OT-rotated factor patterns, whereas they did not substantially increase with q for the OMT-rotated factor patterns. In these conditions, the standard deviations of the RMS differences were much larger for the OT-rotated factor patterns than for the OT-rotated factor patterns (see Figure 5). For ϕ = 0.25 the effects of q, n, and mean salient loading size on mean RMS differences were smaller than for ϕ = 0.50, but the pattern of results was the same (see Figure 6). For ϕ = 0.00 the mean RMS differences were very small and only a small increase of mean RMS differences occurred for OT-rotated factors for n = 100, q > 6, and mean salient loadings of 0.50.

FIGURE 5

Figure 5. Root Mean Square (RMS) difference between the population loading pattern and the OT- and OMT-rotated loading patterns for population factor inter-correlations of ϕ = 0.50; q indicates the number of factors.

FIGURE 6

Figure 6. Root Mean Square (RMS) difference between the population loading pattern and the OT- and OMT-rotated loading patterns for population factor inter-correlations of ϕ = 0.25; q indicates the number of factors.

Differences of OT- and OMT-rotation for the standard deviations of salient loadings and cross-loadings are presented for n = 100 and p/q = 5 in the Supplementary Figures S3A–D. In this condition, the largest standard deviations of loadings can be expected. For the ICM, the standard deviations of OT-rotated cross-loadings were considerably larger than the standard deviations of the OMT-rotated loadings for q ≥ 6 at all levels of ϕ (Supplementary Figures S3A,B). For the ZCLM, considerably larger standard deviations of OT-rotated loadings occurred for q = 6 and ϕ = 0.50 (Supplementary Figure S3C) and for q > 6 at all levels of ϕ. For q = 3, the standard deviations of OT-rotated and OMT-rotated loadings were similar for all levels of ϕ (Supplementary Figure S3D).

Repeated measures ANOVA was performed for factor score indeterminacy of OT-rotated and OMT-rotated factors. Rotation method was a within-factor (Rot), and number of factors (q), and cross-loadings (CL) were between-factors. The main effect of rotation method and the interactions of rotation method with q and CL were significant at p < 0.001 and the corresponding effect sizes $η_{p}^{2}$ are reported in Table 5. The largest effect occurred for rotation method with M = 0.52 (SE = 0.002) for OT-rotation and M = 0.60 (SE = 0.001) for OMT-rotation. The second largest effect was the interaction of rotation method with the number of factors, indicating that the difference between the rotation methods was smaller for q = 6 (OT-rotation: M = 0.58, SE = 0.003; OMT-rotation: M = 0.63, SE = 0.001) than for q = 9 (OT-rotation: M = 0.46, SE = 0.003; OMT-rotation: M = 0.57, SE = 0.001). The remaining effect sizes were rather small.

TABLE 5

Table 5. Repeated measures ANOVA results for the conditions of the simulation study (independent variables) and indeterminacy of factors based on OT- and OMT-rotation.

6 Empirical example

As an empirical example a subsample of participants responses to 25 items from the Open-Source Psychometrics Project¹ based on the Big-Five Factor Markers (BIG5.zip, last updated 5/18/2014, retrieved on 08/22/2023) from the International Personality Item Pool (IPIP, Goldberg, 1992) was used. Only the first 19,700 participants (age/years: M = 26.27, SD = 11.59; gender: 11,973 females, 7,601 males, 102 others, 24 missing values) from the total file of 19,719 participants were used in order to split the total sample into 197 subsamples each containing the responses of 100 participants to the first four items (E1-E4, N1-N4, A1-A4, C1-C4, O1-O4) of each of the five factors. Only a subsample of items was used in order to investigation a data set that is less favorable for optimal factor rotation.

The aim was to compare the OT- and OMT-rotated five-factor solution of the total sample with the OT- and OMT-rotated five-factor solutions of the subsamples. Principal axis factoring of the total sample and of the subsamples was performed with IBM SPSS Version 29.0 and OT- and OMT-rotation was performed with the code provided in the Supplementary Material. The rotated solutions for the total sample are presented in Table 5. The OT- and OMT-rotated loading patterns are very similar which indicates that for the very large total sample both rotation methods work well. The inter-correlations of the OMT-rotated factors were a bit larger than the inter-correlations of the OT-rotated factors (Table 6).

TABLE 6

Table 6. OT- and OMT-rotated five factor loading patterns and factor inter-correlations for 20 BIG-Five Markers of the total sample.

Overall, 195 out of the 197 principal factor analyses converged. OT- and OMT-rotation was performed for the unrotated factor solutions and the RMS difference of each of the rotated factor patterns with the corresponding rotated factor pattern of the total sample was computed. When for RMS-OT five values greater one were set to one, the mean of RMS-OT was 0.18 (SD = 0.19), for RMS-OMT no values greater one occurred and the mean RMS-OMT was 0.16 (SD = 0.06). For the factor inter-correlations of the OT-rotated factors RMS-OT was 0.25 (SD = 0.21, two values greater one were set to one), for the factor inter-correlations of the OMT-rotated factors RMS-OMT was 0.15 (SD = 0.05, no values greater one occurred).

7 Discussion

Investigations of simple structure by means of EFA are still relevant, also because analyses of simple structure models by means of confirmatory factor analyses may lead to series of model-modifications. It was, however, expected that sampling error substantially affects results of conventional Target-rotation because single cross-loadings are minimized. In order to reduce the effect of sampling error on results, OMT-rotation was proposed which minimizes mean cross-loadings instead of single cross-loadings. It was shown in a population example that minimizing single cross-loadings by means of conventional OT-rotation may lead to ambiguous results, when the mean cross-loadings are close to zero while the absolute size of the cross-loadings is substantial. In the population model, the observed variables with single cross-loadings that were close to zero after rotation were arbitrary because the variables all had the same absolute cross-loading before rotation. This indicates that OMT-rotation may be of special interest when the cross-loadings with positive and negative sign balance out.

Accordingly, OT- and OMT-rotation were compared in a simulation study based on the oblique ICM, comprising cross-loadings that are exactly zero in the population, and a model with non-zero population cross-loadings resulting in zero-mean cross-loadings (ZCLM) were investigated in a simulation study. ANOVA revealed that the bias of the factor inter-correlations resulting from OT- and OMT-rotation varied for the type of model (ICM vs. ZCLM), the size of the population factor inter-correlations, and the number of variables with salient loadings per factor. For these conditions, the positive and negative biases of the factor inter-correlations were more pronounced for OT-rotation than for OMT-rotation.

ANOVA and descriptive results revealed that, for the oblique ICM, sampling error may induce negative bias to Target-rotated factor inter-correlations. The negative bias of the factor inter-correlations in the ICM was substantially more pronounced for OT-rotation than for OMT-rotation, especially for small sample sizes, moderate mean salient loadings, and a large number of factors. For 12 factors, 100 cases, mean salient loadings of 0.50, and population inter-correlations of 0.50, the mean sample inter-correlations of OT-rotated factors was zero, whereas it was greater 0.20 for OMT-rotated factors. The mean RMS differences of rotated factor patterns and the population factor pattern were larger for OT-rotation than for OMT-rotation. Thus, when samples size was small and the number of factors large, loading patterns and factor inter-correlations were more similar to the population loading patterns and factor inter-correlations for OMT-rotation than for OT-rotation. However, no relevant differences between the rotation methods were found for the uncorrelated ICM. For the ZCLM, more than three factors, and 8 variables with salient loadings per factor, an overestimation of the inter-correlations of the OT-rotated factors occurred that did not occur for the OMT-rotated factors. This reveals that the size and direction of bias of factor inter-correlations does not only depend on sampling error but also on the type of model. Therefore, recommendations regarding the minimum sample size should be based on general studies concerning this issue. A sample size of n = 50 has been recommended as a reasonable absolute minimum for exploratory factor analysis (de Winter et al., 2009). However, the study of de Winter et al. (2009) reveals that even smaller samples might be possible when the salient loadings are large, the number of observed variables is large and the number of factors low. As the minimum sample size of the present study was n = 100, it remains open for further research whether OMT-rotation may help to improve results of very small samples. Nevertheless, the present simulation study shows that OMT-rotation reduced the standard deviations of loadings, especially when the number of factors is large, and the combined effect of sampling error and model parameters on the bias of factor inter-correlations. Overall, the effect of the conditions of the simulation study on positive and negative bias of factor inter-correlations was more pronounced for OT-rotation than for OMT-rotation. It should be noted that the simulated non-zero population cross-loadings in the ZCLM were not extremely large as they were at maximum one third of the average salient loadings. As small ZCLM cross-loadings resulted in larger bias of factor inter-correlations for OT-rotation than for OMT-rotation, researchers expecting that non-zero cross-loadings cancel out may consider OMT-rotation.

An empirical example was based on open data for the BIG-five model of personality (Goldberg, 1992). A large total sample based on four marker variables per factor was divided into several subsamples based on 100 participants in order to investigate the similarity of the OT- and OMT-rotated subsample solutions with the corresponding OT- and OMT-rotated total sample solutions. The similarity of the rotated loading patterns and factor inter-correlations for the subsamples with the corresponding rotated loading pattern and factor inter-correlations for the total sample was more pronounced for OMT-rotation than for OT-rotation. This indicates that OMT-rotation may help to get more robust results, especially when the number of marker variables per factor and the sample size are rather small.

Overall, the results of the simulation study and of the empirical example indicate that OMT-rotation is more robust than OT-rotation. The robustness refers to effects of sampling error and to the distribution of non-zero cross-loadings across factors. Therefore, OMT-rotation can be recommended when an oblique ICM or a ZCLM is expected and when salient loadings are moderate, factor numbers large, and sample sizes small. The relevant orthogonal/unrotated loading matrices for OMT-rotation may be entered into the R-script or into the SPSS-script provided in the Supplementary Material.

From a broader perspective, it should be noted that less biased factor inter-correlations are an important basis for hierarchical factor models. Factor prediction, as it can be performed with ESEM, also needs optimal estimates of the factor inter-correlations. Further research may consider the investigation of OMT-rotation with very small samples (n = 50 and below). Moreover, the basic idea of OMT-rotation, i.e., averaging loadings in order to provide more suitable initial loadings for factor rotation, may be investigated for other methods of factor rotation. The main idea from the present study in this respect is that the transformation matrix for factor rotation derived from rotation of an averaged loading pattern can be used for rotation of the loading pattern based on single observed variables. That is, not only Target-rotation may be improved by reducing the effect of single loadings on the results. The effect of averaging loadings may be beneficial for analytical methods of factor rotation, but it may also be relevant for algorithms that are based on the permutation of starting values, as, for example, gradient projection algorithms (Bernaards and Jennrich, 2005).

From an applied perspective, it should be noted that the correlation of the factor score predictors with the original factors (i.e., indeterminacy) was improved by means of OMT-rotation. Reducing the effects of single cross-loadings on the rotation of the loading pattern may therefore also help to improve decisions that are based on individual factor score predictors. As the original factors are typically unknown in applied settings, it is recommended to compare the OT- and the OMT-rotated loading patterns to check whether single observed variables have a theoretically unjustified effect on factor rotation. Note that cross-loadings with smaller effect on oblique rotation may remain substantial after rotation. However, in the OMT-rotated loading pattern, some cross-loadings might remain large while the mean cross-loadings remain close to zero. This would indicate that a balanced measurement of the latent variable is reached without increasing the factor inter-correlations by reducing the single cross-loading. Moreover, correlation-preserving factor score predictors (e.g., McDonald, 1981) also require optimal estimates of the factor inter-correlations. Of course, also in settings where the factor inter-correlations are relevant for further research, OMT-rotation might be considered.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found at: http://openpsychometrics.org/_rawdata/.

Ethics statement

Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and the institutional requirements.

Author contributions

AB: Conceptualization, Investigation, Methodology, Software, Writing – original draft. NH: Conceptualization, Investigation, Software, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1285212/full#supplementary-material

Footnotes

1. ^http://openpsychometrics.org/_rawdata/

References

Asparouhov, T., and Muthén, B. (2009). Exploratory structural equation modeling. Struct. Equ. Model. 16, 397–438. doi: 10.1080/10705510903008204

CrossRef Full Text | Google Scholar

Bernaards, C. A., and Jennrich, R. I. (2005). Gradient projection algorithms and software for arbitrary rotation criteria in factor analysis. Educ. Psychol. Meas. 65, 676–696. doi: 10.1177/0013164404272507

CrossRef Full Text | Google Scholar

Browne, M. W. (1972). Oblique rotation to a partially specified target. Br. J. Math. Stat. Psych. 25, 207–212. doi: 10.1111/j.2044-8317.1972.tb00492.x

CrossRef Full Text | Google Scholar

Browne, M. W. (2001). An overview of analytic rotation in exploratory factor analysis. Multivar. Behav. Res. 36, 111–150. doi: 10.1207/S15327906MBR3601_05

CrossRef Full Text | Google Scholar

de Winter, J. C. F., Dodou, D., and Wieringa, P. A. (2009). Exploratory factor analysis with small sample sizes. Multiv. Behav. Res. 44, 147–181. doi: 10.1080/00273170902794206

CrossRef Full Text | Google Scholar

Ertel, S. (2011). Exploratory factor analysis revealing complex structure. Pers. Individ. Differ. 50, 196–200. doi: 10.1016/j.paid.2010.09.026

CrossRef Full Text | Google Scholar

Goldberg, L. R. (1992). The development of markers for the big-five factor structure. Psych. Assess. 4, 26–42. doi: 10.1037/1040-3590.4.1.26

CrossRef Full Text | Google Scholar

Grice, J. W. (2001). Computing and evaluation factor scores. Psychol. Meth. 6, 430–450. doi: 10.1037/1082-989X.6.4.430

CrossRef Full Text | Google Scholar

Harris, R. J. (2013). A primer of multivariate statistics. 3rd Edn New York: Taylor & Francis.

Google Scholar

Hayes, T., and Usami, S. (2020). Factor score regression in connected measurement models containing cross-loadings. Struct. Equ. Model. Multidiscip. J. 27, 942–951. doi: 10.1080/10705511.2020.1729160

CrossRef Full Text | Google Scholar

Hsu, H.-Y., Troncoso Skidmore, S., Li, Y., and Thompson, B. (2014). Forced zero cross-loading misspecifications in measurement component of structural equation models: beware of even “small” misspecifications. Methodology 10, 138–152. doi: 10.1027/1614-2241/a000084

CrossRef Full Text | Google Scholar

Hurley, J. R., and Cattell, R. B. (1962). The procrustes program: producing direct rotation to test a hypothesized factor structure. Comput. Behav. Sci. 7, 258–262. doi: 10.1002/bs.3830070216

CrossRef Full Text | Google Scholar

Jennrich, R. I. (2002). A simple general method for oblique rotation. Psychometrika 67, 7–19. doi: 10.1007/BF02294706

CrossRef Full Text | Google Scholar

Joshanloo, M. (2016). Revisiting the empirical distinction between hedonic and eudaimonic aspects of well-being using exploratory structural equation modeling. J. Happ. Stud. 17, 2023–2036. doi: 10.1007/s10902-015-9683-z

CrossRef Full Text | Google Scholar

MacCallum, R. C., Roznowski, M., and Necowitz, L. B. (1992). Model modifications in covariance structure analysis: the problem of capitalizing on chance. Psychol. Bull. 111, 490–504. doi: 10.1037/0033-2909.111.3.490

PubMed Abstract | CrossRef Full Text | Google Scholar

McCrae, R. R., Zonderman, A. B., Costa, P. T., Bond, M. H., and Paunonen, S. V. (1996). Evaluating replicability of factors in the revised NEO personality inventory: confirmatory factor analysis versus Procrustes rotation. J. Pers. Soc. Psychol. 70, 552–566. doi: 10.1037/0022-3514.70.3.552

CrossRef Full Text | Google Scholar

McDonald, R. P. (1981). Constrained least squares estimators of oblique common factors. Psychometrika 46, 337–341. doi: 10.1007/BF02293740

CrossRef Full Text | Google Scholar

Moler, C.B. (2008). Numerical computing with Matlab (2nd Ed.). Philadelphia: SIAM.

Google Scholar

Sass, D. A., and Schmitt, T. A. (2010). A comparative investigation of rotation criteria within exploratory factor analysis. Multivar. Behav. Res. 45, 73–103. doi: 10.1080/00273170903504810

PubMed Abstract | CrossRef Full Text | Google Scholar

Schmitt, T. A., and Sass, D. A. (2011). Rotation criteria and hypothesis testing for exploratory factor analysis: implications for factor pattern loadings and interfactor correlations. Educ. Psychol. Meas. 71, 95–113. doi: 10.1177/0013164410387348

CrossRef Full Text | Google Scholar

Schönemann, P. H. (1966). A generalized solution of the orthogonal Procrustes problem. Psychometrika 31, 1–10. doi: 10.1007/BF02289451

CrossRef Full Text | Google Scholar

Skrondal, A. (2000). Design and analysis of Monte Carlo experiments: attacking the conventional wisdom. Multiv. Behav. Res. 35, 137–167. doi: 10.1207/S15327906MBR3502_1

PubMed Abstract | CrossRef Full Text | Google Scholar

Tucker, L.R. (1951). A method for synthesis of factor analysis studies. Personnel Research Section Report No. 984. Washington, DC: Department of the Army.

Google Scholar

Waller, N. G. (2016). Fungible correlation matrices: a method for generating nonsingular, singular, and improper correlation matrices for Monte Carlo research. Multivar. Behav. Res. 51, 554–568. doi: 10.1080/00273171.2016.1178566

PubMed Abstract | CrossRef Full Text | Google Scholar

Waller, N., Jones, J., Giordano, C., and Nguyen, H.V. (2023). Package 'fungible' Version 2.3 [Computer software manual]. Available at:https://cran.r-project.org/web/packages/fungible/fungible.pdf

Google Scholar

Wei, X., Huang, J., Zhang, L., Pan, D., and Pan, J. (2022). Evaluation and comparison of SEM, ESEM, and BSEM in estimating structural models with potentially unknown cross-loadings. Struct. Equ. Model. Multidiscip. J. 29, 327–338. doi: 10.1080/10705511.2021.2006664

CrossRef Full Text | Google Scholar

Ximénez, C., Revuelta, J., and Castañeda, R. (2022). What are the consequences of ignoring cross-loadings in bifactor models? A simulation study assessing parameter recovery and sensitivity of goodness-of-fit indices. Front. Psychol. 13:923877. doi: 10.3389/fpsyg.2022.923877

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, G. (2014). Estimating standard errors in exploratory factor analysis. Multivar. Behav. Res. 49, 339–353. doi: 10.1080/00273171.2014.908271

CrossRef Full Text | Google Scholar

Zhang, G., Hattori, M., Trichtinger, L. A., and Wang, X. (2019). Target rotation with both factor loadings and factor correlations. Psychol. Meth. 24, 390–402. doi: 10.1037/met0000198

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, G., and Preacher, K. J. (2015). Factor rotation and standard errors in exploratory factor analysis. J. Educat. Behav. Statist. 40, 579–603. doi: 10.3102/1076998615606098

CrossRef Full Text | Google Scholar

Keywords: exploratory factor analysis, factor-rotation, independent clusters model, factor inter-correlation, Target-rotation

Citation: Beauducel A and Hilger N (2023) Robust oblique Target-rotation for small samples. Front. Psychol. 14:1285212. doi: 10.3389/fpsyg.2023.1285212

Received: 29 August 2023; Accepted: 07 November 2023;
Published: 27 November 2023.

Edited by:

Holmes Finch, Ball State University, United States

Reviewed by:

Carmen Ximénez, Autonomous University of Madrid, Spain
Rodrigo Schames Kreitchmann, Universidad Nacional de Educación a Distancia, Spain

Copyright © 2023 Beauducel and Hilger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: André Beauducel, YmVhdWR1Y2VsQHVuaS1ib25uLmRl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Robust oblique Target-rotation for small samples

1 Introduction

2 Definitions

3 Oblique mean-target-rotation

4 Population example

5 Simulation study

5.1 Specification

5.1.1 Independent variables

5.1.2 Dependent variables

5.1.3 Data generation

5.2 Results

6 Empirical example

7 Discussion

Data availability statement

Ethics statement

Author contributions

Funding

Conflict of interest

Publisher’s note

Supplementary material

Footnotes

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good