Comparing Indirect Effects in Different Groups in Single-Group and Multi-Group Structural Equation Models

Ryu, Ehri; Cheong, Jeewon

doi:10.3389/fpsyg.2017.00747

ORIGINAL RESEARCH article

Front. Psychol. , 11 May 2017

Sec. Quantitative Psychology and Measurement

Volume 8 - 2017 | https://doi.org/10.3389/fpsyg.2017.00747

This article is part of the Research Topic Recent Advancements in Structural Equation Modeling (SEM): From Both Methodological and Application Perspectives View all 20 articles

Comparing Indirect Effects in Different Groups in Single-Group and Multi-Group Structural Equation Models

$\r\nEhri Ryu*$ Ehri Ryu¹^*

Jeewon Cheong²

¹Psychology, Boston College, Chestnut Hill, MA, USA
²Health Education and Behavior, University of Florida, Gainesville, FL, USA

In this article, we evaluated the performance of statistical methods in single-group and multi-group analysis approaches for testing group difference in indirect effects and for testing simple indirect effects in each group. We also investigated whether the performance of the methods in the single-group approach was affected when the assumption of equal variance was not satisfied. The assumption was critical for the performance of the two methods in the single-group analysis: the method using a product term for testing the group difference in a single path coefficient, and the Wald test for testing the group difference in the indirect effect. Bootstrap confidence intervals in the single-group approach and all methods in the multi-group approach were not affected by the violation of the assumption. We compared the performance of the methods and provided recommendations.

Introduction

In mediation analysis, it is a standard practice to conduct a formal statistical test on mediation effects in addition to testing each of the individual parameters that constitutes the mediation effect. Over the past few decades, statistical methods have been developed to achieve valid statistical inferences about mediation effects. The sampling distribution of a mediation effect is complicated because the mediation effect is quantified by a product of at least two parameters. For this reason, numerous studies have proposed and recommended methods that do not rely on distributional assumption (e.g., bootstrapping) for testing mediation effects (e.g., Bollen and Stine, 1990; Shrout and Bolger, 2002; MacKinnon et al., 2004; Preacher and Hayes, 2004).

It is often a question of interest whether a mediation effect is the same across different groups of individuals or under different conditions, in other words, whether a mediation effect is moderated by another variable (called a moderator) that indicates the group membership or different conditions. For example, Levant et al. (2015) found that the mediation effect of endorsement of masculinity ideology on sleep disturbance symptoms via energy drink use was significantly different between white and racial minority groups. Schnitzspahn et al. (2014) found that time monitoring mediated the effect of mood on prospective memory in young adults, but not in old adults. Gelfand et al. (2013) showed that the effect of cultural difference (US vs. Taiwan) on the optimality of negotiation outcome is mediated by harmony norm when negotiating as a team but not when negotiating as solos. In these studies, the mediation effect was moderated by a categorical moderator (e.g., racial group, age group, experimental condition). With a categorical moderator, the moderated mediation effect concerns the difference in the indirect effect between groups. Treating a moderator categorical is appropriate when the moderator is truly categorical, but it is not appropriate to create groups based on arbitrary categorization of a continuous moderator (Maxwell and Delaney, 1993; MacCallum et al., 2002; Edwards and Lambert, 2007; Rucker et al., 2015).

Structural equation modeling (SEM) is a popular choice for many researchers to test a mediation model and to conduct a formal test on mediation effects. In SEM, the mediation effect can be specified as an indirect effect (Alwin and Hauser, 1975; Bollen, 1987) such as “the indirect effect of an independent variable (X) on a dependent variable (Y) via a mediator (M)” in which X affects M, which in turn affects Y. For incorporating a categorical moderator, there are two approaches in SEM: single-group and multi-group analysis. In the single-group analysis approach, the categorical moderator is represented by a variable, or a set of variables, in the model. On the other hand, the multi-group analysis approach uses the categorical moderator to separate the observations into groups at each level of the moderator, and the moderator does not appear in the model as a variable.

In this article, we present the single-group and multi-group analysis approaches to comparing indirect effects between groups, and introduce statistical methods in each approach for testing the group difference in the indirect effect and for testing the simple indirect effect in each group. Then we present a simulation study to compare the performance of the methods. In particular, we examine how robust the methods in single-group analysis approach are when the assumption of homogeneity of variance is not satisfied (the assumption is described in a later section).

Group Difference in Indirect Effect and Simple Indirect Effect in Each Group

We use the following example throughout this article. Suppose that we hypothesize a mediation model in which the effect of an independent variable X on a dependent variable Y is mediated by a mediator M (Figure 1).

FIGURE 1

Figure 1. A mediation model.

We also hypothesize that the X to M relationship is not the same in two groups of individuals (e.g., men and women). This model can be considered as a special case of the first stage moderation model in Edwards and Lambert (2007) and the Model 2 in Preacher et al. (2007), in which the moderator is a categorical variable with two levels. When comparing the indirect effect between two groups, estimating and making statistical inferences about the following two effects are of interest. First, what is the estimated difference in the indirect effect between the groups? Second, what is the estimated indirect effect in each group (i.e., simple indirect effect)?

In the single-group analysis, a (set of) categorical variable indicating the group membership is used as a covariate in the model and an interaction term of X with the group membership (Group) is included to test the difference in the X to M relationship between groups (See Figure 2A).

FIGURE 2

Figure 2. (A) Single-group and (B) multi-group analysis models for testing group difference in the indirect effect. In (A) single-group model, Group is a categorical variable that indicates distinctive group membership.

The interpretation of the parameters depends on how the group membership is coded. For example, when the group membership (Group) is dummy coded as 1 = Group 1 and 0 = Group 2, a₁ = simple effect of X on M in Group 2; a₂ = group difference in conditional mean of M for those whose level of X is at zero (i.e., conditional mean of M in Group 1—conditional mean of M in Group 2); a₃ = difference in simple effect of X on M between groups (i.e., simple effect of X on M in Group 1—simple effect of X on M in Group 2). If a₃ ≠ 0, it means that the relationship between X and M is not the same between groups.

When the relationship between X and M differs between groups, the indirect effect of X on Y via M is conditional on the group membership, because the indirect effect consists of X to M relationship and M to Y relationship. In the model shown in Figure 2A, an estimate of the indirect effect of X on Y via M is obtained by $[â_{1} + â_{3} (G r o u p)] \hat{b}$ (Preacher et al., 2007). So the simple indirect effect (i.e., the conditional indirect effect) estimate is $[â_{1} + â_{3} (1)] \hat{b}$ = $(â_{1} + â_{3}) \hat{b}$ in Group 1 (coded 1), and $[â_{1} + â_{3} (0)] \hat{b}$ = $â_{1} \hat{b}$ in Group 2 (coded 0). The estimated group difference in the indirect effect is $[(â_{1} + â_{3}) \hat{b}] - â_{1} \hat{b}$ = $â_{3} \hat{b}$ (Hayes, 2015).

In multi-group analysis, group membership is not used as a predictor variable in the model. Instead, a set of hypothesized models (e.g., a set of two models if there are two distinctive groups) are specified and estimated simultaneously (See Figure 2B). The group difference in the simple effect of X on M (that is estimated by a₃ in the single-group analysis) is estimated by (â_G1 − â_G2). The simple indirect effect is estimated by $â_{G 1} {\hat{b}}_{G 1}$ and $â_{G 2} {\hat{b}}_{G 2}$ in Group 1 and in Group 2, respectively. The estimated difference in the indirect effect is $(â_{G 1} {\hat{b}}_{G 1} - â_{G 2} {\hat{b}}_{G 2})$ .

Statistical Inferences

There are numerous methods for making statistical inferences about the simple indirect effects and inferences about the group difference in the indirect effect. The methods can be categorized into the following branches: (1) normal-theory standard error, (2) bootstrapping methods, (3) Monte Carlo method, (4) likelihood ratio (LR) test, (5) Wald test¹. Table 1 summarizes the methods and shows the abbreviation to refer to each method. In the abbreviation, the superscripts “S” and “M” indicate the single-group and multi-group approaches, respectively. The subscripts indicate which effect is tested by the method, e.g., “diff” means the group difference in the indirect effect, “ind” means the simple indirect effect in in each group.

TABLE 1

Table 1. Methods for testing group difference in a path, group difference in the indirect effect, and simple indirect effect in each group.

Normal-Theory Standard Error

The normal-theory standard error method is based on the assumption that the sampling distribution of the estimate follows a normal distribution. In testing an indirect effect, it is well-known that the normality assumption is not appropriate to represent the sampling distribution of the indirect effect, and the normal-theory based method do not perform well in testing the indirect effect (e.g., MacKinnon et al., 2002; Shrout and Bolger, 2002; MacKinnon et al., 2004; Preacher and Selig, 2012). In moderated mediation models, Preacher et al. (2007) has advocated the bootstrapping methods over the normal standard error methods for testing the simple indirect effect.

Bootstrapping Methods

The bootstrapping methods can provide interval estimates without relying on a distribution assumption. For this reason, the bootstrapping methods have been recommended for testing indirect effects in previous studies (e.g., MacKinnon et al., 2004; Preacher and Hayes, 2004). The bootstrapping methods can be applied for obtaining interval estimates for any effect of interest, e.g., simple indirect effect in Group 1, simple indirect effect in Group 2, group difference in the indirect effect. In bootstrapping methods, a large number of bootstrap samples (e.g., 1,000 bootstrap samples), whose sizes are the same as the original sample size, are drawn from the original sample with replacement. An estimate is obtained in each bootstrap sample. An empirical sampling distribution is constructed using the set of 1,000 bootstrap estimates. From the bootstrap sampling distribution, percentile bootstrap confidence intervals ([100 * (1 − α)]%) can be computed by the (α/2) and (1 − α/2) percentiles. Bias-corrected bootstrap confidence intervals can be computed with the percentiles adjusted based on the proportion of bootstrap estimates lower than the original sample estimate (see MacKinnon et al., 2004).

In the single-group analysis, the estimate of the simple indirect effect in each group is computed by $(â_{1}^{*} + â_{3}^{*}) {\hat{b}}^{*}$ in Group 1 (coded 1), and $â_{1}^{*} {\hat{b}}^{*}$ in Group 2 (coded 0) in each bootstrap sample. The superscript * denotes that the estimates are obtained in bootstrap samples. In each group, the percentile ( $P C_{i n d}^{S}$ in Table 1) and the bias-corrected ( $B C_{i n d}^{S}$ ) bootstrap confidence intervals for the simple indirect effect are computed from the bootstrap sampling distribution [i.e., the distribution of $(â_{1}^{*} + â_{3}^{*}) {\hat{b}}^{*}$ for Group 1; and the distribution of $â_{1}^{*} {\hat{b}}^{*}$ for Group 2] as described above.

In the multi-group analysis, the estimate of the simple indirect effect is computed by $â_{G 1}^{*} {\hat{b}}_{G 1}^{*}$ in Group 1 and $â_{G 2}^{*} {\hat{b}}_{G 2}^{*}$ in Group 2. The percentile ( $P C_{i n d}^{M}$ ) and the bias-corrected ( $B C_{i n d}^{M}$ ) bootstrap confidence intervals for the simple indirect effect are obtained from the distribution of $â_{G 1}^{*} {\hat{b}}_{G 1}^{*}$ and the distribution of $â_{G 2}^{*} {\hat{b}}_{G 2}^{*}$ , in Group 1 and Group 2, respectively. The percentile ( $P C_{d i f f}^{M}$ ) and the bias-corrected ( $B C_{d i f f}^{M}$ ) bootstrap confidence intervals for the group difference in the indirect effect are obtained from the bootstrap sampling distribution of $(â_{G 1}^{*} {\hat{b}}_{G 1}^{*} - â_{G 2}^{*} {\hat{b}}_{G 2}^{*})$ .

Monte Carlo Method

The Monte Carlo method provides a statistical test or an interval estimate of an effect by generating parameter values with a distributional assumption (e.g., multivariate normal). For testing the group difference in the indirect effect in the multi-group analysis model, the parameter estimates and standard errors are used to specify a joint sampling distribution of the parameter estimates from which the parameter values are generated for a large number of replications, e.g., 1,000 (Preacher and Selig, 2012; Ryu, 2015), such that the joint distribution of the four parameters a_G₁, b_G₁, a_G₂, and b_G₂ is a multivariate normal distribution shown below.

\begin{matrix} [\begin{matrix} a_{G 1} \\ b_{G 1} \\ a_{G 2} \\ b_{G 2} \end{matrix}] ~ M V N ([\begin{matrix} {\hat{a}}_{G 1} \\ {\hat{b}}_{G 1} \\ {\hat{a}}_{G 2} \\ {\hat{b}}_{G 2} \end{matrix}], [\begin{matrix} {\hat{σ}}_{a_{G 1}}^{2} \\ 0 & {\hat{σ}}_{b_{G 1}}^{2} \\ 0 & 0 & {\hat{σ}}_{a_{G 2}}^{2} \\ 0 & 0 & 0 & {\hat{σ}}_{b_{G 2}}^{2} \end{matrix}]) & (1) \end{matrix}

where â_G1, ${\hat{b}}_{G 1}$ , â_G2, and ${\hat{b}}_{G 2}$ are the estimates in the original sample, and ${\hat{σ}}_{a_{G 1}}$ , ${\hat{σ}}_{b_{G 1}}$ , ${\hat{σ}}_{a_{G 2}}$ , and ${\hat{σ}}_{b_{G 2}}$ are the estimated standard errors in the original sample. The parameters in Group 1 (a_G1, b_G1) are independent of the parameters in Group 2 (a_G2, b_G2) because Group 1 and Group 2 are independent as long as the assumption of independent observations is valid. In mediation model, the covariance between a and b paths are often replaced with zero (Preacher and Selig, 2012). So the covariance between a and b paths is zero in each group ( ${\hat{σ}}_{b_{G 1}, a_{G 1}}$ = 0; ${\hat{σ}}_{b_{G 2}, a_{G 2}}$ = 0). For a large number of replications, parameter values $â_{G 1}^{+}$ , ${\hat{b}}_{G 1}^{+}$ , $â_{G 2}^{+}$ , and ${\hat{b}}_{G 2}^{+}$ are generated from the multivariate normal distribution shown in (1). The superscript + denotes the parameter values generated by Monte Carlo method. In each replication, the simple indirect effect estimate is computed by $â_{G 1}^{+} {\hat{b}}_{G 1}^{+}$ in Group 1 and by $â_{G 2}^{+} {\hat{b}}_{G 2}^{+}$ in Group 2. The group difference in the indirect effect is computed by $(â_{G 1}^{+} {\hat{b}}_{G 1}^{+} - â_{G 2}^{+} {\hat{b}}_{G 2}^{+})$ . The Monte Carlo confidence intervals ([100 * (1 − α)]%) are obtained by the (α/2) and (1 − α/2) percentiles in the set of generated values. For the simple indirect effect in Group 1, the Monte Carlo confidence intervals ( $M C_{i n d}^{M}$ ) are computed using the set of $â_{G 1}^{+} {\hat{b}}_{G 1}^{+}$ values, and using the set of $â_{G 2}^{+} {\hat{b}}_{G 2}^{+}$ values in each group, respectively. The Monte Carlo confidence interval for the group difference in the indirect effect ( $M C_{d i f f}^{M}$ ) is obtained using the set of $(â_{G 1}^{+} {\hat{b}}_{G 1}^{+} - â_{G 2}^{+} {\hat{b}}_{G 2}^{+})$ values. The Monte Carlo method is less computer-intensive and less time-consuming than the bootstrapping method.

Likelihood Ratio Test

The likelihood ratio (LR) test and the Wald test can be used to test a (set of) constraint. The LR test (Bentler and Bonett, 1980; Bollen, 1989) is obtained by estimating two nested models with (M₁) and without (M₀) the constraints. The LR test results in a chi-square statistic with the degrees of freedom (df) equal to the difference in the number of freely estimated parameters in the two models.

\begin{matrix} χ^{2} = - 2 l o g [\frac{L (M_{1})}{L (M_{0})}] = {- 2 l o g [L (M_{1})]} - {- 2 l o g [L (M_{0})]} & (2) \end{matrix}

where L(M_k) = likelihood of model k. The LR test can be used to test the group difference in the “X → M” relationship in the multi-group analysis model, by comparing two models with and without the constraint a_G1 = a_G2, with df = 1 ( $L R_{a}^{M}$ ). Likewise, the LR test can be used to test the group difference in the indirect effect by comparing two models with and without the constraint a_G1b_G1 = a_G2b_G2, with df = 1 ( $L R_{d i f f}^{M}$ ).

Wald Test

The Wald test (Wald, 1943; Bollen, 1989) evaluates a constraint in a model in which the constraint is not imposed. For testing group difference in the indirect effect, the constraint a₃b = 0 is tested in the single-group analysis ( $W_{d i f f}^{S}$ ). The Wald statistic (with df = 1) is obtained by

\begin{matrix} W = {\hat{θ}}_{1}^{2} / a v a r ({\hat{θ}}_{1}) & (3) \end{matrix}

Where θ₁ = a₃b and $a v a r ({\hat{θ}}_{1})$ = estimated asymptotic variance of ${\hat{θ}}_{1}$ , i.e., estimated asymptotic variance of $â_{3} \hat{b}$ . Likewise, for testing group difference in the indirect effect in the multi-group model, the constraint a_G1b_G1 = a_G2b_G2 is tested ( $W_{d i f f}^{M}$ ). The Wald statistic (df = 1) is obtained by (3) with θ₁ = a_G1b_G1 − a_G2b_G2 in the multi-group model.

A previous simulation study (Ryu, 2015) compared the performance of different methods for testing group difference in the indirect effect in multi-group analysis. In the previous study, the LR test performed well in terms of Type I error rate and statistical power. The percentile bootstrap confidence intervals for the group difference in indirect effect showed coverage rates that are close to the nominal level. The bias-corrected bootstrap confidence intervals were more powerful than the percentile bootstrap confidence intervals but the bias-corrected bootstrap confidence intervals showed inflated Type I error rates.

Single-Group and Multi-Group Approaches

The multi-group analysis model shown in Figure 2B is less restrictive the single-group analysis model shown in Figure 2A. In the single-group model shown in Figure 2A, b and c′ paths are assumed to be equal between groups, whereas b and c′ paths are allowed to differ between groups in the multi-group model, unless additional equality constraints are imposed. It is possible to specify a single-group model that allow b or c′ paths to differ between groups. In order to allow these parameters to differ between groups in the single-group model, additional parameters need to be estimated or additional interaction terms need to be added. If the model shown in Figure 2A is modified by specifying the path coefficients “Group → Y” and “X*Group → Y” to be freely estimated, that will allow c′ to differ between groups. In order to allow b to differ between groups, the model needs an additional variable “M*Group” and the path coefficients “Group → Y” and “M*Group → Y” need to be freely estimated. The multi-group model can be simplified by imposing equality constraints ${\hat{b}}_{G 1} = {\hat{b}}_{G 2}$ and / or ĉ′_G1 = ĉ′_G2.

In the single-group model, the variance and covariance parameters are assumed to be equal as well, whereas in the multi-group model those parameters are not restricted to be the same between groups unless additional equality constraints are imposed. Specifically, in the single-group analysis model (as shown in Figure 2A) the residual variances of M and Y are assumed to be equal in both groups. The equal variance assumption in the single-group analysis is one of the standard assumptions in general linear models. The assumption is that the conditional variance of the dependent variable is homogeneous at all levels of the independent variables. For example, in regression analysis, the conditional variance of the dependent variable is assumed to be equal at all levels of the predictor variable. In between-subject analysis of variance or in t-test to compare two independent means, the within-group variance is assumed to be equal across all groups. It is well-known that the empirical Type I error rate can be different from the nominal level when the equal variance assumption is violated (e.g., Box, 1954; Glass et al., 1972; Dretzke et al., 1982; Aguinis and Pierce, 1998).

The purpose of this study is to introduce the single-group and multi-group approaches in SEM to comparing indirect effects between groups, and to empirically evaluate the performance of the statistical methods. Specifically, we aim to empirically evaluate how well the statistical methods (summarized in Table 1) perform for three questions in the moderated mediation model: (i) comparing the a path (X → M) between groups, (ii) comparing the indirect effect between groups, (iii) testing simple indirect effect in each group. The methods we considered are summarized in Table 1. We also evaluate how robust the methods in the single-group analysis are when the assumption of equal variances does not hold between groups. We expected that the performance of the methods in multi-group analysis would not be affected by the violation of the assumption of equal variances, because the multi-group analysis model does not rely on the assumption. In the single-group analysis, we expected that the performance of the $z_{a 3}^{S}$ and $W_{d i f f}^{S}$ methods would be affected by the violation of the equal variance assumption, and that the confidence intervals produced by the bootstrapping methods ( $P C_{i n d}^{S}$ , $B C_{i n d}^{S}$ ) would not be affected by the violation of the assumption. The estimates are expected to be unbiased regardless of the equal variance assumption violated. The bootstrap sampling distribution is constructed using the estimates in bootstrap samples. Therefore, as long as the violation of the equal variance assumption does not affect the unbiasedness of the estimates, the performance of the bootstrap confidence intervals is not expected to be affected by the violation of the assumption.

Simulation

We used the mediation model shown in Figure 2B as the population model. There were two distinctive groups (denoted by G1 and G2). We considered a total of 63 conditions: 21 populations × 3 sample sizes.

As shown in Table 2, the 21 populations were created by combinations of three sets of parameter values for structural paths (Populations I, II, and III) and seven sets of parameter values for residual variances (Populations -0, -M1, -M2, -M3, -Y1, -Y2, -Y3). In Population I, there was no group difference in the indirect effect (a_G1b_G1 = 0.165; a_G2b_G2 = 0.165). In Population II, there was no indirect effect in G1; there was a small indirect effect in G2 (a_G1b_G1 = 0.000; a_G2b_G2 = 0.055); the group difference in the indirect effect was (a_G1b_G1 − a_G2b_G2) = −0.055. In Population III, there was no indirect effect in G1; there was a large indirect effect in G2 (a_G1b_G1 = 0.000; a_G2b_G2 = 0.165); the group difference in the indirect effect was –0.165. The direct effect of X on Y was set to zero (i.e., ĉ′_G1 = ĉ′_G2 = 0) in all populations. It has been shown in a previous simulation study (Ryu, 2015) that the population value of the direct effect had little influence on the performance of the five methods for testing the group difference in indirect effect. With each set of the parameter values for structural paths, there were seven patterns of residual variances of M and Y. In Population -0, the residual variances of M and Y were equal between the groups in the population. In Populations -M1, -M2, and -M3, the residual variance of M was smaller in G1. In Populations -Y1, -Y2, and -Y3, the residual variance of Y was smaller in G1. Note that the effect sizes varied depending on the residual variances. The proportions of explained variance in M and Y in the 21 populations are summarized in Table 3.

TABLE 2

Table 2. Parameter values for structural paths a and b, and for residual variances of M and Y in population.

TABLE 3

Table 3. Proportion of explained variance in M and Y in population.

We considered three different sample sizes for each of the 21 populations. Sample size 1: n_G1 = 150; n_G2 = 150. Sample size 2: n_G1 = 200; n_G2 = 100. Sample size 3: n_G1 = 100; n_G2 = 200. With Sample size 2, the residual variances were smaller in the larger group. With Sample size 3, the residual variances were smaller in the smaller group. We used Mplus 7 for data generation and estimation (Muthén and Muthén, 1998–2012). We used SAS PROC IML for resampling of the data to create bootstrap samples. We conducted 1,000 replications in each condition.

We analyzed each of the generated data sets both in single-group analysis (0 = Group 1, 1 = Group 2) and in multi-group analysis to test the group difference in a path, the group difference in the indirect effect of X on Y via M, and the simple indirect effect in each group. We used the methods summarized in Table 1. We provide the sample syntax for data generation and analysis in the Appendix.

Evaluation of Methods

In order to check the data generation and estimation, we first examined the bias of the estimates. Bias was computed by (mean of estimates–true value in the population). Relative bias was computed by (bias/true value in the population) for the effects whose population values were not zero. In the single-group analysis, we compared the following estimates to their corresponding population values: individual path coefficients â₁, â₃, $\hat{b}$ , the simple indirect effect in Group 1 $â_{1} \hat{b}$ , and the simple indirect effect in Group 2 $(â_{1} + â_{3}) \hat{b}$ . In the multi-group analysis, we compared the following estimates to their corresponding population values: individual path coefficients â_G1, ${\hat{b}}_{G 1}$ , â_G2, ${\hat{b}}_{G 2}$ , the simple indirect effects in each group $â_{G 1} {\hat{b}}_{G 1}$ , $â_{G 2} {\hat{b}}_{G 2}$ , and the group difference in the indirect effect $(â_{G 1} {\hat{b}}_{G 1} - â_{G 2} {\hat{b}}_{G 2})$ .

To evaluate the performance of the methods, we examined the rejection rates that can be interpreted as Type I error rate (when the effect was zero in population) or statistical power (when there was a non-zero effect in population) for each method. For the z test of a₃ path ( $z_{a 3}^{S}$ ), LR test ( $L R_{a}^{M}$ , $L R_{d i f f}^{M}$ ), and Wald test ( $W_{d i f f}^{S}$ , $W_{d i f f}^{M}$ ), we used α = 0.05 criterion. For confidence intervals (95%), we computed the rejection rate by the proportion of replications in which the interval estimates did not include zero. We also examined coverage rates, width of confidence intervals, rate of left-side misses, rate of right-side misses, and ratio of left-side misses to right-side misses for interval estimates.

Results

As expected, the estimates were unbiased in all populations with all sample sizes. In the single-group analysis, the bias ranged from 0.007 to −0.005, and the relative bias ranged from −0.038 to 0.007. The estimates obtained in the single-group analysis were unbiased regardless of whether the assumption of equal residual variances was satisfied. In the multi-group analysis, the bias ranged from −0.004 to 0.007, and the relative bias ranged from −0.011 to 0.051.

We present the simulation results in three sections: methods for testing the group difference in a path, methods for testing the group difference in the indirect effect, and methods for testing simple indirect effect in each group.

Group Difference in a Path

Table 4 shows the empirical Type I error rates (nominal α = 0.05) of the methods for testing the group difference in a path in single-group ( $z_{a 3}^{S}$ ) and multi-group analysis ( $L R_{a}^{M}$ ) in Population I.

TABLE 4

Table 4. Type I error rates of the methods for testing group difference in a path.

The Type I error rates of the $L R_{a}^{M}$ method stayed close to the nominal level. But the $z_{a 3}^{S}$ method resulted in inflated Type I error rates when the residual variance of M was smaller in the group with a larger sample size (Populations I-M1 to I-M3; n_G1 = 200; n_G2 = 100). The $z_{a 3}^{S}$ method resulted in deflated Type I error rates when the residual variance of M was smaller in the group with a smaller sample size (Populations I-M2 and I-M3; n_G1 = 100; n_G2 = 200). Whether or not the residual variance of Y was equal between groups did not affect the Type I error rates of the $z_{a 3}^{S}$ method. Figure 3 shows the empirical power of the two methods for Populations II and III.

FIGURE 3

Figure 3. Empirical power for testing group difference in X to M relationship (a path) in Population II (A) and in Population III (B). See Table 1 for description of the methods.

Note that the effect sizes are different in different populations. Figure 3 is to compare the two methods $z_{a 3}^{S}$ and $L R_{a}^{M}$ in each condition. When the group sizes were equal, the power was similar for the two methods. When the residual variance of M was not equal (Populations II-M1 to II-M3, Populations III-M1 to III-M3), the $z_{a 3}^{S}$ method showed higher power than the $L R_{a}^{M}$ method with the Sample size 2 (n_G1 = 200; n_G2 = 100); the $z_{a 3}^{S}$ method showed lower power than the $L R_{a}^{M}$ method with the Sample size 3 (n_G1 = 100; n_G2 = 200).

Group Difference in the Indirect Effect

Type I Error Rates

Table 5 shows the empirical Type I error rates of the methods for testing the group difference in the indirect effect in Population I.

TABLE 5

Table 5. Type I error rates of the methods for testing group difference in the indirect effect.

The Type I error rates for the $W_{d i f f}^{S}$ method were higher than the nominal level when the residual variance of M was smaller in the group with a larger sample size (Populations I-M2 and I-M3; n_G1 = 200; n_G2 = 100); and the Type I error rates were smaller than the nominal level when the residual variance of M was smaller in the group with a smaller sample size (Populations I-M1 to I-M3; n_G1 = 100; n_G2 = 200). This is a similar pattern to the Type I error rates of the $z_{a 3}^{S}$ method in Table 4.

For the five methods in the multi-group analysis, the Type I error rates ranged from 0.049 to 0.068 with Sample size 1; ranged from 0.047 to 0.070 with Sample size 2; and ranged from 0.053 to 0.065 with Sample size 3. The equality of residual variances of M and Y in the population did not affect the Type I error rates of the five methods in the multi-group analysis. The Type I error rates of the $B C_{d i f f}^{M}$ method were slightly higher than the Type I error rates of the other methods.

Power

The empirical power for testing the group difference in the indirect effect in Populations II and III are shown in Figure 4.

FIGURE 4

Figure 4. Empirical power for testing group difference in the indirect effect in Population II (A) and in Population III (B). See Table 1 for description of the methods.

Note that the difference in empirical power across populations (i.e., across different lines) are due to different effect sizes as shown in Table 3. The $B C_{d i f f}^{M}$ method showed higher power than the other methods. The $W_{d i f f}^{M}$ method showed lower power than the other methods in multi-group analysis. For Population III in which the group difference in the indirect effect was larger, the differences in empirical power between the methods were greater with the sample size n_G1 = 200; n_G2 = 100, i.e., when the indirect effect was zero in the larger group and larger in the smaller group. When the residual variance of M was not equal between groups (e.g., II-M1,…, II-M3, III-M1,…, III-M3), the $W_{d i f f}^{S}$ method yielded higher power than the other methods with the sample size n_G1 = 200; n_G2 = 100. Note that the $W_{d i f f}^{S}$ method showed inflated Type I error rates in these conditions. The $W_{d i f f}^{S}$ method yielded lower power than the other methods with the sample size n_G1 = 100; n_G2 = 200. In these conditions, the Type I error rates were lower than the nominal level.

Coverage Rates, Width, and Misses

Three methods in multi-group analysis produced 95% confidence intervals for the group difference in the indirect effect: $P C_{d i f f}^{M}$ , $B C_{d i f f}^{M}$ , and $M C_{d i f f}^{M}$ . The results showed similar patterns in all simulation conditions. The performance of the three confidence intervals was comparable in terms of coverage, width, and misses. The coverage rates of the $P C_{d i f f}^{M}$ confidence intervals ranged from 0.927 to 0.951 (average = 0.939). The coverage rates of the $B C_{d i f f}^{M}$ confidence intervals ranged from 0.923 to 0.947 (average = 0.935). The coverage rates of the $M C_{d i f f}^{M}$ confidence intervals ranged from 0.926 to 0.949 (average = 0.934). On average, the coverage rates were slightly lower than the nominal level. The width of the confidence intervals produced by the three methods was similar to one another. The average width was 0.248 for $P C_{d i f f}^{M}$ , 0.250 for $B C_{d i f f}^{M}$ , and 0.246 for $M C_{d i f f}^{M}$ .

For $P C_{d i f f}^{M}$ , the average ratio of left-to right-side misses was 1.427, 1.927, and 1.824 in Populations I, II, and III, respectively. For $B C_{d i f f}^{M}$ , the average ratio was 1.274, 1.521, and 1.249 in Populations I, II, and III, respectively. For $M C_{d i f f}^{M}$ , the average ratio was 1.397, 1.783, and 1.664 in Populations I, II, and III, respectively. All three confidence intervals showed higher rates of left-side misses than right-side misses². The $B C_{d i f f}^{M}$ confidence intervals were most balanced (i.e., average ratio closer to 1).