Psychometric Properties of the Chinese Version of the Neuroticism Subscale of the NEO-PI

Xi, Chang; Zhong, Mingtian; Lei, Xiaoxia; Liu, Ying; Ling, Yu; Zhu, Xiongzhao; Yao, Shuqiao; Yi, Jinyao

doi:10.3389/fpsyg.2018.01454

ORIGINAL RESEARCH article

Front. Psychol., 17 August 2018

Sec. Quantitative Psychology and Measurement

Volume 9 - 2018 | https://doi.org/10.3389/fpsyg.2018.01454

Psychometric Properties of the Chinese Version of the Neuroticism Subscale of the NEO-PI

$\r\nChang Xi$ Chang Xi¹

Mingtian Zhong²

Xiaoxia Lei¹

Ying Liu¹

Yu Ling³

Xiongzhao Zhu^1,4

Shuqiao Yao^1,4

Jinyao Yi^1,4^*

¹Medical Psychological Center, The Second Xiangya Hospital, Central South University, Changsha, China
²Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China
³Education Institute, Hunan Agricultural University, Changsha, China
⁴Medical Psychological Institute of Central South University, Changsha, China

Neuroticism is an important concept in psychology, self-report measures of neuroticism are important for both research and clinical practice. The neuroticism subscale of the Neuroticism-Extraversion-Openness Personality Inventory (NEO-PI) is a brief measure of neuroticism, and it was widely used in the world. This study was aimed to examine the psychometric properties of the Chinese version of the neuroticism subscale of the NEO-PI. A total of 5,494 undergraduates from three universities and 551 clinical patients with mental disorders from a psychological clinic had completed the Chinese version of the neuroticism subscale of the NEO-PI. Confirmatory factor analysis was performed to examine how well the three hypothetical models fit the data and the measurement equivalence of neuroticism subscale across gender. The internal consistency and test-retest reliability were also evaluated. Both the six-facet model and the bi-factor model (six-facet model with one general factor) achieved satisfactory fit, while the six-facet model had best fit (Undergraduate sample: TLI = 0.919, CFI = 0.933, RMSEA = 0.044, SRMR = 0.033; Clinical sample: TLI = 0.921, CFI = 0.935, RMSEA = 0.047, SRMR = 0.041), and it had measurement equivalence across gender. The neuroticism subscale also showed acceptable internal consistency and good stability. Within the undergraduate sample, there were statistically significant gender differences in neuroticism total scores and scores of six facets, while there were no significant gender differences in the neuroticism scores in the clinical sample. Both in the undergraduate sample and the clinical sample, anxiety facet, depression facet and vulnerability facet of the neuroticism subscale significantly predicted the depression level, while anxiety facet, angry-hostility facet and vulnerability facet significantly predicted the anxiety level. In conclusion, the Chinese version of the neuroticism subscale is a reliable and valid measurement of neuroticism in both undergraduate and clinical population.

Introduction

While the term neuroticism dates back to Freudian theory, the modern concept of neuroticism was introduced by Eysenck. Eysenck recognized neuroticism as a trait of emotionality, specifically the tendency to arouse quickly when stimulated and to inhibit emotions slowly (Eysenck and Michael, 1985). Neuroticism is operationally defined by the factors of items referring to irritability, anger, sadness, anxiety, worry, hostility, and self-consciousness (Costa and McCrae, 1992b).

Researches on neuroticism are important to public health due to the robust correlation between neuroticism and a wide variety of both mental and physical health problems (Malouff et al., 2005, 2006). Neurotic individuals have a limited tolerance for aversive stimuli and tend to experience negative emotion, such as anger, anxiety and depression (McCrae and Costa, 1997, 2008). Individuals with higher level of neuroticism not only tend to develop mood disorders, but also exhibit other disorders, such as substance use disorder and eating disorders (Widiger, 2001). Meanwhile, neurotic individuals are at higher risk of engaging in potentially destructive behaviors that may have a negative impact on health and lead to a decrease in life expectancy, such as smoking and excessive drinking (Daniel et al., 2009).

Self-reported questionnaires are the most commonly used tools for neuroticism assessment. The Eysenck Personality Questionnaire (Eysenck et al., 1985), the Neuroticism-Extraversion-Openness Personality Inventory (NEO-PI; John and Kentle, 1991), and the Dutch Personality Questionnaire (Kerkhof, 2003) have proven adequate psychometric properties in assessing neuroticism. The most widely used instrument to date is the 48-item neuroticism subscale of the NEO-PI, which assesses individual differences in a predisposition to experience negative emotional states associated with symptoms of depression, anxiety and high arousal (Pervin, 1999; Matthews, 2000). The NEO-PI is a 240-item self-reported questionnaire based on the five-factor model (the Big Five) of personality: Extraversion, Neuroticism, Conscientiousness, Agreeableness, and Openness/ Intellect. Each factor of NEO-PI includes six facets. Over the past two decades, the Big Five factors have become the most prominent model for describing the structure of personality traits (Egger et al., 2003; Malouff et al., 2006; Ortet et al., 2012). Among the Big Five factors, neuroticism has been the most studied factor by researchers around the world and the neuroticism subscale of the NEO-PI has been the most widely used instrument for the measurement of neuroticism. The neuroticism subscale of the NEO-PI measures six facets of neuroticism: anxiety, angry-hostility, depression, self-consciousness, impulsiveness and vulnerability (Costa and McCrae, 1992b). The psychometric properties of the NEO-PI scale have been proven validity across cultures and ages, each facet scale includes eight items to ensure that the scales are comparable in many respects (Terracciano, 2011). Considering the wide use of the neuroticism subscale of the NEO-PI, studies focusing on the psychometric properties of the neuroticism subscale are warranted.

Prior researches have found that women scored higher than men on almost all facets of neuroticism and on total level of neuroticism (Grossman and Wood, 1993; Lynn and Martin, 1997; Costa et al., 2001; Terracciano, 2003). One study including data from 26 countries concluded that women scored higher than men on neuroticism as measured by the neuroticism subscale of the NEO-PI, as well as on most facets of neuroticism (Costa et al., 2001). A study in Italy also found that women scored significantly higher than men on total neuroticism and on each of its facets (Terracciano, 2003). Since neuroticism increases the risk of a wide range of mental disorders, gender difference in neuroticism may lead to the gender difference seen in major psychopathologies. For example, women diagnosed with many mental disorders (e.g., phobias, major depression, dysthymic disorder, generalized anxiety disorder, borderline personality disorder) were associated with neuroticism at a higher rate than men (Grossman and Wood, 1993). Although there were many previous researches supporting the difference in neuroticism between men and women(Foot and Koszycki, 2010; Banzhaf et al., 2012; Konishi et al., 2014), only when the latent structure are equivalent across gender, the observed gender difference in neuroticism can truly predict the true difference between males and females.

Measurement invariance is “the mathematical equality of corresponding measurement parameters for a given factorially defined construct across groups” (Little, 1997, p. 55). Generally, measurement invariance is used to estimate parameters that reflect difference in the latent construct, and examine whether an instrument's items operate in the same way in different groups (Meredith, 1993). Simply put, it is used to examine whether an instrument's items measure the same parameters in different groups. If there is poor measurement invariance in the neuroticism scale across gender, the observed difference between males and females may be confounded by the fact that different constructs are being measured.

Therefore, the purpose of this study was to examine the psychometric properties of the 48-item neuroticism subscale of the NEO-PI in Chinese population. One undergraduate sample and one clinical sample were used in this study. Specifically, the current study would examine the measurement invariance across gender in the undergraduate sample.

Methods

Participants

From March 2015 to October 2015, 5,612 undergraduates from three universities in Changsha were invited to participate in the study. A total of 5,494 (99.7%) students [2,878 (52.4%) men, 2,616 (47.6%) women; aged 19–30 (Mean = 25.0; standard deviation (SD) = 1.02)] provided complete data anonymously. To estimate test–retest reliability, a subgroup of 865 students who had been participating in a psychology course completed the neuroticism subscale of the NEO-PI twice with a 2-month interval.

In addition to the subjects described above, from March 2015 to March 2017, 568 outpatients, who had been referred for assessment and treatment in the psychological clinic of Second Xiangya Hospital, were also asked to participate in the study. Those patients who could not understand the question well were excluded, such as patients with intellectual disability. A total of 551 (97%) outpatients provided complete data, consisting of 281 men (51%) and 270 women (49%), ranging in age from 16 to 77 (Mean = 31.41; SD = 18.52). The diagnoses of the clinical sample were schizophrenia, depressive disorder, anxiety disorder, obsessive-compulsive disorder, trauma and stress-related disorder, somatic symptom and related disorder, personality disorder and other mental disorders. There was a significant age difference between the undergraduate sample and the clinical sample, the clinical sample was significantly older than the undergraduate sample (t = 5.772, p < 0.001), but no significant gender difference was found between the two samples (χ² = 0.386, df = 1, p = 0.535).

The study was approved by the Ethics committee of Second Xiangya Hospital, Central South University. All participants provided written informed consent at the time of enrollment.

Instruments

The Neuroticism Subscale of the NEO-PI

The NEO-PI is the most comprehensive self-report questionnaire that measures the five-factor model of personality, in which neuroticism is included (Costa and McCrae, 1997; Xu and Potenza, 2012). The NEO-PI consists of 240 items and has been extensively validated (John and Kentle, 1991; Luchetti et al., 2018). It measures five major factors and 30 facets. The internal consistency of the NEO-PI was high (0.86 to 0.92), and the internal consistency of its facets ranged from 0.58 to 0.81 (Costa and McCrae, 1992a). The neuroticism subscale of the NEO-PI includes 48 items, each is rated on a 5-point Likert scale (0–4), with total score ranging from 0 to 192. Higher scores are indicative of higher level of neuroticism. The neuroticism subscale includes six facets: anxiety, angry-hostility, depression, self-consciousness, impulsiveness, and vulnerability. Each facet is measured by eight items, and the score of each facet ranges from 0 to 32 (Dai and Yao, 2004). The neuroticism subscale of the NEO-PI has previously been proven to be valid in the assessment of neuroticism among Chinese people (Dai et al., 1999).

The Center for Epidemiological Studies Depression Scale

The 20-item Center for Epidemiological Studies Depression Scale (CES-D; Radloff, 1977) assesses various current depressive symptoms in general population. Each item is rated on a 4-point scale. Total score ranges from 0 to 80. Higher scores are indicative of a greater presence of depressive symptoms. The Chinese version of the CES-D has acceptable reliability (Cronbach's α = 0.88) and validity in Chinese population (Wang et al., 2013).

The Self-Rating Anxiety Scale

The Zung Self-Rating Anxiety Scale (SAS) was designed to quantify an indivdual's level of anxiety (Zung, 1965). Each question on the SAS is scored on a Likert-type scale of 1–4 (1 = “ rarely,” 2 = “some of the time,” 3 = “often,” and 4 = “most of the time”). The total score ranges from 20 to 80. Higher scores are indicative of more symptoms of anxiety. The SAS has demonstrated acceptable reliability (Cronbach's α = 0.78) and validity in the assessment of anxiety in Chinese-speaking samples (Luo et al., 2006; Wang and Tang, 2011).

Statistical Analyses

To evaluate the construct validity of the neuroticism subscale, we performed robust maximum-likelihood(MLR) confirmatory factor analysis (CFA) using M-plus 7.11 software. The CFA was used to determine how well the factor models fit the data, and to examine the measurement equivalence of neuroticism subscale across gender. We examined three hypothetical models: the original single-factor model (one general factor), the six-facet model (six facets: anxiety, angry-hostility, depression, self-consciousness, impulsiveness, and vulnerability), and the bi-factor model (six-facet model with one general factor, each item loading on both one facet and the general factor) (Reise et al., 2010). Each model included 24 parcels, which were created by randomly selecting two items within a facet for each parcel (Kishton and Widamen, 1994; Floyd and Widamen, 1995), so each facet included four parcels, see Supplementary Material. Several model fit indices were used to evaluate the model fit: tucker-lewis index (TLI), comparative fit index (CFI), and root-mean-square error of approximation (RMSEA) (Chou et al., 1991; Hu and Bentler, 1998). The criteria used to evaluate model fit were: TLI ≥ 0.90, CFI ≥ 0.90, and RMSEA ≤ 0.08 (Browne and Cudeck, 1993; Hu and Bentler, 1999). To further compare the fit of the competing models, the χ² test is also reported.

To assess the measurement invariance of neuroticism subscale across gender in the undergraduate sample, the multi-group confirmatory factor analysis (MGCFA) was used. The test of measurement invariance was divided into four levels from low to high: configural invariance, weak factorial invariance, strong factorial invariance, and strict invariance (Joreskog, 1971). The four levels have a hierarchical relationship so that data analyses were carried out step by step: (1) configural invariance (model 1), a baseline model for each group was construed such that for both men and women the following criteria were met: TLI ≥ 0.90, CFI ≥ 0.90, and RMSEA ≤ 0.08, (2) weak factorial invariance (model 2), factor loadings were equal across groups, (3) strong factorial invariance (model 3), the factor loadings and intercepts of variables were equal across groups and (4) strict invariance (model 4), the factor loadings, intercepts of variables and error variances were equal across groups, while strict invariance is not necessary for most researches(Widaman and Reise, 1997). Measurement invariance is considered established when two of following are satisfied: the χ² difference test resulted in a p-value > 0.05, the change of CFI < 0.01, the change of TLI < 0.01 and the change of RMSEA < 0.015(Cheung and Rensvold, 2002; Chen, 2007; Ferro and Boyle, 2013).

Cronbach's alphas (α) and mean inter-item correlations (M_IC) for the full neuroticism scale and for each of the facets were calculated to evaluate internal reliability. Generally speaking, a minimum standard of 0.70 is set for Cronbach's α coefficients, but an α of 0.60 is also considered acceptable (DeVellis, 1991). An optimal range of 0.10–0.40 was set for the M_IC (Briggs and Venue, 1986; Nunnally, 1994). The Pearson correlation coefficient (r) was used to evaluate test–retest reliability and was set with a minimum standard of 0.70 (Anastasi and Urbina, 1997). The relationships between the total scale and the six facets were also examined by Pearson's r. Independent-samples t-test was used to compare the differences in scores of neuroticism subscale between the undergraduate sample and the clinical sample, and the gender differences in scores of neuroticism subscale.

To examine whether depression and anxiety were predicted by demographic variables and neuroticism, we performed multiple linear regression both in the undergraduate sample and the clinical sample, with the CES-D total score and SAS total score as dependent variable, respectively.

Results

Descriptive Statistics

The total scores of neuroticism subscale of the NEO-PI ranged from 11 to 155 (Mean = 72.12; SD = 20.99) in the undergraduate sample. While in the clinical sample, the total scores of neuroticism subscale ranged from 19 to 175 (Mean = 106.84; SD = 28.36).

The Goodness of Fit Indices for Neuroticism Subscale

Both in the undergraduate sample and the clinical sample, the fit indices of the six-facet model and the bi-factor model (six-facet model with one general factor) all reached acceptable standards, but the single-factor model didn't fit quite well (Table 1). However, the six-facet model had significant improvement over the bi-factor model in the clinical sample, χ² (diff) = 24.79, p < 0.001, but not in the undergraduate sample χ² (diff) = 2.78, p > 0.05. The standardized factor loading of the six-facet model were presented in Table 2.

TABLE 1

Table 1. The model fit indices of the six-facet model, the single-factor model and the six-facet bi-factor model in the two groups.

TABLE 2

Table 2. The factor loading of each parcel in the six-facet model.

Measurement Invariance Across Gender for the Undergraduate Sample

As the six-facet model fitted the data best, we choose the six-facet model to estimate the measurement invariance across gender. As shown in Table 3, the baseline models were considered optimal in representing the data both for the male (TLI = 0.917; CFI = 0.931; RMSEA = 0.041) and the female undergraduates (TLI = 0.915; CFI = 0.930; RMSEA = 0.047), providing evidence for configural invariance. Furthermore, the changes in TLI and RMSEA (ΔTLI < 0.010, ΔRMSEA < 0.015) supported weak equivalence and strong equivalence of the neuroticism subscale. In other words, the neuroticism subscale demonstrated good measurement equivalence across gender.

TABLE 3

Table 3. Measurement invariance of the neuroticism subscale across gender in the undergraduate sample.

Reliability

In the undergraduate sample, the Cronbach's α coefficient was 0.91 for the neuroticism subscale, and ranged from 0.58 (self-consciousness) to 0.77 (vulnerability) for the six facets. In the clinical sample, the Cronbach's α coefficient was 0.93 for the neuroticism subscale, and ranged from 0.54 (self-consciousness) to 0.83 (depression) for its six facets (Table 4).

TABLE 4

Table 4. The Cronbach's α, mean inter-item correlation and test-retest reliability for the neuroticism subscale and its six facets.

The M_IC of neuroticism subscale was 0.18 in the undergraduate sample and 0.21 in the clinical sample. In the undergraduate sample, the M_IC of the six facets ranged from 0.15 (self-consciousness) to 0.30 (vulnerability), while the clinical sample, the M_IC of six facets ranged from 0.13 (self-consciousness) to 0.38 (depression), Table 4.

In the undergraduate sample, the test-retest coefficient for the neuroticism subscale was 0.71, and the test-retest coefficients for the six facets ranged from 0.52 (impulsiveness) to 0.64 (depression), see Table 4. Test-retest reliability was not evaluated in the clinical sample.

Intercorrelation Among the Total Neuroticism Subscale and Its Six Facets

The correlation coefficients among the total score of neuroticism subscale and its six facets ranged from 0.44 to 0.85 in the undergraduate sample, and from 0.47 to 0.87 in the clinical sample (Table 5). All correlation coefficients were positive and statistically significant (p < 0.001).

TABLE 5

Table 5. Intercorrelations among total neuroticism subscale and its six facets.

Gender Differences and Group Differences in Neuroticism

In the student subjects, females scored significantly higher than males on the total score of neuroticism subscale and scores of six facets (Total score: t = 17.031, p < 0.001, Cohen's d = 0.46; Angry-Hostility score: t = 11.775, p < 0.001, Cohen's d = 0.32; Anxiety score: t = 13.868, p < 0.001, Cohen's d = 0.37; Depression score: t = 11.197, p < 0.001, Cohen's d = 0.30; Self-Consciousness score: t = 9.688, p < 0.001, Cohen's d = 0.26; Impulsiveness score: t = 12.547, p < 0.001, Cohen's d = 0.34; and Vulnerability score: t = 21.358, p < 0.001, Cohen's d = 0.58). There were no significant differences in scores of neuroticism subscale and its six facets between males and females in the clinical sample (Table 6).Compared with the undergraduate group, the clinical group got significantly higher scores on total neuroticism (t = 35.70, p < 0.001, Cohen's d = 1.39) and all six facets (t: 19.06 ~ 33.96, all p < 0.001, Cohen's d: 0.80~1.31).

TABLE 6

Table 6. Scores on total neuroticism subscale and each of the six facets separated by gender (Means ± SD, ranges).

Predictive Validity of Demographic Variables and the Six Facets of Neuroticism Subscale for Depression and Anxiety

Results of multiple regression analyses with the CES-D total score as dependent variable were presented in Table 7. For the undergraduate sample, demographic variables (gender and age), anxiety facet, angry-hostility facet, depression facet, self-consciousness facet and vulnerability facet significantly predicted the CES-D total score (p < 0.05), except impulsiveness facet (p > 0.05). For the clinical sample, anxiety facet, depression facet and vulnerability facet significantly predicted the CES-D total score (p < 0.05), but none of other three facets of the neuroticism subscale significantly predicted the CES-D total score (p > 0.05).

TABLE 7

Table 7. Multiple regression analysis with the CES-D total score as the dependent variable.

As Table 8 showed, for the undergraduate sample, demographic variables (gender and age), anxiety facet, angry-hostility facet, depression facet, impulsiveness facet and vulnerability facet all significantly predicted the SAS total score (p < 0.05), except self-consciousness facet (p > 0.05). For the clinical sample, anxiety facet, angry-hostility facet and vulnerability facet significantly predicted the SAS total score (p < 0.05), while none of other three facets of the neuroticism subscale significantly predicted the SAS total score (p > 0.05).

TABLE 8

Table 8. Multiple regression analysis with the SAS total score as the dependent variable.

Discussion

The neuroticism subscale of the NEO-PI is widely used around the world and is considered a valid and reliable measurement for the personality trait of neuroticism (Costa and McCrae, 1992a; Nunnally, 1994). The present study evaluated the reliability and validity of the neuroticism subscale in two Chinese samples (an undergraduate sample and a clinical sample), and examined its measurement invariance across gender in the undergraduate sample. Three hypothetical models were tested: single-factor model, six-facet model and bi-factor model (six-facet model with one general factor). The model indices of the six-facet model and the bi-factor model for both undergraduate sample and clinical sample all met the fit standards, supporting the neuroticism subscale's construct validity. Since the six-facet model fitted the data best, the six-facet model was chosen in the following measurement invariance analysis. In the undergraduate sample, the model indices for both the males and females indicated a good fit of the neuroticism subscale's theoretical structure. In addition, the Chinese version of the neuroticism subscale also demonstrated acceptable reliability and factorial validity in this study.

The Cronbach's α coefficients for the total neuroticism subscale, in both the undergraduate sample and the clinical sample, reached accepted standards (α > 0.90). Among the six facets, 5 of them were above 0.60 in the two samples, while the self-consciousness facet was with a value of < 0.60. These results were consistent with those of the original English version, which reported the internal reliability of all facets ranged from 0.58 to 0.81 (Costa and McCrae, 1992a). Hence, the α values found in this study were deemed acceptable. Although the α of self-consciousness facets in this study were similar to those of the original version (Costa and McCrae, 1992a), the internal consistency of self-consciousness facet should be reexamined in the future, and researchers should be cautious in interpreting this facet. In both the undergraduate sample and the clinical sample, all the mean inter-item coefficients were above the lowest accepted level (Briggs and Venue, 1986), which also supports good internal consistency of the neuroticism subscale of the NEO-PI.

The test-retest reliability coefficient for the total neuroticism subscale was high and supports the notion of neuroticism as a stable personality trait. These findings were in agreement with the original version of NEO-PI (Anastasi and Urbina, 1997). The 2-month stability coefficients of the six facets ranged from 0.52 to 0.64 in this study, while a previous study found that the 6-month test-retest reliability of the neuroticism subscale of NEO-PI ranged from 0.66 to 0.92 (Costa and McCrae, 1988). Our results were a little lower than those of the previous study, which might be due to natural fluctuations in the state of the six facets of neuroticism. Further studies are needed to examine the stability of neuroticism subscale in both general population and clinical population, with larger samples and longer interval.

The correlations among the six facets were all significantly positive in both two groups (all p < 0.01), which indicated that the six facets not only possessed relative independence, but also related to each other. The six facets of the neuroticism subscale measure different aspects of neuroticism, while all reflect the neuroticism trait. Anyway, these results provided further support for the validity of the neuroticism subscale.

The scores on total neuroticism subscale and all six facets were significantly higher in the clinical sample than the undergraduate sample. These results were consistent with previous findings (Kotov et al., 2010; Ormel et al., 2013), for example, Kotov et al. found that individuals diagnosed with mental disorders also scored high in neuroticism (Kotov et al., 2010). Previous studies also supported that neuroticism is correlated with a wide variety of mental disorders, and individuals with high neuroticism tend to have more mental disorders (Kotov et al., 2010; Ormel et al., 2013). It has previously been stated that neuroticism is one of the most important risk factors in behavioral public health (Lahey, 2009), and that the economic costs of high neuroticism are estimated to exceed those of all common mental disorders combined (Cuijpers et al., 2010). Additionally, it has been shown that neuroticism could predict the onset of common mental disorders even after controlling for most psychiatric confounding variables (Ormel et al., 2013). All these findings suggest the importance of neuroticism in screening individuals with high-risk mental disorders, and in the implementation of early prevention programs to individuals with high-level neuroticism.

The current study found that most of facets could significantly predict depression and anxiety symptoms both in the undergraduate sample and clinical sample, especially the anxiety facet. Previous research has shown a strong link between personality and psychiatric illness, especially in the relationship between neuroticism and depressive symptoms (Luciano, 2015). Individual differences in personality traits, particularly in neuroticism, are known risk factors in the onset and development of depression (Saklofske and Janzen, 1995; Kendler et al., 2004). Research in England indicated that neuroticism predisposes individuals to depression via the relationship between ruminative thinking and low mood (Thorsten and Tobias, 2010). Additionally, certain personality traits may play an important role in how individuals respond to life-events. Individuals with high level of neuroticism are particularly vulnerable to the effect of life events on anxiety (Veen et al., 2016). A study, investigating the relationship between neuroticism and symptoms of anxiety and depression in three patient groups (generalized anxiety disorder; major depressive disorder; mixed anxiety-depressive disorder), revealed that neuroticism might increase the risk of anxious and depressive symptoms, as evidenced by increased worrying or brooding (Merino et al., 2016). While in this study, several facets (anxiety, depression and vulnerability) of the neuroticism subscale could significantly predict the depression level, and three facets (anxiety, angry-hostility and vulnerability) could significantly predict the anxiety level. Therefore, our results, in conjunction with the prior researches discussed above, supported that neuroticism has a great impact on symptoms of depression and anxiety both in general population and in those diagnosed with mental disorders.

It should be stressed that the validity of comparing groups is dependent on measurement invariance. The examination of measurement invariance is a necessary step prior to performing any comparisons across groups (Meredith, 1993; Byrne, 2008). The measurement invariance across gender in the current study supported weak equivalence (equal factor loadings) and strong equivalence (equal factor loadings /intercepts), which is sufficient for meaningful comparisons between groups. No strict measurement invariance was get in our results, while as Baumgartner suggested, strict measurement invariance is not required for substantive analyses (Baumgartner and Steenkamp, 1998). In general, the results of measurement invariance across gender in this study supported the validity of the significant gender differences in neuroticism scores, which was consistent with previous findings (Hirsh, 2011). Our results showed that women scored significantly higher than men on both total neuroticism and all six facets in the undergraduate sample, while in the clinical sample, there were no significant gender differences in total score of neuroticism or scores of its six facets. In the current study, the high level of neuroticism and relatively small size of the clinical sample may contribute to the lack of gender differences.

Conclusion

The current study provides further evidence that the neuroticism subscale of the NEO-PI is a reliable and valid measurement tool for assessing neuroticism in Chinese population. Additionally, its measurement invariance across gender supported that the gender differences in neuroticism found by the Chinese version of the neuroticism subscale of the NEO-PI were reliable and valid.

Our research was restricted by specific Chinese samples, and the generalizability of results in this study to other countries remains to be determined. Studies with longitudinal research design and with larger clinical sample sizes are warranted.

Author Contributions

JY and SY: conceived and designed the study; MZ, XL, YiL, YuL and XZ: collected the data; CX, MZ, XL, and YiL: analyzed the data; CX and JY: wrote the paper.

Funding

This work was supported by the National Natural Science Foundation [grant number: 81370034].

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2018.01454/full#supplementary-material

References

Anastasi, A., and Urbina, S. (1997). Psychological Testing. Upper Saddle River NJ: Prentice-Hall.

PubMed Abstract | Google Scholar

Banzhaf, A., Ritter, K., Merkl, A., Schulte-Herbrüggen, O., Lammers, C. H., and Roepke, S. (2012). Gender differences in a clinical sample of patients with borderline personality disorder. J. Pers. Disord. 26, 368–380. doi: 10.1521/pedi.2012.26.3.368

PubMed Abstract | CrossRef Full Text | Google Scholar

Baumgartner, H., and Steenkamp, J. B. (1998). Multi-group latent variable models varying numbers of items and factors with cross-national and longitudinal applications. Market Lett. 9, 21–35. doi: 10.1023/A:1007911903032