- 1Medical Psychology and Medical Sociology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
- 2Department of Clinical Psychology, Psychotherapy and Psychoanalysis, Institute of Psychology, University of Klagenfurt, Klagenfurt am Wörthersee, Austria
- 3Department of Psychosomatic Medicine and Psychotherapy, University Medical Center Mainz, Johannes Gutenberg University Mainz, Mainz, Germany
- 4Integrated Research and Treatment Center Adiposity Diseases, Behavioral Medicine Unit, Department of Psychosomatic Medicine and Psychotherapy, Leipzig University Medical Center, Leipzig, Germany
Perceived stress is a construct of crucial importance to health and well-being, necessitating the provision of economic, psychometrically sound instruments to assess it in routine clinical practice and large-scale survey studies. Two competing short versions of the Perceived Stress Scale (PSS), each consisting of four items, have been proposed. In the present study, we compare the two in a sample representative of the German general population (n = 2,527). Our analyses show that both versions are sufficiently reliable and valid, given the right measurement model. Specifically, the original PSS-4 by Cohen et al. suffers from response style effects, which we remedied using random intercept factor analysis. With the addition of the method factor, it is a highly reliable and valid scale. The PSS-2&2 by Schäfer et al. is more complex in its interpretation since it is split into two facets which cannot be summarized into a single score. Specifically, the Helplessness subscale correlates with related constructs very similar to the original unifactorial model but its reliability is lackluster. In contrast, the Self-Efficacy subscale is reliable but diverges in terms of its correlational pattern. In sum, both versions can be recommended for research designs in need of a brief measure of stress and offer unique contributions.
1 Introduction
Acute and chronic psychological stress has long been identified as a crucial determinant of physical and psychological health as well as overall well-being (Cohen et al., 2007; Kivimäki and Steptoe, 2018; O’Connor et al., 2021; Wirtz and von Känel, 2017). The stress response is an evolutionarily adaptive mechanism designed to enable individuals to cope with acute threats, often referred to as the “fight-or-flight” response. When confronted with a stressor, the body activates the hypothalamic–pituitary–adrenal (HPA) axis and the sympathetic nervous system, leading to the release of stress hormones, such as cortisol and adrenaline. These physiological changes increase heart rate, sharpen attention, and mobilize energy resources, which together enhance the organism’s ability to respond quickly and effectively to immediate challenges. However, when stress becomes chronic—whether through ongoing or repeated exposure to stressors or inadequate/insufficient coping strategies, or a combination of the two—the body remains in a state of heightened arousal, which can have a range of negative physical and mental health consequences (Agorastos and Chrousos, 2022). Prolonged activation of the HPA axis implicates high allostatic load, a cumulative “wear and tear” on the body, resulting in dysregulation, affecting immune function, cardiovascular health, and metabolic processes. This chronic stress response has been linked to diverse adverse events, including increased risks for cardiovascular disease, hypertension, and depression, as well as cognitive impairments due to neural damage in stress-sensitive areas like the hippocampus (McEwen, 2017). It can therefore be assumed that stress plays an important role in the incidence and chronicity of widespread population diseases.
Given this substantial impact of stress on health, its valid and reliable measurement is of great importance in many research questions across various settings, comprising clinical as well as population-level investigations (Crosswell and Lockwood, 2020; Epel et al., 2018; Giannakakis et al., 2019). Measuring stress accurately allows for targeted prevention and intervention efforts. The fact that the most influential psychological models define stress as the result of an individual’s appraisal of a situation supports the use of self-report measures which provide insights into the subjective experience: According to the transactional model of stress and coping, stress is not only a physiological response but also a subjective experience shaped by the individual appraisals of stressors. Lazarus and Folkman (1984) emphasized that stress arises when a person perceives a situation as threatening or demanding, exceeding their coping resources. This cognitive appraisal process affects how often and intensely the stress response is activated, directly influencing the above-described physiological outcomes and contributing to allostatic load over time.
The Perceived Stress Scale (PSS; Cohen et al., 1983; Cohen and Williamson, 1988) is one of the most widely applied measures of stress, and two competing ultra-short versions have been suggested by previous research. First, the original authors (Cohen et al., 1983) of the PSS suggested the configuration marked in Supplementary Table S1. This version, however, has been shown repeatedly to suffer from suboptimal factorial validity (Demkowicz et al., 2020; Ingram IV et al., 2016; Mondo et al., 2021). Recently, Schäfer et al. (2023) proposed an alternative version which includes two correlated factors, Helplessness and Self-Efficacy, captured using the items marked in Supplementary Table S1 – but does not yield a total score.
The present study aims to test both versions in a representative sample of the German population and compare them with regard to their psychometric merits. Specifically, we will test whether the multi-dimensionality introduced by Schäfer et al. (2023) is necessary and reflects actual properties of the latent stress construct or whether it is just a method artefact caused by response biases commonly encountered with reverse-coded items: acquiescence (Maydeu-Olivares and Coffman, 2006; Podsakoff et al., 2003).
2 Method
2.1 Participants and procedure
The present survey sample was collected in 2014 by a German market research agency (Unabhängiger Service für Umfragen, Methoden und Analysen, Berlin, Germany). The ethics committee of the University of Leipzig approved the present investigation (063-14-10032014). To obtain a representative survey, a random-route procedure was utilized: First, regions were identified based on electoral districts. Second, within these regions, households were randomly selected. Third, within the household, the respondent was determined based on the Kish selection grid. Out of the 4,607 households that were initially contacted, 55.1% gave their informed consent and participated in the survey. The remaining sample of 2,527 is described in Table 1.
2.2 Instruments
The Perceived Stress Scale (PSS-10, Cohen et al., 1983; Schäfer et al., 2023; Schneider et al., 2020) assesses an individual’s acute stress level using 10 items and a 5-point scale. It mainly focuses on the extent to which the respondent feels capable (or incapable) of handling their daily life and upcoming challenges. The German version has previously been investigated regarding its psychometric properties and shown mostly good internal consistency across samples (Reis et al., 2019).
The Personal Burnout Scale (PBS, Nübling et al., 2006; Pejtersen et al., 2010) of the Copenhagen Psychosocial Questionnaire (6 items, ω = 0.915 in this sample) was used to measure physical and mental exhaustion. Specifically, it uses a 6-point scale to inquire into the frequency of the following states: tired, physically exhausted, emotionally exhausted, unable to go on, weak and prone to illness.
The Questions on Life Satisfaction (FLZ-8; Henrich and Herschbach, 2000), specifically the General Life Satisfaction Module, is an 8-item instrument that quantifies a respondent’s life satisfaction. It does so using a 5-point response scale. In the current sample, reliability was estimated at ω = 0.816.
The Patient Health Questionnaire (PHQ-4; Kroenke et al., 2009; Löwe et al., 2010) is a brief measure of symptoms of depression (PHQ-2) and anxiety (Generalized Anxiety Disorder-2, GAD-2), consisting of two items each. Respondents indicate their agreement with the respective symptom descriptions on a 4-point scale. Reliability in the present sample was ω = 0.763 and 0.778, respectively.
2.3 Statistical methods
All analyses of the study at hand were carried out in R (version 4.4.1), using the packages ezCutoffs, lavaan and semTools (Jorgensen et al., 2022; Rosseel, 2012; Schmalbach et al., 2019). Initially, we calculated a congeneric factor analysis model as well as a random intercept model (Maydeu-Olivares and Coffman, 2006) for the Cohen-PSS. Specifically for this model, the method factor loadings for all items are set to equality and the method and content factor are set to not correlate. For the Schäfer-PSS, we calculated a correlated-factors model. To estimate, we utilized the robust full-information maximum likelihood estimator (MLR) – although the number of missing values was negligible (0.4%). We then inspected χ2, the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI), the Root Mean Square Error of Approximation (RMSEA), and the Standardized Root Mean Square Residual (SRMR). As per recommendations by Schermelleh-Engel et al. (2003), the values for CFI/TLI should be greater than 0.95 (better yet 0.97), and RMSEA/SRMR should be smaller than 0.08/0.10 (better yet 0.05). To supplement these fixed cutoffs, we additionally calculated simulated cutoff values using the ezCutoffs package using 1,000 replications and an α of 0.05. We report McDonald’s ω as a measure of internal consistency (Dunn et al., 2014). For the calculation of convergent correlations, we included separate factors for all relevant scales in each of the three PSS models and estimated the interfactor correlations.
3 Results
3.1 Factorial validity
We present the model fit results from the various factor analyses in Table 2, along with the path diagrams for the final models in Figure 1. In summary, the congeneric model for the Cohen-PSS is unacceptable by all applied standards (except the SRMR fixed cutoff), although its reliability was acceptable at ω = 0.71. Including a method factor for acquiescence dramatically improved model fit – both in terms of traditional fixed and simulated cutoffs. Reliability of the content factor also improved, ω = 0.82.
Figure 1. Path diagrams of the final models of the Perceived Stress Scale 4. Models were standardized by setting the latent variable variances to 1 with the exception of the random intercept factor. *This standardized factor loading is equal to one because of the negative error variance of Item 5 which was then set to ≥0.
With regard to the Schäfer-PSS, the initial model already showed a very good fit. However, it became apparent that the error variance of Item 5 was negative (θ = −0.299, SE = 0.200, p = 0.134). We identified this as a Heywood case since the error term was not significantly smaller than 0, and accordingly constrained this specific variance to a positive value, which marginally worsened the fit but yielded a valid model. The resultant model fit was however still very good by both conventional fixed standards as well as when comparing to simulated cutoffs. Reliability estimates were mixed, ωSelf-Efficacy = 0.79 and ωHelplessness = 0.51. For exploratory purposes, we also tested the Schäfer-PSS in a one-factor random intercept configuration. However, this model exhibited unacceptable model fit, χ2(2) = 192.84, p < 0.001, CFI = 0.873, TLI = 0.620, RMSEA = 0.199, SRMR = 0.138.
3.2 Convergent correlations
Regarding the direction and magnitude of the correlations (see Table 3), the expected pattern emerged for the congeneric model of the Cohen-PSS. That is, we found large positive correlations with the PBS, the PHQ, and the GAD, as well as a large negative correlation with the FLZ. In including the method factor for the negative items, the correlational pattern remained largely unchanged. In contrast, the two-factorial Schäfer-PSS evinced a more complex pattern of associations: Whereas the Self-Efficacy subscale correlated moderately positively with the FLZ, and negatively with the symptom and burnout scales, the Helplessness subscale exhibited the same pattern of correlations as the original Cohen-PSS, only with somewhat lower magnitude.
4 Discussion
The present study sought to compare two competing ultra-short versions of the Perceived Stress Scale – PSS4, the original version provided by Cohen et al. (1983) and the newly-constructed version by Schäfer et al. (2023). Schäfer had initially constructed their two-factorial version of the PSS to improve upon some perceived shortcomings of the original version. Our findings that were yielded by a thorough investigation within a large, population-representative sample indicate that neither of the two instruments can be recommended completely and without any reservation whatsoever. Specifically, the original Cohen-PSS-4 when modeled in a congeneric design has unacceptable model fit, moderate internal consistency, but very high convergent correlations. Upon introduction of a method factor for acquiescent response style (by means of a random intercept), reliability improved markedly and model fit became near-perfect while retaining its correlational pattern with convergent scales. The Schäfer-PSS-4 had a unique challenge because of a Heywood case (negative measurement error term). However, after remedying this issue, we found a model with very good fit, mixed reliability, but reasonable overlap in terms of its validity. That is, reliability was good for the Self-Efficacy subscale but not good for the Helplessness subscale. The correlational pattern corresponded well to the original PSS-4 scale in the case of the Helplessness subscale, but as for the Self-Efficacy scale, associations were in the opposite direction (as expected) but of reduced magnitude. The Schäfer-PSS did not fit well with a one-factor random intercept model, providing evidence for its multi-dimensional structure.
In practical terms, the original Cohen version of the PSS-4 allows for a highly reliable and valid measurement of stress, but not if one relies on the observed score to conduct one’s research. As recent contributions in psychological methods research have pointed out, there can be a big difference between using observed and latent scale scores (McNeish and Wolf, 2020; Schmalbach et al., 2024). This needs to be kept in mind when utilizing the original PSS-4. This is further complicated by the presence of response biases such as acquiescence (Podsakoff et al., 2003). A common and effective remedy is the introduction of a method factor to account for these non-content-related portions of variance (Maydeu-Olivares and Coffman, 2006; Schmalbach et al., 2021). The Schäfer-PSS, or PSS-2&2, on the other hand, is clearly two-dimensional in terms of its content which may be of interest to researchers seeking to differentiate various facets of stress. Because of its uncomplicated design (not including any negative items) it does not suffer from the same method effect issues as the Cohen-PSS. This means that it is more readily interpretable in observed score form. However, it should be noted that it does not provide a total score (unlike the Cohen-PSS), but only facet scores. In addition, the low reliability of the Helplessness scale calls into question how accurate the measurement for this facet actually is. To be fair, it should also be mentioned that Schäfer et al. (2023) found an ω of 0.85 for the same scale in their sample. Thus, the scale may very well prove reliable enough in future studies. Finally, the divergence in terms of dimensionality between the PSS-4 and PSS-2&2 indicates that the exact dimensionality of stress and in particular the PSS might need more study.
5 Conclusion
Our analyses show that, overall, both ultrashort versions of the Perceived Stress Scale (PSS-4) can be considered reliable and valid – but not without reservation. Our findings emphasize the need to utilize appropriate measurement models for both the psychometric evaluation of an instrument as well as its application in subsequent research questions. The insights gained from this investigation not only contribute to the ongoing refinement of stress measurement instruments but also provide a critical foundation for future research aiming to enhance the precision of stress-related outcomes in diverse populations. As stress remains a pivotal determinant of health, advancing epidemiological as well as clinical research is contingent on the accuracy and reliability of the measurement of the constructs of interest.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by Ethics Committee at the Medical Faculty of the University of Leipzig. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
BS: Formal analysis, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing. ME: Validation, Writing – original draft, Writing – review & editing. EB: Conceptualization, Data curation, Supervision, Writing – original draft, Writing – review & editing. KP: Conceptualization, Supervision, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2024.1479701/full#supplementary-material
References
Agorastos, A., and Chrousos, G. P. (2022). The neuroendocrinology of stress: the stress-related continuum of chronic disease development. Mol. Psychiatry 27, 502–513. doi: 10.1038/s41380-021-01224-9
Cohen, S., Janicki-Deverts, D., and Miller, G. E. (2007). Psychological stress and disease. JAMA 298, 1685–1687. doi: 10.1001/jama.298.14.1685
Cohen, S., Kamarck, T., and Mermelstein, R. (1983). A global measure of perceived stress. J. Health Soc. Behav. 24, 385–396. doi: 10.2307/2136404
Cohen, S., and Williamson, G. (1988). “Perceived stress in a probability sample of the United States” in The social psychology of health. eds. S. Spacapan and S. Oskamp (Washington, DC: Sage).
Crosswell, A. D., and Lockwood, K. G. (2020). Best practices for stress measurement: how to measure psychological stress in health research. Health Psychol. Open 7:2055102920933072. doi: 10.1177/2055102920933072
Demkowicz, O., Panayiotou, M., Ashworth, E., Humphrey, N., and Deighton, J. (2020). The factor structure of the 4-item perceived stress scale in English adolescents. Eur. J. Psychol. Assess. 36, 913–917. doi: 10.1027/1015-5759/a000562
Dunn, T. J., Baguley, T., and Brunsden, V. (2014). From alpha to omega: a practical solution to the pervasive problem of internal consistency estimation. Br. J. Psychol. 105, 399–412. doi: 10.1111/bjop.12046
Epel, E. S., Crosswell, A. D., Mayer, S. E., Prather, A. A., Slavich, G. M., Puterman, E., et al. (2018). More than a feeling: a unified view of stress measurement for population science. Front. Neuroendocrinol. 49, 146–169. doi: 10.1016/j.yfrne.2018.03.001
Giannakakis, G., Grigoriadis, D., Giannakaki, K., Simantiraki, O., Roniotis, A., and Tsiknakis, M. (2019). Review on psychological stress detection using biosignals. IEEE Trans. Affect. Comput. 13, 440–460. doi: 10.1109/TAFFC.2019.2927337
Henrich, G., and Herschbach, P. (2000). Questions on life satisfaction (FLZM): a short questionnaire for assessing subjective quality of life. Eur. J. Psychol. Assess. 16, 150–159. doi: 10.1027//1015-5759.16.3.150
Ingram, P. B. IV, Clarke, E., and Lichtenberg, J. W. (2016). Confirmatory factor analysis of the perceived stress Scale-4 in a community sample. Stress. Health 32, 173–176. doi: 10.1002/smi.2592
Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., and Rosseel, Y. (2022). semTools: useful tools for structural equation modeling. R package version 0.5-6. Available online at: https://CRAN.R-project.org/package=semTools
Kivimäki, M., and Steptoe, A. (2018). Effects of stress on the development and progression of cardiovascular disease. Nat. Rev. Cardiol. 15, 215–229. doi: 10.1038/nrcardio.2017.189
Kroenke, K., Spitzer, R. L., Williams, J. B., and Löwe, B. (2009). An ultra-brief screening scale for anxiety and depression: the PHQ–4. Psychosomatics 50, 613–621. doi: 10.1016/S0033-3182(09)70864-3
Löwe, B., Wahl, I., Rose, M., Spitzer, C., Glaesmer, H., Wingenfeld, K., et al. (2010). A 4-item measure of depression and anxiety: validation and standardization of the patient health Questionnaire-4 (PHQ-4) in the general population. J. Affect. Disord. 122, 86–95. doi: 10.1016/j.jad.2009.06.019
Maydeu-Olivares, A., and Coffman, D. L. (2006). Random intercept item factor analysis. Psychol. Methods 11, 344–362. doi: 10.1037/1082-989X.11.4.344
McEwen, B. S. (2017). Neurobiological and systemic effects of chronic stress. Chronic Stress 1:2470547017692328. doi: 10.1177/2470547017692328
McNeish, D., and Wolf, M. G. (2020). Thinking twice about sum scores. Behav. Res. Methods 52, 2287–2305. doi: 10.3758/s13428-020-01398-0
Mondo, M., Sechi, C., and Cabras, C. (2021). Psychometric evaluation of three versions of the Italian perceived stress scale. Curr. Psychol. 40, 1884–1892. doi: 10.1007/s12144-019-0132-8
Nübling, M., Stößel, U., Hasselhorn, H. M., Michaelis, M., and Hofmann, F. (2006). Measuring psychological stress and strain at work-evaluation of the COPSOQ questionnaire in Germany. GMS Psycho-Soc. Med. 3:Doc05
O’Connor, D. B., Thayer, J. F., and Vedhara, K. (2021). Stress and health: a review of psychobiological processes. Annu. Rev. Psychol. 72, 663–688. doi: 10.1146/annurev-psych-062520-122331
Pejtersen, J. H., Kristensen, T. S., Borg, V., and Bjorner, J. B. (2010). The second version of the Copenhagen psychosocial questionnaire. Scand. J. Public Health 38, 8–24. doi: 10.1177/1403494809349858
Podsakoff, P. M., MacKenzie, S. B., Lee, J. Y., and Podsakoff, N. P. (2003). Common method biases in behavioral research: a critical review of the literature and recommended remedies. J. Appl. Psychol. 88, 879–903. doi: 10.1037/0021-9010.88.5.879
Reis, D., Lehr, D., Heber, E., and Ebert, D. D. (2019). The German version of the perceived stress scale (PSS-10): evaluation of dimensionality, validity, and measurement invariance with exploratory and confirmatory bifactor modeling. Assessment 26, 1246–1259. doi: 10.1177/1073191117715731
Rosseel, Y. (2012). Lavaan: an R package for structural equation modeling. J. Stat. Softw. 48, 1–36. doi: 10.18637/jss.v048.i02
Schäfer, S. K., von Boros, L., Göritz, A. S., Baumann, S., Wessa, M., Tüscher, O., et al. (2023). The perceived stress scale 2&2: a two-factorial German short version of the perceived stress scale. Front. Psych. 14:1195986. doi: 10.3389/fpsyt.2023.1195986
Schermelleh-Engel, K., Moosbrugger, H., and Müller, H. (2003). Evaluating the fit of structural equation models: Tests of significance and descriptive goodness-of-fit measures. Methods Psychol. Res. Online. 8, 23–74.
Schmalbach, B., Irmer, J., and Schultze, M. (2019). ezCutoffs: fit measure cutoffs in SEM. R package version 1.0.1. Available online at: https://CRAN.R-project.org/package=ezCutoffs
Schmalbach, B., Schmalbach, I., and Hardt, J. (2024). What you see is not what you get: Observed scale score comparisons misestimate true group differences. [Manuscript submitted for publication].
Schmalbach, B., Zenger, M., Michaelides, M. P., Schermelleh-Engel, K., Hinz, A., Körner, A., et al. (2021). From bi-dimensionality to Uni-dimensionality in self-report questionnaires. Eur. J. Psychol. Assess. 37, 135–148. doi: 10.1027/1015-5759/a000583
Schneider, E. E., Schönfelder, S., Domke-Wolf, M., and Wessa, M. (2020). Measuring stress in clinical and nonclinical subjects using a German adaptation of the perceived stress scale. Int. J. Clin. Health Psychol. 20, 173–181. doi: 10.1016/j.ijchp.2020.03.004
Keywords: stress, measurement instrument, psychometric analysis, screening instrument, factor analysis
Citation: Schmalbach B, Ernst M, Brähler E and Petrowski K (2025) Psychometric comparison of two short versions of the Perceived Stress Scale (PSS-4) in a representative sample of the German population. Front. Psychol. 15:1479701. doi: 10.3389/fpsyg.2024.1479701
Edited by:
Hamdollah Ravand, Vali-E-Asr University of Rafsanjan, IranReviewed by:
Abdul Aziz Harith, University of Otago, New ZealandPaweł Larionow, Kazimierz Wielki University, Poland
Copyright © 2025 Schmalbach, Ernst, Brähler and Petrowski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Bjarne Schmalbach, c2NobWFsYmJAdW5pLW1haW56LmRl