Glucocorticoids in relation to behavior, morphology, and physiology as proxy indicators for the assessment of animal welfare. A systematic mapping review

Tiemann, Inga; Fijn, Lisa B.; Bagaria, Marc; Langen, Esther M. A.; van der Staay, F. Josef; Arndt, Saskia S.; Leenaars, Cathalijn; Goerlich, Vivian C.

doi:10.3389/fvets.2022.954607

SYSTEMATIC REVIEW article

Front. Vet. Sci., 06 January 2023

Sec. Animal Behavior and Welfare

Volume 9 - 2022 | https://doi.org/10.3389/fvets.2022.954607

This article is part of the Research TopicTowards a New 3Rs Era in Experimental ResearchView all 37 articles

Glucocorticoids in relation to behavior, morphology, and physiology as proxy indicators for the assessment of animal welfare. A systematic mapping review

Inga Tiemann¹^*

Lisa B. Fijn²

Marc Bagaria²

Esther M. A. Langen²

F. Josef van der Staay³

Saskia S. Arndt²

Cathalijn Leenaars⁴

Vivian C. Goerlich²

¹Faculty of Agriculture, Institute of Agricultural Engineering, University of Bonn, Bonn, Germany
²Division of Animals in Science and Society, Department of Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
³Division of Farm Animal Health, Behaviour and Welfare Group, Department of Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands
⁴Institute for Laboratory Animal Science, Hannover Medical School, Hanover, Germany

Translating theoretical concepts of animal welfare into quantitative assessment protocols is an ongoing challenge. Glucocorticoids (GCs) are frequently used as physiological measure in welfare assessment. The interpretation of levels of GCs and especially their relation to welfare, however, is not as straightforward, questioning the informative power of GCs. The aim of this systematic mapping review was therefore to provide an overview of the relevant literature to identify global patterns in studies using GCs as proxy for the assessment of welfare of vertebrate species. Following a systematic protocol and a-priory inclusion criteria, 509 studies with 517 experiments were selected for data extraction. The outcome of the experiments was categorized based on whether the intervention significantly affected levels of GCs, and whether these effects were accompanied by changes in behavior, morphology and physiology. Additional information, such as animal species, type of intervention, experimental set up and sample type used for GC determination was extracted, as well. Given the broad scope and large variation in included experiments, meta-analyses were not performed, but outcomes are presented to encourage further, in-depth analyses of the data set. The interventions did not consistently lead to changes in GCs with respect to the original authors hypothesis. Changes in GCs were not consistently paralleled by changes in additional assessment parameter on behavior, morphology and physiology. The minority of experiment quantified GCs in less invasive sample matrices compared to blood. Interventions showed a large variability, and species such as fish were underrepresented, especially in the assessment of behavior. The inconclusive effects on GCs and additional assessment parameter urges for further validation of techniques and welfare proxies. Several conceptual and technical challenges need to be met to create standardized and robust welfare assessment protocols and to determine the role of GCs herein.

1. Introduction

Safeguarding and improving the welfare of animals under human care, irrespective of species and context, is a goal recognized by science and society. Many theoretical frameworks have been put forward to conceptualize what “good” welfare is and how animal welfare could be quantified. These concepts lay the basis for practical recommendation for assessing animal welfare and measures for improving animal welfare [e.g., (1)]. The concept of the “Five Freedoms,” proposed by the Brambell committee in 1965 (2), is the earliest and most influential approach to defining the basic aspects of husbandry necessary to safeguard welfare especially of farmed animals. Meanwhile, next to the mere absence of negative states, the importance of the inclusion of positive states in welfare has been emphasized (3). Current concepts incorporate several domains, such as behavior, naturalness, health, and physiology [e.g., the “Five Domains” concept; (4, 5)]. In this review, we use the following conceptual approach to animal welfare: animal welfare is a dynamic process, not a momentary snapshot, to which both positive and negative states contribute (3, 6). The ability to cope and adapt to environmental stimuli and stressors, delimited by the animal's adaptive capacity, is the basis for the animal to “[…] reach a state that it perceives as positive […]” (6). The mental and emotional state of an animal, which is accompanied by correlated physiological patterns, therefore forms a crucial part of welfare (7–9). Establishing the potential relation between these aspects, however, needs further research.

While many different concepts contribute valuable insight into animal welfare, the assessment of welfare is an ongoing challenge. The identification of measurements concerning health, behavior, and/or physiology to derive readout parameters indicative of a positive, or negative, welfare state is a much-debated goal in animal welfare research (10). Ultimately, if one or a few well-validated parameters would correlate highly with other parameters that are considered valid proxies/biomarkers of animal welfare, these could serve as index of welfare [iceberg indicators, e.g., (11)]. In this case, one could dispense with a multitude of measurements in animal welfare research and focus on few key indicators.

Despite critical evaluations of the usefulness of GC values as a proxy indicator for stress (12) and welfare states (13–17), the assumption of animals exhibiting high levels of GCs, and therefore experiencing a diminished welfare, remains widespread (18). Given that welfare is a multidimensional concept, it should therefore be assessed using a combination of behavioral, morphological, and physiological indicators (14). A broad range of additional parameters has been investigated as potential proxy indicators of animal welfare. Some may be measurable directly and quantitatively, e.g., the presence of wounds or infections, others can only be inferred, such as subjective mental states and cognitive bias (19).

In search for key indicators of animal welfare, the measurement of a physiological parameter may imply objectiveness and straightforward interpretation. Glucocorticoids (GCs), in particular the steroid hormones cortisol and corticosterone, have gained much popularity in research on welfare of vertebrate species. External and internal stimuli and stressors may affect an individual's welfare, and an individual's welfare state may affect its ability to cope with these. GCs mediate the endocrine stress response, which is orchestrated by the hypothalamus–pituitary–adrenal (HPA) axis in mammals, birds and reptiles, and the hypothalamic–pituitary–interrenal (HPI) axis in fish and amphibians (20). Notably, next to being a key player in the endocrine response to stressors, GCs induce a manifold of behavioral and physiological processes to promote restoring homeostasis and survival (21). Given the pleiotropic actions of GCs, the interpretation of GC levels and release patterns proves complex (12, 21–23). GCs may rise not only in response to a stressor with potentially negative consequences, but also in response to stimuli such as environmental enrichment or sexual encounters (13, 24–26). Notably, the interpretation of the valence of the stimulus, whether it is perceived as positive or as negative and potentially threatful, may depend on the individual's personality and cognitive traits (18, 27, 28). Adding onto the biological complexity of interpreting GC levels, are the variations in techniques to sample and determine GC levels in various tissues. To make robust assumptions on the relation between welfare and GCs, methodological limitations need to be identified and overcome.

Welfare is dynamic and describes the individual's coping with stimuli and stressors (14). In combination, GC levels may add information on the activation of the HPA axis and arousal, aiding the interpretation of an animal's response to an intervention aimed at affecting welfare. The usefulness of GCs in assessing an animal's welfare may very well-depend on the time frame within which GC levels are monitored. GCs are time-sensitive in their excretion after a triggering event, but also with regard to biological rhythms (29). GCs excretion follows ultradian, circadian and seasonal rhythms, leading to measurable variation in levels under undisturbed circumstances (14, 30, 31). Given the circadian rhythm, an animal should ideally be monitored for at least 24 h to infer information on deviation in GC release (32). Moreover, the genomic actions of GCs need several hours to come into action (21). Finally, considering the central role of the animal's ability to cope and adapt in welfare concepts, monitoring should ideally last longer than 24 h.

Given the popularity, and criticism, of GC measurements in welfare assessment, we performed a mapping review of the research field to identify general trends and provide a basis for future, in-depth analyses. A mapping review “is a high-level review with a broad research question and presents the global results” (33). It follows a systematic search of literature and data extraction, but does not provide detailed information on a meta-analysis level. Rather, the results aid the identification of trends and gaps in knowledge and suggests avenues for future studies.

Based on strict a-priori criteria, we examined experimental studies which tested an intervention aimed at affecting the welfare of a target population. We were interested in the consistency of patterns of GCs, and welfare read out parameters in the domains behavior, morphology, and physiology, in welfare assessment studies. Following our conceptual approach to welfare being a multidimensional construct, we selected studies which measured GCs and concurrently parameters related to behavior, morphology, and physiology to evaluate the effects of interventions (15). Moreover, as we see welfare as being dynamic and comprising more than the peak GC response to a stressor [which typically lasts < 24 h (30, 34), but may exert longer impact on the animal (32)], we restricted our search to studies following animals longer than 24 h after an acute intervention or studies that investigated long-term interventions.

From the included studies, we extracted data on the effects of the intervention on GCs and behavior, morphology, and physiology. We then categorized the impact of the welfare intervention on the outcomes within the four domains into “no effect,” “homogenous effect,” and “heterogenous effect” (Table 1) to provide an overview of the outcome of interest in relation the study characteristics. These within-domain categorizations are presented next to each other to reflect patterns across readout parameters.

TABLE 1

Table 1. Conceptual approach to animal welfare underlying the selection of studies included in this mapping review and categorization of parameter outcomes within the four domains GCs, behavior, morphology or physiology (B, M, P).

We expected GC levels to change in response to welfare interventions in congruence with changes in readout parameters related to behavior, morphology, and physiology (e.g., in both domains “behavior and “GCs” a homogenous effect). Moreover, we expected GC outcomes to follow the author hypothesis. To provide an overview of the experiments, we report general aspects of the experimental studies, such as animal species, experimental design, and sample matrix for GC determination, as well.

With more than 500 records included, we present one of the largest mapping reviews focusing on GC measurements in the field of animal welfare assessment. Our results provide an overview of the field of research on animal welfare, describing the available evidence on the relation between GCs, additional readout parameters and welfare of vertebrate species.

2. Material and methods

2.1. Literature search strategy

The research question leading to this mapping review was: Do GCs change consistently in response to a welfare intervention? For the comprehensive search strategy, the question was rephrased according to the PICO-format (35) to: do GCs (Outcome) consistently change due to interventions potentially affecting welfare (Intervention) compared to within- or between-subject control conditions (Condition) in non-human vertebrate animals (Population)?

Therefore, this mapping review is based on a comprehensive search strategy, using multiple databases: PubMed, Embase and Web of Science. The searches for each database consisted of three search components [SCs, (36)]: SC1 welfare; SC2 glucocorticoids; SC3 all non-human vertebrates. The detailed search strings covered synonyms, alternative spellings and related terms, such as wellbeing for welfare.

For PubMed and Embase searches, both thesaurus-terms (MeSH for PubMed and Emtree for Embase) and title/abstract/keywords terms were included (Web of Science does not use an internal thesaurus). SC1 (animal welfare) covered the terms animal welfare and animal wellbeing, SC2 (GCs) was adapted from Leenaars et al. (33), SC3 (animals) from the Syrcle animal filter (37), with invertebrates removed, and some terms added (title-abstract terms for additional avian species, e.g., “turkey”). The complete search strings can be found in the Supplementary material 1. All searches were performed in June 2020.

2.2. Study selection

Search results were imported into the reference manager Zotero, where duplicate and triplicate records were removed. All remaining records were imported into Rayyan QCRI (https://www.rayyan.ai/) for screening (38, 39).

In the following sections we refer to “records” as the reference to a study or book, etc., “report” and “study” as the published research paper, and “experiment” to the independent interventions investigated within a study.

The reviewer team (five people) were first trained to apply the inclusion and exclusion criteria consistently, using a set of 10 randomly chosen studies from the retrieved records. The retrieved reports were then screened for relevance in two steps: first title and abstract, then full text. Reports were sorted on title and allocated randomly to the reviewers. Screening of each report took place by at least two independent reviewers. Discrepancies were solved by discussion among the team based on the conceptual approach detailed in Table 1.

2.3. Inclusion and exclusion criteria

We included original research publications reporting the assessment of GCs and welfare in non-human vertebrates. We used the following exclusion criteria in title-abstract screening: (1) invertebrate or human study population, (2) no glucocorticoids measured in the study, (3) wrong publication type (no primary data, reviews, theses, conference abstracts), (4) animal welfare not primary focus. During full text screening we used the same exclusion criteria, in addition: (5) lack of an appropriate control for the intervention, and (6) non-English publication. We did not apply any publication date restriction.

The exclusion criterion (4) “animal welfare not primary focus” was operationalized as follows: studies needed to examine parameters of at least one additional domain (behavior, morphology, physiology) next to GCs, as welfare is more than purely the peak GC response to a stressor. As welfare is dynamic and should be monitored over a longer timeframe, we included only studies that followed the animals at least 24 h post intervention. These studies either investigated the long-term response to a short-term intervention, or the effects of a long-term intervention (chronic and/or repeated). Of these studies, all data covering GC, behavioral, morphological, and physiological measurements, thus also those collected within the first 24 h, were taken into consideration. Studies investigating welfare-related parameters over a period of < 24 h after the intervention were excluded.

The “appropriate control” could either be a pre-intervention baseline in within-subject experimental designs, or a separate group of animals not exposed to the intervention. Studies comparing groups without a clear control group (e.g., studies comparing different housing densities or diets where none was explicitly designated as control by the authors) were excluded.

2.4. Data extraction and analyses

From the final set of 509 included studies, we extracted the domains intervention, animals, sampling, and outcome. In total, 17 different categories of information were extracted per experiment included, such as numeric data (number of animals) and descriptive data [species, category, strain/breed (free text), and sex] for the domain “animals.” Other information categories covered descriptive data on the authors hypothesis, the type of welfare intervention, experimental design, sampling regime, and sample type used for determination of GCs (see Supplementary material 2). A priori, we agreed on a list of potential interventions, based on our experience with the field of research. If, during extraction, an intervention did not fit into one of these categories, it was labeled as “other.” If an intervention comprised several categories, it was labeled as “combination” and further specified in a free text column. The authors original hypothesis on the effect of the experimental intervention was classified as welfare enhancing, diminishing, or no hypothesis, based on agreement of at least three of the five reviewers.

Similarly, the impact of the intervention was scored for each readout parameter within the domains GCs, behavior, morphology, physiology (GCs, B, M, P). Based on agreement of at least three of the five reviewers, the outcome was scored as no effect or significant effect compared to baseline/control. Effects were scored as significant based on analytical statistics reported per experiment. The outcome within each domain was scored as homogenous if there was a consistent change in readout parameters compared to baseline/control, i.e., a distinct effect of the intervention on all readout parameters included in the statistical analyses. The outcome within each domain was scored as heterogenous if readout parameters showed inconclusive or inconsistent effects compared to baseline/control (see Table 1). We did not differentiate between an increase or decrease of single parameters, as we were interested in detecting changes in readout parameters caused by the intervention rather than judging on whether an increase or decrease of a readout parameter indicates an increase or decrease in welfare.

To avoid vote counting (40), statistical comparison of outcome categories within parameter domains (GCs, B, M, P) were not performed, but descriptive statistics are reported. The quantitative, but from a statistical point of view not analytical, description of the extracted information of the set of included studies was first organized in Excel (Microsoft Corporation, Redmond, WA, USA). Next, the percentage distribution was calculated for each result category and presented graphically (using SigmaPlot 14, Systat Software, San Jose, CA) or in tabular form. In general, n represents the number of studies and k the number of experiments included in the analyses.

We used VOSviewer for the exploratory visualization of the terms in the title and the abstracts (version 1.6.16; https://www.vosviewer.com/). The result are maps based on the co-occurrence of terms used in the titles and abstracts of the studies included. To simplify the set of terms, VOSviewer calculates clusters, indicated by distinct groups in the same color (41). The size of the circles and of the labels reflects the frequency of occurrence. Labels with low frequencies might not be shown to avoid overlapping. The relatedness is reflected as distance between two items and based on the VOS mapping technique using a similarity matrix. The total link strength indicates the sum of link strengths as a weight attribute, where the link strength is defined as number of links of an item with another item.

3. Results

3.1. Reference flow

Our literature searches provided 717 results from PubMed, 1,092 from Embase, and 1,948 from Web of Science. After duplicate removal, 2,428 records remained for title-abstract screening. After applying the exclusion criteria during title-abstract and subsequent full text screening, n = 509 studies were included in this mapping review [Figure 1, (42); see Supplementary material 3]. Four of these 509 studies, reported two independent experimental interventions, and one reported five independent experimental interventions, resulting in a total of k = 517 experiments for data extraction.

FIGURE 1

Figure 1. Modified PRISMA 2020 flow diagram of the mapping review detailing the database searches, the number of records screened for title and abstracts (TiAb) and full-text, and the final number of studies (n) and experiments [k] included for data extraction. Figure adapted from Page et al. (42).

3.2. Network visualization

A link cloud was calculated based on the terms used in the abstracts of all included records (setting applied: min. occurrence of words: 7; resulting in items: 488; cluster: 9; links: 14,714; total link strength: 109,316; Figure 2). The main clusters revealed the different animal species investigated in the studies: pigs, cows, small rodents, and birds. The clusters also provide a first insight into the research topics predominantly investigated in these species. For pigs, the housing system as well as management including reproduction were frequently occurring themes. Comparable topics were relevant for cows, especially calves, although in this genus, the major focus was on castration. For mice and rats, research focused on the impact of housing and cage design on welfare. For birds, mainly chickens, the stocking density and husbandry system and their impact on welfare was commonly investigated. Research focused on zoo animals and their interaction with visitors is indicated by “penguin” and “exhibit.”

FIGURE 2

Figure 2. VOSviewer visualization of the abstract co-occurrence network based on the 509 studies included in this mapping review. Clusters are indicated by the same color. The frequency of occurrence of terms is reflected by the size of the circles. The distance between the circles reflects their relatedness based on the co-occurrence of the terms, which means the shorter the distance, the stronger the relatedness.

3.3. Effects on glucocorticoids

We scored whether the experimental intervention led to a statistically significant effect on GCs which could either be an increase or decrease of GC levels compared to control or pre-intervention baseline (homogenous effect) or a heterogenous effect (the intervention not leading to a clear and distinct change in GC levels, Table 1). Of the analyzed experiments, 39.26% (k = 203) found no significant effect of the intervention on GC levels, 38.10% (k = 197) found a homogenous effect and 22.63% (k = 117) of the experiments reported a heterogenous effect. Note that outcomes in which the GC response showed the typical course of an initial increase and following decrease were scored as homogenous.

3.4. Additional assessment parameters

In line with our conceptual approach to welfare, additional parameters from all three domains behavior, morphology, and physiology were assessed most frequently amongst the experiments (32.30%, k = 167). The combination of behavior and physiology (22.82%, k = 118) was most common in experiments assessing two additional parameters, while behavior was the most occurring single domain (13.93%, k = 72).

3.5. Outcomes related to study hypothesis

We were interested in the distribution of interventions and whether the experimental outcome aligned with the a-priori research hypotheses stated by the respective study authors (Table 2). The hypotheses were classified based on the expected direction of impact of the experimental intervention on animal welfare as stated by the authors of the studies (increasing or diminishing welfare compared to control/baseline).

TABLE 2

Table 2. Categorization of the proportion and total numbers of experiments (517 experiments in 509 studies) of the research hypothesis (A, B, C) stated by the respective authors of the reviewed studies reporting an effect on GC levels and/or the three additional assessment parameter, behavior, morphology and physiology.

Most of the experiments investigated a putative welfare improving intervention, followed by a putative welfare diminishing interventions. The smallest subset of experiments did not formulate a a-priory hypothesis.

If the intervention was expected to improve welfare, most of the experiments found no effect on GC levels, followed by homogenous changes in GC levels. If the intervention was expected to diminish welfare, most of the experiments found a homogenous effect on GC levels, followed by no effect. If authors did not formulate a directional hypothesis, most of the experiments found no effect on GC levels, followed by heterogenous effects. Outcomes in the domain behavior were most often heterogenous (mean 47.18%), irrespective of the hypotheses. Morphological parameters were most frequently not measured (mean 41.87%), or were found not affected by the intervention (mean 24.12%). Effects on physiology were most often heterogenous (mean 37.41%), again, irrespective of the original research hypotheses.

3.6. Glucocorticoid sampling

The proportion of homogenous, heterogeneous or no effects on GCs, depending on the number of sampling timepoints is depicted in Figure 3. GCs were measured four and more times (43.13%, k = 223), followed by experiments measuring GC two (16.83%, k = 87), and three times (16.25%, k = 84). GCs were measured only once in roughly one quarter of the experiments (23.79%, k = 123).

FIGURE 3

Figure 3. Proportion and total numbers of experiments (517 experiments in 509 studies) finding a homogenous (consistent change in readout parameter compared to baseline/control), heterogenous (inconsistent change in readout parameter compared to baseline/control) or no effect of the intervention on GC levels among the different experimental designs with regard to the number of timepoints at which GC was measured during the experiment. Results are presented in proportions [(%), left vertical axis] and total number of experiments [(k), right vertical axis].

With an increasing number of sampling timepoints throughout the experimental period, the proportion of experiments finding no effect on GC levels declined, similarly to experiments finding a homogenous effect. However, the percentage of experiments finding heterogenous effects on GC levels rose with number of GC measurements.

Samples for GC determination were most frequently collected under undisturbed circumstances (aiming at determining baseline concentrations, 78.92%, k = 408). In only 14.70% of the experiments (k = 76), GCs were measured after applying a stressor or challenge (e.g., ACTH), or a combination of both (6.38%, k = 33). If a stressor was applied, 32.89% (k = 25) of the studies found a homogenous effect on GCs, whereas 46.05% (k = 35) did not find an effect. In case there was no stressor applied, 40.44% of the experiments (k = 165) found a homogenous effect on GC, whereas 38.48% (k = 157) did not find an effect.

Regarding the sample matrix and prominent GC metabolite, GC levels were predominantly determined from blood samples, in which mainly cortisol was quantified (Table 3). Less-invasive sampling techniques, such as sampling feces or saliva, ranked second and third, respectively. Table 3 also shows the outcome categories in relation to the sample matrices. Of the experiments that sampled blood and saliva, 40.20% (k = 123) and 45.28% (k = 24), respectively, found homogenous effects, while of the experiments that quantified GCs in feces or urine, 58.33% (k = 49) and 42.86% (k = 9) found no effect.

TABLE 3

Table 3. Ranking of the proportion and total numbers of experiments (517 experiments in 509 studies) of biological sample types used for determination of GC levels, the corresponding most commonly determined GC metabolite and the outcome of the GC assessment.

3.7. Experimental design

Most experiments, compared treatment to control groups (56.87%, k = 294). Of these experiments, 37.07% (k = 109) found a homogenous effect on GC levels, whereas 40.48% (k = 119) found no effect. A within-subject design was used by 16.44% (k = 85) of the experiments, and 49.25% (k = 33) found a homogenous effect, whereas 32.84% (k = 22) did not find an effect. The combination of between and within subject design was used by 26.69% (k = 138) experiments, of which 3.48% (k = 18) used a cross-over design.

Duration of the intervention was categorized as days to weeks for almost two thirds of the experiments (58.61%, k = 303), followed by months to years (16.83%, k = 87). A proportion of 11.61% (k = 60) of the experiments investigated acute treatments (lasting minutes to hours), 5.03% (k = 26) investigated permanent treatments (from birth till sampling/euthanasia), followed by experimental treatments that were repeatedly continuing (days to weeks; 4.06%, k = 21), repeatedly acute (minutes to hours; 3.29%, k = 17) or repeatedly long-term (months to years; 0.58%, k = 3). Within the category of acute interventions, 46.47% (k = 28) of experiments found homogenous effects on GC responses, in 26.67% (k = 16) no effect was reported. Of the experiments investigating interventions that lasted from days to week, 37.95% (k = 115) found homogenous effects on GC, and 37.62% (k = 114) reported no effects on GC levels. Homogenous effects on GC levels were found in 39.08% (k = 34) of the experiments that investigated long-term interventions, whereas 41.38% (k = 36) found no effect.

Sample size of the experimental animals used in the reviewed experiments ranged from 1 (orangutan) to 650 (tilapia). In all three GC outcome categories, the median sample size ranged from 32 to 40 individual animals (no effect, homogenous, heterogenous, Figure 4). We further summarized the sample size per class of animals in Table 4.

FIGURE 4

Figure 4. Sample size (N) of animals sampled for GCs in studies finding a homogenous (consistent change in readout parameter compared to baseline/control), heterogenous (inconsistent change in readout parameter compared to baseline/control) or no effect of the intervention on GC levels. The box plots represent the median (and 10th, 25th, 75th and 90th percentile) as well as outliers (outside percentiles).

TABLE 4

Table 4. Categorization of the proportion and total numbers of experiments (517 experiments in 509 studies) of animal class, occurrence, primary GC sample type, primary additional assessment parameters (B, Behavior; M, Morphology; P, Physiology), primary investigated sex, and average sample size tested for GC levels (N).

3.8. Study population

The reviewed experiments comprised a wide variety of animal species, ranging from zebrafish to elephants. Most of the experiments investigated the welfare of pigs, cows, chicken and sheep (i.e., farm animals, 51.06%, k = 264), or mice and rats (i.e., laboratory animals, 14.70%, k = 76, Figure 5).

FIGURE 5

Figure 5. Proportion [%] of experiments and the types of animals investigated. Stacked bars refer to the proportion of experiments reporting on homogenous (consistent change in readout parameter compared to baseline/control), heterogenous (inconsistent change in readout parameter compared to baseline/control) or no effect of an intervention on GC levels found among the species investigated. The figure shows the types of animals with a proportion above 1% of the complete set of experiments.

We combined the information given for the animal class, the prevalence among experiments, the welfare parameters assessed as well as sex of the animals investigated and sample size of the experiments. To identify over- or under representation of animal classes in welfare research, we combined species to classes to assess their proportion in the reviewed experiments. As welfare assessment may differ between species and classes, we also related the additional assessment parameters to the respective class (Table 4). Three quarters of the selected studies investigated mammals. Blood as GC sampling matrix was most commonly used across animal classes, as also was the combinatory assessment of behavioral, morphological, and physiological parameters. Females were the most often investigated sex in mammals and avian species, whereas in fish studies, sexes were most often not reported. The mean sample size ranged from 16 individuals in reptiles to 84 in fish, with a minimum of 1 in mammals and 650 in avian species.

3.9. Interventions

Among the interventions aimed at affecting welfare, most of the experiments used a combination of interventions, followed by structural enrichments, other treatments (e.g., milking procedures in dairy cows being the most commonly applied single treatment), combinations of enrichments, and nutrition/nutritional supplementation (Figure 6).

FIGURE 6

Figure 6. Proportion [%] of experiments and the type of interventions investigated. Stacked bars refer to the proportion of experiments reporting on homogenous (consistent change in readout parameter compared to baseline/control), heterogenous (inconsistent change in readout parameter compared to baseline/control) or no effect of the intervention on GC levels. The figure shows the types of interventions with a proportion above 1% of the complete set of experiments.

4. Discussion

The question leading to this mapping review was whether GCs change consistently in response to a welfare intervention, and whether changes were also reflected in measures of behavior, morphology and physiology. With help of a systematic mapping review of the relevant literature, we examined 509 studies including 517 independent experiments. The included experiments used measurements of GCs in combination with measurements of behavior, morphology, and physiology, to assess the welfare of non-human vertebrates in response to an intervention expected to affect animal welfare. We were specifically interested in the added value of GCs as a proxy indicator in animal welfare studies. While this mapping review started with a comprehensive search and screening, the search and screening were restricted to studies with a focus on welfare. Relevant information from publications with another focus was thus not included, as we were specifically interested in the added value of GCs as a proxy indicator in animal welfare studies.

For a mapping review, the goal is not to perform in-depth (meta) analyses, but rather to describe the available literature, in this case on the use of GCs in the assessment of animal welfare. We present an overview on aspects of experimental set-ups and sampling regimes, the sample types analyzed for quantification of GCs, and the species and sex of the target animals. To identify global patterns in the outcome of GC monitoring as proxy for animal welfare, the interventions and original authors' hypotheses were categorized according to the effects of the interventions on GC levels. Finally, we were interested in whether a significant change in GCs was accompanied by changes in behavioral, morphologic, and physiological parameters.

4.1. Network visualization

The network visualization reveals the diversity of welfare-related topics in which GCs are assessed. The clusters formed in the network analyses match current societal and scientific welfare concerns across animals in different contexts such as farm, laboratory and zoological exhibitions (43, 44). Examples are husbandry (light and density), procedures of commercial livestock management (castration), health issues of farm animals (lameness), and the aim to assess emotional states (fear). The selection criteria thus resulted in a representative coverage of the relevant literature.

4.2. Effects on glucocorticoids

The results suggest that the prevalence of effects on GC driven by welfare-related interventions is ambiguous. To make informed conclusions, and avoid vote counting, however, the experimental outcomes need to be analyzed on a more detailed level, ideally using meta-analysis approach. Almost two thirds of the reviewed experiments found an effect of the experimental intervention on GC levels, even if heterogenous effects, thus not all measurements showing significant effects, or inconsistent effects. The probability for an intervention to result in heterogenous effects within a domain may rise with the number of readout parameters. For GCs, however, the spread across the outcome categories was similar across number of sampling moments. Note that to statistically confirm these patterns, further detailed analyses are necessary.

The large number of records and variation in experimental set-ups and analyses prevented a clear distinction of outcomes into increase or decrease of GCs with respect to control/baseline. Notably, rather than the direction of change of GC levels, change itself, may be an indicator of an animal being in a state of arousal (45).

Even if GC levels do not change in response to an intervention, this does not mean that modulation of HPA axis and stress responsivity has not taken place. Under some circumstances, changes in patterns of GC release may only be visible when accounting for circadian rhythms [e.g., (46, 47)]. We included only studies which followed the animals longer than 24 h. Short-term GC responses may point to the animal experiencing acute stress, however, to identify and impact on welfare and the individual's capacity to cope and adapt, an animal needs to be followed over a longer period of time (6, 48).

Animals may also develop a hypo- or hyperresponsiveness to stressors, after being exposed to prolonged periods of chronic or repeated stressors (49–51). This modulation of GC baseline and peak levels in response to chronic exposure to stressors may hamper the interpretation of GC responses to interventions, as individuals in the stressor group may have lower baseline GC levels compared to controls (52). The animal may have adapted to the challenge in order to cope by a new physiological setpoint [allostatic state, (48, 53)]. The crucial point is whether adjusted setpoints have negative consequences on the physical and emotional wellbeing of the individual, or on the animal's capacity to reach a state that it perceives as positive.

Also, persisting changes in GC levels can be, but are not necessarily accompanied by structural changes in organs, e.g., changes in the morphology of the adrenal glands (54). Even more variation is added by the sample type reflecting free or bound fractions of GCs or even GC metabolites (30). These may differ between species (29, 55) and need to be identified prior to measuring and interpreting concentrations with respect to the effects of an intervention (56). We show variation in GC metabolites per sample matrix. This pattern may be influenced by the sample matrices that are typically used withing a particular animal class. However, whether the probability to find an effect of a welfare intervention depends on the sample matrix, remains to be investigated. The lack of validation of GC (metabolite) quantification in alternative sample types such as feces, for example, may lead to erroneous results (30).

Other influential factors in the interpretation of GC levels are corticosteroid binding globulins (CBGs), which regulate bioavailability of GCs. The measurement and correction for CBG levels is not yet widely being applied in research on stress and animal welfare, though recognized by critical reviews on the usefulness of GCs for welfare assessment (13–15). Thus, basing conclusions concerning the effect of a welfare related intervention solely on an effect found on the animal's GC levels appears unsubstantiated by the literature. Regarding the determination of GCs, conceptual, and technical challenges need to be met to create standardized and robust data on GC levels, and to aid the interpretation of changes in GC patterns.

4.3. Additional assessment parameters

GC levels alone are of limited informational value without additional information about the animal, at least about its current behavior, morphology, and physiology, ideally accompanied by information on ontogeny and previous experiences. The value of GCs may be adding information to other welfare proxies while the correlation between different welfare readout parameters needs further study.

Although the measurement of GCs is often considered to represent valid “welfare measurements,” GCs alone do not seem to be valid indicators of welfare, and additional (or even more suitable) readout parameters need to be identified and used. As we limited our review to studies about welfare assessment, we only included studies in our mapping review that measured additional morphological, behavioral, and/or physiological parameters.

The combination of sampling of GC and collection of parameters from the three domains, behavior, morphology, and physiology, was applied most frequently. This reflects the general consensus that welfare comprises several domains (8). Behavior was the most often assessed additional parameter. This might be due to the unanimous opinion that behavior is the main non-invasive readout parameter to assess the welfare state of an animal, across species and between breeds (57–59). How an individual valences its own emotional state cannot be assessed by other non-invasive procedures than behavioral observations (60). Promising indicators for assessing pain and distress in animals, for example, have been obtained through the analyses of vocalizations (61, 62) and by analyzing the animal's facial expressions, using grimace scales. The latter have successfully been developed for several species (63).

When behavior was assessed as additional parameter to GC values, studies reported heterogenous results. This finding indicates that, if there was an effect of the intervention, measurements of different behavioral traits did not show consistent changes. We did not extract information on type and details of behavioral observations, but the high variability in outcomes urges for adaptation, standardization and validation of tests, a common definition of test aims (i.e., which domain is tested in the animal, e.g., exploration, fear, habituation), and detailed description of test setups.

While we acknowledge behavior as the most important indicator of an individual's emotional state, the use of parameters related to health, morphology, and physiology may be indispensable for a comprehensive approach to welfare. The measurement of physiological parameters is subjected to the same issues as GC measurements, thus variability due to sampling regimen or due to individual phenotypes. Moreover, there is a great variety of parameters to choose from, and therefore it was not surprising that studies reported no or heterogenous effects. The large variation in parameters (and outcomes) calls for detailed reviews identifying overarching patterns in effects on specific readout parameters. While there are many promising physiological parameters to monitor and assess welfare, these also need a thorough validation and should be evaluated in a more in-depth review [e.g., (64, 65)].

Morphological parameters are thought to be closely linked to animal welfare, since physical health is often considered a prerequisite for welfare. In our data set, however, the majority of experiments did not monitor parameters related to morphology. Notably, an animal in good health is not automatically in a state of positive welfare, and vice versa. As long as the individual has not reached the limits of its adaptive capacity, and the health status does not prevent it from reaching a state that it perceives as positive, compromised health may not lead to seriously compromised welfare (66). Monitoring clinical signs and body mass is common practice to assess welfare, as these traits are relatively easy to assess. Monitoring weight loss has been shown to be indicative of severe suffering in laboratory animals, but assessing (milder forms of) distress requires the measurement of additional parameters (67). Therefore, also the use of body mass as primary welfare indicator for at least laboratory animals should be questioned [e.g., (68)]. Regarding animals selected for high productivity, the relation between body mass and welfare may be distorted, e.g., in broiler chicken or pigs, where fear of humans, indicative of a negative welfare state, and productivity are inversely correlated (69, 70).

4.4. Outcomes related to study hypothesis

A considerable number of experiments yielded results that were inconsistent with the hypotheses (i.e., no effect while enhancing/diminishing welfare was hypothesized). While this finding can be explained by the ambiguous effects that welfare related interventions may have on assessment parameters, it also highlights the importance of formulating a-priori, testable, and clear research hypotheses. Hypothesis-driven studies are especially needed in the field of stress and welfare research, which is complex and faced with subjective attitudes and interpretations (71). Also, the underlying conceptual approach to welfare, and a hypothesis about why GC levels should in-/decrease in response to a certain intervention, should be explicitly stated. Interventions aimed at affecting welfare and not resulting in changes in GCs may support the critical literature on the validity of GCs for the assessment of welfare (13–16). Alternatively, the interventions may have been unsuccessful in affecting welfare. Teasing apart these two aspects requires a critical evaluation of the design of interventions, and its appropriateness for the study population.

It is a common misperception that GC levels equal the levels of stress an animal experiences (12), and that stress equals diminished welfare (72). We argue that an animal resides in a positive welfare state as long as it can cope and adapt to the demands of its (prevailing) environmental circumstances, enabling it to reach a state that it perceives as positive, e.g., that evokes positive emotions (3, 6). Thus, when investigating the effects of stressors, these aspects should be considered when formulating the research hypothesis.

Since we extracted the hypotheses as they were presented in the final publications, our results may have been affected by authors adapting their hypotheses in the writing phase to improve storytelling. Recent work shows that this phenomenon, generally referred to as “HARKing” (Hypothesizing After Results are Known) also exists in animal welfare research (19). The high incidence of inconsistency between the hypotheses and results reported in the set of reviewed studies suggests that HARKing in the welfare field is less common. Alternatively, the proportion of inconclusive results may even be larger than can be deduced from the published records, i.e., the proportion of inconsistent results may be an underrepresentation of the real number of studies that did not support the (original) hypothesis. This idea may become testable as soon as a priori protocol registration becomes common practice and original hypotheses can reliably be retrieved.

4.4.1. Glucocorticoid sampling

To elucidate potential improvements regarding the informative value of GCs as welfare proxy, we extracted information on sampling methods. Most often, experiments sampled GCs more than once, in line with our conceptual approach to welfare, which is dynamic (3, 6). The proportion of experiments finding homogenous effects or no effects were comparable across the number of GC measurements, but the proportion of experiments reporting heterogenous effects increased. Note that long-term experiments, in which the GC levels initially changed due to the intervention and then returned to control or baseline levels, were not scored as heterogenous but included in the homogenous category because this change was considered as temporary.

To identify patterns of GC release in response to an acute stressor, multiple measurements on the dynamics of the GC response may prove to be more informative about the animal's coping style and resilience (73). Further long-term and in-depth analyses are needed to understand the time course of HPA axis regulation, the role of allostasis and resilience, and the consequences for welfare (74).

Blood sampling for cortisol was the most common procedure. Animal species differ in the glucocorticoid that is predominantly produced by the adrenals, e.g., corticosterone in birds and rodents, and cortisol in most other mammals. However, research has shown that local GC and GC metabolite synthesis may lead to local differences in which glucocorticoid is present at a higher level, challenging the idea that species are corticosterone or cortisol dominant [e.g., (75)]. In addition, cortisol and corticosterone concentrations might not correlate (76). Especially studies investigating the immune system as index of welfare should consider measuring several GC metabolites when trying to elucidate mechanisms [e.g., (77)].

Nearly half of the experiments collected sample matrices such as feces, urine, hair, or other substances (e.g., milk or eggs) to quantify GC levels. These matrices allow for less invasive sampling than blood collection. Additionally, the procedures necessary to collect blood samples may affect levels of GCs themselves (78). Therefore, alternative sample matrices are promising for repeatedly sampling to avoid accumulation of discomfort. Notably, handling to collect alternative samples, e.g., placing the animal in a separate cage [e.g., (79)], may affect GC levels in subsequent samples. Moreover, the relation between GC levels in blood and in alternative matrices needs more investigation. Sample matrices reflect different time periods of GC accumulation (56), thus relative levels might not relate between sample types (23). Interestingly, minimally invasively measured levels of GCs, e.g., in feces or in feathers, have been shown to reflect biologically meaningful patterns (80, 81). Further validation of measurements, e.g., by challenging animals with ACTH and subsequently collecting samples over a period of time to determine peak GC concentrations, is highly recommended (29, 30).

Next to handling of the animals, several technical challenges may account, at least partly, for the variation in the relation between GCs and additional proxies of welfare. Sampling regimens should account for variation on the individual instead of group/cage level (82–84). A circadian and circannual rhythm [e.g., (85)] may lead to additional variation, which needs to be controlled experimentally and statistically. The subsequent processing of samples in the laboratory—i.e., whether and how the sample is extracted and purified, the choice of antibody and assay type—may further impact measured concentrations (86), though a recent meta-analysis did not find an effect of assay method (87).

4.5. Experimental design

A between-subject design was applied in most of the reviewed experiments, comparing the effects of a treatment with a control group. Regarding to the importance of the individual in welfare concepts, data on pre-intervention values would add valuable information about individual profiles (88). Vice versa, studies investigating within-individual changes may miss the comparison to a control group over time. One third of the reviewed studies did indeed use a combined experimental set-up, thus included within- and between individual, or group, measurements. Long-term effects of an intervention may then become visible, which is important in view of the concept that welfare is dynamic and dependent on the individuals' adaptive capacity.

All three GC outcome domains were prevalent across a range of sample sizes, but detailed analyses are needed to infer the likelihood of finding effects on GC levels are driven by the number of animals sampled. Information on sample size for GC measurements proved rather difficult to retrieve. We therefore urge the reporting of precise sample sizes along with the statistics or graphical representations of GC levels. A sufficiently large sample size and appropriate statistical power is an issue across research topics. Guidelines for animal welfare research have been published to ensure the appropriate samples size in animal welfare studies, especially when investigating adverse interventions (89).

4.5.1. Study population

Most of the experiments focused on farm animals such as pigs, cows, and chicken, and on laboratory animals such as mice and rats, followed by other domesticated species. These results may indicate the great effort of researchers investigating welfare and welfare-related interventions in the two most frequently used animal clusters (44). The proportion of studies aiming to study the welfare of a given species might also be linked to the perceived ethical conflict regarding the intrinsic value of animals, urging to safeguard the animal's integrity on the one side, and the use of animals for human purposes, on the other (90). The perceived need for improvement of husbandry circumstances seems to be highest in pigs as well as other species used for meat production.

Despite the increasing production of fish in commercial aquaculture, studies investigating a welfare related intervention in fish were scarce in our data set. The small number of studies represented in our review may be due to our selection criteria, as research on fish welfare often assesses measurements directly relating to the stress response and does not include additional parameter such as behavior. Given the large numbers of fish used in intensive aquaculture, and the large variety of fish species used, more research on their biological needs and potential welfare, using appropriate readout parameter is clearly needed (91, 92). We therefore advocate investigating the biological needs and to identify readout parameters suitable for assessing welfare in species underrepresented in our data set, such as amphibians, reptiles, and fish (93–96).

4.6. Interventions

Prior to data extraction, the team of reviewers defined a set of categories of interventions, based on our knowledge of the field. We were able to assign the major part of the experiments to these categories, however, ten percent were defined as “other intervention.” The outcome confirms the broad scope of welfare related research, and the complexity of factors to which the animals are exposed to. The clusters found are similar to the clusters of the network analysis from Freire and Nicol (44).

The variation of interventions, even within categories, may partly explain the variation in outcome of welfare proxies such as GCs, and behavior, morphology, and physiology. It remains to be statistically evaluated whether certain interventions lead to specific patterns in GC responses. Moreover, the type and array of welfare assessment parameters should be adjusted to the type of intervention.

Structural enrichment was the main category of single interventions, followed by a combination of enrichments. Enrichment aims to trigger different motivational systems and stimulates the animal to express its full behavioral repertoire (97). Similarly, space allowance and restricted movement were often investigated. Both determine which behaviors an animal can express, as some systems severely restrict movement (e.g., gestation stalls in sows). Expressing natural and/or normal behavior is recognized as a biological need and crucial for the animal to reach a positive welfare state (98, 99). Assessment of welfare based on GCs and other physiological parameters may not be appropriate in these systems, as physical exercise may affect the outcomes, leading to e.g., higher GC levels (100).

4.7. Limitations

This mapping review provides an overview of experiments systematically in- or excluded according to a list of a priori criteria. While some of the exclusion criteria were based on a clear go/no go decision, others may be debatable. The first exclusion criterium, invertebrate, or human study population, served to limit the review to animal species in which GCs are components of the stress response. Based on the recent insights into emotional capacities in invertebrate species, we urge to extend welfare concepts and assessment protocols to these taxa (101).

Given the variety of opinions on welfare, the decision whether an experiment was assessing welfare or purely focused on the physiological response to stressors was challenging. Based on the conceptual approach that welfare is dynamic and comprises several domains (4), we established the criterium that, next to GCs, additional parameters needed to be measured (out of the domains, behavior, morphology and physiology) and that the experiment needed to follow the animals longer than 24 h. The 24 h cut-off was chosen to distinguish the acute GC stress response from long-term consequences affecting the coping capacity of animals (32). Nevertheless, we claim that 24 h are certainly not long enough to truly reflect coping capacity and welfare. Welfare assessment of an individual should take its lifetime experiences into account, positive as well as negative, as reflected by concepts such as “a Life Worth Living” (102, 103). Important for the welfare of an individual is that also acute interventions may have long-term effects, especially if applied during sensitive phases such as early life or adolescence, or even prenatally (104–106). If we, however, would have selected only studies across a lifetime of an animal, we would have significantly limited our sample size, as permanent interventions represented only a very small proportion of our set of publications. Nevertheless, our dataset offers the future possibility to investigate the included studies in more detail.

We did not extract detailed quantitative information about GC levels or the other outcome parameter. Instead, we chose to provide an overview of studies measuring GCs as part of assessing animal welfare. We did not analyze relations between experimental characteristics and outcome parameter to prevent “vote counting” (40). Vote counting, thus adding up the number of studies finding an effect, while omitting the statistical analysis of these studies, may lead to a false interpretation and over- or underestimation of effect sizes. While summations may suggest an effect of an intervention (or the absence thereof), reliable interpretation is only possible by performing a weighted meta-analysis on the summarized data, taking means and variation into account (40). Keeping this in mind, we describe GC, behavioral, morphological, and physiological outcomes to provide an overview of the field, not inferring statistical implications. The here-presented results therefore do not allow for causal inferences, rather serves as inspiration. Further exploration of the data set, focusing on a carefully selected and defined subset of limited criteria and parameters, offers the opportunity to perform a meta-analysis and support further results by applying appropriate statistical analyses.

5. Conclusions and recommendations

The variation in outcomes of GCs, behavioral, morphological, and physiological parameter indicates that we have not yet developed the toolbox to decide with sufficient certainly whether an intervention improves or compromises welfare. Further research relating these parameters to the emotional experience of an individual, e.g., in terms of positive welfare indicators, is crucial (107). The complexity and multidimensional nature of welfare assessment needs, and benefits from, more than one assessment dimension (108).

This mapping review provides one of the largest explorative analyses of experimental studies using GCs as one of the parameters to assess the effects of interventions on animal welfare. The results are meant to encourage further extraction and quantitative meta-analysis of the relationships between parameter, for example testing effects on GCs in relation to intervention or sample matrix. Further research aiding our understanding of the interplay of GCs and behavioral, morphological and physiological processes, is clearly needed.

Data availability statement

The data that support the findings of this study are available from the corresponding author, IT upon reasonable request: aW5nYS50aWVtYW5uQHVuaS1ib25uLmRl.

Author contributions

VG and IT conceived the presented idea. IT, LF, MB, CL, and VG defined the literature search. IT, LF, MB, EL, and VG carried out the literature search. IT and VG wrote the paper with input from all the co-authors. All authors contributed to the article and approved the submitted version.

Funding

The publication was supported by the Open Access Publication Fund of the University of Bonn.

Acknowledgments

We thank Malika Weima, Eveline Giersberg, Thea Kranz, Laura Kröger, and Josefine Stuff for their help with the data extraction.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fvets.2022.954607/full#supplementary-material

References

1. Welfare Quality®. Welfare Quality^® Assessment Protocol for Cattle. Lelystad: Welfare Quality^® (2009).

Google Scholar

2. Brambell FW, Barbour DS, Lady Barnett, Ewer TK, Hobson A, Pitchforth H, et al. Report of the Technical Committee to Enquire Into the Welfare of Animals Kept Under Intensive Livestock Husbandry Systems. London (1965).

Google Scholar

3. Ohl F, van der Staay FJ. Animal welfare: at the interface between science and society. Vet J. (2012) 192:13–9. doi: 10.1016/j.tvjl.2011.05.019

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Mellor DJ, Beausoleil NJ, Littlewood KE, McLean AN, McGreevy PD, Jones B, et al. The 2020 five domains model: including human–animal interactions in assessments of animal welfare. Animals. (2020) 10:1870. doi: 10.3390/ani10101870

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Mellor DJ, Reid C. Concepts of animal well-being and predicting the impact of procedures on experimental animals. In: Improving the Well-Being of Animals in the Research Environment. Glen Osmond, SA: WellBeing International (1994). p. 3–18. Available online at: https://www.wellbeingintlstudiesrepository.org/exprawel/7/

Google Scholar

6. Arndt SS, Goerlich VC, van der Staay FJ. A dynamic concept of animal welfare: the role of appetitive and adverse internal and external factors and the animal's ability to adapt to them. Front Anim Sci. (2022) 3:908513. doi: 10.3389/fanim.2022.908513

Glucocorticoids in relation to behavior, morphology, and physiology as proxy indicators for the assessment of animal welfare. A systematic mapping review

1. Introduction

2. Material and methods

2.1. Literature search strategy

2.2. Study selection

2.3. Inclusion and exclusion criteria

2.4. Data extraction and analyses

3. Results

3.1. Reference flow

3.2. Network visualization

3.3. Effects on glucocorticoids

3.4. Additional assessment parameters

3.5. Outcomes related to study hypothesis

3.6. Glucocorticoid sampling

3.7. Experimental design

3.8. Study population

3.9. Interventions

4. Discussion

4.1. Network visualization

4.2. Effects on glucocorticoids

4.3. Additional assessment parameters

4.4. Outcomes related to study hypothesis

4.4.1. Glucocorticoid sampling

4.5. Experimental design

4.5.1. Study population

4.6. Interventions

4.7. Limitations

5. Conclusions and recommendations

Data availability statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher's note

Supplementary material

References

94% of researchers rate our articles as excellent or good