Skip to main content

REVIEW article

Front. Psychol., 03 November 2022
Sec. Evolutionary Psychology

Hearing the physical condition: The relationship between sexually dimorphic vocal traits and underlying physiology

Shitao Chen&#x;Shitao Chen1Chengyang Han&#x;Chengyang Han1Shuai WangShuai Wang1Xuanwen LiuXuanwen Liu1Bin WangBin Wang1Ran WeiRan Wei2Xue Lei
Xue Lei3*
  • 1Department of Psychology, College of Education, Hangzhou Normal University, Hangzhou, Zhejiang, China
  • 2School of Psychology, Shenzhen University, Shenzhen, Guangdong, China
  • 3School of Business Administration, Zhejiang University of Finance and Economics, Hangzhou, China

A growing amount of research has shown associations between sexually dimorphic vocal traits and physiological conditions related to reproductive advantage. This paper presented a review of the literature on the relationship between sexually dimorphic vocal traits and sex hormones, body size, and physique. Those physiological conditions are important in reproductive success and mate selection. Regarding sex hormones, there are associations between sex-specific hormones and sexually dimorphic vocal traits; about body size, formant frequencies are more reliable predictors of human body size than pitch/fundamental frequency; with regard to the physique, there is a possible but still controversial association between human voice and strength and combat power, while pitch is more often used as a signal of aggressive intent in conflict. Future research should consider demographic, cross-cultural, cognitive interaction, and emotional motivation influences, in order to more accurately assess the relationship between voice and physiology. Moreover, neurological studies were recommended to gain a deeper understanding of the evolutionary origins and adaptive functions of voice modulation.

Introduction

Research has shown that a high degree of consistency in the way people judge other people’s voices (Pisanski and Bryant, 2019). For example, low male voice pitch and high female voice pitch are generally considered more attractive (Skrinda et al., 2014). Evolutionary psychologists have suggested that the consistency in the evaluation of voice characteristics is likely due to the fact that different voice characteristics imply corresponding biological information (e.g., reproductive health and physical fitness), which is highly correlated with corresponding social judgment (e.g., sexual attractiveness, resource appropriation; Feinberg, 2008; Puts et al., 2012). Previous studies have often linked human sexually dimorphic vocal traits to sex hormones, body size, and physique, and this paper firstly introduced the widely studied sexually dimorphic acoustic parameters of voice, and then collated and synthesized a review of research on the human sexually dimorphic vocal traits and these three physical signs.

Overview of the human voice

Voice related parameters and measurements

The famous Source-Filter Theory was developed by Fant in the 1960s to describe how humans and most mammals produce sound (Fant, 1960). Sound production can be divided into two parts, the source and the filter. Specifically, the source is generated by the vibration of the vocal cords, which are vibrated by the air exiting the lungs and passing through the windpipe, and then impacting the vocal cords, causing them to vibrate and generate the Fundamental frequency (F0). The filter refers to the supraglottal vocal tract, which are spaces that shaped by pharynx, soft palate, tongue, oral cavity, nasal cavity, and sinuses. The supraglottal vocal tract changes its structure (position of the soft palate, tongue) to change the length and size of the vocal tract transiently, thus changing the frequency of the resonances generated by sound reflections in the vocal tract. This process generates resonant frequencies, and the peak values are called Formant frequencies (Titze and Martin, 1998). Finally, by superimposing the source sound with the resonances of sound reflection in the vocal tract, humans and most mammals eventually achieve vocalization. It is as if the tune of a violin is not only related to the frequency of string vibration (the source), but also the resonances (filters) produced by sound reflection in the cavity of the instrument.

The perceptual component of human vocal fundamental frequency is commonly referred to as voice pitch. Generally, larger vocal folds vibrate at a lower frequency than the smaller ones, resulting in a relatively low F0; however, regardless of volume, F0 increases when the vocal folds are stretched and under tension, so this characteristic is determined by the volume, length and tension of the vocal folds (Titze, 2011; Hollien, 2014). The tone is related to the formant frequencies in the upper larynx. Since formant frequencies include a range of formant frequency values (e.g., first resonance peak frequency F1, second resonance peak F2, etc.), it is often converted into a single value to represent its distribution in research, and there are currently 12 methods (Pisanski et al., 2014). The commonly used of which are: formant disposition (Pf), estimated vocal track length (VTL), and formant dispersion (Df). The relative positions of resonance peaks (especially F1 and F2) play a key role in speech production and perception, and play a greater role than F0 in non-tonal languages1 (Titze and Martin, 1998). Longer vocal tracts produce a relatively lower and tighter formant dispersion than shorter ones, while manipulation of the tongue, lips, jaw, and soft palate can also alter the shape of the vocal tract, affecting the relative dispersion of formant frequencies and thus producing different articulations (Pisanski et al., 2014).

The sexual dimorphism of the human voice is currently of great interest in the fields of evolutionary psychology and acoustics.

Gender dimorphic features of the voice

From the evolutionary psychology perspective, sexual selection and parental investment theories suggest that one sex that invests less in its offspring will generate stronger intra-sexual competition during reproduction (Buss, 2019). In many humanoid primates, males tend to experience stronger intra-sexual competition and possess external traits (e.g., facial and physical traits) that play a role in winning mates (Buss, 2019; Aung and Puts, 2020). It has also been suggested that an animal’s vocalizations may also reflect its formability (Sell et al., 2010), and that low-frequency vocalizations may help males gain mates by intimidating other males and/or attracting females, thus allowing males to develop a lower F0 compared to females (Puts et al., 2014, 2016).

The human voice exhibits significant sexual dimorphism (Puts et al., 2016). In males, there is a significant increase in androgens, thickening of the vocal folds, and increasing of vocal fold length during puberty, resulting in a significant decrease in F0 (Fouquet et al., 2016; Markova et al., 2016). This causes a significant decrease in F0. In adulthood, the length of the vocal folds is about 60% longer and the F0 is about five standard deviations lower in men than in women (Puts et al., 2016). Individuals with low pitch usually have longer vocal folds and less muscle tension on the vocal folds, which vibrate at lower fundamental frequencies (Titze and Martin, 1998). Many studies have found that the lower pitch of male plays a significant role in conveying the impression of dominance to other individuals (Puts et al., 2006; Hodges-Simeon et al., 2011; Hill et al., 2013; Puts, 2016). Puts et al. (2006) speculate that males and females may use the significant gender differences between F0 and resonance peak frequency to convey various gender-related attributes, for example, to convey information related to physical dominance, heterosexual attraction, threat signals (Cartei et al., 2012).

The human voice and sex hormones

The male voice and testosterone

The effects of androgens such as testosterone on the voice are mainly pitch-based. In males, testosterone levels rise during puberty, promoting the development of the laryngeal tissues and framework and the formation of laryngeal nodes. At the same time, the muscular and mucosal layers of the vocal folds thicken, the vocal folds become longer and wider, and male pitch typically decreases after puberty (Harries et al., 1997). Harries et al. (1997) recorded data on a group of boys aged 13–14 years, including their speaking pitch and their salivary testosterone levels, over a period of 1 year, during which these boys experienced vocal changes specific to puberty. Unfortunately, this study did not find a correlation between testosterone levels and pitch, but interestingly, testicular volume was associated with changes in pitch, i.e., the larger the testicular volume, the lower the pitch (Harries et al., 1997). Previous studies have suggested that changes in male voice are completed at puberty, and thus there is no reason to think that pitch would be associated with testosterone levels in the internal circulation of individuals after puberty. However, a small number of studies have found that testosterone levels in the internal circulation are associated with acoustic parameters (F0 and Pf) of the adult male voice. Meuser and Nieschlag (1977) found that tenor singers had lower testosterone/estradiol ratios than baritone and bass singers. In two separate studies, it was found that in young male samples, there was a negative correlation between testosterone levels and pitch (Pedersen et al., 1986; Dabbs and Mallinger, 1999). Later, Bruckert et al. (2006) also found in their study that men with less discrete formant frequencies had higher testosterone levels, but the study did not find a correlation between testosterone levels and pitch.

The reasons for the divergence in these studies may be as follows: firstly, testosterone levels in saliva and serum of adult men vary dynamically throughout the day, with testosterone levels peaking in the morning and reaching their lowest point in the evening, and inconsistency in sampling time points may cause biased experimental results (Campbell et al., 1982). For that matter, given the day-to-day variability of testosterone, researchers should use a more rigorous approach to measuring testosterone levels. For example, Evans et al. (2008) collected saliva samples from subjects at 9 am, 12 noon and 3 pm and explored the relationship between testosterone levels and voice parameters, and the results supported previous findings that there was a negative correlation between testosterone levels and fundamental frequency, and to a greater extent than in previous studies. It is inferred that voice pitch can provide a true signal about an individual’s hormone levels (Evans et al., 2008). In addition, some medical studies suggest that if males do not transition well during the voice change period, this may lead to adolescent falsetto, also known as male to female voice tone, a functional vocal disorder that can be treated with appropriate doses of testosterone to reduce vocal frequency (Zhuang and Liu, 2021).

In sum, part of the studies found a negative correlation between testosterone levels and men’s pitch, and the diurnal shifts of the testosterone levels may cause difficulty getting consistent results, when the test timing is not well controlled.

The female voice and estrogen

The larynx is an important target organ for sex hormones. For women, the vocal fold mucosa proliferates and increases glandular secretion and capillary permeability in response to estrogen, while progesterone acts on top of estrogen to inhibit estrogen-induced hyperplasia of the vocal fold mucosa and glandular secretion and to reduce capillary permeability (Kirgezen et al., 2017; Kim et al., 2020). During the pubertal phase, women experience a mild thickening and lengthening of the vocal folds and a decrease in pitch of approximately one-third of an octave in response to estrogen and progesterone (Zhuang and Liu, 2021).

It has been shown that the female voice changes cyclically with the menstrual cycle, with the follicular phase being the beginning of the menstrual cycle, a period when estrogen levels are significantly higher while progesterone levels are significantly lower, and that this hormonal change leads to vocal fold edema and allows increased blood flow through the vocal folds, and that polysaccharides in the vocal folds break down more easily and bind water more readily, which in turn further promotes the accumulation of fluid in the vocal folds (Kadakia et al., 2013). In addition, blood vessels in the nasal cavity dilate, thus affecting airflow, and the hormonal environment can also lead to increased reflux symptoms by slowing down gastric motility (Kadakia et al., 2013). During the luteal phase, progesterone levels increase much more than estrogen levels, progesterone promotes the shedding of the laryngeal epithelium and inhibits its proliferation, and it also causes glandular secretions to become more viscous, leading to a decrease in the frequency of vocal cord vibrations (Kadakia et al., 2013). Kadakia et al. (2013) postulated that these changes are the main cause of vocal changes during the female menstrual cycle. In a study that recorded voice audio from female subjects at different times during their menstrual cycle, and then rated the attractiveness of the subjects’ voice audio by 30 men and 30 women, they found that voice attractiveness ratings increased significantly throughout the menstrual cycle as the odds of pregnancy increased (closer to ovulation), suggesting that women’s voices may provide reproductive signals related to sex hormone fluctuations (Pipitone and Gallup, 2008).

During menopause, women’s voices change dramatically as their estrogen and progesterone levels decline. At the beginning of menopause, follicle stimulating hormone (FSH) and luteinizing hormone (LH) remain at a high level and the ovaries continue to produce androgens. For women with high fat reserves, these secreted androgens are converted into estrogen, maintaining the impact of estrogen on the body. However, for some women with low fat reserves, no androgens can be converted, thus leaving androgen levels relatively high, which reduces the pitch of the voice and causes irreversible changes (Strauss et al., 1985).

In sum, due to the impact of hormone change, women’s voice change during puberty, menstrual cycle, and menopause.

Furthermore, data from both women and men suggested that human sex hormone level change can influence individual’s voice, especially pitch. Because the sexually dimorphic vocal traits are impacted by sex hormones, and these hormones are linked with reproductive and health viability in men and women (Venners et al., 2006; Almeida et al., 2017). Therefore, it is possible that the sexually dimorphic vocal traits signal reproductive advantage (Apicella et al., 2007; Atkinson et al., 2012) and then sexual selection favored these sexually dimorphic vocal traits (Puts, 2016), which in turn amplified the sexually dimorphic differences of voice between men and women. The similar phenomenon is also replicated in human voice and other physiology that are important to mate competition, such as men’s body size and physique.

Voice and body size

Studies on animals have shown that large body size is generally preferred by the opposite sex. For example, female cichlid fish prefer to spawn near larger males because larger males are better able to provide territorial defense as a means of protecting their offspring (Keenleyside et al., 1985). For territorial monogamous species, females also prefer larger males as larger males tend to gain more territory and thus provide better environmental conditions for females to raise their offspring (Eberhard and Ewald, 1994; Nimje et al., 2021). Evidence from animal vocalizations studies has shown that acoustic signals can provide information on physical characteristics, such as body size, age, and sex. The formant dispersion was found to be a reliable predictor of body size in macaques, as measured by radiographs and computer graphics techniques (Fitch, 1997). In a study of domestic dogs, a significant correlation between formant dispersion and body size was found by recording the acoustic signals of domestic dogs growling (Riede and Fitch, 1999).

In human, body size is often an associated with one’s competitiveness, social status, and attractiveness. It is also an important cue for individuals to effectively assess the strength of their competitors and the quality of their mates (Fitch, 2000). It is often assumed that men with low voices are more attractive to the opposite sex and more dominant over the same sex, so what exactly is the relationship between the human voice and its body size?

Pitch and body size

It has been found that lower F0 in males predicted a number of parameters related to physical signs, such as shoulder circumference and chest circumference as well as height and weight (Evans et al., 2006; Pisanski et al., 2014; Aung and Puts, 2020). Sensory exploitation theories of sexual selection explain this phenomenon as a simple physical property of the world, as if a rock emits lower frequency vibrations when struck with a stick on a larger rock (Titze and Martin, 1994). The perception of a lower male pitch as more dominant simply reflects a response of the organism to objects that emit lower frequency vibrations. There is a clear manifestation of this not just in humans, but throughout the animal kingdom—the perception that bass tones are loud and frightening—suggesting that this sensory-biased response is evolutionarily long-standing (Morton, 1977). One study found that congenitally blind people and sighted people alike perceived that males with lower pitch should be larger, suggesting that visual learning is not required for that auditory perception (Pisanski et al., 2017).

Sensory exploitation theories of sexual selection also suggest that the “lower is louder” heuristic is commonly used in the processing of auditory stimuli. As a result, the perception that bass males have greater size and dominance is likely to be a mere by-product of this heuristic (Rendall et al., 2007). So how does the “lower is louder” heuristic filter out bass males? Feinberg et al. (2018) suggest two possible pathways. First, if all else being equal, bass men exploit women’s sensory bias that “bass is bigger,” causing women to perceive bass men as having a larger size and more dominant position, which leads women to actively choose bass men. Consistent with this possibility, artificially lowering the pitch of men’s voices in the experiment had a positive effect on the opposite sex’s assessment of their attractiveness. Second, all else being equal, men with lower pitches are more likely to win in same-sex competition. Low-pitched men take advantage of other men’s sensory bias that ‘bass is bigger’, causing other men to perceive bass men as larger and more threatening to them, causing other men to be less confident of winning or even to flee the battle, thus making it easier to win intra-sexual competition and giving bass men an evolutionary advantage (Feinberg et al., 2018). Consistent with this possibility, artificially lowering the pitch of male voices in experiments has a positive effect on same-sex assessments of their dominance (Jones et al., 2010).

However, there is also research evidence that the relationship between human voice and true body size is not robust. It has been found that when controlling for sex and age, pitch has a very limited role in predicting body size in many mammals (Fink et al., 2003; Ey et al., 2007). Studies on human have also found that pitch is similarly unsatisfactory in predicting body size in humans (Pisanski et al., 2014). Furthermore, Pisanski et al. (2014) using meta-analysis found that, after controlling for sex, the predictive effect of pitch on body size explained at most 2% of the variance. Furthermore, in studies of adults, both male and female, some research evidence does not support a significant correlation between pitch and body size (González, 2004; Rendall et al., 2005; Evans et al., 2006).

Formant frequencies and body size

Formant frequencies may provide more clues about body size than pitch. Unlike the vocal folds, the length of the vocal tract is largely limited by the skeletal structures that make it up, the length of the neck and the size of the skull; in turn, these structural features are both determined by, and to some extent determine, body size. Three studies have demonstrated a correlation between formant frequencies and adult height in males (Rendall et al., 2005; Bruckert et al., 2006; Evans et al., 2006), while a similar correlation was found in a study of a female sample (Collins and Missing, 2003). A meta-analysis found that formant frequencies of the human voice explained approximately 10% of its body size information (Pisanski et al., 2014). Furthermore, Pisanski et al. (2014) suggest that, given a sufficient sample size, formant frequencies can explain variations in female body size and that women’s voices may carry information about their waist-to-hip ratio (Pisanski et al., 2016b). This finding is consistent with the growing literature that the ‘hourglass’ shape of a woman’s body is a key indicator of her age, fertility and health status (Singh and Singh, 2011; Pisanski and Feinberg, 2013), so a beautiful female voice may suggest a reproductive advantage.

Unlike the sensory exploitation theories of sexual selection, Aung and Puts (2020) suggested that the long-standing tendency to associate low-frequency vocalizations with larger body size in vertebrates may have a role in assessing body size among and within species in natural competition (Morton, 1977). Furthermore, in response to the question of whether organisms can be deceived by volitional vocalizations used to exaggerate body size, game theory models theorize that such deceptive signals must be rare in order for the signal system to remain evolutionarily stable (Grafen, 1990). Otherwise, the following two outcomes may occur: either the organism evolves to ignore the signal altogether; or the organism evolves to be able to distinguish deceptive signals from real signals that provide accurate physiological signs (Garcia and Ramirez, 2005; Pisanski and Reby, 2021). In studies on humans, it has been found that men with lower voices tend to earn more, win more political elections, have more sexual partners and leave more offspring (Apicella et al., 2007; Mayew et al., 2013; Klofstad, 2016). If the male voice was unrelated to physical signs, evolutionary direction should have predisposed one to ignore this signal. Then, why is it that men with lower voices are perceived to be more attractive and dominant, and in the real world, they are more successful? The most likely answer is that the voice signal is, at least in part, accurate and true (Puts et al., 2019). One recent work also provides strong evidence that some features of male voices are related to their physiology (albeit not perfectly; Pisanski and Reby, 2021).

Human voice and physique

Due to the potentially costly nature of intra-sexual conflict, individuals may prefer to reduce costly conflict by predicting each other’s physique, such as strength, fighting ability, and even social status (in the form of dominance), through non-combative approaches (e.g., appearance, voice) (Andersson, 1994). The relationship between sound and fighting ability has been studied in animals from early on. Several studies have shown that in many terrestrial mammals, such as giant pandas, sea lions, horse, deer and domestic dogs, acoustic signals can be used to determine each other’s relative position in aggressive vocalizations, especially in male competition (Reby et al., 2005; Charlton et al., 2010; Taylor et al., 2010; Charrier et al., 2011; Pitcher et al., 2015). These acoustic signals not only predict aggression toward each other, but also elicit a fight or flight response from signal receivers based on their relative combat prowess toward each other (Tibbetts and Dale, 2004; Osiejuk et al., 2007; Anderson et al., 2012). It follows that acoustic cues may contain information about individual physicality that is relevant to individual conflict, particularly intra-sexual competition.

Studies on humans, it has also been shown that there may be a correlation between their voice and strength and combat power. In a cross-cultural study, researchers found that in an American sample, Pf (formant position, a formant frequencies calculation) predicted individual upper limb strength, while F0-SD (standard deviation of F0) predicted self-reported physical aggression and was slightly negatively correlated with arm strength (Puts et al., 2012). In addition, F0 declines sharply with male puberty and shows a high degree of gender dimorphism in adulthood, thus also providing information about variables such as strength. For example, F0 explains over 60% of the variance in grip strength in a mixed-sex sample of US adult college students, as well as over 70% of the variance in upper limb strength in a sample of Bolivian adolescent males (Aung and Puts, 2020). Sell et al. (2010) recruited listeners to assess the upper limb strength of voice providers from a sample of eight from four different language groups. The results found that people could assess each other’s strength more consistently, that their judgments were accurate whether assessing familiar or unfamiliar languages, and that they were more accurate in assessing males than females (Sell et al., 2010). In addition, Raine et al. (2018) showed that listeners were also able to judge the relative strength and height of subjects based on their own strength and height by assessing their aggressive language or threatening rants; for example, when assessing threatening rants, male listeners were able to accurately identify subjects who were taller and stronger than themselves in 88% of the experiments, but unfortunately this study did not examine the correlation between strength and acoustic parameters correlation (Raine et al., 2018).

Some researchers have also argued that there is still insufficient evidence to suggest a significant negative correlation between pitch and upper limb strength in men, and that previously reported correlations between pitch and upper limb strength would not be significant when corrected for multiple comparisons (Feinberg et al., 2018). The possible reason for some of the contradictory results of previous studies is largely due to the fact that these studies differed significantly in their measurements, including the measurement of strength, and that upper limb strength or grip strength is only a representation of strength in a local area, or arguably part of the many component modules of strength. Therefore, assessing strength with more precise measurements or by combining multiple measurements is likely to result in a stronger association with acoustic parameters.

Some research has recently begun to focus on the relationship between the human voice and signals of aggressive intent, with Zhang et al. (2021) showing in an experimental study that, at least for males, their lowered pitch served primarily as a signal of aggressive intent, independent of an assessment of their own combat strength (Zhang et al., 2021). This study suggested that although listeners have the ability to judge the strength or combat power of subjects from their voices, the correlation between human voice and strength and combat power remains largely unknown after excluding some invalid or inconsistent findings (Sell et al., 2010; Puts et al., 2012; Hodges-Simeon et al., 2014; Smith et al., 2017; Han et al., 2018; Kordsmeyer et al., 2018).

Taken together, these studies raise the following questions: first, as there is an evolutionary commonality in the structure and function of other mammalian vocalizations and human spontaneous vocalizations, such as laughter (Ross et al., 2009, 2010; Bryant and Aktipis, 2014; Pisanski et al., 2016a, 2022) and infant screams of pain (Lingle et al., 2012; Lingle and Riede, 2014), does human spontaneous vocalization convey information about physiological aspects (e.g., strength) more effectively than volitional vocalization (speech)? Recent works tested this on perceptual level, which found roar-like vocalizations increase the perceived physical strength of vocalizers relative to screams, distressed speech, and neutral speech (Raine et al., 2019; Kleisner et al., 2021). Future research can test this on real physical condition (i.e., the effectiveness of prediction from spontaneous vocalization and speech to real physical strength). Secondly, volitional vocalizations are more complex and diverse in humans compared to other mammals. For example, humans can exaggerate their physiological advantages by using words that exaggerate their strength or physical qualities to influence listeners in judging their physiological indicators, while the content of language also limits human non-verbal vocalization relatively more. Therefore, if interference parameters such as language content, motivational state, and emotional information are further controlled, will acoustic parameters convey physiological information more effectively than previous studies? Again, this question will need to be explored in future experiments. One recent paper reviewed the literature on human nonverbal vocalization, which also introduced new techniques that can manipulate voice (Pisanski et al., 2022). This facilitates future work in controlling confounding variables.

Research discussion

In summary, there is a correlation between human voice characteristics and physiological signs. In terms of body size, formant frequencies are more reliable predictor of human body size than pitch; in terms of physique, the relationship between human voice and strength and combat power may be correlated, but remains controversial; and in terms of sex hormones, sex hormone levels affect human voice variation and perception of voice. While these studies have provided further insight into the human voice and physiological signs, some conclusions remain divergent and need to be further expanded, deepened and refined. In addition, it is likely that the more consistent social evaluation of voice may also suggest that it is based on certain biological characteristics, for example, a low voice is often perceived as coming from a tall and powerful person, and therefore is easily perceived as a high dominant voice (Banai et al., 2017; Han et al., 2021). It is expected that the study of the social and physiological correspondence between the social evaluation of the voice will receive increasing attention.

At the same time, voice belongs to a comprehensive and multidisciplinary nature, including multiple disciplines such as biology (anatomy, physiology, neurology), psychology (cognitive, developmental, cross-cultural, experimental, social), ethology (including primatology), anthropology, bioacoustics, communication and linguistics. Although there has been considerable interest and a gradual rise in research by researchers from different disciplines on key topics such as the physiological mechanisms of voice control and modulation, the culture and its environmental factors affecting voice modulation, the evolutionary origins and adaptive functions of voice modulation, and the social functions of voice modulation. However, to date, most of the research and publications have been scattered in specialist journals on a variety of topics, and there is a lack of cross-disciplinary cross-dialog, as well as compilation of specialized fields. Recently, the Royal Society Publishing had a special issue that incorporated multidiscipline on the topic of “Voice modulation: from origin and mechanism to social impact” (Leongómez et al., 2021). In the future, interdisciplinary collaboration could be used to increase the avenues of dialog to bridge the blind spots between disciplines, allowing researchers from different disciplines to transcend traditional disciplinary boundaries and thus laying the groundwork for a lasting, interdisciplinary foundation in the field of voice.

In addition, there are a number of aspects of current voice-related research that could be further strengthened.

The need to control demographic factors

Through voice training, actors and voice imitators can significantly increase or decrease the acoustic parameters of their voices (F0 and formant frequencies; Kreiman and Sidtis, 2011). For example, the political figure—Margaret Thatcher—underwent a long period of voice coaching training to reduce the frequency of her articulation in order to present a more authoritative, leadership image (Karpf, 2006). A growing body of research suggests that people who have consciously trained their voices often spontaneously modulate their voices in everyday communication situations (e.g., dating and job interviews) as a way to lead socially (Pisanski et al., 2016a). In addition, social factors including culture and gender are also important; Cartei et al. (2022) found in a voice imitation task with boys and girls (aged 8–10 years) that children spontaneously masculinized or feminized their voices by lowering or raising their pitch, depending on whether the person they were talking to was typically male (rugby) or female (ballet), suggesting that volitional of voice modulation may emerge early in childhood development. Therefore, the variation of an individual’s own voice across occupations, ages, and ecological contexts, and whether the perceived voice differs from that of the general population, needs to be further explored.

The need to study culturally diverse groups of subjects

Most of the subjects studied in the past have been Westerners, but listeners in different ethnic and cultural contexts may have different perceptions of the attractiveness and dominance of the voice. At the same time, the voice carries important ‘dynamic’ information, such as regional (accent) or ethnic-specific articulation patterns, which allow listeners to identify physical and psychological characteristics (e.g., trustworthiness) more accurately from the voice (Kreiman, 1997). In addition, voices contain emotional information (e.g., anger and sadness), and groups from different backgrounds may have different emotional recognition of voices. In a cross-cultural study of laughter perception, Kamiloğlu et al. (2022) compared data from Dutch and Japanese listeners and found that listeners from both cultures perceived spontaneous laughter as more positive than volitional laughter. Moreover, listeners could identify whether laughter was produced by speakers from their own culture, suggesting that non-verbal information in human voice can encode cultural identity (Kamiloğlu et al., 2022). Future studies are recommended to examine individuals from different cultures (especially, between tonal and non-tonal languages) to investigate how cultural differences affect voice perception.

Need to improve experimental design

In previous studies, most experiments have manipulated one acoustic parameter at a time, and the interaction of different acoustic parameters in forming cognitive judgments has not been well documented for listeners (Schild et al., 2020). Previous studies, which have focused on linear relationships, therefore remain largely unclear as to whether acoustic parameters have a curvilinear effect on perception (Puts et al., 2012). At the same time, experimental design makes it difficult to avoid experimenter effects and Hawthorne effects, and linguistic content and motivation can unconsciously influence non-verbal factors. Future research should therefore examine the interaction of different acoustic parameters in natural language and elucidate the relationship between them; and use more contextualized language content in experiments to accurately examine the influence of non-verbal factors.

Possible effects of verbal emotional messages and motivational states

People are able to convey emotional information in verbal communication, and emotional information usually has positive or negative attributes. At the same time, changes in verbal emotional information are often constituted by direct changes in acoustic parameters such as pitch, formant frequencies and volume (Zheng et al., 2017). Furthermore, subjects in different motivational states during verbal communication may also convey different acoustic parameters; for example, the voice may show different properties in a motivational state of mate choice and in a motivational state of self-preservation (Puts, 2006). Pinheiro et al. (2021) studied two types of non-verbal vocalizations: crying and laughing, and designed an experiment to test whether their volitional and spontaneous vocalizations affect listeners’ perceptions of the speaker’s. The results showed that listeners were able to discriminate between spontaneous and volitional vocalizations and that spontaneous vocalizations were considered more trustworthy than volitional vocalizations (Pinheiro et al., 2021). This partly explains why some earlier studies have not yet found a consistent correlation between acoustic parameters and physiological signs, most likely because the correlation between acoustic parameters and physiological signs would serve emotional and motivational states to some extent. Future studies should carefully differentiate and even investigate whether non-verbal vocalizations (martial arts vocalizations) more reliably convey physiological cues.

The need for research at the neural level

Many species, including humans, possess the ability to perceive vocalizations. Human infants do not speak or understand language, but they are able to recognize sounds. It was found that in experiments measuring heart rate changes in newborns when they heard different sounds that they had the ability to recognize sounds and identify the voices of their parents (DeCasper and Fifer, 1980). This suggests that the infant’s ability to perceive vocalizations is likely to be acquired before fetal birth (Kisilevsky et al., 2003). Recent neuroimaging findings also suggest that neurocognitive models of voice perception are largely similar to those of face perception, and that different types of voice information can be processed in partially separated functional pathways (Belin et al., 2004). In an experiment on imitating voices, Waters et al. (2021) simultaneously observed and recorded voice anatomy and brain function in trained singers and non-singer controls. In a real-time map of changes, they found that singers were able to adjust their speech more accurately in a task that imitated volume level and pitch, and showed stronger laryngeal neural associations within the right dorsal somatosensory cortex region, suggesting a common neural basis for enhanced vocal control in speech and song (Waters et al., 2021). Future research should examine the effects of perceived vocalizations on neural activity in conjunction with advanced brain imaging techniques to provide a neural-level explanation for listeners’ cognitive judgments of voice.

Author contributions

SC, XLei, and CH contributed to the conception of the paper. SC and CH wrote the first draft of the manuscript. SW, XLiu, BW, and RW contributed to the manuscript revision, read, and approved the submitted version. All authors contributed to the article and approved the submitted version.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1. ^ Non-tonal languages: Languages that do not use pitch in their utterances to distinguish between words and grammatical meaning, such as English, French, German, etc.

References

Almeida, S., Rato, L., Sousa, M., Alves, M. G., and Oliveira, P. F. (2017). Fertility and sperm quality in the aging male. Curr. Pharm. Des. 23, 4429–4437. doi: 10.2174/1381612823666170503150313

PubMed Abstract | CrossRef Full Text | Google Scholar

Anderson, R. C., Searcy, W. A., Hughes, M., and Nowicki, S. (2012). The receiver-dependent cost of soft song: a signal of aggressive intent in songbirds. Anim. Behav. 83, 1443–1448. doi: 10.1016/j.anbehav.2012.03.016

CrossRef Full Text | Google Scholar

Andersson, M. (1994). Sexual Selection. Princeton, NJ: Princeton University Press.

Google Scholar

Apicella, C. L., Feinberg, D. R., and Marlowe, F. W. (2007). Voice pitch predicts reproductive success in male hunter-gatherers. Biol. Lett. 3, 682–684. doi: 10.1098/rsbl.2007.0410

PubMed Abstract | CrossRef Full Text | Google Scholar

Atkinson, J., Pipitone, R. N., Sorokowska, A., Sorokowski, P., Mberira, M., Bartels, A., et al. (2012). Voice and handgrip strength predict reproductive success in a group of indigenous African females. PLoS One 7:e41811. doi: 10.1371/journal.pone.0041811

PubMed Abstract | CrossRef Full Text | Google Scholar

Aung, T., and Puts, D. (2020). Voice pitch: a window into the communication of social power. Curr. Opin. Psychol. 33, 154–161. doi: 10.1016/j.copsyc.2019.07.028

PubMed Abstract | CrossRef Full Text | Google Scholar

Banai, I. P., Banai, B., and Bovan, K. (2017). Vocal characteristics of presidential candidates can predict the outcome of actual elections. Evol. Hum. Behav. 38, 309–314. doi: 10.1016/j.evolhumbehav.2016.10.012

CrossRef Full Text | Google Scholar

Belin, P., Fecteau, S., and Bedard, C. (2004). Thinking the voice: neural correlates of voice perception. Trends Cogn. Sci. 8, 129–135. doi: 10.1016/j.tics.2004.01.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Bruckert, L., Liénard, J.-S., Lacroix, A., Kreutzer, M., and Leboucher, G. (2006). Women use voice parameters to assess men’s characteristics. Proc. R. Soc. B Biol. Sci. 273, 83–89. doi: 10.1098/rspb.2005.3265

PubMed Abstract | CrossRef Full Text | Google Scholar

Bryant, G. A., and Aktipis, C. A. (2014). The animal nature of spontaneous human laughter. Evol. Hum. Behav. 35, 327–335. doi: 10.1016/j.evolhumbehav.2014.03.003

CrossRef Full Text | Google Scholar

Buss, D. M. (2019). Evolutionary Psychology: The New Science of the Mind. London: Routledge.

Google Scholar

Campbell, I. C., Walker, R., Riad-Fahmy, D., Wilson, D., and Griffiths, K. (1982). Circadian rhythms of testosterone and cortisol in saliva: effects of activity-phase shifts and continuous daylight. Chronobiologia 9, 89–96.

Google Scholar

Cartei, V., Cowles, H. W., and Reby, D. (2012). Spontaneous voice gender imitation abilities in adult speakers. PLoS One 7:e31353. doi: 10.1371/journal.pone.0031353

PubMed Abstract | CrossRef Full Text | Google Scholar

Cartei, V., Reby, D., Garnham, A., Oakhill, J., and Banerjee, R. (2022). Peer audience effects on children’s vocal masculinity and femininity. Philos. Trans. R. Soc. B 377:20200397. doi: 10.1098/rstb.2020.0397

PubMed Abstract | CrossRef Full Text | Google Scholar

Charlton, B. D., Zhihe, Z., and Snyder, R. J. (2010). Giant pandas perceive and attend to formant frequency variation in male bleats. Anim. Behav. 79, 1221–1227. doi: 10.1016/j.anbehav.2010.02.018

CrossRef Full Text | Google Scholar

Charrier, I., Ahonen, H., and Harcourt, R. G. (2011). What makes an Australian sea lion (Neophoca cinerea) male’s bark threatening? J. Comp. Psychol. 125, 385–392. doi: 10.1037/a0024513

PubMed Abstract | CrossRef Full Text | Google Scholar

Collins, S. A., and Missing, C. (2003). Vocal and visual attractiveness are related in women. Anim. Behav. 65, 997–1004. doi: 10.1006/anbe.2003.2123

CrossRef Full Text | Google Scholar

Dabbs, J. M. Jr., and Mallinger, A. (1999). High testosterone levels predict low voice pitchamong men. Personal. Individ. Differ. 27, 801–804. doi: 10.1016/S0191-8869(98)00272-4

CrossRef Full Text | Google Scholar

DeCasper, A. J., and Fifer, W. P. (1980). Of human bonding: newborns prefer their mothers’ voices. Science 208, 1174–1176. doi: 10.1126/science.7375928

PubMed Abstract | CrossRef Full Text | Google Scholar

Eberhard, J. R., and Ewald, P. W. (1994). Food availability, intrusion pressure and territory size: an experimental study of Anna’s hummingbirds (Calypte anna). Behav. Ecol. Sociobiol. 34, 11–18. doi: 10.1007/BF00175453

CrossRef Full Text | Google Scholar

Evans, S., Neave, N., and Wakelin, D. (2006). Relationships between vocal characteristics and body size and shape in human males: an evolutionary explanation for a deep male voice. Biol. Psychol. 72, 160–163. doi: 10.1016/j.biopsycho.2005.09.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Evans, S., Neave, N., Wakelin, D., and Hamilton, C. (2008). The relationship between testosterone and vocal frequencies in human males. Physiol. Behav. 93, 783–788. doi: 10.1016/j.physbeh.2007.11.033

PubMed Abstract | CrossRef Full Text | Google Scholar

Ey, E., Pfefferle, D., and Fischer, J. (2007). Do age-and sex-related variations reliably reflect body size in non-human primate vocalizations? A review. Primates 48, 253–267. doi: 10.1007/s10329-006-0033-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Fant, G. (1960). Acoustic Theory of Speech Production. Mouton: The Hague, Netherlands.

Google Scholar

Feinberg, D. R. (2008). Are human faces and voices ornaments signaling common underlying cues to mate value? Evol. Anthropol. 17, 112–118. doi: 10.1002/evan.20166

CrossRef Full Text | Google Scholar

Feinberg, D. R., Jones, B. C., and Armstrong, M. M. (2018). Sensory exploitation, sexual dimorphism, and human voice pitch. Trends Ecol. Evol. 33, 901–903. doi: 10.1016/j.tree.2018.09.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Fink, B., Neave, N., and Manning, J. (2003). Second to fourth digit ratio, body mass index, waist-to-hip ratio, and waist-to-chest ratio: their relationships in heterosexual men and women. Ann. Hum. Biol. 30, 728–738. doi: 10.1080/03014460310001620153

PubMed Abstract | CrossRef Full Text | Google Scholar

Fitch, W. T. (1997). Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am. 102, 1213–1222. doi: 10.1121/1.421048

PubMed Abstract | CrossRef Full Text | Google Scholar

Fitch, W. T. (2000). The evolution of speech: a comparative review. Trends Cogn. Sci. 4, 258–267. doi: 10.1016/S1364-6613(00)01494-7

CrossRef Full Text | Google Scholar

Fouquet, M., Pisanski, K., Mathevon, N., and Reby, D. (2016). Seven and up: individual differences in male voice fundamental frequency emerge before puberty and remain stable throughout adulthood. R. Soc. Open Sci. 3:160395. doi: 10.1098/rsos.160395

PubMed Abstract | CrossRef Full Text | Google Scholar

Garcia, C. M., and Ramirez, E. (2005). Evidence that sensory traps can evolve into honest signals. Nature 434, 501–505. doi: 10.1038/nature03363

PubMed Abstract | CrossRef Full Text | Google Scholar

González, J. (2004). Formant frequencies and body size of speaker: a weak relationship in adult humans. J. Phon. 32, 277–287. doi: 10.1016/S0095-4470(03)00049-4

CrossRef Full Text | Google Scholar

Grafen, A. (1990). Biological signals as handicaps. J. Theor. Biol. 144, 517–546. doi: 10.1016/S0022-5193(05)80088-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Han, C., Wang, H., Fasolt, V., Hahn, A. C., Holzleitner, I. J., Lao, J., et al. (2018). No clear evidence for correlations between handgrip strength and sexually dimorphic acoustic properties of voices. Am. J. Hum. Biol. 30:e23178. doi: 10.1002/ajhb.23178

PubMed Abstract | CrossRef Full Text | Google Scholar

Han, C., Watkins, C. D., Nan, Y., Ou, J., Lei, X., Li, X., et al. (2021). Exogenous testosterone decreases men’s sensitivity to vocal cues of male dominance. Horm. Behav. 127:104871. doi: 10.1016/j.yhbeh.2020.104871

PubMed Abstract | CrossRef Full Text | Google Scholar

Harries, M. L., Walker, J. M., Williams, D. M., Hawkins, S., and Hughes, I. (1997). Changes in the male voice at puberty. Arch. Dis. Child. 77, 445–447. doi: 10.1136/adc.77.5.445

PubMed Abstract | CrossRef Full Text | Google Scholar

Hill, A. K., Hunt, J., Welling, L. L., Cárdenas, R. A., Rotella, M. A., Wheatley, J. R., et al. (2013). Quantifying the strength and form of sexual selection on men’s traits. Evol. Hum. Behav. 34, 334–341. doi: 10.1016/j.evolhumbehav.2013.05.004

CrossRef Full Text | Google Scholar

Hodges-Simeon, C. R., Gaulin, S. J., and Puts, D. A. (2011). Voice correlates of mating success in men: examining “contests” versus “mate choice” modes of sexual selection. Arch. Sex. Behav. 40, 551–557. doi: 10.1007/s10508-010-9625-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Hodges-Simeon, C. R., Gurven, M., Puts, D. A., and Gaulin, S. J. (2014). Vocal fundamental and formant frequencies are honest signals of threat potential in peripubertal males. Behav. Ecol. 25, 984–988. doi: 10.1093/beheco/aru081

PubMed Abstract | CrossRef Full Text | Google Scholar

Hollien, H. (2014). Vocal fold dynamics for frequency change. J. Voice 28, 395–405. doi: 10.1016/j.jvoice.2013.12.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Jones, B. C., Feinberg, D. R., DeBruine, L. M., Little, A. C., and Vukovic, J. (2010). A domain-specific opposite-sex bias in human preferences for manipulated voice pitch. Anim. Behav. 79, 57–62. doi: 10.1016/j.anbehav.2009.10.003

CrossRef Full Text | Google Scholar

Kadakia, S., Carlson, D., and Sataloff, R. T. (2013). The effect of hormones on the voice. J. Sing. 69, 571–574.

Google Scholar

Kamiloğlu, R. G., Tanaka, A., Scott, S. K., and Sauter, D. A. (2022). Perception of group membership from spontaneous and volitional laughter. Philos. Trans. R. Soc. B 377:20200404. doi: 10.1098/rstb.2020.0404

PubMed Abstract | CrossRef Full Text | Google Scholar

Karpf, A. (2006). The Human Voice: How This Extraordinary Instrument Reveals Essential Clues About Who We Are. New York, NY: Bloomsbury Publishing USA.

Google Scholar

Keenleyside, M. H., Rangeley, R. W., and Kuppers, B. U. (1985). Female mate choice and male parental defense behaviour in the cichlid fish Cichlasoma nigrofasciatum. Can. J. Zool. 63, 2489–2493. doi: 10.1139/z85-368

CrossRef Full Text | Google Scholar

Kim, J. M., Shin, S. C., Park, G. C., Lee, J. C., Jeon, Y. K., Ahn, S. J., et al. (2020). Effect of sex hormones on extracellular matrix of lamina propria in rat vocal fold. Laryngoscope 130, 732–740. doi: 10.1002/lary.28086

PubMed Abstract | CrossRef Full Text | Google Scholar

Kirgezen, T., Sunter, A. V., Yigit, O., and Huq, G. E. (2017). Sex hormone receptor expression in the human vocal fold subunits. J. Voice 31, 476–482. doi: 10.1016/j.jvoice.2016.11.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Kisilevsky, B. S., Hains, S. M., Lee, K., Xie, X., Huang, H., Ye, H. H., et al. (2003). Effects of experience on fetal voice recognition. Psychol. Sci. 14, 220–224. doi: 10.1111/1467-9280.02435

PubMed Abstract | CrossRef Full Text | Google Scholar

Kleisner, K., Leongómez, J. D., Pisanski, K., Fiala, V., Cornec, C., Groyecka-Bernard, A., et al. (2021). Predicting strength from aggressive vocalizations versus speech in African bushland and urban communities. Philos. Trans. R. Soc. B 376:20200403. doi: 10.1098/rstb.2020.0403

PubMed Abstract | CrossRef Full Text | Google Scholar

Klofstad, C. A. (2016). Candidate voice pitch influences election outcomes. Polit. Psychol. 37, 725–738. doi: 10.1111/pops.12280

CrossRef Full Text | Google Scholar

Kordsmeyer, T. L., Hunt, J., Puts, D. A., Ostner, J., and Penke, L. (2018). The relative importance of intra-and intersexual selection on human male sexually dimorphic traits. Evol. Hum. Behav. 39, 424–436. doi: 10.1016/j.evolhumbehav.2018.03.008

CrossRef Full Text | Google Scholar

Kreiman, J. (1997). “Listening to voices: theory and practice in voice perception research,” in Talker Variability in Speech Processing. 1st Edn. eds. K. Johnson and J. W. Mullennix (Academic Press), 85–108.

Google Scholar

Kreiman, J., and Sidtis, D. (2011). Foundations of Voice Studies: An Interdisciplinary Approach To Voice Production and Perception. Hoboken, NJ: John Wiley & Sons.

Google Scholar

Leongómez, J. D., Pisanski, K., Reby, D., Sauter, D., Lavan, N., Perlman, M., et al. (2021). Voice modulation: from origin and mechanism to social impact. Philos. Trans. R. Soc. B 376:20200386. doi: 10.1098/rstb.2020.0386

PubMed Abstract | CrossRef Full Text | Google Scholar

Lingle, S., and Riede, T. (2014). Deer mothers are sensitive to infant distress vocalizations of diverse mammalian species. Am. Nat. 184, 510–522. doi: 10.1086/677677

PubMed Abstract | CrossRef Full Text | Google Scholar

Lingle, S., Wyman, M. T., Kotrba, R., Teichroeb, L. J., and Romanow, C. A. (2012). What makes a cry a cry? A review of infant distress vocalizations. Curr. Zool. 58, 698–726. doi: 10.1093/czoolo/58.5.698

CrossRef Full Text | Google Scholar

Markova, D., Richer, L., Pangelinan, M., Schwartz, D. H., Leonard, G., Perron, M., et al. (2016). Age-and sex-related variations in vocal-tract morphology and voice acoustics during adolescence. Horm. Behav. 81, 84–96. doi: 10.1016/j.yhbeh.2016.03.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Mayew, W. J., Parsons, C. A., and Venkatachalam, M. (2013). Voice pitch and the labor market success of male chief executive officers. Evol. Hum. Behav. 34, 243–248. doi: 10.1016/j.evolhumbehav.2013.03.001

CrossRef Full Text | Google Scholar

Meuser, W., and Nieschlag, E. (1977). Sexualhormone und Stimmlage des Mannes. DMW-Deutsche Medizinische Wochenschrift 102, 261–264. doi: 10.1055/s-0028-1104875

CrossRef Full Text | Google Scholar

Morton, E. S. (1977). On the occurrence and significance of motivation-structural rules in some bird and mammal sounds. Am. Nat. 111, 855–869. doi: 10.1086/283219

CrossRef Full Text | Google Scholar

Nimje, P. S., Mayer, M., Zedrosser, A., Sæbø, M., and Rosell, F. (2021). Territory acquisition and mate choice in a monogamous mammal, the Eurasian beaver. Anim. Behav. 178, 165–173. doi: 10.1016/j.anbehav.2021.06.015

CrossRef Full Text | Google Scholar

Osiejuk, S., Łosak, K. T., and Dale, S. (2007). Cautious response of inexperienced birds to conventional signal of stronger threat. J. Avian Biol. 38, 644–649. doi: 10.1111/j.2007.0908-8857.04255.x

CrossRef Full Text | Google Scholar

Pedersen, M., Møller, S., Krabbe, S., and Bennett, P. (1986). Fundamental voice frequency measured by electroglottography during continuous speech. A new exact secondary sex characteristic in boys in puberty. Int. J. Pediatr. Otorhinolaryngol. 11, 21–27. doi: 10.1016/S0165-5876(86)80024-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Pinheiro, A. P., Anikin, A., Conde, T., Sarzedas, J., Chen, S., Scott, S. K., et al. (2021). Emotional authenticity modulates affective and social trait inferences from voices. Philos. Trans. R. Soc. B 376:20200402. doi: 10.1098/rstb.2020.0402

PubMed Abstract | CrossRef Full Text | Google Scholar

Pipitone, R. N., and Gallup, G. G. Jr. (2008). Women’s voice attractiveness varies across the menstrual cycle. Evol. Hum. Behav. 29, 268–274. doi: 10.1016/j.evolhumbehav.2008.02.001

CrossRef Full Text | Google Scholar

Pisanski, K., and Bryant, G. A. (2019). “The evolution of voice perception,” in The Oxford Handbook of Voice Studies. eds. N. Eidsheim and K. Meizel (Oxford University Press), 269–300.

Google Scholar

Pisanski, K., Bryant, G. A., Cornec, C., Anikin, A., and Reby, D. (2022). Form follows function in human nonverbal vocalisations. Ethol. Ecol. Evol. 34, 303–321. doi: 10.1080/03949370.2022.2026482

CrossRef Full Text | Google Scholar

Pisanski, K., Cartei, V., McGettigan, C., Raine, J., and Reby, D. (2016a). Voice modulation: a window into the origins of human vocal control? Trends Cogn. Sci. 20, 304–318. doi: 10.1016/j.tics.2016.01.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Pisanski, K., and Feinberg, D. R. (2013). Cross-cultural variation in mate preferences for averageness, symmetry, body size, and masculinity. Cross-Cult. Res. 47, 162–197. doi: 10.1177/1069397112471806

CrossRef Full Text | Google Scholar

Pisanski, K., Feinberg, D., Oleszkiewicz, A., and Sorokowska, A. (2017). Voice cues are used in a similar way by blind and sighted adults when assessing women’s body size. Sci. Rep. 7, 1–6. doi: 10.1038/s41598-017-10470-3

CrossRef Full Text | Google Scholar

Pisanski, K., Fraccaro, P. J., Tigue, C. C., O’Connor, J. J., Röder, S., Andrews, P. W., et al. (2014). Vocal indicators of body size in men and women: a meta-analysis. Anim. Behav. 95, 89–99. doi: 10.1016/j.anbehav.2014.06.011

CrossRef Full Text | Google Scholar

Pisanski, K., Jones, B. C., Fink, B., O’Connor, J. J., DeBruine, L. M., Röder, S., et al. (2016b). Voice parameters predict sex-specific body morphology in men and women. Anim. Behav. 112, 13–22. doi: 10.1016/j.anbehav.2015.11.008

CrossRef Full Text | Google Scholar

Pisanski, K., and Reby, D. (2021). Efficacy in deceptive vocal exaggeration of human body size. Nat. Commun. 12, 1–9. doi: 10.1038/s41467-021-21008-7

CrossRef Full Text | Google Scholar

Pitcher, B. J., Briefer, E. F., and McElligott, A. G. (2015). Intrasexual selection drives sensitivity to pitch, formants and duration in the competitive calls of fallow bucks. BMC Evol. Biol. 15, 1–13. doi: 10.1186/s12862-015-0429-7

CrossRef Full Text | Google Scholar

Puts, D. A. (2006). Cyclic variation in women’s preferences for masculine traits. Hum. Nat. 17, 114–127. doi: 10.1007/s12110-006-1023-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Puts, D. (2016). Human sexual selection. Curr. Opin. Psychol. 7, 28–32. doi: 10.1016/j.copsyc.2015.07.011

CrossRef Full Text | Google Scholar

Puts, D. A., Apicella, C. L., and Cárdenas, R. A. (2012). Masculine voices signal men’s threat potential in forager and industrial societies. Proc. R. Soc. B Biol. Sci. 279, 601–609. doi: 10.1098/rspb.2011.0829

PubMed Abstract | CrossRef Full Text | Google Scholar

Puts, D. A., and Aung, T. (2019). Does men’s voice pitch signal formidability? A reply to Feinberg. Trends Ecol. Evol. 34, 189–190. doi: 10.1016/j.tree.2018.12.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Puts, D. A., Doll, L. M., and Hill, A. K. (2014). “Sexual selection on human voices” in Evolutionary Perspectives on Human Sexual Psychology and Behavior (Berlin: Springer), 69–86.

Google Scholar

Puts, D. A., Gaulin, S. J., and Verdolini, K. (2006). Dominance and the evolution of sexual dimorphism in human voice pitch. Evol. Hum. Behav. 27, 283–296. doi: 10.1016/j.evolhumbehav.2005.11.003

CrossRef Full Text | Google Scholar

Puts, D. A., Hill, A. K., Bailey, D. H., Walker, R. S., Rendall, D., Wheatley, J. R., et al. (2016). Sexual selection on male vocal fundamental frequency in humans and other anthropoids. Proc. R. Soc. B Biol. Sci. 283:20152830. doi: 10.1098/rspb.2015.2830

PubMed Abstract | CrossRef Full Text | Google Scholar

Puts, D. A., Jones, B. C., and DeBruine, L. M. (2012). Sexual selection on human faces and voices. J. Sex Res. 49, 227-2-3–227-2-243. doi: 10.1080/00224499.2012.658924

PubMed Abstract | CrossRef Full Text | Google Scholar

Raine, J., Pisanski, K., Bond, R., Simner, J., and Reby, D. (2019). Human roars communicate upper-body strength more effectively than do screams or aggressive and distressed speech. PLoS One 14:e0213034. doi: 10.1371/journal.pone.0213034

PubMed Abstract | CrossRef Full Text | Google Scholar

Raine, J., Pisanski, K., Oleszkiewicz, A., Simner, J., and Reby, D. (2018). Human listeners can accurately judge strength and height relative to self from aggressive roars and speech. Iscience 4, 273–280. doi: 10.1016/j.isci.2018.05.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Reby, D., McComb, K., Cargnelutti, B., Darwin, C., Fitch, W. T., and Clutton-Brock, T. (2005). Red deer stags use formants as assessment cues during intrasexual agonistic interactions. Proc. R. Soc. B Biol. Sci. 272, 941–947. doi: 10.1098/rspb.2004.2954

PubMed Abstract | CrossRef Full Text | Google Scholar

Rendall, D., Kollias, S., Ney, C., and Lloyd, P. (2005). Pitch (F 0) and formant profiles of human vowels and vowel-like baboon grunts: the role of vocalizer body size and voice-acoustic allometry. J. Acoust. Soc. Am. 117, 944–955. doi: 10.1121/1.1848011

PubMed Abstract | CrossRef Full Text | Google Scholar

Rendall, D., Vokey, J. R., and Nemeth, C. (2007). Lifting the curtain on the wizard of Oz: biased voice-based impressions of speaker size. J. Exp. Psychol. Hum. Percept. Perform. 33, 1208–1219. doi: 10.1037/0096-1523.33.5.1208

CrossRef Full Text | Google Scholar

Riede, T., and Fitch, T. (1999). Vocal tract length and acoustics of vocalization in the domestic dog (Canis familiaris). J. Exp. Biol. 202, 2859–2867. doi: 10.1242/jeb.202.20.2859

PubMed Abstract | CrossRef Full Text | Google Scholar

Ross, M. D., Owren, M. J., and Zimmermann, E. (2009). Reconstructing the evolution of laughter in great apes and humans. Curr. Biol. 19, 1106–1111. doi: 10.1016/j.cub.2009.05.028

PubMed Abstract | CrossRef Full Text | Google Scholar

Ross, M. D., Owren, M. J., and Zimmermann, E. (2010). The evolution of laughter in great apes and humans. Commun. Integr. Biol. 3, 191–194. doi: 10.4161/cib.3.2.10944

PubMed Abstract | CrossRef Full Text | Google Scholar

Schild, C., Aung, T., Kordsmeyer, T. L., Cardenas, R. A., Puts, D. A., and Penke, L. (2020). Linking human male vocal parameters to perceptions, body morphology, strength and hormonal profiles in contexts of sexual selection. Sci. Rep. 10, 1–16. doi: 10.1038/s41598-020-77940-z

CrossRef Full Text | Google Scholar

Sell, A., Bryant, G. A., Cosmides, L., Tooby, J., Sznycer, D., Von Rueden, C., et al. (2010). Adaptations in humans for assessing physical strength from the voice. Proc. R. Soc. B Biol. Sci. 277, 3509–3518. doi: 10.1098/rspb.2010.0769

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, D., and Singh, D. (2011). Shape and significance of feminine beauty: an evolutionary perspective. Sex Roles 64, 723–731. doi: 10.1007/s11199-011-9938-z

CrossRef Full Text | Google Scholar

Skrinda, I., Krama, T., Kecko, S., Moore, F. R., Kaasik, A., Meija, L., et al. (2014). Body height, immunity, facial and vocal attractiveness in young men. Naturwissenschaften 101, 1017–1025. doi: 10.1007/s00114-014-1241-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, K. M., Olkhov, Y. M., Puts, D. A., and Apicella, C. L. (2017). Hadza men with lower voice pitch have a better hunting reputation. Evol. Psychol. 15:1474704917740466. doi: 10.1177/1474704917740466

PubMed Abstract | CrossRef Full Text | Google Scholar

Strauss, R. H., Liggett, M. T., and Lanese, R. R. (1985). Anabolic steroid use and perceived effects in ten weight-trained women athletes. JAMA 253, 2871–2873. doi: 10.1001/jama.1985.03350430083032

PubMed Abstract | CrossRef Full Text | Google Scholar

Taylor, A., Reby, D., and McComb, K. (2010). Size communication in domestic dog, Canis familiaris, growls. Anim. Behav. 79, 205–210. doi: 10.1016/j.anbehav.2009.10.030

CrossRef Full Text | Google Scholar

Tibbetts, E. A., and Dale, J. (2004). A socially enforced signal of quality in a paper wasp. Nature 432, 218–222. doi: 10.1038/nature02949

PubMed Abstract | CrossRef Full Text | Google Scholar

Titze, I. R. (2011). Vocal fold mass is not a useful quantity for describing F0 in vocalization. J. Speech Lang. Hear. Res. 54, 520–522. doi: 10.1044/1092-4388(2010/09-0284)

PubMed Abstract | CrossRef Full Text | Google Scholar

Titze, I., and Martin, D. (1994). Principles Of Voice Production Prentice Hall. Hoboken, NJ: Prentice Hall.

Google Scholar

Titze, I. R., and Martin, D. W. (1998). Principles of voice production. Acoust. Soc. Am. 104:1148.

Google Scholar

Venners, S. A., Liu, X., Perry, M. J., Korrick, S. A., Li, Z., Yang, F., et al. (2006). Urinary estrogen and progesterone metabolite concentrations in menstrual cycles of fertile women with non-conception, early pregnancy loss or clinical pregnancy. Hum. Reprod. 21, 2272–2280. doi: 10.1093/humrep/del187

CrossRef Full Text | Google Scholar

Waters, S., Kanber, E., Lavan, N., Belyk, M., Carey, D., Cartei, V., et al. (2021). Singers show enhanced performance and neural representation of vocal imitation. Philos. Trans. R. Soc. B 376:20200399. doi: 10.1098/rstb.2020.0399

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, J., Hodges-Simeon, C., Gaulin, S. J., and Reid, S. A. (2021). Pitch lowering enhances men’s perceived aggressive intent, not fighting ability. Evol. Hum. Behav. 42, 51–60. doi: 10.1016/j.evolhumbehav.2020.07.007

CrossRef Full Text | Google Scholar

Zheng, Y., Shang, J., Li, B., Liang, Y., He, J., You, Y., et al. (2017). The factors affecting attractiveness of human voice. Adv. Psychol. Sci. 25:237. doi: 10.3724/SP.J.1042.2017.00237

CrossRef Full Text | Google Scholar

Zhuang, P., and Liu, Y. (2021). Hormones and voice. J. Otolaryngol. Ophthalmol. Shandong Univ. 35, 5–9. doi: 10.6040/j.issn.1673-3770.1.2020.117

CrossRef Full Text | Google Scholar

Keywords: voice, pitch, formant frequencies, sexual dimorphism, sex hormone, body size, strength

Citation: Chen S, Han C, Wang S, Liu X, Wang B, Wei R and Lei X (2022) Hearing the physical condition: The relationship between sexually dimorphic vocal traits and underlying physiology. Front. Psychol. 13:983688. doi: 10.3389/fpsyg.2022.983688

Received: 01 July 2022; Accepted: 17 October 2022;
Published: 03 November 2022.

Edited by:

Lin Zhang, Ningbo University, China

Reviewed by:

Jinguang Zhang, Sun Yat-sen University, China
R. Nathan Pipitone, Florida Gulf Coast University, United States

Copyright © 2022 Chen, Han, Wang, Liu, Wang, Wei and Lei. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xue Lei, bGVpeHVlQHp1ZmUuZWR1LmNu

These authors share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.