Impairments in recognition of emotional facial expressions, affective prosody, and multisensory facilitation of response time in high-functioning autism

Hoffmann, Jonatan; Travers-Podmaniczky, Gabrielle; Pelzl, Michael Alexander; Brück, Carolin; Jacob, Heike; Hölz, Lea; Martinelli, Anne; Wildgruber, Dirk

doi:10.3389/fpsyt.2023.1151665

ORIGINAL RESEARCH article

Front. Psychiatry, 24 April 2023

Sec. Autism

Volume 14 - 2023 | https://doi.org/10.3389/fpsyt.2023.1151665

Impairments in recognition of emotional facial expressions, affective prosody, and multisensory facilitation of response time in high-functioning autism

Jonatan Hoffmann¹^*

Gabrielle Travers-Podmaniczky¹

Michael Alexander Pelzl¹

Lea Hölz¹

¹Department of General Psychiatry and Psychotherapy, University of Tübingen, Tübingen, Germany
²School of Psychology, Fresenius University of Applied Sciences, Frankfurt am Main, Germany

Introduction: Deficits in emotional perception are common in autistic people, but it remains unclear to which extent these perceptual impairments are linked to specific sensory modalities, specific emotions or multisensory facilitation.

Methods: This study aimed to investigate uni- and bimodal perception of emotional cues as well as multisensory facilitation in autistic (n = 18, mean age: 36.72 years, SD: 11.36) compared to non-autistic (n = 18, mean age: 36.41 years, SD: 12.18) people using auditory, visual and audiovisual stimuli.

Results: Lower identification accuracy and longer response time were revealed in high-functioning autistic people. These differences were independent of modality and emotion and showed large effect sizes (Cohen’s d 0.8–1.2). Furthermore, multisensory facilitation of response time was observed in non-autistic people that was absent in autistic people, whereas no differences were found in multisensory facilitation of accuracy between the two groups.

Discussion: These findings suggest that processing of auditory and visual components of audiovisual stimuli is carried out more separately in autistic individuals (with equivalent temporal demands required for processing of the respective unimodal cues), but still with similar relative improvement in accuracy, whereas earlier integrative multimodal merging of stimulus properties seems to occur in non-autistic individuals.

1. Introduction

For successful social interaction, it is crucial to recognize the emotional state of our counterpart and to act accordingly. To identify the current emotional state of others, we rely on the understanding of emotional cues simultaneously conveyed via various communicational channels, including auditory (i.e., the tone of voice, also referred to as prosody) as well as visual (i.e., facial expressions, gestures) cues. This ability can, however, be compromised in certain conditions. Autism spectrum condition is characterized by persistent deficits in social communication and social interaction, restricted, repetitive patterns of behavior, interests and activities as well as atypical sensory processing (1).

Previous literature has reported that autistic people show considerable impairments in recognition of various nonverbal emotional cues, including facial expressions and prosody, compared to non-autistic people (NAP). In a recent meta-analysis, a large effect size (Hedges’ g = −0.80) was calculated for the average impairment across different types of nonverbal cues (2). Furthermore, this meta-analysis revealed moderate impairments in the processing speed within the autistic group (g = −0.61).

For a closer look at cue-specific deficits, a meta-analysis by Zhang (3), exclusively focusing on recognition of affective prosody, revealed significantly reduced identification accuracy in autistic individuals with a moderate-to-large effect size (Hedges’ g = −0.63) as well as large effects regarding prolonged reaction times (g = −1.35). Furthermore, cue-specific differences with a small-to-moderate effect also were identified for reduced accuracy of recognition of emotional facial expressions (4). Besides these impairments in processing unimodal cues, a reduced identification accuracy has also been observed when audiovisual emotional cues are presented. Impairments of audiovisual emotion recognition range from moderate (5) to large (6).

It should be noted, however, that the true effects in these meta-analyses might be partially overestimated due to a publication bias favoring studies reporting significant impairments (2, 3). Conclusions regarding the relevance of these deficits for everyday life might be further compromised since most of the previous studies focused on unimodal emotion recognition. The majority evaluated either perception of auditory cues (7–16) or perception of visual cues (17–28). Furthermore, visual cue studies have mainly implemented static, pictorial stimuli. These do not reflect typical perceptual processes in everyday life, which are normally characterized by multisensory dynamic nonverbal cues of interactional partners. Astonishingly few studies have addressed emotion recognition in autistic compared to non-autistic individuals based on dynamic audiovisual emotional cues (6, 29–33). Among these, studies are relatively inconsistent regarding their methodology, and the stimulus material is quite diverse. Some studies included body language (6) or verbal information (31) in the stimulus material, while others combined static visual stimuli with dynamic auditory stimuli (34), and still others focused exclusively on negative emotions (5). Thus, it remains unclear whether the degree of perceptual deficit is similar for prosody, facial expression, and bimodal audiovisual signals, or whether modality-specific or emotion-specific differences exist.

During perception of multisensory signals in healthy individuals, a facilitation of cerebral processing is reported, such that audiovisual cues can be recognized more accurately and faster than unimodal stimuli (35). The nomenclature for this effect varies, with the terms “multisensory facilitation” being used for multisensory facilitation of accuracy as well as facilitation of response time or, depending on the study, in some cases for parameters combining these two. In the current study, we focus on multisensory facilitation (MSF) of accuracy and multisensory facilitation of response time as two distinct parameters of multisensory integration.

Compared to non-autistic people, autistic individuals have been shown to benefit significantly less from multimodal presentation, in terms of both accuracy as well as response time, with an overall effect size of −0.41, indicating a small-to-moderate effect across various stimuli in a recent meta-analysis (36). Thereby, multisensory facilitation of response time has been frequently investigated using simple, non-social stimuli (e.g., beeps or flashes) and comparing response times to unimodal and audiovisual presentation. In these studies, autistic individuals showed less pronounced multisensory facilitation of response time than non-autistic people (37–40).

More complex, social stimuli have been used to investigate multisensory facilitation of accuracy, mainly by focusing on speech. In healthy individuals, audiovisual presentation of speech leads to a more accurate perception of the semantic language content (41). In contrast, autistic children benefit less from receiving speech information from multiple sensory modalities (42–46). Impairments in audiovisual speech integration in autistic individuals have furthermore been related to complex audiovisual integration at the cerebral level, with low-level integrational abilities reported to remain intact (47).

Based on these results, it can be assumed that deficits in multisensory facilitation also affect perception of nonverbal emotional cues in autistic people. This has been shown in a study with affective vocalizations and respective facial expressions, in which autistic individuals benefited significantly less from the presentation of audiovisual stimuli than non-autistic people (5) during disgust and fear processing. At the cerebral level, differences in multisensory visual and auditory nonverbal emotional cue processing in autistic individuals have also been reported (48), along with a potential modulatory role of attention (49).

In summary, evidence exists for differences in multisensory (emotional) cue processing between autistic and non-autistic people. However, it remains unclear to which extent the perceptual impairments in autistic people are linked to specific sensory modalities, specific emotions or multisensory facilitation. Therefore, in the current study, stimuli with high ecological validity were used to investigate unimodal and bimodal perception of emotional cues in autistic compared to non-autistic individuals. For this purpose, video recordings of actors were presented either bimodally (AV = emotional facial expression and affective prosody) or unimodally visually (V = mute presentation, i.e., facial expression only) or unimodally auditorily (A = audio track, i.e., prosody only).

Since autism describes a large group of people with great interindividual variability, we included only people diagnosed with high-functioning early childhood autism (ICD-10: F84.0) or Asperger-Syndrome (ICD-10: F84.5) and thereby focused our work on autistic individuals with no impairments in intellectual abilities, also referred to as high-functioning autism (HFA). Based on the evidence presented above we hypothesized that, compared to non-autistic individuals,

1. Autistic individuals show lower accuracies in emotion recognition for each of the three modalities;

2. Autistic individuals show longer response times for each of the three modalities;

3. Autistic individuals show reduced multisensory facilitation of accuracy rates;

4. Autistic individuals show reduced multisensory facilitation of response times.

Moreover, we carried out exploratory analyses to evaluate if the extent of perceptual impairment differs between specific sensory modalities and specific emotional categories in autistic individuals regarding accuracy rates and response times compared to non-autistic individuals.

2. Materials and methods

2.1. Participants

Eighteen autistic and 18 non-autistic people participated in this study. The autistic individuals were recruited from the special outpatient consultation service for autistic adults of the Department of General Psychiatry and Psychotherapy at the University of Tübingen. All autistic people were diagnosed with high-functioning early childhood autism (F84.0) or Asperger-Syndrome (F84.5) according to the ICD-10 criteria (50) by fully trained psychiatrists based on extensive clinical examination. This included a comprehensive anamnesis and evaluation of interactional behavior as well as structured self-rating instruments completed by autistic individuals, Autism-spectrum Quotient AQ (51), Empathy Quotient EQ (52), Multiple-choice Vocabulary Intelligence Test MWT-B (53), Beck Depression Inventory BDI (54), and questionnaires for parents or close relatives able to report firsthand about the participant’s behavior during the first decade of life, Social Responsiveness Scale SRS (55), Social Communication Questionnaire SCQ/FSK (56), and Marburg Rating Scale for Asperger’s Syndrome MBAS (57). EQ, SRS, SCQ/FSK, and MBAS were assessed only for autistic people during the diagnostic process to achieve high diagnostic confidence. Since these parameters were not assessed for non-autistic people, these scores are not reported here. BDI was assessed to evaluate whether symptoms of depression (known as a common comorbidity of autism) are present in our study cohort. The MWT-B was used as a measure to approximate IQ and the “Self-Report Emotional Intelligence Test” (SREIT (58)) was used as a measure of emotional intelligence to compare the self-estimated emotional intelligence with the results of our analyses.

Non-autistic people were recruited from the pool of employees at the Medical Center of the University of Tübingen and from their acquaintances. The non-autistic people were selected to match the autistic individuals in terms of age, gender, IQ and educational level. The comparison group was screened with the AQ questionnaire to confirm they were not autistic. All participants spoke German at the level of a native speaker, had normal or corrected to normal vision and hearing and had a sufficient level of everyday function to fulfill the tasks required in this study. No data on ethnicity were recorded. An overview of the assessed data is provided in Table 1.

TABLE 1

Table 1. Participant characteristics.

2.2. Stimulus material

Short video and audio clips were used as stimulus material in which professional actors nonverbally conveyed information about their current emotional state (happy, alluring, neutral, angry, or disgusted). The four professional actors (2 female, 2 male) were asked to say one of four words, which were selected and balanced based on the results of a previous assessment of their valence and arousal (59–62) and each had a neutral meaning [Möbel = furniture, female actor; Gabel = fork, male actor; Zimmer = room, male actor; Objekt = object, female actor; mean valence scores ± SD: 4.9 ± 0.4 on a 9-point self-assessment Manikin scale (63)]. While speaking, the actors were instructed to express one of the five emotional states simultaneously via facial expressions and modulations of their tone of voice. Recordings from every actor were used for each emotional state, with the resulting stimulus material comprising 20 videos (4 actors × 5 emotional states). The videos were then edited in such a way that only the actor’s face could be seen on a black background in order to exclude any influence by environmental factors. The video recordings were presented to participants in three different modalities: with video and audio track (audiovisual = AV, i.e., emotional facial expression and affective prosody), only the video with muted audio (visual = V, i.e., facial expression only) or only the audio track without the video (auditory = A, i.e., prosody only), resulting in a total set of 60 stimuli (20 per modality). In previous studies, these stimuli were evaluated as reliable and valid for measures of emotion recognition abilities, since the emotional information expressed by the actors was identified correctly well above chance level for each stimulus (61, 64, 65). In the original validation with 30 healthy participants, the stimuli with the highest percentage of correct classifications in the AV condition were selected from a total set of 630 stimuli (61). The classification accuracy was 57% (A), 70% (V), and 86% (AV) (61). A subset of these stimuli (only the stimuli with words with neutral meaning) was used in the current study.

In the current study, we aimed to create a balanced task design with respect to the number of emotions with positive and negative valence, matched for their arousal level. To this end, we included two distinct negative emotional states that are characterized by a high degree of arousal (anger and disgust) and two different positive emotional states also characterized by a high arousal level (happiness and sexual interest). Alluring stimuli (expressing sexual interest) were selected as the second category of nonverbal cues with a positive valence due to their relevance in social interaction and the conceptual distinction from happy cues (60, 65), which otherwise build the only positive category within the concept of “basic emotions” according to Ekman and Friesen (66). Additionally, we included neutral nonverbal cues. During recording of alluring stimuli, the actors were asked to nonverbally communicate sexual interest in an inviting manner. Alluring stimuli were relatively uniform across actors with a soft and sustained intonation in the lower frequency spectrum, slow changing facial expressions, mostly with a slight smile and a slight widening of the palpebral fissure and a lifting of one or both eyebrows.

2.3. Experimental design

The stimuli were presented to participants sitting in a chair in front of a 17-inch flat screen monitor wearing headphones. The sound was presented using binaural headphones (Sennheiser HD 515; Sennheiser electronic GmbH & Co. KG, Wedemark-Wennebostel, Germany), the volume was individually adjusted to a comfortable level by each participant. The “Presentation” software (Neurobehavioral Systems Inc., Albany, CA, United States) was used to present the stimuli and record the responses. First, a scale with the five different emotional states, horizontally aligned, appeared on the screen for 1 s. Then, a fixation cross appeared in the middle off the screen accompanied by a simultaneously presented pure tone (302 Hz) to attract the participant’s attention. Subsequently, a random stimulus was presented in the audiovisual, visual or auditory modality. The scale showing the five different categories of emotional states reappeared on the screen after stimulus offset and the participants were instructed to select the emotional state that, in their intuitive opinion, was most likely expressed in the stimulus. To select the answer, the participants pressed one of the five horizontally adjacent buttons on a Cedrus RB-730 response pad (Cedrus Corporation, San Pedro, CA, United States), corresponding to the position of the emotions displayed on the screen. As soon as the participant responded, a short visual feedback (700 ms) of the selected answer was displayed and the answer could not be changed anymore. The response window (10 s duration) started with stimulus onset. The total duration of a single trial varied from 3.7 to 12.7 s depending on the stimulus duration and the individual response time.

In order to avoid answering bias, the horizontal position of the five different emotional states was permutated between participants. While neutral always remained in the middle, the other emotional states were distributed randomly, with the restriction that the two positive and the two negative emotional states always appeared on the same side, either right or left of the “neutral” response option. This resulted in eight different answer scales, which were changed between participants.

Each participant performed a short training session, consisting of 15 stimuli that were not part of the main experiment, to become familiar with the task before starting with the actual experiment. All 60 stimuli were then presented to each participant in a random order.

2.4. Data analysis

The main focus of data analysis lay on the accuracy of answers and the time until an answer was given (response time). Accuracy rates were calculated as the proportion of correct answers. Accuracy rates and response times were averaged for each modality and for each emotional category. One-sided independent t-tests were calculated for both accuracy rates and response times for all three modalities. Furthermore, Cohen’s d was calculated as a measure of effect size. This information can be helpful to estimate sample sizes in future studies regarding modality effects on emotion recognition by autistic individuals.

For further exploration of the effect of cue modality and emotional category on accuracy rates and response times, a mixed-model design analysis of variance (ANOVA) with modality (auditory, visual, and audiovisual) and emotional category (happy, alluring, neutral, angry, and disgusted) as within-subject factors and group (NAP, HFA) as the between-subject factor was calculated for accuracy rates as well as response times. The BDI was used as a covariate. ANOVA results were Greenhouse–Geisser-corrected.

In the current literature, the term “multisensory facilitation” is used for facilitation of accuracy (increase in identification rates) as well as for facilitation of response time (increase in processing speed) or, in some studies, for parameters combining these two different aspects. In the current study, we focused on multisensory facilitation of accuracy and multisensory facilitation of response time as two distinct parameters of multisensory integration. Multisensory facilitation for accuracy was operationalized as the percent improvement (MSF-A%) in accuracy rate after audiovisual presentation compared to the highest accuracy rate after unimodal presentation. The unimodal modality with the higher accuracy rate (visual or auditory) was determined individually per participant. Thus, MSF-A% was calculated as: [(audiovisual accuracy—max. Unimodal accuracy)/max. Unimodal accuracy] × 100. Similarly, multisensory facilitation for response time was operationalized as percent reduction (MSF-RT%) in response time after audiovisual compared to the fastest unimodal response time per participant. MSF-RT% was thereby calculated as: [(min. Unimodal RT—audiovisual RT)/min. Unimodal RT] × 100. Thus, positive values of the MSF-A% and MSF-RT% reflect an improvement (better accuracy rate or shorter response time) during audiovisual presentation compared to best unimodal presentation. These calculations have already been used in previous studies to quantify effects of multisensory stimulus presentation (5, 37, 40).

For the statistical analysis of the MSF-A% and the MSF-RT%, we used a one-sided independent t-test between groups (NAP, HFA). Pearson correlation coefficients among all psychometric variables are reported in the Supplementary material (Supplementary Tables S1, S2). All statistical data analyses were performed using IBM SPSS Statistics, Version 27. Significance levels were set at p < 0.05.

2.5. Ethical approval

The study was performed in accordance with the ethical principles expressed in the Declaration of Helsinki. Approval of the research protocol was granted by the Medical Ethics Committee at the University of Tübingen, Germany (#469/2013BO2). Written informed consent was obtained from all participants prior to the involvement in this research.

3. Results

3.1. Accuracy

To evaluate our first hypothesis, one-sided independent t-tests were calculated to compare the mean accuracy rates between both groups for all three modalities. In accordance with our hypothesis, significantly lower accuracy rates were observed in the HFA group compared to the NAP group for audiovisual, t(21.74) = 3.50, p = 0.001, visual, t(27.53) = 3.13, p = 0.002, and auditory, t(34) = 3.11, p = 0.002 modalities, with large effect sizes (all Cohen’s d > 1).

Aiming to evaluate possible effects of cue modality and emotional category on the accuracy of emotion perception, an ANOVA was calculated in an exploratory approach. BDI was used as a covariate to evaluate whether it has a confounding effect on accuracy rates. The results revealed a significant main effect for group, F(1, 33) = 4.94, p = 0.033, partial η² = 0.13, for cue modality, F(1.75, 57.72) = 98.61, p < 0.001, partial η² = 0.75, and for emotional category, F(3.04, 100.44) = 18.12, p < 0.001, partial η² = 0.16. Overall, accuracy rates were higher in NAP than in HFA. In both groups, the highest accuracy rates per modality were observed under audiovisual, followed by visual and auditory only modality (see Figure 1A; Table 2). In both groups, the highest accuracy rates per emotion were observed for neutral expressions and the lowest accuracy rates for disgusted expressions, while the accuracy rates for alluring, happy, and angry cues were in between (see Supplementary material; Supplementary Table S3).

FIGURE 1

Figure 1. Mean accuracy rates (A) and mean response times (B) by modality. Significantly decreased accuracy rates and prolonged response times in autistic (HFA) as compared to non-autistic people (NAP) were observed for each of the three modalities. Error bars represent the corresponding 95%-CI. *p < 0.05. **p < 0.01.

TABLE 2

Table 2. Accuracy rates (proportion of correct answers) per modality.

There was no statistically significant interaction between cue modality and group, F(1.75, 57.72) = 1.53, p = 0.227, partial η² = 0.04, between emotional category and group, F(3.04, 100.44) = 1.01, p = 0.393, partial η² = 0.03, or between cue modality, emotional category, and group, F(5.07, 167.32) = 1.03, p = 0.404, partial η² = 0.03. There was also no significant effect of BDI, F(1, 33) = 1.45, p = 0.238, partial η² = 0.04.

3.2. Response time

To evaluate our second hypothesis, one-sided independent t-tests were calculated to compare mean response times between both groups for each modality. These analyses showed significantly prolonged response times in the HFA group compared to the NAP group for audiovisual, t(34) = −3.11, p = 0.002, visual, t(34) = −3.08, p = 0.002, and auditory, t(34) = −2.35, p = 0.013, stimuli, with large effect sizes (all Cohen’s d > 0.8).

Additionally, an ANOVA was calculated to further explore effects of cue modality and emotional category on response times and BDI was again used as a covariate to evaluate whether it has a confounding effect on response rates. This exploratory approach revealed significant main effects for group, F(1, 33) = 6.42, p = 0.016, partial η² = 0.16, for cue modality, F(1.48, 48.98) = 4.74, p = 0.021, partial η² = 0.13, and for emotional category, F(2.83, 93.34) = 23.98, p < 0.001, partial η² = 0.42. Overall, response times were prolonged in HFA compared to NAP. In both groups, the shortest mean response times per modality were observed under audiovisual modality (see Figure 1B; Table 3). In both groups, the shortest mean response times per emotion were observed for neutral expressions, followed by happy, angry, alluring, and disgusted expressions (see Supplementary material; Supplementary Table S4).

TABLE 3

Table 3. Response times per modality.

There was no statistically significant interaction between cue modality and group, F(1.48, 48.98) = 1.11, p = 0.322, partial η² = 0.03, between emotional category and group, F(2.83, 93.34) = 0.60, p = 0.605, partial η² = 0.02, or between cue modality, emotional category, and group, F(5.24, 173.07) = 0.62, p = 0.691, partial η² = 0.02. There was also no significant effect of BDI, F(1, 33) = 0.20, p = 0.658, partial η² = 0.01.

3.3. Multisensory facilitation

To evaluate the third and fourth hypotheses, we first calculated by one sample t-tests whether the multisensory facilitation of accuracy (MSF-A%) and response times (MSF-RT%) significantly differ from 0 for HFA and NAP each and compared MSF-A% and MSF-RT% between both groups using one-sided independent t-tests.

Thereby MSF-A% significantly differed from 0 for NAP, t(17) = 6.98, p = <0.001, as well as HFA, t(17) = 4.78, p = <0.001, meaning that multisensory facilitation of accuracy was observed in both groups. MSF-RT% differed significantly from 0 for NAP, t(17) = 2.54, p = 0.021, but not for HFA, t(17) = −0.866, p = 0.399, meaning that multisensory facilitation of response time is present in NAP but not HFA.

Statistical analysis further revealed no significant group differences regarding MSF-A%, t(34) = −0.621, p = 0.270, Cohen’s d = −0.207, whereas significantly reduced MSF-RT% was observed in the HFA group, t(34) = 1.922, p = 0.032, Cohen’s d = 0.641. The HFA group showed a mean negative MSF-RT%, meaning that the average response times under audiovisual stimulus conditions tended to be even longer than under the respective fastest unimodal condition. This difference was, however, not significant. An overview of these data is shown in Figure 2 and Table 4.

FIGURE 2

Figure 2. Multisensory facilitation (MSF) of accuracy (A) and of response time (B) showing a significant difference between NAP and HFA for the MSF of response time. Error bars represent the corresponding 95%-CI. MSF-A%, Multisensory facilitation of accuracy (percent improvement); MSF-RT%, Multisensory facilitation of response time (percent reduction); NAP, non-autistic people; HFA, high-functioning autism. *p < 0.05. **p < 0.01. ***p < 0.001.

TABLE 4

Table 4. Multisensory facilitation of accuracy and response times.

Regarding unimodal conditions, the visual only modality was the fastest of the two unimodal stimulus conditions in 14 of the 18 participants (78%) in the NAP group, and in 10 of the 18 participants (56%) in the HFA group. This means that the proportion of participants with the auditory modality as the fastest unimodal modality was twice as high in the HFA group (8 of 18, 44%) than in the NAP group (4 of 18, 22%).

4. Discussion

The aim of this study was to investigate perception of auditory, visual, and audiovisual nonverbal emotional cues, as well as multisensory facilitation in autistic compared to non-autistic people.

In agreement with the literature, a significantly reduced accuracy rate was observed in autistic individuals for all three investigated modalities (auditory, visual, and audiovisual), thus confirming our first hypothesis. With our ecologically valid stimulus material, large effect sizes could be found in all three modalities (A: Cohen’s d = 1.04, V: Cohen’s d = 1.04, AV: Cohen’s d = 1.17). Moreover, no interaction between these group effects and modality- or emotion-specific differences were evident.

As a further aspect, in accordance with the literature, autistic people showed a significantly prolonged response time for each of the three modalities, meaning that our second hypothesis can also be confirmed. Again, we observed large effect sizes in all three modalities (A: Cohen’s d = 0.78, V: Cohen’s d = 1.03, AV: Cohen’s d = 1.04) and yet again, no interaction of group effects with modality- or emotion-specific differences emerged.

We further hypothesized that multisensory facilitation is reduced in autistic compared to non-autistic individuals in terms of both accuracy (Hypothesis 3) and response time (Hypothesis 4). However, we could only observe a significantly reduced multisensory facilitation for response time but not for accuracy, meaning that we can confirm Hypothesis 4 but did not find confirming evidence for Hypothesis 3 in our study. Particularly interesting is the fact that in contrast to the results obtained in the control group, we did not observe a significant multisensory facilitation of response times in autistic people at all. The mean MSF-RT% in the HFA group was in fact negative meaning that autistic people showed a tendency to even longer response times under audiovisual stimulus conditions compared to the respective unimodal stimuli, which was, however, not significant. This is in contrast with previous findings in which autistic individuals showed reduced multisensory facilitation of response times but still had facilitated response times as compared to unimodal presentation, albeit to a lesser degree than non-autistic people (37, 40). MSF-RT% showed a negative mean in the HFA group, although the mean response time in the HFA group was fastest in AV modality. This can be explained by the higher interindividual variability of MSF-RT% with stronger outliers (for positive as well as negative values) in HFA compared to NAP (see Supplementary material; Supplementary Figure S1; Supplementary Table S5) and by the observation that the auditory modality was more frequently the fastest of the two unimodal modalities in HFA compared to NAP (44% and 22% respectively).

The reduced multisensory facilitation of response times suggests that neural mechanisms underlying the integration of multisensory stimuli might differ in autistic compared to non-autistic people. Multisensory facilitation of response times in non-autistic people—as confirmed in numerous studies (36)—indicates a convergence of multisensory emotional cues to a bimodal percept at an early step of neural processing, which accelerates identification of the respective emotional state. The results of the current study indicate that in autistic individuals—in contrast—auditory and visual cues might be processed separately within modality-specific brain areas, whereas integration to a bimodal percept might occur to a lesser degree or occur at a later processing stage. These differences in the neural processing of multisensory cues might also contribute to the occurrence of specific symptoms such as sensitivity to light and noise or a general sensory overload in autistic people. However, it must be stated that this conclusion is speculative since the underlying neural mechanisms are not yet fully understood. On the one hand, neuro-oscillatory functions may be important for multisensory facilitation and seem to be altered in autistic people (67–70). Also a reduced inter-regional brain connectivity in autism (71, 72) could play a role, since it is essential that information from different sensory cortex areas are integrated into a joint percept in order to achieve successful multisensory integration. Further investigation is necessary for a better understanding of the underlying mechanisms.

Another very interesting observation is that, despite the reduced multisensory facilitation of response times, we did not find deficits in multisensory facilitation of accuracy (MSF-A% did not differ significantly between the groups). This suggests that autistic individuals perform a more time-consuming analysis of bimodal compared to unimodal emotional cues but can thereby still achieve the same relative improvement of accuracy. We suppose that this compensational strategy evolves during childhood development, since previous literature identified generally larger multisensory integration deficits in younger children than in adolescents (36) and particularly, the bimodal facilitation of accuracy in speech recognition (as a social stimulus) was impaired in autistic children (7–12 years) but not adolescents [13–15 years (43)].

Another explanation for the lack of difference in multisensory facilitation of accuracy in this current study might be the fact that no background noise or complex task demands were present in our experimental design. Prior studies of multisensory facilitation in speech recognition identified the largest impairments in autistic individuals when background noise was present at a level where non-autistic people benefited the most and the signal-to-noise ratio was low, while minor or even no deficits were observed without background noise (43, 45). For emotional signals, differences in multisensory integration were only found during complex (divided) attention conditions (49). Thus, in particular if compensational strategies are involved, it can be expected that differences in multisensory integration of emotional cues in autistic and non-autistic people become more pronounced with increasing task and non-task related demands. Experimental designs with increasing complexity likely approximate the simultaneous demands present in everyday interactional situations more accurately, and thereby may more precisely depict differences in multisensory integration experienced by autistic people. Thus, it would be interesting to further address the impact of background noise and complex task demands to multisensory integration of emotional cues in future research.

Some limitations of our study should be mentioned. Our experiment was only conducted using the described emotional stimuli. Other, non-emotional stimuli were not implemented. Therefore, we cannot distinguish whether the impairments of multisensory facilitation of response time in the HFA group are specifically related to emotion recognition or whether it rather reflects a general impairments in multisensory integration which has been observed in autistic individuals (37–40, 42–49). Although we implemented stimulus material with high ecological validity, each stimulus consisted only of one spoken word. As sentences in our daily life usually contain more than one word, they provide more prosodic information and facial expressions can be observed more intensely. Thus, emotion perception abilities might differ substantially in daily conversations. In addition, the male and female actors have spoken different words, so there might be a confounding influence of speaker gender and word content on the results. However, since the selected words had a neutral meaning and gender influences were not evaluated within this study, any influence would not be expected to systematically bias the reported results.

In summary, with our stimulus material, clear impairments in emotion perception are evident in autistic individuals. These impairments are independent of modality and emotion and show large effect sizes. Due to the reduced multisensory facilitation of response time with preserved multisensory facilitation of accuracy in the HFA group, it can be assumed that audiovisual stimuli might be analyzed separately (and thus more slowly) but still with the same relative improvement of accuracy in autistic individuals, whereas a more integrative stimulus perception seems to take place in non-autistic people. However, further investigation is still necessary to better understand the underlying neural mechanisms.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by Medical Ethics Committee at the University of Tübingen, Germany. The patients/participants provided their written informed consent to participate in this study.

Author contributions

DW, CB, and HJ conceptualized the study. GT-P performed the data acquisition. JH, CB, GT-P, HJ, MP, LH, AM, and DW carried out the analysis and interpretation of the data. JH, DW, and AM carried out writing of the manuscript and preparation of the figures. All authors contributed to the manuscript revision, read and approved the submitted version, and provided approval for publication of the content.

Funding

We acknowledge support by Open Access Publishing Fund of University of Tübingen.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2023.1151665/full#supplementary-material

References

1. American Psychiatric Association (ed). Autism spectrum disorder In: Diagnostic and statistical manual of mental disorders. 5th ed. Arlington, VA: American Psychiatric Association (2013). 50.

Google Scholar

2. Velikonja, T, Fett, AK, and Velthorst, E. Patterns of nonsocial and social cognitive functioning in adults with autism Spectrum disorder: a systematic review and meta-analysis. JAMA Psychiat. (2019) 76:135–51. doi: 10.1001/jamapsychiatry.2018.3645

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Zhang, M, Xu, S, Chen, Y, Lin, Y, Ding, H, and Zhang, Y. Recognition of affective prosody in autism spectrum conditions: a systematic review and meta-analysis. Autism Int J Res Pract. (2021) 15:1362361321995725. doi: 10.1177/1362361321995725

CrossRef Full Text | Google Scholar

4. Lozier, LM, Vanmeter, JW, and Marsh, AA. Impairments in facial affect recognition associated with autism spectrum disorders: a meta-analysis. Dev Psychopathol. (2014) 26:933–45. doi: 10.1017/S0954579414000479

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Charbonneau, G, Bertone, A, Lepore, F, Nassim, M, Lassonde, M, Mottron, L, et al. Multilevel alterations in the processing of audio-visual emotion expressions in autism spectrum disorders. Neuropsychologia. (2013) 51:1002–10. doi: 10.1016/j.neuropsychologia.2013.02.009

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Fridenson-Hayo, S, Berggren, S, Lassalle, A, Tal, S, Pigat, D, Bölte, S, et al. Basic and complex emotion recognition in children with autism: cross-cultural findings. Mol Autism. (2016) 7:52. doi: 10.1186/s13229-016-0113-9

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Kujala, T, Lepistö, T, Nieminen-von Wendt, T, Näätänen, P, and Näätänen, R. Neurophysiological evidence for cortical discrimination impairment of prosody in Asperger syndrome. Neurosci Lett. (2005) 383:260–5. doi: 10.1016/j.neulet.2005.04.048

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Baker, KF, Montgomery, AA, and Abramson, R. Brief report: perception and lateralization of spoken emotion by youths with high-functioning forms of autism. J Autism Dev Disord. (2010) 40:123–9. doi: 10.1007/s10803-009-0841-1

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Grossman, RB, Bemis, RH, Plesa Skwerer, D, and Tager-Flusberg, H. Lexical and affective prosody in children with high-functioning autism. J Speech Lang Hear Res JSLHR. (2010) 53:778–93. doi: 10.1044/1092-4388(2009/08-0127)

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Brennand, R, Schepman, A, and Rodway, P. Vocal emotion perception in pseudo-sentences by secondary-school children with autism Spectrum disorder. Res Autism Spectr Disord. (2011) 5:1567–73. doi: 10.1016/j.rasd.2011.03.002

CrossRef Full Text | Google Scholar

11. Heaton, P, Reichenbacher, L, Sauter, D, Allen, R, Scott, S, and Hill, E. Measuring the effects of alexithymia on perception of emotional vocalizations in autistic spectrum disorder and typical development. Psychol Med. (2012) 42:2453–9. doi: 10.1017/S0033291712000621

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Kim, CH, Kim, YT, and Lee, SJ. Effect of context and affective prosody on emotional perception in children with high-functioning autism. Commun Sci Disord. (2013) 18:24–34. doi: 10.12963/csd.13003

CrossRef Full Text | Google Scholar

13. Lyons, M, Schoen Simmons, E, and Paul, R. Prosodic development in middle childhood and adolescence in high-functioning autism. Autism Res Off J Int Soc Autism Res. (2014) 7:181–96. doi: 10.1002/aur.1355

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Wang, JE, and Tsao, FM. Emotional prosody perception and its association with pragmatic language in school-aged children with high-function autism. Res Dev Disabil. (2015) 37:162–70. doi: 10.1016/j.ridd.2014.11.013

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Martzoukou, M, Papadopoulou, D, and Kosmidis, MH. The comprehension of syntactic and affective prosody by adults with autism Spectrum disorder without accompanying cognitive deficits. J Psycholinguist Res. (2017) 46:1573–95. doi: 10.1007/s10936-017-9500-4

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Rosenblau, G, Kliemann, D, Dziobek, I, and Heekeren, HR. Emotional prosody processing in autism spectrum disorder. Soc Cogn Affect Neurosci. (2017) 12:224–39. doi: 10.1093/scan/nsw118

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Dziobek, I, Fleck, S, Rogers, K, Wolf, OT, and Convit, A. The “amygdala theory of autism” revisited: linking structure to behavior. Neuropsychologia. (2006) 44:1891–9. doi: 10.1016/j.neuropsychologia.2006.02.005

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Boraston, Z, Blakemore, SJ, Chilvers, R, and Skuse, D. Impaired sadness recognition is linked to social interaction deficit in autism. Neuropsychologia. (2007) 45:1501–10. doi: 10.1016/j.neuropsychologia.2006.11.010

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Corden, B, Chilvers, R, and Skuse, D. Avoidance of emotionally arousing stimuli predicts social-perceptual impairment in Asperger’s syndrome. Neuropsychologia. (2008) 46:137–47. doi: 10.1016/j.neuropsychologia.2007.08.005

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Wallace, S, Sebastian, C, Pellicano, E, Parr, J, and Bailey, A. Face processing abilities in relatives of individuals with ASD. Autism Res Off J Int Soc Autism Res. (2010) 3:345–9. doi: 10.1002/aur.161

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Mathewson, KJ, Drmic, IE, Jetha, MK, Bryson, SE, Goldberg, JO, Hall, GB, et al. Behavioral and cardiac responses to emotional stroop in adults with autism spectrum disorders: influence of medication. Autism Res Off J Int Soc Autism Res. (2011) 4:98–108. doi: 10.1002/aur.176

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Lai, MC, Lombardo, MV, Ruigrok, ANV, Chakrabarti, B, Wheelwright, SJ, Auyeung, B, et al. Cognition in males and females with autism: similarities and differences. PLoS One. (2012) 7:e47198. doi: 10.1371/journal.pone.0047198

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Eack, SM, Bahorik, AL, McKnight, SAF, Hogarty, SS, Greenwald, DP, Newhill, CE, et al. Commonalities in social and non-social cognitive impairments in adults with autism spectrum disorder and schizophrenia. Schizophr Res. (2013) 148:24–8. doi: 10.1016/j.schres.2013.05.013

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Sucksmith, E, Allison, C, Baron-Cohen, S, Chakrabarti, B, and Hoekstra, RA. Empathy and emotion recognition in people with autism, first-degree relatives, and controls. Neuropsychologia. (2013) 51:98–105. doi: 10.1016/j.neuropsychologia.2012.11.013

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Wilson, CE, Happé, F, Wheelwright, SJ, Ecker, C, Lombardo, MV, Johnston, P, et al. The neuropsychology of male adults with high-functioning autism or asperger syndrome. Autism Res Off J Int Soc Autism Res. (2014) 7:568–81. doi: 10.1002/aur.1394

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Eack, SM, Mazefsky, CA, and Minshew, NJ. Misinterpretation of facial expressions of emotion in verbal adults with autism spectrum disorder. Autism Int J Res Pract. (2015) 19:308–15. doi: 10.1177/1362361314520755

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Walsh, JA, Creighton, SE, and Rutherford, MD. Emotion perception or social cognitive complexity: what drives face processing deficits in autism Spectrum disorder? J Autism Dev Disord. (2016) 46:615–23. doi: 10.1007/s10803-015-2606-3

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Otsuka, S, Uono, S, Yoshimura, S, Zhao, S, and Toichi, M. Emotion perception mediates the predictive relationship between verbal ability and functional outcome in high-functioning adults with autism Spectrum disorder. J Autism Dev Disord. (2017) 47:1166–82. doi: 10.1007/s10803-017-3036-1

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Dziobek, I, Fleck, S, Kalbe, E, Rogers, K, Hassenstab, J, Brand, M, et al. Introducing MASC: a movie for the assessment of social cognition. J Autism Dev Disord. (2006) 36:623–36. doi: 10.1007/s10803-006-0107-0

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Golan, O, Baron-Cohen, S, Hill, JJ, and Golan, Y. The “reading the mind in films” task: complex emotion recognition in adults with and without autism spectrum conditions. Soc Neurosci. (2006) 1:111–23. doi: 10.1080/17470910600980986

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Lindner, JL, and Rosén, LA. Decoding of emotion through facial expression, prosody and verbal content in children and adolescents with Asperger’s syndrome. J Autism Dev Disord. (2006) 36:769–77. doi: 10.1007/s10803-006-0105-2

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Mathersul, D, McDonald, S, and Rushby, JA. Understanding advanced theory of mind and empathy in high-functioning adults with autism spectrum disorder. J Clin Exp Neuropsychol. (2013) 35:655–68. doi: 10.1080/13803395.2013.809700

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Murray, K, Johnston, K, Cunnane, H, Kerr, C, Spain, D, Gillan, N, et al. A new test of advanced theory of mind: the “strange stories film task” captures social processing differences in adults with autism spectrum disorders. Autism Res Off J Int Soc Autism Res. (2017) 10:1120–32. doi: 10.1002/aur.1744

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Holdnack, J, Goldstein, G, and Drozdick, L. Social perception and WAIS-IV performance in adolescents and adults diagnosed with Asperger’s syndrome and autism. Assessment. (2011) 18:192–200. doi: 10.1177/1073191110394771

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Bremner, AJ, Lewkowicz, DJ, and Spence, C. The multisensory approach to development In: AJ Bremner, DJ Lewkowicz, and C Spence, editors. Multisensory development. Oxford: Oxford University Press (2012). 1–26. Available at: https://oxford.universitypressscholarship.com/view/10.1093/acprof:oso/9780199586059.001.0001/acprof-9780199586059-chapter-001 (Accessed November 28, 2021).

Google Scholar

36. Feldman, JI, Dunham, K, Cassidy, M, Wallace, MT, Liu, Y, and Woynaroski, TG. Audiovisual multisensory integration in individuals with autism spectrum disorder: a systematic review and meta-analysis. Neurosci Biobehav Rev. (2018) 95:220–34. doi: 10.1016/j.neubiorev.2018.09.020

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Brandwein, AB, Foxe, JJ, Butler, JS, Russo, NN, Altschuler, TS, Gomes, H, et al. The development of multisensory integration in high-functioning autism: high-density electrical mapping and psychophysical measures reveal impairments in the processing of audiovisual inputs. Cereb Cortex. (2013) 23:1329–41. doi: 10.1093/cercor/bhs109

CrossRef Full Text | Google Scholar

38. Collignon, O, Charbonneau, G, Peters, F, Nassim, M, Lassonde, M, Lepore, F, et al. Reduced multisensory facilitation in persons with autism. Cortex J Devoted Study Nerv Syst Behav. (2013) 49:1704–10. doi: 10.1016/j.cortex.2012.06.001

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Brandwein, AB, Foxe, JJ, Butler, JS, Frey, HP, Bates, JC, Shulman, LH, et al. Neurophysiological indices of atypical auditory processing and multisensory integration are associated with symptom severity in autism. J Autism Dev Disord. (2015) 45:230–44. doi: 10.1007/s10803-014-2212-9

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Ostrolenk, A, Bao, VA, Mottron, L, Collignon, O, and Bertone, A. Reduced multisensory facilitation in adolescents and adults on the autism Spectrum. Sci Rep. (2019) 9:11965. doi: 10.1038/s41598-019-48413-9

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Fraser, S, Gagné, JP, Alepins, M, and Dubois, P. Evaluating the effort expended to understand speech in noise using a dual-task paradigm: the effects of providing visual speech cues. J Speech Lang Hear Res JSLHR. (2010) 53:18–33. doi: 10.1044/1092-4388(2009/08-0140)

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Smith, EG, and Bennetto, L. Audiovisual speech integration and lipreading in autism. J Child Psychol Psychiatry. (2007) 48:813–21. doi: 10.1111/j.1469-7610.2007.01766.x

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Foxe, JJ, Molholm, S, Del Bene, VA, Frey, HP, Russo, NN, Blanco, D, et al. Severe multisensory speech integration deficits in high-functioning school-aged children with autism Spectrum disorder (ASD) and their resolution during early adolescence. Cereb Cortex. (2015) 25:298–312. doi: 10.1093/cercor/bht213

CrossRef Full Text | Google Scholar

44. Ross, LA, Del Bene, VA, Molholm, S, Frey, HP, and Foxe, JJ. Sex differences in multisensory speech processing in both typically developing children and those on the autism spectrum. Front Neurosci. (2015) 9:185. doi: 10.3389/fnins.2015.00185

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Stevenson, RA, Baum, SH, Segers, M, Ferber, S, Barense, MD, and Wallace, MT. Multisensory speech perception in autism spectrum disorder: from phoneme to whole-word perception. Autism Res Off J Int Soc Autism Res. (2017) 10:1280–90. doi: 10.1002/aur.1776

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Stevenson, RA, Segers, M, Ncube, BL, Black, KR, Bebko, JM, Ferber, S, et al. The cascading influence of multisensory processing on speech perception in autism. Autism Int J Res Pract. (2018) 22:609–24. doi: 10.1177/1362361317704413

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Magnée, MJCM, de Gelder, B, van Engeland, H, and Kemner, C. Audiovisual speech integration in pervasive developmental disorder: evidence from event-related potentials. J Child Psychol Psychiatry. (2008) 49:995–1000. doi: 10.1111/j.1469-7610.2008.01902.x

PubMed Abstract | CrossRef Full Text | Google Scholar

48. C M Magnée, MJ, de Gelder, B, van Engeland, H, and Kemner, C. Atypical processing of fearful face-voice pairs in pervasive developmental disorder: an ERP study. Clin Neurophysiol. (2008) 119:2004–10. doi: 10.1016/j.clinph.2008.05.005

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Magnée, MJCM, de Gelder, B, van Engeland, H, and Kemner, C. Multisensory integration and attention in autism spectrum disorder: evidence from event-related potentials. PLoS One. (2011) 6:e24196. doi: 10.1371/journal.pone.0024196

PubMed Abstract | CrossRef Full Text | Google Scholar

50. World Health Organization. International statistical classification of diseases and related health problems (10th ed) (2016). Available at: https://icd.who.int/browse10/2016/en

Google Scholar

51. Baron-Cohen, S, Wheelwright, S, Skinner, R, Martin, J, and Clubley, E. The autism-spectrum quotient (AQ): evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. J Autism Dev Disord. (2001) 31:5–17. doi: 10.1023/A:1005653411471

PubMed Abstract | CrossRef Full Text | Google Scholar

52. Baron-Cohen, S, and Wheelwright, S. The empathy quotient: an investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences. J Autism Dev Disord. (2004) 34:163–75. doi: 10.1023/B:JADD.0000022607.19833.00

PubMed Abstract | CrossRef Full Text | Google Scholar

53. Lehrl, S. Mehrfachwahl-Wortschatz-Intelligenztest MWT-B. 5th ed. Balingen: Spitta Verlag (2005).

Google Scholar

54. Beck, AT, Ward, CH, Mendelson, M, Mock, J, and Erbaugh, J. An inventory for measuring depression. Arch Gen Psychiatry. (1961) 4:561–71. doi: 10.1001/archpsyc.1961.01710120031004

PubMed Abstract | CrossRef Full Text | Google Scholar

55. Constantino, JN, and Gruber, C. Social responsiveness scale (SRS). Los Angeles: Western Psychological Services (2005).

Google Scholar

56. Rutter, M, Bailey, A, and Lord, C. The social communication questionnaire. Torrance, CA: Western Psychological Services (2003).

Google Scholar

57. Kamp-Becker, I, Mattejat, F, Wolf-Ostermann, K, and Remschmidt, H. The Marburg rating scale for Asperger’s syndrome (MBAS)–a screening instrument for high-functioning autistic disorders. Z Kinder Jugendpsychiatr Psychother. (2005) 33:15–26. doi: 10.1024/1422-4917.33.1.15

PubMed Abstract | CrossRef Full Text | Google Scholar

58. Schutte, NS, Malouff, JM, Hall, LE, Haggerty, DJ, Cooper, JT, Golden, CJ, et al. Development and validation of a measure of emotional intelligence. Personal Individ Differ. (1998) 25:167–77. doi: 10.1016/S0191-8869(98)00001-4

PubMed Abstract | CrossRef Full Text | Google Scholar

59. Herbert, C, Kissler, J, Junghöfer, M, Peyk, P, and Rockstroh, B. Processing of emotional adjectives: evidence from startle EMG and ERPs. Psychophysiology. (2006) 43:197–206. doi: 10.1111/j.1469-8986.2006.00385.x

PubMed Abstract | CrossRef Full Text | Google Scholar

60. Ethofer, T, Wiethoff, S, Anders, S, Kreifelts, B, Grodd, W, and Wildgruber, D. The voices of seduction: cross-gender effects in processing of erotic prosody. Soc Cogn Affect Neurosci. (2007) 2:334–7. doi: 10.1093/scan/nsm028

PubMed Abstract | CrossRef Full Text | Google Scholar

61. Kreifelts, B, Ethofer, T, Grodd, W, Erb, M, and Wildgruber, D. Audiovisual integration of emotional signals in voice and face: an event-related fMRI study. NeuroImage. (2007) 37:1445–56. doi: 10.1016/j.neuroimage.2007.06.020

PubMed Abstract | CrossRef Full Text | Google Scholar

62. Wiethoff, S, Wildgruber, D, Kreifelts, B, Becker, H, Herbert, C, Grodd, W, et al. Cerebral processing of emotional prosody–influence of acoustic parameters and arousal. NeuroImage. (2008) 39:885–93. doi: 10.1016/j.neuroimage.2007.09.028

PubMed Abstract | CrossRef Full Text | Google Scholar

63. Bradley, MM, and Lang, PJ. Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry. (1994) 25:49–59. doi: 10.1016/0005-7916(94)90063-9

PubMed Abstract | CrossRef Full Text | Google Scholar

64. Lambrecht, L, Kreifelts, B, and Wildgruber, D. Age-related decrease in recognition of emotional facial and prosodic expressions. Emot Wash DC. (2012) 12:529–39. doi: 10.1037/a0026827

PubMed Abstract | CrossRef Full Text | Google Scholar

65. Lambrecht, L, Kreifelts, B, and Wildgruber, D. Gender differences in emotion recognition: impact of sensory modality and emotional category. Cogn Emot. (2014) 28:452–69. doi: 10.1080/02699931.2013.837378

PubMed Abstract | CrossRef Full Text | Google Scholar

66. Ekman, P, and Friesen, WV. Constants across cultures in the face and emotion. J Pers Soc Psychol. (1971) 17:124–9. doi: 10.1037/h0030377

PubMed Abstract | CrossRef Full Text | Google Scholar

67. Gomez-Ramirez, M, Kelly, SP, Molholm, S, Sehatpour, P, Schwartz, TH, and Foxe, JJ. Oscillatory sensory selection mechanisms during intersensory attention to rhythmic auditory and visual inputs: a human electrocorticographic investigation. J Neurosci. (2011) 31:18556–67. doi: 10.1523/JNEUROSCI.2164-11.2011

PubMed Abstract | CrossRef Full Text | Google Scholar

68. Kaiser, MD, Yang, DYJ, Voos, AC, Bennett, RH, Gordon, I, Pretzsch, C, et al. Brain mechanisms for processing affective (and nonaffective) touch are atypical in autism. Cereb Cortex. (2016) 26:2705–14. doi: 10.1093/cercor/bhv125

CrossRef Full Text | Google Scholar

69. Mercier, MR, Foxe, JJ, Fiebelkorn, IC, Butler, JS, Schwartz, TH, and Molholm, S. Auditory-driven phase reset in visual cortex: human electrocorticography reveals mechanisms of early multisensory integration. NeuroImage. (2013) 79:19–29. doi: 10.1016/j.neuroimage.2013.04.060

PubMed Abstract | CrossRef Full Text | Google Scholar

70. Simon, DM, and Wallace, MT. Dysfunction of sensory oscillations in autism Spectrum disorder. Neurosci Biobehav Rev. (2016) 68:848–61. doi: 10.1016/j.neubiorev.2016.07.016

PubMed Abstract | CrossRef Full Text | Google Scholar

71. Beker, S, Foxe, JJ, and Molholm, S. Ripe for solution: delayed development of multisensory processing in autism and its remediation. Neurosci Biobehav Rev. (2018) 84:182–92. doi: 10.1016/j.neubiorev.2017.11.008

PubMed Abstract | CrossRef Full Text | Google Scholar

72. Schipul, SE, Keller, TA, and Just, MA. Inter-regional brain communication and its disturbance in autism. Front Syst Neurosci. (2011) 5:10. doi: 10.3389/fnsys.2011.00010

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: multisensory integration, facial expressions, affective prosody, high-functioning autism, emotional perception

Citation: Hoffmann J, Travers-Podmaniczky G, Pelzl MA, Brück C, Jacob H, Hölz L, Martinelli A and Wildgruber D (2023) Impairments in recognition of emotional facial expressions, affective prosody, and multisensory facilitation of response time in high-functioning autism. Front. Psychiatry. 14:1151665. doi: 10.3389/fpsyt.2023.1151665

Received: 26 January 2023; Accepted: 03 April 2023;
Published: 24 April 2023.

Edited by:

Wataru Sato, RIKEN, Japan

Reviewed by:

Makoto Wada, National Rehabilitation Center for Persons with Disabilities, Japan
Akihiro Tanaka, Tokyo Woman's Christian University, Japan

Copyright © 2023 Hoffmann, Travers-Podmaniczky, Pelzl, Brück, Jacob, Hölz, Martinelli and Wildgruber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jonatan Hoffmann, am9uYXRhbmhvZmZtYW5uMUBnbWFpbC5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.