AUTHOR=Zheng Zhong , Li Keyi , Guo Yang , Wang Xinrong , Xiao Lili , Liu Chengqi , He Shouhuan , Feng Gang , Feng Yanmei 

TITLE=The Relative Weight of Temporal Envelope Cues in Different Frequency Regions for Mandarin Disyllabic Word Recognition

JOURNAL=Frontiers in Neuroscience

VOLUME=Volume 15 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2021.670192

DOI=10.3389/fnins.2021.670192

ISSN=1662-453X

ABSTRACT=Objectives: Acoustic temporal envelope (E) cues containing speech information are distributed across all frequency spectra. To provide a theoretical basis for the signal coding of hearing devices, we examined the relative weight of E cues in different frequency regions for Mandarin disyllabic word recognition in quiet. Design: E cues were extracted from 30 continuous frequency bands within the range of 80 to 7,562 Hz using Hilbert decomposition and assigned to five frequency regions from low to high. Disyllabic word recognition scores were obtained from 20 normal-hearing listeners provided with E cues in combinations of two, three, or four frequency regions. The relative weights of the five frequency regions were calculated using least-squares approach. Results: Listeners correctly identified 3.13–38.13%, 27.50–83.13%, or 75.00–93.13% of words when presented with two, three, or four frequency regions, respectively. Increasing the number of frequency region combinations improved recognition scores and decreased the magnitude of the differences in scores between combinations. This suggested a synergistic effect among E cues from different frequency regions. The mean weights of E cues of frequency regions 1–5 were 0.31, 0.19, 0.26, 0.22, and 0.02, respectively. Conclusions: For Mandarin disyllabic words, E cues of frequency regions 1 (80–502 Hz) and 3 (1,022–1,913 Hz) contributed more to word recognition than other regions, while frequency region 5 (3,856–7,562) contributed little.