- 1MED-EL GmbH, Innsbruck, Austria
- 2ENT/Audiological Acoustics, University Hospital, Goethe University Frankfurt, Frankfurt, Germany
Background: One factor which influences the speech intelligibility of cochlear implant (CI) users is the number and the extent of the functionality of spiral ganglion neurons (SGNs), referred to as “cochlear health.” To explain the interindividual variability in speech perception of CI users, a clinically applicable estimate of cochlear health could be insightful. The change in the slope of the electrically evoked compound action potentials (eCAP), amplitude growth function (AGF) as a response to increased interphase gap (IPG) (IPGEslope) has been introduced as a potential measure of cochlear health. Although this measure has been widely used in research, its relationship to other parameters requires further investigation.
Methods: This study investigated the relationship between IPGEslope, demographics and speech intelligibility by (1) considering the relative importance of each frequency band to speech perception, and (2) investigating the effect of the stimulus polarity of the stimulating pulse. The eCAPs were measured in three different conditions: (1) Forward masking with anodic-leading (FMA) pulse, (2) Forward masking with cathodic-leading (FMC) pulse, and (3) with alternating polarity (AP). This allowed the investigation of the effect of polarity on the diagnosis of cochlear health. For an accurate investigation of the correlation between IPGEslope and speech intelligibility, a weighting function was applied to the measured IPGEslopes on each electrode in the array to consider the relative importance of each frequency band for speech perception. A weighted Pearson correlation analysis was also applied to compensate for the effect of missing data by giving higher weights to the ears with more successful IPGEslope measurements.
Results: A significant correlation was observed between IPGEslope and speech perception in both quiet and noise for between-subject data especially when the relative importance of frequency bands was considered. A strong and significant correlation was also observed between IPGEslope and age when stimulation was performed with cathodic-leading pulses but not for the anodic-leading pulse condition.
Conclusion: Based on the outcome of this study it can be concluded that IPGEslope has potential as a relevant clinical measure indicative of cochlear health and its relationship to speech intelligibility. The polarity of the stimulating pulse could influence the diagnostic potential of IPGEslope.
1. Introduction
Cochlear implants (CIs) are the treatment of choice to restore hearing in patients with severe to profound hearing loss (HL). The success of the treatment depends on individual factors such as the cognitive abilities of the patient or the reaction of the immune system to the implant, as well as on implant type and the depth of insertion of the electrode array. One influential factor is the condition of the cochlea –Specifically the survival and the physiological status of the spiral ganglion neurons (SGN). Although the importance of this factor is clear, relevant data are sparse.
Spiral ganglion neurons are the target neurons for electrical stimulation with cochlear implants. Large variations have been documented in the number and condition of surviving SGNs in CI recipients, and this could contribute to the similarly large variability in auditory performance observed (Seyyedi et al., 2014). The parameters describing the status of the auditory nerve include the number of SGNs (neural count), the presence, density, and myelination of their peripheral processes (PP), and metabolic and genetic factors. In this paper, we will use “cochlear health” as a generally inclusive term to encompass all of these parameters.
Some of the early attempts to correlate speech recognition in CI users with cochlear health used post-mortem histology of the temporal bone. These studies showed either negative (Nadol et al., 2001) or no correlation (Khan et al., 2005) between residual SGN counts and word recognition. Potential reasons for the lack of correlation include the long duration between speech recognition testing and histological analysis, the limited datasets, the use of pooled data across the electrode array, the use of only neural count, excluding condition, and the inter-individual differences in cognitive abilities. Seyyedi et al. (2014) presented the first study data which showed a positive correlation between the number of surviving SGNs and word recognition score. In a within-subject comparison of left and right ears, eliminating any between-subject confounding factors, the number of SGNs was consistently higher in the ear which produced the better word recognition scores. Despite the positive correlation observed between cochlear health and auditory performance in CI users, several of the limitations of previous studies were also present here. The study was also conducted post-mortem, the demonstrated differences were small, histological data were again pooled across the cochlea, and only very limited data were available.
Another aspect of cochlear health, genetic factors, can influence the function of the auditory nerve (Shearer et al., 2017; Shearer and Hansen, 2019; Usami and Nishio, 2022). Several attempts have been made to establish Cochlear Nerve Deficiency (CND) as a reference model for verifying cochlear health measures (He et al., 2018, 2020; Xu et al., 2020). However, these findings were not as expected, and some measures for cochlear health were found to be contrary to the initial hypothesis (Xu et al., 2020). This may be due to the pathogenesis of CND, in which the status of the auditory nerve is affected during embryogenesis (Jackler et al., 1987) and the frequent presence of concurrent neurological deficits (Huang et al., 2010). These findings may suggest that patients with CND may not be a suitable model for general investigation of cochlear health (Xu et al., 2020).
Electrically evoked compound action potentials (eCAPs) may provide a means for the estimation of cochlear health in live individuals. The eCAP represents the synchronous ensemble activity of electrically stimulated auditory nerve fibers, and has the same neural origin as Wave I of the electrically evoked auditory brainstem response (eABR). Its primary constituents are a negative (N1) peak, which occurs at approximately 0.2–0.4 ms after stimulus onset, followed by a positive (P2) peak at 0.6–0.8 ms. Several characteristics of the eCAP could potentially be examined to glean information regarding cochlear health (van Eijl et al., 2017). DeVries et al. (2016) observed a negative correlation between eCAP amplitude and behavioral thresholds, and reported a tendency toward better speech perception for subjects with higher eCAP amplitude and lower behavioral thresholds. Kim et al. (2010) observed a significant correlation between the slope of eCAP amplitude growth function (AGF) and speech recognition in quiet and in noise, although only for the sub-group of participants with short electrode arrays.
To mitigate the site-specific variation in eCAP response caused by non-neural factors such as the distance between each electrode and its target SGNs, as well as tissue or bone growth, Prado-Guitierrez et al. (2006) introduced a method based on alteration of the interphase gap (IPG, Figure 1A). The IPG is the short zero-amplitude portion between the anodic and cathodic phases of a symmetric, charge-balanced biphasic pulse used in CIs. The effect of increasing the IPG (IPG effect, IPGE) of the stimulating pulse on eCAP characteristics has become widely used as a measure of cochlear health. Ramekers et al. (2014) measured the IPGE on different eCAP AGF characteristics such as amplitude, threshold, slope, and latency in implanted normal-hearing (NH) and pharmaceutically deafened guinea pigs to investigate the consequences of secondary degeneration of SGNs after severe hair cell loss through chemical ablation. A significant correlation between spiral ganglion cell packing density and the IPGE on some of the AGF characteristics, including slope, was shown.
Figure 1. Illustration of the methods employed to calculate IPGEslope (A) and IPGEoffset (B). The eCAP AGFs obtained by IPG 2.1 and 30 μs are plotted in black and gray, respectively. In panel (A), the green lines mark the AGFs steepest slopes. In panel (B), the green horizontal lines indicate the offset between the AGFs with short and long IPGs for several N1-P2 amplitudes.
The same method was used by Schvartz-Leyzac and Pfingst (2016) for estimation of cochlear health in human subjects. The authors measured the IPGE on the AGF slope (IPGEslope) and observed across-site profiles reflecting the local variability along the cochlea. In a subsequent study (Schvartz-Leyzac and Pfingst, 2018), each ear was represented by the across site mean (ASM) of the measured IPGEslope for all electrodes. Similarly to Seyyedi et al. (2014), by calculating the ear-difference in the IPGEslope ASM in bilateral CI users, the between-subject bias stemming from variation in central cognitive ability could be reduced. A significant correlation between ear-differences in IPGEslope and ear-differences in speech reception threshold (SRT) was shown. The approach proposed by Schvartz-Leyzac and Pfingst (2018) was challenged by Brochier et al. (2021) who investigated the IPG effect on the eCAP AGF in computational and animal models. Based on both models, they concluded that IPGEslope did not successfully control for non-neural factors. Instead, they proposed the IPGE on offset (IPGEoffset), defined as the average dB offset between the overlapping linear regions of the two eCAP AGFs obtained with a short and a long IPG (Figure 1B) that are expressed on logarithmic input-output axes. The contradictory conclusions of the two above-mentioned studies emphasized on the necessity of further research to clarify the suitability of each measure for estimation of cochlear health.
Speech information in different frequency bands, transmitted from different sections of the cochlea to the brain, is not of equal importance (ANSI, 1997). Because of this, two ears with the same ASM of the IPGE on eCAP characteristics may differ in speech perception if the distribution of surviving SGNs differs between each cochlea. Regardless of the type of cochlear health measure, when it comes to relating the measure to speech recognition performance, any such measure may benefit from an adjustment using a band importance function that reflects the human auditory system mechanisms of speech perception.
Between-subject variations in the site-specific cochlear capacity for transferring speech information has been considered for other measures related to speech intelligibility. The speech intelligibility index predicts the intelligibility of speech based on the sum of speech audibility in different frequency bands multiplied by the importance of each band, which is determined by that band’s contribution to the intelligibility of total speech information (ANSI, 1997). This function has been estimated in normal hearing subjects using recognition scores of successively low- and high-pass filtered speech. The importance of a band is then determined by comparing the recognition scores across two successive cutoff frequencies (Healy et al., 2013). The perception of the masked speech signal depends on the noise spectrum and differs through variations in sub-band signal-to-noise ratio (SNR) even for the same subject as reported by Pollack (1948) who measured the intelligibility of low- and high-pass filtered speech masked with white noise. The author reported a relative contribution of the various speech frequencies as a function of SNR. On the other hand, the speech perception of hearing impaired individuals with different patterns in their audiograms varies even under the same noise condition. In case of profound HL and in the absence of functional hair cells, variation in neural survival along Rosenthal’s canal influences the transmission of speech in each frequency band. We therefore conclude that it may be advisable to refine the method developed by Schvartz-Leyzac and Pfingst (2018) and to consider the speech band importance function when relating the IPGE to speech perception.
Another factor that might affect the diagnostic power of the IPGE is the polarity of the stimulating pulse. Histological studies have shown that SGN degeneration occurs over an extended period of time in the human and after loss of supporting cells and degeneration of SGN PPs can also survive as monopolar neurons (Liu et al., 2015; Wu et al., 2019). Models predict that SGNs with degenerated PPs require 5 to 6 times more current than healthy neurons to respond to cathodic-leading pulses, however, this is not true for anodic-leading pulses (Rattay et al., 2001a,b; Joshi et al., 2017; Resnick et al., 2018). Single-fiber recordings (Shepherd and Javel, 1997) and investigations of pseudo-monophasic and triphasic pulses (Undurraga et al., 2010, 2013) have provided electrophysiological evidence for this hypothesis. Biophysical considerations on the effect of externally applied electrical fields on neuronal excitation assign an important role to the polarity of the stimulus (Kalkman et al., 2022). Here, a significant factor is the orientation of the neuron in relation to the orientation of the voltage gradient caused by the electric field. In general, cathodic stimulation is seen as more effective for neuronal stimulation (Rattay, 1998). The electrode generates negative potentials in the extracellular space, such that the intracellular potential is no longer negative (Brocker and Grill, 2013) and the transmembrane potential is depolarized, leading to an action potential. This is, however, dependent on the distance of the electrode to the target neuron, which in the case of CI stimulation never are in close contact and are separated by perilymph as well as multiple tissues. In an analogous way, anodic stimulation can create “virtual cathodes” in locations distant from the electrode. CIs often evoke larger eCAPs with anodic stimulation (Macherey et al., 2008; Herrmann et al., 2021). Degeneration and demyelination of the SGN peripheral process effectively moves the neuron further away from the electrode, compared to a healthy SGN (Kamakura et al., 2018). Factors increasing electrode-neuron distance may be creating a preference for anodic stimulation by an electrode. The “generalized activating function” (Rattay, 1999) observes spatial and temporal local voltage changes, which depend on extracellular voltages, axonal resistance, and membrane capacitance. On a single-cell level, the simulations showed that the amount of threshold increase following the loss of the peripheral processes essentially depended on the electrode position and the polarity of the stimulus. Degeneration and demyelination of PPs is predicted by all these models to lead to more significant loss of efficiency for cathodic stimuli, due to more suitable alignment of excitable regions of the SGN and excitatory extracellularly applied potentials in the anodic case. However, often both anodic-leading and cathodic-leading stimuli are used in a single eCAP measurement and then averaged for purposes of artifact reduction (alternating polarity, AP), making a differentiated observation for single polarities impossible. The model outcomes motivated a comparison of the diagnostic potential of anodic-leading stimuli versus that of cathodic- leading stimuli alone for objective assessment of cochlear health. The forward masking (FM) artifact reduction method uses two biphasic pulses with the same leading phase polarity, most commonly a cathodic-leading pulse (He et al., 2017). It is also possible to implement FM with an anodic-leading pulse. The hypothesis of this study is not directed at the polarity effect on excitability as investigated in previous studies (Hughes et al., 2017, 2018; Jahn and Arenberg, 2019a,b), which focused on the average differences of the responses to anodic-leading and cathodic-leading stimuli. Instead, both polarities are investigated separately in the current study to examine the sensitivity of each polarity separately for use as an electrophysiological biomarker for SGN degeneration.
In summary, the goals of this study were to investigate the relationship between IPGEslope, demographics, and speech intelligibility in CI users by (1) by investigating the effect of the polarity of the stimulating pulse, and (2) considering the band importance weighting function when investigating the correlation between the speech perception measures and IPGEslope within the study.
2. Subjects and methods
2.1. Subjects
The subject cohort consisted of 13 bilateral CI users with a mean age of 58 years (range 29–91). Detailed demographic data are given in Table 1. We recruited CI users implanted with MED-EL devices to ensure compatibility with the custom-made eCAP measurement software described below. Etiology varied between subjects, with the most common etiology being progressive sensorineural HL (10 subjects, at least in one ear). All subjects were native German speakers and were stimulated in monopolar mode. For all the subjects the FS4 coding strategy was used and the lowest frequency was set to 70 Hz. Individuals whose HL was secondary to meningitis were excluded from this study, due to the reduced incidence of recordable eCAPs in this condition (Guedes et al., 2007). Subjects 7, 9, 12, and 13 suffered from progressive hearing impairment which was detected prelingually. Subjects 7 and 13 had restricted speech development and were diagnosed with mild auditory dyslalia. Electrode 12 of the right ear of subject 12 and electrodes 4 and 5 of the left ear of subject 7 were deactivated clinically. In the left ear of subject seven, the IPGEslope measurement of electrodes 3, 7, and 10 was interrupted due to the subject’s complaint of an unpleasant sensation. The left ear of subject seven was the only ear implanted with a relatively short electrode array (Flex24 EAS). Recruitment of subjects for this study was approved by the Ethics Committee of the Goethe University Hospital in Frankfurt (ERB number 44/19), and all subjects gave written informed consent. Subjects received an expense allowance for participation in the study.
Table 1. Demographic data and summary statistics are provided for age, duration of hearing loss until CI-implantation, hearing aid experience, CI experience and residual hearing.
2.2. Speech recognition testing (measurement procedure and stimuli)
All measurements were administered in a double-walled sound attenuating audiological booth, which fulfilled the requirements of the standard DIN EN ISO 8253-1 (2011). The stimuli were presented via calibrated loudspeaker (JBL Control 1, Harman, Garching, Germany). The subject faced the loudspeaker at a one meter distance. Two speech tests were conducted, the German matrix sentence test (Wagener et al., 1999) and the Freiburg monosyllable test (Hahlbrock, 1953). The measurements were performed separately for the two ears, removing the audio processor from the contralateral side, using the clinically adjusted audio processor configurations (i.e., threshold and most-comfortable loudness levels and compression). The microphone directional sensitivity was set to “omnidirectional” and noise reduction and automatic functions were disabled to create a uniform testing condition between the users.
2.2.1. German matrix sentence test
In this test, the participant was presented with a spoken German sentence in the presence of masking noise. Each sentence consists of five words with the structure “Name-Verb-Number-Adjective-Object.” The sentences are syntactically correct but semantically unpredictable. The participant was then asked to select on a touch screen the words they heard among ten alternatives for each word. The speech material is balanced to represent the phonetic variety of the German language. A stationary noise comprising the long-term average spectrum of the speech material (Wagener et al., 1999) was used as the masking noise. Speech presentation level was fixed at a level of 65 dB SPL. In the beginning of the test, noise was presented at + 5 dB SNR (60 dB SPL) and thereafter the SNR was adjusted automatically to adaptively measure the SNR corresponding to a 50% correct classification rate, reported as the speech reception threshold (SRT). For more general information on international language matrix tests, please refer to the manual (HörTech gGmbH, 2019).
2.2.2. Freiburg monosyllable test
This test consists of 20 lists, each containing 20 monosyllabic words. For each measurement two of the lists were randomly selected. The words were presented to the listeners at 65 dB SPL in quiet. The listeners were instructed to repeat the words. The percent of words repeated correctly was recorded. Test results of the two presented lists were averaged.
2.3. Evaluation of the speech band importance function for improving the accuracy of the outcome measure
The band importance function for one-third octave bands and for monosyllables of speech in noise introduced by ANSI S3.5 (1997- Supplementary Table 2) was adapted to the MED-EL default filter bank setting. The number of bands and their equivalent center frequencies of the one-third octave band differ from the filter bank setting of the MED-EL speech processor. A fourth-degree polynomial was therefore fitted to the above-mentioned band importance function and evaluated at MED-EL default center frequencies. This procedure resulted in 12 weights for the 12 electrodes. Figure 2 shows the original band importance function from ANSI S3.5 and the function adapted to the default center frequencies of MED-EL devices. The IPGEslope for each individual electrode was then multiplied with the respective adapted weight for that electrode to reflect the importance of the speech information transmitted via that band for speech intelligibility. In cases of deactivated electrodes, the same adapted weights as depicted in Figure 2 were used. For each ear, the across site mean (ASM) of the weighted IPGEslope was calculated. These ear-specific weighted IPGE ASMs were then used to investigate the correlation between these measures of cochlear health and the two speech perception measures.
Figure 2. One-third octave band importance function for monosyllables of speech in presence of noise (Supplementary Table 2, ANSI S3.5 1997) in blue and adapted to MED-EL default filter bank setting in red.
2.4. Impedance measurement
Electrode impedances were measured via impedance field telemetry (IFT) using the clinical software (MAESTRO 7.0, MED-EL Medical Electronics, Innsbruck, Austria) with the MAX Programming Interface (MED-EL Medical Electronics, Innsbruck, Austria) and a suitable coil. This results in measurements of the implant’s supply voltage, and the impedance values of the 12 implanted electrode contacts. For the purpose of the study, the measurement results were exported (using the scientific export) from the clinical software into XML files, and the supply voltage and the electrode impedance values were extracted for determining the compliance limits. The extracted values were also used in the statistical analysis of the results.
2.5. Loudness-based measurements
A custom-made MATLAB-based (The MathWorks, Natick, MA, United States) research software was used to perform loudness-based measurements with pulse-forms and sequences identical to those used in the following experimental measurements. The MATLAB program communicated with the implant using the Research Interface Box 2 (RIB2) Dynamic Link Library (dll), rib2.dll, Version 1.73.0.0, 64 bit (Department of Ion Physics and Applied Physics, University of Innsbruck, Innsbruck, Austria), the MAX Programming Interface and a suitable coil. Threshold (THR) and maximum acceptable level (MAL) stimulation charges were measured using manual control for all active electrodes using cathodic-leading biphasic stimuli with IPG of 2.1 μs, and repeated for an IPG of 30 μs, in sequences of at least 400 ms duration to allow sufficient loudness integration. The phase duration was the same as that selected for the following eCAP measurements. The amplitude could be increased up to the compliance level which was calculated after impedance measurement.
2.6. eCAP measurements
A custom-made MATLAB-based tool was used for eCAP measurements considering the THR and MAL values from the loudness fitting tool. The communication with the implant was the same as the software for the loudness-based measurement.
Electrically evoked compound action potentials were measured using the forward masking (FM) artifact reduction approach [Brown et al. (1990), illustrated in Figure 1 of Baudhuin et al. (2016)]. This method exploits the absolute refractory period of SGNs by implementing a double-pulse paradigm with a sufficiently short inter-pulse-interval. Theoretically, the technique results in a voltage trace containing the neural response alone. More details on currently applied artifact reduction techniques have been reviewed by He et al. (2017). FM artifact reduction allows investigation of the auditory nerve response to pulses with a specific initial polarity. In this study, all eCAPs were measured both with anodic-leading (FMA) and with cathodic-leading (FMC) pulses, in order to investigate the polarity-specific behavior of neural responses. For each parameter set, 50 sweeps were recorded and averaged. An AP artifact reduction method was implemented by averaging the FMA and FMC probe responses. It should be noted that FMA and FMC probes were not measured consecutively, which is different to how AP is implemented in clinical software. In all cases, eCAPs were measured using two IPGs, 2.1 and 30 μs. The combination of polarities and IPGs resulted in six different conditions: FMA 2.1, FMA 30, FMC 2.1, FMC 30, (virtual) AP 2.1, and (virtual) AP 30.
ECAP recordings were performed in monopolar configuration at a rate of approximately 60 Hz using the standard stimulation ground of the implant. Recording electrodes were by default set to the next more apical active electrode relative to the stimulating electrode (n−1, where n is the electrode number), except for electrode 1 (the most apical electrode) which had electrode 2 as the default recording electrode. When necessary, the location of each recording electrode could be altered by the investigator to yield a clear eCAP based on visual inspection following an initial test pulse. The recording electrode was neighboring the stimulating electrode in nearly all cases. ECAP recordings were obtained with high temporal resolution (stimulator internal sampling rate 1.2 MHz). The measurement delay was set to 120 μs for stimuli with IPG = 2.1 μs and to 149 μs for stimuli with IPG = 30 μs to compensate for the different durations of the respective biphasic stimuli. Both masker and probe signals were measured with the same recording electrode pair. The masker level was 10% higher than that of the probe, except for the highest amplitude step, for which the masker was set to MAL and the probe was set to 95% of MAL. This procedure led to a smaller increase for the last stimulus step in the AGF and to potentially less effective masking, however, it ensured that MAL would not be exceeded.
The default phase duration was set to 30 μs and the default masker-probe interval to 350 μs, but both parameters could be adjusted by the investigator when necessary. For subjects 7 and 10 the phase duration was increased to 40 and 50 μs, respectively, to record eCAPs with sufficient reliability. Subject 14 had different phase durations in right (30 μs) and left (50 μs) ears. AGFs were recorded on all active electrodes, with 10 amplitude steps between and including threshold and MAL, as well as two sub-threshold measurements. The amplitude increment was equidistant. This resulted in a total of 576 eCAPs for each ear, in the case of 12 clinically active electrodes.
To further reduce the influence of the measurement noise, the recorded eCAPs were filtered with a fifth-order Butterworth lowpass filter having a cut-off frequency of 5 KHz. Zero-phase filtering was applied, which effectively doubled the filter order to ten. To avoid potential tilt of the responses, caused by remaining stimulus artifact, biasing the determination of eCAP characteristics, the filtered eCAPs were detrended. To estimate the trend, a weighted linear least-square analysis was applied to the eCAP. Each eCAP trace was divided into three equally long parts and samples within the first, second and third part were multiplied with weights of 0.1, 0.5, and 1, respectively, to emphasize the tail of the response in the estimation of the trend where the presence of the artifact is more pronounced. This allowed more accurate estimation of the trend. This estimated trend was then subtracted from the eCAP. Finally, in order to eliminate the artifact caused by internal circuitry, the response to the lowest subthreshold current level was subtracted from the detrended eCAP.
2.7. AGF slope calculation and IPGEslope
The extrema of the eCAP amplitude time-course falling into the time intervals of approximately 0.02–0.4 ms and 0.3–0.8 ms were chosen to determine the N1 and P2 peaks, respectively. The earliest extremum of the corresponding time interval was chosen as the N1 peak. For the detection of P2 the extremum with the largest amplitude was selected to allow a consistent definition of the peak in cases of double peaks. The default time intervals were modified if no peak was detected, (in particular in the case of a missing N1). Absence of the N1 peak could be driven by an early response of the neurons occurring during the blanking delay and therefore hidden from the measurement system (Lai and Dillier, 2000). Therefore, in such a case, the N1 peak was defined arbitrarily as the amplitude of the signal at 0.03 ms after stimulus onset. For the selection of the P2 peak, the above-mentioned time interval was expanded toward the onset of the signal to compensate for the delay in the recording and to allow the detection of any peaks that occurred earlier than expected.
Amplitude growth function, i.e., the N1-P2 peak amplitude difference as a function of stimulating current level, were calculated for each electrode, each IPG, each polarity and each artifact reduction approach (FM/AP). An automatic AGF selection was performed in order to only estimate AGFs with adequate reliability. The criteria for AGF selection were based on the maximum eCAP amplitude (the N1-P2 peak-amplitude for highest current level must be larger than 120 μV), impedance of the stimulating electrode (must be lower than 10 kΩ), monotonicity of the AGF and the comparison between the maximal AGF slope and the slope of a line fitted to the first three points of the AGFs (the response to the subthreshold and threshold currents) as an estimation of the artifact. For the last criterion, the slope difference should be larger than 0.5 (μV/ μA) unless the slope of the line fitted to the eCAP amplitude measured with subthreshold and threshold currents was smaller than 0.3 (μV/ μA), indicating a mild level of artifact.
The AGF slopes were estimated according to the window method introduced by Skidmore et al. (2022). The input was converted to charge (nC). ECAP AGFs were resampled at 13 data points to handle the non-uniformly sampled data in the original AGF. Subsequently, first order linear functions were fitted to different subsections of the resampled AGF and the slope of the steepest linear function was determined as the slope of the eCAP AGF. Each subset consisted of four points of the resampled AGF, with overlap of three points between subsequent subsets. Only the data points above the noise floor (set to 20 μV) were considered for the analysis. The final AGF slope was the maximum slope that had been determined using this moving window approach. IPGEslope was then obtained by subtracting the AGF slope calculated for IPG 2.1 μs from the AGF slope for IPG 30 μs. Figure 1 depicts two exemplary AGFs obtained with IPG 2.1 μs and IPG 30 μs and their corresponding maximum slope.
2.8. Statistical analysis and outliers
All data were analyzed using MATLAB 2020a. Single and multiple linear regressions were employed to investigate the relationship between the IPGEslope and speech test outcomes, demographics and electrode impedances. The coefficient of determination (R2) was calculated based on the Pearson correlation coefficients in each case. These were reported together with the corresponding level of significance. In the case of multiple linear regression adjusted R2 was reported to compensate for the effect of over-fitting caused by the moderate sample size of this study. The adjusted R2 was defined as.
Where n and k were the sample size and number of independent variables, respectively.
In addition to the standard correlation, a weighted Pearson correlation analysis was also implemented to account for missing data due to rejected AGFs, i.e., the cases that the criteria for an automatic selection of AGF by the algorithm were not satisfied. The correlation coefficient was calculated according to.
Where xi and yi were samples of the independent and dependent variables of length n, and x̄ and ȳ were the corresponding mean values. In cases where each sample was an ear (in case of analysis of monaural data), the weight for each sample, wi, was the number of electrodes with accepted AGF for both IPG 2.1 μs and IPG 30 μs divided by the total number of electrodes (12). For the analysis of the ear-differences data, the weight wi, was the average of the weights for each ear of the subject. For example, in case of subject 4 and for FMA condition, an acceptable AGF was obtained for both employed IPGs on 10 and 2 electrodes on the right and left ears, respectively, resulting in weights of 0.8333 and 0.1667 for these ears, respectively. For analyzing ear-differences, a weight of 0.5 (average of 0.8333 and 0.1667) was therefore applied.
In the context of weighted Pearson correlation, to calculate the corresponding level of significance, the test value for the Student’s t-distribution was defined as.
rw was calculated according to Eq. (2). nw was the effective sample size and was defined as the exponent of the entropy of the weights, with weights being normalized to their summed value.
The corresponding level of significance (p-value) to the adapted t-value in Eq. (3) was calculated using MATLAB’s default numerical methods as for the standard p-value. The degree of freedom was defined as.
To identify outliers for the variables age, duration of hearing loss until implantation (DHL), hearing aid experience (HAE) and CI experience (CIE), the quartiles outlier test was applied. The coefficients of determination for the ranges 0.0–0.3, 0.3–0.6, and 0.6–1.0 were categorized as weak, moderate and strong, respectively.
3. Results
3.1. Patient data
Table 1 contains subject demographic data. Subjects 9 and 16 were the youngest and oldest participants of this study. Six subjects had residual hearing. The extent of residual hearing was comparable among this subgroup. Subject 7L was the only case of implantation with a short electrode array (Table 1). This subject had the longest duration of HL, and the CI experience was below the mean value of the group data. No residual hearing was observed for this subject at the time of experiment. Subject 2 was the only subject who showed no ear-difference in any of the investigated demographic data (duration of HL, hearing aid experience and CI experience). This subject was implanted with the same electrode array type and wore the same type of speech processor on both sides. In general, the variation in electrode array type was low among the ears tested. Most of the ears were implanted with a 28 mm long electrode array. There was a small difference in electrode length in a few subjects. The ear-differences in duration of HL, hearing aid experience and CI experience were the same because there was no ear-difference in the onset of HL and hearing aid use in any subject. Therefore, the time of implantation was the only cause of variation in all these three demographic data for the ear-differences.
3.2. Speech test outcomes and individual patient factors
Figure 3 shows the SRT and percent correct results for the German matrix sentence test and the FMT, respectively (ranked according to the best-ear SRT in a descending order). In general, the SRTs ranged between −5 and 12 dB SNR, with only subject 13 and subject 7L showing SRTs higher than 1 dB SNR. These two subjects were the only ones with (mild) auditory dyslalia. Subject 13 was one of the few subjects of this study who had had a long duration of progressive prelingual HL which probably had a detrimental influence upon speech perception. All these factors have the potential to manifest themselves in the SRT outcome. As subject 13 had notably worse SRT scores in both ears than the other subjects, subject 13 was determined as an outlier and excluded from the analysis related to correlation between speech scores and cochlear health metrics, but was included in the rest of the analysis (with demographic data). Subjects 2, 9, and 10 reached the lowest (best) SRTs. Only subjects 6, 7, 11, 13, and 16 showed ear-differences larger than 1 dB SNR. Given the high measured SRTs for subject 7L and subject 13 (both ears), at least part of the ear-difference in SRT might fall into the test-retest reliability for the matrix sentence test (Hey et al., 2014).
Figure 3. (A) Depicts the speech reception thresholds (SRT, dB SNR) measured for the German matrix sentence test for the right ear (red bars), left ear (blue bars) and the absolute value of the ear-differences (right-left, black bars) sorted according to the SRT for the best ear in descending (improvement in speech intelligibility) order. (B) Displays the outcome of the Freiburg monosyllable test (% correct). The subjects order and the display are the same as (A).
For the FMT, scores ranged between 25 and 95%. No outliers were identified for this test. In the monaural condition, a strong and highly significant correlation (R2 = 0.67***, p-val = 0.00, t-val = −6.97, df = 24, R = −0.81) was observed between the outcomes of the two tests (Figure 4, left panel). Both tests ranked subject 9 and 13 as good and poor performers, respectively. Nevertheless, in case of ear-differences, a comparison of the tests revealed differences in their outcome. A clear difference was apparent for subject 8 who showed the smallest ear-difference for SRT and one of the largest differences for FMT. No correlation was observed between the outcome of the tests for ear-differences (Figure 4, right panel). Furthermore, weak but significant correlations were found between monaural SRTs and demographic data of type age (R2 = 0.20*) and CI experience (R2 = 0.21*), data not shown.
Figure 4. The correlation between the two speech intelligibility measures i.e., speech reception threshold (SRT) and Freiburg monosyllable test scores (FMT scores) for the monaural data (left panel) and the ear-differences (right panel). Each subject is represented by a number according to Tables 1, 2. Red, blue and black indicate data from right ear, left and ear-differences, respectively. Black solid lines indicate significant correlations. Dashed gray regression line indicate non-significant correlations. ***p-value ≤ 0.001.
3.3. IPGEslope – Individual electrodes and across-site mean (ASM)
Figure 5 shows the measured IPGEslope for all the subjects. Each subfigure shows the calculated IPGEslope for individual electrodes (Electrode 1: the most apical, Electrode 12: the most basal) and their corresponding ASM for one ear. The right and left ears are indicated with red and blue. Circles, squares and triangles mark the three conditions FMA, FMC and AP, respectively. Black crosses indicate clinically deactivated electrodes.
Figure 5. The calculated IPG effect on slope (IPGEslope) for each of the 12 electrodes (1: most apical and 12: most basal) and their corresponding across site mean (ASM) values. Each subfigure shows the data of one ear. The data from right (B) and left (A) ears are coded in red and blue. Circles, squares, and triangles mark forward masking with anodic-leading pulse (FMA), forward masking with cathodic-leading pulse (FMC) and alternating polarity (AP) conditions. Crosses indicate the clinically deactivated electrodes. Arrows indicate IPGEslope with magnitudes larger than 40 (μV/ nC). For two out of the three conditions for each electrode, the data points are slightly shifted to the left and right to improve visibility.
An exemplary case of a successful measurement is subject 16. For subject 16L an acceptable monotonic AGF was obtained for all electrodes and both polarities of the stimulating pulse. Subject 5 is an example for incomplete measurements. For this subject, successful eCAP measurements were obtained for only four electrodes. The calculated IPGEslope values for these electrodes were relatively low across the electrodes, possibly indicating poor cochlear health. A clear variation in IPGEslope along the cochlea was observed in subject 2L with electrode 2 and 9 (examples of higher IPGEslope values) in contrast to electrode 4 and 11 exemplary cases that result in lower IPGEslope values. For some of the subjects such as subject 9 a large difference was observed between the right and left ears in terms of the number of electrodes for which IPGEslope was available. For subject 9L (who had a congenital component to their HL), successful measurements were obtained for all the three conditions for most of the electrodes. For the right ear of this subject, eCAP measurement was not possible.
For the three conditions further differences were observed between the ears. For example, for subject 4R, the difference in IPGEslope between conditions was minor. For subject 10R, however, a difference between polarities was apparent in the estimated IPGEslope. In the case of FMA and AP, measurements were successful for most of the electrodes, in contrast to FMC which resulted in successful measurements on only 2 electrodes. A noticeable difference was also observed between FMA and AP for the estimated IPGEslope. Subject 7L and subject 9R were excluded from all analyses due to the extent of the missing data for these ears. Both subjects had a congenital component to their HL. In cases of analyzing ear-differences, data of subject 7 and 9 was excluded for both ears since calculation of ear-differences was not possible.
A comparison of IPGEslope measured in this study with the ones measured in a guinea pig model (Figure 7 of Ramekers et al., 2014) showed a smaller magnitude of this cochlear health measure for human subjects. For the 6-week deafened animals in the study of Ramekers et al. (2014) which are comparable to the human subjects of this study in terms of the degree of HL, the measured IPGEslope was 6 to 8 (depending on the phase duration) times larger than those measured in this study for AP condition.
3.4. Correlation between IPGEslope and speech test outcomes
Figure 6 shows a scatter plot of the SRT as a function of IPGEslope ASM for the three conditions (FMA, FMC and AP). The upper panel shows the standard Pearson correlation results. In this condition all ears contributed equally to the obtained coefficient of determination, regardless of the extent of missing electrode data. The middle panel shows the weighted Pearson correlation. The size of the number labels representing individual ears was scaled according to the corresponding weight for that ear e.g., the subject 16L has the largest label because, for this ear, eCAPs were measured successfully for all 12 electrodes. In contrast, subject 10L with successful eCAP measurement on only three electrodes has one of the smallest labels.
Figure 6. The correlation between speech reception threshold (SRT) and IPGEslope ASM for monaural data. (A) Shows the result of standard Pearson correlation analysis where each ear contributed equally to the calculated correlation coefficient. (B) Depicts the results obtained by applying the weighted Pearson correlation analysis where ears with less missing data contributed more to the calculation of correlation coefficient. (C) Shows the result of the weighted Pearson correlation between SRT and weighted IPGEslope to account for both missing data and the relative importance of frequency bands for speech intelligibility. The right and left ears are presented in red and blue, respectively. Subject numbers are in accordance to Table 1. The difference in the print size of each number in panels (B,C) is proportional to the number of electrodes with successful eCAP AGF measurement for that ear. Black solid lines indicate significant correlations. Dashed gray regression line indicate non-significant correlations. *p-value ≤ 0.05, **p-value ≤ 0.01.
In the bottom panel, the weighting function was applied to the measured IPGEslope on individual electrodes before calculation of ASM of each ear in order to take into account the relative contribution of each electrode’s assigned frequency band to speech intelligibility. Here too, a weighted Pearson correlation was calculated. No correlation was observed between IPGEslope and SRT when the effect of missing data was not compensated for and when the relative importance of frequency bands was not taken into account (upper panel). For FMC, a weighted Pearson analysis resulted in a weak but significant correlation (middle panel). In general, for both polarities (FMA and FMC) the highest correlation was observed when the effect of missing data was compensated and the relative importance of frequency bands for speech intelligibility was considered (FMA: = 0.25*, p-val = 0.02, t-val = −2.45, df = 18.10, Rw = −0.50, FMC: = 0.33**, p-val = 0.00, t-val = −2.95, df = 17.75, Rw = −0.57). For AP, however, this observation resulted in a trend but showed no significant correlation (lower panel).
The purpose of the weighted correlation was to compensate for missing data. Therefore, a comparison was made between the results of the weighted correlation analysis of all ears and a standard correlation analysis of a subset of ears with relatively complete measurements, which we define as AGF for at least 8 out of 12 electrodes in both polarities and for both IPGs (in total, 9 ears). This standard correlation analysis is shown in Figure 7. A stronger IPGEslope – SRT correlation was observed for the subset of ears with more successful eCAP measurements in comparison to the result of the weighted correlation analysis of all subjects (Figure 6, lower panel). For the subset of the ears with more successful measurements, the magnitude of the coefficient of determination was comparable between the three conditions (FMA: R2 = 0.55*, p-val = 0.02, t-val = −2.92, df = 7, R = −0.74, FMC: R2 = 0.50*, p-val = 0.03, t-val = −2.66, df = 7, R = −0.71, AP: R2 = 0.54*, p-val = 0.02, t-val = −2.85, df = 7, R = −0.73).
Figure 7. The standard Pearson correlation between speech reception thresholds (SRT) and weighted IPGEslope ASM for monaural data. Only the ears with successful eCAP AGF measurement on at least eight electrodes in all the three conditions (A: FMA, B: FMC, C: AP) are included. Right and left ears are plotted in red and blue. *p-value ≤ 0.05.
No significant correlation was observed between ear-differences of SRT and ear-differences of IPGEslope either for the standard (FMA: R2 = 0.06, FMC: R2 = 0.00, AP: R2 = 0.04) or the weighted (FMA: = 0.06, FMC: = 0.00, AP: = 0.04) Pearson correlation analyses. Applying the weighting to IPGEslope to account for the relative importance of each frequency band for speech intelligibility did not result in a significant correlation with SRT (FMA: = 0.06, FMC: = 0.02, AP: = 0.13) when ear-differences were analyzed (data not shown).
Figure 8 depicts the FMT scores as a function of the IPGEslope for monaural data and has the same structure as Figure 6. A weak but significant correlation was observed between monaural FMT scores and monaural IPGEslope ASM for FMC, as well as for the standard Pearson correlation analysis and in the absence of applying the weighting to account for the relative importance of each band for speech intelligibility. The magnitude of the coefficient of determination was improved when the weighted Pearson correlation analysis was employed. The highest correlation was observed when, in addition to the weighted correlation, the speech-related weighting was also applied, although the correlation was not significant in case of AP (FMA: = 0.28*, p-val = 0.01, t-val = 2,64, df = 18.10, Rw = 0.52, FMC: = 0.25*, p-val = 0.02, t-val = 2.46, df = 17.75, Rw = 0.50, AP: = 0.19, p-val = 0.05, t-val = 2.09, df = 18.67, Rw = 0.43). No significant correlation was obtained between IPGEslope ASM and FMT scores either for standard or Pearson correlation or after applying the speech-related weighting when ear-differences were analyzed (FMA: = 0.18, FMC: = 0.01, AP: = 0.04, data not shown).
Figure 8. Freiburg monosyllable (FMT) scores as a function of IPGEslope ASM for monaural data. The figure has the same structure as Figure 6. (A) Shows the result of standard Pearson correlation analysis where each ear contributed equally to the calculated correlation coefficient. (B) Depicts the results obtained by applying the weighted Pearson correlation analysis where ears with less missing data contributed more to the calculation of correlation coefficient. (C) Shows the result of the weighted Pearson correlation between SRT and weighted IPGEslope to account for both missing data and the relative importance of frequency bands for speech intelligibility. The right and left ears are presented in red and blue, respectively. Subject numbers are in accordance to Table 1. The difference in the print size of each number in panels (B,C) is proportional to the number of electrodes with successful eCAP AGF measurement for that ear. Black solid lines indicate significant correlations. Dashed gray regression line indicate non-significant correlations. *p-value ≤ 0.05.
3.5. IPGEslope and demographic data
Figure 9 depicts age as a function of IPGEslope for the three artifact reduction methods with monaural data. The correlation analysis showed a clear effect of the polarity of the stimulating pulse: a significant correlation was observed only when a cathodic-leading pulse was used for stimulation. The strength of the correlation (R2 = 0.38, p-val = 0.00, t-val = −3.71, df = 22, R = −0.62) for the AP condition was intermediate between those obtained with FMA and FMC. No correlations were observed between hearing aid experience, duration of HL or CI experience and the IPGEslope (data not shown).
Figure 9. The correlation between IPGEslope ASM and age for monaural data. Each column represents one artifact reduction approach (A: FMA, B: FMC, C: AP). Each number represents one ear which is in accordance with Table 1. The right and left ears are plotted in red and blue, respectively. The difference in the print size of each number is proportional to the number of electrodes with successful measurement for that ear. Black solid lines indicate significant correlations. Dashed gray regression line indicate non-significant correlations. **p-value ≤ 0.01, ***p-value ≤ 0.001.
Figure 10 shows the correlation between IPGEslope and age only for the subset of ears with relatively successful eCAP measurement on at least 8 out of the 12 electrodes. This strict inclusion criterion (applied post hoc) strengthened the correlations between the two parameters for all the three artifact reduction approaches. Notably for FMC, a strong and highly significant correlation (R2 = 0.84, p-val = 0.00, t-val = −5.98, df = 7, R = −0.91) was observed which was considerably higher than the correlations observed in FMA and AP (R2 = 0.60, p-val = 0.01, t-val = −3.26, R = −0.77) conditions.
Figure 10. Standard Pearson correlation analysis between age and IPGEslope ASM for monaural data. Each column represents one of the three conditions investigated (A: FMA, B: FMC, C: AP). Only the ears with successful eCAP measurement on at least eight electrodes in all the three conditions are included. Right and left ears are plotted in red and blue, respectively. Each number represents one ear which is in accordance with Table 1. Black solid lines indicate significant correlations. Dashed gray regression line indicate non-significant correlations. *p-value ≤ 0.05, ***p-value ≤ 0.001.
3.6. Multiple linear regression
A multiple linear regression model was used to investigate the relation of cochlear health measures to demographic data and to electrode impedances. The model was also used to investigate whether considering demographic data and classical impedances in addition to cochlear health, in one model, explains the variation in speech intelligibility to a greater extent. The obtained results for the multiple linear analysis were compared to the standard Pearson correlation in a two-dimensional domain as the reference point. The choice of standard instead of weighted Pearson correlation here was to avoid implementation of weighted multiple regression analysis which requires complex calculations. Two-, three-, four- and five-dimensional models were used. If the dimensionality was higher than two, adjusted coefficients of determination (Radj2) were reported to compensate for overfitting, resulting from an increase in dimensionality. The analysis showed that only age was a significant predictor of IPGEslope for FMC and AP conditions. Addition of other investigated demographic factors or electrode impedances did not result in an improvement of model prediction.
Table 2 describes the variations in SRT as a function of the cochlear health measure alone (first row) and together with CI experience. By considering these two variables, more than 50% of the variation in SRT was explained in the case of FMC. Considering both CI experience and IPGEslope significantly improved the model prediction in comparison with considering only IPGEslope as the independant variable [df(1,19), F-val = 15.61***, p-val = 0.00]. For FMA and AP, the highest explained variation was almost 30%.
Table 2. Results from the multiple linear regression models to predict the variation in speech reception thresholds (SRTs).
3.7. Correlation between IPGEoffset and speech intelligibility
Brochier et al. (2021) compared different methods used for interpretation of changes in eCAP AGF due to changes in IPG using both computational and animal models. They showed a significant correlation between IPGE on level 50% and SGN density in the animal model. No correlation was observed for IPGEslope in the same animal model. They concluded that IPGEslope in either the linear or logarithmic domain is vulnerable to non-neural factors such as electrode-to-modiolus distance or impedances of the stimulating and/or recording electrodes. As a solution, for human subjects, the authors proposed the IPGEoffset which was defined as average offset (in dB re 1 nC) in stimulus amplitude between the linearly growing portions of the eCAP AGFs (obtained with short and long IPGs) expressed on logarithmic input-output axis [Figure 9 in Brochier et al. (2021)].
To compare IPGEslope with IPGEoffset, the same analysis introduced by Brochier et al. (2021) was applied to the human data of this study. Two step-sizes for the sampling of N1-P2 amplitudes were used, 0.1 μV as introduced by Brochier et al. (2021) and a step-size of 0.02 μV. Figure 11 shows the results and has the same structure as Figure 6. It depicts SRTs as a function of IPGEoffset for standard (upper panel) and weighted (middle panel) Pearson correlation and as a function of weighted IPGEoffset for weighted Pearson correlation (lower panel) to consider both the effect of missing data and relative importance of each frequency band for speech intelligibility. No significant correlation was observed in any case regardless of the employed step-size.
Figure 11. The correlation between speech reception threshold (SRT) and IPGEoffset ASM for monaural data. The figure has the same structure as Figure 6.
4. Discussion
This study investigated the relationship between IPGEslope, considered to be a measure of SGN survival (Prado-Guitierrez et al., 2006; Ramekers et al., 2014; Schvartz-Leyzac and Pfingst, 2016), and speech recognition in the monaural condition and for ear-differences of each measure. The purpose of analyzing the ear-differences was to provide a controlled experimental condition as reported by Schvartz-Leyzac and Pfingst (2018) by factoring out the interindividual variability in cognitive abilities and other aspects related to processing of the central auditory system in CI users. Furthermore, the influences of stimulus polarity, artifact reduction technique, demographic data and the impact of the application of a weighting function related to the speech band importance function on a measure of cochlear health were all investigated.
4.1. Correlation between IPGEslope and speech intelligibility and the effect of weighting
For both monaural data and ear-differences, insignificant or very weak correlations were observed between absolute values of IPGEslope and speech test outcomes when a standard Pearson correlation was applied, regardless of stimulus polarity. One factor limiting the potential applicability of a correlation analysis was missing data due to unsuccessful eCAP measurements on several electrodes. The extent of missing data was heterogeneous across ears and for both polarities. A weighted Pearson correlation analysis was employed to compensate for the influence of missing data. This resulted in a weak but significant correlation between monaural IPGEslope and speech test outcomes for some of the conditions as shown in the middle row of Figures 6, 8.
Another factor influencing the relationship between IPGEslope and speech test outcome was the relative importance of each frequency band for speech intelligibility. The importance of speech information is not uniform across the spectrum, but is frequency dependent. Consequently as a result of cochlear tonotopy, degeneration of spiral ganglion cells along Rosenthal’s Canal does not equally impair CI speech outcomes. It is essential to account for this relative importance when relating cochlear health measures to speech intelligibility. This was implemented in this paper by applying the weighting function depicted in Figure 2 to AGF slopes. Employment of this weighting function together with the weighted correlation analysis resulted in an improved significant correlation between IPGEslope and speech test outcomes for monaural data and in FMA and FMC conditions. A comparison of the upper rows of Figures 6, 8 with the lower rows reveals the effect of compensation for these two factors. No correlation was observed for ear-differences in any condition.
These findings are in line with those of Imsiecke et al. (2021), who found no correlation between IPGEslope and speech intelligibility in CI listeners with residual hearing for the case where the effect of missing data and relative importance of speech information were not considered. In contrast, Schvartz-Leyzac and Pfingst (2018) observed a strong correlation between ear-differences of IPGEslope and SRT. The observed difference might partly be due to differences in calculation of IPGEslope ASM. In this study, first changes in AGF slopes in response to changes in IPG were calculated for each electrode and subsequently the mean of slope changes for all the electrodes was calculated. In Schvartz-Leyzac and Pfingst (2018), the difference of the means of AGF slopes for single IPGs was reported (most probably to overcome the ambiguities raised by missing data). The two approaches would have been identical in the absence of missing data. In both studies, for some of the electrodes no AGF was obtained at least for one of the IPGs resulting in a difference when changing the order of averaging and subtraction.
Another potential reason behind the contradictory conclusions of the two studies might be the differences in speech understanding scores of the subjects in each study. The ear-differences in SRT for several participants of the study of Schvartz-Leyzac and Pfingst (2018) ranged from a few dB to 10 dB SNR. Whereas in this study, ear-differences higher than 1 dB SNR were measured only for four subjects. Two of these four subjects, subject 7, and subject 13, who showed the largest ear-differences were excluded from IPGEslope-SRT correlation analysis. The reason behind the exclusion of subject 13 was identification as outlier in terms of speech intelligibility. In case of subject 7, the eCAP measurement of the worse ear was interrupted by the subject, leading to incomplete data. These two subjects were also the only ones who had (mild) auditory dyslalia which is an indication of irregular speech development, possibly leading to a relatively large effect of cognitive processing on performance (Lang-Roth, 2014). Showing evidence for the relationship of IPGEslope and SRT ear-differences might have been more straightforward with SRT ear-differences in the range of 10 dB SNR, however, recruitment from the relatively large patient database of our center resulted in no such participants for this study. Consideration of sequential delays between implantation dates on both sides also played a role for subject inclusion. Subjects who received their second implant shortly after the first implantation were prioritized (except for subject 9). A comparison of the ear-differences in CI experience showed a tendency toward larger differences for the study of Schvartz-Leyzac and Pfingst (2018). Differences in speech testing materials used may also contribute to the difference in speech-test outcomes. The tests might not be equal in their semantic content and consequently in engagement of cognition. In terms of SRT measurement, two methodological differences were observed between the two studies. First, Schvartz-Leyzac and Pfingst (2018) employed a step-size of 2 dB to obtain the adaptive track. However, in this study an adaptive step-size was used which was varied depending on the subject’s response and might have enabled a more accurate estimation of the speech reception threshold. Second, Schvartz-Leyzac and Pfingst (2018) kept the level of the mixed signal (speech + noise) constant. In present study, the level of noise was kept constant and the level of speech was varied to obtain the desired SNR. Additionally, differences in electrode array types, test language, and inclusion of two subjects with the history of explantation/reimplantation by Schvartz-Leyzac and Pfingst (2018) may contribute to the differences in the outcome of the two studies.
Standard Pearson correlation analysis of the ears with the most successful eCAP measurements showed mild but significant correlations between IPGEslope and SRTs. This result supported the validity of weighted correlation as a method to partly compensate for the effect of missing data. It also supported the assumption that IPGEslope would be more suitable for assessing cochlear health if complete eCAP measurement sets were available. The strength of correlation was roughly similar for the three conditions. Based on this result, it could be concluded that, in case of high quality eCAP measurement, polarity might not be influential in correlating cochlear health to speech intelligibility in quiet or in stationary noise. However, an analysis with the same subset of subjects relating age to IPGEslope did exhibit differences between the three conditions (Figure 10).
Significant correlations were observed between SRT and FMT scores for monaural listening but not for ear-differences of the two measures. The same pattern was obtained for the analysis of IPGEslope and SRTs. The large number of subjects in this cohort with progressive HL as the etiology may suggest a high number of genetic causes, which can be more likely to be symmetric. These findings give rise to doubts about the general applicability of an analysis of ear-differences of speech test results. While it appears feasible in certain groups of listeners with relatively large ear-differences in SRT, such as those reported by Schvartz-Leyzac and Pfingst (2018), transferring the approach of analysis of ear-differences to a random selection of bilateral CI users might not always yield a useful result.
4.2. Correlation between IPGEoffset and speech intelligibility
Brochier et al. (2021) compared different methods used for interpretation of changes in eCAP AGF due to changes in IPG using both computational and animal models. They showed a significant correlation between IPGE on level 50% and SGN density in the animal model. No correlation was observed for IPGEslope in the same animal model. They concluded that IPGEslope in either the linear or logarithmic domain is vulnerable to non-neural factors such as electrode-to-modiolus distance or impedances of the stimulating and/or recording electrodes. As a solution, for human subjects, the authors proposed the IPGEoffset which was defined as average offset (in dB re 1 nC) in stimulus amplitude between the linearly growing portions of the eCAP AGFs (obtained with short and long IPGs) expressed on logarithmic input-output axis [Figure 9 in Brochier et al. (2021)].
One potential reason behind the observed difference in the outcome of this study and findings of Brochier et al. (2021) might be due to differences in assessment approaches. Unlike with computational and animal models, the accurate estimation of level 50% is difficult in most of the human subjects. When the AGF is sampled at only 12 current levels, as was done in the present study, a robust estimation of level 50% requires the AGF to reach the inflection point. This was rarely observable in our data and is generally difficult to measure post-operatively with stimulation levels below the loudest applicable presentation level. As a substitution to level 50%, the method based on averaging of current offset for different voltage levels (and not only level 50%) was implemented as suggested by Brochier et al. (2021). This change in the approach for estimation of IPGEoffset might at least partly explain the difference in the outcome of these two studies. The results obtained in this study are in line with the findings of Kim et al. (2010) who employed a very similar approach for the calculation of IPGEoffset and reported no significant correlation between this measure and the speech performance in human subjects.
4.3. The choice of linear domain
Many studies have investigated the proper unit for analysis of psychophysical and physiological measurements in CI users. McKay (2012) investigated the psychometric probe threshold measured using a forward masking paradigm. The author argued that the ratio or logarithmic units are the best for estimation of probe thresholds because only in these domains are the effects of electrode-neuron distances canceled out, and only the effective change that neurons experience remains. For example, increasing the stimulation current from 100 to 200 μA might result in an increase of 0.5–1 μA in one case and from 1 to 2 μA in other case. In both cases the effective current received by neurons was doubled as a response to doubling the stimulation current, but there is a difference in the raw increment. The author argued that the ratio and logarithmic domain can reflect this effect but not the linear domain.
Brochier et al. (2021) used the same argumentation for the calculation of IPGEoffset and applied the IPGEoffset in the logarithmic domain as a cochlear health measure robust against non-neural factors such as variation in electrode-neuron interfaces or variation in the stimulating current level. It should be noted that as IPGEoffset is a differential measure, a logarithmic transformation not only compensates for different field gradient strengths, i.e., effect current at the recruited population, but also removes any time related effects. Degeneration affects the degree of the temporal integration of neurons. Therefore, a reliable estimation of cochlear health requires a measure which should be sensitive to changes in temporal integration. This argumentation was confirmed with the outcome of this study which showed a significant correlation between IPGEslope in the linear domain and speech test outcome but no significant correlation for IPGEoffset in the logarithmic domain. The observation that the IPGE in the linear domain showed only a significant effect could therefore be explained by differences in how the neural population could integrate over time, and that this information is removed by an analysis in the logarithmic domain.
The findings of this study are in line with the study of Takanen et al. (2022) who modeled three cochlear health measures, (1) IPGEslope in the linear domain, (2) relative IPGEslope (ratio of slopes), and (3) IPGEoffset in the logarithmic domain. They investigated the effect of electrode-neuron interfaces and cochlear health (defined as the number of surviving SGN). They reported that only IPGEslope in linear domain was sensitive to cochlear health, although it was also affected by variation in electrode-neuron distance. Relative IPGEslope and IPGEoffset in logarithmic domain were not sensitive to either factor.
4.4. Analysis of age and other demographic data
Cochlear implant research has already shed light on the relationship between some of the cochlear health measures and demographic data. A correlation between the duration of HL and AGF slope was reported by Schvartz-Leyzac and Pfingst (2016). IPGE on threshold and level 50% were also correlated to duration of HL in the study of Imsiecke et al. (2021). In the present study, a strong correlation was observed between age and IPGEslope for FMC but not for FMA (Figures 9, 10). This may highlight a higher diagnostic power of cathodic-leading pulses for a particular aspect of cochlear health. The physiological decrease in human SGN populations (Zimmermann et al., 1995; Otte et al., 2015) with age suggests that a measure sensitive to degenerative processes such as demyelination and loss of SGN PPs as well as subsequent SGN death would therefore exhibit a negative correlation with age. Modeling studies (Rattay, 1999; Rattay et al., 2001a,b; Resnick et al., 2018) elaborated on why cathodic leading pulses are less effective than anodic pulses in eliciting spikes in the region beyond the cell body of SGNs with degenerated PPs. PPs of SGNs degenerate as a consequence of sensorineural deafness (Glueckert et al., 2005) or age-related HL (Kumar et al., 2022) following decreased neurotrophic support from the organ of Corti and from supporting cells (Zilberstein et al., 2012). One finding of this study is that cathodic-leading stimuli are better suited as an electrophysiological marker for SGN degeneration than anodic-leading stimuli. Degenerated PPs should consequently be assessed more reliably with FMC. Observation of a strong negative correlation between age and IPGEslope for FMC but not for FMA in this study (Figure 9) suggests that cathodic-leading pulses may in fact be more sensitive for assessment of degeneration of peripheral process and thereby of cochlear health. Previous studies (Jahn and Arenberg, 2019a,b) investigated the relationship between the polarity effect on behavioral thresholds using triphasic stimuli without interphase gap found evidence for increasing cathodic thresholds with increasing age, which are in agreement with the findings of the present study. However, previous electrophysiological investigations have been conducted with similar hypotheses regarding the polarity effect and cochlear health, but generated less conclusive results (Hughes et al., 2017, 2018). The relatively small sample size in such studies is always a statistical obstacle when attempting to generalize and compare results. However, the current investigation differs from the previous studies in various factors, such as the focus on cathodic-leading stimuli (in contrast to anodic minus cathodic, polarity effect), investigation of individual subjects (in contrast to averaging across all or groups of subjects), correlating with age in years, and investigating the IPGE between 30 and 2.1 μs for different polarities (in contrast to comparisons between polarity effect on eCAP threshold, AGF slope or on MCL). One or several of these factors, as well as a different subject selection may be used to argue for the more conclusive results in this study.
Taken together, our findings provide evidence for the initial hypothesis, that IPGEslope may be used as an electrophysiological biomarker for cochlear health when measured with cathodic-leading stimuli. Partial degeneration and / or complete loss of SGNs may both play an important role and this differentiation requires further research. Demyelination of the peripheral process will increase membrane capacitance (Rattay et al., 2001b) and along with possible downregulation of ion channel expression (Pan et al., 2016, 2021), the response times of SGNs to extracellular stimulation will increase. Further models investigating demyelination suggested that response thresholds may be largely unaffected, but response timing may change significantly (Resnick et al., 2018). The vulnerable period in which the second phase can still prevent depolarization caused by the first phase to pass threshold (van den Honert and Mortimer, 1979) has been suggested to range between 8.7 and 16 μs (Rubinstein et al., 2001) but may likely extend during years of deafness such that a 30 μs IPG is no longer sufficient to allow spike initiation in neurons close to threshold and at the borders of the excited SGN population. Setting the shorter IPG to 2.1 μs was suited to have samples of the second phase interrupting the vulnerable period, longer IPGs may not be suited for the choice as the shorter IPG. The etiology of HL was progressive in 10 of 13 subjects (77%) in this study, suggesting ongoing degenerative processes as well as the presence of remaining hair cells and PPs. Further research is required for more detailed investigation of the principles underlying preferential polarity sensitivity in CI users.
4.5. Multiple linear regression analysis
Although a significant correlation was observed between cochlear health and speech measures, variations in speech perception among CI users are still large. In order to explain the interindividual variability in speech perception of CI users, more than one factor needs to be taken into account. To realize this goal in the present study, a multiple linear regression model was employed and the variation in speech intelligibility was partly explained. IPGEslope was used as the main independent variable. Integration of CI experience in the model in addition to IPGEslope resulted in the greatest performance of the model. This is most probably because IPGEslope was correlated to age and inclusion of age therefore did not provide complementary information to the model. For the case of FMA, the observed correlation was most likely due to CI experience.
It is important to consider the effect of overfitting with high dimensional models, particularly in cases of a smaller sample size as in the one employed here. Adjusted R2 values were reported to compensate for the effect of overfitting. Nevertheless, caution should be taken in interpreting the outcome of these models in particular with respect to the data size of 24 ears. Therefore, it is suggested to repeat this analysis with a larger dataset and (if possible) with less missing data. For the multiple regression analysis, it was not possible to include only the ears with relatively complete eCAP measurements. However, based on the higher correlation observed in case of considering only ears with successful measurements on at least 8 electrodes, it can be concluded that employing such a model with a dataset with less missing data might result in an increase in the predictive performance of the model.
Walia et al. (2022) also employed multiple regression models to predict the speech intelligibility of CI users in noise, however, considering different factors to the ones employed in this study. They explained up to 60% of the variation in speech intelligibility by considering electrocochleography (EcochG) and cognition. Complementary to their study that used EcochG as the cochlear health measure, this study employed IPGEslope. EcochG has the advantage of being measurable in CI candidates prior to implantation and therefore can be used in the process of decision making for implantation. However, since it is not channel specific, it has limited applications to post-implantation. Whereas eCAP based cochlear health measures, as used in this study, have the potential to determine the cochlear health locally, post-implantation and to be employed for individualized fitting. Walia et al. (2022) reported coefficient of determination (R2) and not the adjusted coefficient of determination (R2adj), which might have an increased contribution from overfitting.
For future studies additional factors such as deficits in the fitting of CI processors or issues related to rehabilitation measures may be of interest.
4.6. The uncontrolled variables and future work
In this study to assess the performance of IPGEslope in the estimation of cochlear health (neural status) the correlation with speech intelligibility was selected. The study design controlled for many of the covarying factors affecting speech intelligibility. The variability in electrode array type was kept as low as possible. However, to completely factor out the interindividual variability in reconstruction of cochlear tonotopy, the information about the length of electrode array should be assessed together with the respective insertion angle and cochlear size. To avoid such a bias, this information should be considered in future.
In an attempt to control for the cognitive ability of the subjects, ear-differences in cochlear health measures and speech intelligibility were employed. However, analysis of the ear-differences revealed limitations to this approach. These limitations include the difficulty in recruiting a large enough number of subjects with between-ear SRT differences higher than 1 dB SNR, and the difficulty of obtaining complete eCAP measurements for both ears in some subjects. Therefore, the analysis of monaural data was preferred in the present study. However, this approach came at the expense of losing control over the cognitive ability (which is a highly individual variable and demonstrated in many studies to be related to performance). Therefore, it might be useful for future studies to assess the cognitive ability of the subjects via additional testing in order to describe some of the remaining unexplained variation in speech intelligibility of the CI users.
One other uncontrolled factor influencing speech intelligibility is the spread of excitation. Patterns of spread of excitation vary among the CI users and are affected by features of electrode positioning such as distance from the lateral wall, by the electrode impedances, and by the parameters of the stimulating pulse e.g., pulse amplitude. A large spread of excitation results in interaction between neighboring channels and consequently deteriorates the speech cues. Reliable transmission of speech cues requires focused excitation as well as functioning SGNs. Transmission of deteriorated speech cues by functioning SGNs may result in degraded speech intelligibility. Therefore, to assess the influence of cochlear health on speech intelligibility it is important to control for the spread of excitation. However, it might be difficult to find a measure of the spread of excitation that controls for the survival of SGNs (He et al., 2017). Garcia et al. (2021) took an initial step in disentangling the effect of these two inter-related factors and introduced an approach for estimation of spread of excitation while using neural health vectors to control for SGN survival. Measures based on imaging techniques might also be helpful in bypassing these confounding effect (Noble et al., 2013). Considering the variation in the spread of excitation might help to better explain the outcome of this study.
Electrode impedances also might be indicative of the influence of non-neural factors on cochlear health measures. Impedance measures are regularly assessed in clinical visits and could be used to modify an index of cochlear health without additional effort by the clinician. In animal models, impedance measures have been shown to correlate with intrascalar fibrosis (Ramekers et al., 2021) and with ossification (Colesa et al., 2022), while a distinct relationship between impedance and CI speech outcomes cannot be shown (Prenzler et al., 2020). In future, the relationship between electrode impedances and cochlear health should be investigated to assess the impact of electrode impedances on cochlear health measures.
In addition to objective measures of cochlear health, a subjective measure called charge integration efficiency has been introduced by Zhou et al. (2020). Loudness grows more slowly with an increase in pulse phase duration in comparison with pulse amplitude (for the same delivered charge). The dB difference between pulse amplitude and pulse phase duration dynamic range, i.e., the established chronaxie measure, may be used to estimate the extent of neural degeneration. Zhou et al. (2020) correlated the charge integration efficacy with duration of HL, an indirect measure of cochlear health, as well as with speech recognition (Zhou et al., 2021). In comparison with IPGEslope, charge integration efficacy might be a faster measure of cochlear health as it can be measured psychophysically in a co-operative subject. Nevertheless, its subjective nature might restrict its possible application e.g., for pediatric cases. Further studies are required to compare IPGEslope and charge integration efficacy in terms of their accuracy as well as their vulnerability to missing data.
To calculate the relative importance of each band for speech intelligibility, the band importance function introduced in ANSI S3.5 (1997) was adapted to the MED-EL default filter bank setting. In general, the estimation of speech band importance function could be affected by several factors such as the characteristics of the frequency bands (center frequency and bandwidth, compare Supplementary Table 1 to Supplementary Table 2 for ANSI S3.5-1997) or the language (Jin et al., 2016). Also, natural acoustic speech is different in terms of content from the CI-coded speech (Bosen and Chatterjee, 2016). This latter factor is still an active research topic for CI studies. Even for a certain language, variation in speech material results in differences in the estimated band importance function (CID-22 v.s., NU6, Supplementary Table 2, ANSI S3.5-1997). For any application of band importance function, it is desirable to consider as many of these factors as possible to obtain a function which is tailored to that particular application. It is hypothesized that a tailored band importance function together with complete electrophysiological measurement results in a more accurate prediction of speech intelligibility.
The clinical map of some of the subjects might be different than the default map. Various factors determine the suitable fitting map for individual users in terms of the filter bank setting. The presence of low frequency hearing usually results in a change in the filter bank setting because in such a case the listeners are able to hear the low frequencies acoustically and the CI codes a restricted frequency bandwidth. Another influential factor might be the usage of the anatomy-based fitting which aims at preserving the natural frequency-place map. Here, the amount of the change depends on the insertion depth and the position of the electrodes in the cochlea. Electrode deactivation also affect the filter bank setting and consequently the band importance function. Facial nerve stimulation, open or short circuitry are exemplary common reasons for electrode deactivation which results in the frequency redistribution among the remaining electrodes and depends on the number of deactivated electrodes. The extent of the variation from the default map is individual and ranges from a slight change to a moderate one. Employment of band importance functions adapted to individual maps of the CI users is worthwhile to be tested in future.
5. Conclusion
This study investigated the applicability of cochlear health measures for prediction of speech perception capabilities in CI users. We focused particularly on investigating the effect of the polarity of the stimulating pulse and the utility of the band importance function. In conclusion, significant correlations were observed between IPGEslope and speech perception outcomes, with equal correlation strength for anodic-leading and cathodic-leading pulses, but not for the ear-differences. We found that reliable relationships between the investigated parameters of cochlear health could only be established when the relative importance of each frequency band for speech intelligibility was taken into account. A significant negative correlation was observed between IPGEslope and age. In this case, cathodic-leading pulses resulted in a significant and strong correlation, while anodic-leading pulse showed no significant correlation, supporting the hypothesis that cathodic-leading pulses are better suited for detection of degenerated SGN PPs. The higher sensitivity of younger CI users to cathodic-leading may be due to a larger number of excitable PPs in regions closer to the electrode contact where a cathodic stimulus leads to depolarization more effectively. Missing data was particularly detrimental to the analysis. The highest correlations were observed when the effect of missing data was compensated either by implementation of a weighted correlation or when only ears with relatively complete measurements were included into the analysis. For an accurate estimation of cochlear health (neural status), measurements of high quality eCAPs were essential. Stimulation with a cathodic-leading phase might help to improve the estimation of cochlear health. The results of this study together with further information about current spread, which is assumed to be an individual factor and degrades the spectral resolution of the coded speech, have the potential to explain the observed variation in performance achieved by CI users, partly due to variation in the degeneration level of the auditory periphery.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving human participants were reviewed and approved by the Ethics Committee of the Goethe University Hospital in Frankfurt (ERB number 44/19), and all subjects gave written informed consent. Subjects received an expense allowance for participation in the study. The patients/participants provided their written informed consent to participate in this study.
Author contributions
LZ wrote the manuscript and analyzed the data. BM analyzed the data and created the figures. HB and UB designed the study and revised and finalized the manuscript. JT reviewed the study design and the manuscript. CG designed the study and revised the manuscript. All authors contributed to the article and approved the submitted version.
Funding
Support for this work was provided in part by MED-EL GmbH.
Acknowledgments
We thank Marko Takanen for his helpful consultation on speech information, Philipp Spitzer for his substantial support with eCAP signal processing, Darshan Shah for creating the custom-made MATLAB research tool, two students Vera Komeyer and Sophie Hamkens for data measurement, Marko Takanen, Stefan Strahl, and Konrad Schwarz for their constructive feedback and the review of an early version of the manuscript, and Patrick Connolly for providing English language editing for the manuscript. We would like to express gratitude to the participants of the study for the generous dedication of their time. A portion of this work was presented at the 2021 Conference on Implantable Auditory Prostheses.
Conflict of interest
LZ, HB, JT, and CG were employed by MED-EL GmbH, Innsbruck, Austria.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnint.2023.1125712/full#supplementary-material
References
ANSI (1997). American National Standards S3.5-1997. Methods for the calculation of the speech intelligibility index. New York, NY: American National Standards Institute.
Baudhuin, J. L., Hughes, M. L., and Goehring, J. L. (2016). A comparison of alternating polarity and forward masking artifact-reduction methods to resolve the electrically evoked compound action potential. Ear Hear. 37, e247–e255. doi: 10.1097/AUD.0000000000000288
Bosen, A. K., and Chatterjee, M. (2016). Band importance functions of listeners with cochlear implants using clinical maps. J. Acoust. Soc. Am. 140:3718. doi: 10.1121/1.4967298
Brochier, T., McKay, C. M., and Carlyon, R. P. (2021). Interpreting the effect of stimulus parameters on the electrically evoked compound action potential and on neural health estimates. J. Assoc. Res. Otolaryngol. 22, 81–94. doi: 10.1007/s10162-020-00774-z
Brocker, D. T., and Grill, W. M. (2013). Principles of electrical stimulation of neural tissue. Handbook Clin. Neurol. 116, 3–18. doi: 10.1016/B978-0-444-53497-2.00001-2
Brown, C. J., Abbas, P. J., and Gantz, B. (1990). Electrically evoked whole-nerve action potentials: Data from human cochlear implant users. J. Acoust. Soc. Am. 88, 1385–1391. doi: 10.1121/1.399716
Colesa, D. J., Colesa, K. L., Low, Y., Swiderski, D. L., Raphael, Y., and Pfingst, B. E. (2022). Does impedance reflect intrascalar tissue in the implanted cochlea? J. Assoc. Res. Otolaryngol.
DeVries, L., Scheperle, R., and Bierer, J. A. (2016). Assessing the electrode-neuron interface with the electrically evoked compound action potential, electrode position, and behavioral thresholds. J. Assoc. Res. Otolaryngol. 17, 237–252. doi: 10.1007/s10162-016-0557-9
Garcia, C., Goehring, T., Cosentino, S., Turner, R. E., Deeks, J. M., Brochier, T., et al. (2021). The panoramic eCAP method: Estimating patient-specific patterns of current spread and neural health in cochlear implant users. J. Assoc. Res. Otolaryngol. 22, 567–589. doi: 10.1007/s10162-021-00795-2
Glueckert, R., Pfaller, K., Kinnefors, A., Rask-Andersen, H., and Schrott-Fischer, A. (2005). The human spiral ganglion: New insights into ultrastructure, survival rate and implications for cochlear implants. Audiol. Neurotol. 10, 258–273. doi: 10.1159/000086000
Guedes, M. C., Weber, R., Gomez, M. V., Neto, R. V., Peralta, C. G., and Bento, R. F. (2007). Influence of evoked compound action potential on speech perception in cochlear implant users. Braz. J. Otorhinolaryngol. 73, 439–445. doi: 10.1016/s1808-8694(15)30095-1
Hahlbrock, K. (1953). Über sprachaudiometrie und neue wörterteste. Arch. Ohren Nasen Kehlkopfheilkd. 162, 394–431.
He, S., Shahsavarani, B. S., McFayden, T. C., Wang, H., Gill, K. E., Xu, L., et al. (2018). Responsiveness of the electrically stimulated cochlear nerve in children with cochlear nerve deficiency. Ear Hear. 39, 238–250. doi: 10.1097/AUD.0000000000000467
He, S., Teagle, H. F. B., and Buchman, C. A. (2017). The electrically evoked compound action potential: From laboratory to clinic. Front. Neurosci. 11:339. doi: 10.3389/fnins.2017.00339
He, S., Xu, L., Skidmore, J., Chao, X., Jeng, F. C., Wang, R., et al. (2020). The effect of interphase gap on neural response of the electrically stimulated cochlear nerve in children with cochlear nerve deficiency and children with normal-sized cochlear nerves. Ear Hear. 41, 918–934. doi: 10.1097/AUD.0000000000000815
Healy, E. W., Yoho, S. E., and Apoux, F. (2013). Band importance for sentences and words reexamined. J. Acoust. Soc. Am. 133, 463–473. doi: 10.1121/1.4770246
Herrmann, D. P., Kretzer, K. V. A., Pieper, S. H., and Bahmer, A. (2021). Effects of electrical pulse polarity shape on intra cochlear neural responses in humans: Triphasic pulses with anodic and cathodic second phase. Hear. Res. 412:108375. doi: 10.1016/j.heares.2021.108375
Hey, M., Hocke, T., Hedderich, J., and Müller-Deile, J. (2014). Investigation of a matrix sentence test in noise: Reproducibility and discrimination function in cochlear implant patients. Int. J. Audiol. 53, 895–902. doi: 10.3109/14992027.2014.938368
HörTech gGmbH (2019). International matrix tests: Reliable speech audiometry in noise. Available online at: https://www.hz-ol.de/files/hoerzentrum/medien/Infomaterial%20%26%20Dokumente/8-1-4-0-Dokumente-OLSA/Hoerzentrum_Broschuere_Internationale_Tests_2019_WEB_klein.pdf (accessed March 09, 23)
Huang, B. Y., Roche, J. P., Buchman, C. A., and Castillo, M. (2010). Brain stem and inner ear abnormalities in children with auditory neuropathy spectrum disorder and cochlear nerve deficiency. AJNR Am. J. Neuroradiol. 31, 1972–1979. doi: 10.3174/ajnr.A2178
Hughes, M. L., Choi, S., and Glickman, E. (2018). What can stimulus polarity and interphase gap tell us about auditory nerve function in cochlear-implant recipients? Hear. Res. 359, 50–63. doi: 10.1016/j.heares.2017.12.015
Hughes, M. L., Goehring, J. L., and Baudhuin, J. L. (2017). Effects of stimulus polarity and artifact reduction method on the electrically evoked compound action potential. Ear Hear. 38:332e343.
Imsiecke, M., Büchner, A., Lenarz, T., and Nogueira, W. (2021). Amplitude growth functions of auditory nerve responses to electric pulse stimulation with varied interphase gaps in cochlear implant users with ipsilateral residual hearing. Trends Hear. 25:23312165211014137. doi: 10.1177/23312165211014137
Jackler, R. K., Luxfor, W. M., and House, W. F. (1987). Congenital malformations of the inner ear: A classification based on embryogenesis. Laryngoscope 97, 2–14. doi: 10.1002/lary.5540971301
Jahn, K. N., and Arenberg, J. G. (2019a). Evaluating psychophysical polarity sensitivity as an indirect estimate of neural status in cochlear implant listeners. J. Assoc. Res. Otolaryngol. 20, 415–430. doi: 10.1007/s10162-019-00718-2
Jahn, K. N., and Arenberg, J. G. (2019b). Polarity sensitivity in pediatric and adult cochlear implant listeners. Trends Hear. 23:2331216519862987. doi: 10.1177/2331216519862987
Jin, I. K., Lee, J., Lee, K., Kim, J., Kim, D., Sohn, J., et al. (2016). The band-importance function for the Korean standard sentence lists for adults. J. Audiol. Otol. 20, 80–84. doi: 10.7874/jao.2016.20.2.80
Joshi, S. N., Dau, T., and Epp, B. (2017). A model of electrically stimulated auditory nerve fiber responses with peripheral and central sites of spike generation. J. Assoc. Res. Otolaryngol. 18, 323–342. doi: 10.1007/s10162-016-0608-2
Kalkman, R. K., Briaire, J. J., Dekker, D. M. T., and Frijns, J. H. M. (2022). The relation between polarity sensitivity and neural degeneration in a computational model of cochlear implant stimulation. Hear. Res. 415:108413. doi: 10.1016/j.heares.2021.108413
Kamakura, T., O’Malley, J. T., and Nadol, J. B. Jr. (2018). Preservation of cells of the organ of corti and innervating dendritic processes following cochlear implantation in the human: An immunohistochemical study. Otol. Neurotol. 39, 284–293. doi: 10.1097/MAO.0000000000001686
Khan, A. M., Handzel, O., Burgess, B. J., Damian, D., Eddington, D. K., and Nadol, J. B. Jr. (2005). Is word recognition correlated with the number of surviving spiral ganglion cells and electrode insertion depth in human subjects with cochlear implants? Laryngoscope 115, 672–677. doi: 10.1097/01.mlg.0000161335.62139.80
Kim, J. R., Abbas, P. J., Brown, C. J., Etler, C. P., O’Brien, S., and Kim, L. S. (2010). The relationship between electrically evoked compound action potential and speech perception: A study in cochlear implant users with short electrode array. Otol. Neurotol. 31, 1041–1048. doi: 10.1097/MAO.0b013e3181ec1d92
Kumar, P., Sharma, S., Kaur, C., Pal, I., Bhardwaj, D. N., Vanamail, P., et al. (2022). The ultrastructural study of human cochlear nerve at different ages. Hear. Res. 416:108443. doi: 10.1016/j.heares.2022.108443
Lai, W. K., and Dillier, N. (2000). A simple two-component model of the electrically evoked compound action potential in the human cochlea. Audiol. Neurotol. 5, 333–345. doi: 10.1159/000013899
Lang-Roth, R. (2014). Hearing impairment and language delay in infants: Diagnostics and genetics. GMS Curr. Top. Otorhinolaryngol. Head Neck Surg. 13:Doc05. doi: 10.3205/cto000108
Liu, W., Edin, F., Atturo, F., Rieger, G., Löwenheim, H., Senn, P., et al. (2015). The pre- and post-somatic segments of the human type I spiral ganglion neurons–structural and functional considerations related to cochlear implantation. Neuroscience 284, 470–482. doi: 10.1016/j.neuroscience.2014.09.059
Macherey, O., Carlyon, R. P., van Wieringen, A., Deeks, J. M., and Wouters, J. (2008). Higher sensitivity of human auditory nerve fibers to positive electrical currents. J. Assoc. Res. Otolaryngol. 9, 241–251. doi: 10.1007/s10162-008-0112-4
McKay, C. M. (2012). Forward masking as a method of measuring place specificity of neural excitation in cochlear implants: A review of methods and interpretation. J. Acoust. Soc. Am. 131, 2209–2224. doi: 10.1121/1.3683248
Nadol, J. B., Burgess, B. J., Gantz, B. J., Coker, N. J., Ketten, D. R., Kos, I., et al. (2001). Histopathology of cochlear implants in humans. Ann. Otol. Rhinol. Laryngol. 110, 883–891. doi: 10.1177/000348940111000914
Noble, J. H., Labadie, R. F., Gifford, R. H., and Dawant, B. M. (2013). Image-guidance enables new methods for customizing cochlear implant stimulation strategies. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 820–829. doi: 10.1109/TNSRE.2013.2253333
Otte, J., Schuknecht, H. F., and Kerr, A. G. (2015). Ganglion cell populations in normal and pathological human cochleae. Implications for cochlear implantation. Laryngoscope 125:1038. doi: 10.1002/lary.25219
Pan, C. C., Du, Z. H., Zhao, Y., Chu, H. Q., and Sun, J. W. (2021). Downregulation of Cav3.1 T-type calcium channel expression in age-related hearing loss model. Curr. Med. Sci. 41, 680–686. doi: 10.1007/s11596-021-2416-0
Pan, C., Chu, H., Lai, Y., Liu, Y., Sun, Y., Du, Z., et al. (2016). Down-regulation of the large conductance Ca(2+)-activated K(+) channel expression in C57BL/6J cochlea. Acta Otolaryngol. 136, 875–878. doi: 10.3109/00016489.2016.1168941
Pollack, I. (1948). Effects of high pass and low pass filtering on the intelligibility of speech in noise. J. Acoust. Soc. Am. 20, 259–266. doi: 10.1121/1.1906369
Prado-Guitierrez, P., Fewster, L. M., Heasman, J. M., McKay, C. M., and Shepherd, R. K. (2006). Effect of interphase gap and pulse duration on electrically evoked potentials is correlated with auditory nerve survival. Hear. Res. 215, 47–55. doi: 10.1016/j.heares.2006.03.006
Prenzler, N. K., Weller, T., Steffens, M., Lesinski-Schiedat, A., Büchner, A., Lenarz, T., et al. (2020). Impedance values do not correlate with speech understanding in cochlear implant recipients. Otol. Neurotol. 41, e1029–e1034. doi: 10.1097/MAO.0000000000002743
Ramekers, D., Bouwmeester, A., Hendriksen, F., Benav, H., and Versnel, H. (2021). The relationship between intrascalar tissue growth, electrode impedance and eCAP measures in guinea pigs with chronically implanted electrode arrays. J. Assoc. Res. Otolaryngol.
Ramekers, D., Versnel, H., Strahl, S. B., Smeets, E. M., Klis, S. F. L., and Grolman, W. (2014). Auditory-nerve responses to varied inter-phase gap and phase duration of the electric pulse stimulus as predictors for neuronal degeneration. J. Assoc. Res. Otolaryngol. 15, 187–202. doi: 10.1007/s10162-013-0440-x
Rattay, F. (1998). Analysis of the electrical excitation of CNS neurons. IEEE Trans. Biomed. Eng. 45, 766–772. doi: 10.1109/10.678611
Rattay, F. (1999). The basic mechanism for the electrical stimulation of the nervous system. Neuroscience 89, 335–346. doi: 10.1016/s0306-4522(98)00330-3
Rattay, F., Leao, R. N., and Felix, H. (2001a). A model of the electrically excited human cochlear neuron. II. Influence of the three-dimensional cochlear structure on neural excitability. Hear. Res. 153, 64–79. doi: 10.1016/S0378-5955(00)00257-4
Rattay, F., Lutter, P., and Felix, H. (2001b). A model of the electrically excited human cochlear neuron: I. Contribution of neural substructures to the generation and propagation of spikes. Hear. Res. 153, 43–63. doi: 10.1016/S0378-5955(00)00256-2
Resnick, J. M., O’Brien, G. E., and Rubinstein, J. T. (2018). Simulated auditory nerve axon demyelination alters sensitivity and response timing to extracellular stimulation. Hear. Res. 361, 121–137. doi: 10.1016/j.heares.2018.01.014
Rubinstein, J. T., Miller, C. A., Mino, H., and Abbas, P. J. (2001). Analysis of monophasic and biphasic electrical stimulation of nerve. IEEE Trans. Biomed. Eng. 48, 1065–1070. doi: 10.1109/10.951508
Schvartz-Leyzac, K. C., and Pfingst, B. E. (2016). Across-site patterns of electrically evoked compound action potential amplitude-growth functions in multichannel cochlear implant recipients and the effects of the interphase gap. Hear. Res. 341, 50–65. doi: 10.1016/j.heares.2016.08.002
Schvartz-Leyzac, K. C., and Pfingst, B. E. (2018). Assessing the relationship between the electrically evoked compound action potential and speech recognition abilities in bilateral cochlear implant recipients. Ear Hear. 39, 344–358. doi: 10.1097/AUD.0000000000000490
Seyyedi, M., Viana, L. M., and Nadol, J. B. Jr. (2014). Within-subject comparison of word recognition and spiral ganglion cell count in bilateral cochlear implant recipients. Otol. Neurotol. 35, 1446–1450. doi: 10.1097/MAO.0000000000000443
Shearer, A., and Hansen, M. (2019). Auditory synaptopathy, auditory neuropathy, and cochlear implantation. Laryngosc. Investig. Otolaryngol. 4, 429–440. doi: 10.1002/lio2.288
Shearer, A., Eppsteiner, R., Frees, K., Tejani, V., Sloan-Heggen, C., Brown, C., et al. (2017). Genetic variants in the peripheral auditory system significantly affect adult cochlear implant performance. Hear. Res. 348, 138–142. doi: 10.1016/j.heares.2017.02.008
Shepherd, R. K., and Javel, E. (1997). Electrical stimulation of the auditory nerve: I. correlation of physiological responses with cochlear status. Hear. Res. 108, 112–44.
Skidmore, J., Ramekers, D., Colesa, D. J., Schvartz-Leyzac, K. C., Pfingst, B. E., and He, S. (2022). A broadly applicable method for characterizing the slope of the electrically evoked compound action potential amplitude growth function. Ear. Hear. 43, 150–164. doi: 10.1097/AUD.0000000000001084
Takanen, M., Strahl, S., and Schwarz, K. (2022). Auditory model-based recommendations for evaluation of cochlear health using the inter-phase gap effect. J. Assoc. Res. Otolaryngol. (under review)
Undurraga, J. A., Carlyon, R. P., Wouters, J., and Van Wieringen, A. (2013). The polarity sensitivity of the electrically stimulated human auditory nerve measured at the level of the brainstem. J. Assoc. Res. Otolaryngol. 14, 359–377. doi: 10.1007/S10162-013-0377-0
Undurraga, J. A., van Wieringen, A., Carlyon, R. P., Macherey, O., and Wouters, J. (2010). Polarity effects on neural responses of the electrically stimulated auditory nerve at different cochlear sites. Hear. Res. 269, 146–161. doi: 10.1016/J.HEARES.2010.06.017
Usami, S. I., and Nishio, S. Y. (2022). The genetic etiology of hearing loss in Japan revealed by the social health insurance-based genetic testing of 10K patients. Hum Genet. 141, 665–681. doi: 10.1007/s00439-021-02371-3
van den Honert, C., and Mortimer, J. T. (1979). The response of the myelinated nerve fiber to short duration biphasic stimulating currents. Ann. Biomed. Eng. 7, 117–125. doi: 10.1007/BF02363130
van Eijl, R. H., Buitenhuis, P. J., Stegeman, I., Klis, S. F., and Grolman, W. (2017). Systematic review of compound action potentials as predictors for cochlear implant performance. Laryngoscope 127, 476–487. doi: 10.1002/lary.26154
Wagener, K., Brand, T., and Kollmeier, B. (1999). Entwicklung und Evaluation eines Satztests für die deutsche Sprache III: Evaluation des Oldenburger Satztests. Zeitschrift Audiol. 38, 86–95.
Walia, A., Shew, M. A., Kallogjeri, D., Wick, C. C., Durakovic, N., Lefler, S. M., et al. (2022). Electrocochleography and cognition are important predictors of speech perception outcomes in noise for cochlear implant recipients. Sci. Rep. 12:3083. doi: 10.1038/s41598-022-07175-7
Wu, P. Z., Liberman, L. D., Bennett, K., de Gruttola, V., O’Malley, J. T., and Liberman, M. C. (2019). Primary neural degeneration in the human cochlea: Evidence for hidden hearing loss in the aging ear. Neuroscience 407, 8–20. doi: 10.1016/j.neuroscience.2018.07.053
Xu, L., Skidmore, J., Luo, J., Chao, X., Wang, R., Wang, H., et al. (2020). The effect of pulse polarity on neural response of the electrically stimulated cochlear nerve in children with cochlear nerve deficiency and children with normal-sized cochlear nerves. Ear Hear. 41, 1306–1319. doi: 10.1097/AUD.0000000000000854
Zhou, N., Dong, L., and Galvin, J. J. III (2020). A behavioral method to estimate charge integration efficiency in cochlear implant users. J. Neurosci. Methods. 342:108802. doi: 10.1016/j.jneumeth.2020.108802
Zhou, N., Zhu, Z., Dong, L., and Galvin, J. III (2021). Sensitivity to pulse phase duration as a marker of neural health across cochlear implant recipients and electrodes. J. Assoc. Res. Otolaryngol. 22, 177–192. doi: 10.1007/s10162-021-00784-5
Zilberstein, Y., Liberman, M. C., and Corfas, G. (2012). Inner hair cells are not required for survival of spiral ganglion neurons in the adult cochlea. J. Neurosci. 32, 405–410. doi: 10.1523/JNEUROSCI.4678-11.2012
Keywords: cochlear implant, cochlear health, speech recognition, neural degeneration, band importance function, age
Citation: Zamaninezhad L, Mert B, Benav H, Tillein J, Garnham C and Baumann U (2023) Factors influencing the relationship between cochlear health measures and speech recognition in cochlear implant users. Front. Integr. Neurosci. 17:1125712. doi: 10.3389/fnint.2023.1125712
Received: 16 December 2022; Accepted: 27 March 2023;
Published: 12 May 2023.
Edited by:
Elizabeth B. Torres, Rutgers, The State University of New Jersey, United StatesReviewed by:
Carolyn M. McClaskey, The Medical University of South Carolina, United StatesDaniel Wong, Facebook Reality Labs Research, United States
Copyright © 2023 Zamaninezhad, Mert, Benav, Tillein, Garnham and Baumann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Uwe Baumann, VXdlLkJhdW1hbm5Aa2d1LmRl