Statistics of the instantaneous interaural parameters for dichotic tones in diotic noise (N0Sψ)

Encke, Jörg; Dietz, Mathias

doi:10.3389/fnins.2022.1022308

ORIGINAL RESEARCH article

Front. Neurosci., 08 November 2022

Sec. Auditory Cognitive Neuroscience

Volume 16 - 2022 | https://doi.org/10.3389/fnins.2022.1022308

This article is part of the Research TopicListening with Two Ears – New Insights and Perspectives in Binaural ResearchView all 20 articles

Statistics of the instantaneous interaural parameters for dichotic tones in diotic noise (N₀S_ψ)

Jörg Encke^1,2^*

Mathias Dietz^1,2

¹Physiology and Modeling of Auditory Perception, Department of Medical Physics and Acoustics, University of Oldenburg, Oldenburg, Germany
²Cluster of Excellence “Hearing4all”, University of Oldenburg, Oldenburg, Germany

Stimuli consisting of an interaurally phase-shifted tone in diotic noise—often referred to as N₀S_ψ—are commonly used to study binaural hearing. As a consequence of mixing diotic noise with a dichotic tone, this type of stimulus contains random fluctuations in both interaural phase- and level-difference. We report the joint probability density functions of the two interaural differences as a function of amplitude and interaural phase of the tone. Furthermore, a second joint probability density function for interaural phase differences and the instantaneous cross-power is derived. The closed-form expression can be used in future studies of binaural unmasking first to obtain the interaural statistics and then study more directly the relation between those statistics and binaural tone detection.

1. Introduction

Tone in noise detection thresholds improve when the interaural configuration of tone and noise differ compared to the diotic case. A rich literature reports on the influence of virtually every parameter of acoustic stimuli on this binaural unmasking (see, e.g., Culling and Lavandier, 2021, for a review). Amongst these parameters, the phase difference ψ introduced between the target tones of the two ear signals is fundamental and was explored already in the first study of dichotic tone in noise detection by Hirsh (1948). Such a signal is commonly referred to as N₀S_ψ where the subscripts indicate the interaural phase difference (IPD) of the noise (N) or signal (S). The difference between the detection threshold for the purely diotic N₀S₀ and the N₀S_ψ signal is referred to as the binaural masking level difference (BMLD) and is largest for the case where ψ = π (Hirsh, 1948).

Adding a dichotic S_ψ tone to diotic N₀ noise reduces the correlation between the left and right signals but also introduces random fluctuations of the interaural phase and level differences (IPD, ILD) (visualized in Figure 1A). The interaural correlation decreases with the tone level, so binaural unmasking and incoherence detection are often treated synonymously (Durlach et al., 1986). However, especially for narrowband noise, the value of interaural correlation itself was found to be an insufficient predictor for decorrelation detection performance. Instead, detection performance correlated with the amount of IPD and ILD fluctuations as measured by the standard deviation (Goupell and Hartmann, 2006). Similarly, other studies reported the performance in detecting the tone within an N₀S_ψ stimulus to vary considerably depending on the individual noise token. This token to token variability was best accounted for by models that did consider the amount of instantaneous fluctuations in IPD and ILD (Davidson et al., 2009).

FIGURE 1

Figure 1. (A) Visualization of the random fluctuations in IPD ΔΦ(t) and ILD ΔL(t) and P′(t) due to mixing an antiphasic 500 Hz tone with a 500 Hz wide band of diotic noise (SNR = −10 dB). (B) Signal model used to derive the PDFs for an N₀S_ψ stimulus. The graphic shows the Complex-plane representation of the basebands of the left and right ear signal: $Z_{L} (t) = A_{L} (t) e^{i Φ_{L} (t)}$ (blue), and $Z_{R} (t) = A_{R} (t) e^{i Φ_{R} (t)}$ (red). The left-ear-baseband is constructed by adding a “tone”-vector with length C and angle +ψ/2 to the noise baseband X(t)+iY(t). The right-ear-signal is constructed by adding a “tone”-vector with an angle of −ψ/2 to the same baseband. The instantaneous IPD ΔΦ(t) of the N₀S_ψ signal equal the difference between Φ_R and Φ_L. (C) Complex-plane representation of the interaural-baseband Z₁(t) = Ξ(t) + iϒ(t) which is gained by dividing the left-ears-baseband by the right-ears-baseband. The absolute value of the baseband equals the interaural amplitude ratio R while the phase equals the interaural phase difference ΔΦ.

Therefore, accounting for binaural tone-in-noise sensitivity can be subdivided into two components: First, the signal-based analysis of how stimulus design parameters such as ψ or the SNR influence the interaural cue statistics. In the second step, binaural sensitivity can then be studied more directly by relating it to the interaural cues. Only relatively few studies, however, have previously treated these statistics. The probability density function (PDF) underlying the statistical distribution of IPDs in (partly) decorrelated noise has been derived in the frame of optical interferometry (Just and Bamler, 1994). Henning (1973) derived the PDF for IPDs in the special case of N₀S_π and using a very similar approach for the same stimulus condition, Zurek (1991) additionally derived marginal PDFs for ILDs. Other studies also seemed to have worked on stimuli where the tone IPD did not equal π, but this work seemed to have remained unpublished (Levitt and Lundry, 1966). The present study closes this gap by deriving a closed form expression for the joint PDF of IPDs and ILDs in the general case of a N₀S_ψ stimulus. From this distribution, the marginal PDFs can also be calculated using numerical integration. These PDFs are especially useful when considering narrowband noises that remain relatively unaffected by the bandpass properties of the auditory periphery. Statistics at the stimulus level should thus well describe statistics of the binaural parameters at the level of binaural integration.

Suppose fluctuations of the IPD are indeed a cue used to detect the tone in an N₀S_ψ stimulus. In that case, the stimulus energy at which these fluctuations occurred might also affect performance. A larger IPD occurring during low-energy stimulus sections can be expected to have less impact than the same IPD occurring at high stimulus energy. Information about the stimulus energy in both ears is captured by the product of the left and right ear stimulus envelope, also called the instantaneous cross-power P′(t). Furthermore, the cross-power plays an essential role in defining the interaural coherence of a stimulus (Encke and Dietz, 2022). Consequently, this study derives the joint PDF for p′(t) and IPD.

2. Deriving the probability density functions

The following section will derive the two joint PDF. A computational-notebook that can be used to reproduce these derivations in the computer algebra system sympy (Meurer et al., 2017) can be found as Supplementary material.

If N(t) is a Gaussian noise process with a mean value of zero, the process can be represented using its in-phase and quadrature components X(t) and Y(t):

\begin{array}{l} N (t) = X (t) cos (ω_{0} t) - Y (t) s i n (ω_{0} t), & (1) \end{array}

where X(t) and Y(t) are orthogonal noise processes with the same variance and mean as N(t). The reference frequency ω₀ is not of relevance for the derivation and can thus be chosen freely. For computational convenience, ω₀ is set to equal the frequency of the tone S(t) which is added with the amplitude C and phase ψ:

\begin{array}{l} S (t) = C sin (ω_{0} t + ψ) & (2) \end{array}

The resulting signal W(t) = N(t) + S(t) then equals:

\begin{array}{l} W (t) = [X (t) + C cos (ψ)] cos (ω_{0} t) \\ - [Y (t) + C sin (ψ)] sin (ω_{0} t) . & (3) \end{array}

When dealing with instantaneous phase and amplitude values, it is beneficial to instead work with the analytic representation W_a(t) of the signal:

\begin{array}{l} W_{a} (t) = {[X (t) + C cos (ψ)] + i [Y (t) + C sin (ψ)]} e^{i ω_{0} t}, & (4) \end{array}

where $i = \sqrt{- 1}$ is the imaginary unit. The first term of this expression (enclosed in curly brackets) can be interpreted as an amplitude and phase modulator of the harmonic oscillation $e^{i ω_{0} t}$ . This combined modulator will be referred to as the signals complex baseband Z(t)

\begin{array}{l} Z (t) = [X (t) + C cos (ψ)] + i [Y (t) + C sin (ψ)] = A (t) e^{i Φ (t)}, & (5) \end{array}

where A(t), Φ(t) are the instantaneous amplitude and phase of the baseband. In the case of the N₀S_ψ stimulus, a tone with phase ψ/2 is added to the noise in the left-ear signal while the phase of the tone in the right-ear signal is −ψ/2 resulting in the two basebands:

\begin{array}{l} Z_{L} (t) = [X (t) + C cos ψ / 2] + i [Y (t) + C sin ψ / 2] = A_{L} (t) e^{i Φ_{L} (t)} & (6) \end{array}

\begin{array}{l} Z_{R} (t) = [X (t) + C cos - ψ / 2] + i [Y (t) + C sin - ψ / 2] \\ = A_{R} (t) e^{i Φ_{R} (t)} . & (7) \end{array}

A vector model of the basebands Z_R and Z_L is shown in Figure 1B where the individual components are visualized as vectors in the complex plane.

Based on these two basebands, PDFs for the interaural parameters will be derived using two separate approaches. In the first approach, the baseband of the left-ear signal Z_L(t) is divided by the baseband of the right-ear signal Z_R(t) resulting in the interaural baseband Z₁(t):

\begin{array}{l} Z_{1} (t) = \frac{Z_{R} (t)}{Z_{L} (t)} = \frac{A_{R} (t)}{A_{L} (t)} e^{i [Φ_{R} (t) - Φ_{L} (t)]} \\ = R (t) e^{i Δ Φ (t)}, & (8) \end{array}

where ΔΦ(t) and R(t) are the instantaneous IPDs and the interaural amplitudes ratios (IARs), respectively. Instantaneous ILDs can then be calculated as: ΔL(t) = 20 log₁₀ R(t). In the second approach, the PDF for IPDs and the product of the left and right-ear envelope (cross power) p′ are derived by multiplying Z_L(t) with the complex conjugate of Z_R(t) resulting in

\begin{array}{l} Z_{2} (t) = Z_{R} (t) Z_{L}^{*} (t) = A_{R} (t) A_{L} (t) e^{i [Φ_{R} (t) - Φ_{L} (t)]} \\ = P^{'} (t) e^{i Δ Φ (t)} . & (9) \end{array}

The process of deriving the PDFs from Equation (8) and Equation (9) follows the exact same rationale so that the process will only be detailed for Equation (8). Results for the second approach will then be stated without further detail.

For the interaural baseband, Z_L and Z_R as resulting from Equations (6) and (7) are inserted into Equation (8) resulting in:

\begin{array}{l} Z_{1} (t) = \frac{[X (t) + C cos (ψ / 2)] + i [C sin (ψ / 2) + Y (t)]}{[X (t) + C cos (- ψ / 2)] + i [C sin (- ψ / 2) + Y (t)]} \\ = Ξ (t) + i ϒ (t) & (10) \end{array}

where Ξ(t) and ϒ(t) are the in-phase and quadrature components of the baseband Z₁(t). They can be derived from Equation (10) as:

\begin{array}{l} Ξ (t) = \frac{Y^{2} (t) + {[C cos (ψ / 2) + X (t)]}^{2} - C^{2} {sin}^{2} (ψ / 2)}{{[C sin (ψ / 2) - Y (t)]}^{2} + {[C cos (ψ / 2) + X (t)]}^{2}} & (11) \end{array}

\begin{array}{l} ϒ (t) = \frac{2 C [C cos (ψ / 2) + X (t)] sin (ψ / 2)}{{[C sin (ψ / 2) - Y (t)]}^{2} + {[C cos (ψ / 2) + X (t)]}^{2}} . & (12) \end{array}

Figure 1B visualizes the resulting baseband in the complex plane. From this visualization, it can be seen that the instantaneous IPDs and IARs can be calculated as the argument: ΔΦ(t) = arg{Z₁(t)} = arctan2(ϒ(t), Ξ(t)) and modulus $R (t) = | Z_{1} (t) | = \sqrt{ϒ {(t)}^{2} + Ξ {(t)}^{2}}$ of the baseband. Here, arctan2 is the two-argument arctangent that returns the angle in the Euclidean plane.

Both Random Processes R(t) and ΔΦ(t) are functions of X(t) and Y(t) which are uncorrelated Gaussian noise processes with the variance σ². The joint PDF f_{X, Y}(x, y) of X(t) and Y(t) is thus that of a bivariate Gaussian distribution:

\begin{array}{l} f_{X, Y} (x, y) = \frac{1}{2 π σ^{2}} e^{- \frac{1}{2 σ^{2}} (x^{2} + y^{2})}, & (13) \end{array}

where

\begin{array}{l} \iint_{- \infty}^{\infty} f_{X, Y} (x, y) d x d y = 1 . & (14) \end{array}

Here and in all future equations, lower-case variables will be used to refer to the individual instances generated by a given noise process. x and y are thus two instances generated by the noise processes X(t) and Y(t) and ξ, υ are generated by Ξ(t) and ϒ(t).

Probability density functions for Ξ(t) and ϒ(t) can be gained by applying a coordinate transformation to Equation (13). For this, Equations (11) and (12) are rearranged to calculate x and y given the values of ξ and υ:

\begin{array}{l} x (ξ, υ) = C [\frac{2 υ sin (ψ / 2)}{υ^{2} + {(ξ - 1)}^{2}} - cos (ψ / 2)], \\ y (ξ, υ) = \frac{C (υ^{2} + ξ^{2} - 1) sin (ψ / 2)}{υ^{2} + ξ^{2} - 2 ξ + 1} . & (15) \end{array}

These expressions allow us to derive the Jacobian determinant |J(x, y)|. The Jacobian is then used to apply a coordinate transformation from dx and dy to dξ and dυ:

\begin{array}{l} d x d y = | J (x, y) | d ξ d υ = \frac{4 C^{2} {sin}^{2} (ψ / 2)}{{[υ^{2} + {(ξ - 1)}^{2}]}^{2}} d ξ d υ . & (16) \end{array}

Applying the transformations in Equations (15) and (16) to change the variables of Equation (13) results in:

\begin{array}{l} f_{Ξ, ϒ} (ξ, υ) \\ = \frac{2 C^{2} {sin}^{2} (ψ / 2)}{π σ^{2} {[υ^{2} + {(ξ - 1)}^{2}]}^{2}} e^{- \frac{C^{2} [υ^{2} - 2 υ sin (ψ) + ξ^{2} - 2 ξ cos (ψ) + 1]}{2 σ^{2} [υ^{2} + {(ξ - 1)}^{2}]}} . & (17) \end{array}

Which is the joint PDF for the two random processes Ξ(t) and ϒ(t). To gain the joint PDF f_{R, ΔΦ}(r, Δφ), Equation (17) is transformed from rectangular to polar coordinates (see Figure 1C). This is achieved by using the transforms: ξ = r cos Δφ, υ = r sin Δφ, dξ dυ = r dr dΔφ resulting in:

\begin{array}{l} f_{R, Δ Φ} (r, Δ φ) = \frac{C^{2} 2 r {sin}^{2} (ψ / 2)}{σ^{2} π h {(0)}^{2}} e^{- \frac{C^{2} h (ψ)}{σ^{2} 2 h (0)}}, & (18) \end{array}

where h(ψ) = r² − 2r cos(Δφ − ψ) + 1 and r ∈ [0, ∞], Δφ ∈ [−π, π].

This equation can be interpreted as the distribution of all possible values of the interaural baseband $z_{1} = r e^{i Δ φ}$ and thus the distribution of all possible combinations of IPDs Δφ and IARs r. It is also apparent from Equation (18) that equal ratios of C²/σ² result in the same PDF so that PDFs will be referenced using the signal to noise ratio SNR = C²/2σ² instead of σ² and C. Some examples of these functions are shown in Figures 2A–D. Deriving the joint PDF of Δφ and ILD Δl instead of IAR r is easily done by using transforms r = 10^Δl/20 and dr = r/20 ln(10)dΔl.

\begin{array}{l} f_{Δ L, Δ Φ} (Δ l, Δ φ) = \frac{C^{2} 1 0^{Δ l / 20} ln (10) {sin}^{2} (ψ / 2)}{σ^{2} π h {(0)}^{2}} e^{- \frac{C^{2} h (ψ)}{σ^{2} 2 h (0)}} . & (19) \end{array}

To derive the joint PDF of ΔΦ(t) and P′(t), the process detailed above is repeated based on the interaural baseband Z₂(t) as defined in Equation (9) resulting in the PDF:

\begin{array}{l} f_{P^{'}, Δ Φ} (p^{'}, Δ φ) = \frac{e^{- \frac{C^{2}}{2 σ^{2}} - \frac{p^{'} [cos (Δ φ) - cos (Δ φ - ψ)]}{2 σ^{2} [cos (ψ) - 1]}} p^{'}}{2 π σ^{2} \sqrt{g}} & (20) \end{array}

where g is given by:

\begin{array}{l} g = 2 C^{2} {sin}^{2} (ψ / 2) [2 p^{'} cos (Δ φ) - C^{2} (cos (ψ) - 1)] \\ - {p^{'}}^{2} {sin}^{2} (Δ φ) . & (21) \end{array}

and the range of values is defined by:

\begin{array}{l} p^{'} \in [0, \hat{p^{'}} (Δ φ)], Δ φ \in [- \hat{Δ φ} (p^{'}), + \hat{Δ φ} (p^{'})], & (22) \end{array}

where

\begin{array}{l} \hat{p^{'}} (Δ φ) = C^{2} \frac{cos (ψ) - 1}{cos (Δ φ) - 1} . & (23) \end{array}

The function $\hat{Δ φ} (p^{'})$ can be gained by solving Equation (23) for Δφ.

FIGURE 2

Figure 2. (A–D) Some examples of the joint PDF of IAR and IPD given in Equation (18). All plots show results for a tone-IPD of π with the SNR increasing from left to right. Angles in the polar plot are the IPDs, while the radial variable is the IAR. Colors indicate the probability density. A logarithmically-scaled colormap was used due to the large dynamic range of the PDF. White areas located at an IAR = 1 and IPD = 0 for 0 and 10 dB indicate a probability density of 0. (E–H) Joint PDF for cross-power and IPDs given in Equations (20). Results are shown for the same parameters as in (A–D). As in the first row of plots, angles indicate the IPD and color the probability density. The radial variable, however, is the cross-power. These PDFs were calculated for a noise variance of σ² = 1. A logarithmically-scaled colormap was used due to the large dynamic range of the PDF. White areas indicate undefined combinations of cross-power and IPD as defined by Equation (23). (I–K) Evaluation of the analytical results by comparing the derived marginal PDFs with numerically estimated PDFs. In all cases, black, dashed lines indicate analytical results gained from Equations (24)–(27). Colored lines indicate results that were instead numerically estimated from waveforms. Panel (I) shows marginal PDFS for IPDs ΔΦ, (J) for ILDs ΔL and k) for the cross-power P′.

Similar to Equation (19) which defined the distribution of all possible values of Δφ and r, this function can be interpreted as the distribution of all possible combinations of Δφ and p′. However, the range of these combinations is limited by Equation (23) so that large areas of the exemplary PDFs shown Figures 2E–H are undefined. This limitation will be treated further in the discussion.

The marginal PDFs of the IAR R, the IPD ΔΦ and the cross-power P′ can be calculated from the two joint PDFs defined in Equations (19) and (20) by integrating over the other variable.

\begin{array}{l} f_{Δ Φ} (Δ φ) = \int_{0}^{\infty} f_{R, Δ Φ} (r, Δ φ) d r \\ = \int_{0}^{\hat{p^{'}} (Δ φ)} f_{P^{'}, Δ Φ} (p^{'}, Δ φ) d p^{'} & (24) \end{array}

\begin{array}{l} f_{R} (r) = \int_{- π}^{π} f_{R, Δ Φ} (r, Δ φ) d Δ φ & (25) \end{array}

\begin{array}{l} f_{P^{'}} (p^{'}) = \int_{- \hat{Δ φ} (p^{'})}^{\hat{Δ φ} (p^{'})} f_{P^{'}, Δ Φ} (p^{'}, Δ φ) d Δ φ & (26) \end{array}

\begin{array}{l} f_{Δ L} (Δ l) = \int_{- π}^{π} f_{Δ L, Δ Φ} (Δ L, Δ φ) d Δ φ . & (27) \end{array}

As previously discussed, the PDFs of Δφ and Δl (and thus r) only depend on the SNR and not on the absolute stimulus power. The cross-power P′, however, is the product of the left and right stimulus envelope and must thus also depend on stimulus power. For this reason, PDFs for P′ will always be shown normalized by C² so that PDFs only depend on the SNR and are independent of overall stimulus power.

No closed-form solution for Equations (24)–(27) could be found so that numeric integration was used to evaluate them (QUADPACK algorithms QAGS/QAGI; Piessens et al., 1983). Figures 2I–K show some examples of the PDF of ΔΦ, ΔL, P′ and verifies the results by comparing Equations (24)–(25) to PDFs that were numerically estimated from signal waveforms.

3. Discussion

Figures 2A–D show joint PDFs for IAR and IPD calculated at a tone-IPD of ψ = π and different SNRs. Without any tone, this distribution would equal a delta distribution with infinite probability density at an IPD of zero and an IAR of 1. At low SNRs (Figures 2A,B), the antiphasic tone has only a small influence on the noise resulting in probability densities that are still tightly clustered around the IPD of 0 and an IAR of 1. With increasing amplitude of the tone and thus increasing SNR, this clustering becomes less pronounced (Figures 2B,C). When the tone starts to dominate the stimulus, the probability density becomes highest around the tone-IPD of π (Figures 2C,D). At large SNRs, the PDF would converge toward a delta distribution at the tone-IPD of π and an IAR of 1. Figures 2E–H shows joint PDFs for cross-power and IPD at the same conditions as used in Figures 2A–D. Without the antiphasic tone, the stimulus density would be concentrated on a single line at zero IPD. Also, the signal is diotic so that the cross-power equals the stimulus power so that the cross-power distribution would equal that of the squared envelope. At low SNRs (Figures 2E,F), the addition of the tone starts to introduce IPD fluctuations thus widening the joint PDF. A large area of these joint PDFs are, however, undefined. These undefined areas are determined by Equation (23) and become intuitive when studying the signal model shown in Figure 1B. At low tone amplitudes C, it is only possible to gain large IPDs at moments where the envelope of the noise and thus x + iy are small. This also result in a small cross-power $p^{'} = a_{L} \times a_{R}$ . With increasing C, large IPDs can then also appear at increasingly large values of p′. This is seen in Figure 2G and especially Figure 2H.

While joint PDFs are the main contribution of this study, they are hard to visualize and, consequently, difficult to discuss in detail. Instead, the following section discusses marginal PDFs for IPDs, cross-power, and ILDs as a function of different stimulus properties. These PDFs lack information about the interaction between the individual metrics, such as IPD and cross-power or ILD. However, they do convey the impact of different metrics more intuitively. Figures 3A,D show examples of the marginal IPD PDFs for ψ = π and ψ = π/2 while varying the SNR. The instantaneous IPD Δφ can be interpreted as a result of the mixture of zero IPD due to the diotic noise and the IPD ψ of the tone. The weighting of the two IPDs is determined by the noise's instantaneous power relative to the tone's power. Thus, at large negative SNRs where the stimulus is dominated by noise, IPD PDFs show a mean value close to zero and only little variance. With increasing SNR, the IPDs are increasingly influenced by the tone-IPD so that the distributions mean moves toward ψ and variance increases. At larger positive SNRs, where the noise power is small compared to the tone, the IPDs are dominated by the tone-IPD ψ so that the variance decreases again. In the two extreme cases where the SNR would either be −∞ or +∞, the signal consists of only the noise or the tone so that neither IPD nor ILD fluctuates—both PDFs are then δ-distributions. For the IPD, this distribution is either be located at zero (SNR=−∞) or at ψ (SNR=+∞) while the ILD distribution is always centered at 0 dB. Figures 3B,E show ILD PDFs for the same parameters as used for the IPD PDFs in Figures 3A,D. Instantaneous ILDs Δl, are a direct result of the relative energy of the instantaneous noise and the tone. As a result, ILD PDFs exhibit the same change of variance as discussed for the IPDs, low variance at both high or low SNRs where the stimulus is either dominated by the tone or noise and an increase of variance at intermediate SNRs. Figures 3C,F show distributions for the remaining parameter P′ plotted in decibels relative to the squared amplitude of the tone. For large SNRs, the signal is dominated by the tone, p′/C² is thus narrowly distributed around 0 dB. With decreasing SNR, the noise power increases relative to C² so that the peak of the distribution shifts toward larger values of p′/C² with the overall shape of the distribution remaining largely unchanged.

FIGURE 3

Figure 3. Exemplary marginal PDFs for IPDs (first column), ILDs (second column), and the cross-power (third column). For better visualization, the cross-power values were normalized with the squared tone amplitude so that the x-axis shows $10 {log}_{10} (p^{'} / C^{2})$ . (A–F) PDFs calculated for two fixed signal phases ψ = π (top-row) and ψ = π/2 (bottom row). Different colors indicate results at different SNRs. (G–L) PDFs calculated for two fixed SNRS: −10 dB (top-row) and 0 dB (bottom row). Different colors indicate results at different signal phases ψ.

Figures 3G–L additionally show IPD, ILD, and P′ PDFs for cases where the SNR was fixed while varying ψ. From the vector summation shown in Figure 1B, it is intuitive that, at the same tone amplitude C, a smaller value of ψ also results in smaller IPDs. As a direct consequence, IPD and ILD PDFs also show less variance for smaller values of ψ. The PDFs for P′, however, are largely uninfluenced by ψ—with the notable exception of a sharp peak located at p′/C² = sin²(ψ/2). This peak is a consequence of Equation (23), which limits the possible combinations of IPDs and P′. To better understand the origin of this peak, Figure 4 shows joint PDFs of IPD and P′. Notably, the probabilities are heavily clustered close to the limit defined by Equation (23). The low slope of the limiting ${\hat{p}}^{'}$ function toward ±π in combination with the accumulation of probability density along this limit results in the observed peak in the cross-power PDFs. From Equation (23) follows that ${\hat{p}}^{'} (Δ φ = \pm π) = C^{2} {sin}^{2} (ψ / 2)$ which is the location of the peaks in Figures 3I,L.

FIGURE 4

Figure 4. Joint probability functions of the cross-power P′ and IPD as defined in Equation (20). For better comparison, the y-axis was normalized with the squared tone amplitude so that the y-axis shows $10 {log}_{10} (p^{'} / C^{2})$ . The top row shows PDFs at an SNR of −10 dB, while the bottom row shows PDFs at an SNR of 10 dB. Columns show Each panel shows a PDF at different SNRs and Tone-IPDs ψ. The horizontal dashed black lines indicate the location where p′ = C² so that the normalized cross-power is 0 dB. The vertical black lines indicate where the IPD matches the tone-phase Δφ = ψ. Note that the color map is logarithmically-scaled and that changes in the scale were limited to values between 1 and 10⁻³.

All PDFs derived above show discontinuities for Δφ ∈ {0, ±π} for which the probability densities approach zero. Or, in other words, a N₀S_ψ stimulus will never contain IPDs that are exactly zero or π. Both discontinuities can be understood when keeping in mind that the IPD is defined by Δφ = arctan2(υ, ξ). Which can only result in a value of 0 or ±π if υ = 0. This is only the case when x = −C cos (ψ/2). As the probability of x to take this exact value approaches zero, the joint PDFs will also approach zero. For further discussion of the PDFs, however, this discontinuity was not shown explicitly in the plots above as its implication in practice is limited.

Furthermore, the PDFs derived in this study are independent of noise spectrum and bandwidth. They are thus valid for any Gaussian noise with zero mean. Further, the tone frequency does not need to be located within the noise spectrum. However, with auditory processing, especially peripheral filtering, the spectrum will influence the effective SNR at the level of binaural interaction and, thus, the PDFs of the encoded binaural cues. In these cases, PDFs will be determined by the effective SNR of the stimulus as processed, meaning after considering the bandpass properties of the auditory periphery. While all PDFs were derived for the diotic noise case N₀S_ψ, they can easily be generalized to cases where an additional phase delay ψ₂ is applied to the whole stimulus. Such a signal could then be referred to as (N₀S_ψ)_ψ₂ and would result in identical IPD distributions as in the N₀S_ψ case but shifted by ψ₂ with ILD and P′ distributions remaining unchanged.

3.1. Quantifying IPD and ILD variability

Multiple studies have used models making use of the variability of IPDs, ILDs, or a combination of the two, as a detection cue for tone in noise experiments (e.g., Davidson et al., 2009; Dietz et al., 2021; Encke and Dietz, 2022; Eurich et al., 2022) or for decorrelation detection (Goupell and Hartmann, 2007). Based on the derived PDFs, the following section will thus discuss different measures for the amount of IPD and ILD fluctuation for the special case of N₀S_π.

The amount of ILD fluctuations can be quantified by calculating the variance V of the underlying distribution defined as:

\begin{array}{l} V = < Δ L {(t)}^{2} > = \int_{- π}^{π} \int_{- \infty}^{\infty} Δ l^{2} f_{Δ L, Δ Φ} d Δ l d Δ φ, & (28) \end{array}

where the angular brackets symbolize the ensemble average. The resulting variance as a function of SNR is shown in Figure 5A. As expected from the plots in Figure 3, ILD variance first increases with SNR until reaching its maximum around an SNR of −0.73 dB from where the variance decreases as the tone starts to dominate the stimulus.

FIGURE 5

Figure 5. (A) Variance of ILDs in an N₀S_π signal calculated at different SNRs. The dashed line marks the maximum of the function (B) Circular IPD variance in an N₀S_π signal calculated at different SNRs (blue line) and the matching interaural coherence (gray line). Dotted lines indicate the location of the maximum in variance and minimum in coherence. (C) Circular IPD variance as a function of stimulus coherence for an N₀S_π stimulus (gray line and symbols) as well as (partly) decorrelated noise (dashed black line) (Just and Bamler, 1994). Symbols and labels indicate SNRs resulting in a given combination of coherence and variance.

Most previous studies relied on the regular variance (or standard deviation $\sqrt{V}$ ) as defined in Equation (28) when quantifying IPD variance (Goupell and Hartmann, 2007; Davidson et al., 2009). This approach makes sense at low SNRs where IPDs are narrowly distributed around 0. At higher SNRs, however, the distribution starts to move toward a mean value of π, and calculating the regular variance is of little significance. An alternative and better-suited metric for quantifying the IPD variability is the circular variance V_circ (Fisher, 1993) defined as:

\begin{array}{l} V_{circ} = 1 - | 〈 e^{i Δ Φ (t)} 〉 | = 1 - | \int_{- π}^{π} \int_{0}^{{\hat{p}}^{'} (Δ φ)} e^{i Δ φ} f_{P^{'}, Δ Φ} d p^{'} d Δ φ |, & (29) \end{array}

where the angular brackets symbolize the ensemble average, V_circ can take values between 0 and 1 with a value of 0 indicating no IPD fluctuations. In contrast, a value of 1 indicates a wide distribution of IPDs (but not necessarily a uniform distribution). The gray line shows the circular variance as a function of SNR in Figure 5B. Like the ILD variance, IPD variance increases with increasing SNR until reaching its maximum around an SNR of −1.93 dB from where the variance starts to decrease.

A second and alternative metric for quantifying the amount of IPD fluctuations has recently been shown to directly account for the detection performance in a variety of tone in noise tasks: The interaural coherence¹ |γ| (Encke and Dietz, 2022; Eurich et al., 2022). The interaural coherence is defined as the modulus of the complex-valued correlation coefficient and can be calculated as:

\begin{array}{l} | γ | = \frac{| 〈 R_{a} (t) L_{a}^{*} (t) 〉 |}{< \sqrt{| R_{a} (t) |^{2} > < | L_{a} (t) |^{2} >}} = \frac{| 〈 P^{'} (t) e^{i Δ Φ (t)} 〉 |}{< \sqrt{| R_{a} (t) |^{2} > < | L_{a} (t) |^{2} >}}, & (30) \end{array}

\begin{array}{l} = \frac{1}{2 σ + C^{2}} | \int_{- π}^{π} \int_{0}^{{\hat{p}}^{'} (Δ φ)} p^{'} e^{i Δ φ} f_{P^{'}, Δ Φ} d p^{'} d Δ φ |, & (31) \end{array}

where R_a, L_a are the analytical representation of the left and right ear signals, the asterisk symbolizes the complex conjugate, and σ² and C are the variance of the noise and the amplitude of the tone, respectively. Comparing this equation to the definition of V_circ in Equation (29), shows that the two measures are closely related, with the main difference being that |γ| weights the IPDs by p′ before averaging. This weighting requires a normalization achieved by the term before the integrals. In addition to this, the two metrics show inverse behavior. A stimulus with no IPD fluctuations will result in an interaural coherence of |γ| = 1 while the circular variance would be V = 0.

An interesting property of |γ| is that any stimulus with a real-valued cross power density spectrum such as N₀S_π also results in a real-valued γ which then equals the interaural (Pearson) correlation. Figure 5B shows the interaural coherence (and thus correlation) as a function of SNR (blue line). As expected from the previous discussions, the coherence decreases with increasing SNR until reaching a coherence of zero at an SNR of 0 dB from where it starts to increase. Surprisingly, however, the minimum in coherence does not match the maximum in IPD or ILD variability. Figure 5C thus shows the same data as in panel b but plotting IPD variance as a function of coherence. The same plot also shows the IPD variance of two partly correlated noise tokens as a function of coherence. From this figure, one can appreciate that, depending on the stimulus, the same coherence can result in different amounts of IPD variance. These differences are caused by the p′ weighting of IPDs that is included when calculating |γ| (see Equation 31). two stimuli that share the same IPD PDF but differing P′ PDFs would thus show also differ in their coherence.

4. Summary

This study aimed to derive the joint PDF for ILDs (IARs) and IPDs as well as IPDs and P′. The two functions are given by the Equations (19) and (20). The two equations are a key component for understanding how the SNR and ψ influence the magnitude of binaural unmasking when considering IPD and ILD variance as the underlying cue. The approach applied to derive PDFs can further be used as a template for other types of binaural signals. In the future, it will hopefully help to get a better understanding of how different stimulus statistics influence binaural unmasking.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author/s.

Author contributions

JE and MD designed the research and wrote the paper. JE conducted the calculations, analyzed the data, and produced the figures. All authors contributed to the article and approved the submitted version.

Funding

This work was supported by the European Research Council (ERC) under the European Union's Horizon 2020 Research and Innovation Programme grant agreement No. 716800 (ERC Starting Grant to MD).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2022.1022308/full#supplementary-material

Footnotes

1. ^Note that there are several different definitions of coherence. Our use of coherence as |γ| is a typical time-domain definition (Saleh, 2007). In general signal processing, the coherence function is instead often defined in the frequency domain and calculated as the normalized absolute value of the cross-spectral power density (CSPD) (Shin, 2008). The two definitions are closely related, as the time-domain coherence can also be defined by using a Fourier transform of the CSPD. In binaural research, a third definition exists, where interaural coherence is sometimes used to refer to the maximum of the real-valued cross-correlation function (Blauert, 1983).

References

Blauert, J. (1983). Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge, MA: MIT Press.

Google Scholar

Culling, J. F., and Lavandier, M. (2021). “ Chapter 8: Binaural unmasking and spatial release from masking,” in Binaural Hearing, eds R. Y. Litovsky, M. J. Goupell, R. R. Fay, and A. N. Popper (Cham: Springer International Publishing), 209–241. doi: 10.1007/978-3-030-57100-9_8

CrossRef Full Text | Google Scholar

Davidson, S. A., Gilkey, R. H., Colburn, H. S., and Carney, L. H. (2009). An evaluation of models for diotic and dichotic detection in reproducible noises. J. Acoust. Soc. Am. 126, 1906. doi: 10.1121/1.3206583

PubMed Abstract | CrossRef Full Text | Google Scholar

Dietz, M., Encke, J., Bracklo, K. I., and Ewert, S. D. (2021). Tone detection thresholds in interaurally delayed noise of different bandwidths. Acta Acust. 5, 60. doi: 10.1051/aacus/2021054

CrossRef Full Text | Google Scholar

Durlach, N. I., Gabriel, K. J., Colburn, H. S., and Trahiotis, C. (1986). Interaural correlation discrimination: II. Relation to binaural unmasking. J. Acoust. Soc. Am. 79, 1548–1557. doi: 10.1121/1.393681

PubMed Abstract | CrossRef Full Text | Google Scholar

Encke, J., and Dietz, M. (2022). “A hemispheric two-channel code accounts for binaural unmasking in humans,” in Communications Biology. 5. doi: 10.1038/s42003-022-04098-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Eurich, B., Encke, J., Ewert, S. D., and Dietz, M. (2022). Lower interaural coherence in off-signal bands impairs binaural detection. J. Acoust. Soc. Am. 151, 3927–3936. doi: 10.1121/10.0011673

PubMed Abstract | CrossRef Full Text | Google Scholar

Fisher, N. I. (1993). “ Chapter 2: Descriptive methods,” in Statistical Analysis of Circular Data (Cambridge, UK: Cambridge University Press), 15–38. doi: 10.1017/CBO9780511564345.004

CrossRef Full Text

Goupell, M. J., and Hartmann, W. M. (2006). Interaural fluctuations and the detection of interaural incoherence: bandwidth effects. J. Acoust. Soc. Am. 119, 3971–3986. doi: 10.1121/1.2200147

PubMed Abstract | CrossRef Full Text | Google Scholar

Goupell, M. J., and Hartmann, W. M. (2007). Interaural fluctuations and the detection of interaural incoherence. III. Narrowband experiments and binaural models. J. Acoust. Soc. Am. 122, 1029–1045. doi: 10.1121/1.2734489

PubMed Abstract | CrossRef Full Text | Google Scholar

Henning, G. B. (1973). Effect of interaural phase on frequency and amplitude discrimination. J. Acoust. Soc. Am. 54, 1160–1178. doi: 10.1121/1.1914363

PubMed Abstract | CrossRef Full Text | Google Scholar

Hirsh, I. J. (1948). The influence of interaural phase on interaural summation and inhibition. J. Acoust. Soc. Am. 20, 536–544. doi: 10.1121/1.1906407

CrossRef Full Text | Google Scholar

Just, D., and Bamler, R. (1994). Phase statistics of interferograms with applications to synthetic aperture radar. Appl. Opt. 33, 4361. doi: 10.1364/AO.33.004361

PubMed Abstract | CrossRef Full Text | Google Scholar

Levitt, H., and Lundry, E. A. (1966). Binaural vector model: relative interaural time differences. J. Acoust. Soc. Am. 40, 1251–1251. doi: 10.1121/1.1943044

CrossRef Full Text | Google Scholar

Meurer, A., Smith, C. P., Paprocki, M., Čertík, O., Kirpichev, S. B., Rocklin, M., et al. (2017). Sympy: symbolic computing in Python. PeerJ Comput. Sci. 3, e103. doi: 10.7717/peerj-cs.103

CrossRef Full Text | Google Scholar

Piessens, R., de Doncker-Kapenga, E., Überhuber, C. W., and Kahaner, D. K. (1983). Quadpack. Berlin; Heidelberg: Springer. doi: 10.1007/978-3-642-61786-7

CrossRef Full Text | Google Scholar

Saleh, B. (2007). Fundamentals of Photonics. Hoboken, NJ: Wiley-Interscience.

Google Scholar

Shin, K. (2008). Fundamentals of Signal Processing for Sound and Vibration Engineers. Chichester, Hoboken, NJ: John Wiley & Sons.

Google Scholar

Zurek, P. M. (1991). Probability distributions of interaural phase and level differences in binaural detection stimuli. J. Acoust. Soc. Am. 90, 1927–1932. doi: 10.1121/1.401672

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: sound localization, probability density function, interaural level difference, interaural phase difference, tone in noise detection, binaural unmasking

Citation: Encke J and Dietz M (2022) Statistics of the instantaneous interaural parameters for dichotic tones in diotic noise (N₀S_ψ). Front. Neurosci. 16:1022308. doi: 10.3389/fnins.2022.1022308

Received: 18 August 2022; Accepted: 19 October 2022;
Published: 08 November 2022.

Edited by:

Huiming Zhang, University of Windsor, Canada

Reviewed by:

John Culling, Cardiff University, United Kingdom
Herman Myburgh, University of Pretoria, South Africa

Copyright © 2022 Encke and Dietz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jörg Encke, am9lcmcuZW5ja2VAdW5pLW9sZGVuYnVyZy5kZQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Statistics of the instantaneous interaural parameters for dichotic tones in diotic noise (N₀S_ψ)

1. Introduction

2. Deriving the probability density functions

3. Discussion

3.1. Quantifying IPD and ILD variability

4. Summary

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher's note

Supplementary material

Footnotes

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good