A Note on Capon's Minimum Variance Projection for Multi-Spacecraft Data Analysis

Narita, Yasuhito

doi:10.3389/fphy.2019.00008

PERSPECTIVE article

Front. Phys., 01 February 2019

Sec. Space Physics

Volume 7 - 2019 | https://doi.org/10.3389/fphy.2019.00008

A Note on Capon's Minimum Variance Projection for Multi-Spacecraft Data Analysis

Yasuhito Narita^1,2^*

¹Space Research Institute, Austrian Academy of Sciences, Graz, Austria
²Institut für Geophysik und Extraterrestrische Physik, Technische Universität Braunschweig, Braunschweig, Germany

Capon's minimum variance projection for the multi-point measurements is revisited using the method of likelihood function to derive the minimum variance projection and a simplified error estimate analytically. Theoretical construction of the minimum variance projection assumes a Gaussian form of the likelihood function and also regards the data covariance as a proxy of the noise covariance. The minimum variance projection is extended to the problem of two-spacecraft mode decomposition in the Mercury magnetosphere in which the magnetic field is a superposition of the constant field from the current sheet and the dipolar field from the planet. The extension of the Capon estimator (the data-variance projection) can identify the signal amplitudes of the different fields with a sufficient accuracy when the statistical averaging is properly done. The Capon estimator serves as a powerful analysis tool when the spatial resolution is limited to only a few points.

1. Capon's Minimum Variance Projection

Minimum variance projection introduced by Capon [1] has a wide range of applications in the geophysical and space physical research fields whenever multi-point measurements are available.

Various formations are possible in the multi-point measurements (Figure 1): the THEMIS mission in a one-dimensional array aligned with a magnetic field line [2], the Swarm mission spanning a plane with three spacecraft [3, 4], and the Cluster mission [5] the MMS mission [6] forming a tetrahedron.

FIGURE 1

Figure 1. Multi-point spacecraft configurations.

The projection works with various kinds of shape vectors (or models for the data) by minimizing the projection error without changing the amplitude of the signal amplitude. The minimum variance projection is, after Capon [1] or Haykin [7], obtained by imposing a constrained optimization:

\begin{array}{l} minimize \vec{w} R \vec{w} subject to \vec{w} \vec{h} = 1, & (1) \end{array}

or, by formulating into a variational problem using the variation operator δ[⋯ ] and the Lagrangian multiplier λ,

\begin{array}{l} δ [\vec{w} R \vec{w} - λ (\vec{w} \vec{h} - 1)] = 0 . & (2) \end{array}

Here $\vec{w}$ is the weight vector operating on the measurement covariance matrix $R = 〈 \vec{d} \vec{d} 〉$ , $\vec{d}$ the measurement data in a vectorial form, 〈⋯ 〉 the operation of ensemble averaging, $\vec{h}$ the shape vector. The problem (Equation 2) can analytically be solved. The solution is, blue by treating the shape vector as a complex-number vector,

\begin{array}{l} \vec{w} = \frac{R^{- 1} \vec{h}}{{\vec{h}}^{†} R^{- 1} \vec{h}} . & (3) \end{array}

See, for example, Haykin [7] for the derivation. The projected power (squared signal amplitude) is

\begin{array}{l} P = {\vec{w}}^{†} R \vec{w} = {[{\vec{h}}^{†} R^{- 1} \vec{h}]}^{- 1} . & (4) \end{array}

A larger set of algorithms has so far been developed in the frame of direction-of-arrival estimation in the adaptive filter theory. Many algorithms are recently reviewed by Khmou et al. [8], including beamforming method, Bartlett method, Capon method, linear prediction method, maximum entropy method, Pisarenko harmonic decomposition, minimum norm, MUSIC algorithm, propagator method, and partial covariance matrix method. In the multi-spacecraft wave-field analysis, three methods are most relevant: beamforming method, Capon method, and MSR method [9].

• The beamforming method can easily be implemented to the data analysis, but on the other hand provides a lower resolution of the signal-to-noise ratio in the spectral analysis compared to the Capon method when only few data points are used in the analysis [10, 11]

• The Capon method is versatile in the multi-spacecraft data analysis because the method can be extended in various ways, e.g., to the vectorial data set (the wave telescope technique) [12, 13] to different field types (e.g., electric field and magnetic field used in the k-filtering technique) [14], and to the mode decomposition (this paper).

• The MSR method [11] uses both the Capon (or the wave telescope) method and the eigenvector-based method (the extended MUSIC algorithm), and provides an improved signal-to-noise ratio in the wavevector spectrum. The MSR method is optimized in the estimate of the total fluctuation energy (i.e., trace of the spectral density matrix) but not to the matrix elements. Also, the MSR method is constrained to the isotropic noise assumption.

So far, three shape vectors for the Capon method have successfully been applied to the multi-spacecraft data analysis blue in the field of space physics

1. Plane waves [1, 12, 14]:

\begin{array}{l} h_{i} (k_{x}, k_{y}, k_{z}) = exp [i (k_{x} r_{x i} + k_{y} r_{y i} + k_{z} r_{z i})], & (5) \end{array}

where {k_x, k_y, k_z} are the three components of the wavevectors, {r_xi, r_yi, r_zi} the spatial coordinates of the i-th sensor, and i the imaginary number unit.

2. Spherical waves [15, 16]:

\begin{array}{l} h_{i} (k, r_{xc}, r_{yc}, r_{zc}) = {[\sum_{i = 1}^{n} \frac{1}{| {\vec{r}}_{i} - {\vec{r}}_{c} |^{2}}]}^{- 1 / 2} \frac{exp [i k | {\vec{r}}_{i} - {\vec{r}}_{c} |]}{| {\vec{r}}_{i} - {\vec{r}}_{c} |}, & (6) \end{array}

where k is the wavenumber, and ${\vec{r}}_{c} = [r_{xc}, r_{yc}, r_{zc}]$ the coordinate of the spherical wave center. The summation in the normalization constant runs over the number of sensors n.

3. Phase-shifted waves [17]:

\begin{array}{l} h_{i} (Φ, r_{xc}, Δ x, k_{y}) = exp [i (Φ arctan (\frac{r_{x i} - r_{xc}}{Δ x}) + k_{y} r_{y i})], & (7) \end{array}

where Φ is the amount of phase jump, r_xc the x-coordinate of the phase jump center, Δx the x-width (or range in x) of the phase jump around the center, and k_y the y-component of the wavenumber.

Capon's minimum variance projection uses the measurement data to guide the projection by minimizing the estimated power during the projection (in spirit of minimizing uncertainty) yet keeping the gain. We address the question here, “Why is the inversion of the data covariance matrix R⁻¹ used in the Capon projection (Equation 3) and in the spectrum (Equation 4)?” In this paper, we offer an answer to this question, that is, the Capon method uses the measured data as a proxy of the noise property upon the optimization procedure. The method of the likelihood function is introduced to give an alternative and more instinctive derivation of Capon's method. Moreover, by revisiting Capon's minimum variance projection through the likelihood function method, it becomes clear that one may project the measurement data not only onto a single shape vector but also onto a multitude of shape vectors, which will enhance the capability of the multi-spacecraft data analysis.

It is worthwhile to note that the maximum likelihood and the minimum variance aspects of the Capon estimator are already covered in detail in the original paper by Capon et al. [18] for a seismic array problem. In this paper, the essence of the Capon estimator is reviewed (in section 2) and the estimator is extended to the problem of mode decomposition. As an application, a new method is constructed (in section 3) for two-spacecraft measurements in the magnetosphere to identify the magnetic field of the current sheet origin and the dipolar field from the planet, which is relevant to the BepiColombo mission.

2. A View From Likelihood Function

2.1. Scalar Field

Minimum variance projection can be formulated using the likelihood function $L$ as follows (see also [19]). Consider a model for the observational data as

\begin{array}{l} d_{i} = h_{i} s + η_{i}, & (8) \end{array}

where d_i is the data element at the i-th sensor (i = {1, 2, ⋯ , n}, so n is the number of sensors), h_i the shape vector given a priori as a model, s the signal (in which we are interested), and η_i the noise at the i-th sensor. The noise is characterized by the n × n noise matrix N

\begin{array}{l} N_{i j} = 〈 η_{i} η_{j} 〉 . & (9) \end{array}

Again, the angular bracket 〈⋯ 〉 denotes the ensemble averaging over different realizations. The averaging is important because otherwise the determinant of the matrix vanishes and the matrix cannot be inverted.

We minimize the squared deviation between the model and the data, i.e., we minimize χ² constructed (Figure 2) because of the steeper gradient toward the minimum (or the extremum) than that of the Gaussian curve as follows,

\begin{array}{l} χ^{2} = 〈 \sum_{j = 1}^{n} \sum_{i = 1}^{n} (d_{i} - h_{i} s) N_{i j}^{- 1} (d_{j} - h_{j} s) 〉 & (10) \end{array}

with respect to the signal s. We now assume a Gaussian shape for the likelihood function,

\begin{array}{l} L \propto exp [- \frac{χ^{2}}{2}] . & (11) \end{array}

The likelihood function $L$ represents the probability of finding the true signal value for s under the given data set d_i.

FIGURE 2

Figure 2. Likelihood function on linear and logarithmic scales. Likelihood function is approximated to a Gaussian curve around the maximum (or the minimum of $χ^{2} = - ln L$ in the parameter space k.

Our goal is to estimate the signal amplitude (denoted as $\tilde{s}$ ) for the shape vector for the given data set by finding a maximum of the likelihood function or equivalently by minimizing the deviation χ².

We differentiate χ² with respect to the signal s,

\begin{array}{l} \frac{\partial χ^{2}}{\partial s} = - 2 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} (d_{j} - h_{j} s) 〉 . & (12) \end{array}

Requiring that the derivative be zero (which corresponds to the extremum), ∂(χ²)/∂s, we obtain

\begin{array}{l} 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} (d_{j} - h_{j} s) 〉 = 0 . & (13) \end{array}

Equation (13) can be arranged into the following form,

\begin{array}{l} 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} h_{j} s 〉 = 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} d_{j} 〉 . & (14) \end{array}

Note that the part in the angular bracket on the left hand side of Equation (14) is already statistically evaluated, so one may leave the angular bracket out in the following discussion. For a scalar quantity of s, one may obtain the estimator $\tilde{s}$ as

\begin{array}{l} \tilde{s} = {[\sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} h_{j}]}^{- 1} 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} d_{j} 〉 & (15) \end{array}

\begin{array}{l} = & C_{N} 〈 \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} d_{j} 〉, & (16) \end{array}

where C_N is the noise covariance,

\begin{array}{l} C_{N} = {[\sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i} N_{i j}^{- 1} h_{j}]}^{- 1} . & (17) \end{array}

Or, in a matrix notation,

\begin{array}{l} \tilde{s} = C_{N} 〈 \vec{h} N^{- 1} \vec{d} 〉 & (18) \end{array}

and

\begin{array}{l} C_{N} = {(h N^{- 1} h)}^{- 1} & (19) \end{array}

The signal covariance (or the optimized power estimate) is (by noting that the ensemble averaging is taken after the covariance calculation)

\begin{array}{l} C_{R} = 〈 \tilde{s} \tilde{s} 〉 & (20) \end{array}

\begin{array}{l} = & 〈 C_{N} \vec{h} N^{- 1} \vec{d} \vec{d} (N^{- 1}) \vec{h} C_{N} 〉 & (21) \end{array}

\begin{array}{l} = & C_{N} \vec{h} N^{- 1} 〈 \vec{d} \vec{d} 〉 (N^{- 1}) \vec{h} C_{N} . & (22) \end{array}

In general, the data covariance matrix $R = 〈 \vec{d} \vec{d} 〉$ and the noise covariance matrix N must be evaluated separately, or the noise covariance matrix needs to be known a priori to determine the signal covariance C_S. Capon's minimum variance projection is obtained by imposing (or using) the data covariance matrix as the noise covariance matrix, N → R. An explicit calculation yields

\begin{array}{l} C_{R} = C_{N} \vec{h} R^{- 1} \vec{h} C_{N} & (23) \end{array}

\begin{array}{l} = & {(\vec{h} R^{- 1} \vec{h})}^{- 1} \vec{h} R^{- 1} \vec{h} {(\vec{h} R^{- 1} \vec{h})}^{- 1} & (24) \end{array}

\begin{array}{l} = & {(\vec{h} R^{- 1} \vec{h})}^{- 1} & (25) \end{array}

2.2. Vector Field

The method with the likelihood function can be extended to the vector field treatment in a straightforward fashion. We construct the data model as follows.

\begin{array}{l} \vec{d} = H \vec{s} + η . & (26) \end{array}

The data are arranged into a long vector, $\vec{d} = [d_{x 1}, d_{y 1}, d_{z 1}, d_{x 2}, d_{y 2}, d_{z 2}, \dots d_{z N}]$ . The signal amplitude vector is $\vec{s} = [s_{x}, s_{y}, s_{z}]$ . H is the shape-and-pointing matrix.

\begin{array}{l} H = [\begin{matrix} h_{x 1} & 0 & 0 \\ 0 & h_{y 1} & 0 \\ 0 & 0 & h_{z 1} \\ h_{x 2} & 0 & 0 \\ 0 & h_{y 2} & 0 \\ 0 & 0 & h_{z 2} \\ ⋮ \\ 0 & 0 & h_{z N} \end{matrix}] . & (27) \end{array}

η is the noise with the same construction as that of $\vec{d}$ (with 3 × n elements). The essential difference from the scalar-field treatment is that the covariance becomes a matrix, e.g., $C_{S} = 〈 \vec{s} \vec{s} 〉$ for the signal covariance matrix. Repeating the same procedure (by taking care of the vector and matrix operations), the signal estimator for the m-th component (k = {x, y, z}) is obtained as

\begin{array}{l} {\tilde{s}}_{k} = \sum_{ℓ}^{x, y, z} \sum_{i = 1}^{n} \sum_{j = 1}^{n} {(C_{N})}_{ℓ k} H_{i ℓ} N_{i j}^{- 1} d_{j}, & (28) \end{array}

where the summation on ℓ runs over the x, y, and z components, and that on i and j over the number of sensors. The signal estimator is given in a matrix notation as

\begin{array}{l} \tilde{\vec{s}} = C_{N} 〈 H N^{- 1} \vec{d} 〉 & (29) \end{array}

The noise covariance matrix C_N is given by

\begin{array}{l} {(C_{N}^{- 1})}_{ℓ k} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} H_{i ℓ} N_{i j}^{- 1} H_{j k} & (30) \end{array}

or, in a matrix notation,

\begin{array}{l} C_{N} = {[H N^{- 1} H]}^{- 1} . & (31) \end{array}

The signal covariance matrix, when using the data covariance matrix again as the noise covariance matrix, is

\begin{array}{l} C_{R}^{(vec)} = C_{N} H N^{- 1} R {(N^{- 1})}^{t} H C_{N} & (32) \end{array}

\begin{array}{l} \to & {[H R^{- 1} H]}^{- 1} . & (33) \end{array}

2.3. Mode Decomposition

The minimum variance projection can be extended to multiple shape vectors. By doing so, it is possible to decompose the measurement data set into a spectrum of m different modes or shapes. We construct a model for the measurement data (scalar field) with a multitude of modes and a noise.

\begin{array}{l} d_{i} = \sum_{α = 1}^{m} h_{i}^{α} s^{α} + η_{i}, & (34) \end{array}

where d_i is the measured field at the i-th sensor, m the number of the modes introduced into the model, ${\vec{h}}^{α} = {{\vec{h}}^{(1)}, {\vec{h}}^{(2)}, \dots, {\vec{h}}^{(m)}}$ is a set of shape vectors for modes α = {1, 2, ⋯ , m}, the symbol s^α is the signal amplitude at each mode, and n_i the noise at the i-th sensor.

Derivation of the minimum variance estimator is essentially the same as that for the single mode χ² minimization. The estimator for the signal amplitude at the α-th mode is obtained as

\begin{array}{l} {\tilde{s}}^{α} = 〈 \sum_{β}^{m} \sum_{i = 1}^{n} \sum_{j = 1}^{n} {(C_{N})}^{α β} h_{i}^{β} N_{i j}^{- 1} d_{j} 〉 . & (35) \end{array}

The noise covariance matrix is a projection of the inverse noise matrix N⁻¹ onto different modes (h^α and h^β),

\begin{array}{l} {(C_{N}^{- 1})}^{α β} = \sum_{i = 1}^{n} \sum_{j = 1}^{n} h_{i}^{(α)} N_{i j}^{- 1} h_{j}^{β} . & (36) \end{array}

The signal covariance matrix for the Capon-type projection is obtained as

\begin{array}{l} {(C)}^{α β} = C_{N} {\vec{h}}^{α} N^{- 1} R N^{- 1} {\vec{h}}^{β} C_{N} & (37) \end{array}

\begin{array}{l} \to & {[\vec{h^{α}} R^{- 1} {\vec{h}}^{β}]}^{- 1} . & (38) \end{array}

The diagonal elements in the signal covariance matrix are the power (squared signal amplitude) of each mode.

2.4. Error Estimate

A useful form for the one-sigma error (68 % confidence) for the minimum variance estimator can be evaluated from the likelihood function. Ideally, the inverse of the noise matrix N⁻¹ must be known, but the error estimate requires the knowledge on the noise property. Still, we obtain an insight by considering a special case, that is, we model the noise matrix N as diagonal with a value of each diagonal element C_S+C_N (which is a scalar). The likelihood function for a simplified error estimate is again modeled as Gaussian (see, e.g., Equation 11.21 in Dodelson [19]),

\begin{array}{l} L \propto \frac{1}{{(C_{S} + C_{N})}^{n / 2}} exp [- \frac{1}{2} \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{C_{S} + C_{N}}] & (39) \end{array}

One-sigma error is obtained as the second-order derivative of the logarithm of the likelihood function (which is essentially χ² in the Gaussian likelihood model):

\begin{array}{l} σ_{S} = {[- \frac{\partial^{2}}{\partial C_{S}^{2}} (ln L)]}^{- 1 / 2} & (40) \end{array}

\begin{array}{l} = & \sqrt{\frac{2}{n}} (C_{S} + C_{N}) & (41) \end{array}

\begin{array}{l} \to & \sqrt{\frac{8 C_{S}^{2}}{n}} . & (42) \end{array}

Here Equation (40) is evaluated at the peak of the likelihood function to obtain (Equation 41), and the signal covariance C_S is used as a proxy of the noise covariance C_N for Capon's minimum variance projection in deriving (Equation 42). Thus, the simplified error of the Capon-estimated signal power is $σ^{2} = 8 C_{S}^{2} / n$ . See Appendix for derivation of Equation (40).

3. Application

Mode decomposition using the minimum variance projection serves as a powerful analysis tool when the measurements are limited to only few spatial points. A test using a synthetic data set is presented with two-spacecraft measurements in the Mercury magnetosphere in view of the BepiColombo mission [20].

3.1. Setup

Magnetic fields are modeled as a superposition of two different fields (or modes), B^(a) and B^(b) and noise η in the magnetosphere at two sensor locations, r₁ = 480 km (planetary orbiter) and r₂ = 590 km (magnetospheric orbiter) above surface (at a radius of R_s = 2, 440 km for the planetary center). The measurement data are thus

\begin{array}{l} B (r_{1}) = B^{(a)} (r_{1}) + B^{(b)} (r_{1}) + η_{1} & (43) \end{array}

\begin{array}{l} B (r_{2}) = B^{(a)} (r_{2}) + B^{(b)} (r_{2}) + η_{2}, & (44) \end{array}

where the first mode is the magnetic field from the current sheet

\begin{array}{l} B^{(a)} = s^{(a)} = const . & (45) \end{array}

and the second mode is the dipolar field from the planet

\begin{array}{l} B^{(b)} = s^{(b)} {(\frac{r}{R_{s}})}^{- 3} . & (46) \end{array}

Measurements are assumed to be on the magnetic equatorial plane such that the magnetic fields have only one non-vanishing component (say, the z component). The geometrical configuration is illustrated in Figure 3. The true signal amplitudes are s^(a) = 20 nT and s^(b) = 200 nT, respectively. Noise is blue assumed to be Gaussian distributed with a standard deviation of σ_n = 1 nT. The goal of the numerical test using the synthetic data is to estimate the signal amplitudes for the two modes, ${\tilde{s}}^{(a)}$ and ${\tilde{s}}^{(b)}$ using the noise-variance minimization and the data-variance minimization.

FIGURE 3

Figure 3. Superposition of a constant field from the current sheet (mode 1) and a dipolar field from the planet (mode 2) in a region between the distance to the current sheet (from the planetary center) R_c and the that to the planetary surface R_s.

3.2. Analysis – Preparation

The shape vector for the first mode (a constant field) is

\begin{array}{l} {\vec{h}}_{1} = [\begin{matrix} 1 \\ 1 \end{matrix}] & (47) \end{array}

and that for the second mode (decaying field) is

\begin{array}{l} {\vec{h}}_{2} = [\begin{matrix} {(\frac{r_{1}}{R_{s}})}^{- 3} \\ {(\frac{r_{2}}{R_{s}})}^{- 3} \end{matrix}] . & (48) \end{array}

The shape matrix is constructed as $[{\vec{h}}_{1} {\vec{h}}_{2}]$ ,

\begin{array}{l} h = [\begin{matrix} 1 & {(\frac{r_{1}}{R_{s}})}^{- 3} \\ 1 & {(\frac{r_{2}}{R_{s}})}^{- 3} \end{matrix}] . & (49) \end{array}

The data matrices are averaged over different realizations or samples,

\begin{array}{l} R = 〈 \vec{d} \vec{d} 〉, & (50) \end{array}

using an averaging size of N_s. The measurement data vector is averaged when using the minimum variance estimator as $〈 \vec{d} 〉$ . We study the minimum variance estimators for different sampling sizes N_s. The measurement noise matrix follows uncorrelated Gaussian statistics,

\begin{array}{l} N = 〈 \vec{η} \vec{η} 〉 = [\begin{matrix} σ_{n}^{2} & 0 \\ 0 & σ_{n}^{2} \end{matrix}] . & (51) \end{array}

3.3. Analysis – Projection

The noise minimum variance estimator for each of the modes α = {a, b} is

\begin{array}{l} {({\tilde{\vec{s}}}^{(nv)})}_{α} = {(C_{N})}_{α β} {(h)}_{β i} {(N^{- 1})}_{i j} 〈 {\vec{d}}_{j} 〉 . & (52) \end{array}

The data minimum variance estimator for each of the modes α = {a, b} is obtained by replacing the measurement noise covariance N by the measurement data covariance $R = 〈 \vec{d} \vec{d} 〉$ and also replacing the noise projection C_N by the data projection C_R as

\begin{array}{l} {({\tilde{\vec{s}}}^{(dv)})}_{α} = {(C_{R})}_{α β} {(h)}_{β i} {(R^{- 1})}_{i j} 〈 {\vec{d}}_{j} 〉 . & (53) \end{array}

The noise covariance matrix and the data covariance matrix are essentially the Capon projection, namely,

\begin{array}{l} C_{N} = {[h N^{- 1} h]}^{- 1} & (54) \end{array}

\begin{array}{l} C_{R} = {[h R^{- 1} h]}^{- 1} . & (55) \end{array}

The noise minimum variance estimator needs the knowledge on the noise property and does not require the measurement data themselves. Therefore, the noise minimum variance estimator can be applied without the presence of the data and may be useful when planning a measurement of an experiment. For an uncorrelated Gaussian noise statistics, the noise minimum variance estimator computes the mode amplitude as a linear combination. The data minimum variance estimator, in contrast, requires the data but not the knowledge on the noise property. The mode amplitude is computed in a non-linear fashion, i.e., the measurement weight for each sensor is influenced by the data.

3.4. Results

Signal amplitudes for the two modes are obtained using the noise variance projection (Equation 52) and the data variance projection (Equation 52), and are graphically displayed as a function of the averaging size (or the number of realizations or samples) in Figures 4, 5 together with the one-sigma errors. The both estimators can find the true signal amplitudes (20 nT for the mode 1 and 200 nT for the mode 2) within the error bar, and the analysis using a larger statistical sampling size is beneficial in reducing the error. Yet, the accuracy is by far improved in the data-variance projection. The error bar is about 10 nT for an averaging size of 10–100 and becomes only a few nT or better (cf. the noise amplitude is 1 nT) for an even larger averaging size in the data-variance projection. The estimated amplitudes sufficiently converge to the true values for averaging sizes above 10, too. In contrast, the noise-variance projection still exhibits random deviations of the signal amplitudes from the true values for a larger averaging size (for example, size of 1,000).

FIGURE 4

Figure 4. Amplitudes for the mode 1 (constant field from the current sheet) and the mode 2 (dipolar field at the planetary surface) estimated by the noise-variance projection as a function of the averaging size.

FIGURE 5

Figure 5. Amplitudes for the mode 1 and the mode 2 (the same format as Figure 4) estimated by the data-variance projection.

4. Outlook

Capon's projection is a useful tool when the noise property is unknown, and has a higher flexibility for various applications and extensions compared to the beamforming or the MSR methods. The extension of the minimum variance projection onto a multitude of shape vectors in the data opens the door to a decomposition method for the multi-point data. An application is presented for a decomposition of the multi-point data into a constant magnetic field and a dipolar field in view of the Mercury magnetosphere. It is comforting that even two-point measurements are capable of identifying the signal amplitudes when a sufficient amount of data is obtained for the proper averaging operation.

Another application is a decomposition into a set of orthogonal function basis. Capon estimator and its extension to the mode decomposition can be applied to various wave fields in the solar wind, the foreshock, the magnetosheath, and the magnetotail regions as well as various static fields as far as the spatial structure can be properly modeled (such as the dayside magnetosphere as presented in this paper). For example, the source locator uses the lowest-order spherical Bessel function j₀(x) = sin(x)/x and the lowest-order Neumann function n₀(x) = −j₋₁(x) = −cos(x)/x. The expansion into a series of spherical Bessel functions or cylindrical functions (presumably with a cutoff) is a possible application to the multi-point measurements, e.g., identification or reconstruction of the spherical propagation or the vortical shape or motion.

Capon's minimum variance projection is not limited to the search for wave propagations or mode decomposition into different sources of the magnetic fields, but the method can be applied to solitary, spatially-localized structures. A useful example may be the KdV (Korteweg-de Vries) soliton (or ion-acoustic solitons in the case of plasmas) characterized by the following shape vector, $h_{i} (A, D, c, x_{0}) = A sec h^{2} [D (x_{i} - c t + x_{0})]$ where c is the phase speed of the propagation, A the amplitude, and D the width of the soliton structure around the peak. The amplitude and the width are determined by the phase speed. For ion-acoustic solitons, the amplitude is given as A = 3c and the width $D = \sqrt{c / 2}$ [21].

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Discussions with Tohru Hada, Uwe Motschmann, and Daniel Heyner are greatly acknowledged to improve the quality of the manuscript.

References

1. Capon J. High resolution frequency-wavenumber spectrum analysis. Proc IEEE (1969) 57:1408–18. doi: 10.1109/PROC.1969.7278

CrossRef Full Text | Google Scholar

2. Angelopoulos V. The THEMIS mission. Space Sci Rev. (2008) 141:5–34. doi: 10.1007/s11214-008-9336-1

CrossRef Full Text | Google Scholar

3. Friis-Christensen E, Lühr H, Hulot G. Swarm: a constellation to study the Earth's magnetic field. Earth Planets Space (2006) 58:351–8. doi: 10.1186/BF03351933

CrossRef Full Text | Google Scholar

4. Friis-Christensen E, Lühr H, Knudsen D, Haagmans R. Swarm–an earth observation mission Investigating geospace. Adv Space Res. (2008) 41:210–6. doi: 10.1016/j.asr.2006.10.008

CrossRef Full Text | Google Scholar

5. Escoubet CP, Fehringer M, Goldstein M. Introduction to the cluster mission, Ann Geophys. (2001) 19:1197–200. doi: 10.5194/angeo-19-1197-2001

CrossRef Full Text | Google Scholar

6. Burch JL, Moore TE, Torbert RB, Giles BL. Magnetospheric multiscale overview and science objectives. Space Sci Rev. (2016) 199:5–21. doi: 10.1007/s11214-015-0164-9

CrossRef Full Text | Google Scholar

7. Haykin S. Adaptive Filter Theory, 2nd. ed., Prentice Hall Information and System Science Series. Upper Saddle River, NJ: Prentice-Hall Inc. (1991).

Google Scholar

8. Khmou Y, Safi S, Frikel M. Comparative study between several direction of arrival estimation methods. J Telecomm Inform Tech. (2014) 2014:41–8.

Google Scholar

9. Narita Y, Glassmeier KH, Motschmann U. Wave vector analysis methods using multi-point measurements. Nonlin Processes Geophys. (2010) 17:383–94. doi: 10.5194/npg-17-383-2010

CrossRef Full Text | Google Scholar

10. Motschmann U, Woodward TI, Glassmeier KH, Dunlop MW. Array signal processing techniques. In: Glassmeier KH, Motschmann U, Schmidt R. editors. Proceedings of the Cluster Workshop on Data Analysis Tools, Braunschweig (1995). ESA SP-371, p. 79–86. p. 28–3.

Google Scholar

11. Narita Y, Glassmeier KH, Motschmann U. High-resolution wave number spectrum using multi-point measurements in space–the multi-point signal resonator (MSR) technique. Ann Geophys. (2011) 29:351–60. doi: 10.5194/angeo-29-351-2011

CrossRef Full Text | Google Scholar

12. Motschmann U, Woodward TI, Glassmeier KH, Southwood DJ, Pinçon J. Wavelength and direction filtering by magnetic measurements at satellite arrays: generalized minimum variance analysis. J Geophys Res. (1996) 101:4961–6. doi: 10.1029/95JA03471

CrossRef Full Text | Google Scholar

13. Glassmeier KH, Motschmann U, Dunlop M, Balogh A, Acuña MH, Carr C, et al. Cluster as a wave telescope–first results from the fluxgate magnetometer. Ann Geophys. (2001) 19:1439–47. doi: 10.5194/angeo-19-1439-2001

CrossRef Full Text | Google Scholar

14. Pinçon JL, Lefeuvre F. Local characterization of homogeneous turbulence in a space plasma from simultaneous measurements of field components at several points in space. J Geophys Res. (1991) 96:1789–802. doi: 10.1029/90JA02183

CrossRef Full Text | Google Scholar

15. Constantinescu OD, Glassmeier KH, Motschmann U, Treumann RA, Fornaçon KH, Fränz M. Plasma wave source location using CLUSTER as a spherical wave telescope. J Geophys Res Space Phys. (2006) 111:A09221. doi: 10.1029/2005JA011550

CrossRef Full Text | Google Scholar

16. Constantinescu OD, Glassmeier KH, Décréau PME, Fränz M, Fornaçon KH. Low frequency wave sources in the outer magnetosphere, magnetosheath, and near Earth solar wind. Ann Geophys. (2007) 25:2217–28. doi: 10.5194/angeo-25-2217-2007

CrossRef Full Text | Google Scholar

17. Plaschke F, Glassmeier KH, Constantinescu OD, Mann IR, Milling DK, Motschmann U, et al. Statistical analysis of ground based magnetic field measurements with the field line resonance detector. Ann Geophys. (2008) 26:3477–89. doi: 10.5194/angeo-26-3477-2008

CrossRef Full Text | Google Scholar

18. Capon J, Greenfield RJ, Kolker RJ. Multidimensional maximul-likelihood processing of a large aperture seismic array. Proc IEEE (1967) 55:192–211. doi: 10.1109/PROC.1967.5439

CrossRef Full Text | Google Scholar

19. Dodelson S. Modern Cosmology. London: Academic Press (2003).

Google Scholar

20. Benkhoff J, van Casteren J, Hayakawa H, Fujimoto M, Laakso H, Novara M, et al. BepiColombo–comprehensive exploration of mercury: mission overview and Science goals. Planet Space Sci. (2010) 58:2–20. doi: 10.1016/j.pss.2009.09.020

CrossRef Full Text | Google Scholar

21. Ichikawa YH, Watanabe S. Solitons, envelope solitons in collisionless plasmas. J Phys Colloques (1977) 38:15–26. doi: 10.1051/jphyscol:1977603

CrossRef Full Text | Google Scholar

Appendix: Derivative Calculation

Calculation for Equation 40 is as follows. The first-order derivative of the Gaussian error likelihood function (Equation 39) with respect to the variance C_S is obtained:

\begin{array}{l} \frac{\partial L}{\partial C_{R}} = \frac{n / 2}{{(C_{R} + C_{N})}^{n / 2 - 1}} \exp [- \frac{1}{2} \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{C_{R} + C_{N}}] + \\ \frac{1}{2} \frac{1}{{(C_{R} + C_{N})}^{n / 2}} \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{{(C_{R} + C_{N})}^{2}} \times \\ \exp [- \frac{1}{2} \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{C_{R} + C_{N}}] & (A 1) \end{array}

\begin{array}{l} = & L [- \frac{n}{2 (C_{R} + C_{N})} + \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{2 {(C_{R} + C_{N})}^{2}}] . & (A 2) \end{array}

The second-order derivative of the logarithmic of the likelihood function is:

\begin{array}{l} \frac{\partial^{2} (ln L)}{\partial C_{R}^{2}} = \frac{\partial}{\partial C_{R}} [\frac{1}{L} \frac{\partial L}{\partial C_{R}}] & (A 3) \end{array}

\begin{array}{l} = & \frac{\partial}{\partial C_{R}} [\frac{- n / 2}{C_{R} + C_{N}} + \frac{1}{2} \frac{〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉}{{(C_{R} + C_{N})}^{2}}] & (A 4) \end{array}

\begin{array}{l} = & \frac{n}{2} \frac{1}{{(C_{R} + C_{N})}^{2}} - \frac{n}{{(C_{R} + C_{N})}^{2}} & (A 5) \end{array}

\begin{array}{l} = & \frac{n}{2} \frac{1}{{(C_{R} + C_{N})}^{2}} . & (A 6) \end{array}

Here the first-order derivative evaluated at the peak of the likelihood function, $\partial L / \partial C_{R} = 0$ , i.e.,

\begin{array}{l} 〈 \sum_{i = 1}^{n} {(d_{i} - h_{i} s)}^{2} 〉 = n (C_{R} + C_{N}) & (A 7) \end{array}

is used in deriving Equation (A5). The one-sigma error (Equation 40) is evaluated using Equation (A6).

Keywords: adaptive filter theory, Capon estimator, multi-spacecraft data analysis, waves and turbulence, mode decomposition

Citation: Narita Y (2019) A Note on Capon's Minimum Variance Projection for Multi-Spacecraft Data Analysis. Front. Phys. 7:8. doi: 10.3389/fphy.2019.00008

Received: 15 June 2018; Accepted: 11 January 2019;
Published: 01 February 2019.

Edited by:

Hermann Lühr, Helmholtz Center Potsdam German Geophysical Research Center (GFZ), Germany

Reviewed by:

Peter Haesung Yoon, University of Maryland, College Park, United States
Xochitl Blanco-Cano, National Autonomous University of Mexico, Mexico

Copyright © 2019 Narita. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yasuhito Narita, eWFzdWhpdG8ubmFyaXRhQG9lYXcuYWMuYXQ=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.