Reconstruction of electron radiation belts using data assimilation and machine learning

Drozdov, Alexander Y.; Kondrashov, Dmitri; Strounine, Kirill; Shprits, Yuri Y.

doi:10.3389/fspas.2023.1072795

ORIGINAL RESEARCH article

Front. Astron. Space Sci. , 19 May 2023

Sec. Space Physics

Volume 10 - 2023 | https://doi.org/10.3389/fspas.2023.1072795

This article is part of the Research Topic Radiation Belt Dynamics: Theory, Observation and Modeling View all 13 articles

Reconstruction of electron radiation belts using data assimilation and machine learning

Alexander Y. Drozdov¹

Dmitri Kondrashov²*

Kirill Strounine³

Yuri Y. Shprits^1,3,4

¹Department of Earth, Planetary, and Space Sciences, University of California, Los Angeles, Los Angeles, CA, United States
²Department of Atmospheric and Oceanic Sciences, University of California, Los Angeles, Los Angeles, CA, United States
³Space Sciences Innovations Inc., Seattle, WA, United States
⁴GFZ German Centre for Geosciences, Potsdam, Germany

We present a reconstruction of radiation belt electron fluxes using data assimilation with low-Earth-orbiting Polar Orbiting Environmental Satellites (POES) measurements mapped to near equatorial regions. Such mapping is a challenging task and the appropriate methodology should be selected. To map POES measurements, we explore two machine learning methods: multivariate linear regression (MLR) and neural network (NN). The reconstructed flux is included in data assimilation with the Versatile Electron Radiation Belts (VERB) model and compared with Van Allen Probes and GOES observations. We demonstrate that data assimilation using MLR-based mapping provides a reasonably good agreement with observations. Furthermore, the data assimilation with the flux reconstructed by NN provides better performance in comparison to the data assimilation using flux reconstructed by MLR. However, the improvement by adding data assimilation is limited when compared to the purely NN model which by itself already has a high performance of predicting electron fluxes at high altitudes. In the case an optimized machine learning model is not possible, our results suggest that data assimilation can be beneficial for reconstructing outer belt electrons by correcting errors of a machine learning based LEO-to-MEO mapping and by providing physics-based extrapolation to the parameter space portion not included in the LEO-to-MEO mapping, such as at the GEO orbit in this study.

1 Introduction

The radiation belts consist of electrons and protons trapped by the Earth’s magnetic field (Lyons and Thorne, 1973) and are a major source of damaging space weather effects on near-Earth spacecraft. The inner electron belt is located typically between 1.2 and 2.0 Earth radii R_E, while the outer belt extends from about 3 to ∼8 R_E. Relativistic electron fluxes in the outer belt are highly variable; this variability is due to the competing effects of source and loss processes, both of which are forced by solar-wind-driven magnetospheric dynamics and by resonant interactions of plasma waves and particles (Thorne, 2010; Shprits et al., 2008a; 2008b).

Understanding the mechanisms of build-up and decay of radiation belt electron fluxes is one of the fundamental problems of modern space physics having an important application in relation to human technological systems. While significant progress has been achieved in understanding the electron radiation belt dynamics using physics-based models, it is still incomplete, due to the limited number of satellites in mapping the global radiation environment in space at any given time. Here, data assimilation techniques become very important and helpful, as they combine measurements that are irregularly distributed in space and time with a physics-based model to estimate the evolution of the system’s state in time; both the model and observations typically include errors. The Kalman filter (K-filter, hereafter) (Kalman, 1960) technique of data assimilation represents so-called sequential filtering or sequential estimation, and its various generalizations have been successfully applied in various engineering fields, including autonomous or assisted navigation systems, as well as in atmospheric, oceanic, and climate studies (Ghil and Malanotte-Rizzoli, 1991; Kalnay, 2003). Data assimilation for radiation belts by K-filter techniques had been pioneered at UCLA in collaboration with Richard Thorne and Michael Ghil (Kondrashov et al., 2007; Shprits et al., 2007; Daae et al., 2011; Kondrashov et al., 2011) starting with the Versatile Electron Radiation Belt (VERB) 1-D code, where only radial diffusion is included, similar to study of (Koller et al., 2007) about the same time. For the VERB-3D code, where the state vector is of a very large size $O (1 0^{6} - 1 0^{7})$ and the computational requirements of the standard K-filter become very large, Shprits et al. (2013) developed a novel efficient approximation of a K-filter inspired by the operator splitting technique. This method still applies the standard formulation of a K-filter, but only for the 1D diffusion operators of VERB-3D model in L-shell, energy, and pitch-angle, thus operating sequentially on matrices of much smaller size for each grid line. Utilizing the split-operator technique, the first operational data-assimilative radiation belt forecast model was developed at UCLA (e.g., Kellerman et al., 2014; Shprits et al., 2023). Additionally, the approach of using data assimilation with the VERB model was successfully used to study radiation belt source and loss mechanisms (Cervantes et al., 2020a; Cervantes et al., 2020b), although so far no reconstructed measurements based on LEO observations were used for data assimilation. Recently, K-filter type approaches have been extended into a complex high-dimensional magnetosphere model, where it has been demonstrated that missing physics in global MHD models can be successfully compensated for by data assimilation, namely that pressure gradients in the inner magnetosphere can be generated via the imposition of an observed low-latitude current system (Merkin et al., 2016).

Before the launch of the Van Allen Probes (Mauk et al., 2013) that provided unprecedented measurements of the radiation belts, several works attempted reconstruction of the electron flux variation at geostationary orbit using a neural network (e.g., Koons and Gorney, 1991; Fukata et al., 2002; Ling et al., 2010; Kitamura et al., 2011), which was important for space weather applications and for the understanding of the physical processes driving radiation belt dynamics. The neural network approach of the electron flux prediction showed decent agreement with observations and other models (Perry et al., 2010). Lately, machine learning methods including neural networks became increasingly commonly used in reconstructing and forecasting relativistic electrons in radiation belts, using solar wind conditions, geomagnetic indices and other inputs (e.g., Batusov et al., 2018; Pires de Lima et al., 2020; Sarma et al., 2020; Chu et al., 2021; Landis et al., 2022; Ma et al., 2022; Wing et al., 2022; Zhelavskaya et al., 2016; 2017; 2018; 2021).

Kanekal et al. (2001) have found a remarkable global coherency in ultrarelativistic electron populations ( $> 2$ MeV) throughout the outer zone observed on satellites in distinct orbits, ranging from polar low-Earth to geosynchronous altitudes. Recently, Chen et al. (2016) have established cross-energy, cross-pitch-angle coherence between the trapped MeV electrons observed by Van Allen-Probes and precipitating 100 s of keV electrons at LEO. These findings naturally motivated more studies and model development on forecasting and nowcasting of outer belt electrons using LEO measurements.

Chen et al. (2019) developed a linear filter model to predict distributions of electrons within Earth’s outer radiation belt using measurements from the Polar Operational Environmental Satellite (POES) and LANL GEO. This PreMevE model provided a prediction spanning several hours as well as a 1-day forecasts of the spin-averaged ∼MeV radiation belt electrons near the equator. The extended PreMevE 2.0 (Pires de Lima et al., 2020) and 2E (Sinha et al., 2021) models further evaluated multiple machine learning models that fall into four different classes of linear and neural network architectures and utilized electron intensities from Polar Operational Environmental Satellite (POES) and LANL GEO to map into 1 MeV and $> 2$ MeV trapped spin-averaged electron fluxes with the focus on extended prediction (up to 2 days), taking as input also solar wind parameters.

Claudepierre and O’Brien (2020) also developed the neural net SHELLS model nowcasting daily 350 keV and 1 MeV electron fluxes in the outer radiation belt by using as input the electron fluxes from the POES satellite, and the model was built for spin-averaged flux. A new version of the SHELLS model was recently developed by Boyd et al. (2023) which incorporates the radial, angular and energy dependence as well as finer temporal resolution, and can accurately nowcast the outer electron radiation belt dynamics using both out-of-sample data from the Van Allen Probes and GPS.

In this work, we use machine learning to enhance existing satellite observations for data assimilation purposes. Our main goal is to build a model that will map the low-Earth-orbit satellite data to near-equatorial regions. Mapping the POES data to the equatorial region enables data assimilation (DA) of the electron radiation belts with the Versatile Electron Radiation Belt (VERB) code, in particular providing the state of the radiation belts in the wide range of equatorial pitch-angles and energies. The fully reconstructed state of the radiation belts is particularly useful for space weather applications, as it allows to fly virtual satellites with arbitrary orbital parameters. Using POES data is ideal for this task because of its long history and availability in the near future.

Our work extends earlier studies and is different in several important ways. First, we use POES data for mapping (nowcasting) the newly available Van Allen Probes ECT dataset (Boyd et al., 2021) in an extended range of available energies and equatorial pitch angles, which is essential for the DA and radiation belts reconstruction because it is necessary for the computation of the PSD in the adiabatic invariant space, that is used in the physics-based model, e.g. VERB. Secondly, we explore two machine learning methods for mapping: multivariate linear regression (MLR) and neural net (NN).

This is an initial study that is aimed to test if the considered machine learning models (MLR and NN) can be used by data assimilation to reconstruct radiation belts. We test the entire workflow including first mapping to the equator and then assimilating ML model results into the VERB model. In the particular case described in this study, we only use POES satellites (specifically, NOAA-15, NOAA-16, NOAA-18, and NOAA-19) for mapping from LEO to MEO (Van Allen Probes), to reconstruct the entire radiation belts and along any satellite trajectory, such as GOES. Furthermore, data assimilation allows us to combine measurements from different sources and different satellites, and this will be explored in future studies.

To summarize, the machine learning (ML) based mapping of LEO to MEO can be interpreted as creating high-quality Van Allen Probes-like satellite measurements even after the end of the Van Allen Probes mission, and which can be used to reconstruct radiation belts via data assimilation. Such use of ML-based “virtual satellite” is a very powerful and novel concept that could be potentially applied to other Earth sciences. Our approach represents the combination of physics-based (via data assimilation with VERB) and ML approaches, known as gray box (Camporeale, 2019), and takes into account errors (uncertainties) in both.

2 Data and methods

2.1 Data

In this study, we utilize measurements from the National Aeronautics and Space Administration (NASA) Van Allen Probes (Mauk et al., 2013) and from the National Oceanic and Atmospheric Administration (NOAA) Polar Orbiting Environmental Satellites (POES) (Evans and Greer, 2004).

The Van Allen Probes included two identical spacecraft (RBSP-A and RBSP-B) that were orbiting through the Earth’s radiation belts between a perigee and apogee of 1.1 and 5.8 RE (medium Earth orbit, MEO), respectively, with a low inclination $(\sim 10 °)$ . Each probe maintains an orbital period of 9 h, providing near-equatorial electron measurements. On board the satellites are multiple instruments that are a part of the Energetic Particle, Composition and Thermal Plasma Suite (ECT) (Spence et al., 2013), providing the measurements of electrons in a wide energy range (from 1 eV up to 20 MeV). In this study, we use a new ECT data product that incorporates the pitch-angle-resolved electron flux measurements on a consistent cross-calibrated data set (Boyd et al., 2021). Figure 1A illustrates 1 month of electron flux measurements at local pitch angle α_loc = 90° with corresponding equatorial pitch-angle coverage on Figure 1B.

FIGURE 1

FIGURE 1. Example of the data used in this study: (A) Van Allen Probes observation from RBSP-A satellite (ETC data set), 1 MeV electron flux and (B) corresponding equatorial pitch angle α_eq. (C) POES observations from NOAA-15 satellite (SEM2Peck data set), 0.97 MeV electron flux and (D) corresponding equatorial pitch angle α_eq. (E) Kp index.

The POES are multiple Sun-synchronous low-orbiting satellites (altitude of ∼800 km or lower Earth orbit, LEO), which provide comprehensive coverage in L-shell and magnetic local time (MLT). The orbital period of each satellite is ∼100 min. The satellites provide measurements with two telescopes oriented to zenith (0°) and perpendicular (90°). Two telescopes enable us to distinguish between particles in the loss cone and trapped (or quasi-trapped) population. In this study, we use 4 satellites: NOAA-15, NOAA-16, NOAA-18, and NOAA-19. We use a contamination-corrected dataset of differential electron flux that is available from 1998 until 11 May 2014 (Peck et al., 2015). In this study, we limit the energy range in the selected dataset from ∼30 keV up to ∼1.9 MeV, providing 20 energy channels. Figure 1C shows electron flux measurements from a single POES satellite (NOAA-15) using the perpendicular telescope-maximizing corresponding equatorial pitch-angle coverage on Figure 1D. In comparison to the Van Allen Probes, observations from POES are limited in equatorial pitch-angle coverage but have a much finer temporal resolution, which makes them highly advantageous for the reconstruction of the radiation belts.

GOES spacecraft at geosynchronous orbit measures electrons in several integral flux channels using the Energetic Proton, Electron, and Alpha particle Detector (EPEAD) (e.g., Rodriguez et al., 2014). In this study we use $>$ 800 keV and $> 2$ MeV channels and calculate differential electron flux between those energies using Gaussian fit of the spectrum. We use 2 spacecraft available for the time of interest, GOES-13 and GOES-15.

For all missions, the adiabatic invariants μ, K, and L* (Roederer, 1970) are computed with The International Radiation Belt Environment Modeling (IRBEM) library, utilizing the International Geomagnetic Reference Field (IGRF) internal field model, and the T89 external field model (Tsyganenko, 1989).

2.2 Versatile electron radiation belt (VERB) code

The adiabatic motion of energetic charged particles in these belts consists of three basic periodic components: gyro-motion about Earth’s magnetic field lines; bounce motion of the gyration center up and down a given magnetic field line; and the azimuthal drift of particles around the Earth, perpendicular to the meridional planes formed by the magnetic polar axis and the field lines. There are three adiabatic invariants, each associated with one of these motions, and by averaging over the gyro, bounce, and drift motions, we can describe the evolution of the particles’ phase-space density (PSD) solely in terms of these invariants — (μ, J, Φ), respectively. In the collisionless magnetospheric plasma, resonant wave-particle interactions provide the dominant mechanism for violation of the adiabatic invariants, resulting in changes in the outer radiation belt structure. For small wave amplitudes and a broad wave spectrum, such resonant interaction can be described within a framework of the quasi-linear (QL) theory, which is based on the 3-D Fokker-Planck diffusion equation (Shultz and Lanzerotti, 1974). The three-dimensional Versatile Electron Radiation Belt (VERB-3D) code (Subbotin and Shprits, 2009) solves the Fokker-Planck equations for PSD of electrons f written in term operators describing the radial diffusion, equatorial pitch angle (α_eq) and energy (or momentum p) diffusion:

\begin{align} \frac{\partial f}{\partial t} & = \frac{1}{G} \frac{\partial}{\partial L^{*}} (G D_{L^{*} L^{*}} \frac{\partial f}{\partial L^{*}}) + \frac{1}{G} \frac{\partial}{\partial p} (G D_{p p} \frac{\partial f}{\partial p}) \\ + \frac{1}{G} \frac{\partial}{\partial α_{e q}} (G D_{α_{e q} α_{e q}} \frac{\partial f}{\partial α_{e q}}) - \frac{f}{τ}, \end{align} (1)

where $G = \frac{8 π R_{E}^{3}}{m_{0}} p^{s} \sin (2 α_{e q}) {L^{*}}^{2} T (\sin (α_{e q}))$ is the Jacobian of the transformation from an adiabatic invariant system (μ, J, Φ) to (p, α_eq, L*); L* is a form of the third invariant Φ; m₀ is the particle’s rest mass; R_E is Earth’s radius; and T (sin (α_eq)) is a function corresponding to the bounce frequency Shultz and Lanzerotti (1974). The diffusion coefficients $D_{L^{*} L^{*}}$ , D_pp, $D_{α_{e q} α_{e q}}$ of Eq. 1 incorporate radial and energy diffusion and pitch angle scattering, respectively, and are estimated using QL diffusion theory and statistical hiss and chorus wave properties (Brautigam and Albert, 2000; Zhu et al., 2019). The mixed terms are not included for simplicity of the use of the VERB code in data assimilation (see Section 2.3). The lifetime parameter τ accounts for electron losses due to collisions with neutral particles, which is modeled by setting up lifetimes equal to the quarter bounce time for electrons inside of the loss cone and infinite outside of the loss cone. The VERB model was successfully validated on time scales from several months to several years (e.g. Drozdov et al., 2015; Drozdov et al., 2017; Zhu et al., 2019; Drozdov et al., 2020; Wang et al., 2020; Drozdov et al., 2021; Saikin et al., 2021).

2.3 Data assimilation (DA)

By using common nomenclature for data assimilation (DA), in the K-filter formulation for a numerically discretized model (such as VERB-3D), the observational data y^o and dynamically evolving fields of the model forecast x^f are combined into analysis x^a:

x_{k + 1}^{a} = M_{k} x_{k}^{f} + K_{k} (y_{k}^{o} - H_{k} x_{k}^{f}) (2)

Here x_k represents a state column vector composed of all model variables on a numerical grid–for our case, it is PSD f in Eq. 1, k is the time-stepping index, and the time-dependent matrix M_k of the VERB numerical model is obtained by numerically discretizing the partial differential equations that govern the physical system under study, i.e. Fokker-Plank equations for PSD (Eq. (1)). The use of the full Kalman filter for a three-dimensional model is a challenging task, as it requires the operation of O(N³) in computational complexity, where N is the number of all points in the grid. In this study, we use a 31 × 30 × 29 grid in the coordinates of L*, p, α_eq, respectively. Instead of using the full Kalman filter, we use an alternative method of split-operator approach Shprits et al. (2013), where the Kalman filter is applied for each grid direction. The model matrices M_k correspond to each of the diffusion operators in Eq. 1. The grid is selected to cover the L* ∈ [1, 7], with pitch angle and energy covering α_eq ∈ [0.3°, 89.7°], E ∈ [0.01, 10] MeV at L* = 7.

The matrix H_k represents a map between the model state x_k and the observations of that state. The last term on the right-hand side of Eq. 2,

z_{k} \equiv y_{k}^{o} - H x_{k}^{f}, (3)

is the innovation vector that represents the mismatch between the model and observations and is used to drive the model state closer to the observations.

Specifying the model and observational errors Q and R allows us to follow the time evolution of the forecast-error P^f and analysis-error P^a covariance matrices.

This knowledge of the error-covariance matrices provides, in turn, the optimal Kalman gain matrix K_k, which gives the proper weight to the observations vs. the model prediction:

\begin{align} \begin{aligned} P_{k}^{f} & = M_{k} P_{k - 1}^{a} M_{k}^{T} + Q_{k}, \\ P_{k}^{a} & = (I - K_{k} H_{k}) P_{k}^{f}, \\ K_{k} & = P_{k}^{f} H_{k}^{T} {(H_{k} P_{k}^{f} H_{k}^{T} + R_{k})}^{- 1} . \end{aligned} \end{align} (4)

Information obtained in the error-covariance matrices is crucial in modifying the state vector x_k in observation-void regions.

In the standard formulation of the Kalman filter, the noise covariances Q and R are assumed to be known. This rarely happens in practice, and usually, some simple approximations are made. Assuming the log-normal distribution of errors for PSD and uncorrelated errors in different locations, both Q and R are specified as diagonal matrices, and the diagonal terms of Q and R are taken simply as $ζ_{o,m} f_{o,m}^{2}$ , where $f_{o,m}^{2}$ is the observed or modeled PSD value, and ζ_o,m is a specified factor corresponding to observational or model error (Kondrashov et al., 2011). This heuristic approach worked well in previous DA studies using VERB. Note that the exact values of ζ_o,m are not important: it is their respective ratio that determines the weight given to the observations vs. the model solution in the analysis, or update, the step of the data assimilation. In this study, we use ζ_o = 0.5 and ζ_m = 0.5.

2.4 Multivariate linear regression model (MLR)

To map POES measurements to the high equatorial pitch-angle region, we use the following data processing. Van Allen Probes data is interpolated into the regular grid of equatorial pitch angles from 5° to 85° with a step of 10°. Then the data is interpolated onto the same energy grid as the energy channels on the POES satellites. The interpolated flux from RBSP-A and RBSP-B is merged and binned in time and L* $(j_{RBSP}^{binned})$ . For the binning, we use a time step of 3 h and a L* step of 0.1.

Next, we calculated the standard deviation of the log₁₀ of POES flux (j) at all POES energies and found very high variations as revealed by very high standard deviation values. The high variation of the flux is considered to be an outlier of the unrealistically low or high values of the measured fluxes. To remove the unrealistic measurements, we exclude the data that is below the threshold based on the visual inspection of measurements. The threshold is calculated for the 1-year period of 01 March 2013–01 March 2014 for each energy channel for the entire L* range

t h r e s h o l d = 1 0^{< \log_{10} (j) - s t d (\log_{10} (j)) / 4 >} (5)

The results of this method were inspected at all energies for several months of POES measurements. The inspection included the analysis of the flux vs. L* dependence with a determined threshold level. The threshold level was selected to be significantly below the reliable flux level.

To obtain the extrapolated pitch-angle distribution, we assume a simplified functional dependence of the flux as shown by the following equation:

j (α_{e q}) = j_{0} \cdot \sin (α_{e q}) (6)

where α_eq is an equatorial pitch angle and j₀ is the flux of the trapped population at 90°. Then the POES measurements are extrapolated to the equatorial pitch angles on the grid ( $α_{e q}^{grid}$ ) from 5° to 85° grid with a fixed step of 10°:

j (t, E, α_{e q}^{grid}) = \sin (α_{e q}^{grid}) \frac{j (t, E, α_{e q})}{\sin (α_{e q})} (7)

where $j (t, E, α_{e q}^{grid})$ is flux extrapolated to the new equatorial pitch angle grid, and j(t, E, α_eq) is flux observed by POES at the time t, energy E and equatorial pitch angle α_eq. At each point in time, POES provided flux measurements for two local pitch angles. We selected one measurement of the flux that corresponded to the higher pitch angle to calculate extrapolated flux values. The extrapolated POES flux is binned in a similar manner as Van Allen Probes $(j_{POES}^{binned})$ . The simplified sin approximation is used to establish a baseline method of that described in this section. As discussed in Section 4, the use of advanced pitch-angle approximation will be a subject of future research.

Then, we calculate the ratio $r (t, E, α_{e q}^{grid}) = j_{RBSP}^{binned} / j_{POES}^{binned}$ of the binned Van Allen Probes and binned POES fluxes for a 1-year period (01 March 2013–01 March 2014). We first take the median of this ratio for each Kp value and then bin by Kp, L*, energy, and equatorial pitch angle. Typically, inter-calibration coefficients are used to describe differences between instruments. Here we use obtained ratio r to capture not only the bias of the instrument but also the bias of the extrapolation to the high pitch angles procedure which may depend on Kp and L*.

Using the logarithm of the obtained ratio (log₁₀(r)), we perform a multivariate linear regression analysis. We obtain calibration coefficients (ξ) that depend on Kp, L*, energy (E) and pitch angle (α_eq) based on this analysis:

ξ (K p, L^{*}, E, α_{e q}) = b_{0} + b_{1} \cdot K p + b_{2} \cdot L^{*} + b_{3} \cdot E + b_{4} \cdot α_{e q} (8)

where b₀…b₄ are regression coefficients. The calibration coefficients ξ can be used to obtain the fluxes at given α_eq, namely $j_{α_{e q}^{grid}} = 1 0^{ξ} \cdot j_{POES}$ , for each of the $α_{e q}^{grid}$ and POES energy. Figure 2 illustrates that the flux resulted from MLR method is in reasonable agreement with Van Allen Probe measurements. The resulted flux for the period from 01 April 2014 until 01 May 2014 (see Figure 3), which is outside of the 1-year interval used to construct MLR calibration coefficients, is included in the data assimilation in Section 3.

FIGURE 2

FIGURE 2. Observed and reconstructed using NOAA-15 satellite electron flux at 1 MeV and α_eq =75° for the training/testing period of 1 year from 01 March 2013 until 01 March 2014. (A) Van Allen Probes observations. (B) Reconstructed with multivariate linear regression analysis flux. (C) Reconstructed with neural network flux. (D) Kp index.

FIGURE 3

FIGURE 3. Observations and reconstruction using NOAA-15 satellite electron flux at 1 MeV and α_eq =75° for the interval outside of the training/testing period, from 01 April 2014 until 01 May 2014. (A) Van Allen Probes observations. (B) Reconstructed with multivariate linear regression analysis flux. (C) Reconstructed with neural network flux. (D) Kp index.

2.5 Neural network (NN) model

The constructed neural network (NN) model predicts Van Allen Probes ECT electron flux j_RBSP for a specific energy channel and a local pitch angle channel using one fully connected layer with 32 neurons and the rectified linear unit (“relu”) activation function in the hidden layer. Our NN model design choices are generated by using best practices and an extensive parameter search and testing. We have compared various designs of NN model with different number of hidden layers and neurons (not shown here), and selected 32 hidden neurons based on the minimal validation error.

Thus, in total, we independently train N_E * N_α = 180 NN models, where N_E = 20 and N_α = 9 are the number of selected energy and local pitch angle channels from the ECT dataset, respectively. The 20 selected energy channels from the ECT dataset are chosen to be close to the selected energy channels from POES dataset by Peck et al. (2015). The pitch angles are selected from 10° up to 90° with the step of 10°. Thus, the single network input data consists of the POES fluxes in all 20 energy channels (1.20), one POES equatorial pitch angle (selected only from a perpendicular telescope), and one Van Allen Probes equatorial pitch angle (selected from a local pitch angle channel) and L* (as explained below and computed with T89 model), as well as Kp index:

\begin{align} j_{RBSP} (t, α_{loc}, E_{RBSP}) & = N N (j_{POES} (t, E_{1.20}), α_{e q}^{POES} (t), \\ α_{e q}^{RBSP} (t), L^{*} (t), K p (t)) \end{align} (9)

Both POES and Van Allen Probes fluxes are transformed into logarithmic space and normalized before fitting the network. The outliers of the unrealistically low or high values of the POES flux measured are removed similarly as described in Section 2.4. All of the inputs and output are aggregated to and averaged at specific time t (within 1 hour) and L* location (within 0.1 L*) of ECT output, and are also standardized to have zero mean and unit variance. The network minimizes the mean-squared error loss (MSE) function using the stochastic gradient descent ‘adam’ method with initial learning rate as 0.005, piecewise learn rate schedule for dropping the learning rate every 125 epochs by multiplying by a factor of 0.2. In order to avoid overfitting, we have used training, validation, and test datasets. We randomly select 90% of data from 01 March 2013 until 01 March 2014 as training, and 10% as the validation set (see Figure 2). We have used validation-based early stopping, that is the training process stops when the MSE of the validation set stops improving for several consecutive epochs.

The test period from 01 April 2014 until 01 May 2014 (see Figure 3) is used to assess out-of-sample model performance and is included into data assimilation (Section 3). Once the neural network is trained, the Van Allen Probes data are no longer needed; only the POES fluxes, locations in space and pitch angle, and the Kp index are required to specify the outer electron belt environment. While $α_{e q}^{RBSP} (t)$ is changing in time during the training of the NN model, we use a constant value from $α_{e q}^{grid}$ (see Section 2.4) in a predictive mode specific to assessment, comparison, and data assimilation. The NN that corresponds to the selected value of $α_{e q}^{grid}$ is chosen based on the time median for corresponding during training $α_{e q}^{RBSP} (t)$ .

3 Results

3.1 POES-to-RBSP reconstruction by NN and MLR models

Figure 2 shows the electron flux at 1 MeV and α_eq = 75° for a 1-year period (01 March 2013–01 March 2014). This period is used to obtain calibration coefficients using the MLR method (Figure 2B) and to train and validate the NN (Figure 2C). Figure 2 serves an illustrative purpose and demonstrated both methods (MLR and NN) provide a reasonable reconstruction of the electron flux at a higher equatorial pitch angle than POES can observe. For the testing of the methods and for the following data assimilation, we use a different period (01 April 2014–01 May 2014), which is shown in Figure 3. For the quantitative estimation of the MLR and NN models, we use metrics presented in Claudepierre and O’Brien (2020). Namely, we use coefficients of determination [r²; Eq. (1) from (Claudepierre and O’Brien, 2020)] and correlation coefficients calculated in logarithmic (r_log) and linear (r_lin) space between Van Allen Probes, RBSP-A and reconstituted flux from NOAA-15. The metrics are calculated for the full range of L* and since RBSP-A and NOAA-15 data have different time resolutions, the data is binned with the time step of 4 h and L* step of 0.1 prior to calculating the coefficients. The coefficients are presented in Table 1 and are computed values for 3 energies (0.5, 1.0, 1.5 MeV) and separately for 3 different pitch-angle values (35°, 55°, 75°). Although the selection of energy and pitch angle is limited, both models indicate similar performance albeit the NN model is at least noticeably better than MLR in term of r².

TABLE 1

TABLE 1. Coefficients of determination (r²) and correlation coefficients calculated in logarithmic (r_log) and linear (r_lin) space between Van Allen Probes, RBSP-A and reconstituted from POES NOAA-15 data using MLR and NN models.

3.2 VERB data assimilation using NN- and MLR-reconstructed data

The predicted flux j by MLR and NN models is converted to PSD f as f = j/(p ⋅ c)² for convenience, where p is momentum, and c is the speed of light. The first (μ) and second (K) invariants are calculated from the energy and equatorial pitch angles using a dipole field and preserving the third adiabatic invariant (L*) that is calculated using the T89 magnetic field. The resulting PSD from POES-based MLR- and NN- reconstructed fluxes at multiple energies based on POES data and equatorial pitch angles $(α_{e q}^{grid})$ are used as observations (y^o in Eq (2)) for assimilation with the VERB model. Hence, each point reconstructed in time from a single POES satellite covering 20 energy values and 9 $α_{e q}^{grid}$ values is interpolated to the simulation grid and included in DA.

Next we compare DA results using POES-based NN- and MLR-reconstructed fluxes in the validation period from 01 April 2014 until 01 May 2014. Figure 4A shows the binned Van Allen Probes observations as a ground truth at 1 MeV and α_eq = 75°, in comparison to DA results using MLR-reconstructed fluxes, shown in Figure 4B. Figure 4C show the logarithmic difference between Van Allen Probes observations and DA results.

FIGURE 4

FIGURE 4. Data assimilation using reconstructed with multivariate linear regression analysis data. Electron flux at 1 MeV and α_eq =75° for the period from 01 April 2014 until 01 May 2014. (A) Binned Van Allen Probe observations. (B) Data assimilation using reconstructed flux from POES data. (C) Logarithmic difference between flux from data assimilation and observations. (D) Kp index. (E) Comparison of fluxes between observations Flux_data and DA results Flux_da, (F) distribution of the logarithmic flux ratio.

Figure 4E shows a quantitative comparison of fluxes between observations Flux_data and data assimilation using MLR-reconstructed fluxes Flux_da, with 62.3% of points being within a factor of 2. Figure 4F shows a histogram of their corresponding logarithmic ratio. The histogram is nearly normally distributed with slight overestimation of Flux_da in comparison to Flux_data.

Figure 5 is in the same format as Figure 4 but shows DA results with NN-reconstructed fluxes and indicating an improved accuracy with 72.9% of points within the factor of 2 (Figure 5E). The histogram on Figure 5F shows that data assimilation using NN-reconstructed fluxes results in almost no overestimation larger than a factor of 2, and its peak is shifted towards underestimation of Flux_da in comparison to Flux_data.

FIGURE 5

FIGURE 5. Data assimilation using reconstructed with neural network data. Electron flux at 1 MeV and α_eq =75° for period from 01 April 2014 until 01 May 2014. (A) Binned Van Allen Probe observations. (B) Data assimilation using reconstructed flux from POES data. (C) Logarithmic difference between flux from data assimilation and observations. (D) Kp index, (E) comparison of fluxes between observations Flux_data and DA results Flux_da, (F) distribution of the logarithmic flux ratio.

Furthermore, Table 2 shows that DA improves accuracy (as measured by r²) of reconstructed fluxes in the heart of radiation belts (L* ∈ [3.5, 6.0], where election dynamics is the most significant) in comparison with standalone machine learning model results. Such improvement by DA is more pronounced when using MLR-based fluxes, and accuracy is only marginally better when using NN-based fluxes. We chose the narrower L* region because physics-based VERB code simulation provides a very low PSD level in the slot region in comparison to the observations for the selected period as seen on Figure 3, which are defined by the instrumental noise level. The similar comparison at lower L* < 3.5 (below heart of radiation belts Reeves et al. (2013)) results in fitting to observations when DA is applied to NN-based fluxes (not shown).

TABLE 2

TABLE 2. Coefficients of determination (r²) and correlation coefficients calculated in logarithmic (r_log) and linear (r_lin) space between Van Allen Probes, RBSP-A and reconstituted from POES NOAA-15 data using MLR and NN models (first 5 rows, similar to Table 1); and the same comparison with data assimilation with POES NOAA-15 using MLR and NN models (last 5 rows). The calculation of coefficients is limited to L*∈ [3.5,6.0], which represents the heart of the radiation belts.

One of the main advantages of using DA is that it provides a full and complete reconstruction of radiation belts. This enables a virtual flyby of arbitrary satellites retrieving the accurate representation of electron flux/PSD along the trajectory, similar to the Observing System Simulation Experiments (OSSEs) study recently supported by NOAA (Schiller et al., 2022) in so called “fraternal twin” assimilation experiments (Kondrashov et al., 2007; Shprits et al., 2007; Kondrashov et al., 2011), where synthetic data from virtual satellites along different orbits (LEO, GTO, MEO) of VERB simulation with one set of physical parameters is assimilated into VERB with different physical parameter settings with a goal to best reconstruct at GEO. We achieve such reconstruction using physics-based extrapolation of LEO observations with VERB code and machine learning. To demonstrate such capability, Figures 6, 7 show DA results in comparison to GOES observations at 1 MeV and α_eq = 55° in the validation period from 01 April 2014 until 01 May 2014 and using the same format as in Figures 4, 5. As one can see, the accuracy of DA reconstruction using NN-based fluxes is significantly better than using the MLR method, such as 70.9% of points being within a factor of 2 for the former vs. 55.1% for the latter. In addition, we perform a comparison of the GOES fluxes reconstructed from LEO using ML methods and DA. The wide L-shell coverage provided by POES allow us to reconstruct the flux level in the region of GEO. However, none of our ML models (MLR and NN) were trained on the data outside of Van Allen Probes spatial coverage in L*, which is below GEO. Hence, the physics-based extrapolation imposed by DA may become more important for such a task. Table 3 provides details of the comparison of DA and our ML models at extrapolation to GEO at different energies and pitch angles, in a format similar to Table 2. The agreement of the observed and reconstructed fluxes at GEO using DA is better than for our ML models, although the accuracy of the DA-NN model is lower than in Table 2. This is expected result because our ML models did not include training on GEO data. Also, there already exist much better predictive ML models that includes GEO electron data for training (e.g., Boynton et al., 2013; Shin et al., 2016; Zhang et al., 2020; Wang et al., 2023). However, such models usually rely on the knowledge of the solar wind data, while demonstrated in this paper DA technique only use Kp-index as a indicator of geomagnetic activity, with is available at near real-time (e.g., Matzka et al., 2021). Also, the demonstrated method of reconstruction of the fluxes at GEO using LEO measurements is of an interest of the community (e.g., Drozdov et al., 2022).

FIGURE 6

FIGURE 6. Data assimilation using reconstructed with multivariate linear regression analysis data. Electron flux at 1 MeV and α_eq =55° for the period from 01 April 2014 until 01 May 2014. (A) Binned GOES observations. (B) Data assimilation using reconstructed flux from POES data. (C) Logarithmic difference between flux from data assimilation and observations. (D) Kp index. (E) Comparison of fluxes between observations Flux_data and DA results Flux_da, (F) distribution of the logarithmic flux ratio.

FIGURE 7

FIGURE 7. Data assimilation using reconstructed with neural network data. Electron flux at 1 MeV and α_eq =55° for period from 01 April 2014 until 01 May 2014. (A) Binned GOES observations. (B) Data assimilation using reconstructed flux from POES data. (C) Logarithmic difference between flux from data assimilation and observations. (D) Kp index, (E) comparison of fluxes between observations Flux_data and DA results Flux_da, (F) distribution of the logarithmic flux ratio.

TABLE 3

TABLE 3. Coefficients of determination (r²) and correlation coefficients calculated in logarithmic (r_log) and linear (r_lin) space between GOES-13, GOES-15 and reconstituted from POES NOAA-15 data using MLR and NN models (first 5 rows); and the same comparison with data assimilation with POES NOAA-15 using MLR and NN models (last 5 rows). The calculation of coefficients is limited to L*∈ [5.0,7.0], the GOES coverage.

4 Conclusion

In this work, we demonstrated that electron radiation belt flux observed by the MEO satellite can be successfully reconstructed using LEO POES measurements with various machine learning methods. We used 2 ML methods: multivariate linear regression analysis (MLR) and neural network (NN). The reconstructed flux was included in data assimilation (DA) with VERB code and compared with Van Allen Probes and GOES observations. The MLR method represents a reference model which is easy to implement in space weather applications that require reconstruction of the radiation belt dynamics. We found that data assimilation using MLR-reconstructed flux can provide a reasonable agreement with observations. However, the data assimilation with the flux reconstructed using a NN provided only a limited improvement. Therefore, our main conclusion is that, in the case an optimized machine learning model is not possible, our preliminary results suggest that data assimilation can be beneficial for reconstructing outer belt electrons by correcting errors of a subpar machine learning based LEO-to-MEO mapping (e.g., the MLR case), as well as by providing physics-based extrapolation to the parameter space portion that is inadequately covered by existing measurements (e.g., GEO is used as the pretended case here). Meanwhile, when a well-trained ML model is feasible (e.g., the NN case), the application of DA shows only limited improvement.

Although both methods (MLR and NN) in combination with DA showed applicability in the reconstruction of radiation belts, this study includes several assumptions and limitations. The selected implementation of the MLR reconstructed flux has limitations, as we used a simplified sin-function extrapolation of electron flux. The use of the more realistic reconstruction of the pitch-angle distribution (e.g., Allison et al., 2018; Zhao et al., 2018; Smirnov et al., 2022), as well as MLT dependence, may be used in future studies to improve the results. Additionally, we used the convenient for this study POES data set presented by Peck et al. (2015), which has limited temporal coverage (1998–2014) and thus short overlap with Van-Allen Probes to allow for robust comparison of DA-MLR and DA-NN results between quiet and disturbed geomagnetic activity, including extreme geomagnetic storms. The future work will consider the near real-time POES measurements, a comprehensive analysis of PSD (e.g., Wing et al., 2022), as well as, the detailed analysis of a wider range of energies and pitch angles remains a subject of future research. In addition, future work will include the combination of different measurements with various errors into a data assimilative model.

The main advantage of data assimilation is that it can help with the reduction of the errors that can arise from the inaccuracies of measurements, inaccuracies associated with the mixing of trapped and quasi-trapped populations, and inaccuracies associated with extrapolation to the equator. In the case an optimized machine learning model is not possible, our results suggest that data assimilation can be beneficial for reconstructing outer belt electrons by correcting errors of a machine learning based LEO-to-MEO mapping and by providing physics-based extrapolation to the parameter space portion not included in the LEO-to-MEO mapping, such as at GEO orbit. Machine learning models can be also inaccurate especially when applied outside of the training interval and during extreme geomagnetic conditions. In these situations, we may consider rebalancing using a similar approach as by Shprits et al. (2019) or using different machine learning models (e.g., MLR and NN) depending on the geomagnetic activity, when DA can compensate for the possible machine learning errors during extreme geomagnetic storms (see Zhelavskaya et al., 2021).

The ML and DA-based reconstruction of the radiation belts with the presented methodology enables continuous monitoring of the radiation belt state even without in situ near-equatorial radiation belt measurements. This is particularly crucial for space weather applications and space weather prediction. Such an approach can also be used to study the global long-term dynamics of radiation belts. Furthermore, analysis of the pitch-angle distributions of the reconstructed from LEO measurements radiation belts can inform about the dominant physical mechanism that drives radiation belts dynamics and will be addressed in future research.

Data availability statement

We thank the Van Allen Probe ECT team for providing the data: https://rbsp-ect.newmexicoconsortium.org/. The POES measurements are available at the NOAA NGDC website: https://satdat.ngdc.noaa.gov/sem/poes/data/processed/ngdc/corrected/peck/. The GOES measurements are available at the NOAA-NGDC website: https://www.ngdc.noaa.gov/stp/. The authors used geomagnetic indices provided by OMNIWeb: https://omniweb.gsfc.nasa.gov/.

Author contributions

AD led the work, performed the analyses, and assisted in writing the paper. DK developed the NN model, advised AD on the DA details and interpretation of the results, and wrote the paper. KS assisted in DA analysis. YS conceptualized the study. All authors contributed to the article and approved the submitted version.

Funding

This research is funded by contract FA9453-19-C-0619 submitted to ARFL STTR topic AF17C-T03. DK is supported by NSF grant AGS-2211345.

Acknowledgments

The authors acknowledge the developers of the International Radiation Belt Environment Modeling (IRBEM) library. We thank Drew Turner, Quintin Schiller and Geoff Reeves for their useful discussions and Sharon Uy for proofreading this manuscript.

Conflict of interest

AD, KS and YS have a significant financial interest in Space Sciences Innovations Inc.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Author disclaimer

The views expressed are those of the author and do not necessarily reflect the official policy or position of the Department of the Air Force, the Department of Defense, or the U.S. government. The appearance of external hyperlinks does not constitute endorsement by the United States Department of Defense (DoD) of the linked websites, or the information, products, or services contained therein. The DoD does not exercise any editorial, security, or other control over the information you may find at these locations.

References

Allison, H. J., Horne, R. B., Glauert, S. A., and Del Zanna, G. (2018). Determination of the equatorial electron differential flux from observations at low Earth orbit. J. Geophys. Res. Space Phys. 123, 9574–9596. doi:10.1029/2018ja025786

Reconstruction of electron radiation belts using data assimilation and machine learning

1 Introduction

2 Data and methods

2.1 Data

2.2 Versatile electron radiation belt (VERB) code

2.3 Data assimilation (DA)

2.4 Multivariate linear regression model (MLR)

2.5 Neural network (NN) model

3 Results

3.1 POES-to-RBSP reconstruction by NN and MLR models

3.2 VERB data assimilation using NN- and MLR-reconstructed data

4 Conclusion

Data availability statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

Author disclaimer

References

95% of researchers rate our articles as excellent or good