Statistical and Machine Learning Approach to Earthquake Forecast: Models, Laboratory and Field Data

Editorial

02 September 2022

Editorial: Statistical and machine learning approach to earthquake forecast: Models, laboratory and field data

Alexey Lyubushin

1,360 views

0 citations

Editors

Filippos Vallianatos

National and Kapodistrian University of Athens

Tamaz Lucka Chelidze

M. Nodia Institute of Geophysics

Alexey Lyubushin

Institute of Physics of the Earth (RAS)

Pan Xiong

Institute of Earthquake Forecasting, China Earthquake Administration

Impact

Original Research

22 June 2022

Long-Term Forecasting of Strong Earthquakes in North America, South America, Japan, Southern China and Northern India With Machine Learning

Victor Manuel Velasco Herrera

, 7 more and

Carlos Vera

Strong earthquakes (magnitude ≥7) occur worldwide affecting different cities and countries while causing great human, ecological and economic losses. The ability to forecast strong earthquakes on the long-term basis is essential to minimize the risks and vulnerabilities of people living in highly active seismic areas. We have studied seismic activities in North America, South America, Japan, Southern China and Northern India in search for patterns in strong earthquakes on each of these active seismic zones between 1900 and 2021 with the powerful mathematical tool of wavelet transform. We found that the primary seismic activity patterns for M ≥ 7 earthquakes are 55, 3.7, 7.7, and 8.6 years, for seismic zones of the southwestern United States and northern Mexico, southwestern Mexico, South American, and Southern China-Northern India, respectively. In the case of Japan, the most important seismic pattern for earthquakes with magnitude 7 ≤ M $<$ 8 is 4.1 years and for strong earthquakes with M ≥ 8, it is 40 years. Every seismic pattern obtained clusters the earthquakes in historical intervals/episodes with and without strong earthquakes in the individually analyzed seismic zones. We want to clarify that the intervals where no strong earthquakes do not imply the total absence of seismic activity because earthquakes can occur with lesser magnitude within this same interval. From the information and pattern we obtained from the wavelet analyses, we created a probabilistic, long-term earthquake prediction model for each seismic zone using the Bayesian Machine Learning method. We propose that the periods of occurrence of earthquakes in each seismic zone analyzed could be interpreted as the period in which the stress builds up on different planes of a fault, until this energy releases through the rupture along faults and fractures near the plate tectonic boundaries. Then a series of earthquakes can occur along the fault until the stress subsides and a new cycle begins. Our machine learning models predict a new period of strong earthquakes between 2040 ± 5 and 2057 ± 5, 2024 ± 1 and 2026 ± 1, 2026 ± 2 and 2031 ± 2, 2024 ± 2 and 2029 ± 2, and 2022 ± 1 and 2028 ± 2 for the five active seismic zones of United States, Mexico, South America, Japan, and Southern China and Northern India, respectively. In additon, our methodology can be applied in areas where moderate earthquakes occur, as for the case of the Parkfield section of the San Andreas fault (California, United States). Our methodology explains why a moderate earthquake could never occur in 1988 ± 5 as proposed and why the long-awaited Parkfield earthquake event occurred in 2004. Furthermore, our model predicts that possible seismic events may occur between 2019 and 2031, with a high probability of earthquake events at Parkfield around 2025 ± 2 years.

8,837 views

13 citations

(A)—graph of daily average values of the index γ calculated for all reference points, green line - moving average in a window of 57 days; (B)—graph of average values of the index γ after removal of local trends by a Gaussian window with a radius of 2 days, red and blue dots indicate the 377 most pronounced local maxima and minima of the DJ index γ after detrending; (C)—the time sequence of 377 earthquakes with a magnitude of at least 7 for the whole world for the time interval 1997–2021; (D) and (E)—pairs of graphs representing the shares of intensities of sequences of seismic events [(d1) and (e1)] and points of local extrema of the DJ index [(d2) and (e2)] when estimating the model of mutual influence of 2 point processes in a window of 5 years for the relaxation time τ 15 days.

Original Research

05 July 2022

Investigation of the Global Seismic Noise Properties in Connection to Strong Earthquakes

Alexey Lyubushin

2,774 views

12 citations

Original Research

01 March 2022

Earthquake Forecast as a Machine Learning Problem for Imbalanced Datasets: Example of Georgia, Caucasus

Tamaz Chelidze

, 3 more and

Gennady Kobzev

In this article, we considered the problem of $M \geq 3$ earthquake (EQ) forecasting (hindcasting) using a machine learning (ML) approach, using experimental (training) time series on monitoring water-level variations in deep wells as well as geomagnetic and tidal time series in Georgia (Caucasus). For such magnitudes’, the number of “seismic” to “aseismic” days in Georgia is approximately 1:5 and the dataset is close to the balanced one. However, the problem of forecast is practically important for stronger events—say, events of $M \geq 3.5$ —which means that the learning dataset of Georgia became more imbalanced: the ratio of seismic to aseismic days for in Georgia reaches the values of the order of 1:20 and more. In this case, some accepted ML classification measures, such as accuracy leads to wrong predictions due to a large number of true negative cases. As a result, the minority class, here—seismically active periods—is ignored at all. We applied specific measures to avoid the imbalance effect and exclude the overfitting possibility. After regularization (balancing) of the training data, we build the confusion matrix and performed receiver operating classification in order to forecast the next day probability of $M \geq 3.5$ earthquake occurrence. We found that the Matthews’ correlation coefficient (MCC) is the measure, which gives good results even if the negative and positive classes are of very different sizes. Application of MCC to observed geophysical data gives a good forecast of the next day $M \geq 3.5$ seismic event probability of the order of 0.8. After randomization of EQ dates in the training dataset, the Matthews’ coefficient efficiency decreases to 0.17.

5,406 views

9 citations

Road network model during the golden window.

Original Research

17 May 2022

Resilience Assessment of Road Networks in the Extremely Severe Disaster Areas of the Wenchuan Earthquake

Meng Wei

, 4 more and

Li Yang

2,708 views

6 citations