- 1School of Computer, Electronics and Information, Guangxi University, Nanning, China
- 2Center for Geodata and Analysis, Faculty of Geographical Science, Beijing Normal University, Beijing, China
- 3Institute of Automation, Chinese Academy of Sciences, Beijing, China
- 4International College, Guangxi University, Nanning, China
- 5The First Affiliated Hospital of Guangxi Medical University, Nanning, China
Epileptiform discharges are of fundamental importance in understanding the physiology of epilepsy. To aid in the clinical diagnosis, classification, prognosis, and treatment of epilepsy, it is important to develop automated computer programs to distinguish epileptiform discharges from normal electroencephalogram (EEG). This is a challenging task as clinically used scalp EEG often contains a lot of noise and motion artifacts. The challenge is even greater if one wishes to develop explainable rather than black-box based approaches. To take on this challenge, we propose to use a multiscale complexity measure, the scale-dependent Lyapunov exponent (SDLE). We analyzed 640 multi-channel EEG segments, each 4 s long. Among these segments, 540 are short epileptiform discharges, and 100 are from healthy controls. We found that features from SDLE were very effective in distinguishing epileptiform discharges from normal EEG. Using Random Forest Classifier (RF) and Support Vector Machines (SVM), the proposed approach with different features from SDLE robustly achieves an accuracy exceeding 99% in distinguishing epileptiform discharges from normal control ones. A single parameter, which is the ratio of the spectral energy of EEG signals and the SDLE and quantifies the regularity or predictability of the EEG signals, is introduced to better understand the high accuracy in the classification. It is found that this regularity is considerably greater for epileptiform discharges than for normal controls. Robustly having high accuracy in distinguishing epileptiform discharges from normal controls irrespective of which classification scheme being used, the proposed approach has the potential to be used widely in a clinical setting.
1. Introduction
Epilepsy is a common disorder of the brain (Li et al., 2019). Approximately 8–10% of people would experience an epileptic seizure during their lifetime (Gavvala and Schuele, 2016). In adults, the risk of the recurrence of seizure within the 5 years following a new-onset or a second seizure is 35 and 75%, respectively (Gavvala and Schuele, 2016). These percentages are even higher in children, with 50% of the recurrence within the 5 years following a single unprovoked seizure, and 80% after two unprovoked seizures (Camfield and Camfield, 2015). In the United States in 2011, about 1.6 million seizure patients made emergency department visits; approximately 25% of these visits were for new-onset seizures (Gavvala and Schuele, 2016). The exact incidence of epileptic seizures in low-income and middle-income countries is unknown, however it is speculated to exceed that in high-income countries (Ba-Diop et al., 2014).
Electroencephalography (EEG) provides a continuous measure of cortical function with excellent time resolution, and thus remains the primary diagnostic test of brain function, especially in those with epileptic seizures, even though new functional imaging procedures such as functional MRI (fMRI), single-photon emission computed tomography (SPECT), and positron emission tomography (PET) have been increasingly used for assessing anatomical changes in the brain. EEG is especially valuable in investigating patients with known or suspected seizures or encephalopathy. Seizures are however infrequent events in the majority of patients in an outpatient setting, making recording of ictal EEG time-consuming and labor intensive. So far, the mainstay of diagnosis remains to detect interictal (i.e., between seizures) epileptiform discharges. Therefore, epileptiform discharges are of fundamental importance in understanding the physiology of epilepsy. To aid in the clinical diagnosis, classification, prognosis, and treatment of epilepsy, it is critical to develop automated computer programs to distinguish epileptiform discharges from normal EEG.
Many methods have been developed to study EEG. Simple but important features of EEG include the amplitude values (Toet et al., 2005) and the Power Spectral Density (PSD) (Gao et al., 2007). Using wavelet transform is also a popular approach (Adeli et al., 2003; Subasi, 2007; Faust et al., 2015; Chen et al., 2017). Clinically, however, neurologists still rely heavily to visually examine the long continuous EEG signals. Unfortunately, this approach is time-consuming and prone to error due to human fatigue. This issue has motivated much effort to develop automated algorithms to detect epileptiform discharges or other features from EEG (Sharmila and Geethanjali, 2019). Among the notable works along this line are to use entropy (Nicolaou and Georgiou, 2012; Arunkumar et al., 2016, 2017) and complexity measures (Gao et al., 2011, 2012b; Martis et al., 2015; Medvedeva et al., 2016; Pratiher et al., 2016; Sikdar et al., 2018). The majority of the works published are however based on electrocorticogram (ECoG), which is invasively obtained by directly attaching electrodes to the cerebral cortex (Wang et al., 2019). Clinically, the more widely available form of EEG is the non-invasive scalp EEG. Compared with ECoG, scalp EEG signals are much poorer in terms of signal-to-noise ratios (Haufe et al., 2018). Scalp EEG recordings also contain various kinds of artifacts (Islam et al., 2016; Brienza et al., 2019), including eye movements (e.g., blinks), muscle activities (e.g., swallowing, head movements), and the heartbeat (Kappel et al., 2017). These noise and artifacts exacerbates greatly the difficulty in automatically detecting epileptiform discharges from normal controls. Although machine learning based approaches (Mirowski et al., 2008; Shen et al., 2009; Antoniades et al., 2016; Kuswanto et al., 2017; Ullah et al., 2018; van Putten et al., 2018; Subasi et al., 2019) can partly solve some of these problems, overall, the problem remains largely open, especially with regard to the development of explainable non-black-box based approaches.
In this paper, we propose to use scale-dependent Lyapunov exponent (SDLE) to develop a readily explainable approach to automatically detect epileptiform discharges from normal controls. SDLE is a multiscale complexity measure developed to unambiguously distinguish chaos from noise, and more fundamentally to automatically characterize the defining parameters/properties of complex data (Gao et al., 2006, 2007). SDLE stems from two important concepts, the time-dependent exponent curves (Gao and Zheng, 1993, 1994a,b; Gao, 1997) and the finite size Lyapunov exponent (Torcini et al., 1995; Aurell et al., 1996, 1997). SDLE was first introduced in Gao et al. (2006, 2007), and has been further developed in Gao et al. (2009, 2012a) and applied to characterize ECoG (Gao et al., 2011), HRV (Hu et al., 2009, 2010), financial time series (Gao et al., 2013), Earth's geodynamo (Ryan and Sarson, 2008), precipitation dynamics (Fan et al., 2013), sea clutter (Hu and Gao, 2013), THz imagery (Blasch et al., 2012), and randomness (Li et al., 2016). We will show that the proposed approach is very accurate in distinguishing epileptiform discharges from normal controls.
The remainder of the paper is organized as follows. In section 2, we briefly describe the EEG data and analysis methods. In section 3, we present analysis results. In section 4, we summarize our findings.
2. Materials and Methods
2.1. Data
The scalp EEG data analyzed here were clinically obtained at the First Affiliated Hospital to Guangxi Medical University. The studies involving human participants were reviewed and approved by the ethics committee of the First Affiliated Hospital to Guangxi Medical University. The participants provided their written informed consent to participate in this study. Fifty-nine epilepsy patients underwent a 3-h video-EEG monitoring with 19-channel EEG recording with electrodes placed on the scalp under the international 10–20 system at 256 Hz sampling rate. The electrode impedances were kept below 10KΩ. The 19 scalp electroencephalographic electrodes were arranged according to the names Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2. Since the information yielded by an EEG channel is essentially the difference of electrical activity between two electrodes in the time-domain (Pardey et al., 1996; Lopez et al., 2016), the amplitude, frequency, and synchronization of the brain waves and background will change (Seeck et al., 2017; Vanherpe and Schrooten, 2017), depending on which montage is chosen (e.g., earlobe reference, averaged reference, or bipolar; Christodoulakis et al., 2013; Geier and Lehnertz, 2017; Rana et al., 2017; Acharya and Acharya, 2019; Rios et al., 2019). In this work, we choose the widely used earlobe reference.
All epileptiform discharges were annotated by an experienced clinical neurophysiologist based on the average montage with an analog bandwidth of 0.1~70 Hz and a notch filter of 50 Hz. EEG signals were segmented into 4 s epochs, with each epoch assigned a random number. The collected epochs were transformed into European Data Format (EDF) for further analysis. In total, there were 532 EEG recordings of epileptiform discharges and 100 healthy controls, each 4 s long, from all the participants. Among the 532 short epileptic discharges, there were 69 spike waves, 82 sharp waves, 174 spike and slow wave complexes, 72 sharp and slow wave complexes, 64 polyspike complexes, 77 polyspike, and slow wave complexes and 2 spike rhythmic discharges. Note the numbers for these seven epileptiform discharges sum up to 540, which is slightly larger than 532. The reason is a few discharges were considered to simultaneously belong to more than 1 of the 7 different epileptiform discharges. For convenience of referencing, the definitions for these 7 epileptiform discharges are given below, together with the number of cases analyzed for each type indicated in the parentheses immediately following each terminology. Examples of their waveforms are shown in Figure 1.
• Spike wave (69): the most basic paroxysmal EEG activity, with a duration of 20~70 ms; amplitude varies but typically >50 uV (Kane et al., 2017).
• Sharp wave (82): a transient wave similar to the spike and clearly distinguishable from background activity; its time limit is 70~200 ms (5~14 Hz), amplitude is between 100 and 200 uV, and the phase is usually negative.
• Spike and slow wave complex (174): pattern consisting of a spike followed by a slow wave (classically the slow wave being of higher amplitude than the spike); may be single or multiple (Kane et al., 2017).
• Sharp and slow wave complex (72): pattern consisting of a sharp followed by a slow wave (classically the slow wave being of higher amplitude than the sharp); may be single or multiple (Kane et al., 2017).
• Polyspike complex (64): a sequence of two or more spikes.
• Polyspike and slow wave complex (77): pattern with two or more spikes associated with one or more slow waves.
• Spike rhythm (2): a rare pattern of widespread 10~25 Hz spike rhythm outbreak, with an amplitude of 100~200 uV and the highest voltage in the frontal area, lasting more than 1 s.
Figure 1. Typical waveforms of the 7 major epileptiform EEG, where (A-G), denotes spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges, respectively.
Recall that a few epileptiform discharge waveforms were considered to simultaneously belong to more than 1 of these 7 different epileptiform discharges. Because of this, we will not pursue the issue of further characterizing the differences among the 7 epileptiform discharges here.
2.2. Computation of Power Spectral Density (PSD)
PSD of EEG can be readily obtained by taking Fourier transform of the EEG signal, computing the square of the amplitude of the transform to obtain the power, and finally plotting the power against the frequency. In clinical applications, brain waves are often categorized into five bands: delta (0.5~ 3Hz), theta (4~7 Hz), alpha (8~13 Hz), beta (14~30 Hz), and gamma (>30 Hz), respectively. To obtain the energy of these waves, one only needs to integrate the PSD curve over the respective wave band. In this work, we integrate the PSD curve for frequencies between 0.5 and 25 Hz for the 10 electrodes with the strongest signals, and then take the average.
2.3. Computation of the SDLE
As with the estimation of PSD, for each subject, we picked up 10 strongest EEG signals from 19 electrodes, computed SDLE from each one of the 10 EEG signals, and took the average.
To compute SDLE, we first need to reconstruct a phase space from the EEG signals. Denote the signal as x(i), i = 1, ⋯ , n, we construct vectors
where m is called the embedding dimension and L the delay time. In practice, m and L have to be chosen properly. This is the issue of optimal embedding. For example, to reconstruct the phase plane of a harmonic oscillator from a sinusoidal signal, the optimal delay time is 1/4 of the period (Gao et al., 2007). Extensive works have been done to optimally determine m and L. Two of the most systematic and most extensively tested approaches are a statistical method called the false nearest neighbor method (Kennel et al., 1992) and a dynamical method based on time-dependent exponent curves developed by Gao and Zheng (1993, 1994a,b). The basic idea of the latter is to choose L in such a way that the motion in the reconstructed phase space is as uniform as possible (in the case of a harmonic oscillator, the reconstructed phase plane is an ellipse, which becomes a circle when L is 1/4 of the period; motion on the circle is the most uniform when compared with motions on ellipses). This is achieved by requiring divergence characterized by time-dependent exponent curves be a minimum when L is varied, and the divergence does not become much larger when m is further increased. This is the method that is employed here. For the EEG signals analyzed in this work, which was sampled with a sampling frequency of 256 Hz, we found L = 1 is optimal. With larger sampling frequency, L also has to be larger. For example, when the sampling frequency is 1,024 Hz, L then needs to be 4. As our EEG signal is not that long (4 s, or 1,024 points), we also found that m = 2 worked very well. After the phase space is reconstructed, we consider an ensemble of trajectories. We denote the initial separation between two nearby trajectories by ϵ0, and their average separation at time t and t+Δt by ϵt and ϵt+Δt, respectively. The trajectory separation is schematically shown in Figure 2. Note ϵt+Δt is not necessarily larger than ϵt. We then examine the relation between ϵt and ϵt+Δt, where Δt is small. When Δt → 0, we have,
where λ(ϵt) is the SDLE given by
With the above definition, we can readily compute SDLE using the vectors defined by Equation (1). Specifically, we check whether pairs of vectors (Vi, Vj) satisfy the following Inequality:
where ϵi and Δϵi are prescribed small distances. Geometrically, a pair of ϵi and Δϵi defines a shell, with the former being the diameter of the shell and the latter the thickness of the shell (which reduces to a ball with radius Δϵk when ϵk = 0; in a 2-D plane employed here, a ball is a circle described by , where (a, b) is the center of the circle, and r is the radius). We then monitor the evolution of all such vector pairs (Vi, Vj) within a shell and take the ensemble average over the indices i, j. Since we are most interested in exponential or power-law functions, we assume that taking logarithm and averaging can be exchanged, then Equation (3) can be written as
where t and Δt are integers in units of the sampling time, the angle brackets denote the average over indices i, j within a shell. Note 〈||Vi+t+Δt−Vj+t+Δt||〉 and 〈||Vi+t−Vj+t||〉 amount to ϵt+Δt and ϵt, respectively. For EEG signals, the most relevant scaling law for SDLE is
where γ determines the speed of loss of information.
Figure 2. A schematic showing two arbitrary trajectories in a general high-dimensional space, with the distance between them at time 0, t, and t+δt being ϵ0, ϵt, and ϵt+δt, respectively.
To make the computation of SDLE readily repeated by other researchers, and more importantly, to enable different researchers to readily compare their results, we recommend to use the size of the first shell by of the standard deviation of the EEG signal, and successive shells shrink by a factor of . Altogether, we used four shells, and then took the average of the four SDLE curves.
2.4. Random Forest Classifier (RF)
Random forest (RF) is a learning technique for classification based on ensembles (Cutler et al., 2012). It is not affected by overtraining, does not require normalization of the input data, and has high accuracy. It uses many separate classification trees. Each tree is obtained through a separate bootstrap sample from the data set and classifies the data. A majority vote among the trees provides the final result.
The objective of the RF classifier used here is to classify which of the two classes an EEG signal belongs to: normal or epileptiform discharges. The inputs to the RF classifier are the PSD and a feature extracted from the SDLE curve. Following usual practice, we have randomly taken one-third of the total data as testing data and two-thirds of the data for training the model in this paper.
2.5. Support Vector Machine (SVM)
Support Vector Machine (SVM) is a popular machine learning method for pattern classification (Cristianini and Shawe-Taylor, 2000). It has been widely used in biomedical applications. It aims to find a hyperplane in an N-dimensional space (N, the number of features) that maximizes the distance between two classes of points. Hyperplanes are decision boundaries that help classify the data points. Data points falling on either side of the hyperplane can be attributed to the two different classes. The dimension of the hyperplane depends upon the number of features. If the number of input features is 2, then the hyperplane is just a line. If the number of input features is 3, then the hyperplane is a two-dimensional plane. When the number of features exceeds 3, it becomes difficult to imagine the shape of the hyperplane, nevertheless, it can be readily computed.
2.6. Evaluation of Performance
The consistency between the diagnosis by the neurologists and machine classification needs to be quantified. This can be accomplished by computing the receiver operating characteristic (ROC) curve and many statistics derived from the ROC curve. A good understanding of these metrics can be based on the confusion matrix, which is a table with two rows and two columns that reports the number of false positives (FP), false negatives (FN), true positives (TP), and true negatives (TN). From them we can define three major metrics:
Note that the sensitivity is also called true positive rate (TPR) and 1−specificity is also called false positive rate (FPR).
The ROC is a plot of TPR vs. FPR using different threshold values as a sweeping variable. The ROC is a good way to characterize imbalanced data sets, as it does not suffer from class imbalance. The area below the ROC is called area under curve (AUC). Its value takes from 0 to 1. A value of AUC being 0.5 means the classification model has no predictive ability at all. On the other hand, when the value of AUC reaches 1, the prediction ability is 100%. This is equivalent to the ROC being a unit step function.
3. Result
We mentioned that for each subject, to compute the SDLE curves, we chose from the 19 electrodes 10 strongest EEG signals, computed the SDLE curves from each EEG signal, then took the average. For each EEG signal, we reconstructed a phase space with m = 2, L = 1, then computed 4 ln ϵt vs. t curves corresponding to 4 shells, with the diameter of the largest shell being of the standard deviation of the EEG signal, and successive shells shrinking by a factor of . Eight typical ln εt vs. t curves for epileptiform discharges and normal EEG corresponding to these four shells were shown in Figure 3. For simplicity, we call these error growth curves. Note the classic algorithm of computing the Lyapunov exponent amounts to assuming , where λ1 is the largest positive Lyapunov exponent, and estimating λ1 by (ln ϵt−ln ϵ0)/t (Wolf et al., 1985). This clearly is inappropriate here since ln ϵt does not increase with t linearly. In other words, small variations in EEG signals did not really grow exponentially. This difficulty is readily overcome with SDLE, since the latter is the local slopes of such error growth curves, which are always well-defined. The SDLE curves corresponding to the error growth curves of Figure 3 were shown in Figure 4. There are 4 SDLE curves here, corresponding to 4 shells chosen. The left-most curve corresponds to the smallest shell, while the right-most curve corresponds to the largest shell (they often are indistinguishable on larger scales). The most salient feature of these SDLE curves is the scaling behavior described by Equation (6).
Figure 3. Typical ln εt vs. t curves for epileptiform discharges and normal EEG, where the four curves correspond to four different shells, with the diameter of the largest shell being of the standard deviation of the EEG signal, and successive shells shrinking by a factor of . (A–H) illustrates the different between the seven types of epileptiform discharges (spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges) and normal EEG.
Figure 4. Typical λ(ϵ) vs. lnϵ curves for epileptiform discharges and normal EEG. The four curves represented in four different colors correspond to the error growth curves shown in Figure 3. (A–H) illustrates the different between the seven types of epileptiform discharges (spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges) and normal EEG.
It would be desirable to combine the 4 SDLE curves into a single curve. The most rigorous way to estimate the SDLE at a specific scale ϵ* is to first interpolate each SDLE curve to that scale so that it has a value there, then average the 4 SDLE curves at ϵ* using the number of pairs of vectors in each shell as the weights. For simplicity, one could also first align the 4 SDLE curves with the left-most curve, and then simply take the arithmetic average (in cases where the 4 curves are indistinguishable, then this alignment operation is unnecessary). To make the proposed method easier to reproduce, we adopted this simplified approach here. For the purpose of distinguishing epileptiform discharges from normal controls, we focused on three SDLEs λ(ϵ1), λ(ϵ2), and λ(ϵ3) at three specific scales ϵ1, ϵ2, and ϵ3, and their average, which was denoted as . The three scales ϵ1, ϵ2, and ϵ3 were specifically indicated in Figures 3A, 4A. These scales correspond to the smallest, intermediate, and boundary scales where the scaling law of Equation (6) holds.
To appreciate how well SDLEs can be used to distinguish epileptiform discharges from normal controls, we formed scatter plots with PSD and SDLEs, where PSD was obtained using Fourier transform, as we explained earlier. The scatter plots with PSD and λ(ϵ1), PSD and λ(ϵ2), and PSD and were shown in Figures 5–7, respectively. We observe that in all these three cases, the separation between all seven types of epileptiform discharges and the normal control was excellent. Therefore, we can expect that the classification accuracy will be very high. Below, we specifically evaluate the performance of these three algorithms, which use PSD and λ(ϵ1), PSD and λ(ϵ2), and PSD and , respectively.
Figure 5. Scatter plots with PSD and λ(e1), where (A–G), illustrates the different between the seven types of epileptiform discharges (spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges) and normal EEG. These plots highly suggest the classification accuracy will be very high.
Figure 6. Scatter plots with PSD and λ(e2), where (A–G), illustrates the different between the seven types of epileptiform discharges (spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges) and normal EEG. These plots highly suggest the classification accuracy will be very high.
Figure 7. Scatter plots with PSD and , where (A–G), illustrates the different between the seven types of epileptiform discharges (spike wave, spike and slow wave complex, sharp wave, sharp and slow wave complex, polyspike complex, polyspike and slow wave complex, spike rhythm discharges) and normal EEG. These plots highly suggest the classification accuracy will be very high.
To compute the classification accuracy, we employed RF and SVM. We randomly took two-thirds of the data as the training data and the remaining one-third of the total data as the testing data. The class distribution of the samples in the training and testing data set is summarized in Table 1. The test performance of the classifier can be determined by computing the metrics defined in section 2.6. The confusion matrix in Table 2 for Algorithm 1, which used PSD and λ(ϵ1), showed that 1 out of 34 normal subjects was classified incorrectly by the two classification algorithms RF and SVM as the epileptiform discharge, and 1 out of 180 epileptiform discharges was classified incorrectly as the normal subject by RF and SVM. Algorithm 2, which used PSD and λ(ϵ2), was even better, which only misclassified 1 out of 180 epileptiform discharges as a normal subject by the RF, but without any other errors (the classification accuracy remained the same as that for Algorithm 1 when SVM is used). Algorithm 3, which used PSD and , was also excellent, which only misclassified 1 out of 34 normal subjects as an epileptiform discharge, but without any other errors for both RF and SVM. These were also summarized in Table 2. With these confusion matrices, we computed Sensitivity, Specificity, and Accuracy of these three algorithms. They were listed in Table 3. We find that all the three algorithms are excellent, with their accuracy all exceeding 99%, for both classification schemes RF and SVM.
Table 2. Confusion Matrix for the testing data of 180 epileptiform discharges and 34 normal controls: Algorithms 1, 2, 3 use PSD and λ(ϵ1), PSD and λ(ϵ2), PSD and , respectively.
The amazing performance of these methods can be further corroborated by the unit step function like ROC curves shown in Figure 8. To facilitate comparison of our algorithms with that of Anh-Dao et al. (2018), which achieved a high AUC of 0.945, we also listed the AUC for the three algorithms proposed here in Table 3. The AUC of the three algorithms proposed here ranges from 0.9727 to 0.9980, and therefore, are all considerably better than that of Anh-Dao et al. (2018).
Figure 8. The ROC curves for the testing data: (A–C) are for algorithms using PSD and λ(ϵ1), PSD and λ(ϵ2), and PSD and , respectively.
4. Conclusion and Discussion
In this paper, we have proposed to employ SDLE for distinguishing epileptiform discharges from normal EEGs, with the aim of being able to use them conveniently in a clinical setting. We found that SDLE computed from scalp EEG signals was mainly characterized by a scaling law described by Equation (6). When the scale parameters were confined to where this scaling law held, SDLE was very effective in distinguishing epileptiform discharges from normal EEG. Using RF and SVM, the proposed approach with different features from SDLE was found to robustly achieve an accuracy exceeding 99% in distinguishing epileptiform discharges from normal control ones.
What is the reason that the choice of concrete classification schemes such as RF or SVM is not critical for the proposed approach to have high accuracy in distinguishing epileptiform discharges from normal control ones? It has to be because of the excellent separations revealed by the scatter plots shown in Figures 5–7. To better understand the explainability of the proposed approach, we need to understand better the meaning of the SDLE. The definition of SDLE is equivalent to
Letting ϵTdb = 2ϵ0, we find the error doubling time Tdb given by
As the first approximation, we may consider 1/λ(ϵ) to be proportional to the error doubling time (Gao et al., 2009). This understanding motivates us to combine the two parameters PSD and SDLE into a single parameter such as PSD/λ(ϵ1). Since on average PSD is larger but λ(ϵ1) (as well as λ(ϵ2) and , as shown in Figures 5–7 is smaller for epileptiform discharges than for normal control ones,
we can expect that this ratio will be on average larger for epileptiform discharges. In fact, this ratio can be regarded as a measure of the regularity or predictability of EEG signals, since large PSD stems from synchronized firing of neurons, while small SDLE highlights slow divergence and thus considerable regularity and predictability.
Now the question is whether such a single parameter can effectively distinguish normal control ones from epileptiform discharges. For this purpose, we have computed the probability density distribution (PDF) for PSD/λ(ϵ1) of the epileptiform discharges and the normal control ones. The results are shown in Figure 9 as the blue and the red curves, respectively. The overlapping of the blue and the red curves defines a right and a left tail for the blue and the red curves; the corresponding probabilities for them are 1.39 and 4.19%, as indicated in the plot. They correspond to the probability that a normal control one may be misclassified as an epileptiform discharge and vice versa. As the classification accuracy with the scheme based on a single parameter will not be higher than that based on two parameters, we can readily understand that the probabilities of 1.39 and 4.19% are the lower bounds that a normal control may be misclassified as epileptiform discharges, and vice verse. This is surely consistent with the probabilities shown in Table 3 (the case for Algorithm 1). As these misclassification probabilities are very low, we thus can be confident that the proposed approach will always be very accurate in distinguishing epileptiform discharges from normal control ones, no matter what classification schemes are used for classification.
Figure 9. The probability density distribution (PDF) for the ratio PSD/λ(ϵ1) of the epileptiform discharges (red curve) and normal control ones (blue curve). The overlapping of the blue and the red curves defines a right and left tail for the blue and red curves, respectively; the corresponding probabilities for them are 1.39 and 4.19%, as indicated in the plot.
It is interesting to note that if we choose SDLE corresponding to larger scales, such as ϵ3 indicated in Figures 3A, 4A, an algorithm based on PSD and λ(ϵ3) would be slightly worse than the three algorithms discussed here, but still slightly better than that of Anh-Dao et al. (2018). This suggests the importance of properly selecting the scale for analysis. On the other end, if we use a three parameter method, for example, using PSD, , and ϵ∞ (which characterizes the size of an attractor and amounts to the largest scale in Figure 3), then the accuracy in distinguishing epileptiform discharges from normal controls can be further improved to 100%. The reason is that ϵ∞ contains information independent of PSD and SDLE. However, we had not further pursed the issue of improving the accuracy here, since the high accuracy achieved by the easily explainable algorithms presented is already more than satisfying. Overall, our analysis highly suggests that the proposed approach is very promising to be used clinically.
It is worth noting that the epileptiform discharges analyzed here were provided by our collaborators at Guangxi Medical University in two batches: in the first batch, which was about 2/3 of the data analyzed here, the accuracy was similar to that reported here. Then another 1/3 of the data were given to us to further examine whether the accuracy remained as high. It was yes. Nevertheless, the data analyzed here were still quite limited. It would be interesting and important to further validate the proposed approaches with more data in different clinical sets.
Brain activities involve spatial-temporal coordinated dynamics of numerous neurons in different regions of the brain, i.e., involve numerous functional brain networks. To better characterize the synergistic effects among the brain networks, it is important to construct brain networks based on multi-channel EEG signals. Closely related to this network issue is to infer the localization of each type of epileptiform discharges, which is of great clinical importance. These issues have not been pursued in this work, which is obviously a serious limitation of the current study. In the near future, we will examine these issues systematically, especially from the viewpoint of synthesizing network analysis with nonlinear analysis based on complexity science.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics Statement
The scalp EEG data analyzed here were clinically obtained at the First Affiliated Hospital to Guangxi Medical University. The studies involving human participants were reviewed and approved by the ethics committee of the First Affiliated Hospital to Guangxi Medical University. The participants provided their written informed consent to participate in this study.
Author Contributions
QL performed most of the experimental work. QH and YW provided the data needed for this experiment and engaged in many discussions, together with BX. JG conceived the study, provided overall supervision for the study, directed all phases of the study, and including writing of the manuscript. All authors read and approved the final manuscript.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
This research was supported by the National Natural Science Foundation of China under Grant Nos. 71661002 and 41671532 and by the Fundamental Research Funds for the Central Universities. It is also supported by the National Key Research and Development Program of China, grant number 2019AAA0103402. One of the authors (JG) also benefited tremendously from participating the long program on culture analytics organized by the Institute for Pure and Applied Mathematics (IPAM) at UCLA, which was supported by the National Science Foundation.
References
Acharya, J. N., and Acharya, V. J. (2019). Overview of EEG montages and principles of localization. J. Clin. Neurophysiol. 36, 325–329. doi: 10.1097/WNP.0000000000000538
Adeli, H., Zhou, Z., and Dadmehr, N. (2003). Analysis of EEG records in an epileptic patient using wavelet transform. J. Neurosci. Methods 123, 69–87. doi: 10.1016/S0165-0270(02)00340-0
Anh-Dao, N. T., Linh-Trung, N., Van Nguyen, L., Tran-Duc, T., and Boashash, B. (2018). A multistage system for automatic detection of epileptic spikes. Rev J. Electron. Commun. 8, 1–12. doi: 10.21553/rev-jec.166
Antoniades, A., Spyrou, L., Took, C. C., and Sanei, S. (2016). “Deep learning for epileptic intracranial EEG data,” in Deep learning International Workshop on Machine Learning for Signal Processing (MLSP) (Vietri sul Mare), 1–6. doi: 10.1109/MLSP.2016.7738824
Arunkumar, N., Ram Kumar, K., and Venkataraman, V. (2016). Automatic detection of epileptic seizures using permutation entropy, Tsallis entropy and Kolmogorov complexity. J. Med. Imaging Health Inform. 6, 526–531. doi: 10.1166/jmihi.2016.1710
Arunkumar, N., Ramkumar, K., Venkatraman, V., Abdulhay, E., Fernandes, S. L., Kadry, S., et al. (2017). Classification of focal and non-focal EEG using entropies. Pattern Recogn. Lett. 94, 112–117. doi: 10.1016/j.patrec.2017.05.007
Aurell, E., Boffetta, G., Crisanti, A., Paladin, G., and Vulpiani, A. (1996). Growth of noninfinitesimal perturbations in turbulence. Phys. Rev. Lett. 77:1262. doi: 10.1103/PhysRevLett.77.1262
Aurell, E., Boffetta, G., Crisanti, A., Paladin, G., and Vulpiani, A. (1997). Predictability in the large: an extension of the concept of Lyapunov exponent. J. Phys. A 30:1. doi: 10.1088/0305-4470/30/1/003
Ba-Diop, A., Marin, B., Druet-Cabanac, M., Ngoungou, E., Newton, C., and Preux, P. (2014). "epidemiology, causes, and treatment of epilepsy in Sub-Saharan Africa. Lancet Neurol. 13, 1029–1044. doi: 10.1016/S1474-4422(14)70114-0
Blasch, E. P., Gao, J., and Tung, W.-W. (2012). “Chaos-based image assessment for THZ imagery,” in 2012 11th International Conference on Information Science, Signal Processing and Their Applications (ISSPA) (Montreal, QC), 360–365. doi: 10.1109/ISSPA.2012.6310576
Brienza, M., Davassi, C., and Mecarelli, O. (2019). “Artifacts,” in Clinical Electroencephalography, ed O. Mecarelli (Cham: Springer), 109–130. doi: 10.1007/978-3-030-04573-9_8
Camfield, P., and Camfield, C. (2015). Incidence, prevalence and aetiology of seizures and epilepsy in children. Epilept. Disord. 17, 117–123. doi: 10.1684/epd.2015.0736
Chen, D., Wan, S., Xiang, J., and Bao, F. S. (2017). A high-performance seizure detection algorithm based on discrete wavelet transform (dwt) and EEG. PLoS ONE 12:e173138. doi: 10.1371/journal.pone.0173138
Christodoulakis, M., Hadjipapas, A., Papathanasiou, E. S., Anastasiadou, M., Papacostas, S. S., and Mitsis, G. D. (2013). “Graph-theoretic analysis of scalp EEG brain networks in epilepsy-the influence of montage and volume conduction,” in 13th IEEE International Conference on Bioinformatics and Bioengineering (Chania), 1–4. doi: 10.1109/BIBE.2013.6701572
Cristianini, N., and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511801389
Cutler, A., Cutler, D. R., and Stevens, J. R. (2012). “Random forests,” in Ensemble Machine Learning, eds C. Zhang and Y. Q. Ma (Boston, MA: Springer), 157–175. doi: 10.1007/978-1-4419-9326-7_5
Fan, Q., Wang, Y., and Zhu, L. (2013). Complexity analysis of spatial-temporal precipitation system by PCA and SDLE. Appl. Math. Model. 37, 4059–4066. doi: 10.1016/j.apm.2012.09.009
Faust, O., Acharya, U. R., Adeli, H., and Adeli, A. (2015). Wavelet-based EEG processing for computer-aided seizure detection and epilepsy diagnosis. Seizure 26, 56–64. doi: 10.1016/j.seizure.2015.01.012
Gao, J. (1997). Recognizing randomness in a time series. Phys. D 106, 49–56. doi: 10.1016/S0167-2789(97)00024-9
Gao, J., Cao, Y., Tung, W.-w., and Hu, J. (2007). Multiscale Analysis of Complex Time Series: Integration of Chaos and Random Fractal Theory, and Beyond. Hoboken, NJ: John Wiley & Sons. doi: 10.1002/9780470191651
Gao, J., Hu, J., Mao, X., and Tung, W.-W. (2012a). Detecting low-dimensional chaos by the “noise titration” technique: possible problems and remedies. Chaos Solitons Fractals 45, 213–223. doi: 10.1016/j.chaos.2011.12.004
Gao, J., Hu, J., Tung, W., and Cao, Y. (2006). Distinguishing chaos from noise by scale-dependent Lyapunov exponent. Phys. Rev. E 74:066204. doi: 10.1103/PhysRevE.74.066204
Gao, J., Hu, J., and Tung, W.-W. (2011). Complexity measures of brain wave dynamics. Cogn. Neurodyn. 5, 171–182. doi: 10.1007/s11571-011-9151-3
Gao, J., Hu, J., and Tung, W.-w. (2012b). Entropy measures for biological signal analyses. Nonlin. Dyn. 68, 431–444. doi: 10.1007/s11071-011-0281-2
Gao, J., Hu, J., Tung, W.-W., and Zheng, Y. (2013). Multiscale analysis of economic time series by scale-dependent Lyapunov exponent. Quant. Fin. 13, 265–274. doi: 10.1080/14697688.2011.580774
Gao, J., Tung, W., and Hu, J. (2009). Quantifying dynamical predictability: the pseudo-ensemble approach. Chin. Ann. Math. Ser. B 30, 569–588. doi: 10.1007/s11401-009-0108-3
Gao, J., and Zheng, Z. (1993). Local exponential divergence plot and optimal embedding of a chaotic time series. Phys. Lett. A 181, 153–158. doi: 10.1016/0375-9601(93)90913-K
Gao, J., and Zheng, Z. (1994a). Direct dynamical test for deterministic chaos. Europhys. Lett. 25:485. doi: 10.1209/0295-5075/25/7/002
Gao, J., and Zheng, Z. (1994b). Direct dynamical test for deterministic chaos and optimal embedding of a chaotic time series. Phys. Rev. E 49:3807. doi: 10.1103/PhysRevE.49.3807
Gavvala, J., and Schuele, S. (2016). New-onset seizure in adults and adolescents: a review. JAMA 316, 2657–2668. doi: 10.1001/jama.2016.18625
Geier, C., and Lehnertz, K. (2017). Which brain regions are important for seizure dynamics in epileptic networks? Influence of link identification and EEG recording montage on node centralities. Int. J. Neural Syst. 27:1650033. doi: 10.1142/S0129065716500337
Haufe, S., DeGuzman, P., Henin, S., Arcaro, M., Honey, C. J., Hasson, U., et al. (2018). Reliability and correlation of fMRI, ECOG and EEG during natural stimulus processing. BioRxiv 2018, 207456. doi: 10.1101/207456
Hu, J., and Gao, J. (2013). Multiscale characterization of sea clutter by scale-dependent Lyapunov exponent. Math. Probl. Eng. 2013:584252. doi: 10.1155/2013/584252
Hu, J., Gao, J., and Tung, W.-W. (2009). Characterizing heart rate variability by scale-dependent Lyapunov exponent. Chaos 19:028506. doi: 10.1063/1.3152007
Hu, J., Gao, J., Tung, W.-W., and Cao, Y. (2010). Multiscale analysis of heart rate variability: a comparison of different complexity measures. Ann. Biomed. Eng. 38, 854–864. doi: 10.1007/s10439-009-9863-2
Islam, M. K., Rastegarnia, A., and Yang, Z. (2016). Methods for artifact detection and removal from scalp EEG: a review. Neurophysiol. Clin. 46, 287–305. doi: 10.1016/j.neucli.2016.07.002
Kane, N., Acharya, J., Benickzy, S., Caboclo, L., Finnigan, S., Kaplan, P. W., et al. (2017). A revised glossary of terms most commonly used by clinical electroencephalographers and updated proposal for the report format of the EEG findings. revision 2017. Clin. Neurophysiol. Pract. 2:170. doi: 10.1016/j.cnp.2017.07.002
Kappel, S. L., Looney, D., Mandic, D. P., and Kidmose, P. (2017). Physiological artifacts in scalp EEG and ear-EEG. Biomed. Eng. Online 16:103. doi: 10.1186/s12938-017-0391-2
Kennel, M. B., Brown, R., and Abarbanel, H. D. (1992). Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys. Rev. A 45:3403. doi: 10.1103/PhysRevA.45.3403
Kuswanto, H., Salamah, M., and Fachruddin, M. I. (2017). Random forest classification and support vector machine for detecting epilepsy using electroencephalograph records. Am. J. Appl. Sci. 14, 533–539. doi: 10.3844/ajassp.2017.533.539
Li, F., Liang, Y., Zhang, L., Yi, C., Liao, Y., Jiang, Y., et al. (2019). Transition of brain networks from an interictal to a preictal state preceding a seizure revealed by scalp EEG network analysis. Cogn. Neurodyn. 13, 175–181. doi: 10.1007/s11571-018-09517-6
Li, X.-Z., Zhuang, J.-P., Li, S.-S., Gao, J.-B., and Chan, S.-C. (2016). Randomness evaluation for an optically injected chaotic semiconductor laser by attractor reconstruction. Phys. Rev. E 94:042214. doi: 10.1103/PhysRevE.94.042214
Lopez, S., Gross, A., Yang, S., Golmohammadi, M., Obeid, I., and Picone, J. (2016). “An analysis of two common reference points for EEGs,” in 2016 IEEE Signal Processing in Medicine and Biology Symposium (SPMB) (Philadelphia, PA), 1–5. doi: 10.1109/SPMB.2016.7846854
Martis, R. J., Tan, J. H., Chua, C. K., Loon, T. C., Yeo, S. W. J., and Tong, L. (2015). Epileptic EEG classification using nonlinear parameters on different frequency bands. J. Mech. Med. Biol. 15:1550040. doi: 10.1142/S0219519415500402
Medvedeva, T. M., Lüttjohann, A., van Luijtelaar, G., and Sysoev, I. V. (2016). “Evaluation of nonlinear properties of epileptic activity using largest Lyapunov exponent,” in Saratov Fall Meeting 2015: Third International Symposium on Optics and Biophotonics and Seventh Finnish-Russian Photonics and Laser Symposium (PALS), Vol. 9917 (Saratov: International Society for Optics and Photonics), 991724.
Mirowski, P. W., LeCun, Y., Madhavan, D., and Kuzniecky, R. (2008). “Comparing SVM and convolutional networks for epileptic seizure prediction from intracranial EEG,” in 2008 IEEE Workshop on Machine Learning for Signal Processing (Cancún), 244–249. doi: 10.1109/MLSP.2008.4685487
Nicolaou, N., and Georgiou, J. (2012). Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Syst. Appl. 39, 202–209. doi: 10.1016/j.eswa.2011.07.008
Pardey, J., Roberts, S., and Tarassenko, L. (1996). A review of parametric modelling techniques for EEG analysis. Med. Eng. Phys. 18, 2–11. doi: 10.1016/1350-4533(95)00024-0
Pratiher, S., Patra, S., and Bhattacharya, P. (2016). “On the marriage of Kolmogorov complexity and multi-fractal parameters for epileptic seizure classification,” in 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) (Noida), 831–836. doi: 10.1109/IC3I.2016.7918797
Rana, A. Q., Ghouse, A. T., and Govindarajan, R. (2017). “Basics of electroencephalography (EEG),” in Neurophysiology in Clinical Practice, ed J. Renwick (Cham: Springer), 3–9. doi: 10.1007/978-3-319-39342-1_1
Rios, W. A., Olguín, P. V., Mena, D. A., Cabrera, M. C., Escalona, J., Garcia, A. M., et al. (2019). The influence of EEG references on the analysis of spatio-temporal interrelation patterns. Front. Neurosci. 13:941. doi: 10.3389/fnins.2019.00941
Ryan, D., and Sarson, G. (2008). The geodynamo as a low-dimensional deterministic system at the edge of chaos. Europhys. Lett. 83:49001. doi: 10.1209/0295-5075/83/49001
Seeck, M., Koessler, L., Bast, T., Leijten, F., Michel, C., Baumgartner, C., et al. (2017). The standardized EEG electrode array of the IFCN. Clin. Neurophysiol. 128, 2070–2077. doi: 10.1016/j.clinph.2017.06.254
Sharmila, A., and Geethanjali, P. (2019). A review on the pattern detection methods for epilepsy seizure detection from EEG signals. Biomed. Eng. 64, 507–517. doi: 10.1515/bmt-2017-0233
Shen, T.-W., Kuo, X., and Hsin, Y.-L. (2009). “Ant k-means clustering method on epileptic spike detection,” in 2009 Fifth International Conference on Natural Computation, Vol. 6 (Tianjin), 334–338. doi: 10.1109/ICNC.2009.639
Sikdar, D., Roy, R., and Mahadevappa, M. (2018). Epilepsy and seizure characterisation by multifractal analysis of EEG subbands. Biomed. Signal Process. Control 41, 264–270. doi: 10.1016/j.bspc.2017.12.006
Subasi, A. (2007). EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst. Appl. 32, 1084–1093. doi: 10.1016/j.eswa.2006.02.005
Subasi, A., Kevric, J., and Canbaz, M. A. (2019). Epileptic seizure detection using hybrid machine learning methods. Neural Comput. Appl. 31, 317–325. doi: 10.1007/s00521-017-3003-y
Toet, M. C., Groenendaal, F., Osredkar, D., van Huffelen, A. C., and de Vries, L. S. (2005). Postneonatal epilepsy following amplitude-integrated EEG-detected neonatal seizures. Pediatr. Neurol. 32, 241–247. doi: 10.1016/j.pediatrneurol.2004.11.005
Torcini, A., Grassberger, P., and Politi, A. (1995). Error propagation in extended chaotic systems. J. Phys. A 28:4533. doi: 10.1088/0305-4470/28/16/011
Ullah, I., Hussain, M., Qazi, E. H, and Aboalsamh, H. (2018). An automated system for epilepsy detection using EEG brain signals based on deep learning approach. Expert Syst. Appl. 107, 61–71. doi: 10.1016/j.eswa.2018.04.021
van Putten, M. J., de Carvalho, R., and Tjepkema-Cloostermans, M. C. (2018). F85. deep learning for detection of epileptiform discharges from scalp EEG recordings. Clin. Neurophysiol. 129, e98–e99. doi: 10.1016/j.clinph.2018.04.248
Vanherpe, P., and Schrooten, M. (2017). Minimal EEG montage with high yield for the detection of status epilepticus in the setting of postanoxic brain damage. Acta Neurol. Belgica 117, 145–152. doi: 10.1007/s13760-016-0663-9
Wang, Q., Valdés-Hernández, P. A., Paz-Linares, D., Bosch-Bayard, J., Oosugi, N., Komatsu, M., et al. (2019). EECOG-comp: an open source platform for concurrent EEG/ECOG comparisons-applications to connectivity studies. Brain Topogr. 32, 1–19. doi: 10.1007/s10548-019-00708-w
Keywords: EEG, epileptiform discharges, power spectral density (PSD), scale-dependent Lyapunov exponent (SDLE), random forest classifier, support vector machine (SVM)
Citation: Li Q, Gao J, Huang Q, Wu Y and Xu B (2020) Distinguishing Epileptiform Discharges From Normal Electroencephalograms Using Scale-Dependent Lyapunov Exponent. Front. Bioeng. Biotechnol. 8:1006. doi: 10.3389/fbioe.2020.01006
Received: 16 April 2020; Accepted: 31 July 2020;
Published: 08 September 2020.
Edited by:
Francesco Rundo, STMicroelectronics (Italy), ItalyReviewed by:
Nenad Filipovic, University of Kragujevac, SerbiaMichael Ming-Yuan Wei, Texas Commission on Environmental Quality, United States
Copyright © 2020 Li, Gao, Huang, Wu and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jianbo Gao, amJnYW8ucG1iJiN4MDAwNDA7Z21haWwuY29t; Yuan Wu, bnd1eXVhbiYjeDAwMDQwO3N0dS5neG11LmVkdS5jbg==