- 1Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
- 2Brain Cognition and Brain-Computer Intelligence Integration Group, Kunming University of Science and Technology, Kunming, China
- 3Faculty of Science, Kunming University of Science and Technology, Kunming, China
- 4School of Information Engineering, Chinese People’s Armed Police Force Engineering University, Xi’an, China
Common spatial pattern (CSP) is an effective algorithm for extracting electroencephalogram (EEG) features of motor imagery (MI); however, CSP mainly aims at multichannel EEG signals, and its effect in extracting EEG features with fewer channels is poor—even worse than before using CSP. To solve the above problem, a new combined feature extraction method has been proposed in this study. For EEG signals from fewer channels (three channels), wavelet packet transform, fast ensemble empirical mode decomposition, and local mean decomposition were used to decompose the band-pass filtered EEG into multiple time–frequency components, and the corresponding components were selected according to the frequency characteristics of MI or the correlation coefficient between its time–frequency components and the original EEG signal. Furthermore, phase space reconstruction (PSR) was performed on the selected components after the three time-frequency decompositions, the maximum Lyapunov index was calculated, and the features were reconstructed; then, CSP projection mapping was used for the reconstructed features. The support vector machine probability output model was trained by the obtained three mappings. Probability outputs by three different support vector machines were then obtained. Finally, the classification of test samples was determined by the fusion of the Dempster–Shafer evidence theory at the decision level. The results showed that the accuracy of the proposed method was 95.71% on data set III of BCI competition II (left- and right-hand MI), which was 2.88% higher than the existing methods. On data set IIb of BCI competition IV, the average accuracy was 86.60%, which was 2.3% higher than the existing methods. This study verified the effectiveness of the proposed method and provided an approach for the research and development of the MI-BCI system based on fewer channels.
Introduction
Brain–computer interface (BCI) is a revolutionary human–computer interaction with the ability to improve the quality of life of medical patients, disabled people, and healthy people (Wolpaw et al., 2000; Cheng et al., 2002; Blankertz et al., 2010). Among various types of BCI, motor imagery (MI) is an important BCI paradigm with great potential to be applied in rehabilitation training for motor dysfunction (Attallah et al., 2020).
In MI-BCI, effectively decoding MI intent based on electroencephalogram (EEG) signals is an important problem (Chen et al., 2018; Wang et al., 2018; Jiao et al., 2019; Zhang et al., 2019). Common spatial pattern (CSP) is often used to extract features in most MI-BCI studies; this algorithm is recognized as an effective method to extract features of MI-EEG (Ang et al., 2008; Wang et al., 2017; Jin et al., 2021; Miao et al., 2021). However, CSP is mainly applicable to multichannel EEG signals and is not effective in extracting EEG features with fewer channels for identification. In addition, the application of multichannel MI-BCI is limited due to a large number of channels. In comparison, MI-BCI with fewer channels is more practical. For MI-BCI with fewer channels, how to effectively apply CSP to this kind of BCI is a problem worth exploring.
Although CSP has many improved methods in multichannel MI-BCI and achieved good results, these improved algorithms cannot be directly used in few-channel MI-BCI. Moreover, in the case of MI-BCI with fewer channels, due to the lack of sufficient spatial information, the classification accuracy of features extracted by CSP is worse than other simple spatial filtering methods. To solve this problem, Huang et al. (Huang, 2009) regarded multiple frequency bands of each channel as new channels and used CSP to extract features. As such, this method can effectively utilize frequency domain information in the case of fewer channels. However, it cannot completely replace spatial information. Meng et al. (2013) used multiple coordinate delays of each channel time series to optimize a spatial filter and a higher order multiparameter FIR filter at the same time, thereby enhancing the distinguishability of MI-EEG signals with fewer channels. Nevertheless, the results of this method are closely related to the selection of the number of delay factors, and the number of delay factors is generally larger, which will seriously affect the computational complexity of this method. Yang et al. (2015) used phase space reconstruction (PSR) to decompose EEG with fewer channels into multichannel signals, but the effect of PSR mainly depends on the delay parameters. However, the selection of embedded dimension parameters has less effect on its performance, as PSR is directly used to expand the channel, and the reconstructed signal still has the problem of insufficient spatial information. Guan and Duan (2019) put forward a spectral feature and transformation method based on multivariate empirical mode decomposition (MEMD), which has some advantages while suffering from some problems such as high computational complexity and insufficient spatial information after decomposition and reconstruction. Xu et al. (2019) used wavelet transform and CNN to decode MI-EEG signals. Although this method can improve the classification accuracy of MI-BCI to a certain extent, it requires a large number of training data sets and is not suitable for the study of a few-channel MI-BCI. Wang and He (2004) used means of frequency decomposition and weighting synthesis strategy for recognizing imagined right- and left-hand movements. Although this method has a certain effect on MI-BCI signal decoding, it is only suitable for multichannel MI-EEG decoding and cannot effectively solve the problem of insufficient spatial information of few-channel MI-EEG signals.
In view of the above problems of CSP applied to MI-BCI with fewer channels, this study proposed a combined feature extraction method called TFD-PSR-CSP. This method utilizes different time–frequency decomposition (TFD) methods (involving wavelet packet transform [WPT], fast ensemble empirical mode decomposition [FEEMD, and local mean decomposition [LMD]) combined with PSR, which expands the EEG signals from fewer channels into multichannel EEG. Finally, features were extracted by CSP. Afterward, considering the nonlinear and nonstationary characteristics of EEG signals, in this study, three different support vector machine (SVM) probability outputs were obtained by introducing a sigmoid function for probability mapping of SVM outputs. Finally, the Dempster–Shafer (D–S) evidence theory is used for decision-making fusion to determine the category of test samples. This study is organized as “Materials and methods,” “Results,” “Discussion,” and “Conclusion” sections.
Materials and Methods
EEG Data Set and Preprocessing
EEG Data Set
To evaluate the performance of TFD-PSR-CSP and D–S evidential theory in the identification of MI intention with fewer channels, data set III of BCI competition II (denoted as data set 1) and data set IIB of BCI competition IV (denoted as data set 2) are adopted, which are briefly described as follows:
Data set 1: A female subject, aged 25 years, in good health performed left- and right-hand MI; and a single trial lasted for 9 s, totaling 280 trials, all of which were completed on the same day and were divided into seven groups. Three bipolar EEG channels were recorded before and after C3, CZ, and C4. The EEG sampling frequency was 128 Hz, and band-pass filtering was performed at 0.5–30 Hz. Finally, the data set is distributed equally into training and testing sets, each consisting of 140 trials.
For additional details, see http://www.bbci.de/competition/ii/.
Data set 2: In total, nine healthy subjects, all of whom had normal eyesight or eyesight corrected to normal, and all of them were right-handed. All subjects participated in five groups (five sessions) of the left- and right-hand MI experiments, among which the first two sessions had no feedback and the last three sessions had feedback. Each subject conducted 160 trials with feedback (80 trials for the right hand, and 80 for the left hand). Three bipolar leads (C3, CZ, and C4) were recorded with a sampling frequency of 250 Hz, 0.5–100 Hz band-pass filtering, and 50 Hz notch processing. According to some studies (Zhang et al., 2019) and (Bustios and Rosa, 2017), only the third session data were used for evaluation. Each subject performed 160 trials, i.e., 80 trials per class. For additional details, see http://www.bbci.de/competition /iv/.
Pretreatment
In this study, for each trial of data set 1, the data segment of 0.5–3 s after the prompt was extracted; for each trial of data set 2, the data segment of 0–4 s after the prompt was extracted, and the prompt time was 0 s. Studies have shown that the frequency range of EEG-based MI-BCI signals is generally 8–30 Hz; so in this study, the original EEG data were filtered at 8–30 Hz by the 5th-order Butterworth band-pass filter.
Method
General Idea and Fusion Algorithm Flow
The overall research idea and fusion algorithm flow of the proposed method are shown in Figure 1. First, the EEG data set was divided into a training set and a testing set, and TFD-PSR-CSP was used to extract features from the training data set. Specifically, WPT + PSR, FEEMD + PSR, and LMD + PSR were used for time–frequency decomposition and reconstruction, and features were extracted by CSP to construct feature vectors. Then, the SVM classifier was trained. A sigmoid function was introduced to map the classification results to probability, and three SVM probability output models were obtained through training.
After establishing the probability output model of SVM on the training set, TFD-PSR-CSP extracted features from any trial on the test set. Then, the probability output of each category of the test sample was obtained through the trained SVM probability output model. Finally, the classification result was obtained through D–S evidence theory fusion.
TFD-PSR-CSP Method and Its Specific Design
TFD-PSR-CSP first carried out TFD on the preprocessed EEG data to obtain effective time–frequency information, then carried out PSR to expand the EEG signal with few channels into multichannel EEG, and finally extracted features with good separability by CSP. The method and specific design are described below.
(1) Time–frequency decomposition (TFD)
In this experiment, three different TFD methods were used: WPT, FEEMD, and LMD.
(1) Wavelet packet transform (WPT)
WPT is an adaptive time–frequency localization analysis method whose time window and frequency window can be changed (Ting et al., 2008). The sampling frequencies of data set 1 and data set 2 in this study were 128 and 250 Hz, respectively. According to the Nyquist sampling theorem, the best decomposition layers of WPT were layers 3 and 4. For an arbitrary trial on data set 1 and data set 2, WPT was performed on the EEG signals of C3, C4, and CZ channels, and the db4 wavelet was used after comparing the results of each wavelet base.
Each node in the 3rd or 4th layer was reconstructed, the energy ratio in each frequency range was calculated, and the related node signal was selected as the object of subsequent processing. In this way, the original 3-channel (C3, CZ, and C4) signals were expanded into 12-channel EEG data after WPT. Figures 2A,B illustrates the EEG waveform and the energy ratio of the first five nodes in layer 3 after WPT of the C3 channel of the EEG signal in data set 1.
Figure 2. Time domain, frequency spectrum, and energy ratio diagram of each component of EEG signal after TFD. (A) EEG waveform of the first 5 nodes of layer 3 after WPT; (B) EEG energy ratio diagrams of the first 5 nodes of layer 3 after WPT; (C) Spectrum diagram of each IMF component after FEEMD; (D) Spectrum diagram of each PF component after LMD.
(2) Fast ensemble empirical mode decomposition (FEEMD)
Fast ensemble empirical mode decomposition is suitable for processing nonlinear and nonstationary signals, which can effectively solve the mode aliasing problem in empirical mode decomposition (EMD) and further reduce the computational complexity (Wang et al., 2014; Chen and You, 2017; Dai et al., 2021). In this experiment, FEEMD was used to decompose the preprocessed EEG data in time and frequency, which mainly involved setting two parameters, namely, the amplitude coefficient of the white noise sequence and the number of times EMD was executed.
For convenience, taking the EEG signal of the left-hand MI C3 channel of data set 1 as an example, the amplitude coefficient of the white noise sequence was determined to be 0.2 by the grid search method, and EMD was executed 40 times. Each channel to be decomposed was divided into 10 intrinsic mode function (IMF) components. It can be seen from the spectrogram of each component in Figure 2C that the frequency ranges of IMF1, IMF2, IMF3, and IMF4 were 8–30, 8–16, 0–12, and 0–6 Hz, respectively, and the amplitude of each component decreased with the increase of decomposition number. Comparing the frequency bands of these four components and considering the frequency range of MI-EEG, the first three components (IMF1–IMF3) were selected to form a new EEG signal as the object of subsequent processing. In this way, the original 3-channel (C3, CZ, and C4) signals were expanded into 9-channel EEG data after passing through FEEMD.
(3) Local mean decomposition (LMD)
Local mean decomposition is an adaptive time–frequency analysis method for nonstationary signals proposed by Smith (Smith, 2005; Liu et al., 2018). The original signal x(t) can be expressed as the sum of the product function PFi(t) and the residual un(t), as follows:
For convenience, taking the EEG signal of channel C3 of data set 1 as an example, first, the preprocessed EEG signal is decomposed into several PF components by LMD, as shown in Figure 2D. In Figure 2D, the frequency ranges of PF1, PF2, and PF3 are 0–30, 0–16, and 0–7 Hz, respectively, and the frequency ranges of PF4 and PF5 are significantly lower than 8 Hz. Considering the frequency range of MI-EEG and calculating the correlation coefficient between EEG data before LMD and PF components after LMD, the correlation coefficients with PF1, PF2, and PF3 are 0.9194, 0.2420, and 0.0015, respectively. Therefore, PF1 and PF2 are selected to form a new EEG signal as the subsequent processing object. In this way, the original 3-channel (C3, CZ, and C4) signals are expanded into 6-channel EEG data after LMD.
The numbers of expanded channels of the original EEG data after WPT, FEEMD, and LMD are 12, 9, and 6, respectively. The following PSR is used to further expand the spatial information.
(2) Phase space reconstruction (PSR)
Electroencephalogram signals have chaotic characteristics. PSR is an effective method for analyzing chaotic time series, and studies have shown that it is suitable for extracting MI-EEG features (Meng et al., 2013). Takens (1981) put forward the embedding theorem and used the delayed coordinate method to analyze chaotic time series x = {x(i)|i = 1,2,….N} and carry out reconstruction. The resulting reconstructed signal y(i) is
where τ is time delay, and d is embedded dimension. In this study, the C-C method proposed by Kim et al. (1999) is used to select the values of τ and d.
To clearly explain how WPT + PSR, FEEMD + PSR, and LMD + PSR are formed, the original 3-channel MI-EEG data were reconstructed by WPT, FEEMD, and LMD, and the number of channels was expanded to 12, 9, and 6 channels, respectively. Although these three TFD methods expanded the spatial information of EEG with few channels, the expansion is limited. To further expand the spatial information of EEG, we use the C-C method proposed by Kim et al. based on the idea of the embedded window method to calculate the time delay τ and embedded dimension d of the above three kinds of TFD EEG data, and then we use the searching method to select the best parameter values. Then, based on the optimal parameter values and the actual computational complexity, the embedding dimensions of WPT, FEEMD, and LMD processed data were selected as 2, 2, and 3, respectively, by PSR. In this way, the EEG data after three different TFDs were further processed by PSR, and the number of channels was extended to 24, 18, and 18 channels, respectively.
(3) Common spatial pattern (CSP) algorithm
After the original 3-channel EEG signal was expanded into a multichannel signal by TFD-PSR, CSP was proposed to extract features.
The main idea of the CSP algorithm is to diagonalize two kinds of covariance matrices at the same time, i.e., maximize one kind of variance and minimize the other kind of variance, to find a set of optimal spatial filters and obtain the feature vector with the best separability by projection. Before the feature extraction algorithm is applied, band-pass filtering and centralized processing are generally required for data, so the spatial covariance matrix of these two types of data is expressed as:
where Xi,1,Xi,2 ∈ RN*T represent the two types of EEG data obtained from the i-th experiment, N represents the number of channels, T represents the number of sampling points, and n1 and n2 are used to represent the number of experiments of these two categories.
The mathematical principle of the CSP algorithm is to maximize the following objective function (Lotte and Guan, 2011):
where w is the spatial filter.
The optimization problem of the Rayleigh quotient can be transformed into the generalized eigenvalue problem. By solving the generalized eigenvalue problem, the optimal spatial filter can be obtained.
where λ is a generalized eigenvalue, and w is a generalized eigenvector. The eigenvalue λ is used to calculate the variance ratio between the two classes. The spatial filter wcsp is composed of n eigenvectors corresponding to maximum and minimum eigenvalues.
Finally, the original signal Xi is passed through the spatial filter wcsp to obtain the projection signals Zi :
D–S Evidence Theory
After features are extracted by TFD-PSR-CSP, the sigmoid function is used to map SVM outputs to obtain three different SVM probability outputs, and D–S evidence theory is used to determine the classification of the final test samples by decision-level fusion.
Dempster–Shafer evidence theory is a decision-making method for uncertain reasoning proposed and improved by Dempster and Shafer (Dempster, 1967; Shafer, 1976; Jin et al., 2021). It combines multiple pieces of evidence by rules to make decisions under a known identification framework to obtain higher correct recognition and reliability (Mathon et al., 2010).
To use the D–S evidence theory for decision-level fusion in an MI-EEG signal classification problem, this experiment introduces the sigmoid function to map the probability (Platt, 1999) of an EEG sample category output by SVM belonging to the left-handed MI category {L} or the right-handed MI category {R}. The recognition framework is defined as follows:
The power series 2Θ of the identification framework is as follows:
where ∅ is an empty set. According to the trained SVM probability output model, the corresponding probability P{L} or P{R} of the test sample category {L} or {R} is obtained, and then the basic probability assignment (BPA) value of each focal element under the recognition framework is calculated by formulas (9)–(11).
where S({L}) and S({R}) represent the BPA values of categories {L} and {R}, respectively, S(Θ) represents the BPA values of the identification framework, Nsv represents the number of support vectors in the SVM probabilistic output model, and l represents the total number of samples.
After obtaining the BPA value of each focal element under the identification framework, the BPA values of each focal element are fused by D-S evidence theory combination rules (12), (13). Assuming that ∀A⊂Θ exists, the combination rules of n mass functions in the recognition framework Θ are as follows:
In these formulas, Kis the normalization factor, and indicates the conflict degree of evidence. The mass function indicates the trust degree of each category after classification, and the trust level of category Ai is m(Ai). When expressed by probability, m(Ai) represents the probability of class A in the i-th probability output model, and Ai represents class A in the i-th probability output model.
After obtaining the fused BPA values Sf({L}), Sf({R}), and Sf({Θ}), the classification of the final test samples is determined according to the following decision rules:
Decision rules: If a test sample belongs to class Ω, Ω ∈ {{L},{R}}, therefore should satisfy the following conditions:
In this experiment, ε1 = 0, ε2 = 0.1.
Results
Classification Accuracy of TFD-PSR-CSP Combined With D–S Evidence Theory
The last column in Table 1 shows the classification accuracy of the proposed method on data set 1 and data set 2, which are 95.71 and 86.60 ± 11.14%, respectively.
Table 1. 5 × 5 cross-validation classification accuracy of proposed methods and different methods using BCI competition II data set III and competition IV data set IIb.
Figure 3 shows the classification accuracy of the proposed method, traditional CSP, and different time–frequency decomposition and reconstruction methods. The results show that TFD and PSR achieve better classification accuracy than traditional CSP methods by effectively combining with CSP. At the same time, the proposed method in this study achieves the highest classification accuracy in all subjects.
Kappa Value of TFD-PSR-CSP Combined With D–S Evidence Theory
In the last column of Table 2, the kappa values of the proposed method in data set 1 and data set 2 are given, which are 0.91 and 0.74 ± 0.229, respectively.
Table 2. 5 × 5 cross-validation kappa value of proposed methods and different methods using BCI competition II data set III and competition IV data set IIb.
Figure 4 shows the kappa value of the proposed method, traditional CSP, and different time–frequency decomposition and reconstruction methods. The results show that except FDM method, other methods obtain higher kappa values than traditional CSP. At the same time, the proposed method in this study achieves the highest kappa value in all subjects.
Discussion
To make better use of the D–S evidence theory for decision-level fusion to get the final test sample category, this study employs three different TFD methods with PSR-CSP + SVM training to get three different probability output models. The following compares the proposed method with traditional CSP and different time–frequency decomposition reconstruction methods and also with other existing methods.
Comparison of TFD-PSR-CSP and D-S Evidence Theory With Traditional CSP and Different Time–Frequency Decomposition Reconstruction Methods
Classification Accuracy
It can be seen from Table 1 that the classification accuracy of the proposed method was improved from 84.29 to 95.71% compared with the traditional CSP on data set 1. In addition, before extracting features from EEG data by CSP, using PSR, the degree was increased from 84.29 to 88.57%, which indicated the effectiveness of PSR in the intention identification of MI with fewer channels. TFD not only expanded the number of channels of original EEG data but also achieved better classification accuracy when combined with CSP than traditional CSP; this also showed the effectiveness of TFD in the intention identification of MI with fewer channels.
It can also be seen from Table 1 that on data set 2, the average classification accuracy of the method proposed in this study changed from 78.52 ± 12.31 to 86.60 ± 11.14% compared with FDM (direct fusion of features extracted from three different TFDs combined with PSR-CSP). In addition, the classification accuracy of traditional CSP and other methods was tested by the paired t-test, and the P-values of other methods were far less than 0.005, except for FDM (P-value is 0.0547). The above showed that the fusion of D–S evidence theory after extracting features by different TFDs combined with PSR-CSP was better than FDM.
Kappa Value
It can be seen from Table 2 that the kappa value or mean value of the proposed method was increased by 0.22 and 0.21, respectively, compared with the traditional CSP on data sets 1 and 2. In addition, before extracting EEG features by CSP, TFD, and PSR were used to expand the number of original EEG data channels to increase the amount of spatial information, and the kappa value or mean value was improved. This showed the effectiveness of TFD and PSR in the intention identification of MI with fewer channels. Compared with FDM, the kappa values of data sets 1 and 2 increased from 0.86 to 0.91 and from 0.57 to 0.74, respectively, which indicated that the D–S evidence theory fusion was superior to FDM after different TFD and PSR combined with CSP to extract features.
Feature Distribution
To compare the distribution differences of features extracted by different methods, Figure 5 presents the distribution of features extracted by these methods on data set 1 (for convenience, only the two preferred features in each method are displayed). It can be seen from Figure 5 that the feature distribution obtained by combining three different TFD methods and PSR with CSP in this study was more separable than that obtained by a single CSP.
Figure 5. Characteristic box diagram of traditional CSP and of three different time–frequency decomposition and reconstruction methods. M1 represents the characteristic box diagram of CSP; M2 represents the characteristic box diagram of PSR + CSP; M3 represents the characteristic box diagram of WPT + CSP; M4 represents the characteristic box diagram of WPT + PSR + CSP; M5 represents the characteristic box diagram of LMD + CSP; M6 represents the characteristic box diagram of LMD + PSR + CSP; M7 represents the characteristic box diagram of FEEMD + CSP; M8 represents the characteristic of box diagram FEEMD + PSR + CSP.
To intuitively compare the distribution of features proposed by different methods, Table 3 shows the upper quartile difference and median difference of each characteristic box diagram; the difference between the upper quartile difference and the median difference indicates the separability of left- and right-hand feature distribution. The separability difference of WPT-PSR-CSP (0.00559 to –1.41e-03) was larger than the separability difference of WPT-CSP (0.00293 to –1.82e-04), which indicated that PSR can improve the separability of the left- and right-hand MI features in the intention identification of MI with fewer channels. In addition, the Fisher score (FS) was calculated for the features extracted by each method. The FS value of the proposed method was much larger than that of CSP, which also indicated that the features extracted from the original EEG by three different TFD methods and PSR combined with CSP had better separability.
Table 3. Comparison of feature distribution between CSP method and three different time–frequency decomposition reconstruction methods.
Performance Analysis of D–S Evidence Theory in Decision Fusion Layer
To analyze the performance of the D–S evidence theory in the decision fusion layer, Table 4 shows the BPA of the single-feature domain and multifeature domain fusion of two test samples. The largest BPA in the single feature domain of test sample 1 in Table 4 is S2{L} (0.5985). After the fusion of D–S evidence theory, S{L} is increased to 0.8269, and the uncertainty S{Θ} is decreased from the minimum of 0.1188–0.0282. Therefore, the D–S evidence theory can be used to improve BPA and reduce the uncertainty of categories.
Table 4. BPA values of test samples under the framework Θ. Si and Si* represent the BPA of test samples 1 and 2, respectively. sf and sf* represent the BPA of test samples 1 and 2, respectively, according to D–S decision fusion rules.
For test sample 2 in Table 4, the BPA (0.4617) of is larger than the BPA (0.0883) of , and the BPA values of are larger than those of and , respectively. After D–S evidence theory fusion, the BPA of (0.5408) is higher than (0.4241), while S{Θ} decreases from 0.1188 to 0.0351. The final test sample category is judged as {L}, which is in line with the true category of the sample. If the test sample category is judged as {R} according to the traditional voting principle, which is inconsistent with the true category, it shows that the fusion of D–S evidence theory can not only reduce the uncertainty of the test sample but also improve the classification accuracy.
Comparison of TFD-PSR-CSP and D–S Evidence Theory With Existing Methods
To compare the performance of the proposed method with the existing methods (both for data set 1 and data set 2), the classification accuracy of data set 1 is compared, and the kappa value and classification accuracy of data set 2 are compared. The best accuracies and kappa values are highlighted in bold in Tables 5–7.
Table 5. Comparison of classification accuracy between the methods proposed in this study and the existing methods in data set 1.
Table 6. Comparison of kappa values on data set 2 between the methods proposed in this study and the methods adopted by the top four in the BCI competition.
Table 7. Comparison of classification accuracy between the method proposed in this study and existing methods on data set 2.
Comparison of Results of Data Set 1
Table 5 compares the classification accuracy of the method proposed in this study with the existing methods on data set 1. The features adopted by these methods are different from the classification algorithm, and the results are different. It can be seen from Table 5 that the proposed method has achieved better classification accuracy, reaching 95.71%, which is 2.88% higher than the highest classification accuracy obtained by the existing method (the method proposed by Ge et al.).
Comparison of Results of Data Set 2
(1) Kappa value comparison
Table 6 compares kappa values of the methods proposed in this study and the methods adopted by the top four in the BCI competition on data set 2. It can be seen from Table 6 that the average kappa value of the method proposed in this study is higher than that of other methods in the table, and all subjects have achieved better performance.
(2) Comparison of classification accuracy
Table 7 compares the classification accuracy of the proposed method with the existing methods on data set 2. It can be seen from Table 7 that the proposed method has achieved better average classification accuracy, reaching 86.60%, which is 2.3% higher than the highest average classification accuracy obtained by the existing methods (the method proposed by Yu et al.), but the classification accuracy on subjects 7 and 8 is slightly lower than that of some existing methods.
Compared with multichannel MI-BCI, MI-BCI with fewer channels may be more easily accepted by users. To solve the inapplicability of CSP in MI-BCI with fewer channels, this study proposed that TFD-PSR-CSP expands the spatial information of MI-EEG with fewer channels and enriched the information in the time domain and frequency domain. In addition, the sigmoid function was used for probability mapping of SVM outputs to obtain three different SVM probability outputs; D–S evidence theory was used for decision-level fusion, which effectively improved the performance of few-channel MI-BCI.
However, the methods proposed in this study also have some limitations. One of the limitations is that compared with traditional CSP, the computational complexity of the proposed method is higher. Therefore, in our future work, we will work toward reducing the computational complexity of the proposed method. Another limitation is that the BCI competition data set was used in this study, which can only be used for offline evaluation of the proposed method. Hence, an online evaluation of the proposed method will also be a part of our future work.
Conclusion
In view of the lack of spatial information in EEG data with fewer channels, CSP cannot extract features effectively. In this study, a time–frequency-space multidomain fusion method combining TFD-PSR-CSP feature extraction with D–S evidence theory was proposed. Compared with existing methods, it had certain advantages in classification accuracy, kappa value, and feature distribution, which is expected to provide ideas for the research and development of MI-BCI online systems based on EEG with fewer channels.
Data Availability Statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.
Ethics Statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants or patients/participants’ legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.
Author Contributions
FW: conceptualization, methodology, writing—original draft, software, and validation. HL: methodology and conceptualization. LZ: investigation and data curation. LS: data curation and software. JZ: software. AG: writing—review and editing, and funding acquisition. YF: conceptualization, methodology, writing—review and editing, and funding acquisition. All authors contributed to the article and approved the submitted version.
Funding
This work was supported by the National Natural Science Foundation of China (Nos. 82172058, 81771926, 61763022, and 62006246).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Ang, K. K., Chin, Z. Y., Zhang, H., and Guan, C. (2008). “Filter Bank Common Spatial Pattern (Fbcsp) in Brain-Computer Interface,” in 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), (Piscataway: IEEE).
Attallah, O., Abougharbia, J., Tamazin, M., and Nasser, A. A. (2020). A Bci System Based on Motor Imagery for Assisting People with Motor Deficiencies in the Limbs. Brain Sci. 10:864. doi: 10.3390/brainsci10110864
Blankertz, B., Tangermann, M., Vidaurre, C., Fazli, S., Sannelli, C., Haufe, S., et al. (2010). The Berlin Brain-Computer Interface: Non-Medical Uses of Bci Technology. Front. Neurosci. 4:198. doi: 10.3389/fnins.2010.00198
Bustios, P., and Rosa, J. L. (2017). “Restricted Exhaustive Search for Frequency Band Selection in Motor Imagery Classification,” in 2017 International Joint Conference on Neural Networks (IJCNN), (Picataway: IEEE).
Chen, B., Li, Y., Dong, J., Na, L., and Jing, Q. (2018). Common Spatial Patterns Based on the Quantized Minimum Error Entropy Criterion. IEEE Transac. Syst. Man Cybernet. 50, 1–12.
Chen, W., and You, Y. (2017). “Masking Empirical Mode Decomposition-Based Hybrid Features for Recognition of Motor Imagery in Eeg,” in 2017 3rd IEEE International Conference on Control Science and Systems Engineering (ICCSSE), (Piscataway: IEEE).
Cheng, M., Gao, X., Gao, S., and Xu, D. (2002). Design and Implementation of a Brain-Computer Interface with High Transfer Rates. IEEE Trans. Biomed. Eng. 49, 1181–1186. doi: 10.1109/tbme.2002.803536
Dai, Y., Duan, F., Feng, F., Sun, Z., Zhang, Y., Caiafa, C. F., et al. (2021). A Fast Approach to Removing Muscle Artifacts for Eeg with Signal Serialization Based Ensemble Empirical Mode Decomposition. Entropy 23:1170. doi: 10.3390/e23091170
Dempster, A. P. (1967). Upper and Lower Probabilities Included by a Multivalued Mapping. Ann. Mathemat. Stat. 38, 325–339. doi: 10.1214/aoms/1177698950
Dose, H., Møller, J. S., Iversen, H. K., and Puthusserypady, S. (2018). An End-to-End Deep Learning Approach to Mi-Eeg Signal Classification for Bcis. Exp. Syst. Appl. 114, 532–542. doi: 10.1088/1741-2552/ab3471
Ge, R., and Hu, J. (2018). The Classification of Eeg Signals with Multi-Domain Fusion Based on D-S Evidence Theory. J. Circ. Syst. Comput. 28:1950160. doi: 10.1142/s0218126619501603
Guan, J., and Duan, F. (2019). The Improvement of Motor Imagery Based on Spectral Feature and Transformation on Multivariate Empirical Mode Decomposition. J. Phys. Conf. Ser. 1169:012044. doi: 10.1088/1742-6596/1169/1/012044
Huang, G., Liu, G., and Zhu, X. (2009). common spatial patterns in classification based on less number channels of EEG. Chin. J. Biomed. Eng. 28, 840–845.
Jiao, Y., Zhang, Y., Chen, X., Yin, E., Jin, J., Wang, X., et al. (2019). Sparse Group Representation Model for Motor Imagery Eeg Classification. IEEE J. Biomed. Health Inform. 23, 631–641. doi: 10.1109/JBHI.2018.2832538
Jin, J., Xiao, R., Daly, I., Miao, Y., Wang, X., and Cichocki, A. (2021). Internal Feature Selection Method of Csp Based on L1-Norm and Dempster-Shafer Theory. IEEE Trans. Neural Netw. Learn. Syst. 32, 4814–4825. doi: 10.1109/TNNLS.2020.3015505
Kee, C.-Y., Ponnambalam, S., and Loo, C.-K. (2017). Binary and Multi-Class Motor Imagery Using Renyi Entropy for Feature Extraction. Neural Comput. Appl. 28, 2051–2062. doi: 10.1007/s00521-016-2178-y
Kim, H. S., Eykholt, R., and Salas, J. (1999). Nonlinear Dynamics, Delay Times, and Embedding Windows. Physica DNonlin. Phenom. 127, 48–60. doi: 10.1016/s0167-2789(98)00240-1
Liu, Y., Tang, B., Duan, L., and Fei, F. (2018). “Feature Extraction for Rolling Bearing Diagnosis Based on Improved Local Mean Decomposition,” in 2018 Prognostics and System Health Management Conference (PHM-Chongqing), (Piscataway: IEEE).
Lotte, F., and Guan, C. (2011). Regularizing Common Spatial Patterns to Improve Bci Designs: Unified Theory and New Algorithms. IEEE Trans. Biomed. Eng. 58, 355–362. doi: 10.1109/TBME.2010.2082539
Mathon, B. R., Ozbek, M. M., and Pinder, G. F. (2010). Theory Applied to Uncertainty Surrounding Permeability. Mathemat. Geosci. 42, 293–307. doi: 10.1007/s11004-009-9246-0
Meng, J. J., Sheng, X. J., Yao, L., and Zhu, X. Y. (2013). Common Spatial Spectral Pattern for Motor Imagery Tasks in Small Channel Configuration. Chin. J. Biomed. Eng. 32, 553–561.
Miao, Y., Jin, J., Daly, I., Zuo, C., Wang, X., Cichocki, A., et al. (2021). Learning Common Time-Frequency-Spatial Patterns for Motor Imagery Classification. IEEE Trans. Neural Syst. Rehabil. Eng. 29, 699–707. doi: 10.1109/TNSRE.2021.3071140
Park, Y., and Chung, W. (2019). Frequency-Optimized Local Region Common Spatial Pattern Approach for Motor Imagery Classification. IEEE Trans. Neural. Syst. Rehabil. Eng. 27, 1378–1388. doi: 10.1109/TNSRE.2019.2922713
Platt, J. (1999). Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Adv. Larg. Marg. Classif. 10, 61–74.
Rodríguez-Bermúdez, G., García-Laencina, P. J., Roca-González, J., and Roca-Dorda, J. (2013). Efficient Feature Selection and Linear Discrimination of Eeg Signals. Neurocomputing 115, 161–165. doi: 10.1016/j.neucom.2013.01.001
Smith, J. S. (2005). The Local Mean Decomposition and Its Application to Eeg Perception Data. J. R. Soc. Inter. 2, 443–454. doi: 10.1098/rsif.2005.0058
Song, J.-L., Hu, W., and Zhang, R. (2016). Automated detection of epileptic EEGs using a novel fusion feature and extreme learning machine. Neurocomputing 175, 383–391. doi: 10.1016/j.neucom.2015.10.070
Takens, F. (1981). Detecting Strange Attractors in Turbulence. Dynamical Systems and Turbulence, Warwick 1980. New York: Springer, 366–381.
Ting, W., Guo-Zheng, Y., Bang-Hua, Y., and Hong, S. (2008). Eeg Feature Extraction Based on Wavelet Packet Decomposition for Brain Computer Interface. Measurement 41, 618–625. doi: 10.1016/j.measurement.2007.07.007
Wang, J., Feng, Z., and Lu, N. (2017). “Feature Extraction by Common Spatial Pattern in Frequency Domain for Motor Imagery Tasks Classification,” in 2017 29th Chinese Control and Decision Conference (CCDC), (Piscataway: IEEE).
Wang, J., Feng, Z., Ren, X., Lu, N., Luo, J., and Sun, L. (2020). Feature Subset and Time Segment Selection for the Classification of Eeg Data Based Motor Imagery. Biomed. Sig. Proc. Control 61:102026. doi: 10.1109/IEMBS.2009.5334902
Wang, K., Zhai, D.-H., and Xia, Y. (2019). “Motor Imagination Eeg Recognition Algorithm Based on Dwt, Csp and Extreme Learning Machine,” in 2019 Chinese control conference (CCC), (Piscataway: IEEE).
Wang, T., and He, B. (2004). An Efficient Rhythmic Component Expression and Weighting Synthesis Strategy for Classifying Motor Imagery Eeg in a Brain-Computer Interface. J. Neural Engin. 1, 1–7. doi: 10.1088/1741-2560/1/1/001
Wang, Y., Shen, X., and Peng, Z. (2018). “Research of Eeg Recognition Algorithm Based on Motor Imagery,” in 2018 2nd International Conference on Robotics and Automation Sciences (ICRAS), (Piscataway: IEEE).
Wang, Y.-H., Yeh, C.-H., Young, H.-W. V., Hu, K., and Lo, M.-T. (2014). On the Computational Complexity of the Empirical Mode Decomposition Algorithm. Physica A Stat. Mech. Appl. 400, 159–167. doi: 10.1016/j.physa.2014.01.020
Wolpaw, J. R., Birbaumer, N., Heetderks, W. J., McFarland, D. J., Peckham, P. H., Schalk, G., et al. (2000). Brain-Computer Interface Technology: A Review of the First International Meeting. IEEE Trans. Rehabil. Eng. 8, 164–173. doi: 10.1109/tre.2000.847807
Xu, B. G., Zhang, L. L., Song, A. G., Wu, C. C., Li, W. L., Zhang, D. L., et al. (2019). Wavelet Transform Time-Frequency Image and Convolutional Network-Based Motor Imagery Eeg Classification. IEEE Access 7, 6084–6093. doi: 10.1109/access.2018.2889093
Yang, Q., Zhang, Z., Leng, Y., Yang, Y., and Ge, S. (2015). “Phase Space Reconstruction for Improvement of Classification in Few-Channel Bci Systems,” in 2015 12th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), (Piscataway: IEEE).
Zhang, Y., Nam, C. S., Zhou, G., Jin, J., Wang, X., and Cichocki, A. (2019). Temporally Constrained Sparse Group Spatial Patterns for Motor Imagery Bci. IEEE Trans. Cybern. 49, 3322–3332. doi: 10.1109/TCYB.2018.2841847
Zhao, H., Zheng, Q., Ma, K., Li, H., and Zheng, Y. (2021). Deep Representation-Based Domain Adaptation for Nonstationary Eeg Classification. IEEE. Trans. Neural Netw. Learn. Syst. 32, 535–545. doi: 10.1109/TNNLS.2020.3010780
Keywords: MI-BCI with fewer channels, Dempster–Shafer evidence theory, time-frequency decomposition (TFD), phase space reconstruction (PSR), common spatial pattern (CSP)
Citation: Wang F, Liu H, Zhao L, Su L, Zhou J, Gong A and Fu Y (2022) Improved Brain–Computer Interface Signal Recognition Algorithm Based on Few-Channel Motor Imagery. Front. Hum. Neurosci. 16:880304. doi: 10.3389/fnhum.2022.880304
Received: 22 February 2022; Accepted: 24 March 2022;
Published: 06 May 2022.
Edited by:
Bin He, Carnegie Mellon University, United StatesReviewed by:
Yuan Yang, University of Oklahoma, United StatesYuliang Ma, Hangzhou Dianzi University, China
Jianjun Meng, Shanghai Jiao Tong University, China
Copyright © 2022 Wang, Liu, Zhao, Su, Zhou, Gong and Fu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Anmin Gong, gonganmincapf@163.com; Yunfa Fu, fyf@ynu.edu.cn