AUTHOR=Rath Adyasha , Mishra Debahuti , Panda Ganapati TITLE=Imbalanced ECG signal-based heart disease classification using ensemble machine learning technique JOURNAL=Frontiers in Big Data VOLUME=Volume 5 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2022.1021518 DOI=10.3389/fdata.2022.1021518 ISSN=2624-909X ABSTRACT=The machine learning (ML) based classification models are widely utilized for the automated detection of heart diseases (HD) using various physiological signals such as Electrocardiogram (ECG), magnetocardiography (MCG), heart sound (HS), and impedance cardiography (ICG) signals. However, the ECG based HD identification is the most common one used by the clinicians. In the current investigation, the ECG records or subjects have been sampled and are used as inputs to the classification model to distinguish between normal and abnormal patients. The study has employed imbalanced number of ECG samples for training the various classification models. Few ML methods such as support vector machine (SVM), logistic regression (LR) and adaptive boosting (AdaBoost) which have been rarely used for HD detection have been selected. The performance of the developed model has been evaluated in terms of accuracy, F1-score and area under curve (AUC) values using ECG signals of subjects given in publicly available (PTB-ECG, MIT-BIH) datasets. Ranking of the models have been assigned based on these performance metrics and it is found that the AdaBoost and LR classifiers stand in first and second positions. These two models have been ensembled based on majority voting principle and the performance measure of this ensemble model have also been determined. It is, in general, observed that the proposed ensemble model demonstrates the best HD detection performance of 0.946, 0.949, 0.951 for PTB-ECG dataset and 0.921, 0.926, 0.950 for MIT-BIH dataset in terms of accuracy, F1-score and AUC respectively. The proposed methodology can also be employed for classification of HD using ICG, MCG and HS signal as inputs. Further, the proposed methodology can also be applied for detection of other diseases.