Skip to main content

ORIGINAL RESEARCH article

Front. Pharmacol., 08 April 2022
Sec. Translational Pharmacology
This article is part of the Research Topic Model-Informed Drug Development and Evidence-Based Translational Pharmacology View all 17 articles

A Supervised ML Applied Classification Model for Brain Tumors MRI

Zhengyu Yu,Zhengyu Yu1,2Qinghu HeQinghu He3Jichang YangJichang Yang3Min Luo,
Min Luo1,3*
  • 1Department of Nephrology, The Second Xiangya Hospital, Central South University, Changsha, China
  • 2Faculty of Engneering and IT, University of Technology Sydney, Sydney, NSW, Australia
  • 3Department of Rehabilitation Medicine and Health Care, Hunan University of Medicine, Huaihua, China

Brain Tumor originates from abnormal cells, which is developed uncontrollably. Magnetic resonance imaging (MRI) is developed to generate high-quality images and provide extensive medical research information. The machine learning algorithms can improve the diagnostic value of MRI to obtain automation and accurate classification of MRI. In this research, we propose a supervised machine learning applied training and testing model to classify and analyze the features of brain tumors MRI in the performance of accuracy, precision, sensitivity and F1 score. The result presents that more than 95% accuracy is obtained in this model. It can be used to classify features more accurate than other existing methods.

Introduction

In the human body, the brain is a complex organ. When brain tumors originate, uncontrolled cell division occurs in an abnormal series of cells forms in the brain (Logeswari and Karnan, 2010). That abnormal series of cells will destroy healthy cells and influent the general activity of the brain. Benign tumors and malignant tumors Brain are two classifications of brain tumors. Benign tumors grow slowly and originate in the brain; They are considered non-progressive or non-cancerous. Benign tumors cannot extend to any other organs inside the body. In contrast, malignant tumors are progressive and cancerous. They grow unexpectedly in an indeterminate manner. Primary malignant tumors can grow themselves. In addition, malignant tumors also can grow in other organs inside the body and spread to the brain.

MRI is an imaging technology that can generate high-quality images of human anatomy. MRI provides extensive information for medical diagnosis and research (Zhang et al., 2011). The automation and accurate classification of MRI images has dramatically improved the diagnostic value of MRI (Scapaticci et al., 2012). However, one type of MRI cannot provide full details for brain tumours that contain many different tissues (Sudharani et al., 2016). Different weighted images are combined to develop the image segmentation of brain tumors. Three weighted MRI images (T1, T2, and FLAIR, in Figure 1) are used for image segmentation of the skull on different axial slices (Vannier et al., 1988; Clark et al., 1994; Dou et al., 2007).

FIGURE 1
www.frontiersin.org

FIGURE 1. Comparison of T1, T2 and flair of brain tumors MRI (Clark et al., 2013; Scarpace et al., 2022).

As one of the best imaging methods, researchers use MRI to analyze the progression of a brain tumor during the stages of detection and treatment. As MRI generates high resolution, brain structure information, such as brain tissue abnormalities, is detailed. Therefore, MRI significantly influences automatic analysis for medical images (Zacharaki et al., 2009; Litjens et al., 2017). Since medical images can be scanned and loaded into a computer, researchers have proposed different automated methods of observation and classification for brain tumor by exploiting brain MRI images (Litjens et al., 2017).

Recently, two categories of research have been proposed. First is unsupervised classification, such as fuzzy c-means and self-organization feature maps (Ibrahim et al., 2013). Second is supervised classification, such as K Nearest Neighbours (KNN) and Support Vector Machine (SVM) (Cocosco et al., 2003; Chaplot et al., 2006). According to the results in classification accuracy, the performance of supervised classification is better than unsupervised classification (Zhang and Wu, 2008; Ibrahim et al., 2013). Nevertheless, most of the classification accuracy is less than 95% (Yeh and Fu, 2008). In the past decades, SVM and Neural Network (NN) become popular due to the outstanding performance for detecting and classifying brain tumors (Ibrahim et al., 2013). Recently, deep learning methods have established novel modeling in machine learning. Complex relationships can be displayed effectively without the need for many nodes by deep architectures, such as SVM and KNN. In this case, they have rapidly developed into the most advanced technologies in various health research fields (such as medical image analysis, medical informatics, and bioinformatics) (Pan et al., 2015; Ravì et al., 2016; Litjens et al., 2017).

Materials and Methods

Supervised machine learning algorithms applied classification method is proposed to classify whether the cysts are detected from the MRI of brain tumors. Figure 2 illustrates the workflow diagram for the training and testing models of the classification method. The process is summarised below:

1) Extract datasets of Brain tumors MRI images. The datasets are from the Repository of Molecular Brain Neoplasia Data (REMBRANDT) in this research (Clark et al., 2013; Scarpace et al., 2022).

2) Extract features. Table 1 presents that there are 30 features extracted from brain tumors MRI, including 21 categorical features and 9 numerical features. Feature 8 is selected as a target feature; The rest are selected as attributes.

3) Machine learning algorithm classification comparison. Supervised machine learning algorithms applied classification methods, such as Decision Tree (DT), SVM, KNN and NN have been compared to estimate the performance for each training model. Cross-validations are computed on different folds to avoid overfitting. 80% of the datasets are used for training model. The result indicates that the model using DT is the most accurate.

4) The testing model is evaluated by using 20% of the datasets; in this stage, feature 8 is also selected as a target feature; the rest of the features are selected as attributes. The results present that the performance of the DT model with 30 cross-validation folds is the best.

5) After the final model has been evaluated, the result is predicted that the accuracy of the final model is 95.9%.

FIGURE 2
www.frontiersin.org

FIGURE 2. Workflow diagram for training model and testing model.

TABLE 1
www.frontiersin.org

TABLE 1. Data features extracted from brain tumors MRI.

Datasets

The dataset we used for the research is REMBRANDT (Scarpace et al., 2022). It is accessed from The Cancer Imaging Archive (TCIA) database (Clark et al., 2013). REMBRANDT is purposed to explore the link between the data from genomic characterization and clinical information. and clinical information. REMBRANDT consists of pre-surgical MRI for 130 patients, including 174 studies, 1,483 series, and 110,020 images. Table 1 presents 30 extracted features from brain tumors MRI, including 21 categorical features and 9 numerical features.

Training Algorithms Methods

The DT classifier is a supervised machine learning technique to make decisions in a multistage way. The decision tree’s fundamental concept includes spreading a complicated decision into a group of more straightforward decisions. The result from this technique could be similar to the intended desired result (Hastie et al., 2009).

The DT technique is a widely used data mining methodology to classify multiple covariates or predict a target variable by algorithms. Branda-like segments are classified via decision tree to consist of an inverted tree containing leaf node, interal node and the root node. The decision tree algorithm can efficiently determine complex and large data sets as its non-parametric structure. The data for the study is separated for training and validation when the data set size is too large. The training data sets are built for the decision tree model, whereas the validation data sets are built to approach the optimal final solution by appropriate tree size (Boser et al., 1992; Song and Lu, 2015).

SVM is a commonly used machine learning methodology that classifies data mining problems by its relative flexibility and simplicity (Hearst et al., 1998). SVMs have been processed in a wide variety of biomedical applications. For instance, SVM can help automatically classify microarray gene data sets, where the gene expression profile can be examined if they are derived from peripheral fluid or a tumour sample for the result of diagnosis or prognosis. In brain diseases search, SVMs are usually applied by multivoxel pattern analysis due to the low possibility of overfitting when processing images with high dimensions. Recently, SVMs have been developed to predict prognosis and diagnosis in brain disorders research (Orrù et al., 2012).

KNN is an effective and high-performance learning technique to classify and cluster data from a large scale in big data applications (Zhang et al., 2017). The original KNN technique typically set a value of K and select the nearest samples with the influential group. In selecting K nearest samples, KNN is calculated the similarity of all samples for training (Guo et al., 2003). This algorithm costs high memory of the computer and time to process extensive data. Nevertheless, KNN is one of the top techniques in data mining due to its significant performance (Deng et al., 2016).

NN has been introduced as a vital tool for classification in recent research. NN is non-linear and self-adaptive. It is flexible in a complex data environment and can alter itself based on data without explaining of classification functions (Cybenko, 1989; Hornik, 1991). Moreover, NN has the advantage of performing statistical analysis and establishing classification functions with their capability of estimating the probabilities of posterior (Richard and Lippmann, 1991; Zhang, 2000).

Results and Discussion

The confusion matrix is applied to determine the accuracy, precision, sensitivity and F1 score for the performance of the classifier method. Table 2 shows the confusion matrix for the classifier method.

TABLE 2
www.frontiersin.org

TABLE 2. Confusion matrix for the classifier method.

The accuracy, precision, sensitivity and F1 score are calculated by equations below:

Accuracy=TP+TNTP+TN+FP+FN(1)
Precision=TPTP+FP(2)
Sensitivity=TPTP+FN(3)
F1-score=2TP2TP+FP+FN(4)

DT Classifier

After processing the training model, the machine learning classifier using DT algorithms indicates that the most accurate model is 96.2% at 30 folds cross-validation. Table 3 and Figure 3 present the value of accuracy, precision, sensitivity and F1-score for each fold cross-validation. At 30 folds cross-validation, 96.2% accuracy, 97.3% precision, 98.6% sensitivity and 97.9% F1-score are obtained.

TABLE 3
www.frontiersin.org

TABLE 3. Performance of DT classifier.

FIGURE 3
www.frontiersin.org

FIGURE 3. Comparison diagram for the performance of DT classifier.

SVM Classifier

After the training model has been computed by SVM algorithms, Table 4 and Figure 4 indicate that the most accurate model is 94.9% at 5, 15, 20 and 30 folds cross-validation. They all obtain 94.9% accuracy, 94.9% precision, 100% sensitivity and 97.4% F1-score.

TABLE 4
www.frontiersin.org

TABLE 4. Performance of SVM classifier.

FIGURE 4
www.frontiersin.org

FIGURE 4. Comparison diagram for the performance of SVM classifier.

KNN Classifier

In this case, the training model has been processed by KNN Classifier, Table 5 and Figure 5 present that the most accurate model is 93.7% which are at 10 and 20 folds, 25 and 30 folds cross-validation. 93.7% accuracy, 94.8% precision, 98.6% sensitivity and 96.6% F1-score are obtained for all of them.

TABLE 5
www.frontiersin.org

TABLE 5. Performance of KNN classifier.

FIGURE 5
www.frontiersin.org

FIGURE 5. Comparison diagram for the performance of KNN classifier.

NN Classifier

Table 6 and Figure 6 are generated from the training model by NN classifier, they present that the most accurate model is 92.4% which is at 10 cross-validation with 94.7% precision, 97.3% sensitivity and 95.9%.

TABLE 6
www.frontiersin.org

TABLE 6. Performance of NN classifier.

FIGURE 6
www.frontiersin.org

FIGURE 6. Comparison diagram for the performance of NN classifier.

Testing Model

All the classifiers are trained in the previous section. DT training model at 30 folds cross-validation with 96.2% accuracy is selected, which is the highest accurate model among the results. In this research, the testing model is used for evaluation with the rest of the datasets to verify the model’s performance.

As Table 7 presented, the accuracy of DT classifier at 30 folds cross-validation in the testing model is 95.9%. Although this is lower than the score in the training model due to the overfitting classification, it is still the best model with the highest performance.

TABLE 7
www.frontiersin.org

TABLE 7. Performance of Testing model.

Conclusion

This article proposes a supervised machine learning applied classification model for brain tumors MRI. This model is developed to obtain higher classification performance of accuracy, precision, sensitivity and F1 score for the classification of features of brain tumors MRI. The optimized classification model with the most accurate result is developed by comparing with different supervised machine learning algorithms at different folds of cross-validation. After testing, the best performance of the model is obtained. This classification model can be used in other features of brain tumors MRI to obtain the most accurate result.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author Contributions

ZY contributed to conception and design of the study, and wrote the first draft of the manuscript. QH, JY, and ML contributed to manuscript revision, read, and project management. All authors approved the submitted version.

Funding

This work was supported by the Foundation of The Second Xiangya Hospital (XYEYY20200812), Central South University.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). “A Training Algorithm for Optimal Margin Classifiers,” in Proceedings of the fifth annual workshop on Computational learning theory, Pittsburgh, PA, July 27–29, 1992, 144–152. doi:10.1145/130385.130401

CrossRef Full Text | Google Scholar

Chaplot, S., Patnaik, L. M., and Jagannathan, N. R. (2006). Classification of Magnetic Resonance Brain Images Using Wavelets as Input to Support Vector Machine and Neural Network. Biomed. signal Process. Control 1 (1), 86–92. doi:10.1016/j.bspc.2006.05.002

CrossRef Full Text | Google Scholar

Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., et al. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. J. Digit Imaging 26 (6), 1045–1057. doi:10.1007/s10278-013-9622-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Clark, M. C., Hall, L. O., Goldgof, D. B., Clarke, L. P., Velthuizen, R. P., and Silbiger, M. S. (1994). MRI Segmentation Using Fuzzy Clustering Techniques. IEEE Eng. Med. Biol. Mag. 13 (5), 730–742. doi:10.1109/51.334636

CrossRef Full Text | Google Scholar

Cocosco, C. A., Zijdenbos, A. P., and Evans, A. C. (2003). A Fully Automatic and Robust Brain MRI Tissue Classification Method. Med. Image Anal. 7 (4), 513–527. doi:10.1016/s1361-8415(03)00037-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Cybenko, G. (1989). Approximation by Superpositions of a Sigmoidal Function. Math. Control. Signal. Syst. 2 (4), 303–314. doi:10.1007/bf02551274

CrossRef Full Text | Google Scholar

Deng, Z., Zhu, X., Cheng, D., Zong, M., and Zhang, S. (2016). Efficient K NN Classification Algorithm for Big Data. Neurocomputing 195, 143–148. doi:10.1016/j.neucom.2015.08.112

CrossRef Full Text | Google Scholar

Dou, W., Ruan, S., Chen, Y., Bloyet, D., and Constans, J.-M. (2007). A Framework of Fuzzy Information Fusion for the Segmentation of Brain Tumor Tissues on MR Images. Image Vis. Comput. 25 (2), 164–171. doi:10.1016/j.imavis.2006.01.025

CrossRef Full Text | Google Scholar

Guo, G., Wang, H., Bell, D., Bi, Y., and Greer, K. (2003). “KNN Model-Based Approach in Classification,” in OTM Confederated International Conferences" On the Move to Meaningful Internet Systems, Sicily, November 3–7, 2003 (Springer), 986–996. doi:10.1007/978-3-540-39964-3_62

CrossRef Full Text | Google Scholar

Hastie, T., Tibshirani, R., Friedman, J. H., and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Heidelberg: Springer.

Google Scholar

Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., and Scholkopf, B. (1998). Support Vector Machines. IEEE Intell. Syst. Their Appl. 13 (4), 18–28. doi:10.1109/5254.708428

CrossRef Full Text | Google Scholar

Hornik, K. (1991). Approximation Capabilities of Multilayer Feedforward Networks. Neural networks 4 (2), 251–257. doi:10.1016/0893-6080(91)90009-t

CrossRef Full Text | Google Scholar

Ibrahim, W. H., Osman, A. A. A., and Mohamed, Y. I. (2013). “MRI Brain Image Classification Using Neural Networks,” in 2013 international conference on computing, electrical and electronic engineering (ICCEEE), Khartoum, August 26–28, 2013 (IEEE), 253–258. doi:10.1109/icceee.2013.6633943

CrossRef Full Text | Google Scholar

Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., et al. (2017). A Survey on Deep Learning in Medical Image Analysis. Med. Image Anal. 42, 60–88. doi:10.1016/j.media.2017.07.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Logeswari, T., and Karnan, M. (2010). An Improved Implementation of Brain Tumor Detection Using Segmentation Based on Hierarchical Self Organizing Map. Ijcte 2 (4), 591–595. doi:10.7763/ijcte.2010.v2.207

CrossRef Full Text | Google Scholar

Orrù, G., Pettersson-Yeo, W., Marquand, A. F., Sartori, G., and Mechelli, A. (2012). Using Support Vector Machine to Identify Imaging Biomarkers of Neurological and Psychiatric Disease: a Critical Review. Neurosci. Biobehav Rev. 36 (4), 1140–1152. doi:10.1016/j.neubiorev.2012.01.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Pan, Y., Huang, W., Lin, Z., Zhu, W., Zhou, J., Wong, J., et al. (2015). “Brain Tumor Grading Based on Neural Networks and Convolutional Neural Networks,” in 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milano, August 25–29 (IEEE), 699–702. doi:10.1109/embc.2015.7318458

CrossRef Full Text | Google Scholar

Ravì, D., Wong, C., Deligianni, F., Berthelot, M., Andreu-Perez, J., Lo, B., et al. (2016). Deep Learning for Health Informatics. IEEE J. Biomed. Health Inform. 21 (1), 4–21. doi:10.1109/JBHI.2016.2636665

PubMed Abstract | CrossRef Full Text | Google Scholar

Richard, M. D., and Lippmann, R. P. (1991). Neural Network Classifiers Estimate Bayesian A Posteriori Probabilities. Neural Comput. 3 (4), 461–483. doi:10.1162/neco.1991.3.4.461

PubMed Abstract | CrossRef Full Text | Google Scholar

Scapaticci, R., Di Donato, L., Catapano, I., and Crocco, L. (2012). A Feasibility Study on Microwave Imaging for Brain Stroke Monitoring. Pier B 40, 305–324. doi:10.2528/pierb12022006

CrossRef Full Text | Google Scholar

Scarpace, L., Flanders, A. E., Jain, R., Mikkelsen, T., and Andrews, D. W. (2022). Data from: Rembrandt. The Cancer Imaging Archive.

Google Scholar

Song, Y. Y., and Lu, Y. (2015). Decision Tree Methods: Applications for Classification and Prediction. Shanghai Arch. Psychiatry 27 (2), 130–135. doi:10.11919/j.issn.1002-0829.215044

PubMed Abstract | CrossRef Full Text | Google Scholar

Sudharani, K., Sarma, T. C., and Prasad, K. S. (2016). Advanced Morphological Technique for Automatic Brain Tumor Detection and Evaluation of Statistical Parameters. Proced. Tech. 24, 1374–1387. doi:10.1016/j.protcy.2016.05.153

CrossRef Full Text | Google Scholar

Vannier, M., Speidel, C., and Rickman, D. (1988). Magnetic Resonance Imaging Multispectral Tissue Classification. Physiology 3 (4), 148–154. doi:10.1152/physiologyonline.1988.3.4.148

CrossRef Full Text | Google Scholar

Yeh, J., and Fu, J. (2008). A Hierarchical Genetic Algorithm for Segmentation of Multi-Spectral Human-Brain MRI. Expert Syst. Appl. 34 (2), 1285–1295. doi:10.1016/j.eswa.2006.12.012

CrossRef Full Text | Google Scholar

Zacharaki, E. I., Wang, S., Chawla, S., Soo Yoo, D., Wolf, R., Melhem, E. R., et al. (2009). Classification of Brain Tumor Type and Grade Using MRI Texture and Shape in a Machine Learning Scheme. Magn. Reson. Med. 62 (6), 1609–1618. doi:10.1002/mrm.22147

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, G. P. (2000). Neural Networks for Classification: a Survey. IEEE Trans. Syst. Man. Cybern. C 30 (4), 451–462. doi:10.1109/5326.897072

CrossRef Full Text | Google Scholar

Zhang, S., Li, X., Zong, M., Zhu, X., and Wang, R. (2017). Efficient kNN Classification with Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 29 (5), 1774–1785. doi:10.1109/TNNLS.2017.2673241

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y.-D., and Wu, L. (2008). Weights Optimization of Neural Network via Improved BCO Approach. Pier 83, 185–198. doi:10.2528/pier08051403

CrossRef Full Text | Google Scholar

Zhang, Y., Wu, L., and Wang, S. (2011). Magnetic Resonance Brain Image Classification by an Improved Artificial Bee colony Algorithm. Pier 116, 65–79. doi:10.2528/pier11031709

CrossRef Full Text | Google Scholar

Keywords: brain tumor, magnetic resonance imaging, machine learning algorithms, classification, automation

Citation: Yu Z, He Q, Yang J and Luo M (2022) A Supervised ML Applied Classification Model for Brain Tumors MRI. Front. Pharmacol. 13:884495. doi: 10.3389/fphar.2022.884495

Received: 26 February 2022; Accepted: 28 March 2022;
Published: 08 April 2022.

Edited by:

Weiguo Li, Harbin Institute of Technology, China

Reviewed by:

Yan Zhang, Hunan Normal University, China
Yunrun Liu, Hong Kong Baptist University, Hong Kong SAR, China

Copyright © 2022 Yu, He, Yang and Luo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Min Luo, xyluomin@csu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.