AUTHOR=Aalam Syed Wajid , Ahanger Abdul Basit , Masoodi Tariq A. , Bhat Ajaz A. , Akil Ammira S. Al-Shabeeb , Khan Meraj Alam , Assad Assif , Macha Muzafar A. , Bhat Muzafar Rasool TITLE=Deep learning-based identification of esophageal cancer subtypes through analysis of high-resolution histopathology images JOURNAL=Frontiers in Molecular Biosciences VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2024.1346242 DOI=10.3389/fmolb.2024.1346242 ISSN=2296-889X ABSTRACT=

Esophageal cancer (EC) remains a significant health challenge globally, with increasing incidence and high mortality rates. Despite advances in treatment, there remains a need for improved diagnostic methods and understanding of disease progression. This study addresses the significant challenges in the automatic classification of EC, particularly in distinguishing its primary subtypes: adenocarcinoma and squamous cell carcinoma, using histopathology images. Traditional histopathological diagnosis, while being the gold standard, is subject to subjectivity and human error and imposes a substantial burden on pathologists. This study proposes a binary class classification system for detecting EC subtypes in response to these challenges. The system leverages deep learning techniques and tissue-level labels for enhanced accuracy. We utilized 59 high-resolution histopathological images from The Cancer Genome Atlas (TCGA) Esophageal Carcinoma dataset (TCGA-ESCA). These images were preprocessed, segmented into patches, and analyzed using a pre-trained ResNet101 model for feature extraction. For classification, we employed five machine learning classifiers: Support Vector Classifier (SVC), Logistic Regression (LR), Decision Tree (DT), AdaBoost (AD), Random Forest (RF), and a Feed-Forward Neural Network (FFNN). The classifiers were evaluated based on their prediction accuracy on the test dataset, yielding results of 0.88 (SVC and LR), 0.64 (DT and AD), 0.82 (RF), and 0.94 (FFNN). Notably, the FFNN classifier achieved the highest Area Under the Curve (AUC) score of 0.92, indicating its superior performance, followed closely by SVC and LR, with a score of 0.87. This suggested approach holds promising potential as a decision-support tool for pathologists, particularly in regions with limited resources and expertise. The timely and precise detection of EC subtypes through this system can substantially enhance the likelihood of successful treatment, ultimately leading to reduced mortality rates in patients with this aggressive cancer.