AUTHOR=Shi Wenjie , Chen Zhilin , Liu Hui , Miao Chen , Feng Ruifa , Wang Guilin , Chen Guoping , Chen Zhitong , Fan Pingming , Pang Weiyi , Li Chen TITLE=COL11A1 as an novel biomarker for breast cancer with machine learning and immunohistochemistry validation JOURNAL=Frontiers in Immunology VOLUME=13 YEAR=2022 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2022.937125 DOI=10.3389/fimmu.2022.937125 ISSN=1664-3224 ABSTRACT=

Machine learning (ML) algorithms were used to identify a novel biological target for breast cancer and explored its relationship with the tumor microenvironment (TME) and patient prognosis. The edgR package identified hub genes associated with overall survival (OS) and prognosis, which were validated using public datasets. Of 149 up-regulated genes identified in tumor tissues, three ML algorithms identified COL11A1 as a hub gene. COL11A1was highly expressed in breast cancer samples and associated with a poor prognosis, and positively correlated with a stromal score (r=0.49, p<0.001) and the ESTIMATE score (r=0.29, p<0.001) in the TME. Furthermore, COL11A1 negatively correlated with B cells, CD4 and CD8 cells, but positively associated with cancer-associated fibroblasts. Forty-three related immune-regulation genes associated with COL11A1 were identified, and a five-gene immune regulation signature was built. Compared with clinical factors, this gene signature was an independent risk factor for prognosis (HR=2.591, 95%CI 1.831–3.668, p=7.7e-08). A nomogram combining the gene signature with clinical variables, showed better predictive performance (C-index=0.776). The model correction prediction curve showed little bias from the ideal curve. COL11A1 is a potential therapeutic target in breast cancer and may be involved in the tumor immune infiltration; its high expression is strongly associated with poor prognosis.