Skip to main content

ORIGINAL RESEARCH article

Front. Med.
Sec. Pulmonary Medicine
Volume 11 - 2024 | doi: 10.3389/fmed.2024.1424750
This article is part of the Research Topic Advancements in Multimodal Data Analysis for Lung Tumor Diagnosis View all 9 articles

Construction of a Risk Screening and Visualization System for Pulmonary Nodule in Physical Examination Population Based on Feature Self-Recognition Machine Learning Model

Provisionally accepted
Kaiwen Hou Kaiwen Hou 1*Fang Tian Fang Tian 1Yongchun Lin Yongchun Lin 1Liangjiao Wang Liangjiao Wang 1Fei Fang Fei Fang 2
  • 1 Outpatient Department, Western Theater General Hospital, Chengdu, China
  • 2 Emergency department, General Hospital of Tibetan Military Command Lhasa, Lhasa, China

The final, formatted version of the article will be published soon.

    Objective: To assess the effectiveness of a feature self-recognition machine learning model in screening for pulmonary nodule risk in a physical examination population and to evaluate the constructed visualization system. Methods: We analyzed data from 4861 individuals who underwent chest CT exams during their physical examinations at the Western Theater General Hospital of the People's Liberation Army from January 2023 to November 2023. Among them, 1168 had positive CT reports for pulmonary nodules, while 3693 had negative findings. We developed a machine learning model using the XGBoost algorithm and employed an improved sooty tern optimization algorithm (ISTOA) for feature selection. The significance of the selected features was evaluated through univariate analysis and multivariable logistic stepwise regression analysis. A visualization system was created to estimate the risk of developing pulmonary nodules. Results: Multivariable analysis identified older age, smoking or passive smoking, high psychological stress within the past year, occupational exposure (e.g., air pollution at the workplace), presence of chronic lung diseases, and elevated carcinoembryonic antigen levels as significant risk factors for pulmonary nodules. The feature self-recognition machine learning model further highlighted age, smoking or passive smoking, high psychological stress, occupational exposure, chronic lung diseases, family history of lung cancer, decreased albumin levels, and elevated carcinoembryonic antigen as key predictors for early pulmonary nodule risk, demonstrating superior performance. Conclusion: The feature self-recognition machine learning model effectively aids in the early prediction and clinical identification of pulmonary nodule risk, facilitating timely intervention and improving patient prognosis.

    Keywords: machine learning, pulmonary nodules, Risk screening, Visualization System, algorithm

    Received: 28 Apr 2024; Accepted: 22 Oct 2024.

    Copyright: © 2024 Hou, Tian, Lin, Wang and Fang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Kaiwen Hou, Outpatient Department, Western Theater General Hospital, Chengdu, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.