Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder’s reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognostic model including IGHV mutational status, del(11q) and del(17p), NOTCH1 mutations, β2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell’s c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (χ2 = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.
Background: Hematological malignancies (HMs) represent a heterogeneous group of diseases with diverse etiology, pathogenesis, and prognosis. HMs’ accurate registration by Cancer Registries (CRs) is hampered by the progressive de-hospitalization of patients and the transition to molecular rather than microscopic diagnosis.
Material and methods: A dedicated software capable of automatically identifying suspected HMs cases by combining several databases was adopted by Reggio Emilia Province CR (RE-CR). Besides pathological reports, hospital discharge archives, and mortality records, RE-CR retrieved information from general and biomolecular laboratories. Incidence, mortality, and 5-year relative survival (RS) reported according to age, sex, and 4 HMs’ main categories, were noted.
Results: Overall, 7,578 HM cases were diagnosed from 1996 to 2020 by RE-CR. HMs were more common in males and older patients, except for Hodgkin Lymphoma and Follicular Lymphoma (FL). Incidence showed a significant increase for FL (annual percent change (APC)=3.0), Myeloproliferative Neoplasms (MPN) in the first period (APC=6.0) followed by a significant decrease (APC=-7.4), and Myelodysplastic Syndromes (APC=16.4) only in the first period. Over the years, a significant increase was observed in 5-year RS for Hodgkin -, Marginal Zone -, Follicular - and Diffuse Large B-cell-Lymphomas, MPN, and Acute Myeloid Leukemia. The availability of dedicated software made it possible to recover 80% of cases automatically: the remaining 20% required direct consultation of medical records.
Conclusions: The study emphasizes that HM registration needs to collect information from multiple sources. The digitalization of CRs is necessary to increase their efficiency.
Introduction: Diffuse large B-cell lymphoma (DLBCL) is the most common subtypes of lymphoma. Clinical biomarkers are still required for DLBCL patients to identify high-risk patients. Therefore, we developed and validated the platelet-to-albumin (PTA) ratio as a predictor for DLBCL patients.
Methods: A group of 749 patients was randomly divided into a training set (600 patients) and an internal validation set (149 cases). The independent cohort of 110 patients was enrolled from the other hospital as an external validation set. Penalized smoothing spline (PS) Cox regression models were used to explore the non-linear relationship between the PTA ratio and overall survival (OS) as well as progression-free survival (PFS), respectively.
Results: A U-shaped relation between the PTA ratio and PFS was identified in the training set. The PTA ratio less than 2.7 or greater than 8.6 was associated with the shorter PFS. Additionally, the PTA ratio had an additional prognostic value to the well-established predictors. What’s more, the U-shaped pattern of the PTA ratio and PFS was respectively validated in the two validation sets.
Discussion: A U-shaped association between the PTA ratio and PFS was found in patients with DLBCLs. The PTA ratio can be used as a biomarker, and may suggest abnormalities of both host nutritional aspect and systemic inflammation in DLBCL.
Maintenance treatment is a pivotal part in the whole process management of multiple myeloma (MM), which further deepens response and improves survival. However, evidence of maintenance in non-transplant MM patients is inadequate in real-world practice. Here, we retrospectively analyzed the efficacy and survival of 375 non-transplant MM patients from 11 centers between 2010 and 2021 in north China. After a median of seven cycles of front-line regimens, there were 141, 79, and 155 patients receiving lenalidomide maintenance (L-MT), bortezomib maintenance (B-MT), or thalidomide maintenance (T-MT), respectively. Patients on L-MT and B-MT had significantly greater proportions of high-risk cytogenetic abnormalities (HRCAs) detected by fluorescence in situ hybridization (FISH), which was defined as 1q21 gain, 17p deletion, adverse immunoglobulin heavy chain (IgH) translocations. Although the progression-free survival (PFS) and overall survival (OS) were comparable among the three groups, L-MT and B-MT remedied the negative impact of HRCAs on survival (PFS of patients with HRCAs vs. patients without HRCAs: L-MT, 26.9 vs. 39.2 months, p=0.19; B-MT, 20.0 vs. 29.7 months, p=0.36; OS not reached in all groups). Patients with HRCAs in the T-MT group presented inferior clinical outcomes compared to standard-risk patients (PFS, 12.1 vs. 22.8 months, p=0.02, HR=1.8, 95% CI 1.0–3.4; OS, 54.9 months vs. NR, p<0.001, HR=3.2, 95% CI 1.5–7.0). Achieving complete response (CR) after induction therapy led to superior PFS compared to other degrees of response, regardless of maintenance medication. Furthermore, maintenance duration over 24 months correlated with favorable survival. Due to the large gap of transplant eligibility in China, optimizing maintenance therapy is important for non-transplant MM patients. In this real-world multi-centered study, our findings suggest that clinicians prefer to prescribe lenalidomide or bortezomib as maintenance therapy in high-risk settings, which are superior to thalidomide in non-transplant MM patients. Achievement of CR and maintenance duration over 2 years are positive factors that influence survival.