AUTHOR=Chierici Marco , Bussola Nicole , Marcolini Alessia , Francescatto Margherita , Zandonà Alessandro , Trastulla Lucia , Agostinelli Claudio , Jurman Giuseppe , Furlanello Cesare TITLE=Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling JOURNAL=Frontiers in Oncology VOLUME=10 YEAR=2020 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2020.01065 DOI=10.3389/fonc.2020.01065 ISSN=2234-943X ABSTRACT=
Recent technological advances and international efforts, such as The Cancer Genome Atlas (TCGA), have made available several pan-cancer datasets encompassing multiple omics layers with detailed clinical information in large collection of samples. The need has thus arisen for the development of computational methods aimed at improving cancer subtyping and biomarker identification from multi-modal data. Here we apply the Integrative Network Fusion (INF) pipeline, which combines multiple omics layers exploiting Similarity Network Fusion (SNF) within a machine learning predictive framework. INF includes a feature ranking scheme (rSNF) on SNF-integrated features, used by a classifier over juxtaposed multi-omics features (juXT). In particular, we show instances of INF implementing Random Forest (RF) and linear Support Vector Machine (LSVM) as the classifier, and two baseline RF and LSVM models are also trained on juXT. A compact RF model, called rSNFi, trained on the intersection of top-ranked biomarkers from the two approaches juXT and rSNF is finally derived. All the classifiers are run in a 10x5-fold cross-validation schema to warrant reproducibility, following the guidelines for an unbiased Data Analysis Plan by the US FDA-led initiatives MAQC/SEQC. INF is demonstrated on four classification tasks on three multi-modal TCGA oncogenomics datasets. Gene expression, protein expression and copy number variants are used to predict estrogen receptor status (BRCA-ER,