AUTHOR=Halperin Rebecca F. , Liang Winnie S. , Kulkarni Sidharth , Tassone Erica E. , Adkins Jonathan , Enriquez Daniel , Tran Nhan L. , Hank Nicole C. , Newell James , Kodira Chinnappa , Korn Ronald , Berens Michael E. , Kim Seungchan , Byron Sara A. TITLE=Leveraging Spatial Variation in Tumor Purity for Improved Somatic Variant Calling of Archival Tumor Only Samples JOURNAL=Frontiers in Oncology VOLUME=9 YEAR=2019 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2019.00119 DOI=10.3389/fonc.2019.00119 ISSN=2234-943X ABSTRACT=
Archival tumor samples represent a rich resource of annotated specimens for translational genomics research. However, standard variant calling approaches require a matched normal sample from the same individual, which is often not available in the retrospective setting, making it difficult to distinguish between true somatic variants and individual-specific germline variants. Archival sections often contain adjacent normal tissue, but this tissue can include infiltrating tumor cells. As existing comparative somatic variant callers are designed to exclude variants present in the normal sample, a novel approach is required to leverage adjacent normal tissue with infiltrating tumor cells for somatic variant calling. Here we present lumosVar 2.0, a software package designed to jointly analyze multiple samples from the same patient, built upon our previous single sample tumor only variant caller lumosVar 1.0. The approach assumes that the allelic fraction of somatic variants and germline variants follow different patterns as tumor content and copy number state change. lumosVar 2.0 estimates allele specific copy number and tumor sample fractions from the data, and uses a to model to determine expected allelic fractions for somatic and germline variants and to classify variants accordingly. To evaluate the utility of lumosVar 2.0 to jointly call somatic variants with tumor and adjacent normal samples, we used a glioblastoma dataset with matched high and low tumor content and germline whole exome sequencing data (for true somatic variants) available for each patient. Both sensitivity and positive predictive value were improved when analyzing the high tumor and low tumor samples jointly compared to analyzing the samples individually or