AUTHOR=He Fengyu , Ding Shuangcheng , Wang Hongwei , Qin Feng TITLE=IntAssoPlot: An R Package for Integrated Visualization of Genome-Wide Association Study Results With Gene Structure and Linkage Disequilibrium Matrix JOURNAL=Frontiers in Genetics VOLUME=11 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2020.00260 DOI=10.3389/fgene.2020.00260 ISSN=1664-8021 ABSTRACT=

Genome-wide association study (GWAS), exploring the historical and evolutionary recombinations at the population level, is a major method adopted to identify quantitative trait loci (QTL) for complex traits. However, to summarize GWAS results, gene structure, and linkage disequilibrium (LD) in a single view, multiple tools are required. It is tedious to generate these three results and manually put them together; moreover, it may eventually lead to inaccuracies. On the other hand, genotype markers are usually detected by DNA- and/or RNA-Seq. For GWAS analysis based on RNA-Seq, markers from DNA-Seq provide more genetic information when displaying LD. The currently released software package does not have this function for an integrated analysis of LD, using genotypic markers different from that in association analysis. Here, we present an R package, IntAssoPlot, which provides an integrated visual display of GWAS results, along with LD and gene structure information, in a publication-ready format. The main panel of an IntAssoPlot plot has a connecting line linking the genome-wide association P-values on the -log10 scale with the gene structure and LD matrix. Importantly, IntAssoPlot is designed to plot GWAS results with LD calculated from genotypes different from those in GWAS analysis. IntAssoPlot provides a powerful visualization tool to gain an integrated insight into GWAS results. The functions provided by IntAssoPlot increase the efficiency by revealing GWAS results in a publication-ready format. Inspection of the output image can provide important biological information, including the loci that passed the genome-wide significance threshold, genes located at or near the significant loci, and the extent of LD within the selected region.