AUTHOR=Verma Shefali S. , de Andrade Mariza , Tromp Gerard , Kuivaniemi Helena , Pugh Elizabeth , Namjou-Khales Bahram , Mukherjee Shubhabrata , Jarvik Gail P. , Kottyan Leah C. , Burt Amber , Bradford Yuki , Armstrong Gretta D. , Derr Kimberly , Crawford Dana C. , Haines Jonathan L. , Li Rongling , Crosslin David , Ritchie Marylyn D. TITLE=Imputation and quality control steps for combining multiple genome-wide datasets JOURNAL=Frontiers in Genetics VOLUME=5 YEAR=2014 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2014.00370 DOI=10.3389/fgene.2014.00370 ISSN=1664-8021 ABSTRACT=
The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic