AUTHOR=Qin Shaopu P., Kim Jinhee , Arafat Dalia , Gibson Greg TITLE=Effect of Normalization on Statistical and Biological Interpretation of Gene Expression Profiles JOURNAL=Frontiers in Genetics VOLUME=3 YEAR=2013 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2012.00160 DOI=10.3389/fgene.2012.00160 ISSN=1664-8021 ABSTRACT=
An under-appreciated aspect of the genetic analysis of gene expression is the impact of post-probe level normalization on biological inference. Here we contrast nine different methods for normalization of an Illumina bead-array gene expression profiling dataset consisting of peripheral blood samples from 189 individual participants in the Center for Health Discovery and Well Being study in Atlanta, quantifying differences in the inference of global variance components and covariance of gene expression, as well as the detection of variants that affect transcript abundance (eSNPs). The normalization strategies, all relative to raw log2 measures, include simple mean centering, two modes of transcript-level linear adjustment for technical factors, and for differential immune cell counts, variance normalization by interquartile range and by quantile, fitting the first 16 Principal Components, and supervised normalization using the SNM procedure with adjustment for cell counts. Robustness of genetic associations as a consequence of Pearson and Spearman rank correlation is also reported for each method, and it is shown that the normalization strategy has a far greater impact than correlation method. We describe similarities among methods, discuss the impact on biological interpretation, and make recommendations regarding appropriate strategies.