Impact of different genomic relationship matrix construction methods on the accuracy of genomic prediction in different species

Wang, Shiyi; Wei, Yingjia; Liu, Dengying; Zhang, Xiangzhe; Wang, Qishan; Pan, Yuchun; Ma, Peipei

doi:10.3389/fgene.2025.1576248

ORIGINAL RESEARCH article

Front. Genet.

Sec. Statistical Genetics and Methodology

Volume 16 - 2025 | doi: 10.3389/fgene.2025.1576248

Impact of different genomic relationship matrix construction methods on the accuracy of genomic prediction in different species

Provisionally accepted

Shiyi Wang¹

Yingjia Wei¹

Xiangzhe Zhang¹

¹Shanghai Key Laboratory of Veterinary Biotechnology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
²Department of Animal Science and Technology, College of Animal Science, Zhejiang University, Hangzhou, Zhejiang Province, China

The final, formatted version of the article will be published soon.

Genomic best linear unbiased prediction (GBLUP) is a key method in genomic prediction, relying on the construction of a genomic relationship matrix (G-matrix). While various methods for G-matrix construction have been proposed, their performance across different species had not been thoroughly compared. This study aimed to evaluate the prediction accuracy of GBLUP models using six G-matrix construction methods for pigs, bulls, wheat, and mice. The methods included an initial unscaled matrix and five scaled methods that used allele frequencies to centralize the genotype.Among the scaled methods, three weighted the sum of expected variance across loci using allele frequencies set at 0.5 (G05), observed frequencies (GOF), or average minor allele frequencies (GMF). The other two methods centralized the matrix with observed allele frequencies and weighted it by the trace of the numerator matrix (GN) or the reciprocals of each locus' expected variance (GD). Results showed that the GD matrix significantly improved prediction accuracy for pig traits, while most scaled G-matrices had little effect on accuracy for mice, wheat, and bulls, sometimes even underperforming compared to the original unscaled matrix. The study concluded that the optimal G-matrix construction method varies across species, with population structure being a key factor. When the reference population size and genetic marker density reached a certain scale, the choice of G-matrix had minimal impact on prediction accuracy. These findings highlight the importance of species-specific optimization in genomic prediction and suggest that the influence of G-matrix construction diminishes in large-scale, high-density genomic datasets.

Keywords: Genomic relationship matrix, accuracy of prediction, Different species, size of reference population, Marker density

Received: 13 Feb 2025; Accepted: 08 Apr 2025.

Copyright: © 2025 Wang, Wei, Liu, Zhang, Wang, Pan and Ma. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Yuchun Pan, Department of Animal Science and Technology, College of Animal Science, Zhejiang University, Hangzhou, 310058, Zhejiang Province, China
Peipei Ma, Shanghai Key Laboratory of Veterinary Biotechnology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.