AUTHOR=Montesinos-López Abelardo , Crespo-Herrera Leonardo , Dreisigacker Susanna , Gerard Guillermo , Vitale Paolo , Saint Pierre Carolina , Govindan Velu , Tarekegn Zerihun Tadesse , Flores Moisés Chavira , Pérez-Rodríguez Paulino , Ramos-Pulido Sofía , Lillemo Morten , Li Huihui , Montesinos-López Osval A. , Crossa Jose 

TITLE=Deep learning methods improve genomic prediction of wheat breeding

JOURNAL=Frontiers in Plant Science

VOLUME=15

YEAR=2024

URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1324090

DOI=10.3389/fpls.2024.1324090

ISSN=1664-462X

ABSTRACT=<p>In the field of plant breeding, various machine learning models have been developed and studied to evaluate the genomic prediction (GP) accuracy of unseen phenotypes. Deep learning has shown promise. However, most studies on deep learning in plant breeding have been limited to small datasets, and only a few have explored its application in moderate-sized datasets. In this study, we aimed to address this limitation by utilizing a moderately large dataset. We examined the performance of a deep learning (DL) model and compared it with the widely used and powerful best linear unbiased prediction (GBLUP) model. The goal was to assess the GP accuracy in the context of a five-fold cross-validation strategy and when predicting complete environments using the DL model. The results revealed the DL model outperformed the GBLUP model in terms of GP accuracy for two out of the five included traits in the five-fold cross-validation strategy, with similar results in the other traits. This indicates the superiority of the DL model in predicting these specific traits. Furthermore, when predicting complete environments using the leave-one-environment-out (LOEO) approach, the DL model demonstrated competitive performance. It is worth noting that the DL model employed in this study extends a previously proposed multi-modal DL model, which had been primarily applied to image data but with small datasets. By utilizing a moderately large dataset, we were able to evaluate the performance and potential of the DL model in a context with more information and challenging scenario in plant breeding.</p>