AUTHOR=Montesinos-López Abelardo , Crespo-Herrera Leonardo , Dreisigacker Susanna , Gerard Guillermo , Vitale Paolo , Saint Pierre Carolina , Govindan Velu , Tarekegn Zerihun Tadesse , Flores Moisés Chavira , Pérez-Rodríguez Paulino , Ramos-Pulido Sofía , Lillemo Morten , Li Huihui , Montesinos-López Osval A. , Crossa Jose TITLE=Deep learning methods improve genomic prediction of wheat breeding JOURNAL=Frontiers in Plant Science VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1324090 DOI=10.3389/fpls.2024.1324090 ISSN=1664-462X ABSTRACT=

In the field of plant breeding, various machine learning models have been developed and studied to evaluate the genomic prediction (GP) accuracy of unseen phenotypes. Deep learning has shown promise. However, most studies on deep learning in plant breeding have been limited to small datasets, and only a few have explored its application in moderate-sized datasets. In this study, we aimed to address this limitation by utilizing a moderately large dataset. We examined the performance of a deep learning (DL) model and compared it with the widely used and powerful best linear unbiased prediction (GBLUP) model. The goal was to assess the GP accuracy in the context of a five-fold cross-validation strategy and when predicting complete environments using the DL model. The results revealed the DL model outperformed the GBLUP model in terms of GP accuracy for two out of the five included traits in the five-fold cross-validation strategy, with similar results in the other traits. This indicates the superiority of the DL model in predicting these specific traits. Furthermore, when predicting complete environments using the leave-one-environment-out (LOEO) approach, the DL model demonstrated competitive performance. It is worth noting that the DL model employed in this study extends a previously proposed multi-modal DL model, which had been primarily applied to image data but with small datasets. By utilizing a moderately large dataset, we were able to evaluate the performance and potential of the DL model in a context with more information and challenging scenario in plant breeding.