AUTHOR=Montesinos-López Osval A. , Crespo-Herrera Leonardo , Pierre Carolina Saint , Cano-Paez Bernabe , Huerta-Prado Gloria Isabel , Mosqueda-González Brandon Alejandro , Ramos-Pulido Sofia , Gerard Guillermo , Alnowibet Khalid , Fritsche-Neto Roberto , Montesinos-López Abelardo , Crossa José TITLE=Feature engineering of environmental covariates improves plant genomic-enabled prediction JOURNAL=Frontiers in Plant Science VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1349569 DOI=10.3389/fpls.2024.1349569 ISSN=1664-462X ABSTRACT=Introduction

Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology.

Methods

When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models.

Results and discussion

We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.