Skip to main content

ORIGINAL RESEARCH article

Front. Plant Sci.
Sec. Technical Advances in Plant Science
Volume 15 - 2024 | doi: 10.3389/fpls.2024.1433552

Maize Yield Prediction with Trait-Missing Data via Bipartite Graph Neural Network

Provisionally accepted
Kaiyi Wang Kaiyi Wang 1Yanyun Han Yanyun Han 1Yuqing Zhang Yuqing Zhang 2Yong Zhang Yong Zhang 2Shufeng Wang Shufeng Wang 1Feng Yang Feng Yang 1Qingchun Liu Qingchun Liu 3Dongfeng Zhang Dongfeng Zhang 1Tiangang Lu Tiangang Lu 4Like Zhang Like Zhang 3Zhongqiang Liu Zhongqiang Liu 1*
  • 1 Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
  • 2 Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, China
  • 3 National Agriculture Technology Extension Service Center (NATESC), Beijing, Beijing, China
  • 4 Beijing Digital Agriculture Promotion Center, Beijing, China

The final, formatted version of the article will be published soon.

    The timely and accurate prediction of maize (Zea mays L.) yields prior to harvest is critical for food security and agricultural policy development. Currently, many researchers are using machine learning and deep learning to predict maize yields in specific regions with high accuracy.However, existing methods typically have two limitations. One is that they ignore the extensive correlation in maize planting data, such as the association of maize yields between adjacent planting locations and the combined effect of meteorological features and maize traits on maize yields. The other issue is that the performance of existing models may suffer significantly when some data in maize planting records is missing, or the samples are unbalanced. Therefore, this paper proposes an end-to-end bipartite graph neural network-based model for trait data imputation and yield prediction. The maize planting data is initially converted to a bipartite graph data structure. Then, a yield prediction model based on a bipartite graph neural network is developed to impute missing trait data and predict maize yield. This model can mine correlations between different samples of data, correlations between different meteorological features and traits, and correlations between different traits. Finally, to address the issue of unbalanced sample size at each planting location, we propose a loss function based on the gradient balancing mechanism that effectively reduces the impact of data imbalance on the prediction model. When compared to other data imputation and prediction models, our method achieves the best yield prediction result even when missing data is not pre-processed.

    Keywords: Yield prediction, Graph neural network, bipartite graph, Data imputation, gradient harmonization

    Received: 16 May 2024; Accepted: 17 Sep 2024.

    Copyright: © 2024 Wang, Han, Zhang, Zhang, Wang, Yang, Liu, Zhang, Lu, Zhang and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Zhongqiang Liu, Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.