Skip to main content

ORIGINAL RESEARCH article

Front. Plant Sci.
Sec. Plant Breeding
Volume 15 - 2024 | doi: 10.3389/fpls.2024.1429976

Multi-trait modeling and machine learning discover new markers associated with stem traits in alfalfa

Provisionally accepted
Cesar Medina Cesar Medina 1Deborah J. Heuschele Deborah J. Heuschele 2Dongyan Zhao Dongyan Zhao 3Meng Lin Meng Lin 3Craig T. Beil Craig T. Beil 3Moira J. Sheehan Moira J. Sheehan 3Zhanyou Xu Zhanyou Xu 2*
  • 1 Department of Agronomy and Plant Genetics, College of Food, Agricultural and Natural Resource Sciences, University of Minnesota, St. Paul, Minnesota, United States
  • 2 Plant Science Research Unit, Agricultural Research Service (USDA), St. Paul, Minnesota, United States
  • 3 Breeding Insight, Institute of Biotechnology, Cornell University, Ithaca, Illinois, United States

The final, formatted version of the article will be published soon.

    Alfalfa biomass can be fractionated into leaf and stem components. Leaves comprise a protein-rich and highly digestible portion of biomass for ruminant animals, while stems constitute a high fiber and less digestible fraction, representing 50 to 70% of the biomass. However, little attention has focused on stem-related traits, which are a key aspect in improving the nutritional value and intake potential of alfalfa. This study aimed to identify molecular markers associated with four morphological traits in a panel of five populations of alfalfa generated over two cycles of divergent selection based on 16h and 96-h in vitro neutral detergent fiber digestibility in stems. Phenotypic traits of stem color, presence of stem pith cells, winter standability, and winter injury were modeled using univariate and multivariate spatial mixed linear models (MLM), and the predicted values were used as response variables in genome-wide association studies (GWAS). The alfalfa panel was genotyped using a 3K DArTag SNP markers for the evaluation of the genetic structure and GWAS. Principal component and population structure analyses revealed differentiations between populations selected for highand low-digestibility. Thirteen molecular markers were significantly associated with stem traits using either univariate or multivariate MLM. Additionally, support vector machine (SVM) and random forest (RF) algorithms were implemented to determine marker importance scores for stem traits and validate the GWAS results. The top-ranked markers from SVM and RF aligned with GWAS findings for solid stem pith, winter standability, and winter injury. Additionally, SVM identified additional markers with high variable importance for solid stem pith and winter injury. Most molecular markers were located in coding regions. These markers can facilitate marker-assisted selection to expedite breeding programs to increase winter hardiness or stem palatability.

    Keywords: alfalfa, Stem traits, GWAS, Multivariate modeling, machine learning

    Received: 09 May 2024; Accepted: 30 Jul 2024.

    Copyright: © 2024 Medina, Heuschele, Zhao, Lin, Beil, Sheehan and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Zhanyou Xu, Plant Science Research Unit, Agricultural Research Service (USDA), St. Paul, 55108, Minnesota, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.