ADAM-multi: software to simulate complex breeding programs for animals and plants with different ploidy levels and generalized genotypic effect models to account for multiple alleles

Chu, Thinh Tuan; Jensen, Just

doi:10.3389/fgene.2025.1513615

ORIGINAL RESEARCH article

Front. Genet., 10 February 2025

Sec. Statistical Genetics and Methodology

Volume 16 - 2025 | https://doi.org/10.3389/fgene.2025.1513615

ADAM-multi: software to simulate complex breeding programs for animals and plants with different ploidy levels and generalized genotypic effect models to account for multiple alleles

Thinh Tuan Chu^1,2*

Just Jensen¹

¹Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
²Faculty of Animal Science, Vietnam National University of Agriculture, Hanoi, Vietnam

Stochastic simulation software, ADAM, has been developed for the purpose of breeding optimization in animals and plants, and for validation of statistical models used in genetic evaluations. Just like other common simulation programs, ADAM assumed the bi-allelic state of quantitative trait locus (QTL). While the bi-allelic state of marker loci is due to the common choice of genotyping technology of single nucleotide polymorphism (SNP) chip, the assumption may not hold for the linked QTL. In the version of ADAM-Multi, we employ a novel simulation model capable of simulating additive, dominance, and epistatic genotypic effects for species with different levels of ploidy, providing with a more realistic assumption of multiple allelism for QTL variants. When assuming bi-allelic QTL, our proposed model becomes identical to the model assumption in common simulation programs, and in genetic textbooks. Along with the description of the updated simulation model in ADAM-Multi, this paper shows two small-scale studies that investigate the effects of multi-allelic versus bi-allelic assumptions in simulation and the use of different prediction models in a single-population breeding program for potatoes. We found that genomic models using dense bi-allelic markers could effectively predicted breeding values of individuals in a well-structure population despite the presence of multi-allelic QTL. Additionally, the small-scale study indicated that including non-additive genetic effects in the prediction model for selection did not lead to an improvement in the rate of genetic gains of the breeding program.

1 Introduction

Stochastic simulation is a cost-effective and powerful tool to optimize breeding programs with reduced experimental costs. Such a tool unlocks possibilities for investigating alternative breeding schemes, in order to maximizing genetic gains of the breeding program at a given input of resources. Software package, ADAM (Pedersen et al., 2009; Liu et al., 2019), has been developed for the purpose of breeding optimization in pig, cattle, fish and plants (Bengtsson et al., 2022; Tessema et al., 2020; Zaalberg et al., 2022; Chu et al., 2020). The tool is also very useful in validating statistical genetic prediction models (Romé et al., 2023; Chu et al., 2021) and in studying methods for preserving genetic diversity in breeding programs (Henryon et al., 2015). Over time the software has been further developed and updated with many features including extension to non-additive genetic models (Chu et al., 2024), indirect genetic effects (Chu et al., 2021), categorical traits (Gebreyesus et al., 2020), definition of true inbreeding (Henryon et al., 2019), or extension to plant breeding (Liu et al., 2019).

ADAM (Pedersen et al., 2009) simulates genotypic effects for each allele in a quantitative trait locus (QTL). This model is unique from other software like AlphaSim (Gaynor et al., 2021; Faux et al., 2016), ChromaX (Younis et al., 2023), MoBPS (Pook et al., 2020), SeqBreed (Pérez-Enciso et al., 2020) and XSim (Chen et al., 2022) that use substitution genotypic effects of QTL. However, all of these tools including ADAM assume a bi-allelic state of each segregating QTL. This assumption might come from current, common genetic models (Christensen et al., 2012; VanRaden, 2008; Falconer and Mackay, 1996) that assume substitution effects of one allele to its alternative. While the bi-allelic state of marker loci is due to the common choice of genotyping technology that yield single nucleotide polymorphism (SNP). The assumption of bi-allelic QTL may not true for all QTL. Multiple alleles have been shown in numerous QTL (Biová et al., 2024; Jiang et al., 2020). In addition, multi-allelic models of QTL are more reasonable explanations for different functional genetic effects from unrelated populations (González-Diéguez et al., 2021).

Multi-allelic models have been developed for genomic prediction in diploid species (Álvarez-Castro and Crujeiras, 2019; Álvarez-Castro and Yang, 2011; Yang and Álvarez-Castro, 2008; Da, 2015). Relevant model based on haplotype blocks also have been shown for genomic prediction (Weber et al., 2023). Thérèse Navarro et al. (2022) has developed a package for genome wide association studies (GWAS) of polyploid populations with multi-allelic models, but only the additive genetic effects were included in the model. For the purpose of simulation, however, we are not aware of any studies that have used multi-allelic models with additive, dominance and epistatic genetic effects, or accounting for different levels of ploidy.

In addition to bi-allelic assumptions, earlier version of ADAM (Pedersen et al., 2009; Liu et al., 2019) use was limited to diploid species only. Many economically important species like potato, banana, sugar cane and some fish orders of salmonids and common carps are polyploid. Extension of simulation models to different ploidy levels is necessary for ADAM (Pedersen et al., 2009; Liu et al., 2019) to design complex breeding schemes for these species.

This paper will describe new features of software package, now called ADAM-Multi, for simulating breeding programs for plants and animals. The focus will be on description of genotypic models for simulating traits with genotypic effects of additive, dominance, and epistatic genetics for species with different ploidy levels including extensions to multi-allelic assumptions. The methods implemented are illustrated in two examples that study the effects of different assumptions on number of alleles, ploidy level and different prediction models used in selection.

2 Materials and methods

2.1 Genotypic models for simulation

We aim to simulate genotypic effects that are generalized to multi-allelic QTL with number ( $n_{B}$ ) of alleles, and the genome with ploidy level of $n_{p l o i d y}$ . Assuming a QTL with alleles $B_{1}$ , $B_{2}$ ,…, $B_{n_{B}}$ , ADAM-Multi uses following model to simulate additive genotypic value ( $a$ ) at one locus:

a = \sum_{i_{B}}^{n_{B}} t_{i_{B}}^{a} a_{i_{B}} = t_{1}^{a} a_{1} + t_{2}^{a} a_{2} + \dots + t_{n_{B}}^{a} a_{n_{B}} (1)

where $a$ is the additive genotypic value of a QTL; $a_{i_{B}}$ is genotypic additive effect of allele $B_{i_{B}}$ (or called $i_{B}$ for short) at the QTL; $t_{i_{B}}^{a}$ is the additive covariate for allele $i_{B}$ that is scaled genotype dosage calculated as in the AlphaSimR software (Gaynor et al., 2021):

t_{i_{B}}^{a} = (t_{i_{B}} - \frac{n_{p l o i d y}}{2}) (\frac{2}{n_{p l o i d y}}) (2)

where $t_{i_{B}}$ is a raw genotype dosage, or number of copies of allele $i_{B}$ at the locus, $n_{p l o i d y}$ is the ploidy level of genome. Key notations are defined in Table 1.

Table 1

Table 1. List of key symbols.

Similarly, the genotypic model for dominance value ( $d$ ) of a QTL at the locus level is:

d = \sum_{i_{B}}^{n_{B}} t_{i_{B}}^{d} d_{i_{B}} = t_{1}^{d} d_{1} + t_{2}^{d} d_{2} + \dots + t_{n_{B}}^{d} d_{n_{B}} (3)

where $d_{i_{B}}$ is dominance genotypic effect of allele $i_{B}; t_{i_{B}}^{d}$ is the dominance covariate for allele $i_{B}$ that is scaled genotype dosage calculated as in AlphaSimR (Gaynor et al., 2021):

t_{i_{B}}^{d} = t_{i_{B}} (n_{p l o i d y} - t_{i_{B}}) {(\frac{2}{n_{p l o i d y}})}^{2} (4)

This simulation model assumes digenic dominance for each allele, i.e., each allele has the same dominance effect with all other alleles. Table 2 shows examples of $t$ , $t^{a}$ and $t^{d}$ for diploid and tetraploid genome assuming $n_{B} = 2$ with allele $B_{1}$ and $B_{2}$ .

Table 2

Table 2. Conversion from raw genotype dosages ( $t_{1}$ , $t_{2}$ ) to additive ( $t_{1}^{a}$ , $t_{2}^{a}$ ) and dominance ( $t_{1}^{d}$ , $t_{2}^{d}$ ) covariates when assuming bi-allelic loci.

The simulation model for additive × additive genotypic value ${(a a)}^{k l}$ of the two-locus epistatic interaction between the pair of loci $k$ and $l$ :

{(a a)}^{k l} = \{[\begin{array}{c} t_{1}^{a, l} & t_{2}^{a, l} & \dots & t_{n_{B}}^{a, l} \end{array}] \otimes [\begin{array}{c} t_{1}^{a, k} & t_{2}^{a, k} & \dots & t_{n_{B}}^{a, k} \end{array}]\} \cdot [\begin{array}{c} {(a a)}_{1}^{k l} \\ {(a a)}_{2}^{k l} \\ \dots \\ {(a a)}_{n_{B} \times n_{B}}^{k l} \end{array}] (5)

where $[\begin{array}{c} t_{1}^{a, x} & t_{2}^{a, x} & \dots & t_{n_{B}}^{a, x} \end{array}]$ is a vector of additive covariates for locus $x$ ( $k$ or $l$ ) with $n_{B}$ elements; $\otimes$ denotes the Kronecker product; denotes the symbol for matrix multiplication; $[\begin{array}{c} {(a a)}_{1}^{k l} \\ {(a a)}_{2}^{k l} \\ \dots \\ {(a a)}_{n_{B} \times n_{B}}^{k l} \end{array}]$ is vector of additive × additive genotypic effects that have $n_{B} \times n_{B}$ elements. In Equation 5, the number of alleles in loci $k$ and $l$ are the same, and equal to $n_{B}$ . Simulation models in this paper consider a fixed number $n_{B}$ for all QTL, even if not all alleles in a QTL are segregating. Genotypic models in case of arbitrary number of alleles for QTL that set the effects of non-segregating alleles to zero can be found in Supplementary Appendix 1.

The simulation model for additive-dominance genotypic value ${(a d)}^{k l}$ of the epistatic interaction is the sum of additive × dominance and dominance × additive interaction between the pair of loci $k$ and $l$ :

{(a d)}^{k l} = \{[\begin{array}{c} t_{1}^{d, l} & t_{2}^{d, l} & \dots & t_{n_{B}}^{d, l} \end{array}] \otimes [\begin{array}{c} t_{1}^{a, k} & t_{2}^{a, k} & \dots & t_{n_{B}}^{a, k} \end{array}]\} \cdot [\begin{array}{c} {(a d)}_{1}^{k l} \\ {(a d)}_{2}^{k l} \\ \dots \\ {(a d)}_{n_{B} \times n_{B}}^{k l} \end{array}] + \{[\begin{array}{c} t_{1}^{a, l} & t_{2}^{a, l} & \dots & t_{n_{B}}^{a, l} \end{array}] \otimes [\begin{array}{c} t_{1}^{d, k} & t_{2}^{d, k} & \dots & t_{n_{B}}^{d, k} \end{array}]\} \cdot [\begin{array}{c} {(d a)}_{1}^{k l} \\ {(d a)}_{2}^{k l} \\ \dots \\ {(d a)}_{n_{B} \times n_{B}}^{k l} \end{array}] (6)

where $[\begin{array}{c} t_{1}^{d, x} & t_{2}^{d, x} & \dots & t_{n_{B}}^{d, x} \end{array}]$ is a vector of dominance covariates (Equation 4) for locus $x$ with $n_{B}$ elements; $[\begin{array}{c} {(a d)}_{1}^{k l} \\ {(a d)}_{2}^{k l} \\ \dots \\ {(a d)}_{n_{B} \times n_{B}}^{k l} \end{array}] and [\begin{array}{c} {(d a)}_{1}^{k l} \\ {(d a)}_{2}^{k l} \\ \dots \\ {(d a)}_{n_{B} \times n_{B}}^{k l} \end{array}]$ are vectors of additive × dominance ( ${(a d)}_{x}^{k l}$ ), and dominance × additive ( ${(d a)}_{x}^{k l}$ ) genotypic effects that have $n_{B} \times n_{B}$ elements. The value ${(a d)}_{x}^{k l}$ is different from ${(d a)}_{x}^{k l}$ .

The simulation model for dominance × dominance genotypic value ${(d d)}^{k l}$ of the epistatic interaction between the pair of loci $k$ and $l$ :

{(d d)}^{k l} = \{[\begin{array}{c} t_{1}^{d, l} & t_{2}^{d, l} & \dots & t_{n_{B}}^{d, l} \end{array}] \otimes [\begin{array}{c} t_{1}^{d, k} & t_{2}^{d, k} & \dots & t_{n_{B}}^{d, k} \end{array}]\} \cdot [\begin{array}{c} {(d d)}_{1}^{k l} \\ {(d d)}_{2}^{k l} \\ \dots \\ {(d d)}_{n_{B} \times n_{B}}^{k l} \end{array}] (7)

where $[\begin{array}{c} {(d d)}_{1}^{k l} \\ {(d d)}_{2}^{k l} \\ \dots \\ {(d d)}_{n_{B} \times n_{B}}^{k l} \end{array}]$ is vector of dominance × dominance genotypic effects that have $n_{B} \times n_{B}$ elements.

So far, the simulated genotypic values were presented at the levels of locus and loci pairs. Here, we present the model for simulating genotypic value at the individual level, which is the sum effects of all QTLs and epistatic pair interactions. The model in a matrix form for total genotypic value $g_{i}$ of individual $i$ is:

g_{i} = tr (t_{i}^{a} \cdot a) + tr (t_{i}^{d} \cdot d) + tr [t_{i}^{aa} \cdot (aa)] + tr [t_{i}^{dd} \cdot (dd)] + tr [t_{i}^{ad} \cdot (ad)] + tr [t_{i}^{da} \cdot (da)] (8)

where $a$ is a $n_{B} \times n_{q t l}$ matrix:

a = [\begin{array}{c} a_{1}^{j_{1}} & a_{1}^{j_{2}} & \dots & a_{1}^{n_{q t l}} \\ a_{2}^{j_{1}} & a_{2}^{j_{2}} & \dots & a_{2}^{n_{q t l}} \\ \dots & \dots & \dots & \dots \\ a_{n_{B}}^{j_{1}} & a_{n_{B}}^{j_{2}} & \dots & a_{n_{B}}^{n_{q t l}} \end{array}],

where $n_{q t l}$ is the number of $QTL; a_{i_{B}}^{j_{q t l}}$ is the additive genotypic effect of allele $i_{B}$ at $QTL j_{q t l}; t_{i}^{a} is a n_{q t l} \times n_{B}$ matrix:

t_{i}^{a} = [\begin{array}{c} t_{1, i}^{a, 1} & t_{2, i}^{a, 1} & \dots & t_{n_{B}, i}^{a, 1} \\ t_{1, i}^{a, 2} & t_{2, i}^{a, 2} & \dots & t_{n_{B}, i}^{a, 2} \\ \dots & \dots & \dots & \dots \\ t_{1, i}^{a, n_{q t l}} & t_{2, i}^{a, n_{q t l}} & \dots & t_{n_{B}, i}^{a, n_{q t l}} \end{array}],

where $t_{i_{B}, i}^{a, j_{q t l}}$ is the additive covariate of allele $i_{B}$ at $QTL j_{q t l}$ of individual $i . t_{i_{B}, i}^{a, j_{q t l}}$ can be calculated based on the genotype of individual $i$ at locus $j_{q t l}$ using Equation 2; $tr ()$ is the trace of a matrix. Similarly, $d$ is a $n_{B} \times n_{q t l}$ matrix of dominance genotypic effects; $t_{i}^{d}$ is a $n_{q t l} \times n_{B}$ matrix of dominance covariate of individual $i$ . Matrix $(aa)$ has a dimension of $(n_{B} \times n_{B})$ rows and $n_{e p}$ columns:

(aa) = [\begin{array}{c} {(a a)}_{1, 1}^{k l} & {(a a)}_{2, 1}^{k l} & \dots & {(a a)}_{n_{e p}, 1}^{k l} \\ {(a a)}_{1, 2}^{k l} & {(a a)}_{2, 2}^{k l} & \dots & {(a a)}_{n_{e p}, 2}^{k l} \\ \dots & \dots & \dots & \dots \\ {(a a)}_{1, (n_{B} \times n_{B})}^{k l} & {(a a)}_{2, (n_{B} \times n_{B})}^{k l} & \dots & {(a a)}_{n_{e p}, (n_{B} \times n_{B})}^{k l} \end{array}],

where $n_{e p}$ is number of epistatic interactions between pairs of loci; ${(a a)}_{j_{e p}, (i_{B}^{k}, i_{B}^{l})}^{k l}$ is the epistatic additive × additive effect at the interaction pair $j_{e p}$ between allele $i_{B}^{k}$ of locus $k$ and allele $i_{B}^{l}$ of locus $l . t_{i}^{aa}$ is a matrix with $n_{e p}$ rows and $(n_{B} \times n_{B})$ columns.Row $j_{e p}$ th of $t_{i}^{aa}$ is set up as in Equation 5. For example,

t_{j_{ep}, i}^{aa} = [\begin{array}{c} t_{j_{e p}, 1, i}^{a, l} & t_{j_{e p}, 2, i}^{a, l} & \dots & t_{j_{e p}, n_{B}, i}^{a, l} \end{array}] \otimes [\begin{array}{c} t_{j_{e p}, 1, i}^{a, k} & t_{j_{e p}, 2, i}^{a, k} & \dots & t_{j_{e p}, n_{B}, i}^{a, k} \end{array}],

where $t_{j_{e p}, i_{B}^{x}, i}^{a, x}$ is the additive covariate of allele $i_{B}^{k}$ of QTL $k$ or allele $i_{B}^{l}$ of QTL $l$ at the interaction pair $j_{e p}$ for individual $i$ .

Similarly, $(dd)$ is a $(n_{B} \times n_{B}) \times n_{e p}$ matrix of dominance × dominance genotypic effects; $t_{i}^{dd} is a n_{e p} \times (n_{B} \times n_{B})$ matrix of dominance × dominance covariates of individual $i$ . The genotypic additive-dominance effects consist of two components: additive × dominance $tr [t_{i}^{ad} \cdot (ad)]$ and dominance × additive $tr [t_{i}^{da} \cdot (da)]$ . Similar to $(dd)$ , matrices $(ad)$ and $(da)$ have dimension of $(n_{B} \times n_{B}) \times n_{e p}$ . $t_{i}^{ad}$ and $t_{i}^{da}$ are matrices with $n_{e p}$ rows and $(n_{B} \times n_{B})$ columns. Row $j_{e p}$ th of $t_{i}^{ad}$ is set up as in Equation 6. For example,

t_{j_{ep}, i}^{ad} = [\begin{array}{c} t_{j_{e p}, 1, i}^{d, l} & t_{j_{e p}, 2, i}^{d, l} & \dots & t_{j_{e p}, n_{B}, i}^{d, l} \end{array}] \otimes [\begin{array}{c} t_{j_{e p}, 1, i}^{a, k} & t_{j_{e p}, 2, i}^{a, k} & \dots & t_{j_{e p}, n_{B}, i}^{a, k} \end{array}],

where $t_{j_{e p}, i_{B}^{x}, i}^{d, x}$ is the dominance covariate of allele $i_{B}^{k}$ of QTL $k$ or allele $i_{B}^{l}$ of QTL $l$ at the interaction pair $j_{e p}$ for individual $i$ . Row $j_{e p}$ th of $t_{i}^{da}$ is:

t_{j_{ep}, i}^{da} = [\begin{array}{c} t_{j_{e p}, 1, i}^{a, l} & t_{j_{e p}, 2, i}^{a, l} & \dots & t_{j_{e p}, n_{B}, i}^{a, l} \end{array}] \otimes [\begin{array}{c} t_{j_{e p}, 1, i}^{d, k} & t_{j_{e p}, 2, i}^{d, k} & \dots & t_{j_{e p}, n_{B}, i}^{d, k} \end{array}] .

2.2 Stochastic simulation

Details of simulation steps and theoretical principles of ADAM-Multi can be found in previous version by Pedersen et al. (2009) and Liu et al. (2019). These principles are also similar to those in AphaSim (Gaynor et al., 2021). Simulation of genomic models with ADAM first starts with founder haplotypes of a defined genome structure. To create linkage disequilibrium (LD) between QTL and markers, ADAM-Multi can be used in case of multi-allelism. Other packages such as QMSIM (Sargolzaei and Schenkel, 2009) and AlphaSim (Gaynor et al., 2021) do not support multi-allelic models, but they can be used to generate the genome with a specified degree of LD in case of bi-allelic loci. The genotypic effects of alleles in QTLs are sampled, and then centered and scaled to user-defined parameters using the founders’ QTL haplotypes (Chu et al., 2024). Steps for generating additive effects of alleles $a$ ( $n_{B} \times n_{q t l}$ matrix) in ADAM-Multi are:

- Sampling: Each element of matrix $a^{s}$ is sampled from a user-defined normal distribution, e.g., mean of zero and additive variance $σ_{A}^{2}$ [or N(0, $σ_{A}^{2}$ )]. Matrix $a^{s}$ with the same dimension as $a$ contains starting values of additive effects.

- Centering: Based on matrix $a^{s}$ and genotypes of a founder population, we can calculate population mean at each QTL locus. Additive effects of alleles within each QTL (each row of $a^{s}$ ) are centered to achieve population mean of zero. For example,: ${a_{i_{B}}^{j_{q t l}}}^{*} = a_{i_{B}}^{j_{q t l^{s}}} - {μ^{j_{q t l}}}^{s}, where {μ^{j_{q t l}}}^{s}$ is the population mean at locus $j_{q t l}$ given $a^{s}$ (before being centered) and genotypes of the founder population; ${a_{i_{B}}^{j_{q t l}}}^{*}$ is the prior value after centering. Matrix $a^{*}$ (same dimension as $a$ ) with elements of ${a_{i_{B}}^{j_{q t l}}}^{*}$ is the prior values of additive effects after centering.

- Rescaling: Prior variance $σ_{A^{*}}^{2}$ can also be calculated as we know all individuals’ genotype in the population and functional effects of QTL ( $a^{*}$ ). Calculation of the variance can be done by different methods including variance by locus, by chromosome, or by individual as in Chu et al. (2024). Additive effects $a$ are calculated by rescaling prior effects $a^{*}$ to achieve the user-defined variance input $σ_{A}^{2}$ for the founder population. For example:

a = a^{*} \times \sqrt{\frac{σ_{A}^{2}}{σ_{A^{*}}^{2}}}

The calculated variances of $σ_{A}^{2}$ and $σ_{A^{*}}^{2}$ in the rescaling step are functional, biological or genotypic parameters, which are different from classical, statistical quantitative parameters. The differences between functional and statistical variances are detailed and explained in Álvarez-Castro and Carlborg (2007), Álvarez-Castro and Yang (2011), Chu et al. (2024), and Vitezica et al. (2017). Functional effects of dominance and epistasis are also generated based on user-defined inputs of functional variances. Steps for generating dominance effects of alleles $d$ ( $n_{B} \times n_{q t l}$ matrix) are:

- Sampling: A dominance degree $δ_{i_{B}}^{d, j_{q t l}}$ for allele $i_{B}$ of QTL $j_{q t l}$ is sampled from a user-defined normal distribution $N (μ_{δ}, σ_{δ}^{2})$ , where $μ_{δ}$ is the mean of dominance degree, and $σ_{δ}^{2}$ is the variance of dominance degree. If a user does not provide this distribution, ADAM-Multi default values are $N (0.19, 0.097)$ as in Wellmann and Bennewitz, (2011). Each element of matrix $d^{*}$ (starting (or prior) values of dominance effects) is generated as:

{d_{i_{B}}^{j_{q t l}}}^{*} = δ_{i_{B}}^{d, j_{q t l}} * |a_{i_{B}}^{j_{q t l}}| .

- Rescaling: This is done similarly to rescaling as for simulating additive effects. Based on matrix $d^{*}$ and QTL genotypes of a founder population, prior variance $σ_{D^{*}}^{2}$ can be calculated. Dominance effects $d$ are the prior effects $d^{*}$ that are rescaled to achieve the user-defined variance inputs of $σ_{D}^{2}$ for the founder population.

Steps for sampling additive × additive effects are.

- Sampling: Each element of matrix ${(aa)}^{*}$ [starting (or prior) values of additive × additive effects] is sampled from a user-defined normal distribution N(0, $σ_{A A}^{2}$ ).

- Rescaling: Prior variance $σ_{{A A}^{*}}^{2}$ can be calculated as we know all individuals’ genotype in the population and functional effects of QTL ${(aa)}^{*}$ . Calculation of the variance can be done by different methods including variance by pair loci, or by individual as in Chu et al. (2024). Additive × additive effects $(aa)$ are obtained by rescaling prior effects ${(aa)}^{*}$ to achieve the desired variance inputs of $σ_{A A}^{2}$ for the founder population.

Steps for sampling dominance × dominance effects are similar to those for additive × additive effects. Steps for sampling additive-dominance effects are:

- Sampling: Each element of matrices ${(ad)}^{*}$ and ${(da)}^{*}$ [starting (or prior) values of additive × dominance and dominance × additive effects] is sampled from a user-defined normal distribution N(0, $σ_{A D}^{2}$ ).

- Rescaling: Prior variance $σ_{{A D}^{*}}^{2}$ can be calculated as we know all individuals’ genotype in the population and functional effects of QTL ${(ad)}^{*}$ and ${(da)}^{*}$ . Note that the additive-dominance effects of an individuals’ genotype is the sum of additive × dominance and dominance × additive. Effects $(ad)$ and $(da)$ are obtained by rescaling prior effects ${(ad)}^{*}$ and ${(da)}^{*}$ to achieve the desired variance inputs of $σ_{A D}^{2}$ for the founder population. For example:

(ad) = {(ad)}^{*} \times \sqrt{\frac{σ_{A D}^{2}}{σ_{{A D}^{*}}^{2}}} and (da) = {(da)}^{*} \times \sqrt{\frac{σ_{A D}^{2}}{σ_{{A D}^{*}}^{2}}} .

A centering step is not included for dominance and epistatic effects, but the total genetic value $g_{i}$ is centered to achieve the population mean of zero based on the founder population. For example, the model for simulating individuals’ phenotypes is:

y_{i} = μ + g_{i} + e_{i} (9)

where $g_{i}$ is the genetic values of individual $i$ that is constructed as in (Equation 8); $μ$ is the population mean to re-adjust the mean of the founder population to zero.

In ADAM-Multi, model (Equation 9) can be extended to repeated records, inclusion of non-genetic effects, and multiple traits. The functional genetic effects of additive, dominance, and epistasis are independent of allele frequencies, and in the simulation, they are kept constant across generations. Modeling of genetic recombination during meiosis uses bivalent chromosome pairing (Voorrips and Maliepaard, 2012). A breeding scheme is simulated by combining series of actions: mating, reproduction, phenotyping, genotyping, prediction of breeding values and different selection methods. The use of ADAM-Multi is demonstrated in two examples that study the effects of multi-allelic versus bi-allelic assumptions and the use of different prediction models on accuracy of prediction and genetic gains of breeding programs for potato.

2.3 Example 1

The example uses a simulation model with additive effects only for a single-population breeding scheme. The investigated factors include different multi-allelic assumptions (6 levels) and two levels of ploidy (Table 3). In total, there were 6 × 2 = 12 scenarios simulated in example 1.

Table 3

Table 3. Overview of factors investigated in example 1 and 2.

The simulation model for individuals’ phenotype in example 1 was:

y_{i} = μ + \sum_{i_{B}}^{n_{B}} t_{i_{B}, i}^{a, 1} a_{i_{B}}^{j_{1}} + \sum_{i_{B}}^{n_{B}} t_{i_{B}, i}^{a, 2} a_{i_{B}}^{j_{2}} + \dots + \sum_{i_{B}}^{n_{B}} t_{i_{B}, i}^{a, n_{q t l}} a_{i_{B}}^{n_{q t l}} + e_{i} (10)

Simulation of genetic values in (Equation 10) is an extension of (Equation 1) to the sum effects of $n_{q t l}$ , or equivalent to the additive part of (Equation 8). Notations and symbols are the same as in (Equation 8). The environment term $e_{i}$ was drawn from a normal distribution N(0, $σ_{e}^{2}$ ) with $σ_{e}^{2} = 2$ . This example used a simplified simulation model as compared to Equation 9 because we would like to assess accuracy of predicted breeding values in selection. When non-additive genetics are included in the model, the definition of accuracy of predicted breeding values with multi-allelic assumption is unclear in literature.

Figure 1 shows the simulation pipeline for this example. The genome of founders was simulated to form LD between QTL and markers using a Fisher-Wright inheritance model (Fisher, 1930). The LD genome consisted of 12 chromosomes with genetic distances emulating that of potato (Massa et al., 2018). The total genome length was 888.6 cM. The initial genome that had 50k marker loci and 10k QTL with an equal frequency for each allele. A historical population with effective population size of 200 was simulated for 1,000 generations of random mating, a simulated bottleneck, and an inheritance pattern of standard Mendelian principles and bivalent chromosome pairing (Voorrips and Maliepaard, 2012). In example 1, 12 founder populations were created corresponding to 6 levels of multi-allelic assumptions and 2 levels of ploidy (Table 3). The 100-individual founders for each of the populations was created, where the genome of these founders consisted of 2k QTL and 10k bi-allelic markers. The QTL and markers were drawn randomly from loci that were segregating with a minor allele frequency ≥0.05. For markers, bi-allelic state was assumed in all scenarios. The percentage of QTL having segerating bi-, tri-, or quad-alleles was corresponding to the assumption of the scenario. In case of multiple allellisms, for example, quad-allelic assumption for a QTL, each of the four alleles must have a minor allele frequency of at least 0.05. The LD pattern in case of bi-allelic loci for diploid and tetraploid genomes can be found in Supplementary Appendix 2. The functional additive variance of the founder populations was simulated at $σ_{A}^{2} = 1.0$ .

Figure 1

Figure 1. An overview of simulation pipeline in example 1 and 2.

A simplified breeding scheme was simulated for 13 discrete generations. In a generation, 80 parents were crossed, pseudo-randomly with no self-pollination, to create 80 families. A parent could mate any other parents, but each parent could contribute to only maximum of four crosses. Each family had 10 full-sib offspring, thus in total there were 800 offspring per generation. From generation 1 to 5, the 80 parents were randomly selected from the 800 offspring. In generation 6 to 13, the selection of the offspring to be parents in the following generation was based on predicted breeding values. At generation 6, variance components were estimated when the phenotype data consisted of 4,000 individuals. This estimation of variance components ensures that extra variation due to unknown variance components were taken into account. These variance component estimates were used in the models for prediction of breeding values in the subsequent generations. Selection of 80 new parents from 800 individuals were carried out based on the genetic evaluation after the phenotypes in a generation were obtained.

Model Equation 11 were used in example 1 for variance component estimation and prediction of breeding values is as follows:

y = Xb + Z_{u} u + e (11)

where $y$ is the vector of individual phenotypes; $b$ is a vector of the fixed effects of individuals’ generation; $u$ is the vector of breeding values $u \sim N (0, G_{u} σ_{u}^{2})$ , where the relationship matrix for additive genetics ( $G_{u}$ ) was constructed based on bi-allelic markers using method (VanRaden, 2008) for different levels of ploidy assumed in the scenario. The computation of $G_{u}$ was carried out using AGHmatrix R package (Amadeu et al., 2023). $X$ and $Z_{u}$ are design matrices relating individuals to fixed effects and additive genetic effects, respectively. Vector $e$ is an environmental residual term: $e \sim N (0, I σ_{e}^{2})$ , where $I$ is an identity matrix, $σ_{e}^{2}$ is the environmental residual variance.

Simulation model Equation 10 is QTL-based whereas Equation 11 is marker-based prediction model. Simulation and prediction models are also different in how the covariate of additive effects is calculated. The covariate in Equation 10 is independent of the allele frequency in the population while the covariate as elements of $G_{u}$ in Equation 11 uses the frequency in calculation (Chu et al., 2024). The variances estimated from Equation 11 are statistical parameters whereas the simulated variances in Equation 10 are functional variances (Chu et al., 2024). However, as non-additive genetic effects were not simulated in this example, the functional and statistical variances in example 1 would be identical.

Each founder population (at generation 0) were replicated 5 times, i.e., a total of 50 replicates were simulated. The breeding scheme at generations 1–13 was replicated 10 times per founder population replicate. Variance component estimation was carried out using REML in the DMUAI module of the DMU package (Madsen and Jensen, 2013). The prediction of breeding values was performed with the DMU4 module of the DMU package. Population accuracy of the predicted breeding values were assessed for the individuals in generation 6. The accuracy was the correlation between true $u$ from the simulated values and predicted $\hat{u}$ in (Equation 11). Rate of genetic gain was assessed as the rate of increase in the genetic mean of population from generation 5 to 13, i.e., rate of genetic gain = $\frac{\bar{u_{13}} - \bar{u_{5}}}{8}$ , where $\bar{u_{5}}$ and $\bar{u_{13}}$ are the genetic means of population at generation 5 and 13.

2.4 Example 2

The investigated factors in example 2 included different prediction models for selection, multi-allelic levels and two levels of ploidy (Table 3). This example used four haplotype founder populations from example 1 for scenarios that had multi-allelic assumption of bi-allelic (100%), and bi-allelic (20%) + quad-allelic (80%) QTL at two ploidy levels of diploids and tetraploids. There were three prediction models investigated, thus in total 2 × 2 × 3 = 12 scenarios simulated. The simulation pipeline of this example is the same as in example 1 and Figure 1. However, the simulation model in example 2 included additive, dominance, and epistatic genetic effects, as in (Equation 9). For simulating epistasis, we assumed $n_{e p} = 1000$ , with each QTL present in precisely one pair. The functional variance inputs for the simulation model were set for additive $σ_{A}^{2} = 1.0$ , dominance $σ_{D}^{2} = 0.25$ , additive × additive $σ_{A A}^{2} = 0.25$ , additive-dominance $σ_{A D}^{2} = 0.25$ , dominance × dominance $σ_{D D}^{2} = 0.25$ , and environmental term $σ_{e}^{2} = 2$ . The GBLUP models for predicting breeding values included (Equation 11) and two others as follows:

y = Xb + Z_{u} u + Z_{v} v + e (12)

y = Xb + Z_{u} u + Z_{v} v + Z_{uv} (uu) + Z_{uv} (uv) + Z_{vv} (vv) + e (13)

where $u$ is the vector of breeding (additive) values as described in Equation 11; $v$ is the vector of dominance values $v \sim N (0, G_{v} σ_{v}^{2})$ , where $σ_{v}^{2}$ is dominance variance, and the relationship matrix for dominance ( $G_{v}$ ) is genomic marker-based, which were calculated using AGHmatrix R package (Amadeu et al., 2023). The construction of dominance relationships between individuals used the method by Vitezica et al. (2017) for diploids, and Endelman et al. (2018) for tetraploids. Vector $(uu)$ is additive × additive $(uu) \sim N (0, G_{uu} σ_{u u}^{2})$ , where $σ_{u u}^{2}$ is additive × additive variance, and the relationship matrix $G_{uu} = G_{u} ⊙ G_{u}$ , where $G_{u}$ is genomic-based relationship as in (Equation 11), $⊙$ is the Hadamard product. Similarly, $(uv)$ is the vector of additive-dominance: $(uv) \sim N (0, G_{uv} σ_{u v}^{2})$ , where $σ_{u v}^{2}$ is the variance, and $G_{uv} = G_{u} ⊙ G_{v}$ . Vector $(vv)$ is dominance × dominance: $(vv) \sim N (0, G_{vv} σ_{v v}^{2})$ , where $σ_{v v}^{2}$ is the variance, and $G_{vv} = G_{v} ⊙ G_{v}$ . Other notations and symbols are the same as in (Equation 11). Variances $σ_{u}^{2}$ , $σ_{v}^{2}$ , $σ_{u u}^{2}$ , $σ_{u v}^{2}$ and $σ_{v v}^{2}$ from Equation 13 are statistical parameters, which are different from functional variances $σ_{A}^{2}$ , $σ_{D}^{2}$ , $σ_{A A}^{2}$ , $σ_{A D}^{2}$ and $σ_{D D}^{2}$ in Equation 9 (Chu et al., 2024).

Similar to example 1, the breeding scheme at generations 1–13 was replicated for 10 times for each of the five founder population replicates. Variance component estimation and prediction of breeding values were carried out by DMUAI and DMU4 module, respectively, of the DMU package (Madsen and Jensen, 2013). Rate of genetic gain was calculated similar as in example 1, except that the total genetic value $g_{i}$ was used instead of additive values only.

3 Results

Table 4 shows genotypic values of additive ( $a$ ) using (Equation 1) and dominance ( $d$ ) using (Equation 3) for diploids and tetraploids when assuming $n_{B} = 2$ at a locus with allele $B_{1}$ and $B_{2}$ . With bi-allelic QTL locus of diploids, the genotypic effects of $B_{1} B_{1}$ , $B_{1} B_{2}$ , and $B_{2} B_{2}$ were $- (a_{2} - a_{1})$ , 0 and $+ (a_{2} - a_{1})$ for $a$ , respectively, and 0, $(d_{1} + d_{2})$ and 0 for $d$ , respectively. Table 5 shows epistatic genotypic values for diploids due to interactions between loci $k$ and $l$ when assuming $n_{B} = 2$ at the two loci. The additive-dominance interaction between the pair of loci $k$ and $l$ is the sum of additive × dominance and dominance × additive interaction effects for the two loci.

Table 4

Table 4. Genotypic values of additive ( $a$ ) and dominance ( $d$ ) at one-locus level when assuming bi-allelic QTL.

Table 5

Table 5. Genotypic values of additive × additive $(a a)$ , additive - dominance $(a d)$ , and dominance × dominance $(d d)$ in two-loci epistatic interactions when assuming bi-allelic QTL.

Table 6 shows rate of genetic gains and accuracy of predicted breeding values in a simplified breeding scheme for diploids and tetraploids when different levels of multiple allelism were assumed in example 1. Surprisingly, accuracy of predicted breeding values was not statistically different for altered levels of QTL multi-allelic assumptions, which occurred in the breeding scheme for both diploids and tetraploids. On the contrary, the rate of genetic gains had increasing tendency with increasing levels of multiple allelic QTL in the scheme for tetraploids whereas the genetic gain did not show this tendency in the scheme for diploids. The increasing tendency in genetic gains was most likely due to increased additive variances with higher multiple allelism in tetraploids. The estimated variances for different scenarios in example 1 could be found in Supplementary Table 1, and the true variances at different generations are in Supplementary Table 2.

Table 6

Table 6. Genetic gain and accuracy in simulation model with additive effects only of example 1.

Table 7 shows the rate of genetic gains in a breeding scheme where different prediction models were used for selection in example 2. In this example, while the simulation model includes additive, dominance, and epistatic interactions between pairs of loci, different prediction models were used for selection. The prediction models with and without non-additive effects did not lead to statistical differences in rate of genetic gains. In about two thirds of replicates, the model could not estimate epistatic effects (Supplementary Table 3). Different multiple allelism did not lead to a significant change in rate of genetic gains. The variance components estimated from different prediction models in example 2 can be found in Supplementary Table 4, and the true variances of total genetic values at different generations are in Supplementary Table 2.

Table 7

Table 7. Genetic gain in simulation model with additive, dominance and epistatic effects of example 2 when different prediction models were used.

4 Discussion

Our simulation models allow the presence of multi-allelic loci, which is a more realistic assumption for QTL variants. Our simulation model for additive genetic effects is basically the sum of allelic effects. When bi-allelic QTL is assumed, the simulation models applying Equations 1, 3, and 5–7 are identical to the assumption in common genetic textbook, e.g., Falconer and Mackay (1996) and AlphaSimR package (Gaynor et al., 2021). For example, AlphaSimR defines effects of genotype $B_{1} B_{1}$ , $B_{1} B_{2}$ and $B_{2} B_{2}$ as $- a$ , 0 and $+ a$ for additive, respectively. These effects would be corresponding to values in Table 4 if $a$ was defined: $a = (a_{2} - a_{1})$ . Here, $(a_{2} - a_{1})$ is the substitution effect of allele $B_{2}$ for $B_{1}$ , which is also the definition of $a$ in AlphaSimR and Falconer and Mackay (1996). When bi-allelic QTL is assumed, dominance effect in different ploidy levels is based on a digenic dominance model, which is consistent to Gaynor et al. (2021). Similarly, the epistatic effects in Gaynor et al. (2021) are a special case of our simulation model with bi-allelic assumption.

Interestingly, different levels of multi-allelic assumptions for QTL did not affect accuracy of predicted breeding values based on bi-allelic markers in example 1. This may be due to high density of markers and well-structure population where each clone had many full and half-sibs. Multiple markers can link to a given QTL. Therefore, effects of all alleles at the QTL with multi-allelic state could be estimated using bi-allelic markers. For example, three different bi-allelic marker loci that were closely linked to a QTL could combined to code for up to eight different alleles of the QTL. This could be the reason that regardless of possible multiple alleles in QTL, bi-allelic markers with reasonably high density could predict breeding values in many genomic selection programs (Chu et al., 2019; Samorè and Fontanesi, 2016; Hayes et al., 2013).

In example 1, the differences in rate of genetic gain between different degree of multi-allelisms is primarily due to genetic variances. Although the base population variances are simulated as the same values between two populations, existence of multi-allelism can have higher potential variance, or lower loss of genetic variance under selection. For example, selection led to a removal of a “bad” allele at a QTL in the population. The genetic variance due to QTL would be zero in the bi-allelic case, but might be not in the multi-allelic population. However, maintenance of multi-allelic state might require a bigger effective population size. Otherwise, the multi-allelic state could be lost due to random sampling. This could be the explanation for a higher genetic variance of multi-allelic population in case of tetraploids, but not in diploids in example 1.

While the simulation model including additive, dominance and epistasis was the same for scenarios in example 2, different prediction models (Equations 11–13) were employed for selection. Definition of accuracy of prediction is unclear in literature when different prediction models were used in this case, and particularly when multi-allelic QTL was assumed. Therefore, the rate of genetic gains was used as the main criteria to compare prediction models. Surprisingly, the use of different prediction models did not lead to significant changes in the rate of genetic gains. In other words, the use of correct prediction model for selection, i.e., prediction model and simulation model were more similar, did not improve genetic gains of the breeding scheme. The higher level of multi-allelic assumptions for QTL tended to reduce the genetic gains in example 2, which might be due to lower accuracy of prediction. However, the reduction was not significant.

Nonetheless, examples in this paper are small-scale studies to test ADAM-Multi for multi-allelic features. Many other factors that may affect genetic gain, accuracy of predictions and genetic variances in multi-allelic populations include LD between markers and QTL alleles, population structure, population size, and prediction model. Like other software (Gaynor et al., 2021; Pook et al., 2020; Younis et al., 2023), ADAM-Multi uses functional effects for simulating genotypic values of individuals. Functional effects are independent of allele frequency, thus convenient for studying the consequence of selection in breeding programs (Chu et al., 2024). However, the functional effects and variance parameters cannot be obtained directly by model estimation using real data. Therefore, it is difficult to ensure user-defined parameters for the simulated populations. Just like other software (Gaynor et al., 2021; Pook et al., 2020; Younis et al., 2023), ADAM-Multi is still missing an important feature for a transformation between functional and statistical parameters. More theories are needed to be developed for this transformation, particularly, in case of multi-allelic QTL. Nonetheless, with a more realistic assumption of QTL, ADAM-Multi opens research possibility to study the use of genotyping technology of bi-allelic markers, or the need of new genotyping technology to improve accuracy of selection. Particularly, this assumption of QTL remains very relevant for genomic prediction studies involving multiple breeds and populations. For example, different functional effects of QTL could be assumed in two populations, e.g., González-Diéguez et al. (2021).

This paper presented single-trait models, but the program, ADAM-Multi, can be used for simulating multiple traits with different levels of correlations. For example, the scaling and rescaling procedures in simulation of multiple traits use matrix multiplication, inversion and Cholesky decomposition instead of number multiplication, devision and square root calculations as indicated in this paper. Another note is that the number of alleles $n_{B}$ could be defined individually for each of QTL (Supplementary Appendix 1). However, this paper assumes a defined $n_{B}$ for all QTL even when not all alleles in a QTL are segregating. With this assumption, non-segregating alleles’ effects can affect the mean, but the mean can be altered with adding constant values to $μ$ just like in Equation 9. On the contrary, non-segregating alleles do not affect the functional or statistical variances as the frequencies of these alleles are fixed. In addition, additive effects are centered in our simulation models, but the dominance effects are not. This assumption of dominance leads to a positive effect of heterozygous genotypes, as recommended in Wellmann and Bennewitz (2011).

5 Conclusion

This paper presented a simulation model capable of simulating genotypic effects generalized for multiple allelic models and different ploidy levels. This model accommodates genotypic effects of additive, dominance, and epistasis. When assuming bi-allelic QTL, the generalized model becomes identical to the model assumption in common simulation programs, and in genetic textbooks. This model is integrated in our software ADAM-Multi.

In a small-scale study, we have shown that with a reasonable density of bi-allelic markers and a well-structured population, genomic models can effectively predict breeding values despite the presence of multi-allelic QTL. It was also shown that the inclusion of non-additive genetic effects in the prediction model for selection did not lead to a significant improvement in the rate of genetic gains of a breeding program.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions

TC: Conceptualization, Data curation, Formal Analysis, Investigation, Methodology, Software, Validation, Visualization, Writing–original draft, Writing–review and editing. JJ: Funding acquisition, Methodology, Project administration, Resources, Supervision, Validation, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. The research as a part of project RESPOT was funded by Ministry of Food, Agriculture and Fisheries of Denmark under the Green Development and Demonstrations Program (grant no. 34009-20-1643).

Acknowledgments

We would like to thank Merethe Bagge and the breeders from Danespo for discussions on their breeding program for potatoes, and the presence of multi-allelism in potato population.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2025.1513615/full#supplementary-material

References

Álvarez-Castro, J. M., and Carlborg, O. R. (2007). A unified model for functional and statistical epistasis and its application in quantitative trait loci analysis. Genetics. 176, 1151–1167. doi:10.1534/genetics.106.067348

PubMed Abstract | CrossRef Full Text | Google Scholar

Álvarez-Castro, J. M., and Crujeiras, R. M. (2019). Orthogonal decomposition of the genetic variance for epistatic traits under linkage disequilibrium-applications to the analysis of bateson-dobzhansky-müller incompatibilities and sign epistasis. Front. Genet. 10, 54. doi:10.3389/fgene.2019.00054

PubMed Abstract | CrossRef Full Text | Google Scholar

Álvarez-Castro, J. M., and Yang, R.-C. (2011). Multiallelic models of genetic effects and variance decomposition in non-equilibrium populations. Genetica 139, 1119–1134. doi:10.1007/s10709-011-9614-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Amadeu, R. R., Garcia, A. A. F., Munoz, P. R., and Ferrão, L. F. V. (2023). AGHmatrix: genetic relationship matrices in R. Bioinformatics. 39, btad445. doi:10.1093/bioinformatics/btad445

PubMed Abstract | CrossRef Full Text | Google Scholar

Bengtsson, C., Thomasen, J. R., Kargo, M., Bouquet, A., and Slagboom, M. (2022). Emphasis on resilience in dairy cattle breeding: possibilities and consequences. J. Dairy Sci. 105, 7588–7599. doi:10.3168/jds.2021-21049

PubMed Abstract | CrossRef Full Text | Google Scholar

Biová, J., Kaňovská, I., Chan, Y. O., Immadi, M. S., Joshi, T., Bilyeu, K., et al. (2024). Natural and artificial selection of multiple alleles revealed through genomic analyses. Front. Genet. 14, 1320652. doi:10.3389/fgene.2023.1320652

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, C. J., Garrick, D., Fernando, R., Karaman, E., Stricker, C., Keehan, M., et al. (2022). XSim version 2: simulation of modern breeding programs. G3 (Bethesda). 12, jkac032. doi:10.1093/g3journal/jkac032

PubMed Abstract | CrossRef Full Text | Google Scholar

Christensen, O. F., Madsen, P., Nielsen, B., Ostersen, T., and Su, G. (2012). Single-step methods for genomic evaluation in pigs. Animal Int. J. Animal Biosci. 6, 1565–1571. doi:10.1017/s1751731112000742

PubMed Abstract | CrossRef Full Text | Google Scholar

Chu, T. T., Bastiaansen, J. W. M., Berg, P., Romé, H., Marois, D., Henshall, J., et al. (2019). Use of genomic information to exploit genotype-by-environment interactions for body weight of broiler chicken in bio-secure and production environments. Genet. Sel. Evol. 51, 50. doi:10.1186/s12711-019-0493-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Chu, T. T., Henryon, M., Jensen, J., Ask, B., and Christensen, O. F. (2021). Statistical model and testing designs to increase response to selection with constrained inbreeding in genomic breeding programs for pigs affected by social genetic effects. Genet. Sel. Evol. 53, 1. doi:10.1186/s12711-020-00598-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Chu, T. T., Kristensen, P., and Jensen, J. (2024). Simulation of functional additive and non-additive genetic effects using statistical estimates from quantitative genetic models. Hered. (Edinb). 133, 33–42. doi:10.1038/s41437-024-00690-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Chu, T. T., Sørensen, A. C., Lund, M. S., Meier, K., Nielsen, T., and Su, G. (2020). Phenotypically selective genotyping realizes more genetic gains in a rainbow trout breeding program in the presence of genotype-by-environment interactions. Front. Genet. 866, 866. doi:10.3389/fgene.2020.00866

PubMed Abstract | CrossRef Full Text | Google Scholar

Da, Y. (2015). Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers. BMC Genet. 16, 144. doi:10.1186/s12863-015-0301-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Endelman, J. B., Carley, C. A. S., Bethke, P. C., Coombs, J. J., Clough, M. E., da Silva, W. L., et al. (2018). Genetic variance partitioning and genome-wide prediction with allele dosage information in autotetraploid potato. Genetics. 209, 77–87. doi:10.1534/genetics.118.300685

PubMed Abstract | CrossRef Full Text | Google Scholar

Falconer, D. S., and Mackay, T. F. (1996). Introduction to quantitative genetics. England: Pearson Prentice Hall.

Google Scholar

Faux, A. M., Gorjanc, G., Gaynor, R. C., Battagin, M., Edwards, S. M., Wilson, D. L., et al. (2016). AlphaSim: software for breeding program simulation. Plant Genome 9. doi:10.3835/plantgenome2016.02.0013

PubMed Abstract | CrossRef Full Text | Google Scholar

Fisher, R. A. (1930). The genetical theory of natural selection. Oxford: Clarendon.

CrossRef Full Text | Google Scholar

Gaynor, R. C., Gorjanc, G., and Hickey, J. M. (2021). AlphaSimR: an R package for breeding program simulations. G3 (Bethesda). 11, jkaa017. doi:10.1093/g3journal/jkaa017

PubMed Abstract | CrossRef Full Text | Google Scholar

Gebreyesus, G., Sahana, G., Christian Sørensen, A., Lund, M. S., and Su, G. (2020). Novel approach to incorporate information about recessive lethal genes increases the accuracy of genomic prediction for mortality traits. Hered. (Edinb). 125, 155–166. doi:10.1038/s41437-020-0329-5

PubMed Abstract | CrossRef Full Text | Google Scholar

González-Diéguez, D., Legarra, A., Charcosset, A., Moreau, L., Lehermeier, C., Teyssèdre, S., et al. (2021). Genomic prediction of hybrid crops allows disentangling dominance and epistasis. Genetics 218, iyab026. doi:10.1093/genetics/iyab026

PubMed Abstract | CrossRef Full Text | Google Scholar

Hayes, B. J., Lewin, H. A., and Goddard, M. E. (2013). The future of livestock breeding: genomic selection for efficiency, reduced emissions intensity, and adaptation. Trends Genet. 29, 206–214. doi:10.1016/j.tig.2012.11.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Henryon, M., Liu, H., Berg, P., Su, G., Nielsen, H. M., Gebregiwergis, G. T., et al. (2019). Pedigree relationships to control inbreeding in optimum-contribution selection realise more genetic gain than genomic relationships. Genet. Sel. Evol. 51, 39. doi:10.1186/s12711-019-0475-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Henryon, M., Ostersen, T., Ask, B., Sørensen, A. C., and Berg, P. (2015). Most of the long-term genetic gain from optimum-contribution selection can be realised with restrictions imposed during optimisation. Genet. Sel. Evol. 47, 21. doi:10.1186/s12711-015-0107-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Jiang, Y., Chen, S., Wang, X., Liu, M., Iacono, W. G., Hewitt, J. K., et al. (2020). Association analysis and meta-analysis of multi-allelic variants for large-scale sequence data. Genes 11, 586. doi:10.3390/genes11050586

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, H., Tessema, B. B., Jensen, J., Cericola, F., Andersen, J. R., and Sørensen, A. C. (2019). ADAM-plant: a software for stochastic simulations of plant breeding from molecular to phenotypic level and from simple selection to complex speed breeding programs. Front. Plant Sci. 9, 1926. doi:10.3389/fpls.2018.01926

PubMed Abstract | CrossRef Full Text | Google Scholar

Madsen, P., and Jensen, J. (2013). DMU: a user’s guide. A package for analysing multivariate mixed models. Available at: https://dmu.ghpc.au.dk/dmu/DMU/(Accessed September 12, 2021).

Google Scholar

Massa, A. N., Manrique-Carpintero, N. C., Coombs, J., Haynes, K. G., Bethke, P. C., Brandt, T. L., et al. (2018). Linkage analysis and QTL mapping in a tetraploid russet mapping population of potato. BMC Genet. 19, 87. doi:10.1186/s12863-018-0672-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Pedersen, L., Sørensen, A., Henryon, M., Ansari-Mahyari, S., and Berg, P. (2009). ADAM: a computer program to simulate selective breeding schemes for animals. Livest. Sci. 121, 343–344. doi:10.1016/j.livsci.2008.06.028

CrossRef Full Text | Google Scholar

Pérez-Enciso, M., Ramírez-Ayala, L. C., and Zingaretti, L. M. (2020). SeqBreed: a python tool to evaluate genomic prediction in complex scenarios. Genet. Sel. Evol. GSE 52, 7. doi:10.1186/s12711-020-0530-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Pook, T., Schlather, M., and Simianer, H. (2020). MoBPS - modular breeding program simulator. G3 Genes. Genomes. Genetics 10, 1915–1918. doi:10.1534/g3.120.401193

PubMed Abstract | CrossRef Full Text | Google Scholar

Romé, H., Chu, T. T., Marois, D., Huang, C.-H., Madsen, P., and Jensen, J. (2023). Estimation and consequences of direct-maternal genetic and environmental covariances in models for genetic evaluation in broilers. Genet. Sel. Evol. 55, 58. doi:10.1186/s12711-023-00829-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Samorè, A. B., and Fontanesi, L. (2016). Genomic selection in pigs: state of the art and perspectives. Ital. J. Anim. Sci. 15, 211–232. doi:10.1080/1828051X.2016.1172034

CrossRef Full Text | Google Scholar

Sargolzaei, M., and Schenkel, F. S. (2009). QMSim: a large-scale genome simulator for livestock. Bioinformatics. 25, 680–681. doi:10.1093/bioinformatics/btp045

PubMed Abstract | CrossRef Full Text | Google Scholar

Tessema, B. B., Liu, H., Sørensen, A. C., Andersen, J. R., and Jensen, J. (2020). Strategies using genomic selection to increase genetic gain in breeding programs for wheat. Front. Genet. 11, 578123. doi:10.3389/fgene.2020.578123

PubMed Abstract | CrossRef Full Text | Google Scholar

Thérèse Navarro, A., Tumino, G., Voorrips, R. E., Arens, P., Smulders, M. J. M., van de Weg, E., et al. (2022). Multiallelic models for QTL mapping in diverse polyploid populations. BMC Bioinforma. 23, 67. doi:10.1186/s12859-022-04607-z

PubMed Abstract | CrossRef Full Text | Google Scholar

VanRaden, P. M. (2008). Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423. doi:10.3168/jds.2007-0980

PubMed Abstract | CrossRef Full Text | Google Scholar

Vitezica, Z. G., Legarra, A., Toro, M. A., and Varona, L. (2017). Orthogonal estimates of variances for additive, dominance, and epistatic effects in populations. Dominance, Epistatic Eff. Populations 206, 1297–1307. doi:10.1534/genetics.116.199406

PubMed Abstract | CrossRef Full Text | Google Scholar

Voorrips, R. E., and Maliepaard, C. A. (2012). The simulation of meiosis in diploid and tetraploid organisms using various genetic models. BMC Bioinforma. 13, 248. doi:10.1186/1471-2105-13-248

PubMed Abstract | CrossRef Full Text | Google Scholar

Weber, S. E., Frisch, M., Snowdon, R. J., and Voss-Fels, K. P. (2023). Haplotype blocks for genomic prediction: a comparative evaluation in multiple crop datasets. Front. plant Sci. 14, 1217589. doi:10.3389/fpls.2023.1217589

PubMed Abstract | CrossRef Full Text | Google Scholar

Wellmann, R., and Bennewitz, J. (2011). The contribution of dominance to the understanding of quantitative genetic variation. Genet. Res. 93, 139–154. doi:10.1017/S0016672310000649

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, R.-C., and Álvarez-Castro, J. M. (2008). Functional and statistical genetic effects with multiple alleles.

Google Scholar

Younis, O. G., Turchetta, M., Ariza Suarez, D., Yates, S., Studer, B., Athanasiadis, I. N., et al. (2023). ChromaX: a fast and scalable breeding program simulator. Bioinformatics 39, btad691. doi:10.1093/bioinformatics/btad691

PubMed Abstract | CrossRef Full Text | Google Scholar

Zaalberg, R. M., Villumsen, T. M., Jensen, J., and Chu, T. T. (2022). Effective selection for lower mortality in organic pigs through selection for total number born and number of dead piglets. Animals. 12, 1796. doi:10.3390/ani12141796

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: stochastic breeding program, genotypic model, polyploidy, dominance, epistasis

Citation: Chu TT and Jensen J (2025) ADAM-multi: software to simulate complex breeding programs for animals and plants with different ploidy levels and generalized genotypic effect models to account for multiple alleles. Front. Genet. 16:1513615. doi: 10.3389/fgene.2025.1513615

Received: 18 October 2024; Accepted: 17 January 2025;
Published: 10 February 2025.

Edited by:

Kui Zhang, Michigan Technological University, United States

Reviewed by:

Zitong Li, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
Cheng Gao, University of Florida, United States

Copyright © 2025 Chu and Jensen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Thinh Tuan Chu, Y2h1LnRoaW5oQHFnZy5hdS5kaw==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.