Skip to main content
  • Research Paper
  • Open access
  • Published:

Unlocking genome-based prediction and selection in conifers: the key role of within-family prediction accuracy illustrated in maritime pine (Pinus pinaster Ait.)

Abstract

Key message

Based on experimental and simulated data for maritime pine (Pinus pinaster Ait.) in a genomic selection context, our study reveals that the often-highlighted equivalence between genome-based and pedigree-based prediction accuracies of breeding values is associated with a zero accuracy of genome-based prediction within families, which can be attributed to the still insufficient size of the genomic training sets for conifers.

Context

Genomic selection is a promising approach for forest tree breeding. However, its advantage in terms of prediction accuracy over conventional pedigree-based methods is unclear and within-family accuracy is rarely assessed.

Aims

We used a pedigree-based model (ABLUP) with corrected pedigree data as a baseline reference for assessing the prediction accuracy of genome-based model (GBLUP) at the global and within-family levels in maritime pine (Pinus pinaster Ait).

Methods

We considered 39 full-sib families, each comprising 10 to 40 individuals, to constitute an experimental population of 833 individuals. A stochastic simulation model was also developed to explore other scenarios of heritability, training set size, and marker density.

Results

Prediction accuracies with GBLUP and ABLUP were similar, and within-family accuracy with GBLUP was on average zero with large variation between families. Simulations revealed that the number of individuals in the training set was the principal factor limiting GBLUP accuracy in our study and likely in many forest tree breeding programmes. Accurate within-family prediction is possible if 40–65 individuals per full-sib family are included in the genomic training set, from a total of 1600–2000 individuals in the training set.

Conclusions

The increase in the number of individuals per family in the training set lead to a significant advantage of GBLUP over ABLUP in terms of prediction accuracy and more clearly justify the switch to genome-based prediction and selection in forest trees.

1 Introduction

The use of pedigree information in populations with genealogy records has improved breeding programmes for many species. Pedigree information can be used to infer the expected relatedness between each pair of individuals, making it possible to gauge the extent to which phenotypic values of individuals in the studied population have a genetic basis. This pedigree-based model underpinned the development of predictions of the individual additive genetic value, or the breeding value, via BLUP (Best Linear Unbiased Prediction) methodology (Henderson 1975). The breeding value of each individual can be decomposed into parent average and Mendelian sampling terms. The parent average term captures variation between families and it represents the expected breeding value of progeny given its parents. The Mendelian sampling term captures variation within families and it represents deviation of each individual’s breeding value from the parent average due to recombination and segregation of parental genomes. With a pedigree-based model, we need phenotypic values on an individual or its progeny to estimate the Mendelian sampling term of the individual’s breeding value. Hence, pedigree-based prediction of breeding values for non-phenotyped individuals (forward prediction) captures only the parent average term. By using genomic data, we observe outcome of recombination and segregation of parental genomes as well as recent or past mutations, meaning that we can in principle estimate parent average and Mendelian sampling terms of breeding value even for non-phenotyped individuals (VanRaden 2008; Hill and Weir 2011).

The advent of affordable genome-wide DNA marker genotyping platforms has enabled such genome-based predictions to pave the way to genomic selection (GS). Albeit initially proposed by Bernardo (1994) and Nejati-Javaremi et al. (1997), genome-based prediction took off with the work of Meuwissen et al. (2001), which showed how regressing individuals’ phenotypic values onto their genome-wide marker genotypes captured variation between individuals’ breeding values by leveraging linkage-disequilibrium between quantitative trait loci (QTL) affecting traits of interest and the genome-wide markers. Using a training set of individuals that have been phenotyped and genotyped, the model estimates associations between variation in genome-wide markers and variation in phenotypic values. This means that the associations can be used to predict breeding values for non-phenotyped individuals which have genomic information. Such genome-based predictions have revolutionized many breeding programmes, enabling an efficient and early selection of candidates and leading to significant genetic and economic gain per unit of time (Crossa et al. 2017; Hayes et al. 2009a, 2009b; Pryce et al. 2011).

Genome-based prediction is of particular interest in forest trees, as it could decrease the length of breeding cycles, which are long for such species, and cut the cost of phenotyping complex traits, such as drought tolerance and disease resistance (Grattapaglia & Resende 2011, Isik 2014). Driven by the promising results obtained from simulations and first empirical approaches (Grattapaglia et al., 2011; Grattapaglia & Resende 2011; Iwata et al. 2011), increasing numbers of experimental GS studies have been performed in recent years on many forest tree species (see Lebedev et al. 2020 and Beaulieu et al. 2024 for recent reviews). Many studies have highlighted the attractiveness of genome-based prediction by reporting moderate to high prediction accuracies (Durán et al. 2017; Isik et al. 2016; Resende et al. 2012a, 2012b) and improved genetic gain per unit of time due to 20–50% shorter generation interval for GS (Chen et al. 2018; Lenz et al. 2017; Ratcliffe et al. 2015; Resende et al., 2012a, 2012b). However, such a reduction in generation interval is also possible with pedigree-based prediction (if considering forward selection). Thus, GS overcomes pedigree-based selection only if genome-based predictions are more accurate than mid-parental breeding values (i.e. pedigree-based prediction estimated at the seedling age). In fact, several studies in forest tree breeding report that pedigree-based predictions have similar accuracy to genome-based predictions (Beaulieu et al. 2014; Lenz et al. 2020a2020b; Thistlethwaite et al. 2017, 2019; Zapata-Valenzuela et al. 2012, 2013; Zhou et al. 2020). Furthermore, the use of an incomplete or error-containing pedigree tends to distort the comparison with genomic data (El-Dien et al. 2018; Li et al. 2019). Such errors may be common in forest tree breeding programmes and this penalizes pedigree-based evaluation (Doerksen and Herbinger 2010; Munoz et al. 2014). The advantage of GS may therefore stem at least partly from the errors inherent to pedigree-based selection (Lenz et al. 2020a, 2020b). Clarifying the conditions under which genome-based models can deliver real benefits remains a prerequisite for full exploitation of the advantages of GS in forest trees.

The access to within-family variability provided by molecular markers should increase the benefits of GS relative to pedigree-based selection, making it possible to improve the management of diversity. Indeed, accurate within-family prediction would facilitate the exploitation of within-family genetic variability rather than inter-familial variability, preventing the over-representation of certain lineages during selection and subsequent drift (Allier et al., 2019; Jannink 2010; Rauf et al. 2010). However, little attention has been paid to the accuracy of genome-based prediction within families in forest trees, mostly due to the limited number of individuals per half- and full-sib family generally used in progeny trials. This issue has been addressed in only a few studies. Fuentes et al. (2017) and Cros et al. (2019) each studied a single large full-sibling family, making it difficult to extrapolate their results to more general cases. In three other studies (Pégard et al. 2020; Resende et al. 2017; Ukrainetz & Mansfield, 2019), the accuracies of within-family predictions were substantial, but variable.

Maritime pine (Pinus pinaster Ait.) covers 4.2 million hectares in south-western Europe (Abad Viñas et al., 2016). A breeding programme based on a recurrent selection scheme was initiated for this species in France in the 1960s (Durel 1992, GIS 2002). Starting from a base population (600 G0 individuals) selected for growth, environmental adaptation and stem straightness, two breeding cycles were performed using the estimated breeding values from a pedigree-based model (Bouffier et al. 2016). The potential of genome-based prediction in maritime pine breeding has already been highlighted in two previous studies (Isik et al. 2016; Bartholomé et al. 2016), but, as for most forest tree species, it is essential to investigate in greater depth the conditions in which genome-based prediction is clearly superior to pedigree-based one.

The aim of this study was to evaluate the ability of genome-based prediction to capture the Mendelian sampling term in a maritime pine breeding population with empirical and simulation approaches. In both cases, the accuracy of genome-based prediction was estimated at the population as well as at the within-family levels and was compared with that of pedigree-based prediction. The real data were obtained for a population of 39 full-sib families with family sizes ranging from 10 to 40 individuals per family. The simulation, designed to mimic the conditions of the maritime pine breeding programme, added other scenarios not observed in the real population, including variations of heritability, training set size and marker density.

2 Materials and methods

2.1 Exploring accuracy of predictions with real data

2.1.1 Maritime pine trial

A maritime pine trial was established in 2011 in the Landes de Gascogne forest at Le Barp (Lat 44.62, Long − 0.77). A complete block design was used, with 89 full-sib families and 10 checklots, each containing 48 individuals planted in six-tree plots (1250 trees/ha). The full-sib families considered were taken from the third generation of the French maritime pine breeding programme (i.e., the pedigree of the trees was known to grandparent level).

Preliminary simulations were performed to determine the most appropriate proportion of families and offspring per family from the total trial population to maximize GS accuracy at both global and within-family levels (Appendix 1). Based on the simulation, an optimal sample of 40 families was obtained, 30 with 20 individuals and 10 with 40 individuals. The selected families were representative of the genetic diversity present in the trial and the within-family samples were representative of the phenotypic variability of each family. The larger families corresponded to five of the best-related and five of the worst-related families (referred to hereafter as “well-related” and “poorly related” families). The average relatedness to the rest of the population was 0.03 for the well-related families and 0.01 for the poorly related families (calculated from pedigree data). Considering families with large numbers of offspring is key to investigating within-family predictive ability. After genotyping, our study set, called POPR, contained 833 individuals with an effective population size of 25 (Lindgren et al. 1996). Thirty-nine families could be used to assess within-family predictive ability, including nine families of more than 30 individuals each.

2.1.2 Genomic and pedigree information for POPR

Genomic DNA was extracted from young needles collected from each individual of POPR. DNA quantification and quality control were performed by fluorimetry (Qubit 2.0, Life Technologies, Thermo Fisher Scientific, USA) and spectrophotometry (NanoDrop Technologies, Wilmington, DE, USA), respectively. Genotyping was performed by Thermo Fisher Scientific (Thermo Fisher Scientific, Santa Clara, CA, USA) with the 4TREE Axiom 50 K single-nucleotide polymorphisms (SNP) multi-species array (Guilbaud et al. 2020). Samples with a call rate below 97% were excluded from further analysis. In addition to the quality control filters at the SNP level (CallRate ≥ 85%, fld-cutoff ≥ 3.2, het-so-cutoff: ≥ − 0.3) suggested by Thermo Fisher Scientific, we also excluded SNPs with more than 5% Mendelian segregation errors, SNPs with a repeatability below 98% (estimated with 42 duplicated samples), and SNPs with a minor allele frequency (MAF) below 1%. Finally, 833 individuals characterized for 8235 SNPs were available for this study. Missing genotypes were imputed by assigning the average genotype within the full-sib family. It was not possible to apply more sophisticated imputation methods due to the lack of a genetic map. We computed a realized genomic relationship matrix (G) following VanRaden (2008), with the R package AGHmatrix (Amadeu et al. 2016):

$$\begin{array}{c}G=\frac{\left({{M}}-{{P}}\right){\left({{M}}-{{P}}\right)}{^{\prime}}}{2\Sigma {p}_{i}\left(1-{p}_{i}\right)}\end{array}$$
(1)

where \({\boldsymbol{M}}\) and \({\boldsymbol{P}}\) are matrices of dimensions \(n\) (number of individuals) × \(p\) (number of markers). \({\boldsymbol{M}}\) gives the genotype at each locus, coded − 1 for one of the homozygotes, 0 for heterozygotes and 1 for the other homozygote, and the \({\boldsymbol{P}}\) is a matrix of allele frequencies expressed as \(2({p}_{i}-0.5)\), where \({p}_{i}\) is the observed allele frequency at marker \(i\) for all genotyped individuals.

Pedigree information was also available for POPR at the parental (41 seed parents and 40 pollen parents) and grandparental (103 initial progenitors from the base population of the breeding programme) levels. Pedigree errors were detected with the R package pedtools (Vigeland, 2021), by comparing the genotyping data of POPR with the genotyping data available for 78 of 81 parents. Pedigree errors were detected for 35 individuals (4% of individuals), i.e. when more than 1% of mismatches were detected, for all SNPs, between these individuals and their theoretical parents. Non-genotyped parents were assumed to be correct. Of these 35 individuals, 4 individuals were reassigned to existing families, 10 individuals initially assigned to one family were identified as new full-sib family, as well as 6 individuals initially assigned to another family were identified as a second new full-sib family, and 15 individuals were considered of unknown parentage. The total number of families has therefore increased to 42, but within-family predictive ability was only assessed for 39 full-sib families since 3 families did not contain enough individuals (< 10 individuals) for a reliable assessment. More precisely, within-family predictive ability was assessed for 9 large families with a mean of 34 individuals per family (30 to 40) and 30 families with a mean of 17 individuals per family (10 to 20). A complete corrected version of the pedigree was used to calculate an additive relationship matrix A, for a total of 1014 individuals.

2.1.3 Phenotypic data

All individuals in the original trial, including those of POPR, were phenotyped at the age of 8 years for height (HT) and deviation of the stem from verticality (DEV). Direct phenotypic values were then used in the subsequent genomic selection models after a step of pre-adjustment (7.1.3.). Narrow-sense heritability for HT and DEV within POPR were 0.13 and 0.21, respectively, for estimates based on genomic data, and 0.17 and 0.25, respectively, for estimates based on pedigree information.

2.1.4 Genome-based and pedigree-based models

Breeding values were estimated for each trait and for the n individuals using the model:

$$\begin{array}{c}y=1\mu +Zu+e\end{array}$$
(2)

where \({\boldsymbol{y}}\) is the vector of adjusted phenotypes (dimension \(n x 1\)), 1 is the vector of 1 s, \(\mu\) is the population mean associated with a vector 1 of dimension (\(n x 1\)), \({\boldsymbol{Z}}\) is the incidence matrix (dimension \(n x n\)) connecting the phenotypes to the vector of breeding values \({\boldsymbol{u}}\) (dimension \(n x 1\)) and \({\boldsymbol{e}}\) is the vector of residuals (dimension \(n x 1\)). \({\boldsymbol{u}}\) and \({\boldsymbol{e}}\) are assumed to be independent from each other and to follow a normal distribution of the form \({\boldsymbol{u}} \sim N(0,{{\boldsymbol{X}}\sigma }_{u}^{2})\) and \({\boldsymbol{e}} \sim N(0,{{{\boldsymbol{I}}}_{n}\sigma }_{e}^{2})\), where \({\boldsymbol{X}}\) is either the genome-based (realized) additive relationship matrix \({\boldsymbol{G}}\) or pedigree-based (expected) additive relationship matrix \({\boldsymbol{A}}\), \({\sigma }_{u}^{2}\) is the associated variance of breeding values, \({{\boldsymbol{I}}}_{n}\) is the n-dimensional identity matrix and \({\sigma }_{e}^{2}\) is the variance of residual effects. Mixed-model equations were solved to predict the random genetic effects \({\boldsymbol{u}}\):

$$\begin{array}{c}\left[\begin{array}{cc} {1}^{\prime}1& {1}^{\prime}{{Z}}\\ {{{Z}}}^{\prime}1& {{{Z}}}^{\prime}{{Z}}+{{{X}}}^{-1}\alpha \end{array}\right] \left[\widehat{\begin{array}{c}\mu \\ \widehat{{{u}}}\end{array}}\right] = \left[\begin{array}{c}{1}^{\prime}y\\ {{{Z}}}^{\prime}y\end{array}\right]\end{array}$$
(3)

where \({{\boldsymbol{X}}}^{-1}\) is the inverse of \({\boldsymbol{X}}\) and \(\alpha=\sigma_e^2/\sigma_u^2\) (Henderson 1975; Mrode and Pocrnic 2023). These two model versions (GBLUP and ABLUP) respectively produced genome-based (GEBV) and pedigree-based (EBV) estimates / predictions of breeding values. All model fitting was performed with the R.4.2.2 environment (R Core Team 2022) using the R package breedR (Muñoz and Sanchez 2020).

2.1.5 Cross-validation scenarios and assessment of prediction accuracy

Two cross-validation scenarios (CV) were used to assess the prediction accuracies of the GBLUP and ABLUP models (Fig. 1): CV1 was used to assess within-family predictive ability for the 39 families of POPR, whereas CV2 focused on the nine large families (i.e. the families from which 40 individuals were sampled). In the CV1 scenario, the training set (Tset) included 40% of each family and the validation set (Vset) included the remaining 60% of each family. The CV2 scenario was divided into six subscenarios. For the first subscenario, the Tset consisted of POPR minus the large families and the Vset included all the individuals from large families. The number of individuals from large families included in the Tset was progressively increased in the other five subscenarios (5, 10, 15, 20 and then 25 individuals per large family), with a corresponding reduction of the contribution of other families to Tset so as to maintain a constant Tset size (the individuals from non-large families not included in the Tset became part of the Vset, thereby also keeping the size of this set constant).

Fig. 1
figure 1

Cross-validation scenarios CV1 and CV2 performed with ABLUP and GBLUP models

Each cross-validation subscenario was replicated 100 times. For each replicate, within-family predictive abilities were defined as Pearson’s correlation coefficients between adjusted phenotypes (\(y\)) and (G)EBV (\(\widehat{y}\)) for individuals in the Vset, considering each family separately. Note that within-family prediction is only meaningful for the GBLUP model. Namely, the ABLUP model will only predict parent average component of the breeding value for non-phenotyped full-sibs, meaning that the resulting EBV will be the same for all full-sibs giving a within-family predictive ability of 0.

At the global level (i.e. considering all individuals in the Vset simultaneously), we rather estimated prediction accuracy to facilitate comparison with other studies, which frequently report this metric (Legarra et al. 2008):

$$\begin{array}{c}accuracy= \frac{r\left(y,\widehat{u}\right)}{\sqrt{\widehat{{h}_{y}^{2}}}}\end{array}$$
(4)

where \(\widehat{{h}_{y}^{2}}\) is the estimated heritability of the trait. We did not use prediction accuracy as the metric for within-family analyses because dividing by the heritability defined at population level gives rise to values much higher than 1 or lower than − 1. Instead, we focused on the deviation from 0.

2.2 Identifying key parameters for prediction accuracy with simulations

2.2.1 Description of the simulation model

Stochastic simulations based on an allelic model were performed with the R package AlphaSimR (Gaynor et al. 2021). Briefly, from a base population representative of the maritime pine genome (Chagné et al. 2002; Chancerel et al. 2013; Jaramillo-Correa et al., 2020; Milesi et al. 2023), we simulated successive breeding populations based on a single trait (equivalent to HT), taking into account the characteristics of the real French maritime pine breeding programme. Details of the simulations are provided in 8.1., and the code is available at https://github.com/HighlanderLab/vpapin_pine_gs. Simulated phenotypes and genotypes for the final population, POPS (the simulated version of POPR), were used to fit GBLUP and ABLUP models as described for the real data. The analyses were performed for the CV1 scenario (40% of individuals in each family included in the Tset). Heritability of simulated phenotypes was set to 0.13, and the number of markers used was 8235 to mimic the real-life data as closely as possible. Ten independent replications of the entire process described above were performed to ensure that the results were robust.

2.2.2 ABLUP and GBLUP prediction accuracy under different scenarios

Stochastic simulations were used to extend the comparison of prediction accuracy between GBLUP and ABLUP to different scenarios in which trait heritability (h2), training set size (nTset) and the number of markers (nSNP) were varied. For this purpose, POPS was extended by generating 100 individuals for each of the 42 initially sampled families. Accuracy was assessed with a unique cross-validation scenario similar to CV1 (all families contributing equally to Tset) but with a fixed-size Vset of 1200 individuals (evenly distributed between families). The values taken by the three parameters mimic real possibilities for maritime pine breeding:

  • The size of Tset was set to nTset = 400, 600, 1,600 or 2,600 individuals, corresponding to 10, 15, 40 and 65 individuals per family, respectively. Such numbers are usually available in most forest tree breeding programmes. These numbers are also economically viable, because they require only 10, 15, 40 and 65% of the population to be phenotyped.

  • The number of markers was set to nSNP = 8235 or 17,220 or 35,000 SNPs, corresponding to marker densities of 5.7, 12 and 24 markers/cM, respectively. Data for 8235 SNPs are already available in our real maritime pine dataset, but this number could be increased by developing chips with a higher density of markers.

  • The heritability of the trait was set to h2 = 0.13, 0.33 or 0.50. Mean heritability for HT in the maritime pine breeding programme is 0.33, but HT ranges between 0.13 and 0.50, depending on the trial considered.

3 Results

3.1 Global and within-family predictive ability for maritime pine data

3.1.1 Global prediction accuracies of GBLUP and ABLUP

Global prediction accuracies were estimated with the CV1 scenario (Fig. 2). Mean prediction accuracy was similar for the two traits at 0.52 (± 0.08) and 0.55 (± 0.07) for HT and DEV, respectively. Mean prediction accuracy was slightly higher for the GBLUP model than for the ABLUP model, at + 0.05 (± 0.11) for HT and + 0.02 (± 0.10) for DEV, but the differences were significant only for HT. For this CV1 scenario, we also varied the percentage of individuals from each family included in the Tset. The proportion included ranged from 20 to 80%, and accuracy increased with this percentage, from 0.45 (± 0.09) to 0.62 (± 0.18), but in a similar manner for ABLUP and GBLUP. We present only the data for a proportion of 40% (the mode) in Fig. 2.

Fig. 2
figure 2

Global prediction accuracies obtained in the scenario CV1 with ABLUP and GBLUP models, for height (HT) and stem deviation to verticality (DEV)

In the CV2 scenario (Fig. 3), adding more individuals from the large families and removing individuals from the other families increased mean accuracy, with a maximum value of 0.68 (± 0.10) obtained for both traits when 10 to 15 individuals from large families were included in the Tset. Global accuracy declined if the number of individuals from large families was increased any further. In other words, accuracy was highest when all families were equally represented in the Tset. In most subscenarios, the differences between the GBLUP and ABLUP models were not significant.

Fig. 3
figure 3

Global prediction accuracies obtained in the different sub-scenarios CV2 with ABLUP and GBLUP models, for height (HT) and stem deviation to verticality (DEV)

Repetitions of CV scenarios lead to changes in Tset composition, and therefore in the relatedness between Tset and Vset, which could affect prediction accuracies (Scutari et al. 2016). However, CV scenario structures lead only to very slight changes in this relatedness, which were greater between CV scenarios than between repetitions of the same CV scenario (9.1.).

3.1.2 Within-family genomic prediction accuracy

The large number of individuals per family in our design made it possible to assess within-family predictive ability. Figure 4 shows the predictive abilities obtained with the CV1 scenario when 40% of the individuals from each family were included in Tset, for each of the two traits (Fig. 4A and B). Within-family predictive ability was therefore estimated for the 60% of individuals per family included in the Vset (a mean of 20 individuals for each of the 9 large families and 10 individuals for the other 30 families). For each family, the variance of prediction accuracies was very high, indicating that the choice of individuals in the Tset (and therefore in the Vset) had a major impact on prediction accuracy. Mean prediction accuracy differed between families (Fig. 4), ranging from − 0.43 (± 0.20) to + 0.45 (± 0.18) for HT and from − 0.35 (± 0.24) to + 0.46 (± 0.22) for DEV. These mean values were centred around 0, with a very low across-family mean (− 0.01 (± 0.33) for HT and + 0.06 (± 0.33) for DEV). Despite the similar distribution of within-family predictive ability for the two traits, the ranking of families differed significantly between HT and DEV (Kendall correlation of + 0.01).

Fig. 4
figure 4

Genome-based within-family predictive ability obtained in the scenario CV1 for each of the 39 full-sib families, for height (HT) and stem deviation to verticality (DEV)

We further investigated within-family predictive ability using the CV2 scenario (Fig. 5). Prediction accuracy was low for both traits, but slightly higher for families well-related to the Tset than for poorly related families (mean + 0.12 (± 0.08) for HT and + 0.16 (± 0.09) for DEV). Adding more individuals to the Tset had no impact, increased or, more surprisingly, decreased within-family predictive ability especially for families poorly related to the Tset as observed for both traits in family 22. For both scenarios (CV1 and CV2), within-family predictive abilities remained close to 0 on average, despite strong variation between families, but could potentially explain the equivalence in terms of global prediction accuracy between the GBLUP and ABLUP models. In this context, simulations can be more informative in investigating the determinants of between- and within-family predictive ability.

Fig. 5
figure 5

Genome-based within-family predictive ability obtained in the sub-scenarios CV2 for each of the 9 large full-sib families, for height (HT) and stem deviation to verticality (DEV)

3.2 Use of simulations to explore prediction accuracy

3.2.1 A relevant simulation model

The comparison of simulated and real data is an interesting preliminary step in assessments of the relevance of the simulation model. For both models, GBLUP and ABLUP, the simulated and real data yielded very similar prediction accuracies (10.1.1.) in terms of the mean (0.54 (± 0.07) and 0.55 (± 0.08) for real and simulated data, respectively, with the GBLUP model, 0.49 (± 0.07) and 0.47 (± 0.07), respectively with the ABLUP model), and the distribution of values (the coefficient of variation for prediction accuracy being 15% and 14% for real and simulated data, respectively, with the GBLUP model, and 15% and 16%, respectively, with the ABLUP model). With the simulated data, prediction accuracy was slightly better for GBLUP than for ABLUP. The agreement between the simulated and real data was very high when accuracy comparisons were made within families (10.1.1.). The high variance of the within-family predictive abilities in each family reported previously based on real data was also observed with the simulated data, as were the large differences between families in terms of mean within-family predictive ability. Within-family predictive ability ranged from − 0.43 (± 0.20) to + 0.45 (± 0.18) with the real data, and from − 0.31 (± 0.27) to + 0.42 (± 0.28) with the simulated data. The ranking of families based on within-family predictive ability differed between the real and simulated data, as the genotypes were simulated independently of the real data.

3.2.2 Determining relevant conditions for GS implementation

Starting from the initial conditions defined by the real data (h2 = 0.13, nSNP = 8235 and nTset \(\epsilon [167:667]\) depending on the CV1 subscenario considered), new scenarios with different conditions were explored through simulations. Figure 6 shows the overall prediction accuracy for different combinations of h2, nTset and nSNP. The objective was not only to study the behaviour of accuracy as a function of the variation of these key parameters, but also to compare it with the accuracy of predictions based solely on pedigree, the reference in many forest tree improvement programmes.

Fig. 6
figure 6

Global prediction accuracies for height (HT) with simulated data for different combinations of heritabilities (h.2), training set sizes (nTset) and marker densities (nSNP)

Marker density did not appear to be of a critical importance for GS accuracy, as curves associated to genomic models (blue, red and green) overlapped in most situations. The greatest benefits of a higher marker density (12 and 24 markers/cM) in the tested scenarios were observed in situations in which a large Tset was combined with high heritability. A similar combination of parameters also resulted in a lower level of variation in accuracy. Similar results were obtained for the highest two densities, suggesting that the response is saturated when there are more than 17,000 markers.

Regardless of heritability, prediction accuracy appeared to be highly dependent on Tset size. GS prediction accuracy increased steadily for the first few increases in Tset size (nTset = 400, 600, 1600) and then tended to stabilize at about nTset = 2600, reaching a plateau with inflexion points between nTset = 1600 and nTset = 2000. The prediction accuracy of ABLUP models followed a similar pattern, but the advantage of GBLUP models over ABLUP models was greatest for larger Tset sizes and higher heritability. For a trait of average heritability (h2 = 0.33), the advantage of GBLUP models in terms of mean prediction accuracy was only + 0.02 (± 0.08) when nTset = 400, whereas it reached + 0.08 (± 0.05) when nTset = 1600 (+ 0.13 (± 0.03) with 2600). The simulation results clearly indicate that the conditions under which the real data were analysed were not optimal for revealing any advantage of the GBLUP model. According to the simulation data, the GBLUP model would be much more advantageous at higher nTset (nTset \(\ge\) 1600) values, for traits of intermediate or higher heritability and with a larger number of markers.

For two of the situations in which GBLUP performed significantly better than ABLUP — with a minimal difference in one case (+ 0.06 (± 0.05) in mean accuracy) and for the maximum observed difference in the other (+ 0.13 (± 0.03)), we present within-family predictive abilities in Fig. 7. As for all the within-family predictive abilities presented above, the variance was high and mean values differed considerably between families. However, in this case, mean prediction ability values were positive for most families, ranging from − 0.03 (± 0.18) to 0.34 (± 0.13) for the first situation (h2 = 0.33, nTset = 1600, nSNP = 17,220) and from 0.07 (± 0.14) to 0.50 (± 0.12) for the second situation (h2 = 0.5, nTset = 2600, nSNP = 35,000). Taking all the families into account, mean within-family predictive ability was 0.18 (± 0.14) and 0.29 (± 0.17), respectively, for the two situations described, indicating a clear advantage of genome-based approach over the reference value of 0 associated with the pedigree-based model. The advantage of the GBLUP model over the ABLUP model in terms of overall prediction accuracy coincides with non-zero within-family predictive ability. This advantage of the GBLUP models was increasingly evident for higher values of within-family predictive ability.

Fig. 7
figure 7

Genome-based within-family predictive ability for each full-sib family for height (HT), with simulated data and for two different combinations of heritability (h.2), training set size (nTset) and marker density (nSNP)

4 Discussion

Genomic selection is particularly interesting for forest trees in order to reduce the length of selection cycles by predicting the breeding values of the next generation (Bartholomé et al. 2016; Thistlethwaite et al. 2019; Simiqueli et al. 2023; Haristoy et al. 2023). However, the maintenance of predictive ability over generations can be hampered by the processes of genetic recombination, selection and drift, which alter the linkage disequilibrium and relatedness between the training population and the population under selection (Grattapaglia 2022). A strategy of “updating” the genomic selection model is recommended by integrating individuals from each generation into the training set (Iwata et al. 2011; Grattapaglia 2022), which implies implementing progeny tests that slow down the selection process. In addition, new crosses are dependent on the time required for trees to reach sexual maturity—between 3 and 10 years depending on the species.

By concentrating its phenotyping effort on a training population of limited size, genomic selection also makes it possible to increase the selection intensity and to integrate new selection criteria. These criteria can be more complex and better adapted to the challenging environmental conditions (Rivers et al. 2023), targeting for example drought tolerance (Bouvet et al. 2020; Papin et al. 2024) or pathogen resistance (Stocks et al. 2019; Beaulieu et al. 2020; Lenz et al. 2020b; Westbrook et al. 2020). Our study therefore focuses on within-generation and within-family prediction that can be particularly valuable in this context.

Before implementing genome-based prediction and selection in maritime pine and forest trees more generally, we need to clearly define the conditions in which the use of this approach has real advantages over conventional methods, particularly in terms of prediction accuracy (Bartholomé et al. 2016; Beaulieu et al., 2014). We evaluated genome-based prediction accuracies by clear comparison with the corresponding pedigree-based prediction accuracies, and we investigated within-family predictive ability. To justify the genotyping of individuals, genome-based models must provide higher prediction accuracies than pedigree-based models. We found that even with a test design that a priori favoured within-family variability, the advantage of GBLUP over ABLUP models was not clearly significant, and that within-family predictive ability was zero, on average, across families. Stochastic simulations mimicking our selection scheme showed that our real case was close to the tipping point for recommended training set size, beyond which genome-based predictions would start to show its full potential relative to conventional pedigree-based predictions.

4.1 Equivalence of ABLUP and GBLUP prediction accuracies with a maritime pine dataset

4.1.1 Baseline for comparison of the ABLUP and GBLUP models

Bartholomé et al. (2016) reported a higher prediction accuracy for ABLUP than for GBLUP models in maritime pine, with similar conclusions drawn in several other studies on forest trees (Zapata-Valenzuela et al. 2012; Thistlethwaite et al. 2017; Zhou et al. 2020). An insufficiently high marker density has frequently been identified as the cause of a lack of advantage of genome-based predictions. Marker density was higher in this study than in previous studies in maritime pine (5.7 SNPs/cM versus 2.4 and 1.7 SNPs/cM in the articles by Bartholomé et al. 2016 and Isik et al. 2016, respectively). Furthermore, the original design of POPR, characterized by a limited number of full-sib families but relatively large numbers of trees per family, was expected to favour the genome-based approach, better capturing Mendelian sampling terms that drive the within-family variation.

However, GBLUP prediction accuracies obtained in this study were moderate (Fig. 2). Prediction accuracies were equivalent to or slightly lower than the accuracy obtained in previous studies on maritime pine (Isik et al. 2016; Bartholomé et al. 2016) which proposed GS scenarios that were quite different compared to our strategies. Above all, GBLUP models had only a slight advantage over ABLUP models in terms of prediction accuracy for some of the cross-validation subscenarios evaluated (Figs. 2 and 3). The reasons for this slight advantage are difficult to identify and could even be partially due to the way in which prediction accuracy is calculated. Prediction accuracy, defined as the ratio of predictive ability to the square root of heritability of the predicted trait, requires the estimation of variance components, which are subject to additional errors (Legarra et al. 2008). As in other studies, GBLUP estimates of heritability were lower than the estimates obtained with the ABLUP model (Resende et al. 2017; El-Dien et al. 2018; Lenz et al. 2020b). This resulted in higher prediction accuracies for the GBLUP model despite a predictive ability similar to that for the ABLUP model.

We also decided to base our comparison solely on the accuracy, without converting it into a response to selection. Pedigree- and genome-based predictions for non-phenotyped individuals are both usable for early selection without phenotype observations and must be compared in terms of accuracy. We believe that the assumption of a shorter selection cycle for genome-based predictions compared to pedigree-based prediction does not hold and that this assumption introduces a bias, increasing the perceived advantage of GS over traditional approaches (Lenz et al. 2017; Ratcliffe et al. 2015; Resende et al. 2012a, 2012b). Genome-based prediction is clearly advantageous when it enables pedigree recovery and correction (Zapata-Valenzuela et al. 2012), but similar results can routinely be obtained with a very small number of carefully chosen markers across the genome (Vidal et al. 2017). For the use of genome-based prediction to be considered truly advantageous, it must, therefore, provide a higher accuracy than pedigree-based prediction associated with a complete corrected pedigree.

4.1.2 Capture of linkage disequilibrium and relatedness in GBLUP models

Theoretically, the genome-wide markers used for genome-based predictions can provide additional information over and above that obtained from a simple pedigree. They can, for example, capture the effects of nearby QTL in linkage disequilibrium (Habier et al. 2007), and provide more accurate information about the relationship between any two given individuals (Nejati-Javaremi et al. 1997; VanRaden 2008). However, these two advantages of marker use did not emerge clearly in this study. In our case, there are two principal explanations for the equivalence between ABLUP and GBLUP (Figs. 2 and 3), as outlined by other authors faced with similar results (Beaulieu et al. 2014). First, the estimation of genomic effects may not be very accurate, potentially due to the Tset being too small to guarantee a good estimate (Habier et al. 2013). For example, with simulation, Hayes et al. (2009b) showed that the advantage of GBLUP over ABLUP models was apparent from 100 individuals per full-sib family. Second, it is possible that genome-wide markers capture only existing genetic relationships (Legarra et al. 2008), being much less effective at capturing actual linkage disequilibrium (LD) (Habier et al. 2007).

A lack of long-range LD has already been reported for conifers (Eckert et al. 2010; Kujala & Savolainen, 2012), including maritime pine (Plomion et al. 2014; Isik et al. 2016), suggesting that a large number of markers may be required for genome-based predictions (Resende et al. 2017; Thistlethwaite et al. 2019). Grattapaglia et al. (2011) performed simulations for forest trees that suggested that a density of about 10 markers/cM was required to increase the prediction accuracy of GBLUP to such an extent that it surpassed ABLUP when Ne = 100. Most conifer genomes have a much smaller genetic map than physical size [1435 cM (Chancerel et al. 2013) and 24 Gb (Chagné et al. 2002), respectively, for maritime pine], suggesting that there is very little recombination in large parts of the genome. Thus, rather than the number of markers, it may be their distribution across the genome that counts. However, it can be difficult to improve the distribution of markers in the absence of a physical map for the species, as in maritime pine. The densities used in our real case would, therefore, be sufficiently high to capture pedigree-like relatedness, but they are probably still too low and the markers are probably not distributed appropriately to capture QTL information across LD. One alternative not considered here would be the use of alternative statistical approaches based on Bayesian variable selection, such as Bayes-B. Such methods have been shown to capture population LD more effectively and to give a greater weight to causal SNPs (Habier et al. 2007; Thistlethwaite et al. 2017), but their benefits are often case- and trait-specific, and may even disappear, particularly if the training population exceeds a certain size (Karaman et al. 2016).

4.1.3 Genome-based within-family predictive ability

Unlike ABLUP model, GBLUP model capture realized relatedness between and within families through the G matrix. Nevertheless, regardless of the scenario tested here, within-family predictive ability for the real data was, on average, zero when all full-sib families were considered (Fig. 4), indicating that underlying genome-wide marker associations in GBLUP (Strandén and Garrick 2009) were not estimated accurately. For both traits, within-family predictive abilities were not significantly different from zero in most families, but there was considerable variation across the families, including some (15%) for which accuracy values were, surprisingly, significantly negative. For some relatively small families, the number of individuals included in the Vset was probably too small for a robust estimation of correlations, resulting in a very large standard deviation when all CV iterations were taken into account. By contrast, within-family predictive ability for large families was calculated with a mean of 20 individuals (in the Vset), resulting in zero or slightly positive values, but with a lower standard deviation. Genome-based prediction accuracy was, therefore, mostly driven by capturing parent average term rather than capturing the Mendelian sampling term of breeding values. This partly explains the observed equivalence with pedigree-based prediction models, as suggested in other studies with similar results (Thistlethwaite et al., 2019).

Some specific scenarios of cross-validation imposing restrictions on relatedness between Vset and Tset, particularly if Vset did not contain relatives of individuals present in Tset, resulted in particularly low prediction accuracies at the global (Fig. 3) and within-family levels (Fig. 5). This suggests that relatedness was the main source of information in both GBLUP and ABLUP, with little or no additional information captured from LD to maintain prediction quality in the absence of relatedness. Globally, the relatedness between families in our population was low, with parents and grandparents at the founding level of the population producing a mean of 1.1 and 2.4 full-sib families, respectively. This structure may pose challenges that prevent GBLUP from outperforming ABLUP. In general, the use of a diversified training population is always desirable, to produce robust predictions and a validation set well related to the training set. The first condition could have been met by ensuring the equal representation of different grandparents and parents. The second condition is less easy to satisfy, as the siblings in Vset probably had few collaterals other than the remaining siblings already present in Tset.

4.2 Use of simulations to identify conditions in which GBLUP has a better prediction accuracy than ABLUP

The stochastic simulation model yielded results similar to those obtained empirically, suggesting that we can have some confidence in the relevance of the results, in terms of both the trends observed and the absolute values obtained. The effect of increasing Tset size and the number of genome-wide markers on GS prediction accuracy was tested under three contrasting heritability scenarios, each with an explicit comparison with the prediction accuracy of ABLUP. These simulations showed that the size of Tset was the most important determinant of prediction accuracy in our study. The inflexion point of the curves occurred somewhere between 1500 and 2000 individuals, consistent with the findings of deterministic approaches in other forest tree contexts (Grattapaglia et al. 2011; Grattapaglia & Resende 2011). From this Tset size upwards, prediction accuracy begins to be significantly higher for GBLUP than for ABLUP. Increasing the number of individuals per family, as in the simulation scenarios in which Tset increased, made it possible to estimate breeding values with greater accuracy (Habier et al. 2013). Thus, the often-highlighted equivalence between GBLUP and ABLUP in terms of prediction accuracy for forest trees, regardless of the species considered, can be explained by the training sets, which rarely exceed 1000 individuals, being too small (Lebedev et al., 2020). By contrast, increasing the number of markers did not increase significantly the genome-based predictive accuracy, at least for traits with moderate heritability, suggesting that our initial marker density would have been sufficient if it had been coupled with a larger training population. It should be noted that prediction accuracy is also impacted by the effective size of the breeding population, although we did not vary this parameter in our study (Beaulieu et al. 2024). Relatedness between Tset and Vset is one of the factors explaining the success of predictions (Grattapaglia and Resende 2011). Thus, our conclusions could be readily extrapolated to other conifers, which have similar genome sizes and effective breeding population sizes, but it would be more difficult to generalize it to deciduous species, which can have very different genome structures and LD profiles.

Complementary analysis of genome-based prediction accuracy based on deterministic approaches (Daetwyler et al. 2008) with empirical parameters for our population of maritime pine (see 11.) were consistent with our stochastic simulation results.

Within-family prediction accuracy is rarely considered in genomic studies but appears to be key for the superiority of genome-based predictions over pedigree-based predictions to be expressed. Simulations showed that more accurate within-family prediction was associated with a greater accuracy advantage of GBLUP over ABLUP. For our study design, the suggested Tset size for efficient within-family prediction corresponds to between 40 and 65 individuals per full-sib family. This is a very important requirement for the implementation of genome-based prediction and selection, as the use of GBLUP models with zero within-family predictive abilities can have several negative consequences.

In the short term, the effectiveness of selection and the response to selection would be reduced if one of the sources of genetic variation in the population, the within-family variation due to Mendelian sampling, was not exploited. While this source of variation can be measured with genome-wide markers at the genotype level, attaching quantitative value to these genotypes requires sufficiently powered training set of phenotyped and genotyped individuals. Ensuring such sufficiently large training set is important to avoid longer-term consequence related to the fact that selection on underpowered genome-based predictions would be only leveraging variation between families. Such an approach would increase the risk of losing diversity due to the elimination of certain lineages and the co-selection of candidates from the same families. This loss of diversity due to a shift in the weighting between within-family and between-family selection would lead to long-term losses of genetic gain (Jannink 2010) and an accumulation of inbreeding.

This tendency can be counteracted by selection methods based on the optimization of genetic contributions (Meuwissen 1997; Woolliams et al. 2015) — so-called “optimal contribution selection” (OCS) — which allows a trade-off between short-term and longer-term gains through the application of constraints to the balance between parental genetic contributions (Gorjanc et al. 2018). Future studies should assess these optimal strategies which would, presumably, work better if genome-based predictions could discriminate between candidates within families more accurately (Hallander & Waldmann, 2009), with sufficiently large families in the training populations, thereby increasing the efficiency of selection and of the constraints imposed by the OCS.

5 Conclusion

Despite the undeniable potential benefits for forest trees, examples in which genome-based approaches have clearly demonstrated superiority over pedigree-based approaches are lacking. Using an ABLUP model with full and corrected pedigree information as a reference, we evaluated the accuracy of GS in a maritime pine trial with the largest number of individuals per full-sib family to date. Prediction accuracy was found to be similar for the pedigree-based and genome-based models, and within-family genomic accuracy for forward predictions was close to zero. By constructing a relevant simulation model, we were able to demonstrate that the number of individuals per family, and thus the overall size of the training set, is a key parameter for accurately estimating marker associations and for detecting a clear advantage of genome-based approach. This conclusion can be extended to many forestry contexts in which the equivalence between ABLUP and GBLUP prediction accuracies can be explained by suboptimal training set sizes and structures. Increases in training-set size may be readily achievable in forestry due to the large numbers of individuals commonly used in breeding programmes and decreasing genotyping costs. Effective within-family prediction, based on well-scaled genome-based approaches, will be key to maintaining diversity in the long term and ensuring genetic gain in the challenging years ahead.

Data availability

The datasets generated during and/or analysed during the current study are available in the DATA INARE repository: https://doiorg.publicaciones.saludcastillayleon.es/10.57745/QYBKDE

Abbreviations

ABLUP:

Pedigree-based best linear unbiased prediction

BLUP:

Best linear unbiased prediction

CV:

Cross-validation

DEV:

Stem deviation to verticality

DNA:

Deoxyribonucleic acid

EBV:

Pedigree-based estimated breeding value

GBLUP:

Genome-based best linear unbiased prediction

GEBV:

Genome-based estimated breeding values

GS:

Genomic selection

HT:

Height

LD:

Linkage disequilibrium

nTset :

Training set size

nSNP:

Number of SNP

OCS:

Optimum contribution selection

POPr :

Trees sampled for this study

POPs :

Simulated version of POPr

QTL:

Quantitative trait loci

SNP:

Single-nucleotide polymorphism

T set :

Training set

V set :

Validation set

References

Download references

Acknowledgements

The authors would like to thank GIS “Groupe Pin Maritime du Futur” and INRAE—UEFP (https://doiorg.publicaciones.saludcastillayleon.es/10.15454/1.5483264699193726E12) for the installation of the studied sites, the management of the sites, the help to collect data (HT and DEV measurements) and biological material (needles). Part of the experiments (DNA extraction, quantification and manipulation) were also performed at the PGTB (https://doiorg.publicaciones.saludcastillayleon.es/10.15454/1.5572396583599417E12), with the help of Mathilde Flores, Christophe Boury and Céline Lalanne. Authors are also thankful to Christophe Plomion for his help during the conceptualization of this study.

Code availability

The custom code and/or software application generated during the current study are available in the Github.com repository, https://github.com/HighlanderLab/vpapin_pine_gs

Funding

This work was supported by the European Union’s Horizon 2020 Research and Innovation Programme Project under grant agreement n°773383 (B4EST). VP was awarded a doctoral fellowship (N°2020-CK-126) from Ecole Nationale Supérieure Des Sciences Agronomiques de Bordeaux-Aquitaine, 1 cours du Général de Gaulle, CS 40201 33175 Gradignan Cedex.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Laurent Bouffier, Leopoldo Sanchez; Methodology: Victor Papin, Gregor Gorjanc, Ivan Pocrnic, Laurent Bouffier, Leopoldo Sanchez; Formal analysis and investigation: Victor Papin; Writing—original draft preparation: Victor Papin; Writing—review and editing: Gregor Gorjanc, Ivan Pocrnic, Laurent Bouffier, Leopoldo Sanchez; Funding acquisition: Laurent Bouffier, Leopoldo Sanchez; Ressources: Laurent Bouffier, Leopoldo Sanchez; Supervision: Laurent Bouffier, Leopoldo Sanchez. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Laurent Bouffier.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors gave their informed consent to this publication and its content.

Competing interests

The authors declare that they have no competing interests.

Additional information

Handling editor: Ricardo Alia.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

1.1 Definition of sampling within the trial for genome-based analysis

The initial maximum capacity for genotyping in this study was 800 individuals. Thus, preliminary simulations were performed to determine a relevant choice of these 800 individuals among the total population in the trial.

1.1.1 Choice of sampling scenario and families

We compared 3 sampling scenarios to select 800 individuals among the 89 families present in the trial: 40 families with 20 individuals each (A), 32 families with 25 individuals each (B) and 20 families with 40 individuals each (C). The best scenario was selected on the basis of the prediction accuracy, following a three-step procedure that was replicated 10 times independently:

  • Genotypes and phenotypes were simulated for the families present in the trial using the MoBPS software (Pook et al. 2020).

  • For each scenario, the subset of 20, 32 or 40 families was selected with the goal to minimize the average genomic relationships among the subset (with genomic relationships calculated using the simulated genotypes) and with a simple trail-and-error optimization process. The aim was to select the subset that maximizes the genetic diversity.

  • Genome-based models were run using the R-package breedR for each scenario using the simulated data and the accuracy was assessed through a cross-validation routine (the 800 individuals were selected as the Tset and the remaining ~ 2,000 individuals from the trial were in the Vset). For each sampling scenario we assessed both global and within-family predictive ability (see figures below).

Finally, the scenario using 40 families with 20 individuals each, was chosen since it achieved significantly higher genome-based prediction accuracy at the global level (Fig. 8), and similar within-family predictive ability compared to the other 2 scenarios (Fig. 9).

1.1.2 Choice of individuals within each family

Within each of the 40 selected families, we applied the Kennard-Stone algorithm (Kennard and Stone 1969) implemented in the R-package prospectr (Stevens and Ramirez-Lopez 2022) to get the subset of 20 individuals that maximize phenotypic diversity (now based on the real phenotypic values) for the three traits of interest: circumference, height, and deviation to verticality at 8 years.

1.1.3 Addition of 200 individuals

Subsequently, additional 200 individuals could be genotyped and added to the study. As the sampling of the 800 individuals was already performed, we chose to add 20 additional individuals in the 10 out of the 40 selected families. These families (in the main manuscript labeled as the “large families”) were finally made up of 40 individuals each.

These 10 families were chosen to provide a contrast in connectivity within the population sampled. They represent the top 5 and bottom 5 families in terms of connectivity, with their average relatedness to the rest of the families sampled calculated as 0.03 and 0.01, respectively, based on pedigree data. Note that this choice of large families was made a posteriori so that the gap in relatedness with the rest of the population between the two groups of large families may be not very pronounced.

Fig. 8
figure 8

Genome-based prediction accuracy at the global level for the different sampling scenarios. A: 40 families × 20 individuals, B: 32 families × 25 individuals, C: 20 families × 40 individuals. “Complete Vset ” indicates that all individuals in the Vset were considered to assess prediction accuracy, while “incomplete Vset ” indicates that only individuals from families not included in the Tset were considered to assess prediction accuracy

Fig. 9
figure 9

Genome-based within-family predictive ability for the different sampling scenarios. A: 40 families × 20 individuals, B: 32 families × 25 individuals, C: 20 families × 40 individuals. On the left-hand side, assessing within-family predictive ability for all the families not included in the Tset, and on the right-hand side only for families with some individuals in the Tset. In the latter modality, prediction accuracy is not available considering sampling scenario C since all the individuals in these families are included in the Tset 

Appendix 2

2.1 Analysis of direct phenotypic values

Direct phenotypic values for HT and DEV are available for all the trees within sites. First, phenotypes were analysed with the following simple model using R package breedR (Muñoz & Sanchez 2020):

$$\begin{array}{c}y=1\mu +Zu+\varepsilon\end{array}$$
(5)

with \({\boldsymbol{y}}\) the vector of phenotypic observations, \(\mu\) the population mean, \({\boldsymbol{u}}\) the vector of regression coefficients for additive genetic random effects and \({\boldsymbol{Z}}\) the corresponding incidence matrix, \({\boldsymbol{\varepsilon}}\) the vector of residuals. We assumed \({\boldsymbol{u}}\sim N(0,{\boldsymbol{A}}{\sigma }_{u}^{2})\), with \({\boldsymbol{A}}\) the additive relationship matrix (calculated from pedigree) and \({\sigma }_{u}^{2}\) the additive genetic variance. This analysis of phenotypes revealed important spatial autocorrelation within the site, as shown for HT (Figs. 10, 11):

Fig. 10
figure 10

Spatial representation of residuals from model 1

Each individual is defined by x and y coordinates within the site. There is clearly a spatial autocorrelation, as the residuals are not randomly distributed.

Fig. 11
figure 11

Isotropic variogram from model 1

The empirical isotropic variogram is built by aggregating all the pairs of points separated by h, no matter the direction (see breedR manual). Residuals from nearby individuals are more similar than residuals from distant individuals, pointing to the existence of spatial autocorrelation.

We compared three individual-tree models with breedR to estimate the spatial effects affecting phenotypes within the site, as proposed by (Cappa et al. 2019). The models are of the general form:

$$\begin{array}{c}y= 1\mu +Bb+Zu+e \end{array}$$
(6)

with alternative formulations of the spatial random effect \({\boldsymbol{B}}{\boldsymbol{b}}\): block design (block), bidimensional separable first-order autoregressive (AR), and bidimensional spline regression (splines). The goodness-of-fit of the models was assessed by comparing Aikaike information criterion (AIC) and the splines approach was found to be the most appropriate.

Model

Model 1

Model 2 block

Model 2 AR

Model 2 splines

AIC

14308

14111

14105

13668

The form of the model chosen therefore corresponds to Eq. 6, with \({\boldsymbol{B}}\) the matrix containing the two-dimensional B-splines basis evaluated in the corresponding row and column for each tree (Fig. 12), and assuming \({\boldsymbol{b}}\sim N(0,{\boldsymbol{U}}{\sigma }_{s}^{2})\) with \({\boldsymbol{U}}\) a fixed spatial structure and \({\sigma }_{s}^{2}\) the spatial variance parameter. The residuals of this model are now more randomly distributed (Fig. 13).

Fig. 12
figure 12

Spatial component estimated with the model 2 splines

Fig. 13
figure 13

Spatial representation of residuals from model 2 splines

Finally, we obtained adjusted phenotypic values \(y^{\prime}\) as: \({y}^{\prime}=y-b\)

These phenotypic values \(y^{\prime}\) were subsequently used in the models of the manuscript.

Appendix 3

3.1 Simulation of the French maritime pine breeding programme

We simulated a diploid maritime pine genome composed of 12 chromosomes each with a physical length of 2.14e + 09 bp (Chagné et al. 2002), a genetic length of 1.2 M (Chancerel et al. 2013) and a total number of segregating sites of 6910 (10 times the average number of SNP per chromosome available in the real dataset plus 50 theoretical QTL). A simple species-specific demography was used to mimic the selection in the natural environment of the breeding programme base population (popG0): from a wild ancestral population in the Landes forest, we simulated one significant bottleneck reducing effective population size from ~ 50,000 (Milesi et al. 2023) to 100 (actual effective size for popG0) and with a mutation rate of 4e-18 per generation (Jaramillo-Correa et al. 2020). We investigated the coherence in LD and allele frequency distributions for the 600 individuals of popG0, by comparing real genotyping data and simulated SNP array data (see Fig. 14). One phenotypic trait was simulated with reference values equal to those of the “height” trait targeted in the breeding programme (phenotypic average = 6.81 m and genetic variance = 0.12m2).

The 833 individuals of POPR considered in this study were simulated after the two breeding cycles, as closely as possible to the real-life conditions (see Fig. 15). Each individual in popG0 took the identity of a real individual in the pedigree, the one with which it shared the same rank in terms of EBV. The EBV were generated with a correlation of 0.97 with true breeding values (BV) since the accuracy of EBV in real programme for these individuals is very high due to progeny testing. Individuals of popG0 were crossed according to the actual crossing plan to generate popP1 FS families of 130 individuals. Individuals for popG1 were selected based on EBV in each family with an intensity of 1.07. The use of this selection intensity, estimated with real data, made it possible to mimic the multi-character aspect of the actual selection carried out, as well as the diversity constraint actually used. Finally, the families of POPS (simulated version of POPR) were obtained by crossing the individuals of popG1 according to the actual pedigree.

Fig. 14
figure 14

Comparison of LD (A) and allele frequency (B) distributions between real and simulated SNP array data

Fig. 15
figure 15

Breeding cycles simulated to mimic the French maritime pine breeding programme

Appendix 4

4.1 Relatedness between training and validation sets in the different cross-validation scenarios

4.1.1 Mean relatedness between training (T set) and validation (V set) sets in the different cross-validation scenarios

 

CV1

CV2

 

Sub-scenario with n individuals from large families included in the Tset

 

n = 0

n = 5

n = 10

n = 15

n = 20

n = 25

Mean relatedness between Tset and Vset

6.2e-04

1.2e-02

1.4e-02

7.4e-03

2.9e-03

9.1e-04

1.2e-03

Standard deviation

2.7e-05

NA

1.0e-04

1.1e-04

5.9E-05

3.7e-05

3.6e-05

Relatedness was calculated for each repetition of the cross-validation scenario as the average of the genomic relationship coefficients in the G matrix between training and validation individuals. Standard deviation is not available for CV2 (n = 0) as this sub-scenario is repeated only once.

Appendix 5

5.1 Verification of the consistency of results obtained with real and simulated data

Fig. 16
figure 16

Global prediction accuracies obtained in the scenario CV1 for height with ABLUP and GBLUP models based on real or simulated data

Fig. 17
figure 17

Genome-based within-family predictive ability obtained in the scenario CV1 for height (HT) for each of the 39 full-sib families with real or simulated data

Appendix 6

6.1 Complementary analysis of genome-based prediction accuracy using deterministic approaches

Mathematical modeling approaches are useful tools to predict the accuracy of genome-based prediction as a function of various population parameters (Daetwyler et al. 2008; Hayes et al. 2009b; Goddard 2009). Deterministic formulas have already been applied to forest trees (Grattapaglia & Resende, 2011) and are here adapted with input parameters from our maritime pine case study, following Gorjanc et al. (2015). The accuracy (square root of reliability) of GEBV was obtained by: \(\begin{array}{c}{r}_{EBV}= \sqrt{\frac{\theta }{1+\theta }b}\end{array}\)

Where \(\theta =n{T}_{set}b{h}^{2}/{M}_{e}\) with nTset the size of the training set and h2 the heritability of the trait. \({M}_{e}\) and \(b\), respectively the effective number of chromosome segments and the proportion of genetic variance captured by markers, being defined by \({M}_{e}=2{N}_{e}LC/\text{log}({N}_{e}L)\) and \(b=nSNP/(nSNP+{M}_{e})\), with \({N}_{e}=100\) the effective size of our breeding population, \(L=1.2\) the average size of maritime pine chromosomes (in Morgans) and \(C=12\) the number of chromosomes. Based on progeny records, the prediction accuracy of non-phenotypes progeny with ABLUP depends solely on the accuracy of EBV in its parents. In case of no selection among parents, we have \({r}_{EB{V}_{pedigree}}^{2}=\frac{1}{4}\left({r}_{EB{V}_{mother}}^{2}+{r}_{EB{V}_{father}}^{2}\right)\) with \({r}_{EB{V}_{mother}}^{2}={r}_{EB{V}_{father}}^{2}=n/(n+\frac{4-{h}^{2}}{{h}^{2}})\), \(n\) being the number of progenies with phenotypic values per parent.

The trends observed (see Fig. 18) are fully consistent with those obtained with stochastic simulations presented in the previous section (Fig. 6). Overall prediction accuracy is mainly impacted by Tset size and to a much lesser extent by trait heritability. The advantage of GBLUP over ABLUP models in terms of forward prediction accuracy only becomes apparent above 2,000 individuals in the Tset. Although deterministic approaches are very interesting for quickly revealing key parameters for genome-based prediction accuracy, they have been called into question and refined several times (Brard and Ricard 2015; Elsen 2016, 2017). Especially when employing genome-based prediction within full-sibling families, as in this study, accurately forecasting the performance of corresponding accuracy for a specific trait within a particular family remains a challenging endeavour (Schopp et al. 2017).

Fig. 18
figure 18

Genome-based accuracy determined with deterministic formulas

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Papin, V., Gorjanc, G., Pocrnic, I. et al. Unlocking genome-based prediction and selection in conifers: the key role of within-family prediction accuracy illustrated in maritime pine (Pinus pinaster Ait.). Annals of Forest Science 81, 52 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13595-024-01269-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13595-024-01269-0

Keywords