- Research Paper
- Open access
- Published:
LiDAR-estimated height in a young Scots pine (Pinus sylvestris L.) genetic trial supports high-accuracy early selection for height
Annals of Forest Science volume 82, Article number: 12 (2025)
Abstract
Key message
Enhancing the efficiency and precision of breeding programs necessitates the implementation of “high-throughput” phenotyping. By employing various sensors for rapid and frequent measurements, we can gather extensive datasets crucial for conventional breeding efforts. This approach not only holds promise for improving forest production but also for evaluating emerging challenges such as fungal infestations and drought damage. Our research demonstrates the efficiency of utilizing height data derived from LiDAR analysis to identify superior genotypes within the Scots pine breeding program, aimed at enhancing volume production.
Context
Cost-effective ‘high-throughput’ phenotyping methods would be highly valuable in both conventional and advanced molecular tree breeding programs. Light Detection and Ranging (LiDAR) systems installed on unmanned aerial vehicles (UAVs, drones) have highly promising potential for such purposes as they enable rapid acquisition of relevant data.
Aims
To assess their current capacity, we have compared heights from conventional and LiDAR-based measurements in a Scots pine clonal/progeny trial (9 years old) in central Sweden. We have also compared effects of using them to obtain relationships between phenotypic and genetic parameters, and for selection.
Methods
The study was done in a Scots pine genetic field trial that included clones and seedlings. Mean values and estimation of genetic parameters for height were compared between datasets obtained by conventional measurements and by analysis of LiDAR objects obtained by a drone. The potential influence of the measurement method on genetic selection was quantified.
Results
The phenotypic correlations between heights obtained with the two methods were very high (≥ 0.9) and so were both the genetic correlations and estimated heritabilities. Selections of the best clones within tested families using the two sets of measurements matched almost perfectly. A wrong clone with a difference in rank of more than one was selected for just one family (of 47). The findings highlight the great potential of the approach for use in breeding practices, as it will allow the collection of vast amounts of accurate data much cheaper than conventional measurements.
1 Introduction
Phenotyping of genetic field trials is an essential, but costly, task in tree-breeding programs all over the world. Genetic trials are often large, including several thousands of trees that must be measured or assessed (phenotyped) to obtain data for genetic evaluations and selection of superior genotypes, for use in further efforts to improve stock and/or deployment in operational forestry. The most commonly used traits for selection are height and diameter, with consideration of external quality traits, e.g., straightness and branch diameter (Rosvall 2011, Rosvall and Mullin 2013). These traits are well correlated with the most common objective traits, i.e., the increase in volume or biomass production per unit area (Skovsgaard and Vanclay, 2008).
In genetic trials in the northern hemisphere, height is often measured until trees reach heights of 5–6 m, generally 6–10 years after planting. After passing this threshold, height measurements are too time consuming and therefore expensive, although height has proven efficacy for assessing the genetic rankings of older trees (du Toit et al., 2023). Thus, diameter is usually measured instead of height, 12–20 years after planting, depending on stand development. After this, superior genotypes are selected using both height and diameter measurements in efforts to meet breeding objectives which usually include maximization of wood volume production per hectare, or its optimization with respect to quality traits. After final measurement and selection, the material at most sites is not monitored any more as costs of data acquisition are too high and resources are invested elsewhere in the breeding program. However, careful follow-ups and collection of measurements of material in older experiments would be beneficial for breeding programs to calculate genetic correlations between measurements of heights in early stages and later ages. This is because rotation ages are usually much higher than the selection ages. For example, in Sweden they are 60–100 years for Scots pine (Pinus sylvestris) and 45–100 years for Norway spruce (Picea abies L. Kars).
Genetic field trials are often designed as so-called fully randomized single tree experiments, in which each plant with its own genetic identity is randomly assigned to a planting spot in the field. It is subsequently essential to carefully monitor each individual plant during measurements. The individual identities are used to create pedigrees of all planted plants, which are a core information in genetic evaluations. To facilitate measurements and finding the right tree in trails, the sites are often divided into smaller plots, each hosting 50 to 150 plants with local x/y coordinates indicating rows and planting positions, e.g., 10 rows with 12 planting positions, giving 120 planted plants per plot. The distances between plants depend on local conditions, but ideally the spacing should be regular and consistent in the whole experiment, e.g., 2.0 × 2.0 m.
In the last decade, there have been rapid advances in the development of various types of measuring sensors and unmanned aircraft (drones), with diverse new applications. Combinations of these technologies have started to be commonly used in agricultural and forestry sectors, particularly Airborne Laser Scanning (ALS) with a Light Detection and Ranging (LiDAR) sensor for measurement. This involves determining the distance between a sensor and an object from the time required for a laser pulse to be sent from the sensor, reflected from the object, and returned to the sensor (Debnath, et al. 2023). The most common uses of drones have been for monitoring land use and forest areas (Almeida, et al. 2019, Joshi, et al. 2016, Szostak 2020), detection of species, diseases and damage, e.g., by bark beetle (Näsi, et al. 2018), and obtaining overall stand inventory data, e.g., average height, basal area, and volume data (Hyyppä, et al. 2017, Mora, et al. 2013). The most recent practical applications are in forest management to guide thinnings (Malmberg 2023), assist in regeneration (Fargione, et al. 2021, Mohan, et al. 2021), fire detection (Kinaneva, et al. 2019, Momeni, et al. 2022), and support management decisions (Buchelt, et al. 2024, Gallardo-Salazar, et al. 2020).
However, application of drones in breeding requires much higher resolution than in most current forestry applications. The analysis of material obtained using drones must be scaled down to single trees, which must be recognized and matched with their positions in the trial and genotype information. In recent decades there have been several attempts to use drones for phenotyping material in genetic field trials of Norway spruce (Liziniewicz, et al. 2020, Solvin, et al. 2020), Douglas fir (du Toit, et al. 2022), and eucalyptus (Liao, et al. 2022). It has generally been found that the method has good potential utility in improvement programs, but challenges must be addressed before full-scale implementation (Bian, et al. 2022). Exact matching between local coordinates (x/y) and GPS coordinates of material delivered by sensors is essential for appropriate estimation of breeding values and selection of superior genotypes (du Toit, et al. 2022, Reynolds, et al. 2019). However, this is not straightforward because trees in genetic trials are rarely planted with consistently even spacing, due to variations in local factors such as soil preparation, and presence of stumps, rocks, or other impediments (Liziniewicz et al., 2020). Matching currently requires manual annotation, with analysts merging aerial observations with data compiled in on-ground inventories of field trials.
Accurate GPS-positioning of all planted seedlings directly after planting with assignment of correct genotypes to the planting points is a possible solution that could facilitate measurements in genetic field trials using advanced sensors (Krause, et al. 2019, Liziniewicz, et al. 2018). The aim of this study was to assess the potential of this approach by comparing results obtained from conventional measurements (8 years after plating) and LiDAR-based measurements (9 years after planting). The comparisons comprised of relationship between both variables and effects of using them to obtain relationships between phenotypic and genetic parameters, including estimated breeding values and genetic heritabilities (h2), and for selection. The study was done in a Scots pine mixed clonal and progeny trial in central Sweden.
2 Materials and methods
2.1 Experiment
The data for the study were collected from a Scots pine genetic field trial names S22S1410349 (S349) containing both cloned plants and seedlings covering 1.2 ha established in the spring of 2014 close to the town Indal in central Sweden (62° 61′ 04″ N, 17° 11′ 24″ E, Fig. 1).
Most of the planted material consisted of clonal seedlings (K) propagated as rooted cuttings from 1417 clones belonging to 42 full-sib Scots pine families, i.e., with known mother and father trees. It also included plants from another four families from seed-orchards with a known mother but unknown father. There were between six and 48 clones per family, with between one and six ramets per clone and 1.8 ramets per clone on average. In total, 2576 clonal plants were planted.
Seedlings that had germinated from seeds of the 43 fullsib families (H) used for clonal plant production were planted (6–24 seedlings per family, 10 on average, and 432 seedlings in total). Finally, 33 seedlings from four seed orchards representative of the area were planted to connect the experiment with other experiments. All seedlings were two years old when planted, containerized, and protected against pine weevil before planting. They were planted in 2.0 × 1.8 m spacing. In total 3045 plants were planted in the trial.
The trial site was divided into 28 plots and all plants were randomly distributed over the area. Fully stocked plots hosted 9 × 15 plants, but due to constrictions of the available area the last edge rows were usually shorter and included a few less plants than the full rows (e.g., six plants). All trees were locally positioned by plot, row, and plant relative positions, and georeferenced using a dual GPS/GLONASS RTK TOPCON receiver 3 months after planting. The GNSS positioning accuracy varied from 0.008 to 0.0270 m, with a median of 0.011 m.
2.2 Conventional measurements
Height measurements were acquired in the fall of 2014 after 1 year in the field, the fall of 2018 after 5 years, and the fall of 2021 when the stand was 8 years old (H1, H5, and H8, respectively). All trees were measured with a measuring pole with centimeter accuracy. The trees’ vitality, and both crown and stem damage were also registered.
2.3 UAV data acquisition
LiDAR data were acquired during a flight during spring 2023, one growing season after the H8 height measurements. The flights were done with an octocopter drone equipped with a Velodyne VLP-16 sensor, flying at 70 m altitude, with a 5 m/s flight speed, x m strip width, three returns per pulse, and flight detection overlap exceeding 90%. The average point density was 2664 points/m2 when considering all three returns (Fig. 2), and about 2385, 286, and 6 points/m2 for the 1st, 2nd, and 3rd returns, respectively.
2.4 LiDAR data processing and tree height extraction
The LiDAR point cloud was processed using the “lidR” package (Roussel, et al. 2020, Roussel, et al. 2018) of R-statistical software (R Core Team 2013). The processing pipeline included the following standard operations: normalization of the elevation data from the point-cloud to remove the effect of terrain, then extraction of the normalized LiDAR measurements within a 0.3 m buffer placed around each estimated tree location provided by the GNSS positioning. The LiDAR point with the maximum height within each resulting zone was then assigned as the treetop (x, y, z)—position.
For height normalization, the terrain elevation surface was created as a raster with 1 × 1 m spatial resolution containing the minimum LiDAR heights and then resampled to 0.1 m spatial resolution by cubic interpolation. The LiDAR heights were then subtracted from the elevation raster to obtain the heights of vegetation above terrain. The general vector and raster datasets were entered, and processed in, the “sf” (Pebesma and Bivand, 2023; Pebesma, 2018) and “terra” (Hijmans, 2023) packages of R statistical software.
During processing, the LiDAR returns from the crowns of large trees not included in the genetic experiment had to be removed. Such measurements occur due to large trees surrounding the experiment site and naturally regenerated trees located within the plots (Fig. 3). They were removed using a heuristic that employed k-mean clustering for all trees with a total height (estimated using LiDAR) exceeding 6.5 m. The algorithm assessed the clustering quality, considering at most four clusters using the gap-statistic (Tibshirani, et al. 2001) estimated using 50 bootstrap samples. The gap statistic evaluates the clustering solution against a uniform distribution assuming no clustering, and the solution producing a gap statistic that departs furthest from the reference is considered to have the optimal number of clusters and meaningful partitioning of the data. For the best optimal solution, the cluster with the minimum height for the center point was considered to contain the LiDAR returns from the suppressed trees, corresponding in our case to the trees in the genetic experiment. The gap statistic calculations for k-means clustering were performed using the “clust” package (Maechler, et al. 2023) of the R statistical software.
Overview of the study site. The normalized vegetation heights from LiDAR data (buffered 5 m around the experimental plots) range between 0.25 and 11 m. The blue dots represent treetop locations, and red spots correspond to the crowns of large trees which were further processed to retrieve heights of the suppressed trees (those included in the genetic experiment)
2.5 Genetic analysis
Genetic parameters of the experimental material as a whole and sets of the trees were calculated with single-tree and single-trait mixed models. An extended spatial model was used to account for global trends and “extraneous” variation aligned with rows and columns in the experiment (Costa e Silva et al., 2001). The following model in a matrix form was used for the analysis:
Here, y is a vector of the measured trait, \(\beta\) is a vector of fixed effects with its design matrix X, \(u\) is a vector of random effects with its design matrix Z, and \(\varepsilon\) is a vector of residuals. Fixed and random effects solutions are obtained by solving the mixed model equations:
Here, R is a variance–covariance matrix of residuals and G is the direct sum of variance–covariance matrices of each of the random effects. Residuals are assumed to be independent.
Including spatial autocorrelation in the mix-model allows a variance–covariance matrix of residuals (R) to decompose an error (\(\varepsilon )\) structure into spatially dependent (ξ) and independent (η) residuals (nugget effect). The spatially dependent (ξ) residuals are modelled using a covariance structure that assumes separable first-order autoregressive correlation in rows and columns:
Here, \({\sigma }_{\upxi }^{2}\) is the spatially dependent residual variance, \({\sigma }_{\eta }^{2}\) is an independent residual variance, I is an identity matrix, \(\otimes\) is a direct product (the Kronecker product) for two matrices \(AR1\left({p}_{col}\right)\), and \(AR1({p}_{row})\), indicating a first-order autoregressive correlation matrix in columns and rows, respectively.
2.6 Variance estimations
The variance parameters were estimated by the Residual Maximum Likelihood (REML) method using ASReml 4.2 (Gilmour, et al. 2015).
The individual-tree narrow sense heritability was calculated for each of the analyzed traits using a formula proposed by Isic et al. (2017):
where \({h}_{i}^{2}\) is narrow-sense heritability, \({\widehat{\sigma }}_{A}^{2}\) is additive genetic variance, \({\widehat{\sigma }}_{P}^{2}\) is total phenotypic variance, \({\sigma }_{\upxi }^{2}\) is the spatially dependent residual variance, and \({\sigma }_{\eta }^{2}\) is an independent residual variance.
After running the model, traits of interest for each tree were adjusted for spatial correlation. The adjusted values were subsequently used to calculate genetic correlations between traits, according to the following formula:
Here, \({\widehat{\sigma }}_{(x)}^{2}\) and \({\widehat{\sigma }}_{(y)}^{2}\) are the estimated genetic variances for traits x and y, respectively, or the same trait variances at two ages, and \({\widehat{cov}}_{(x,y)}\) is the estimated phenotypic or genetic covariance between traits x and y or between the same trait at two ages.
2.7 Used data and errors
The genetic parameters were calculated for conventionally measured traits (CONV_H) at age 1, 5, and 8 years (e.g., CONV_H_1) and for height estimates obtained from the LiDAR point cloud (LID). A first LID dataset (designated LID_H_RAW) was extracted from the LiDAR layer by excluding data pertaining to trees (23 in total) assigned heights exceeding 10 m. Second, third, and fourth datasets were created by excluding data from the first dataset pertaining to trees lower than 1 m (LID_H_1m), 1.5 m (LID_H_1.5 m), and 2 m (LID_H_2m), respectively. The data in each of these four datasets were visualized and used for both genetic analysis, and genetic selection to compare the conventional measurements and LiDAR data and assess the latter’s utility. Another dataset (designated LID_PRU) was created by removing LiDAR-estimated height data for trees that lacked conventional measurements, i.e., trees that were registered as missing or dead in a time of conventional measurements, then subjected to the same analyses.
The data were trimmed in these ways to compare the accuracy and value of the LiDAR-based height estimates after stripping out heights assigned to points where trees had died after planting or no trees were planted (where there is usually other vegetation, e.g., shrubs, naturally regenerated trees, or branches of adjacent trees). The rationale for including the LID_PRU dataset was to investigate the effect of assigning heights to missing trees on the selection of superior genotypes and estimation of genetic parameters, i.e., heritability.
Three measures of error were estimated, systematic mean error (ME or bias), total error (RMSE), and random error (SDE).
where \(\hat{{y}_{i}}\) is a predicted value of individual obtained from LiDAR data, \({y}_{i}\) is a measured value, and N is a number of observations.
2.8 Clone selection
Arithmetic mean values of the height variables for each clone in the K group were calculated, and the one with the highest average height value obtained from conventional measurements (CONV) or LiDAR-based measurements was “selected” as the best performing clone. For comparisons of these datasets, it was assumed that rankings (in descending order) based on conventional height estimates were true and used for comparison of the LiDAR-based rankings. Rankings of material in the H group were based on absolute values of trees within families (and the highest tree within family was “selected”). Spearman correlation coefficients were calculated for rankings of the compared heights.
The LID_H_RAW and LID_PRU datasets were tested as datasets of such types are most likely to be used operationally if LiDAR methodology is applied in breeding programs. If raw LiDAR data are used, like the LID_H_RAW dataset, clone or seedling heights will be estimated with no correction for wrong assignments of heights, as LiDAR-based heights will be assigned to missing trees. If “pruned” datasets are used, like the LID_PRU dataset, the LiDAR data will be corrected against ground measurements of diameter and missing trees will be removed from the datasets.
3 Results
3.1 Phenotypic correlations
The overall survival rate of planted material in the trial was 90%, and there were 2749 living trees at the time of conventional measurement, 8 years after planting. Of the initially planted trees, 296 had died and were not measured. The average height of the living trees was 2.7 m. Heights were estimated by LiDAR 9 years after planting and averaged 3.4 m, excluding missing trees (Table 1). In different datasets, the height estimated by LiDAR was systematically overestimated at 14.5% (Table 1). Accuracy of prediction (RMSE) was the lowest for the H2.0_m dataset indicating highest accuracy. The variation of RMSE and SDE between datasets was small (Table 1).
There was a strong correlation between the conventional and LiDAR-based height measurements. The coefficient of correlation between the raw dataset (LID_H_RAW) and conventional measurements was 0.9. The correlations for the clones and seedlings were also high and did not decrease with trimming of the heights in the LiDAR-based dataset (Table 2). The correlations for clones and seedlings were similar (Table 2). The correlation for the clonal material increased slightly with increasing height (Fig. 4).
Correlations between conventionally measured heights 8 years after planting (x-axes) and heights derived from the LIDAR point cloud (y-axes), obtained 9 years after planting, for a individual clonal plants (K—top left panel), individual seedlings (H—top right panel) and b means for the clones (bottom panel). Solid lines are regression lines, and dashed lines are 1:1 lines
Genetic narrow sense heritability was highest (0.27) at the age of 1 year (after planting). At this time the genetic coefficient of variation was also the highest (11.8%). In the following measurements, both heritability and the coefficient of variation decreased almost two-fold (Table 3). At the age of 8 years, the heritability based on conventionally measured heights (H8) was about 0.15 and very similar to the value obtained using the LID_H_PRUN LiDAR-based dataset. The coefficient of variation obtained using the LID_H_PRUN dataset was 1.3% lower than the value obtained using conventional measurements. There was a slight difference between heritability values obtained using the conventional height measurements and untrimmed LiDAR-based dataset (LID_H_RAW, Table 2). Generally, trimming certain heights slightly increased heritability estimates and slightly decreased genetic coefficients of variation, but they did not decrease with increases in the height of pruned trees (Table 3).
The multivariate model including spatially adjusted values of variables with heterogenous correlation between traits, e.g., H8 and RAW_9, did not converge due to high phenotypic correlation between conventional height measurements and LiDAR-based estimates. The bivariate analysis of traits showed very low genetic correlation between heights measured 1 year after planting and after 8 or 9 years in the field. The genetic correlations between heights conventionally measured at the ages of 5 and 8 years, and the “raw” LiDAR estimates (LID_H_RAW) were 0.99.
3.2 Selection
Clones selected as the best in their families using conventional measurements and the “raw” LiDAR data almost exactly matched (Fig. 5a). Analyses of both datasets resulted in the same clone being most highly ranked for 37 of 46 clonal families (ca. 80%), and there was only a substantial deviation in rankings for one clone (designated S22K1314174, Fig. 5a). There were two planted ramets of this clone, and one died. The dead individual was assigned no value in the conventional measurement dataset, but a low LiDAR estimate decreased the mean value of the clone. Use of the pruned dataset eliminated this incorrect selection (Fig. 5b). Apart from this mentioned clone, the differences in rankings were at most one rank position and the maximum absolute difference in height for a group selected with conventionally measured and LiDAR estimated heights was 3 cm.
Correlations between the maximum rankings of best clones in the 42 full-sib Scots pine families and four other families obtained from conventional measurements (x-axes) and: a all LIDAR estimates and b LIDAR estimates with missing trees pruned from the dataset. The correlation between the maximum rankings of the best seedlings in each of 43 full-sib Scots pine families obtained from conventional measurements (x-axis) and from c all LIDAR estimates and d LIDAR estimates with missing trees pruned from the dataset. The numbers in the panels are Pearson correlation coefficients between obtained rankings and the lines are regression lines between ranks. Points have been jittered to avoid overlapping in a and b. The solid lines are 1:1 lines
Selection of the best progenies, i.e., the tallest tree in each family, resulted in perfect matches for 27 of 43 families (63%). The difference between conventionally measured heights between groups of genotypes selected using conventionally measured heights and LiDAR-based height estimates was 10 cm. When the number of genotypes selected per family was increased to two genotypes, 69 of the 86 selected using the two datasets were the same and the absolute difference in height between groups was 3 cm. The absolute difference remained the same when the number of selected progenies increased to three per family.
4 Discussion
4.1 Heights from measurements and estimates
The analysis of LiDAR point clouds provided very accurate data, yielding a phenotypic correlation with conventionally measured heights of 0.9 and a slightly higher one at clonal level the phenotypic correlation on clonal level was slightly greater. Similar phenotypic correlations have been found for material in Norway spruce genetic trials (Liziniewicz et al., 2020; Solvin et al., 2020), mangrove trees (Yin and Wang 2019), and the method in general (Jaakkola, et al. 2017). High phenotypic correlations between measurements and estimates from image point clouds have also been obtained for seedlings (Castilla, et al. 2020) and material in entire plots in realized gain trials of Douglas fir (Grubinger, et al. 2020). The stand in the analyzed trial was nine years old and the average height in the preceding year was 2.7 m, indicating that LiDAR analysis of young stands can provide valuable data for breeding and selection. Thus, they could potentially replace the first height measurements in Swedish breeding programs, which are often acquired at the age of 6–8 years when trees are 2–5 m tall, for example.
In this study, the conventionally measured heights and those obtained from LiDAR analysis were not collected at the same stand age; there was a 1-year difference in their collection times, which has a reflection in measurement errors that are slightly greater than in other studies. Visual analysis of both datasets revealed a systematic difference between them, which can be interpreted as an indication of the height growth of individual trees. Measurements of annual height growth provide essential data for elucidating trees’ responses to drought, which is currently one of the biggest challenges in European forestry. The observed inter-annual differences seem to represent the annual height growth of Scots pine at the site of the focal trial well, but a specific analysis of the accuracy of growth increments deduced from conventional and LiDAR-based measurements is needed to assess the latter’s potential utility. It would be relatively easy to acquire the data required for such an analysis from the field trial addressed here because once GPS positions of the trees have been established, new point cloud data can be analyzed relatively quickly. In addition, annual height growth estimates could be validated with an appropriate sample of trees. This methodology could provide a more fruitful effective approach for quantifying inter-annual growth than use of a single flight and detection of growth whorls from photogrammetric data, which is not appropriate for Norway spruce, according to Solvin et al. (2020).
Collection of GPS coordinates of planting spots and plot corners simplified the estimation of heights from the LiDAR point cloud. The corners’ coordinates allowed more effective checks of the planting positions and prevented propagation of potential errors across the whole experiment. In an analysis of this trial, the manual annotations of genotypes were limited to checking the planting spots in relation to borders of the plots and correction of some points’ positions. The GPS points applied in this study were collected a long time ago as part of an investigation of the evenness of spacing in genetic filed trials. In new trials, if the goal of GPS collection is clearer the acquisition of GPS planting points could also be improved.
4.2 Estimation of genetic parameters
Acquired height data can be used for estimation of genetic parameters and forward selection of superior Scots pine clones in genetic field trials. The genetic parameters estimated using all the datasets presented here were quite consistent. The trimmed LiDAR-based datasets yielded higher heritability values than the “raw” LiDAR data, with all derived heights. This is consistent with expectations as the “raw” dataset includes small heights at positions of missing trees due to the detection of branches of surrounding trees that had extended into the growing space of missing trees or some other kind of vegetation. A similar (2%) underestimate of heritability for height derived using a drone was found in an analysis of a 12-year-old Norway spruce field trial by Liziniewicz et al. (2020). Pont et al. (2016) also found acceptable differences between heritability estimates for tree size traits of Pinus radiata derived from data acquired with airborne sensors and ground-based methods. In our study, we found high genetic correlations between heights measured conventionally at ages of 5 and 8 years, and heights derived from the LiDAR analysis, clearly indicating that conventional height measurements can be replaced by LiDAR-based estimates.
4.3 Selection and implementation
Use of raw estimates from the LiDAR point clouds with exclusion of anomalies, e.g., data indicating the presence of trees that are three times higher than average in trial, is likely to be the most appropriate way to use them in genetic selection. Results of selection using the “raw” dataset with heights assigned to missing trees and “pruned” datasets excluding trees that were missing according to the conventional ground measurements did not result in any differences except for one family. The exception was for a family in which the wrong clone was selected using the “raw” dataset due to the inclusion of a small height estimate for a missing ramet. The clones selected for almost all the 42 families using the two datasets were the same.
Assignment of low heights to missing trees was the greatest source of error when “raw” LiDAR data were used. However, in the analyzed trial there was just one ramet per clone for most of the clones, which ensured the right selection if the tree was present at the inventory time. If there are more ramets per clone, mortality or substantial damage and associated wrong (low) height estimates will lead to wrong mean or breeding values. This was illustrated in our study by the clone S22K1314174, which was selected from conventional measurements but ranked poorly when using the LiDAR dataset. There were two ramets of this clone, one of which had died. From the LiDAR estimates, an accurate estimate was obtained for the ramet that was present in the field, while a very low estimate was obtained for the dead ramet, leading to calculation of a wrong mean value for the clone. Ideally, for testing Scots pine in Sweden three or four ramets per clone should be planted in a trial. However, ramets are usually produced as rooted cuttings, Scots pine is not easy to propagate in this way, and there is usually high variation in ease of propagation between clones. Thus, as in our case, there is often just ramet for many clones. So, use of LiDAR-based height assessments could clearly lead to erroneous estimates of mean heights and breeding values of clones in trials without control of plants’ mortality. Similarly, Solvin et al. (2020) concluded from an analysis of a dense Norway spruce genetic trail that photogrammetric data would only be beneficial for forward selection, i.e., selection of the tallest tree per tested family. Our selection procedure was like forward selection in a progeny trial as many tested clones were only represented by one ramet. The potential risks are illustrated by findings of Liziniewicz et al. (2020) that only 50% of sets of Norway spruce clones selected from 32 full-sib families using conventional measurements and photogrammetry-based estimates matched.
When selecting progenies the tallest genotype of each family was selected, in accordance with the most common practice in operational breeding. The selection was not perfect, but the maximal differences in rankings obtained using conventional measurements and LiDAR-based estimates were one or two places. In addition, there were marginal differences in absolute mean values of the heights of the selected genotypes when they differed (up to 10 cm).
To increase control and check for damage to the trees, ground measurements at later ages, e.g. diameter measurements and survival checks, would be highly valuable. Moreover, diameter measurements are relatively cheap and highly correlated with objective traits. Stem quality traits and defects cannot be obtained from LiDAR point clouds yet, but the rapid development of sensors and techniques might enable this in the coming decade. From a practical perspective, pre-commercial thinning just before flights might reduce the magnitude of errors. Natural regeneration of pioneer species often occurs in places where planting of conifers has failed. Removal of unwanted trees in a trial will not only facilitate genetic evaluation of the trial but also ease exclusion of erroneous data for missing trees from LiDAR datasets. Alternatively, a height threshold could be chosen to exclude individuals from datasets that are probably erroneous. We have found this approach increases heritability estimates but does not change the ranks of selected individuals. From the other hand such approach might eliminate well-estimated but low heigh values leading to overestimation of clonal average.
In forestry, trees’ heights or average stand heights have been traditionally used for quantification of sites’ quality and production capacities, e.g., site index functions. Height has also been widely used as a selection trait in breeding programs, as it is relatively cheap to measure in young age or sample trees, and highly correlated with breeding objective traits, e.g., volume production per unit area (Eichhorn, 1902; Skovsgaard and Vanclay, 2008). However, costs of height measurements increase with increasing height, so diameter is often used as a selection trait at older ages. Analyses of National Forest Inventory data have shown that there have been substantial increases in planted trees’ height during the last century (Elfving and Tegnhammar, 1996; Elfving et al., 1996). On average, the average height of 50-year-old Scots pine and Norway spruce trees was 12 m in 1950, 15 m in the 1990s, and 17–18 m currently (Fridman and Danell, 2024). There were proportional increases in diameter in the period from the 1950s to 1990s, but not since then. Mensah et al. (2021) also found that top height growth of Norway spruce and Scots pine has increased in the last 30 years in Sweden. Genetic improvements may have contributed to the increases, potentially in conjunction with changes in silvicultural practices, increases in nitrogen deposition, and rises in atmospheric CO2 concentrations. These studies, our study, and numerous others indicate that height is the optimal trait to measure and should be measured consistently rather than diameter in breeding programs. LiDAR sensors can provide accurate data in breeding programs that can improve estimates of breeding values and assessments of genetic gains obtained from the programs. The relatively low costs of data acquisition with LiDAR sensors also allow more frequent measurements, and measurements throughout a whole rotation (if desired), including ages when ground measurements have become difficult.
In this study, we have focused on practical application of currently available equipment and methods to derive height estimates of trees in a Scots pine trial. It should be noted that we have not attempted to develop new analytical methods that might improve estimates, e.g., segmentation of trees, identification of the most appropriate cut-off heights in areas around GPS positions of trees, or optimization of the flight mode.
4.4 Practical considerations
The GPS-positioning of the planting spots allowed quick and accurate estimation of tree heights from LiDAR point clouds even though time spent on that was not recorded. The GPS-measured points for the planted seedlings were matched with x/y positions in numbered plots on a traditional map. This enabled quick analysis of LiDAR data with no need for manual annotation of the planting points on a LiDAR layer in a GIS program, which has been considered a major challenge for application (du Toit, et al. 2022). The correct matching of GPS points and genotypes (each of which is a unique individual that must be monitored) is crucial for obtaining reliable data for genetic analysis.
We tested various available algorithms and methods for automatic annotation without success. The tests were intended to mimic a situation where GPS positions are not available, which is often the case for genetic field trials, many of which were planted before the development of technologies applied in this study. The main reasons for failure of automatic annotation were deviations in planting spacing due to natural obstacles (stumps, stones, or other impediments) in the field, and associated discrepancies between the GPS points obtained at planting time and the regular grid of coordinates laid out on the LiDAR image in initial attempts to mimic them. Pont et al. (2016) concluded that variations in accurate detection frequencies between 90 and 98% did not significantly affect estimated heritabilities and genetic gains for 7-year-old Pinus radiata. However, the cited authors did not investigate the effect on selection. In the future, implementation of AI technology may facilitate this process, but at the current stage for genetic trials planted in Sweden the attempts were not successful. Automatic annotation could be tried at very flat agricultural sites, which raise the possibilities to maintain consistent and even distances between trees, but specific tests are required to confirm this. Similar obstacles have been highlighted in studies of Douglas fir in Canada (du Toit, et al. 2022) and Norway spruce in Norway (Liziniewicz, et al. 2020, Solvin, et al. 2020).
5 Conclusion
This is the most comprehensive study to date confirming that LiDAR sensors installed on drones can provide good background information for estimation of height in young Scots pine genetic trials. The method can also probably be used in analyses of other tree species. Our study included ca. 2500 trees and heights estimated from the acquired 3D point cloud provided reliable estimates of genetic parameters and acceptably accurate information for selection of superior genotypes. The precision of selection was high and slight mismatches in selected clones did not significantly affect the height of genotypes selected for future breeding and deployment. The main obstacle for common implementation of the method is still a connection of conventional genetic maps with LiDAR layers that currently require tedious registration of GPS position of planting spots. However, we consider this step as beneficial for the future and recommend GPS positioning of all planted genetic trials. This step might make it possible to develop phenotyping platforms for use in forest tree breeding that will be like the phenotyping platforms in agriculture.
Data availability
Data and codes may be made available upon a reasonable request to the main authors.
References
Almeida DRAd, Stark, SC, Chazdon R, Nelson BW, César RG, Meli P et al (2019) The effectiveness of LiDAR remote sensing for monitoring forest cover attributes and landscape restoration. For Ecol Manage 438:34–43. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.foreco.2019.02.002
Bian L, Zhang H, Ge Y, Čepl J, Stejskal J, El-Kassaby YA (2022) Closing the gap between phenotyping and genotyping: review of advanced, image-based phenotyping technologies in forestry. Ann for Sci 79(1):1–21. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13595-022-01143-x
Buchelt A, Adrowitzer A, Kieseberg P, Gollob C, Nothdurft A, Eresheim S et al (2024) Exploring artificial intelligence for applications of drones in forest ecology and management. For Ecol Manage 551:121530. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.foreco.2023.121530
Castilla G, Filiatrault M, McDermid GJ, Gartrell M (2020) Estimating individual conifer seedling height using drone-based image point clouds. Forests 11(9):924. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/f11090924
Costa e Silva J, Dutkowski GW, Gilmour AR (2001) Analysis of early tree height in forest genetic trials is enhanced by including a spatially correlated residual. Can J For Res 31:1887–1893. https://doiorg.publicaciones.saludcastillayleon.es/10.1139/x01-123.
Debnath S, Paul M, Debnath T (2023) Applications of LiDAR in agriculture and future research directions. J Imaging 9:57. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jimaging9030057
du Toit F, Coops NC, Ratcliffe B, El-Kassaby YA (2022) Generating douglas-fir breeding value estimates using airborne laser scanning derived height and crown metrics. Front Plant Sci 13:893017. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fpls.2022.893017
du Toit F, Coops NC, Ratcliffe B, El-Kassaby YA, Lucieer A (2023) Modelling internal tree attributes for breeding applications in Douglas-fir progeny trials using RPAS-ALS. Sci Remote Sensing. 7:100072. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.srs.2022.100072.
Eichhorn F (1902) Ertragstafeln für die Weiβtanne [Yield tables for the silver fir]. Verlag Julius Springer
Elfving B, Tegnhammar L, Tveite B (1996) Studies on growth trends of forests in Sweden and Norway. In: Growth trends in European forests: studies from 12 countries 61–70. http://dx.doi.org/10.1007/978-3-642-61178-0_6.
Elfving B and Tegnhammar L (1996) Trends of tree growth in Swedish forests 1953–1992: an analysis based on sample trees from the National Forest Inventory. Scan J For Res 11:26–37. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02827589609382909.
Fargione J, Haase DL, Burney OT, Kildisheva OA, Edge G, Cook-Patton SC et al (2021) Challenges to the Reforestation Pipeline in the United States. Front Forests Glob Change 4:629198. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/ffgc.2021.629198
Fridman J, Danell K (2024) Sweden's Forests Over the Last 100 Years: The Swedish National Forest Inventory 1923-2023. Gidlunds förlag, ISBN: 9789178445486
Gallardo-Salazar JL, Pompa-García M, Aguirre-Salado CA, López-Serrano PM, Meléndez-Soto A (2020) Drones: technology with a promising future in forest management. Revista mexicana de ciencias forestales. 11(61):27–50. https://doiorg.publicaciones.saludcastillayleon.es/10.29298/rmcf.v11i61.794
Gilmour A, Gogel B, Cullis B, Welham S, Thompson R (2015) ASReml user guide release 4.1 structural specification. In Hemel hempstead: VSN international ltd, VSN International Ltd, Hemel Hempstead, HP1 1ES, UK
Grubinger S, Coops NC, Stoehr M, El-Kassaby YA, Lucieer A, Turner D (2020) Modeling realized gains in Douglas-fir (Pseudotsuga menziesii) using laser scanning data from unmanned aircraft systems (UAS). For Ecol Manage 473:118284. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.foreco.2020.118284
Hijmans RJ (2023) terra: Spatial Data Analysis. R package version 1.7–39. The R Foundation for Statistical Computing. https://cran.r-project.org/web/packages/terra/index.html.
Hyyppä J, Hyyppä H, Yu X, Kaartinen H, Kukko A, Holopainen M (2017) Forest inventory using small-footprint airborne LiDAR. CRC Press, In Topographic laser ranging and scanning, pp 335–370
Isik F, Holland J, Maltecca C (2017) Genetic data analysis for plant and animal breeding. Springer. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/978-3-319-55177-7.
Jaakkola A, Hyyppä J, Yu X, Kukko A, Kaartinen H, Liang X et al (2017) Autonomous collection of forest field reference—The outlook and a first step with UAV laser scanning. Remote Sensing 9(8):785. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs9080785
Joshi N, Baumann M, Ehammer A, Fensholt R, Grogan K, Hostert P et al (2016) A review of the application of optical and radar remote sensing data fusion to land use mapping and monitoring. Remote Sensing 8(1):70. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs8010070
Kinaneva D, Hristov G, Raychev J, Zahariev P. Early forest fire detection using drones and artificial intelligence. In: 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). M.e.a. Koricic (ed.). IEEE, pp. 1060-1065. https://doiorg.publicaciones.saludcastillayleon.es/10.23919/MIPRO.2019.8756696.
Krause S, Sanders TG, Mund J-P, Greve K (2019) UAV-based photogrammetric tree height measurement for intensive forest monitoring. Remote Sensing 11(7):758. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs11070758
Liao L, Cao L, Xie Y, Luo J, Wang G (2022) Phenotypic traits extraction and genetic characteristics assessment of eucalyptus trials based on UAV-borne LiDAR and RGB images. Remote Sensing 14(3):765. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs14030765
Liziniewicz M, Berlin M, Karlsson B (2018) Early assessments are reliable indicators for future volume production in Norway spruce (Picea abies L. Karst) genetic field trials. For Ecol Manage 411:75–81. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.foreco.2018.01.015
Liziniewicz M, Ene LT, Malm J, Lindberg J, Helmersson A, Karlsson B (2020) Estimation of Genetic Parameters and Selection of Superior Genotypes in a 12-Year-Old Clonal Norway Spruce Field Trial after Phenotypic Assessment Using a UAV. Forests 11(9):992. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/f11090992
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2023) Cluster: cluster analysis basics and extensions. R Package Version 2(1):6
Malmberg Å (2023) Using drones to thin the forest. Press Release - Uppsala University, Uppsala. https://www.uu.se/en/news/2023/2023-02-23-using-drones-to-thin-the-forest. Accessed 10 Feb 2025.
Mensah AA, Holmström E, Petersson H, Nyström K, Mason EG, Nilsson U (2021) The millennium shift: Investigating the relationship between environment and growth trends of Norway spruce and Scots pine in northern Europe. For Ecol Manage 481:118727. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.foreco.2020.118727
Mohan M, Richardson G, Gopan G, Aghai MM, Bajaj S, Galgamuwa GAP et al (2021) UAV-Supported Forest Regeneration: Current Trends. Challenge Imp Remote Sens 13(13):2596. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/rs13132596
Momeni M, Soleimani H, Shahparvari S, Afshar-Nadjafi B (2022) Coordinated routing system for fire detection by patrolling trucks with drones. Int J Disaster Risk Reduct 73:102859. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ijdrr.2022.102859
Mora B, Wulder MA, Hobart GW, White JC, Bater CW, Gougeon FA et al (2013) Forest inventory stand height estimates from very high spatial resolution satellite imagery calibrated with lidar plots. Int J Remote Sens 34(12):4406–4424. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/01431161.2013.779041
Näsi R, Honkavaara E, Blomqvist M, Lyytikäinen-Saarenmaa P, Hakala T, Viljanen N et al (2018) Remote sensing of bark beetle damage in urban forests at individual tree level using a novel hyperspectral camera from UAV and aircraft. Urban Forest Urban Green 30:72–83. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ufug.2018.01.010
Pebesma EJ (2018) Simple features for R: standardized support for spatial vector data. The R Journal 10:439–446
Pebesma E, Bivand R (2023) Spatial data science: With applications in R. Chapman and Hall/CRC. https://doiorg.publicaciones.saludcastillayleon.es/10.1201/9780429459016.
Pont D, Dungey H, Watt M, Morgenroth J, Stovold T (2016) The use of LiDAR for Phenotyping. In: Forest Genetics for Productivity Conference, Rotorua, New Zealand. https://www.researchgate.net/profile/David-Pont-2/publication/304864832_The_use_of_LiDAR_for_Phenotyping/links/577c866e08ae213761cac0d8/The-use-of-LiDARfor-Phenotyping.pdf.
R Core Team (2013) R: A language and environment for statistical computing. R Foundation for Statistical Computing, R Core Team Vienna, Austria
Reynolds D, Baret F, Welcker C, Bostrom A, Ball J, Cellini F et al (2019) What is cost-efficient phenotyping? Optimizing costs for different scenarios. Plant Sci 282:14–22. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.plantsci.2018.06.015
Rosvall O, Ståhl P, Almqvist C, Anderson B, Berlin M, Ericsson T, Eriksson M, Gregorsson B, Hajek J, Hallander J (2011) Review of the Swedish tree breeding programme. Skogforsk, Uppsala, Sweden
Rosvall O, Mullin TJ (2013) Introduction to breeding strategies and evaluation of alternatives. Best practice for tree breeding in Europe, pp.7-27. Skogforsk, Uppsala, Editors: Tim J. Mullin, Steve Lee. ISBN: 978-91-977649-6-4
Roussel JR, Auty D, De Boissieu F and Meador AS (2018) lidR: Airborne LiDAR data manipulation and visualization for forestry applications. R package version 4.2.0 https://cran.r-project.org/package=lidR
Roussel J-R, Auty D, Coops NC, Tompalski P, Goodbody TR, Meador AS et al (2020) lidR: An R package for analysis of Airborne Laser Scanning (ALS) data. Remote Sens Environ 251:112061. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.rse.2020.112061
Skovsgaard JP, Vanclay JK (2008) Forest site productivity: a review of the evolution of dendrometric concepts for even-aged stands. Forestry 81:13–31. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/forestry/cpm041.
Solvin TM, Puliti S, Steffenrem A (2020) Use of UAV photogrammetric data in forest genetic trials: measuring tree height, growth, and phenology in Norway spruce (Picea abies L. Karst.). Scand J Forest Res 35(7):322–333. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02827581.2020.1806350
Szostak M (2020) Automated land cover change detection and forest succession monitoring using LiDAR Point Clouds and GIS analyses. Geosciences 10(8):321. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/geosciences10080321
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc: Series B (Stat Method) 63(2):411–423. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1467-9868.00293
Yin D, Wang L (2019) Individual mangrove tree measurement using UAV-based LiDAR data: Possibilities and challenges. Remote Sens Environ 223:34–49. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.rse.2018.12.034
Acknowledgements
The study was financially supported by the Troedssons Stiftelsen.
Funding
The study was financially supported by Stiftelsen Nils och Dorthi Troëdssons Forskningsfond in the project number 1019/21 “Efficient phenotyping of genetic field trials using automated tree height extraction from multi-temporal UAV imagery.”
Author information
Authors and Affiliations
Contributions
ML conceptualized the project, made data visualization and genetic analysis, and wrote first draft of the manuscript. CA was responsible for conventional inventories and was involved in scientific writing. AHe was responsible for conceptualization and scientific writing. AH was responsible for delivery of LiDAR data and scientific writing. LTE was responsible for conceptualization, analysis of LiDAR data delivering height estimates from UAV flights and scientific writing. All authors read and approved of the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable. All authors gave their informed consent to this publication and its content.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Handling editor: Shuguang Liu.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liziniewicz, M., Almqvist, C., Helmersson, A. et al. LiDAR-estimated height in a young Scots pine (Pinus sylvestris L.) genetic trial supports high-accuracy early selection for height. Annals of Forest Science 82, 12 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13595-025-01283-w
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13595-025-01283-w