1
|
Melchert GF, Ferreira FM, Muniz FR, de Matos JW, Benatti TR, Brum IJB, de Siqueira L, Tambarussi EV. Genomic Prediction in a Self-Fertilized Progenies of Eucalyptus spp. PLANTS (BASEL, SWITZERLAND) 2025; 14:1422. [PMID: 40430990 PMCID: PMC12115009 DOI: 10.3390/plants14101422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2025] [Revised: 04/30/2025] [Accepted: 05/03/2025] [Indexed: 05/29/2025]
Abstract
Genomic selection in Eucalyptus enables the identification of superior genotypes, thereby reducing breeding cycles and increasing selection intensity. However, its efficiency may be compromised due to the complex structures of breeding populations, which arise from the use of multiple parents from different species. In this context, partial inbred lines have emerged as a viable alternative to enhance efficiency and generate productive clones. This study aimed to apply genomic selection to a self-fertilized population of different Eucalyptus spp. Our objective was to predict the genomic breeding values (GEBVs) of individuals lacking phenotypic information, with a particular focus on inbred line development. The studied population comprised 662 individuals, of which 600 were phenotyped for diameter at breast height (DBH) at 36 months in a field experiment. The remaining 62 individuals were located in a hybridization orchard and lacked phenotypic data. All individuals, including progeny and parents, were genotyped using 10,132 SNP markers. Genomic prediction was conducted using four frequentist models-GBLUP, GBLUP dominant additive, HBLUP, and ABLUP-and five Bayesian models-BRR, BayesA, BayesB, BayesC, and Bayes LASSO-using k-fold cross-validation. Among the GS models, GBLUP exhibited the best overall performance, with a predictive ability of 0.48 and an R2 of 0.21. For mean squared error, the Bayes LASSO presented the lowest error (3.72), and for the other models, the MSE ranged from 3.72 to 15.50. However, GBLUP stood out as it presented better precision in predicting individual performance and balanced performance in the studied parameter. These results highlight the potential of genomic selection for use in the genetic improvement of Eucalyptus through inbred lines. In addition, our model facilitates the identification of promising individuals and the acceleration of breeding cycles, one of the major challenges in Eucalyptus breeding programs. Consequently, it can reduce breeding program production costs, as it eliminates the need to implement experiments in large planted areas while also enhancing the reliability in selection of genotypes.
Collapse
Affiliation(s)
- Guilherme Ferreira Melchert
- Department of Forest Science, Soils and Enviroment, São Paulo State University (UNESP), School of Agricultural Sciences (FCA), Av. Universitária, Botucatu 18610-034, SP, Brazil;
| | - Filipe Manoel Ferreira
- Department of Plant Production, São Paulo State University (UNESP), School of Agricultural Sciences (FCA), Av. Universitária, Botucatu 18610-034, SP, Brazil;
| | - Fabiana Rezende Muniz
- Suzano S.A., Jacareí 12340-010, SP, Brazil; (F.R.M.); (J.W.d.M.); (T.R.B.); (I.J.B.B.); (L.d.S.)
| | - Jose Wilacildo de Matos
- Suzano S.A., Jacareí 12340-010, SP, Brazil; (F.R.M.); (J.W.d.M.); (T.R.B.); (I.J.B.B.); (L.d.S.)
| | - Thiago Romanos Benatti
- Suzano S.A., Jacareí 12340-010, SP, Brazil; (F.R.M.); (J.W.d.M.); (T.R.B.); (I.J.B.B.); (L.d.S.)
| | | | - Leandro de Siqueira
- Suzano S.A., Jacareí 12340-010, SP, Brazil; (F.R.M.); (J.W.d.M.); (T.R.B.); (I.J.B.B.); (L.d.S.)
| | - Evandro Vagner Tambarussi
- Department of Plant Production, São Paulo State University (UNESP), School of Agricultural Sciences (FCA), Av. Universitária, Botucatu 18610-034, SP, Brazil;
| |
Collapse
|
2
|
Zhang Z, Wang X, Zhang Y, Zhou K, Yu G, Yang W, Li F, Guan X, Zhang X, Yang Z, Xu C, Xu Y. SPDC-HG: An accelerator of genomic hybrid breeding in maize. PLANT BIOTECHNOLOGY JOURNAL 2025; 23:1847-1861. [PMID: 40014659 PMCID: PMC12018846 DOI: 10.1111/pbi.70011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 01/22/2025] [Accepted: 02/04/2025] [Indexed: 03/01/2025]
Abstract
Integrating multiple modern breeding techniques in maize has always been challenging. This study aimed to address this issue by applying a flexible sparse partial diallel cross design composed of 945 maize hybrids derived from 266 inbred lines across different heterotic groups. The research integrated genome-wide association studies, genomic selection and genomic evaluation of parental inbred lines to accelerate the breeding process for developing single-cross hybrids. Significant associations were identified for 7-25 stable single nucleotide polymorphisms (SNPs) associated with the general combining abilities (GCAs) of nine yield-related traits. Using the maizeGDB and NCBI databases, 264 candidate genes were screened and functionally annotated based on significant SNPs detected by at least three statistical methods. The marker set developed from these GCA SNPs significantly improved the prediction accuracy of hybrids across all traits. The GCA estimates of the inbred lines involved in the top 100 and bottom 100 hybrids consistently ranked at the top and bottom, thereby confirming the accuracy of the predictions. Furthermore, the top 100 crosses selected using BayesB, GBLUP and LASSO showed a 105.4-108.6% increase in average ear weight compared to the bottom 100 crosses in field validation, demonstrating strong selection gains. Notably, amongst the top 100 hybrids, A017/A037 and A037/A169, each containing six superior genotypes were registered as Suyu 161 and Tongyu 1701, respectively, by the National Crop Variety Approval Committee in China. These results highlight the effectiveness of genomic selection and provide valuable insights for advancing genomic hybrid breeding in maize.
Collapse
Affiliation(s)
- Zhenliang Zhang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Jiangsu Yanjiang Institute of Agricultural SciencesNantongChina
| | - Xin Wang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
- College of Information EngineeringYangzhou UniversityYangzhouJiangsuChina
| | - Yuxiang Zhang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Kai Zhou
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Guangning Yu
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Wenyan Yang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Furong Li
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Xiusheng Guan
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Xuecai Zhang
- International Maize and Wheat Improvement Center (CIMMYT)TexcocoMéxico
| | - Zefeng Yang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Chenwu Xu
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Yang Xu
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co‐Innovation Center for Modern Production Technology of Grain Crops, College of AgricultureYangzhou UniversityYangzhouJiangsuChina
| |
Collapse
|
3
|
Lell M, Gogna A, Kloesgen V, Avenhaus U, Dörnte J, Eckhoff WM, Eschholz T, Gils M, Kirchhoff M, Koch M, Kollers S, Pfeiffer N, Rapp M, Wimmer V, Wolf M, Reif J, Zhao Y. Breaking down data silos across companies to train genome-wide predictions: A feasibility study in wheat. PLANT BIOTECHNOLOGY JOURNAL 2025. [PMID: 40253615 DOI: 10.1111/pbi.70095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 03/07/2025] [Accepted: 04/07/2025] [Indexed: 04/22/2025]
Abstract
Big data, combined with artificial intelligence (AI) techniques, holds the potential to significantly enhance the accuracy of genome-wide predictions. Motivated by the success reported for wheat hybrids, we extended the scope to inbred lines by integrating phenotypic and genotypic data from four commercial wheat breeding programs. Acting as an academic data trustee, we merged these data with historical experimental series from previous public-private partnerships. The integrated data spanned 12 years, 168 environments, and provided a genomic prediction training set of up to ~9500 genotypes for grain yield, plant height and heading date. Despite the heterogeneous phenotypic and genotypic data, we were able to obtain high-quality data by implementing rigorous data curation, including SNP imputation. We utilized the data to compare genomic best linear unbiased predictions with convolutional neural network-based genomic prediction. Our analysis revealed that we could flexibly combine experimental series for genomic prediction, with prediction ability steadily improving as the training set sizes increased, peaking at around 4000 genotypes. As training set sizes were further increased, the gains in prediction ability decreased, approaching a plateau well below the theoretical limit defined by the square root of the heritability. Potential avenues, such as designed training sets or novel non-linear prediction approaches, could overcome this plateau and help to more fully exploit the high-value big data generated by breaking down data silos across companies.
Collapse
Affiliation(s)
- Moritz Lell
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Abhishek Gogna
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Vincent Kloesgen
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Ulrike Avenhaus
- W. von Borries-Eckendorf GmbH & Co. KG, Leopoldshöhe, Germany
| | - Jost Dörnte
- Deutsche Saatveredelung AG, Lippstadt, Germany
| | | | | | - Mario Gils
- Nordsaat Saatzucht GmbH, Langenstein, Germany
| | | | | | | | | | - Matthias Rapp
- W. von Borries-Eckendorf GmbH & Co. KG, Leopoldshöhe, Germany
| | | | | | - Jochen Reif
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| | - Yusheng Zhao
- Leibniz Institute for Plant Genetics and Crop Plant Research, Seeland, Germany
| |
Collapse
|
4
|
Shahi D, Todd J, Gravois K, Hale A, Blanchard B, Kimbeng C, Pontif M, Baisakh N. Exploiting historical agronomic data to develop genomic prediction strategies for early clonal selection in the Louisiana sugarcane variety development program. THE PLANT GENOME 2025; 18:e20545. [PMID: 39740237 DOI: 10.1002/tpg2.20545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 11/15/2024] [Accepted: 11/18/2024] [Indexed: 01/02/2025]
Abstract
Genomic selection can enhance the rate of genetic gain of cane and sucrose yield in sugarcane (Saccharum L.), an important industrial crop worldwide. We assessed the predictive ability (PA) for six traits, such as theoretical recoverable sugar (TRS), number of stalks (NS), stalk weight (SW), cane yield (CY), sugar yield (SY), and fiber content (Fiber) using 20,451 single nucleotide polymorphisms (SNPs) with 22 statistical models based on the genomic estimated breeding values of 567 genotypes within and across five stages of the Louisiana sugarcane breeding program. TRS and SW with high heritability showed higher PA compared to other traits, while NS had the lowest. Machine learning (ML) methods, such as random forest and support vector machine (SVM), outperformed others in predicting traits with low heritability. ML methods predicted TRS and SY with the highest accuracy in cross-stage predictions, while Bayesian models predicted NS and CY with the highest accuracy. Extended genomic best linear unbiased prediction models accounting for dominance and epistasis effects showed a slight improvement in PA for a few traits. When both NS and TRS, which can be available as early as stage 2, were considered in a multi-trait selection model, the PA for SY in stage 5 could increase up to 0.66 compared to 0.30 with a single-trait model. Marker density assessment suggested 9091 SNPs were sufficient for optimal PA of all traits. The study demonstrated the potential of using historical data to devise genomic prediction strategies for clonal selection early in sugarcane breeding programs.
Collapse
Affiliation(s)
- Dipendra Shahi
- School of Plant, Environmental and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, Louisiana, USA
| | - James Todd
- Sugarcane Research Unit, USDA-ARS, Houma, Louisiana, USA
| | - Kenneth Gravois
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Anna Hale
- Sugarcane Research Unit, USDA-ARS, Houma, Louisiana, USA
| | - Brayden Blanchard
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Collins Kimbeng
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Michael Pontif
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Niranjan Baisakh
- School of Plant, Environmental and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, Louisiana, USA
| |
Collapse
|
5
|
Chu TT, Jensen J. ADAM-multi: software to simulate complex breeding programs for animals and plants with different ploidy levels and generalized genotypic effect models to account for multiple alleles. Front Genet 2025; 16:1513615. [PMID: 39995464 PMCID: PMC11847855 DOI: 10.3389/fgene.2025.1513615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 01/17/2025] [Indexed: 02/26/2025] Open
Abstract
Stochastic simulation software, ADAM, has been developed for the purpose of breeding optimization in animals and plants, and for validation of statistical models used in genetic evaluations. Just like other common simulation programs, ADAM assumed the bi-allelic state of quantitative trait locus (QTL). While the bi-allelic state of marker loci is due to the common choice of genotyping technology of single nucleotide polymorphism (SNP) chip, the assumption may not hold for the linked QTL. In the version of ADAM-Multi, we employ a novel simulation model capable of simulating additive, dominance, and epistatic genotypic effects for species with different levels of ploidy, providing with a more realistic assumption of multiple allelism for QTL variants. When assuming bi-allelic QTL, our proposed model becomes identical to the model assumption in common simulation programs, and in genetic textbooks. Along with the description of the updated simulation model in ADAM-Multi, this paper shows two small-scale studies that investigate the effects of multi-allelic versus bi-allelic assumptions in simulation and the use of different prediction models in a single-population breeding program for potatoes. We found that genomic models using dense bi-allelic markers could effectively predicted breeding values of individuals in a well-structure population despite the presence of multi-allelic QTL. Additionally, the small-scale study indicated that including non-additive genetic effects in the prediction model for selection did not lead to an improvement in the rate of genetic gains of the breeding program.
Collapse
Affiliation(s)
- Thinh Tuan Chu
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
- Faculty of Animal Science, Vietnam National University of Agriculture, Hanoi, Vietnam
| | - Just Jensen
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| |
Collapse
|
6
|
Weber SE, Roscher-Ehrig L, Kox T, Abbadi A, Stahl A, Snowdon RJ. Genomic prediction in Brassica napus: evaluating the benefit of imputed whole-genome sequencing data. Genome 2024; 67:210-222. [PMID: 38708850 DOI: 10.1139/gen-2023-0126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Advances in sequencing technology allow whole plant genomes to be sequenced with high quality. Combining genotypic and phenotypic data in genomic prediction helps breeders to select crossing partners in partially phenotyped populations. In plant breeding programs, the cost of sequencing entire breeding populations still exceeds available genotyping budgets. Hence, the method for genotyping is still mainly single nucleotide polymorphism (SNP) arrays; however, arrays are unable to assess the entire genome- and population-wide diversity. A compromise involves genotyping the entire population using an SNP array and a subset of the population with whole-genome sequencing. Both datasets can then be used to impute markers from whole-genome sequencing onto the entire population. Here, we evaluate whether imputation of whole-genome sequencing data enhances genomic predictions, using data from a nested association mapping population of rapeseed (Brassica napus). Employing two cross-validation schemes that mimic scenarios for the prediction of close and distant relatives, we show that imputed marker data do not significantly improve prediction accuracy, likely due to redundancy in relationship estimates and imputation errors. In simulation studies, only small improvements were observed, further corroborating the findings. We conclude that SNP arrays are already equipped with the information that is added by imputation through relationship and linkage disequilibrium.
Collapse
Affiliation(s)
- Sven E Weber
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Lennard Roscher-Ehrig
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | | | | | - Andreas Stahl
- Julius Kuehn Institute (JKI), Federal Research Centre for Cultivated Plants, Institute for Resistance Research and Stress Tolerance, Quedlinburg, Germany
| | - Rod J Snowdon
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| |
Collapse
|
7
|
Chen C, Bhuiyan SA, Ross E, Powell O, Dinglasan E, Wei X, Atkin F, Deomano E, Hayes B. Genomic prediction for sugarcane diseases including hybrid Bayesian-machine learning approaches. FRONTIERS IN PLANT SCIENCE 2024; 15:1398903. [PMID: 38751840 PMCID: PMC11095127 DOI: 10.3389/fpls.2024.1398903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 04/15/2024] [Indexed: 05/18/2024]
Abstract
Sugarcane smut and Pachymetra root rots are two serious diseases of sugarcane, with susceptible infected crops losing over 30% of yield. A heritable component to both diseases has been demonstrated, suggesting selection could improve disease resistance. Genomic selection could accelerate gains even further, enabling early selection of resistant seedlings for breeding and clonal propagation. In this study we evaluated four types of algorithms for genomic predictions of clonal performance for disease resistance. These algorithms were: Genomic best linear unbiased prediction (GBLUP), including extensions to model dominance and epistasis, Bayesian methods including BayesC and BayesR, Machine learning methods including random forest, multilayer perceptron (MLP), modified convolutional neural network (CNN) and attention networks designed to capture epistasis across the genome-wide markers. Simple hybrid methods, that first used BayesR/GWAS to identify a subset of 1000 markers with moderate to large marginal additive effects, then used attention networks to derive predictions from these effects and their interactions, were also developed and evaluated. The hypothesis for this approach was that using a subset of markers more likely to have an effect would enable better estimation of interaction effects than when there were an extremely large number of possible interactions, especially with our limited data set size. To evaluate the methods, we applied both random five-fold cross-validation and a structured PCA based cross-validation that separated 4702 sugarcane clones (that had disease phenotypes and genotyped for 26k genome wide SNP markers) by genomic relationship. The Bayesian methods (BayesR and BayesC) gave the highest accuracy of prediction, followed closely by hybrid methods with attention networks. The hybrid methods with attention networks gave the lowest variation in accuracy of prediction across validation folds (and lowest MSE), which may be a criteria worth considering in practical breeding programs. This suggests that hybrid methods incorporating the attention mechanism could be useful for genomic prediction of clonal performance, particularly where non-additive effects may be important.
Collapse
Affiliation(s)
- Chensong Chen
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Shamsul A. Bhuiyan
- Sugar Research Australia, Woodford, QLD, Australia
- Queensland Micro- and Nanotechnology Centre, Griffith University, Nathan, QLD, Australia
| | - Elizabeth Ross
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Owen Powell
- Center for Crop Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Eric Dinglasan
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Xianming Wei
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | | | - Emily Deomano
- Sugar Research Australia, Indooroopilly, QLD, Australia
| | - Ben Hayes
- Center for Animal Science, The Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
8
|
Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024; 17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]
Abstract
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining GP accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single-nucleotide polymorphisms, level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine-learning methods, and non-additive effects are the other vital factors. Using wheat, maize, and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP-theoretically reaching one when using the Pearson's correlation as a metric-is an active research area as yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy, making genomic selection an effective tool in plant breeding.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Johanna Åstrand
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden; Lantmännen Lantbruk, Svalöv, Sweden
| | | | - Julio Isidro Y Sánchez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Javier Fernández-Gónzalez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Wuletaw Tadesse
- International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco
| | - Ramesh R Vetukuri
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Anders S Carlsson
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco, México 52640, Mexico
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| |
Collapse
|
9
|
Lin YC, Mayer M, Valle Torres D, Pook T, Hölker AC, Presterl T, Ouzunova M, Schön CC. Genomic prediction within and across maize landrace derived populations using haplotypes. FRONTIERS IN PLANT SCIENCE 2024; 15:1351466. [PMID: 38584949 PMCID: PMC10995330 DOI: 10.3389/fpls.2024.1351466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 03/05/2024] [Indexed: 04/09/2024]
Abstract
Genomic prediction (GP) using haplotypes is considered advantageous compared to GP solely reliant on single nucleotide polymorphisms (SNPs), owing to haplotypes' enhanced ability to capture ancestral information and their higher linkage disequilibrium with quantitative trait loci (QTL). Many empirical studies supported the advantages of haplotype-based GP over SNP-based approaches. Nevertheless, the performance of haplotype-based GP can vary significantly depending on multiple factors, including the traits being studied, the genetic structure of the population under investigation, and the particular method employed for haplotype construction. In this study, we compared haplotype and SNP based prediction accuracies in four populations derived from European maize landraces. Populations comprised either doubled haploid lines (DH) derived directly from landraces, or gamete capture lines (GC) derived from crosses of the landraces with an inbred line. For two different landraces, both types of populations were generated, genotyped with 600k SNPs and phenotyped as lines per se for five traits. Our study explores three prediction scenarios: (i) within each of the four populations, (ii) across DH and GC populations from the same landrace, and (iii) across landraces using either DH or GC populations. Three haplotype construction methods were evaluated: 1. fixed-window blocks (FixedHB), 2. LD-based blocks (HaploView), and 3. IBD-based blocks (HaploBlocker). In within population predictions, FixedHB and HaploView methods performed as well as or slightly better than SNPs for all traits. HaploBlocker improved accuracy for certain traits but exhibited inferior performance for others. In prediction across populations, the parameter setting from HaploBlocker which controls the construction of shared haplotypes between populations played a crucial role for obtaining optimal results. When predicting across landraces, accuracies were low for both, SNP and haplotype approaches, but for specific traits substantial improvement was observed with HaploBlocker. This study provides recommendations for optimal haplotype construction and identifies relevant parameters for constructing haplotypes in the context of genomic prediction.
Collapse
Affiliation(s)
- Yan-Cheng Lin
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Manfred Mayer
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Bayer CropScience Deutschland GmbH, Borken, Germany
| | - Daniel Valle Torres
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Sugar Beet Breeding, Strube Research GmbH & Co. KG, Söllingen, Germany
| | - Torsten Pook
- Animal Breeding and Genomics, Wageningen University & Research, Wageningen, Netherlands
| | - Armin C. Hölker
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Thomas Presterl
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Milena Ouzunova
- Product Development Maize and Oil Crops, KWS SAAT SE & Co. KGaA, Einbeck, Germany
| | - Chris-Carolin Schön
- Chair of Plant Breeding, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|