1
|
Alemu A, Åstrand J, Montesinos-López OA, Isidro Y Sánchez J, Fernández-Gónzalez J, Tadesse W, Vetukuri RR, Carlsson AS, Ceplitis A, Crossa J, Ortiz R, Chawade A. Genomic selection in plant breeding: Key factors shaping two decades of progress. MOLECULAR PLANT 2024; 17:552-578. [PMID: 38475993 DOI: 10.1016/j.molp.2024.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/22/2024] [Accepted: 03/08/2024] [Indexed: 03/14/2024]
Abstract
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining GP accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single-nucleotide polymorphisms, level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine-learning methods, and non-additive effects are the other vital factors. Using wheat, maize, and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP-theoretically reaching one when using the Pearson's correlation as a metric-is an active research area as yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy, making genomic selection an effective tool in plant breeding.
Collapse
Affiliation(s)
- Admas Alemu
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Johanna Åstrand
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden; Lantmännen Lantbruk, Svalöv, Sweden
| | | | - Julio Isidro Y Sánchez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Javier Fernández-Gónzalez
- Centro de Biotecnología y Genómica de Plantas (CBGP, UPM-INIA), Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Campus de Montegancedo-UPM, 28223 Madrid, Spain
| | - Wuletaw Tadesse
- International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco
| | - Ramesh R Vetukuri
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | - Anders S Carlsson
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| | | | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45, Carretera México-Veracruz, Texcoco, México 52640, Mexico
| | - Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.
| | - Aakash Chawade
- Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden
| |
Collapse
|
2
|
Weber SE, Frisch M, Snowdon RJ, Voss-Fels KP. Haplotype blocks for genomic prediction: a comparative evaluation in multiple crop datasets. FRONTIERS IN PLANT SCIENCE 2023; 14:1217589. [PMID: 37731980 PMCID: PMC10507710 DOI: 10.3389/fpls.2023.1217589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 08/21/2023] [Indexed: 09/22/2023]
Abstract
In modern plant breeding, genomic selection is becoming the gold standard for selection of superior genotypes. The basis for genomic prediction models is a set of phenotyped lines along with their genotypic profile. With high marker density and linkage disequilibrium (LD) between markers, genotype data in breeding populations tends to exhibit considerable redundancy. Therefore, interest is growing in the use of haplotype blocks to overcome redundancy by summarizing co-inherited features. Moreover, haplotype blocks can help to capture local epistasis caused by interacting loci. Here, we compared genomic prediction methods that either used single SNPs or haplotype blocks with regards to their prediction accuracy for important traits in crop datasets. We used four published datasets from canola, maize, wheat and soybean. Different approaches to construct haplotype blocks were compared, including blocks based on LD, physical distance, number of adjacent markers and the algorithms implemented in the software "Haploview" and "HaploBlocker". The tested prediction methods included Genomic Best Linear Unbiased Prediction (GBLUP), Extended GBLUP to account for additive by additive epistasis (EGBLUP), Bayesian LASSO and Reproducing Kernel Hilbert Space (RKHS) regression. We found improved prediction accuracy in some traits when using haplotype blocks compared to SNP-based predictions, however the magnitude of improvement was very trait- and model-specific. Especially in settings with low marker density, haplotype blocks can improve genomic prediction accuracy. In most cases, physically large haplotype blocks yielded a strong decrease in prediction accuracy. Especially when prediction accuracy varies greatly across different prediction models, prediction based on haplotype blocks can improve prediction accuracy of underperforming models. However, there is no "best" method to build haplotype blocks, since prediction accuracy varied considerably across methods and traits. Hence, criteria used to define haplotype blocks should not be viewed as fixed biological parameters, but rather as hyperparameters that need to be adjusted for every dataset.
Collapse
Affiliation(s)
- Sven E. Weber
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Matthias Frisch
- Department of Biometry and Population Genetics, Justus Liebig University, Giessen, Germany
| | - Rod J. Snowdon
- Department of Plant Breeding, Justus Liebig University, Giessen, Germany
| | - Kai P. Voss-Fels
- Institute for Grapevine Breeding, Hochschule Geisenheim University, Geisenheim, Germany
| |
Collapse
|
3
|
He S, Liang S, Meng L, Cao L, Ye G. Sparse Phenotyping and Haplotype-Based Models for Genomic Prediction in Rice. RICE (NEW YORK, N.Y.) 2023; 16:27. [PMID: 37284992 DOI: 10.1186/s12284-023-00643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 05/20/2023] [Indexed: 06/08/2023]
Abstract
The multi-environment genomic selection enables plant breeders to select varieties resilient to diverse environments or particularly adapted to specific environments, which holds a great potential to be used in rice breeding. To realize the multi-environment genomic selection, a robust training set with multi-environment phenotypic data is of necessity. Considering the huge potential of genomic prediction enhanced sparse phenotyping on the cost saving of multi-environment trials (MET), the establishment of a multi-environment training set could also benefit from it. Optimizing the genomic prediction methods is also crucial to enhance the multi-environment genomic selection. Using haplotype-based genomic prediction models is able to capture local epistatic effects which could be conserved and accumulated across generations much like additive effects thereby benefitting breeding. However, previous studies often used fixed length haplotypes composed by a few adjacent molecular markers disregarding the linkage disequilibrium (LD) which is of essential role in determining the haplotype length. In our study, based on three rice populations with different sizes and compositions, we investigated the usefulness and effectiveness of multi-environment training sets with varying phenotyping intensities and different haplotype-based genomic prediction models based on LD-derived haplotype blocks for two agronomic traits, i.e., days to heading (DTH) and plant height (PH). Results showed that phenotyping merely 30% records in multi-environment training set is able to provide a comparable prediction accuracy to high phenotyping intensities; the local epistatic effects are much likely existent in DTH; dividing the LD-derived haplotype blocks into small segments with two or three single nucleotide polymorphisms (SNPs) helps to maintain the predictive ability of haplotype-based models in large populations; modelling the covariances between environments improves genomic prediction accuracy. Our study provides means to improve the efficiency of multi-environment genomic selection in rice.
Collapse
Affiliation(s)
- Sang He
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124, China
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124, China
| | - Shanshan Liang
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, 300387, China
| | - Lijun Meng
- Kunpeng Institute of Modern Agriculture at Foshan, Foshan, 528200, China
| | - Liyong Cao
- Key Laboratory for Zhejiang Super Rice Research, China National Rice Research Institute, Hangzhou, 310006, China.
| | - Guoyou Ye
- CAAS-IRRI Joint Laboratory for Genomics-Assisted Germplasm Enhancement, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518124, China.
- Rice Breeding Innovations Platform, International Rice Research Institute, Metro Manila, Philippines.
| |
Collapse
|
4
|
Dreisigacker S, Pérez-Rodríguez P, Crespo-Herrera L, Bentley AR, Crossa J. Results from rapid-cycle recurrent genomic selection in spring bread wheat. G3 (BETHESDA, MD.) 2023; 13:jkad025. [PMID: 36702618 PMCID: PMC10085763 DOI: 10.1093/g3journal/jkad025] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/18/2023] [Accepted: 01/19/2023] [Indexed: 01/28/2023]
Abstract
Genomic selection (GS) in wheat breeding programs is of great interest for predicting the genotypic values of individuals, where both additive and nonadditive effects determine the final breeding value of lines. While several simulation studies have shown the efficiency of rapid-cycling GS strategies for parental selection or population improvement, their practical implementations are still lacking in wheat and other crops. In this study, we demonstrate the potential of rapid-cycle recurrent GS (RCRGS) to increase genetic gain for grain yield (GY) in wheat. Our results showed a consistent realized genetic gain for GY after 3 cycles of recombination (C1, C2, and C3) of bi-parental F1s, when summarized across 2 years of phenotyping. For both evaluation years combined, genetic gain through RCRGS reached 12.3% from cycle C0 to C3 and realized gain was 0.28 ton ha-1 per cycle with a GY from C0 (6.88 ton ha-1) to C3 (7.73 ton ha-1). RCRGS was also associated with some changes in important agronomic traits that were measured (days to heading, days to maturity, and plant height) but not selected for. To account for these changes, we recommend implementing GS together with multi-trait prediction models.
Collapse
Affiliation(s)
- Susanne Dreisigacker
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, Texcoco, Edo. de México, CP 56100, México
| | | | - Leonardo Crespo-Herrera
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, Texcoco, Edo. de México, CP 56100, México
| | - Alison R Bentley
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, Texcoco, Edo. de México, CP 56100, México
| | - José Crossa
- International Maize and Wheat Improvement Center (CIMMYT), Km 45 Carretera México-Veracruz, Texcoco, Edo. de México, CP 56100, México
- Colegio de Postgraduados, Montecillos, Edo. de México, CP 56264, México
| |
Collapse
|
5
|
Terraillon J, Roeber FK, Flachenecker C, Frisch M. Training set designs for prediction of yield and moisture of maize test cross hybrids with unreplicated trials. FRONTIERS IN PLANT SCIENCE 2023; 14:1080087. [PMID: 36950349 PMCID: PMC10025381 DOI: 10.3389/fpls.2023.1080087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 02/03/2023] [Indexed: 06/18/2023]
Abstract
Unreplicated field trials and genomic prediction are both used to enhance the efficiency in early selection stages of a hybrid maize breeding program. No results are available on the optimal experimental design when combining both approaches. Our objectives were to investigate the effect of the training set design on the accuracy of genomic prediction in unreplicated maize test crosses. We carried out a cross validation study on basis of an experimental data set consisting of 1436 hybrids evaluated for yield and moisture for which genotyping information of 461 SNP markers were available. Training set designs of different size, implementing within environment prediction, within year prediction, across year prediction, and combinations of data sources across years and environments were compared with respect to their prediction accuracy. Across year prediction did not reach prediction accuracies that are useful for genomic selection. Within year prediction across environments provided useful correlations between observed and predicted breeding values. The prediction accuracies did not improve when adding to the training set data from previous years. We conclude that using all data available from unreplicated tests of the current breeding cycle provides a good accuracy of predicting test crosses, whereas adding data from previous breeding cycles, in which the genotypes are less related to the tested material, has only limited value for increasing the prediction accuracy.
Collapse
Affiliation(s)
- Jérôme Terraillon
- Institute of Agronomy and Plant Breeding II, Justus Liebig University, Giessen, Germany
| | | | | | - Matthias Frisch
- Institute of Agronomy and Plant Breeding II, Justus Liebig University, Giessen, Germany
| |
Collapse
|
6
|
Biswas PS, Ahmed MME, Afrin W, Rahman A, Shalahuddin AKM, Islam R, Akter F, Syed MA, Sarker MRA, Ifterkharuddaula KM, Islam MR. Enhancing genetic gain through the application of genomic selection in developing irrigated rice for the favorable ecosystem in Bangladesh. Front Genet 2023; 14:1083221. [PMID: 36911402 PMCID: PMC9992429 DOI: 10.3389/fgene.2023.1083221] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 01/18/2023] [Indexed: 02/24/2023] Open
Abstract
Increasing selection differential and decreasing cycle time, the rate of genetic improvement can be accelerated. Creating and capturing higher genetic with higher accuracy within the shortest possible time is the prerequisite for enhancing genetic gain for any trait. Comprehensive yield testing at multi-locations at early generations together with the shortest line fixation time can expedite the rapid recycling of parents in the breeding program through recurrent selection. Genomic selection is efficient in capturing high breeding value individuals taking additive genetic effects of all genes into account with and without extensive field testing, thus reducing breeding cycle time enhances genetic gain. In the Bangladesh Rice Research Institute, GS technology together with the trait-specific marker-assisted selection at the early generation of RGA-derived breeding lines showed a prediction accuracy of 0.454-0.701 with 0.989-2.623 relative efficiency over the four consecutive years of exercise. This study reports that the application of GS together with trait-specific MAS has expedited the yield improvement by 117 kg ha-1·year-1, which is around seven-fold larger than the baseline annual genetic gain and shortened the breeding cycle by around 1.5 years from the existing 4.5 years.
Collapse
Affiliation(s)
- Partha S Biswas
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - M M Emam Ahmed
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Wazifa Afrin
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Anisar Rahman
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - A K M Shalahuddin
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Rafiqul Islam
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Fahamida Akter
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Md Abu Syed
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - Md Ruhul Amin Sarker
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | - K M Ifterkharuddaula
- Plant Breeding Division, Bangladesh Rice Research Institute, Gazipur, Bangladesh
| | | |
Collapse
|