1
|
Cheng D, Li J, Guo S, Wang Y, Xu S, Chen S, Liu W. Genomic Prediction for Germplasm Improvement Through Inter-Heterotic-Group Line Crossing in Maize. Int J Mol Sci 2025; 26:2662. [PMID: 40141304 PMCID: PMC11942448 DOI: 10.3390/ijms26062662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2025] [Revised: 03/02/2025] [Accepted: 03/12/2025] [Indexed: 03/28/2025] Open
Abstract
Germplasm improvement is essential for maize breeding. Currently, intra-heterotic-group crossing is the major method for germplasm improvement, while inter-heterotic-group crossing is also used in breeding but not in a systematic way. In this study, five inbred lines from four heterotic groups were used to develop a connected segregating population through inter-heterotic-group line crossing (CSPIC), which comprised 5 subpopulations with 535 doubled haploid (DH) lines and 15 related test-cross populations including 1568 hybrids. Significant genetic variation was observed in most subpopulations, with several DH populations exhibiting superior phenotypes regarding traits such as plant height (PH), ear height (EH), days to anthesis (DTA), and days to silking (DTS). Notably, 10.8% of hybrids in the population POP5/C229 surpassed the high-yielding hybrid ND678 (CK). To reduce field planting costs and quickly screen for the best inter-heterotic-group DH lines and test-cross hybrids, we assessed the accuracy of genomic selection (GS) for within- and between-population predictions in the DH populations and the test-cross populations. Within the DH or the hybrid population, the prediction accuracy varied across populations and traits, with an average hybrid yield prediction accuracy of 0.41, reaching 0.54 in POP5/Z58. In the cross DH population predictions, the prediction accuracy of the half-sib population exceeded that of the non-sib cross population prediction, with the highest accuracy observed when the non-shared parents were from the same heterotic group, and the average phenotypic prediction accuracies of POP3 predicting POP2 and POP2 predicting POP3 were 0.54 and 0.45, respectively. In the cross hybrid population predictions, the accuracy was highest when both the training and the test sets came from the same DH populations, with an average accuracy of 0.43. The proportion of shared polymorphisms with respect to SNPs between the training and the test sets (PSP) exhibited a significant and strong correlation with the prediction accuracy of cross population prediction. This study demonstrates the feasibility of creating new heterotic groups through inter-heterotic-group crossing in germplasm improvement, and some cross population prediction patterns exhibited excellent prediction accuracy.
Collapse
Affiliation(s)
- Dehe Cheng
- State Key Laboratory of Maize Bio-Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China
| | - Jinlong Li
- State Key Laboratory of Maize Bio-Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China
| | - Shuwei Guo
- State Key Laboratory of Maize Bio-Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China
| | - Yuandong Wang
- Maize Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
| | - Shizhong Xu
- Department of Botany and Plant Sciences, University of California, Riverside, CA 92521, USA
| | - Shaojiang Chen
- State Key Laboratory of Maize Bio-Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China
| | - Wenxin Liu
- State Key Laboratory of Maize Bio-Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China
| |
Collapse
|
2
|
Building a Calibration Set for Genomic Prediction, Characteristics to Be Considered, and Optimization Approaches. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2467:77-112. [PMID: 35451773 DOI: 10.1007/978-1-0716-2205-6_3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The efficiency of genomic selection strongly depends on the prediction accuracy of the genetic merit of candidates. Numerous papers have shown that the composition of the calibration set is a key contributor to prediction accuracy. A poorly defined calibration set can result in low accuracies, whereas an optimized one can considerably increase accuracy compared to random sampling, for a same size. Alternatively, optimizing the calibration set can be a way of decreasing the costs of phenotyping by enabling similar levels of accuracy compared to random sampling but with fewer phenotypic units. We present here the different factors that have to be considered when designing a calibration set, and review the different criteria proposed in the literature. We classified these criteria into two groups: model-free criteria based on relatedness, and criteria derived from the linear mixed model. We introduce criteria targeting specific prediction objectives including the prediction of highly diverse panels, biparental families, or hybrids. We also review different ways of updating the calibration set, and different procedures for optimizing phenotyping experimental designs.
Collapse
|
3
|
Elsen JM. Genomic Prediction of Complex Traits, Principles, Overview of Factors Affecting the Reliability of Genomic Prediction, and Algebra of the Reliability. Methods Mol Biol 2022; 2467:45-76. [PMID: 35451772 DOI: 10.1007/978-1-0716-2205-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The quality of the predictions of genetic values based on the genotyping of neutral markers (GEBVs) is a key information to decide whether or not to implement genomic selection. This quality depends on the part of the genetic variability captured by the markers and on the precision of the estimate of their effects. Selection index theory provided the framework for evaluating the accuracy of GEBVs once the information had been gathered, with the genomic relationship matrix (GRM) playing a central role. When this accuracy must be known a priori, the theory of quantitative genetics gives clues to calculate the expectation of this GRM. This chapter makes a critical inventory of the methods developed to calculate these accuracies a posteriori and a priori. The most significant factors affecting this accuracy are described (size of the reference population, number of markers, linkage disequilibrium, heritability).
Collapse
Affiliation(s)
- Jean-Michel Elsen
- GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, France.
| |
Collapse
|
4
|
Picard Druet D, Varenne A, Herry F, Hérault F, Allais S, Burlot T, Le Roy P. Reliability of genomic evaluation for egg quality traits in layers. BMC Genet 2020; 21:17. [PMID: 32046634 PMCID: PMC7014768 DOI: 10.1186/s12863-020-0820-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 01/31/2020] [Indexed: 11/17/2022] Open
Abstract
Background Genomic evaluation, based on the use of thousands of genetic markers in addition to pedigree and phenotype information, has become the standard evaluation methodology in dairy cattle breeding programmes over the past several years. Despite the many differences between dairy cattle breeding and poultry breeding, genomic selection seems very promising for the avian sector, and studies are currently being conducted to optimize avian selection schemes. In this optimization perspective, one of the key parameters is to properly predict the accuracy of genomic evaluation in pure line layers. Results It was observed that genomic evaluation, whether performed on males or females, always proved more accurate than genetic evaluation. The gain was higher when phenotypic information was narrowed, and an augmentation of the size of the reference population led to an increase in accuracy prediction with regard to genomic evaluation. By taking into account the increase of selection intensity and the decrease of the generation interval induced by genomic selection, the expected annual genetic gain would be higher with ancestry-based genomic evaluation of male candidates than with genetic evaluation based on collaterals. This advantage of genomic selection over genetic selection requires more detailed further study for female candidates. Conclusions In conclusion, in the population studied, the genomic evaluation of egg quality traits of breeding birds at birth seems to be a promising strategy, at least for the selection of males.
Collapse
Affiliation(s)
- David Picard Druet
- PEGASE, INRAE, Agrocampus Ouest, 16 Le Clos, Saint-Gilles, 35590, France
| | | | - Florian Herry
- PEGASE, INRAE, Agrocampus Ouest, 16 Le Clos, Saint-Gilles, 35590, France.,NOVOGEN, 5, rue des Compagnons, Plédran, 22960, France
| | - Frédéric Hérault
- PEGASE, INRAE, Agrocampus Ouest, 16 Le Clos, Saint-Gilles, 35590, France
| | - Sophie Allais
- PEGASE, INRAE, Agrocampus Ouest, 16 Le Clos, Saint-Gilles, 35590, France
| | | | - Pascale Le Roy
- PEGASE, INRAE, Agrocampus Ouest, 16 Le Clos, Saint-Gilles, 35590, France.
| |
Collapse
|
5
|
Mangin B, Rincent R, Rabier CE, Moreau L, Goudemand-Dugue E. Training set optimization of genomic prediction by means of EthAcc. PLoS One 2019; 14:e0205629. [PMID: 30779753 PMCID: PMC6380617 DOI: 10.1371/journal.pone.0205629] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 01/03/2019] [Indexed: 12/17/2022] Open
Abstract
Genomic prediction is a useful tool for plant and animal breeding programs and is starting to be used to predict human diseases as well. A shortcoming that slows down the genomic selection deployment is that the accuracy of the prediction is not known a priori. We propose EthAcc (Estimated THeoretical ACCuracy) as a method for estimating the accuracy given a training set that is genotyped and phenotyped. EthAcc is based on a causal quantitative trait loci model estimated by a genome-wide association study. This estimated causal model is crucial; therefore, we compared different methods to find the one yielding the best EthAcc. The multilocus mixed model was found to perform the best. We compared EthAcc to accuracy estimators that can be derived via a mixed marker model. We showed that EthAcc is the only approach to correctly estimate the accuracy. Moreover, in case of a structured population, in accordance with the achieved accuracy, EthAcc showed that the biggest training set is not always better than a smaller and closer training set. We then performed training set optimization with EthAcc and compared it to CDmean. EthAcc outperformed CDmean on real datasets from sugar beet, maize, and wheat. Nonetheless, its performance was mainly due to the use of an optimal but inaccessible set as a start of the optimization algorithm. EthAcc's precision and algorithm issues prevent it from reaching a good training set with a random start. Despite this drawback, we demonstrated that a substantial gain in accuracy can be obtained by performing training set optimization.
Collapse
Affiliation(s)
- Brigitte Mangin
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
- * E-mail:
| | | | - Charles-Elie Rabier
- ISEM, Univ. Montpellier, CNRS, EPHE, IRD, Montpellier, France
- LIRMM, Univ. Montpellier, CNRS, Montpellier, France
| | - Laurence Moreau
- GQE-Le Moulon, INRA, Univ Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Gif-sur-Yvette, France
| | | |
Collapse
|
6
|
Rio S, Mary-Huard T, Moreau L, Charcosset A. Genomic selection efficiency and a priori estimation of accuracy in a structured dent maize panel. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2019; 132:81-96. [PMID: 30288553 DOI: 10.1007/s00122-018-3196-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Accepted: 09/22/2018] [Indexed: 06/08/2023]
Abstract
Population structure affects genomic selection efficiency as well as the ability to forecast accuracy using standard GBLUP. Genomic prediction models usually assume that the individuals used for calibration belong to the same population as those to be predicted. Most of the a priori indicators of precision, such as the coefficient of determination (CD), were derived from those same models. But genetic structure is a common feature in plant species, and it may impact genomic selection efficiency and the ability to forecast prediction accuracy. We investigated the impact of genetic structure in a dent maize panel ("Amaizing Dent") using different scenarios including within- or across-group predictions. For a given training set size, the best accuracies were achieved when predicting individuals using a model calibrated on the same genetic group. Nevertheless, a diverse training set representing all the groups had a certain predictive efficiency for all the validation sets, and adding extra-group individuals was almost always beneficial. It underlines the potential of such a generic training set for dent maize genomic selection applications. Alternative prediction models, taking genetic structure explicitly into account, did not improve the prediction accuracy compared to GBLUP. We also investigated the ability of different indicators of precision to forecast accuracy in the within- or across-group scenarios. There was a global encouraging trend of the CD to differentiate scenarios, although there were specific combinations of target populations and traits where the efficiency of this indicator proved to be null. One hypothesis to explain such erratic performances is the impact of genetic structure through group-specific allele diversity at QTLs rather than group-specific allele effects.
Collapse
Affiliation(s)
- Simon Rio
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
- MIA, INRA, AgroParisTech, Université Paris-Saclay, 75005, Paris, France
| | - Laurence Moreau
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France
| | - Alain Charcosset
- GQE - Le Moulon, INRA, Univ. Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
7
|
|
8
|
Elsen JM. An analytical framework to derive the expected precision of genomic selection. Genet Sel Evol 2017; 49:95. [PMID: 29281960 PMCID: PMC5745666 DOI: 10.1186/s12711-017-0366-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 12/01/2017] [Indexed: 11/16/2022] Open
Abstract
Background Formulae to predict the precision or accuracy of genomic estimated breeding values (GEBV) are important when modelling selection schemes. Simple versions of such formulae have been proposed in the past, based on a number of simplifying hypotheses, including absence of linkage disequilibrium and linkage between loci, a population made up of unrelated individuals, and that all genetic variability of the trait is explained by the genotyped loci. These formulae were based on approximations that were not always clear. The objective of this paper is to offer a unique framework to derive equations that predict the precision of GEBV from the size of the reference population and the heritability of and number of QTL controlling the quantitative trait. Results The exact formulation of the precision of GEBV involves the expectation of the inverse of a linear function of the genomic matrix, which cannot be calculated from simple algebra but can be approximated using a Taylor polynomial expansion. First order approximations performed better than the initial prediction equations published in the literature. Second order approximations produced almost perfect estimates of precision when compared to results obtained when simulating situations that agreed with the assumptions that were required to derive the precision equations. Using this proposed framework, we present several generalizations, including multi-trait genomic evaluation. Conclusions Although further improvements are needed to account for the complexity of practical situations, the equations proposed here can be used to derive the precision of GEBV when comparing breeding schemes a priori. Electronic supplementary material The online version of this article (10.1186/s12711-017-0366-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jean-Michel Elsen
- GenPhySE (Génétique Physiologie et Systèmes d'Elevage), Université de Toulouse, INRA, ENVT, 31326, Castanet-Tolosan, France.
| |
Collapse
|