1
|
Ribeiro PCO, Howard R, Jarquin D, Oliveira ICM, Chaves S, Carneiro PCS, Souza VF, Schaffert RE, Damasceno CMB, Parrella RAC, Dias KOG, Pastina MM. Prediction of biomass sorghum hybrids using environmental feature-enriched genomic combining ability models in tropical environments. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2025; 138:113. [PMID: 40343517 DOI: 10.1007/s00122-025-04895-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 04/02/2025] [Indexed: 05/11/2025]
Abstract
KEY MESSAGE Incorporating environmental features improved the predictive ability of genomic prediction models under multi-environment trials in tropical conditions. Gathering environmental and genomic information can benefit the breeding of sorghum hybrids by overcoming complications imposed by the genotype-by-environment interaction (GEI). In this study, we explored the value of combining environmental features (EFs) and genomic data to enhance predictions for biomass sorghum hybrid breeding, addressing GEI complexities. We also investigated if considering specific time windows for EFs improves the prediction. We used a historical dataset from a tropical biomass sorghum breeding program featuring 253 genotypes across 64 trials. Initially, a first-stage analysis was performed to obtain the adjusted means (EBLUEs) and scrutinize the impact of 29 EFs (geographic, climatic, and soil-related EFs) on GEI. Subsequently, in the second-stage analysis, we used data from 221 hybrids that had both parents genotyped to evaluate the predictive ability and assertiveness of 12 models with different effects. The most relevant EFs included soil organic carbon, insolation on a horizontal surface, longitude, temperature at dew point, and nitrogen content. Across three cross-validation scenarios (CV1, CV0, and CV00), the most effective model encompassed main combining ability effects, GEI, and G ω I (genotype-by-specific environmental effects interaction), utilizing an environmental kinship matrix ( Ω ) derived from mean EF values. Only in CV2, a model with a similar structure but utilizing Ω from specific time windows outperformed others. Our findings highlight the potential of integrating environmental and genomic data to refine predictive models for optimizing biomass sorghum hybrid breeding strategies.
Collapse
Affiliation(s)
- Pedro C O Ribeiro
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Reka Howard
- Department of Statistics, University of Nebraska - Lincoln (UNL), Lincoln, NE, USA
| | - Diego Jarquin
- Department of Agronomy, University of Florida, Gainesville, FL, USA
| | - Isadora C M Oliveira
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil
| | - Saulo Chaves
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
- Department of Genetics, "Luiz de Queiroz" College of Agriculture, University of São Paulo, Piracicaba, São Paulo, Brazil
| | - Pedro C S Carneiro
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Vander F Souza
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil
| | - Robert E Schaffert
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil
| | - Cynthia M B Damasceno
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil
| | - Rafael A C Parrella
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil
| | - Kaio Olimpio G Dias
- Department of General Biology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil.
- Institute of Artificial and Computational Intelligence (IDATA), Federal University of Viçosa, Viçosa, Minas Gerais, Brazil.
| | - Maria M Pastina
- Embrapa Milho e Sorgo, Brazilian Agricultural Research Corporation (Embrapa), Sete Lagoas, Minas Gerais, Brazil.
| |
Collapse
|
2
|
Fernandes IK, Vieira CC, Dias KOG, Fernandes SB. Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:189. [PMID: 39044035 PMCID: PMC11266441 DOI: 10.1007/s00122-024-04687-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 06/29/2024] [Indexed: 07/25/2024]
Abstract
KEY MESSAGE Incorporating feature-engineered environmental data into machine learning-based genomic prediction models is an efficient approach to indirectly model genotype-by-environment interactions. Complementing phenotypic traits and molecular markers with high-dimensional data such as climate and soil information is becoming a common practice in breeding programs. This study explored new ways to combine non-genetic information in genomic prediction models using machine learning. Using the multi-environment trial data from the Genomes To Fields initiative, different models to predict maize grain yield were adjusted using various inputs: genetic, environmental, or a combination of both, either in an additive (genetic-and-environmental; G+E) or a multiplicative (genotype-by-environment interaction; GEI) manner. When including environmental data, the mean prediction accuracy of machine learning genomic prediction models increased up to 7% over the well-established Factor Analytic Multiplicative Mixed Model among the three cross-validation scenarios evaluated. Moreover, using the G+E model was more advantageous than the GEI model given the superior, or at least comparable, prediction accuracy, the lower usage of computational memory and time, and the flexibility of accounting for interactions by construction. Our results illustrate the flexibility provided by the ML framework, particularly with feature engineering. We show that the feature engineering stage offers a viable option for envirotyping and generates valuable information for machine learning-based genomic prediction models. Furthermore, we verified that the genotype-by-environment interactions may be considered using tree-based approaches without explicitly including interactions in the model. These findings support the growing interest in merging high-dimensional genotypic and environmental data into predictive modeling.
Collapse
Affiliation(s)
- Igor K Fernandes
- Department of Crop, Soil, and Environmental Sciences, Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA
| | - Caio C Vieira
- Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Kaio O G Dias
- Department of General Biology, Federal University of Viçosa, Viçosa, Brazil
| | - Samuel B Fernandes
- Department of Crop, Soil, and Environmental Sciences, Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA.
| |
Collapse
|
3
|
Resende RT, Hickey L, Amaral CH, Peixoto LL, Marcatti GE, Xu Y. Satellite-enabled enviromics to enhance crop improvement. MOLECULAR PLANT 2024; 17:848-866. [PMID: 38637991 DOI: 10.1016/j.molp.2024.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/04/2024] [Accepted: 04/11/2024] [Indexed: 04/20/2024]
Abstract
Enviromics refers to the characterization of micro- and macroenvironments based on large-scale environmental datasets. By providing genotypic recommendations with predictive extrapolation at a site-specific level, enviromics could inform plant breeding decisions across varying conditions and anticipate productivity in a changing climate. Enviromics-based integration of statistics, envirotyping (i.e., determining environmental factors), and remote sensing could help unravel the complex interplay of genetics, environment, and management. To support this goal, exhaustive envirotyping to generate precise environmental profiles would significantly improve predictions of genotype performance and genetic gain in crops. Already, informatics management platforms aggregate diverse environmental datasets obtained using optical, thermal, radar, and light detection and ranging (LiDAR)sensors that capture detailed information about vegetation, surface structure, and terrain. This wealth of information, coupled with freely available climate data, fuels innovative enviromics research. While enviromics holds immense potential for breeding, a few obstacles remain, such as the need for (1) integrative methodologies to systematically collect field data to scale and expand observations across the landscape with satellite data; (2) state-of-the-art AI models for data integration, simulation, and prediction; (3) cyberinfrastructure for processing big data across scales and providing seamless interfaces to deliver forecasts to stakeholders; and (4) collaboration and data sharing among farmers, breeders, physiologists, geoinformatics experts, and programmers across research institutions. Overcoming these challenges is essential for leveraging the full potential of big data captured by satellites to transform 21st century agriculture and crop improvement through enviromics.
Collapse
Affiliation(s)
- Rafael T Resende
- Universidade Federal de Goiás (UFG), Agronomy Department, Plant Breeding Sector, Goiânia (GO) 74690-900, Brazil; TheCROP, a Precision-Breeding Startup: Enviromics, Phenomics, and Genomics, No Zip-code, Operating Virtually, Goiânia (GO) and Sete Lagoas (MG), Brazil.
| | - Lee Hickey
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD, Australia
| | - Cibele H Amaral
- Earth Lab, Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80303, USA; Environmental Data Science Innovation & Inclusion Lab, Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80303, USA
| | - Lucas L Peixoto
- Universidade Federal de Goiás (UFG), Agronomy Department, Plant Breeding Sector, Goiânia (GO) 74690-900, Brazil
| | - Gustavo E Marcatti
- TheCROP, a Precision-Breeding Startup: Enviromics, Phenomics, and Genomics, No Zip-code, Operating Virtually, Goiânia (GO) and Sete Lagoas (MG), Brazil; Universidade Federal de São João del-Rei, Forest Engineering Department, Campus Sete Lagoas, Sete Lagoas (MG) 35701-970, Brazil
| | - Yunbi Xu
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China; Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261325, China; BGI Bioverse, Shenzhen 518083, China.
| |
Collapse
|
4
|
Matsushita K, Onogi A, Yonemaru JI. NARO historical phenotype dataset from rice breeding. BREEDING SCIENCE 2024; 74:114-123. [PMID: 39355631 PMCID: PMC11442108 DOI: 10.1270/jsbbs.23040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 12/10/2023] [Indexed: 10/03/2024]
Abstract
Data from breeding, including phenotypic information, may improve the efficiency of breeding. Historical data from breeding trials accumulated over a long time are also useful. Here, by organizing data accumulated in the National Agriculture and Food Research Organization (NARO) rice breeding program, we developed a historical phenotype dataset, which includes 6052 records obtained for 667 varieties in yield trials in 1991-2018 at six NARO research stations. The best linear unbiased predictions (BLUPs) and principal component analysis (PCA) were used to determine the relationships with various factors, including the year of cultivar release, for 15 traits, including yield. Yield-related traits such as the number of grains per panicle, plant weight, grain yield, and thousand-grain weight increased significantly with time, whereas the number of panicles decreased significantly. Ripening time significantly increased, whereas the lodging degree and protein content of brown rice significantly decreased. These results suggest that panicle-weight-type high-yielding varieties with excellent lodging resistance have been selected. These trends differed slightly among breeding locations, indicating that the main breeding objectives may differ among them. PCA revealed a higher diversity of traits in newer varieties.
Collapse
Affiliation(s)
- Kei Matsushita
- Research Center for Agricultural Information Technology (RCAIT), National Agriculture and Food Research Organization (NARO), 3-1-1 Kannondai, Tsukuba, Ibaraki 305-8517, Japan
- Institute of Crop Science (NICS), NARO, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Akio Onogi
- Research Center for Agricultural Information Technology (RCAIT), National Agriculture and Food Research Organization (NARO), 3-1-1 Kannondai, Tsukuba, Ibaraki 305-8517, Japan
- Faculty of Agriculture, Ryukoku University, 1-5 Yokotani, Seta Oe-cho, Otsu, Shiga 520-2194, Japan
| | - Jun-Ichi Yonemaru
- Research Center for Agricultural Information Technology (RCAIT), National Agriculture and Food Research Organization (NARO), 3-1-1 Kannondai, Tsukuba, Ibaraki 305-8517, Japan
- Institute of Crop Science (NICS), NARO, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| |
Collapse
|
5
|
Fernández-González J, Haquin B, Combes E, Bernard K, Allard A, Isidro Y Sánchez J. Maximizing efficiency in sunflower breeding through historical data optimization. PLANT METHODS 2024; 20:42. [PMID: 38493115 PMCID: PMC10943787 DOI: 10.1186/s13007-024-01151-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 01/30/2024] [Indexed: 03/18/2024]
Abstract
Genomic selection (GS) has become an increasingly popular tool in plant breeding programs, propelled by declining genotyping costs, an increase in computational power, and rediscovery of the best linear unbiased prediction methodology over the past two decades. This development has led to an accumulation of extensive historical datasets with genotypic and phenotypic information, triggering the question of how to best utilize these datasets. Here, we investigate whether all available data or a subset should be used to calibrate GS models for across-year predictions in a 7-year dataset of a commercial hybrid sunflower breeding program. We employed a multi-objective optimization approach to determine the ideal years to include in the training set (TRS). Next, for a given combination of TRS years, we further optimized the TRS size and its genetic composition. We developed the Min_GRM size optimization method which consistently found the optimal TRS size, reducing dimensionality by 20% with an approximately 1% loss in predictive ability. Additionally, the Tails_GEGVs algorithm displayed potential, outperforming the use of all data by using just 60% of it for grain yield, a high-complexity, low-heritability trait. Moreover, maximizing the genetic diversity of the TRS resulted in a consistent predictive ability across the entire range of genotypic values in the test set. Interestingly, the Tails_GEGVs algorithm, due to its ability to leverage heterogeneity, enhanced predictive performance for key hybrids with extreme genotypic values. Our study provides new insights into the optimal utilization of historical data in plant breeding programs, resulting in improved GS model predictive ability.
Collapse
Affiliation(s)
- Javier Fernández-González
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA)-Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Universidad Politécnica de Madrid (UPM), Campus de Montegancedo-UPM, Pozuelo de Alarcón, Madrid, 28223, Spain.
| | | | | | | | | | - Julio Isidro Y Sánchez
- Centro de Biotecnologia y Genómica de Plantas (CBGP, UPM-INIA)-Instituto Nacional de Investigación y Tecnologia Agraria y Alimentaria (INIA), Universidad Politécnica de Madrid (UPM), Campus de Montegancedo-UPM, Pozuelo de Alarcón, Madrid, 28223, Spain.
| |
Collapse
|
6
|
de Verdal H, Baertschi C, Frouin J, Quintero C, Ospina Y, Alvarez MF, Cao TV, Bartholomé J, Grenier C. Optimization of Multi-Generation Multi-location Genomic Prediction Models for Recurrent Genomic Selection in an Upland Rice Population. RICE (NEW YORK, N.Y.) 2023; 16:43. [PMID: 37758969 PMCID: PMC10533757 DOI: 10.1186/s12284-023-00661-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/19/2023] [Indexed: 09/29/2023]
Abstract
Genomic selection is a worthy breeding method to improve genetic gain in recurrent selection breeding schemes. The integration of multi-generation and multi-location information could significantly improve genomic prediction models in the context of shuttle breeding. The Cirad-CIAT upland rice breeding program applies recurrent genomic selection and seeks to optimize the scheme to increase genetic gain while reducing phenotyping efforts. We used a synthetic population (PCT27) of which S0 plants were all genotyped and advanced by selfing and bulk seed harvest to the S0:2, S0:3, and S0:4 generations. The PCT27 was then divided into two sets. The S0:2 and S0:3 progenies for PCT27A and the S0:4 progenies for PCT27B were phenotyped in two locations: Santa Rosa the target selection location, within the upland rice growing area, and Palmira, the surrogate location, far from the upland rice growing area but easier for experimentation. While the calibration used either one of the two sets phenotyped in one or two locations, the validation population was only the PCT27B phenotyped in Santa Rosa. Five scenarios of genomic prediction and 24 models were performed and compared. Training the prediction model with the PCT27B phenotyped in Santa Rosa resulted in predictive abilities ranging from 0.19 for grain zinc concentration to 0.30 for grain yield. Expanding the training set with the inclusion of the PCT27A resulted in greater predictive abilities for all traits but grain yield, with increases from 5% for plant height to 61% for grain zinc concentration. Models with the PCT27B phenotyped in two locations resulted in higher prediction accuracy when the models assumed no genotype-by-environment (G × E) interaction for flowering (0.38) and grain zinc concentration (0.27). For plant height, the model assuming a single G × E variance provided higher accuracy (0.28). The gain in predictive ability for grain yield was the greatest (0.25) when environment-specific variance deviation effect for G × E was considered. While the best scenario was specific to each trait, the results indicated that the gain in predictive ability provided by the multi-location and multi-generation calibration was low. Yet, this approach could lead to increased selection intensity, acceleration of the breeding cycle, and a sizable economic advantage for the program.
Collapse
Affiliation(s)
- Hugues de Verdal
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France.
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France.
| | - Cédric Baertschi
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France
| | - Julien Frouin
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France
| | - Constanza Quintero
- Alliance Bioversity-CIAT, A.A.6713, Km 17 Recta Palmira Cali, Cali, Colombia
| | - Yolima Ospina
- Alliance Bioversity-CIAT, A.A.6713, Km 17 Recta Palmira Cali, Cali, Colombia
| | | | - Tuong-Vi Cao
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France
| | - Jérôme Bartholomé
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France
- Alliance Bioversity-CIAT, A.A.6713, Km 17 Recta Palmira Cali, Cali, Colombia
| | - Cécile Grenier
- CIRAD, UMR AGAP Institut, 34398, Montpellier, France.
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France.
- Alliance Bioversity-CIAT, A.A.6713, Km 17 Recta Palmira Cali, Cali, Colombia.
| |
Collapse
|
7
|
Feldmann MJ, Covarrubias-Pazaran G, Piepho HP. Complex traits and candidate genes: estimation of genetic variance components across multiple genetic architectures. G3 (BETHESDA, MD.) 2023; 13:jkad148. [PMID: 37405459 PMCID: PMC10468314 DOI: 10.1093/g3journal/jkad148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/09/2023] [Accepted: 06/12/2023] [Indexed: 07/06/2023]
Abstract
Large-effect loci-those statistically significant loci discovered by genome-wide association studies or linkage mapping-associated with key traits segregate amidst a background of minor, often undetectable, genetic effects in wild and domesticated plants and animals. Accurately attributing mean differences and variance explained to the correct components in the linear mixed model analysis is vital for selecting superior progeny and parents in plant and animal breeding, gene therapy, and medical genetics in humans. Marker-assisted prediction and its successor, genomic prediction, have many advantages for selecting superior individuals and understanding disease risk. However, these two approaches are less often integrated to study complex traits with different genetic architectures. This simulation study demonstrates that the average semivariance can be applied to models incorporating Mendelian, oligogenic, and polygenic terms simultaneously and yields accurate estimates of the variance explained for all relevant variables. Our previous research focused on large-effect loci and polygenic variance separately. This work aims to synthesize and expand the average semivariance framework to various genetic architectures and the corresponding mixed models. This framework independently accounts for the effects of large-effect loci and the polygenic genetic background and is universally applicable to genetics studies in humans, plants, animals, and microbes.
Collapse
Affiliation(s)
- Mitchell J Feldmann
- Department of Plant Sciences, University of California Davis, One Shields Ave, Davis, CA 95616, USA
| | - Giovanny Covarrubias-Pazaran
- International Maize and Wheat Improvement Center (CIMMYT), Carretera México-Veracruz, El Batán, 56130 Texcoco, Edo. de México, México
| | - Hans-Peter Piepho
- Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Stuttgart 70599, Germany
| |
Collapse
|
8
|
Ma J, Cao Y, Wang Y, Ding Y. Development of the maize 5.5K loci panel for genomic prediction through genotyping by target sequencing. FRONTIERS IN PLANT SCIENCE 2022; 13:972791. [PMID: 36438102 PMCID: PMC9691890 DOI: 10.3389/fpls.2022.972791] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
Genotyping platforms are important for genetic research and molecular breeding. In this study, a low-density genotyping platform containing 5.5K SNP markers was successfully developed in maize using genotyping by target sequencing (GBTS) technology with capture-in-solution. Two maize populations (Pop1 and Pop2) were used to validate the GBTS panel for genetic and molecular breeding studies. Pop1 comprised 942 hybrids derived from 250 inbred lines and four testers, and Pop2 contained 540 hybrids which were generated from 123 new-developed inbred lines and eight testers. The genetic analyses showed that the average polymorphic information content and genetic diversity values ranged from 0.27 to 0.38 in both populations using all filtered genotyping data. The mean missing rate was 1.23% across populations. The Structure and UPGMA tree analyses revealed similar genetic divergences (76-89%) in both populations. Genomic prediction analyses showed that the prediction accuracy of reproducing kernel Hilbert space (RKHS) was slightly lower than that of genomic best linear unbiased prediction (GBLUP) and three Bayesian methods for general combining ability of grain yield per plant and three yield-related traits in both populations, whereas RKHS with additive effects showed superior advantages over the other four methods in Pop1. In Pop1, the GBLUP and three Bayesian methods with additive-dominance model improved the prediction accuracies by 4.89-134.52% for the four traits in comparison to the additive model. In Pop2, the inclusion of dominance did not improve the accuracy in most cases. In general, low accuracies (0.33-0.43) were achieved for general combing ability of the four traits in Pop1, whereas moderate-to-high accuracies (0.52-0.65) were observed in Pop2. For hybrid performance prediction, the accuracies were moderate to high (0.51-0.75) for the four traits in both populations using the additive-dominance model. This study suggests a reliable genotyping platform that can be implemented in genomic selection-assisted breeding to accelerate maize new cultivar development and improvement.
Collapse
|
9
|
Dias KOG, Dos Santos JPR, Krause MD, Piepho HP, Guimarães LJM, Pastina MM, Garcia AAF. Leveraging probability concepts for cultivar recommendation in multi-environment trials. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1385-1399. [PMID: 35192008 DOI: 10.1007/s00122-022-04041-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
We propose using probability concepts from Bayesian models to leverage a more informed decision-making process toward cultivar recommendation in multi-environment trials. Statistical models that capture the phenotypic plasticity of a genotype across environments are crucial in plant breeding programs to potentially identify parents, generate offspring, and obtain highly productive genotypes for target environments. In this study, our aim is to leverage concepts of Bayesian models and probability methods of stability analysis to untangle genotype-by-environment interaction (GEI). The proposed method employs the posterior distribution obtained with the No-U-Turn sampler algorithm to get Hamiltonian Monte Carlo estimates of adaptation and stability probabilities. We applied the proposed models in two empirical tropical datasets. Our findings provide a basis to enhance our ability to consider the uncertainty of cultivar recommendation for global or specific adaptation. We further demonstrate that probability methods of stability analysis in a Bayesian framework are a powerful tool for unraveling GEI given a defined intensity of selection that results in a more informed decision-making process toward cultivar recommendation in multi-environment trials.
Collapse
Affiliation(s)
- Kaio O G Dias
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil
- Department of General Biology, Federal University of Viçosa, Viçosa, Brazil
| | - Jhonathan P R Dos Santos
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil
| | | | | | | | | | - Antonio A F Garcia
- Department of Genetics, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, SP, Brazil.
| |
Collapse
|
10
|
de Faria SV, Zuffo LT, Rezende WM, Caixeta DG, Pereira HD, Azevedo CF, DeLima RO. Phenotypic and molecular characterization of a set of tropical maize inbred lines from a public breeding program in Brazil. BMC Genomics 2022; 23:54. [PMID: 35030994 PMCID: PMC8759194 DOI: 10.1186/s12864-021-08127-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 10/27/2021] [Indexed: 11/28/2022] Open
Abstract
Background The characterization of genetic diversity and population differentiation for maize inbred lines from breeding programs is of great value in assisting breeders in maintaining and potentially increasing the rate of genetic gain. In our study, we characterized a set of 187 tropical maize inbred lines from the public breeding program of the Universidade Federal de Viçosa (UFV) in Brazil based on 18 agronomic traits and 3,083 single nucleotide polymorphisms (SNP) markers to evaluate whether this set of inbred lines represents a panel of tropical maize inbred lines for association mapping analysis and investigate the population structure and patterns of relationships among the inbred lines from UFV for better exploitation in our maize breeding program. Results Our results showed that there was large phenotypic and genotypic variation in the set of tropical maize inbred lines from the UFV maize breeding program. We also found high genetic diversity (GD = 0.34) and low pairwise kinship coefficients among the maize inbred lines (only approximately 4.00 % of the pairwise relative kinship was above 0.50) in the set of inbred lines. The LD decay distance over all ten chromosomes in the entire set of maize lines with r2 = 0.1 was 276,237 kb. Concerning the population structure, our results from the model-based STRUCTURE and principal component analysis methods distinguished the inbred lines into three subpopulations, with high consistency maintained between both results. Additionally, the clustering analysis based on phenotypic and molecular data grouped the inbred lines into 14 and 22 genetic divergence clusters, respectively. Conclusions Our results indicate that the set of tropical maize inbred lines from UFV maize breeding programs can comprise a panel of tropical maize inbred lines suitable for a genome-wide association study to dissect the variation of complex quantitative traits in maize, mainly in tropical environments. In addition, our results will be very useful for assisting us in the assignment of heterotic groups and the selection of the best parental combinations for new breeding crosses, mapping populations, mapping synthetic populations, guiding crosses that target highly heterotic and yielding hybrids, and predicting untested hybrids in the public breeding program UFV. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08127-7.
Collapse
Affiliation(s)
| | - Leandro Tonello Zuffo
- Department of Agronomy, Universidade Federal de Viçosa, Minas Gerais, Viçosa, Brazil
| | | | | | | | | | | |
Collapse
|
11
|
Martins Oliveira IC, Bernardeli A, Soler Guilhen JH, Pastina MM. Genomic Prediction of Complex Traits in an Allogamous Annual Crop: The Case of Maize Single-Cross Hybrids. Methods Mol Biol 2022; 2467:543-567. [PMID: 35451790 DOI: 10.1007/978-1-0716-2205-6_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
For many plant and animal species, commercial products are hybrids between individuals from different genetic groups. For allogamous plant species such as maize, the breeding objective is to produce single-cross hybrid varieties from two inbred lines each selected in complementary groups. Efficient hybrid breeding requires methods that (1) quickly generate homozygous and homogeneous parental lines with high combining abilities, (2) efficiently choose among the large number of available parental lines the most promising ones, and (3) predict the performances of sets of non-phenotyped single-cross hybrids, or hybrids phenotyped in a limited number of environments, based on their relationship with another set of hybrids with known performances. The maize breeding community has been developing model-based prediction of hybrid performances well before the genomic era. This chapter (1) provides a reminder of the maize breeding scheme before the genomic era; (2) describes how genomic data were incorporated in the prediction models involved in different steps of genomic-based single-cross maize hybrid breeding; and (3) reviews factors affecting the accuracy of genomic prediction, approaches for optimizing GP-based single-cross maize hybrid breeding schemes, and ensuring the long-term sustainability of genomic selection.
Collapse
Affiliation(s)
| | - Arthur Bernardeli
- Department of Agronomy, Universidade Federal de Viçosa, Viçosa-MG, Brazil
| | | | | |
Collapse
|
12
|
Rice BR, Lipka AE. Diversifying maize genomic selection models. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2021; 41:33. [PMID: 37309328 PMCID: PMC10236107 DOI: 10.1007/s11032-021-01221-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/07/2021] [Indexed: 06/14/2023]
Abstract
Genomic selection (GS) is one of the most powerful tools available for maize breeding. Its use of genome-wide marker data to estimate breeding values translates to increased genetic gains with fewer breeding cycles. In this review, we cover the history of GS and highlight particular milestones during its adaptation to maize breeding. We discuss how GS can be applied to developing superior maize inbreds and hybrids. Additionally, we characterize refinements in GS models that could enable the encapsulation of non-additive genetic effects, genotype by environment interactions, and multiple levels of the biological hierarchy, all of which could ultimately result in more accurate predictions of breeding values. Finally, we suggest the stages in a maize breeding program where it would be beneficial to apply GS. Given the current sophistication of high-throughput phenotypic, genotypic, and other -omic level data currently available to the maize community, now is the time to explore the implications of their incorporation into GS models and thus ensure that genetic gains are being achieved as quickly and efficiently as possible.
Collapse
Affiliation(s)
- Brian R. Rice
- Department of Crop Sciences, University of Illinois, Urbana, IL USA
| | | |
Collapse
|
13
|
Labroo MR, Studer AJ, Rutkoski JE. Heterosis and Hybrid Crop Breeding: A Multidisciplinary Review. Front Genet 2021; 12:643761. [PMID: 33719351 PMCID: PMC7943638 DOI: 10.3389/fgene.2021.643761] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/08/2021] [Indexed: 11/24/2022] Open
Abstract
Although hybrid crop varieties are among the most popular agricultural innovations, the rationale for hybrid crop breeding is sometimes misunderstood. Hybrid breeding is slower and more resource-intensive than inbred breeding, but it allows systematic improvement of a population by recurrent selection and exploitation of heterosis simultaneously. Inbred parental lines can identically reproduce both themselves and their F1 progeny indefinitely, whereas outbred lines cannot, so uniform outbred lines must be bred indirectly through their inbred parents to harness heterosis. Heterosis is an expected consequence of whole-genome non-additive effects at the population level over evolutionary time. Understanding heterosis from the perspective of molecular genetic mechanisms alone may be elusive, because heterosis is likely an emergent property of populations. Hybrid breeding is a process of recurrent population improvement to maximize hybrid performance. Hybrid breeding is not maximization of heterosis per se, nor testing random combinations of individuals to find an exceptional hybrid, nor using heterosis in place of population improvement. Though there are methods to harness heterosis other than hybrid breeding, such as use of open-pollinated varieties or clonal propagation, they are not currently suitable for all crops or production environments. The use of genomic selection can decrease cycle time and costs in hybrid breeding, particularly by rapidly establishing heterotic pools, reducing testcrossing, and limiting the loss of genetic variance. Open questions in optimal use of genomic selection in hybrid crop breeding programs remain, such as how to choose founders of heterotic pools, the importance of dominance effects in genomic prediction, the necessary frequency of updating the training set with phenotypic information, and how to maintain genetic variance and prevent fixation of deleterious alleles.
Collapse
Affiliation(s)
| | | | - Jessica E. Rutkoski
- Department of Crop Sciences, University of Illinois at Urbana–Champaign, Urbana, IL, United States
| |
Collapse
|