1
|
Dalid C, Zheng C, Osorio L, Verma S, Abd‐Elrahman A, Wang X, Whitaker VM. Genetic analysis of predicted vegetative biomass and biomass-related traits from digital phenotyping of strawberry. THE PLANT GENOME 2025; 18:e70018. [PMID: 40164966 PMCID: PMC11958871 DOI: 10.1002/tpg2.70018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 12/30/2024] [Accepted: 02/23/2025] [Indexed: 04/02/2025]
Abstract
High-throughput digital phenotyping (DP) has been widely explored in plant breeding to assess large numbers of genotypes with minimal manual labor and reduced cost and time. DP platforms using high-resolution images captured by drones and tractor-based platforms have recently allowed the University of Florida strawberry (Fragaria × ananassa) breeding program to assess vegetative biomass at scale. Biomass has not previously been explored in a strawberry breeding context due to the labor required and the need to destroy the plant. This study aims to understand the genetic basis of predicted vegetative biomass and biomass-related traits and to chart a path for the combined use of DP and genomics in strawberry breeding. Aboveground dry vegetative biomass was estimated by adapting a previously published model using ground-truth data on a subset of breeding germplasm. High-resolution images were collected on clonally replicated trials at different time points during the fruiting season. There was moderate to high heritability (h2 = 0.26-0.56) for predicted vegetative biomass, and genetic correlations between vegetative biomass and marketable yield were mostly positive (rG = -0.13-0.47). Fruit yield traits scaled on a vegetative biomass basis also had moderate to high heritability (h2 = 0.25-0.64). This suggests that vegetative biomass can be decreased or increased through selection, and that marketable fruit yield can be improved without simultaneously increasing plant size. No consistent marker-trait associations were discovered via genome-wide association studies. On the other hand, predictive abilities from genomic selection ranged from 0.15 to 0.46 across traits and years, suggesting that genomic prediction will be an effective breeding tool for vegetative biomass in strawberry.
Collapse
Affiliation(s)
- Cheryl Dalid
- Horticultural Sciences Department, IFAS Gulf Coast Research and Education CenterUniversity of FloridaWimaumaFloridaUSA
| | - Caiwang Zheng
- School of Forest Resources and Conservation Geomatics, IFAS Gulf Coast Research and Education CenterUniversity of FloridaPlant CityFloridaUSA
| | - Luis Osorio
- Horticultural Sciences Department, IFAS Gulf Coast Research and Education CenterUniversity of FloridaWimaumaFloridaUSA
| | | | - Amr Abd‐Elrahman
- School of Forest Resources and Conservation Geomatics, IFAS Gulf Coast Research and Education CenterUniversity of FloridaPlant CityFloridaUSA
| | - Xu Wang
- Agricultural and Biological Engineering Department, IFAS Gulf Coast Research and Education CenterUniversity of FloridaWimaumaFloridaUSA
| | - Vance M. Whitaker
- Horticultural Sciences Department, IFAS Gulf Coast Research and Education CenterUniversity of FloridaWimaumaFloridaUSA
| |
Collapse
|
2
|
Matt JL, Small JM, Kube PD, Allen SK. Quantitative genetic analysis of late spring mortality in triploid Crassostrea virginica. Genet Sel Evol 2025; 57:19. [PMID: 40205344 PMCID: PMC11983945 DOI: 10.1186/s12711-025-00965-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 03/09/2025] [Indexed: 04/11/2025] Open
Abstract
BACKGROUND Triploid oysters, bred by crossing tetraploid and diploid oysters, are common worldwide in commercial oyster aquaculture and make up much of the hatchery-produced Crassostrea virginica farmed in the mid-Atlantic and southeast of the United States. Breeding diploid and tetraploid animals for genetic improvement of triploid progeny is unique to oysters and can proceed via several possible breeding strategies. Triploid oysters, along with their diploid or tetraploid relatives, have yet been subject to quantitative genetic analyses that could inform a breeding strategy of triploid improvement. The importance of quantitative genetic analyses involving triploid C. virginica has been emphasized by the occurrence of mortality events of near-market sized triploids in late spring. METHODS Genetic parameters for survival and weight of triploid and tetraploid C. virginica were estimated from twenty paternal half-sib triploid families and thirty-nine full-sib tetraploid families reared at three sites in the Chesapeake Bay (USA). Traits were analyzed using linear mixed models in ASReml-R. Genetic relationship matrices appropriate for pedigrees with triploid and tetraploid animals were produced using the polyAinv package in R. RESULTS A mortality event in triploids occurred at one site located on the bayside of the Eastern Shore of Virginia. Between early May and early July, three triploid families had survival of less than 0.70, while most had survival greater than 0.90. The heritability for survival during this period in triploids at this affected site was 0.57 ± 0.23. Triploid survival at the affected site was adversely related to triploid survival at the low salinity site (- 0.50 ± 0.23) and unrelated to tetraploid survival at the site with similar salinity (0.05 ± 0.39). CONCLUSIONS Survival during a late spring mortality event in triploids had a substantial additive genetic basis, suggesting selective breeding of tetraploids can reduce triploid mortalities. Genetic correlations revealed evidence of genotype by environment interactions for triploid survival and weak genetic correlations between survival of tetraploids and triploids. A selective breeding strategy with phenotyping of tetraploid and triploid half-sibs is recommended for genetic improvement of triploid oysters.
Collapse
Affiliation(s)
- Joseph L Matt
- Marine Genomics Lab, Department of Life Sciences, Texas A&M University-Corpus Christi, 6300 Ocean Drive, Corpus Christi, TX, 78412, USA.
- Texas A&M AgriLife Research, 600 John Kimbrough Boulevard, Suite 512, College Station, TX, 77843, USA.
| | - Jessica Moss Small
- Aquaculture Genetics and Breeding Technology Center, Virginia Institute of Marine Science, William & Mary, P.O. Box 1346, Gloucester Point, VA, 23062, USA
| | - Peter D Kube
- Center for Aquaculture Technologies, Hobart, Tasmania, Australia
- CSIRO Agriculture & Food, Hobart, Tasmania, Australia
| | - Standish K Allen
- Aquaculture Genetics and Breeding Technology Center, Virginia Institute of Marine Science, William & Mary, P.O. Box 1346, Gloucester Point, VA, 23062, USA
| |
Collapse
|
3
|
Sipowicz P, Murad Leite Andrade MH, Fernandes Filho CC, Benevenuto J, Muñoz P, Ferrão LFV, Resende MFR, Messina C, Rios EF. Optimization of high-throughput marker systems for genomic prediction in alfalfa family bulks. THE PLANT GENOME 2025; 18:e20526. [PMID: 39635923 PMCID: PMC11726437 DOI: 10.1002/tpg2.20526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 09/25/2024] [Accepted: 09/25/2024] [Indexed: 12/07/2024]
Abstract
Alfalfa (Medicago sativa L.) is a perennial forage legume esteemed for its exceptional quality and dry matter yield (DMY); however, alfalfa has historically exhibited low genetic gain for DMY. Advances in genotyping platforms paved the way for a cost-effective application of genomic prediction in alfalfa family bulks. In this context, the optimization of marker density holds potential to reallocate resources within genomic prediction pipelines. This study aimed to (i) test two genotyping platforms for population structure discrimination and predictive ability (PA) of genomic prediction models (G-BLUP) for DMY, and (ii) explore optimal levels of marker density to predict DMY in family bulks. For this, 160 nondormant alfalfa families were phenotyped for DMY across 11 harvests and genotyped via targeted sequencing using Capture-seq with 17K probes and the DArTag 3K panel. Both platforms discriminated similarly against the population structure and resulted in comparable PA for DMY. For genotyping optimization, different levels of marker density were randomly extracted from each platform. In both cases, a plateau was achieved around 500 markers, yielding similar PA as the full set of markers. For phenotyping optimization, models with 500 markers built with data from five harvests resulted in similar PA compared to the full set of 11 harvests and full set of markers. Altogether, genotyping and phenotyping efforts were optimized in terms of number of markers and harvests. Capture-seq and DArTag yielded similar results and have the flexibility to adjust their panels to meet breeders' needs in terms of marker density.
Collapse
Affiliation(s)
- Pablo Sipowicz
- Plant Breeding Graduate ProgramUniversity of FloridaGainesvilleFloridaUSA
- Instituto Nacional de Tecnologia AgropecuariaManfrediArgentina
| | | | | | - Juliana Benevenuto
- Horticultural Sciences DepartmentUniversity of FloridaGainesvilleFloridaUSA
| | - Patricio Muñoz
- Horticultural Sciences DepartmentUniversity of FloridaGainesvilleFloridaUSA
| | | | | | - C. Messina
- Horticultural Sciences DepartmentUniversity of FloridaGainesvilleFloridaUSA
| | - Esteban F. Rios
- Agronomy DepartmentUniversity of FloridaGainesvilleFloridaUSA
| |
Collapse
|
4
|
Mertten D, McKenzie CM, Baldwin S, Thomson S, Souleyre EJF, Lenhard M, Datson PM. Genomic selection in a kiwiberry breeding programme: integrating intra- and inter-specific crossing. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2025; 45:31. [PMID: 40061125 PMCID: PMC11889281 DOI: 10.1007/s11032-025-01550-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 02/20/2025] [Indexed: 03/21/2025]
Abstract
Inter-specific hybridisation between natural populations within the genus Actinidia is a common phenomenon and has been used in breeding programmes. Hybridisation between species increases the diversity of breeding populations, incorporating new desirable traits into potential cultivars. We explored genomic prediction in Actinidia breeding, focusing on the closely related species Actinidia arguta and Actinidia melanandra. We investigated the potential of genomic selection by analysing four quantitative traits across intra-specific A. arguta crosses and inter-specific crosses between A. arguta and A. melanandra. The continuous distributions of the studied traits in both intra-specific and inter-specific crosses indicated a polygenic background. A linear mixed model approach was used, incorporating the factor of year of season and a marker-based relationship matrix instead of pedigree as a random effect. After evaluation, the best model was applied to assess variance components and heritability for each quantitative trait. Expanding beyond intra-specific crosses, predictive ability was calculated to investigate inter-specific cross effect. Considering predictive ability, this study explored the impacts of sample size and population structure. A reduction in sample size correlated with decreased predictive ability, while the influence of population structure was particularly pronounced in inter-specific crosses. Finally, the prediction accuracy of genomic estimated breeding values, for parental genotypes, revealed an inter-species effect on prediction confidence. Considering the imbalance in genotype numbers between intra- and inter-specific cross populations, this research highlights the difficulty of genomic prediction in hybrid populations. Understanding prediction accuracy in inter-species crossing designs provides valuable insights for optimising genomic selection. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-025-01550-8.
Collapse
Affiliation(s)
- Daniel Mertten
- The New Zealand Institute for Plant and Food Research Ltd, Auckland, 1142 New Zealand
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany
| | - Catherine M. McKenzie
- The New Zealand Institute for Plant and Food Research Ltd, Te Puke, 3182 New Zealand
| | - Samantha Baldwin
- The New Zealand Institute for Plant and Food Research Ltd, Lincoln, 7608 New Zealand
| | - Susan Thomson
- The New Zealand Institute for Plant and Food Research Ltd, Lincoln, 7608 New Zealand
| | - Edwige J. F. Souleyre
- The New Zealand Institute for Plant and Food Research Ltd, Auckland, 1142 New Zealand
| | - Michael Lenhard
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany
| | | |
Collapse
|
5
|
Shahi D, Todd J, Gravois K, Hale A, Blanchard B, Kimbeng C, Pontif M, Baisakh N. Exploiting historical agronomic data to develop genomic prediction strategies for early clonal selection in the Louisiana sugarcane variety development program. THE PLANT GENOME 2025; 18:e20545. [PMID: 39740237 DOI: 10.1002/tpg2.20545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 11/15/2024] [Accepted: 11/18/2024] [Indexed: 01/02/2025]
Abstract
Genomic selection can enhance the rate of genetic gain of cane and sucrose yield in sugarcane (Saccharum L.), an important industrial crop worldwide. We assessed the predictive ability (PA) for six traits, such as theoretical recoverable sugar (TRS), number of stalks (NS), stalk weight (SW), cane yield (CY), sugar yield (SY), and fiber content (Fiber) using 20,451 single nucleotide polymorphisms (SNPs) with 22 statistical models based on the genomic estimated breeding values of 567 genotypes within and across five stages of the Louisiana sugarcane breeding program. TRS and SW with high heritability showed higher PA compared to other traits, while NS had the lowest. Machine learning (ML) methods, such as random forest and support vector machine (SVM), outperformed others in predicting traits with low heritability. ML methods predicted TRS and SY with the highest accuracy in cross-stage predictions, while Bayesian models predicted NS and CY with the highest accuracy. Extended genomic best linear unbiased prediction models accounting for dominance and epistasis effects showed a slight improvement in PA for a few traits. When both NS and TRS, which can be available as early as stage 2, were considered in a multi-trait selection model, the PA for SY in stage 5 could increase up to 0.66 compared to 0.30 with a single-trait model. Marker density assessment suggested 9091 SNPs were sufficient for optimal PA of all traits. The study demonstrated the potential of using historical data to devise genomic prediction strategies for clonal selection early in sugarcane breeding programs.
Collapse
Affiliation(s)
- Dipendra Shahi
- School of Plant, Environmental and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, Louisiana, USA
| | - James Todd
- Sugarcane Research Unit, USDA-ARS, Houma, Louisiana, USA
| | - Kenneth Gravois
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Anna Hale
- Sugarcane Research Unit, USDA-ARS, Houma, Louisiana, USA
| | - Brayden Blanchard
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Collins Kimbeng
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Michael Pontif
- Sugar Research Station, Louisiana State University Agricultural Center, St. Gabriel, Louisiana, USA
| | - Niranjan Baisakh
- School of Plant, Environmental and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, Louisiana, USA
| |
Collapse
|
6
|
Ferrão LFV, Azevedo CF, Sims CA, Munoz PR. A consumer-oriented approach to define breeding targets for molecular breeding. THE NEW PHYTOLOGIST 2025; 245:711-721. [PMID: 39530162 DOI: 10.1111/nph.20254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024]
Abstract
Flavor is a crucial aspect of the eating experience, reflecting evolving consumer preferences for fruits with enhanced quality. Modern fruit breeding programs prioritize improving quality traits aligned with consumer tastes. However, defining fruit-quality attributes that significantly impact consumer preference is a current challenge faced by the industry and breeders. This study proposes a data-driven approach to statistically model the relationship between fruit-quality parameters and consumers' overall liking. Our primary hypothesis suggests that the interplay between fruit-quality attributes and consumer preferences may reach a critical value, serving as new empirical benchmarks for fruit quality. Using extensive historical datasets accounting for sensory, biochemical, and genomic information described in blueberry, we first demonstrated that multivariate adaptive regression splines (MARS) could be used to identify specific values of fruit-quality traits that significantly affect consumer perception by using nonlinear spline regressions on estimating threshold points. We harnessed genomic information and carried out genomic selection (GS) for five fruit-quality traits evaluated on the original scale and after classified via the MARS approach. This study provides a pioneering consumer-centric and data-driven approach to defining fruit-quality standards and supporting molecular breeding that has broad applications to breeding programs from any species.
Collapse
Affiliation(s)
- Luis Felipe V Ferrão
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, 32611, USA
| | - Camila F Azevedo
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, 32611, USA
- Statistics Department, Federal University of Viçosa, Viçosa, MG, 36570-900, Brazil
| | - Charles A Sims
- Food Science and Human Nutrition Department, University of Florida, Gainesville, FL, 32611, USA
| | - Patricio R Munoz
- Horticultural Sciences Department, Blueberry Breeding and Genomics Lab, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
7
|
Ferrão LFV, Azevedo C, Benevenuto J, Mengist MF, Luby C, Pottorff M, Casorzo GIP, Mackey T, Lila MA, Giongo L, Bassil N, Perkins-Veazie P, Iorizzo M, Munoz PR. Inference of the genetic basis of fruit texture in highbush blueberries using genome-wide association analyses. HORTICULTURE RESEARCH 2024; 11:uhae233. [PMID: 39431114 PMCID: PMC11489598 DOI: 10.1093/hr/uhae233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 08/04/2024] [Indexed: 10/22/2024]
Abstract
The global production and consumption of blueberry (Vaccinium spp.), a specialty crop known for its abundant bioactive and antioxidant compounds, has more than doubled over the last decade. To hold this momentum, plant breeders have begun to use quantitative genetics and molecular breeding to guide their decisions and select new cultivars that are improved for fruit quality. In this study, we leveraged our inferences on the genetic basis of fruit texture and chemical components by surveying large breeding populations from northern highbush blueberries (NHBs) and southern highbush blueberries (SHBs), the two dominant cultivated blueberries. After evaluating 1065 NHB genotypes planted at the Oregon State University, and 992 SHB genotypes maintained at the University of Florida for 17 texture-related traits, evaluated over multiple years, our contributions consist of the following: (i) we drew attention to differences between NHB and SHB materials and showed that both blueberry types can be differentiated using texture traits; (ii) we computed genetic parameters and shed light on the genetic architecture of important texture attributes, indicating that most traits had a complex nature with low to moderate heritability; (iii) using molecular breeding, we emphasized that prediction could be performed across populations; and finally (iv) the genomic association analyses pinpointed some genomic regions harboring potential candidate genes for texture that could be used for further validation studies. Altogether, the methods and approaches used here can guide future breeding efforts focused on maximizing texture improvements in blueberries.
Collapse
Affiliation(s)
- Luis Felipe V Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Camila Azevedo
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
- Statistic Department, Federal University of Vicosa, Vicosa, Brazil
| | - Juliana Benevenuto
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Molla Fentie Mengist
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Claire Luby
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Marti Pottorff
- USDA-ARS, Horticulture Crops Research Unit, Corvallis, OR 97333, USA
| | - Gonzalo I P Casorzo
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Ted Mackey
- USDA-ARS, Horticulture Crops Research Unit, Corvallis, OR 97333, USA
| | - Mary Ann Lila
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Lara Giongo
- Fondazione Edmund Mach - Research and Innovation Centre Italy
| | - Nahla Bassil
- USDA-ARS, Horticulture Crops Research Unit, Corvallis, OR 97333, USA
| | | | - Massimo Iorizzo
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Patricio R Munoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
8
|
Casorzo G, Ferrão LF, Adunola P, Tavares Flores E, Azevedo C, Amadeu R, Munoz PR. Understanding the genetic basis of blueberry postharvest traits to define better breeding strategies. G3 (BETHESDA, MD.) 2024; 14:jkae163. [PMID: 39052988 PMCID: PMC11373639 DOI: 10.1093/g3journal/jkae163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 01/23/2024] [Accepted: 07/09/2024] [Indexed: 07/27/2024]
Abstract
Blueberry (Vaccinium spp.) is among the most-consumed soft fruit and has been recognized as an important source of health-promoting compounds. Highly perishable and susceptible to rapid spoilage due to fruit softening and decay during postharvest storage, modern breeding programs are looking to maximize the quality and extend the market life of fresh blueberries. However, it is uncertain how genetically controlled postharvest quality traits are in blueberries. This study aimed to investigate the prediction ability and the genetic basis of the main fruit quality traits affected during blueberry postharvest to create breeding strategies for developing cultivars with an extended shelf life. To achieve this goal, we carried out target genotyping in a breeding population of 588 individuals and evaluated several fruit quality traits after 1 day, 1 week, 3 weeks, and 7 weeks of postharvest storage at 1°C. Using longitudinal genome-based methods, we estimated genetic parameters and predicted unobserved phenotypes. Our results showed large diversity, moderate heritability, and consistent predictive accuracies along the postharvest storage for most of the traits. Regarding the fruit quality, firmness showed the largest variation during postharvest storage, with a surprising number of genotypes maintaining or increasing their firmness, even after 7 weeks of cold storage. Our results suggest that we can effectively improve the blueberry postharvest quality through breeding and use genomic prediction to maximize the genetic gains in the long term. We also emphasize the potential of using longitudinal genomic prediction models to predict the fruit quality at extended postharvest periods by integrating known phenotypic data from harvest.
Collapse
Affiliation(s)
- Gonzalo Casorzo
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32608, USA
| | - Luis Felipe Ferrão
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32608, USA
| | - Paul Adunola
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32608, USA
| | | | - Camila Azevedo
- Department of Statistics, Federal University of Viçosa, Viçosa 36570, Brazil
| | - Rodrigo Amadeu
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32608, USA
- Bayer US-Crop Science, Chesterfield, MO 63017, USA
| | - Patricio R Munoz
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32608, USA
| |
Collapse
|
9
|
Lee AMJ, Foong MYM, Song BK, Chew FT. Genomic selection for crop improvement in fruits and vegetables: a systematic scoping review. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2024; 44:60. [PMID: 39267903 PMCID: PMC11391014 DOI: 10.1007/s11032-024-01497-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 09/01/2024] [Indexed: 09/15/2024]
Abstract
To ensure the nutritional needs of an expanding global population, it is crucial to optimize the growing capabilities and breeding values of fruit and vegetable crops. While genomic selection, initially implemented in animal breeding, holds tremendous potential, its utilization in fruit and vegetable crops remains underexplored. In this systematic review, we reviewed 63 articles covering genomic selection and its applications across 25 different types of fruit and vegetable crops over the last decade. The traits examined were directly related to the edible parts of the crops and carried significant economic importance. Comparative analysis with WHO/FAO data identified potential economic drivers underlying the study focus of some crops and highlighted crops with potential for further genomic selection research and application. Factors affecting genomic selection accuracy in fruit and vegetable studies are discussed and suggestions made to assist in their implementation into plant breeding schemes. Genetic gain in fruits and vegetables can be improved by utilizing genomic selection to improve selection intensity, accuracy, and integration of genetic variation. However, the reduction of breeding cycle times may not be beneficial in crops with shorter life cycles such as leafy greens as compared to fruit trees. There is an urgent need to integrate genomic selection methods into ongoing breeding programs and assess the actual genomic estimated breeding values of progeny resulting from these breeding programs against the prediction models. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-024-01497-2.
Collapse
Affiliation(s)
- Adrian Ming Jern Lee
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| | - Melissa Yuin Mern Foong
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Beng Kah Song
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| |
Collapse
|
10
|
Inamori M, Kimura T, Mori M, Tarumoto Y, Hattori T, Hayano M, Umeda M, Iwata H. Machine learning for genomic and pedigree prediction in sugarcane. THE PLANT GENOME 2024; 17:e20486. [PMID: 38923818 DOI: 10.1002/tpg2.20486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 05/07/2024] [Accepted: 05/08/2024] [Indexed: 06/28/2024]
Abstract
Sugarcane (Saccharum spp.) plays a crucial role in global sugar production; however, the efficiency of breeding programs has been hindered by its heterozygous polyploid genomes. Considering non-additive genetic effects is essential in genome prediction (GP) models of crops with highly heterozygous polyploid genomes. This study incorporates non-additive genetic effects and pedigree information using machine learning methods to track sugarcane breeding lines and enhance the prediction by assessing the degree of association between genotypes. This study measured the stalk biomass and sugar content of 297 clones from 87 families within a breeding population used in the Japanese sugarcane breeding program. Subsequently, we conducted analyses based on the marker genotypes of 33,149 single-nucleotide polymorphisms. To validate the accuracy of GP in the population, we first predicted the prediction accuracy of the best linear unbiased prediction (BLUP) based on a genomic relationship matrix. Prediction accuracy was assessed using two different cross-validation methods: repeated 10-fold cross-validation and leave-one-family-out cross-validation. The accuracy of GP of the first and second methods ranged from 0.36 to 0.74 and 0.15 to 0.63, respectively. Next, we compared the prediction accuracy of BLUP and two machine learning methods: random forests and simulation annealing ensemble (SAE), a newly developed machine learning method that explicitly models the interaction between variables. Both pedigree and genomic information were utilized as input in these methods. Through repeated 10-fold cross-validation, we found that the accuracy of the machine learning methods consistently surpassed that of BLUP in most cases. In leave-one-family-out cross-validation, SAE demonstrated the highest accuracy among the methods. These results underscore the effectiveness of GP in Japanese sugarcane breeding and highlight the significant potential of machine learning methods.
Collapse
Affiliation(s)
- Minoru Inamori
- Laboratory of Biometry and Bioinformatics, Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| | - Tatsuro Kimura
- Toyota Motor Corporation, New Business Planning Division, Agriculture & Biotechnology Business Department, Toyota, Japan
| | - Masaaki Mori
- Toyota Motor Corporation, Environment Affairs and Engineering Management Division, CN Advanced Engineering Development Center, Tokyo, Japan
| | - Yusuke Tarumoto
- NARO Kyushu Okinawa Agricultural Research Center, Tanegashima Sugarcane Breeding Site, Nishinoomote, Japan
| | - Taiichiro Hattori
- NARO Kyushu Okinawa Agricultural Research Center, Tanegashima Sugarcane Breeding Site, Nishinoomote, Japan
- NARO Kyushu Okinawa Agricultural Research Center, Itoman Resident Office, Itoman, Japan
| | - Michiko Hayano
- NARO Kyushu Okinawa Agricultural Research Center, Tanegashima Sugarcane Breeding Site, Nishinoomote, Japan
- NARO Institute for Agro-Environmental Science, Tsukuba, Japan
| | - Makoto Umeda
- NARO Kyushu Okinawa Agricultural Research Center, Tanegashima Sugarcane Breeding Site, Nishinoomote, Japan
| | - Hiroyoshi Iwata
- Laboratory of Biometry and Bioinformatics, Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
11
|
Adunola P, Ferrão LFV, Benevenuto J, Azevedo CF, Munoz PR. Genomic selection optimization in blueberry: Data-driven methods for marker and training population design. THE PLANT GENOME 2024; 17:e20488. [PMID: 39087863 DOI: 10.1002/tpg2.20488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/25/2024] [Accepted: 06/04/2024] [Indexed: 08/02/2024]
Abstract
Genomic prediction is a modern approach that uses genome-wide markers to predict the genetic merit of unphenotyped individuals. With the potential to reduce the breeding cycles and increase the selection accuracy, this tool has been designed to rank genotypes and maximize genetic gains. Despite this importance, its practical implementation in breeding programs requires critical allocation of resources for its application in a predictive framework. In this study, we integrated genetic and data-driven methods to allocate resources for phenotyping and genotyping tailored to genomic prediction. To this end, we used a historical blueberry (Vaccinium corymbosun L.) breeding dataset containing more than 3000 individuals, genotyped using probe-based target sequencing and phenotyped for three fruit quality traits over several years. Our contribution in this study is threefold: (i) for the genotyping resource allocation, the use of genetic data-driven methods to select an optimal set of markers slightly improved prediction results for all the traits; (ii) for the long-term implication, we carried out a simulation study and emphasized that data-driven method results in a slight improvement in genetic gain over 30 cycles than random marker sampling; and (iii) for the phenotyping resource allocation, we compared different optimization algorithms to select training population, showing that it can be leveraged to increase predictive performances. Altogether, we provided a data-oriented decision-making approach for breeders by demonstrating that critical breeding decisions associated with resource allocation for genomic prediction can be tackled through a combination of statistics and genetic methods.
Collapse
Affiliation(s)
- Paul Adunola
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| | - Luis Felipe V Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| | - Juliana Benevenuto
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| | - Camila F Azevedo
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
- Statistics Department, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | - Patricio R Munoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
12
|
Aalborg T, Nielsen KL. To be or not to be tetraploid-the impact of marker ploidy on genomic prediction and GWAS of potato. FRONTIERS IN PLANT SCIENCE 2024; 15:1386837. [PMID: 39139728 PMCID: PMC11319270 DOI: 10.3389/fpls.2024.1386837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/10/2024] [Indexed: 08/15/2024]
Abstract
Cultivated potato, Solanum tuberosum L., is considered an autotetraploid with 12 chromosomes with four homologous phases. However, recent evidence found that, due to frequent large phase deletions in the genome, gene ploidy is not constant across the genome. The elite cultivar "Otava" was found to have an average gene copy number of 3.2 across all loci. Breeding programs for elite potato cultivars rely increasingly on genomic prediction tools for selection breeding and elucidation of quantitative trait loci underpinning trait genetic variance. These are typically based on anonymous single nucleotide polymorphism (SNP) markers, which are usually called from, for example, SNP array or sequencing data using a tetraploid model. In this study, we analyzed the impact of using whole genome markers genotyped as either tetraploid or observed allele frequencies from genotype-by-sequencing data on single-trait additive genomic best linear unbiased prediction (GBLUP) genomic prediction (GP) models and single-marker regression genome-wide association studies of potato to evaluate the implications of capturing varying ploidy on the statistical models employed in genomic breeding. A panel of 762 offspring of a diallel cross of 18 parents of elite breeding material was used for modeling. These were genotyped by sequencing and phenotyped for five key performance traits: chipping quality, length/width ratio, senescence, dry matter content, and yield. We also estimated the read coverage required to confidently discriminate between a heterozygous triploid and tetraploid state from simulated data. It was found that using a tetraploid model neither impaired nor improved genomic predictions compared to using the observed allele frequencies that account for true marker ploidy. In genome-wide associations studies (GWAS), very minor variations of both signal amplitude and number of SNPs supporting both minor and major quantitative trait loci (QTLs) were observed between the two data sets. However, all major QTLs were reproducible using both data sets.
Collapse
Affiliation(s)
- Trine Aalborg
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Kåre Lehmann Nielsen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
- Research and Development, Kartoffelmelcentralen (KMC) Amba, Brande, Denmark
| |
Collapse
|
13
|
Nazzicari N, Franguelli N, Ferrari B, Pecetti L, Annicchiarico P. The Effect of Genome Parametrization and SNP Marker Subsetting on Genomic Selection in Autotetraploid Alfalfa. Genes (Basel) 2024; 15:449. [PMID: 38674384 PMCID: PMC11050091 DOI: 10.3390/genes15040449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 03/21/2024] [Accepted: 03/27/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND Alfalfa, the most economically important forage legume worldwide, features modest genetic progress due to long selection cycles and the extent of the non-additive genetic variance associated with its autotetraploid genome. METHODS To improve the efficiency of genomic selection in alfalfa, we explored the effects of genome parametrization (as tetraploid and diploid dosages, plus allele ratios) and SNP marker subsetting (all available SNPs, only genic regions, and only non-genic regions) on genomic regressions, together with various levels of filtering on reading depth and missing rates. We used genotyping by sequencing-generated data and focused on traits of different genetic complexity, i.e., dry biomass yield in moisture-favorable (FE) and drought stress (SE) environments, leaf size, and the onset of flowering, which were assessed in 143 genotyped plants from a genetically broad European reference population and their phenotyped half-sib progenies. RESULTS On average, the allele ratio improved the predictive ability compared with other genome parametrizations (+7.9% vs. tetraploid dosage, +12.6% vs. diploid dosage), while using all the SNPs offered an advantage compared with any specific SNP subsetting (+3.7% vs. genic regions, +7.6% vs. non-genic regions). However, when focusing on specific traits, different combinations of genome parametrization and subsetting achieved better performances. We also released Legpipe2, an SNP calling pipeline tailored for reduced representation (GBS, RAD) in medium-sized genotyping experiments.
Collapse
Affiliation(s)
- Nelson Nazzicari
- Council for Agricultural Research and Economics (CREA), Research Center for Animal Production and Aquaculture, Viale Piacenza 29, 26900 Lodi, Italy
| | | | | | | | | |
Collapse
|
14
|
Bilton TP, Sharma SK, Schofield MR, Black MA, Jacobs JME, Bryan GJ, Dodds KG. Construction of relatedness matrices in autopolyploid populations using low-depth high-throughput sequencing data. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:64. [PMID: 38430392 PMCID: PMC10908621 DOI: 10.1007/s00122-024-04568-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/30/2024] [Indexed: 03/03/2024]
Abstract
KEY MESSAGE An improved estimator of genomic relatedness using low-depth high-throughput sequencing data for autopolyploids is developed. Its outputs strongly correlate with SNP array-based estimates and are available in the package GUSrelate. High-throughput sequencing (HTS) methods have reduced sequencing costs and resources compared to array-based tools, facilitating the investigation of many non-model polyploid species. One important quantity that can be computed from HTS data is the genetic relatedness between all individuals in a population. However, HTS data are often messy, with multiple sources of errors (i.e. sequencing errors or missing parental alleles) which, if not accounted for, can lead to bias in genomic relatedness estimates. We derive a new estimator for constructing a genomic relationship matrix (GRM) from HTS data for autopolyploid species that accounts for errors associated with low sequencing depths, implemented in the R package GUSrelate. Simulations revealed that GUSrelate performed similarly to existing GRM methods at high depth but reduced bias in self-relatedness estimates when the sequencing depth was low. Using a panel consisting of 351 tetraploid potato genotypes, we found that GUSrelate produced GRMs from genotyping-by-sequencing (GBS) data that were highly correlated with a GRM computed from SNP array data, and less biased than existing methods when benchmarking against the array-based GRM estimates. GUSrelate provides researchers with a tool to reliably construct GRMs from low-depth HTS data.
Collapse
Affiliation(s)
- Timothy P Bilton
- AgResearch, Invermay Agricultural Centre, Mosgiel, New Zealand.
- Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand.
| | - Sanjeev Kumar Sharma
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, UK
| | - Matthew R Schofield
- Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand
| | - Michael A Black
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | | | - Glenn J Bryan
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, UK
| | - Ken G Dodds
- AgResearch, Invermay Agricultural Centre, Mosgiel, New Zealand
| |
Collapse
|
15
|
Song H, Zhang Q, Hu H. polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species. Brief Bioinform 2024; 25:bbae106. [PMID: 38517695 PMCID: PMC10959164 DOI: 10.1093/bib/bbae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 12/22/2023] [Accepted: 02/26/2024] [Indexed: 03/24/2024] Open
Abstract
Given the universality of autopolyploid species in nature, it is crucial to develop genomic selection methods that consider different allele dosages for autopolyploid breeding. However, no method has been developed to deal with autopolyploid data regardless of the ploidy level. In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermore, when the dominant effect was added to polyGBLUP (polyGDBLUP), the greater the dominance degree, the more obvious the advantages of polyGDBLUP over the diploid models in terms of prediction accuracy, bias, mean squared error and mean absolute error. For real data, the superiority of polyGBLUP over GBLUP appeared in blueberry and sweet potato populations and a part of the traits in guinea grass population due to the high correlation coefficients between diploid and polyploidy genomic relationship matrices. In addition, polyGDBLUP did not produce higher prediction accuracy than polyGBLUP for most traits of real data as dominant genetic variance was not captured for these traits. Our study will be a significant promising method for genomic prediction of autopolyploid species.
Collapse
Affiliation(s)
- Hailiang Song
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, Shandong Agricultural University, Taian 271001, China
| | - Hongxia Hu
- Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences & Beijing Key Laboratory of Fisheries Biotechnology, Beijing 100068, China
- Key Laboratory of Sturgeon Genetics and Breeding, Ministry of Agriculture and Rural Affairs, Hangzhou, 311799, China
| |
Collapse
|
16
|
Njuguna JN, Clark LV, Lipka AE, Anzoua KG, Bagmet L, Chebukin P, Dwiyanti MS, Dzyubenko E, Dzyubenko N, Ghimire BK, Jin X, Johnson DA, Kjeldsen JB, Nagano H, de Bem Oliveira I, Peng J, Petersen KK, Sabitov A, Seong ES, Yamada T, Yoo JH, Yu CY, Zhao H, Munoz P, Long SP, Sacks EJ. Impact of genotype-calling methodologies on genome-wide association and genomic prediction in polyploids. THE PLANT GENOME 2023; 16:e20401. [PMID: 37903749 DOI: 10.1002/tpg2.20401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 09/17/2023] [Accepted: 09/23/2023] [Indexed: 11/01/2023]
Abstract
Discovery and analysis of genetic variants underlying agriculturally important traits are key to molecular breeding of crops. Reduced representation approaches have provided cost-efficient genotyping using next-generation sequencing. However, accurate genotype calling from next-generation sequencing data is challenging, particularly in polyploid species due to their genome complexity. Recently developed Bayesian statistical methods implemented in available software packages, polyRAD, EBG, and updog, incorporate error rates and population parameters to accurately estimate allelic dosage across any ploidy. We used empirical and simulated data to evaluate the three Bayesian algorithms and demonstrated their impact on the power of genome-wide association study (GWAS) analysis and the accuracy of genomic prediction. We further incorporated uncertainty in allelic dosage estimation by testing continuous genotype calls and comparing their performance to discrete genotypes in GWAS and genomic prediction. We tested the genotype-calling methods using data from two autotetraploid species, Miscanthus sacchariflorus and Vaccinium corymbosum, and performed GWAS and genomic prediction. In the empirical study, the tested Bayesian genotype-calling algorithms differed in their downstream effects on GWAS and genomic prediction, with some showing advantages over others. Through subsequent simulation studies, we observed that at low read depth, polyRAD was advantageous in its effect on GWAS power and limit of false positives. Additionally, we found that continuous genotypes increased the accuracy of genomic prediction, by reducing genotyping error, particularly at low sequencing depth. Our results indicate that by using the Bayesian algorithm implemented in polyRAD and continuous genotypes, we can accurately and cost-efficiently implement GWAS and genomic prediction in polyploid crops.
Collapse
Affiliation(s)
- Joyce N Njuguna
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Lindsay V Clark
- Research Scientific Computing, Seattle Children's Research Institute, Seattle, Washington, USA
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Kossonou G Anzoua
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Larisa Bagmet
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Pavel Chebukin
- FSBSI "FSC of Agricultural Biotechnology of the Far East named after A.K. Chaiki", Ussuriysk, Russian Federation
| | - Maria S Dwiyanti
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Elena Dzyubenko
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Nicolay Dzyubenko
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Bimal Kumar Ghimire
- Department of Crop Science, College of Sanghuh Life Science, Konkuk University, Seoul, South Korea
| | - Xiaoli Jin
- Agronomy Department, Key Laboratory of Crop Germplasm Research of Zhejiang Province, Zhejiang University, Hangzhou, China
| | - Douglas A Johnson
- USDA-ARS Forage and Range Research Lab, Utah State University, Logan, Utah, USA
| | | | - Hironori Nagano
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | | | - Junhua Peng
- Spring Valley Agriscience Co. Ltd., Jinan, China
| | | | - Andrey Sabitov
- Vavilov All-Russian Institute of Plant Genetic Resources, St. Petersburg, Russian Federation
| | - Eun Soo Seong
- Division of Bioresource Sciences, Kangwon National University, Chuncheon, South Korea
| | - Toshihiko Yamada
- Field Science Center for Northern Biosphere, Hokkaido University, Sapporo, Japan
| | - Ji Hye Yoo
- Bioherb Research Institute, Kangwon National University, Chuncheon, South Korea
| | - Chang Yeon Yu
- Bioherb Research Institute, Kangwon National University, Chuncheon, South Korea
| | - Hua Zhao
- Key Laboratory of Horticultural Plant Biology of Ministry of Education, Huazhong Agricultural University, Wuhan, China
| | - Patricio Munoz
- Horticultural Science Department, University of Florida, Gainesville, Florida, USA
| | - Stephen P Long
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Erik J Sacks
- Department of Crop Sciences, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
17
|
Mertten D, Baldwin S, Cheng CH, McCallum J, Thomson S, Ashton DT, McKenzie CM, Lenhard M, Datson PM. Implementation of different relationship estimate methodologies in breeding value prediction in kiwiberry ( Actinidia arguta). MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2023; 43:75. [PMID: 37868140 PMCID: PMC10584781 DOI: 10.1007/s11032-023-01419-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/02/2023] [Indexed: 10/24/2023]
Abstract
In dioecious crops such as Actinidia arguta (kiwiberries), some of the main challenges when breeding for fruit characteristics are the selection of potential male parents and the long juvenile period. Currently, breeding values of male parents are estimated through progeny tests, which makes the breeding of new kiwiberry cultivars time-consuming and costly. The application of best linear unbiased prediction (BLUP) would allow direct estimation of sex-related traits and speed up kiwiberry breeding. In this study, we used a linear mixed model approach to estimate narrow sense heritability for one vine-related trait and five fruit-related traits for two incomplete factorial crossing designs. We obtained BLUPs for all genotypes, taking into consideration whether the relationship was pedigree-based or marker-based. Owing to the high cost of genome sequencing, it is important to understand the effects of different sources of relationship matrices on estimating breeding values across a breeding population. Because of the increasing implementation of genomic selection in crop breeding, we compared the effects of incorporating different sources of information in building relationship matrices and ploidy levels on the accuracy of BLUPs' heritability and predictive ability. As kiwiberries are autotetraploids, multivalent chromosome formation and occasionally double reduction can occur during meiosis, and this can affect the accuracy of prediction. This study innovates the breeding programme of autotetraploid kiwiberries. We demonstrate that the accuracy of BLUPs of male siblings, without phenotypic observations, strongly improved when a tetraploid marker-based relationship matrix was used rather than parental BLUPs and female siblings with phenotypic observations. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-023-01419-8.
Collapse
Affiliation(s)
- Daniel Mertten
- The New Zealand Institute for Plant and Food Research Ltd (PFR), Auckland, 1142 New Zealand
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany
| | | | | | | | | | | | | | - Michael Lenhard
- Institute for Biochemistry and Biology, University of Potsdam, 14476 Potsdam-Golm, Germany
| | | |
Collapse
|
18
|
Ortiz R, Reslow F, Vetukuri R, García-Gil MR, Pérez-Rodríguez P, Crossa J. Inbreeding Effects on the Performance and Genomic Prediction for Polysomic Tetraploid Potato Offspring Grown at High Nordic Latitudes. Genes (Basel) 2023; 14:1302. [PMID: 37372482 DOI: 10.3390/genes14061302] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 06/18/2023] [Accepted: 06/19/2023] [Indexed: 06/29/2023] Open
Abstract
Inbreeding depression (ID) is caused by increased homozygosity in the offspring after selfing. Although the self-compatible, highly heterozygous, tetrasomic polyploid potato (Solanum tuberosum L.) suffers from ID, some argue that the potential genetic gains from using inbred lines in a sexual propagation system of potato are too large to be ignored. The aim of this research was to assess the effects of inbreeding on potato offspring performance under a high latitude and the accuracy of the genomic prediction of breeding values (GEBVs) for further use in selection. Four inbred (S1) and two hybrid (F1) offspring and their parents (S0) were used in the experiment, with a field layout of an augmented design with the four S0 replicated in nine incomplete blocks comprising 100, four-plant plots at Umeå (63°49'30″ N 20°15'50″ E), Sweden. S0 was significantly (p < 0.01) better than both S1 and F1 offspring for tuber weight (total and according to five grading sizes), tuber shape and size uniformity, tuber eye depth and reducing sugars in the tuber flesh, while F1 was significantly (p < 0.01) better than S1 for all tuber weight and uniformity traits. Some F1 hybrid offspring (15-19%) had better total tuber yield than the best-performing parent. The GEBV accuracy ranged from -0.3928 to 0.4436. Overall, tuber shape uniformity had the highest GEBV accuracy, while tuber weight traits exhibited the lowest accuracy. The F1 full sib's GEBV accuracy was higher, on average, than that of S1. Genomic prediction may facilitate eliminating undesired inbred or hybrid offspring for further use in the genetic betterment of potato.
Collapse
Affiliation(s)
- Rodomiro Ortiz
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), SE 23436 Lomma, Sweden
- Umeå Plant Science Center, SLU Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences (SLU), SE 90183 Umeå, Sweden
| | - Fredrik Reslow
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), SE 23436 Lomma, Sweden
| | - Ramesh Vetukuri
- Department of Plant Breeding, Swedish University of Agricultural Sciences (SLU), SE 23436 Lomma, Sweden
| | - M Rosario García-Gil
- Umeå Plant Science Center, SLU Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences (SLU), SE 90183 Umeå, Sweden
| | | | - José Crossa
- Colegio de Postgraduados (COLPOS), Montecillos 56230, Edo. de México, Mexico
- International Maize and Wheat Improvement Center (CIMMYT), El Batán, Texcoco 56237, Edo. de México, Mexico
| |
Collapse
|
19
|
Filho CCF, Andrade MHML, Nunes JAR, Jarquin DH, Rios EF. Genomic prediction for complex traits across multiples harvests in alfalfa (Medicago sativa L.) is enhanced by enviromics. THE PLANT GENOME 2023:e20306. [PMID: 36815221 DOI: 10.1002/tpg2.20306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 12/17/2022] [Indexed: 06/18/2023]
Abstract
Breeding for dry matter yield and persistence in alfalfa (Medicago sativa L.) can take several years as these traits must be evaluated under multiple harvests. Therefore, genotype-by-harvest interaction should be incorporated into genomic prediction models to explore genotypes' adaptability and stability. In this study, we investigated how enviromics could help to predict the genotypic performance under multiharvest alfalfa breeding trials by evaluating 177 families across 11 harvests under four cross-validation scenarios. All scenarios were analyzed using six models in a Bayesian mixed model framework. Our results demonstrate that models accounting to the enviromics information led to an increase of genetic variance and a decrease in the error variance, indicating better biological explanation when the enviromic information was incorporated. Furthermore, models that accounted for enviromic data led to higher predictive ability (PA) in a reduced number of harvests used in the training data set. The best enviromic models (M2 and M3) outperformed the base model (GBLUP model-M0) for predicting adaptability and persistence across all cross-validation scenarios. Incorporating environmental covariates also provided higher PA for persistence compared with the base model, as predictions increased from 0 to 0.16, 0.20, 0.56, and 0.46 for CV00, CV1, CV0, and CV2. The results also demonstrate that GBLUP without enviromics term has low power to predict persistence, thus the adoption of enviromics is a cheap and efficient alternative to increase accuracy and biological meaning.
Collapse
Affiliation(s)
| | | | - José Airton Rodrigues Nunes
- Departamento de Biologia, Instituto de Ciências Naturais, Universidade Federal de Lavras, Lavras, Minas Gerais, Brazil
| | | | | |
Collapse
|
20
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
21
|
Annicchiarico P, Nazzicari N, Bouizgaren A, Hayek T, Laouar M, Cornacchione M, Basigalup D, Monterrubio Martin C, Brummer EC, Pecetti L. Alfalfa genomic selection for different stress-prone growing regions. THE PLANT GENOME 2022; 15:e20264. [PMID: 36222346 DOI: 10.1002/tpg2.20264] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 08/25/2022] [Indexed: 06/16/2023]
Abstract
Alfalfa (Medicago sativa L.) selection for stress-prone regions has high priority for sustainable crop-livestock systems. This study assessed the genomic selection (GS) ability to predict alfalfa breeding values for drought-prone agricultural sites of Algeria, Morocco, and Argentina; managed-stress (MS) environments of Italy featuring moderate or intense drought; and one Tunisian site irrigated with moderately saline water. Additional aims were to investigate genotype × environment interaction (GEI) patterns and the effect on GS predictions of three single-nucleotide polymorphism (SNP) calling procedures, 12 statistical models that exclude or incorporate GEI, and allele dosage information. Our study included 127 genotypes from a Mediterranean reference population originated from three geographically contrasting populations, genotyped via genotyping-by-sequencing and phenotyped based on multi-year biomass dry matter yield of their dense-planted half-sib progenies. The GEI was very large, as shown by 27-fold greater additive genetic variance × environment interaction relative to the additive genetic variance and low genetic correlation for progeny yield responses across environments. The predictive ability of GS (using at least 37,969 SNP markers) exceeded 0.20 for moderate MS (representing Italian stress-prone sites) and the sites of Algeria and Argentina while being quite low for the Tunisian site and intense MS. Predictions of GS were complicated by rapid linkage disequilibrium decay. The weighted GBLUP model, GEI incorporation into GS models, and SNP calling based on a mock reference genome exhibited a predictive ability advantage for some environments. Our results support the specific breeding for each target region and suggest a positive role for GS in most regions when considering the challenges associated with phenotypic selection.
Collapse
Affiliation(s)
- Paolo Annicchiarico
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Nelson Nazzicari
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Abdelaziz Bouizgaren
- Institut National de la Recherche Agronomique, Centre Régional de Marrakech, BP 533, Marrakech, 40000, Morocco
| | - Taoufik Hayek
- Institut des Régions Arides, Route du Jorf, Médenine, 4119, Tunisia
| | - Meriem Laouar
- Ecole Nationale Supérieure Agronomique, Dép. de Productions Végétales. Laboratoire d'Amélioration Intégrative des Productions Végétales (C2711100), Rue Hassen Badi, El Harrach 16200, Alger, Algérie
| | - Monica Cornacchione
- Instituto Nacional de Tecnología Agropecuaria, Estación Experimental Santiago del Estero, Jujuy 850, Santiago del Estero, 4200, Argentina
| | - Daniel Basigalup
- Instituto Nacional de Tecnología Agropecuaria, Estación Experimental Manfredi, Ruta Nacional no. 9 km 636, Manfredi, Córdoba, X5988, Argentina
| | - Cristina Monterrubio Martin
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| | - Edward Charles Brummer
- Plant Breeding Center, Dep. of Plant Sciences, Univ. of California, Davis, CA, 95616, USA
| | - Luciano Pecetti
- Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Centro di ricerca Zootecnia e Acquacoltura, 29 viale Piacenza, Lodi, 26900, Italy
| |
Collapse
|
22
|
Mbo Nkoulou LF, Ngalle HB, Cros D, Adje COA, Fassinou NVH, Bell J, Achigan-Dako EG. Perspective for genomic-enabled prediction against black sigatoka disease and drought stress in polyploid species. FRONTIERS IN PLANT SCIENCE 2022; 13:953133. [PMID: 36388523 PMCID: PMC9650417 DOI: 10.3389/fpls.2022.953133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 09/28/2022] [Indexed: 06/16/2023]
Abstract
Genomic selection (GS) in plant breeding is explored as a promising tool to solve the problems related to the biotic and abiotic threats. Polyploid plants like bananas (Musa spp.) face the problem of drought and black sigatoka disease (BSD) that restrict their production. The conventional plant breeding is experiencing difficulties, particularly phenotyping costs and long generation interval. To overcome these difficulties, GS in plant breeding is explored as an alternative with a great potential for reducing costs and time in selection process. So far, GS does not have the same success in polyploid plants as with diploid plants because of the complexity of their genome. In this review, we present the main constraints to the application of GS in polyploid plants and the prospects for overcoming these constraints. Particular emphasis is placed on breeding for BSD and drought-two major threats to banana production-used in this review as a model of polyploid plant. It emerges that the difficulty in obtaining markers of good quality in polyploids is the first challenge of GS on polyploid plants, because the main tools used were developed for diploid species. In addition to that, there is a big challenge of mastering genetic interactions such as dominance and epistasis effects as well as the genotype by environment interaction, which are very common in polyploid plants. To get around these challenges, we have presented bioinformatics tools, as well as artificial intelligence approaches, including machine learning. Furthermore, a scheme for applying GS to banana for BSD and drought has been proposed. This review is of paramount impact for breeding programs that seek to reduce the selection cycle of polyploids despite the complexity of their genome.
Collapse
Affiliation(s)
- Luther Fort Mbo Nkoulou
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
- Institute of Agricultural Research for Development, Centre de Recherche Agricole de Mbalmayo (CRAM), Mbalmayo, Cameroon
| | - Hermine Bille Ngalle
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - David Cros
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), Unité Mixte de Recherche (UMR) Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP) Institut, Montpellier, France
- Unité Mixte de Recherche (UMR) Amélioration Génétique et Adaptation des Plantes méditerranéennes et tropicales (AGAP) Institut, University of Montpellier, Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Institut Agro, Montpellier, France
| | - Charlotte O. A. Adje
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| | - Nicodeme V. H. Fassinou
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| | - Joseph Bell
- Unit of Genetics and Plant Breeding (UGAP), Department of Plant Biology, Faculty of Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - Enoch G. Achigan-Dako
- Genetics, Biotechnology, and Seed Science Unit (GBioS), Department of Plant Sciences, Faculty of Agronomic Sciences, University of Abomey Calavi, Cotonou, Benin
| |
Collapse
|
23
|
Murad Leite Andrade MH, Acharya JP, Benevenuto J, de Bem Oliveira I, Lopez Y, Munoz P, Resende MFR, Rios EF. Genomic prediction for canopy height and dry matter yield in alfalfa using family bulks. THE PLANT GENOME 2022; 15:e20235. [PMID: 35818699 DOI: 10.1002/tpg2.20235] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 04/30/2022] [Indexed: 06/15/2023]
Abstract
Genomic selection (GS) has proven to be an effective method to increase genetic gain rates and accelerate breeding cycles in many crop species. However, its implementation requires large investments to phenotype of the training population and for routine genotyping. Alfalfa (Medicago sativa L.) is one of the major cultivated forage legumes, showing high-quality nutritional value. Alfalfa breeding is usually carried out by phenotypic recurrent selection and is commonly done at the family level. The application of GS in alfalfa could be simplified and less costly by genotyping and phenotyping families in bulks. For this study, an alfalfa reference population composed of 142 full-sib and 35 half-sib families was bulk-genotyped using target enrichment sequencing and phenotyped for dry matter yield (DMY) and canopy height (CH) in Florida, USA. Genotyping of the family bulks with 17,707 targeted probes resulted in 114,945 single-nucleotide polymorphisms. The markers revealed a population structure that matched the mating design, and the linkage disequilibrium slowly decayed in this breeding population. After exploring multiple prediction scenarios, a strategy was proposed including data from multiple harvests and accounting for the G×E in the training population, which led to a higher predictive ability of up to 38 and 24% for DMY and CH, respectively. Although this study focused on the implementation of GS in alfalfa families, the bulk methodology and the prediction schemes used herein could guide future studies in alfalfa and other crops bred in bulks.
Collapse
Affiliation(s)
| | - Janam P Acharya
- Agronomy Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| | - Juliana Benevenuto
- Horticultural Sciences Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| | | | - Yolanda Lopez
- Agronomy Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| | - Patricio Munoz
- Horticultural Sciences Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| | - Marcio F R Resende
- Horticultural Sciences Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| | - Esteban F Rios
- Agronomy Dep., Univ. of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
24
|
González Silos R, Fischer C, Lorenzo Bermejo J. NGS allele counts versus called genotypes for testing genetic association. Comput Struct Biotechnol J 2022; 20:3729-3733. [PMID: 35891781 PMCID: PMC9294184 DOI: 10.1016/j.csbj.2022.07.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 07/07/2022] [Accepted: 07/07/2022] [Indexed: 11/28/2022] Open
Abstract
RNA sequence data are commonly summarized as read counts. By contrast, so far there is no alternative to genotype calling for investigating the relationship between genetic variants determined by next-generation sequencing (NGS) and a phenotype of interest. Here we propose and evaluate the direct analysis of allele counts for genetic association tests. Specifically, we assess the potential advantage of the ratio of alternative allele counts to the total number of reads aligned at a specific position of the genome (coverage) over called genotypes. We simulated association studies based on NGS data from HapMap individuals. Genotype quality scores and allele counts were simulated using NGS data from the Personal Genome Project. Real data from the 1000 Genomes Project was also used to compare the two competing approaches. The average proportions of probability values lower or equal to 0.05 amounted to 0.0496 for called genotypes and 0.0485 for the ratio of alternative allele counts to coverage in the null scenario, and to 0.69 for called genotypes and 0.75 for the ratio of alternative allele counts to coverage in the alternative scenario (9% power increase). The advantage in statistical power of the novel approach increased with decreasing coverage, with decreasing genotype quality and with decreasing allele frequency – 124% power increase for variants with a minor allele frequency lower than 0.05. We provide computer code in R to implement the novel approach, which does not preclude the use of complementary data quality filters before or after identification of the most promising association signals. Author summary Genetic association tests usually rely on called genotypes. We postulate here that the direct analysis of allele counts from sequence data improves the quality of statistical inference. To evaluate this hypothesis, we investigate simulated and real data using distinct statistical approaches. We demonstrate that association tests based on allele counts rather than called genotypes achieve higher statistical power with controlled type I error rates.
Collapse
Affiliation(s)
| | - Christine Fischer
- Institute of Human Genetics, University of Heidelberg, 69120, Germany
| | | |
Collapse
|
25
|
Edger PP, Iorizzo M, Bassil NV, Benevenuto J, Ferrão LFV, Giongo L, Hummer K, Lawas LMF, Leisner CP, Li C, Munoz PR, Ashrafi H, Atucha A, Babiker EM, Canales E, Chagné D, DeVetter L, Ehlenfeldt M, Espley RV, Gallardo K, Günther CS, Hardigan M, Hulse-Kemp AM, Jacobs M, Lila MA, Luby C, Main D, Mengist MF, Owens GL, Perkins-Veazie P, Polashock J, Pottorff M, Rowland LJ, Sims CA, Song GQ, Spencer J, Vorsa N, Yocca AE, Zalapa J. There and back again; historical perspective and future directions for Vaccinium breeding and research studies. HORTICULTURE RESEARCH 2022; 9:uhac083. [PMID: 35611183 PMCID: PMC9123236 DOI: 10.1093/hr/uhac083] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 03/22/2022] [Indexed: 06/02/2023]
Abstract
The genus Vaccinium L. (Ericaceae) contains a wide diversity of culturally and economically important berry crop species. Consumer demand and scientific research in blueberry (Vaccinium spp.) and cranberry (Vaccinium macrocarpon) have increased worldwide over the crops' relatively short domestication history (~100 years). Other species, including bilberry (Vaccinium myrtillus), lingonberry (Vaccinium vitis-idaea), and ohelo berry (Vaccinium reticulatum) are largely still harvested from the wild but with crop improvement efforts underway. Here, we present a review article on these Vaccinium berry crops on topics that span taxonomy to genetics and genomics to breeding. We highlight the accomplishments made thus far for each of these crops, along their journey from the wild, and propose research areas and questions that will require investments by the community over the coming decades to guide future crop improvement efforts. New tools and resources are needed to underpin the development of superior cultivars that are not only more resilient to various environmental stresses and higher yielding, but also produce fruit that continue to meet a variety of consumer preferences, including fruit quality and health related traits.
Collapse
Affiliation(s)
- Patrick P Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- MSU AgBioResearch, Michigan State University, East Lansing, MI, 48824, USA
| | - Massimo Iorizzo
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
- Department of Horticultural Science, North Carolina State University, Raleigh, NC USA
| | - Nahla V Bassil
- USDA-ARS, National Clonal Germplasm Repository, Corvallis, OR 97333, USA
| | - Juliana Benevenuto
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Luis Felipe V Ferrão
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Lara Giongo
- Fondazione Edmund Mach - Research and Innovation CentreItaly
| | - Kim Hummer
- USDA-ARS, National Clonal Germplasm Repository, Corvallis, OR 97333, USA
| | - Lovely Mae F Lawas
- Department of Biological Sciences, Auburn University, Auburn, AL 36849, USA
| | - Courtney P Leisner
- Department of Biological Sciences, Auburn University, Auburn, AL 36849, USA
| | - Changying Li
- Phenomics and Plant Robotics Center, College of Engineering, University of Georgia, Athens, USA
| | - Patricio R Munoz
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Hamid Ashrafi
- Department of Horticultural Science, North Carolina State University, Raleigh, NC USA
| | - Amaya Atucha
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Ebrahiem M Babiker
- USDA-ARS Southern Horticultural Laboratory, Poplarville, MS 39470-0287, USA
| | - Elizabeth Canales
- Department of Agricultural Economics, Mississippi State University, Mississippi State, MS 39762, USA
| | - David Chagné
- The New Zealand Institute for Plant and Food Research Limited (PFR), Palmerston North, New Zealand
| | - Lisa DeVetter
- Department of Horticulture, Washington State University Northwestern Washington Research and Extension Center, Mount Vernon, WA, 98221, USA
| | - Mark Ehlenfeldt
- SEBS, Plant Biology, Rutgers University, New Brunswick NJ 01019 USA
| | - Richard V Espley
- The New Zealand Institute for Plant and Food Research Limited (PFR), Palmerston North, New Zealand
| | - Karina Gallardo
- School of Economic Sciences, Washington State University, Puyallup, WA 98371, USA
| | - Catrin S Günther
- The New Zealand Institute for Plant and Food Research Limited (PFR), Palmerston North, New Zealand
| | - Michael Hardigan
- USDA-ARS, Horticulture Crops Research Unit, Corvallis, OR 97333, USA
| | - Amanda M Hulse-Kemp
- USDA-ARS, Genomics and Bioinformatics Research Unit, Raleigh, NC 27695, USA
- Department of Crop and Soil Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - MacKenzie Jacobs
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48823, USA
| | - Mary Ann Lila
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Claire Luby
- USDA-ARS, Horticulture Crops Research Unit, Corvallis, OR 97333, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA, 99163, USA
| | - Molla F Mengist
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
- Department of Horticultural Science, North Carolina State University, Raleigh, NC USA
| | | | | | - James Polashock
- SEBS, Plant Biology, Rutgers University, New Brunswick NJ 01019 USA
| | - Marti Pottorff
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Lisa J Rowland
- USDA-ARS, Genetic Improvement of Fruits and Vegetables Laboratory, Beltsville, MD 20705, USA
| | - Charles A Sims
- Food Science and Human Nutrition Department, University of Florida, Gainesville, FL 32611, USA
| | - Guo-qing Song
- Plant Biotechnology Resource and Outreach Center, Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA
| | - Jessica Spencer
- Department of Horticultural Science, North Carolina State University, Raleigh, NC USA
| | - Nicholi Vorsa
- SEBS, Plant Biology, Rutgers University, New Brunswick NJ 01019 USA
| | - Alan E Yocca
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Juan Zalapa
- USDA-ARS, VCRU, Department of Horticulture, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
26
|
Strategies to Increase Prediction Accuracy in Genomic Selection of Complex Traits in Alfalfa ( Medicago sativa L.). Cells 2021; 10:cells10123372. [PMID: 34943880 PMCID: PMC8699225 DOI: 10.3390/cells10123372] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 11/19/2021] [Accepted: 11/24/2021] [Indexed: 12/27/2022] Open
Abstract
Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.
Collapse
|
27
|
Wilson S, Zheng C, Maliepaard C, Mulder HA, Visser RGF, van der Burgt A, van Eeuwijk F. Understanding the Effectiveness of Genomic Prediction in Tetraploid Potato. FRONTIERS IN PLANT SCIENCE 2021; 12:672417. [PMID: 34434201 PMCID: PMC8381724 DOI: 10.3389/fpls.2021.672417] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 07/13/2021] [Indexed: 05/20/2023]
Abstract
Use of genomic prediction (GP) in tetraploid is becoming more common. Therefore, we think it is the right time for a comparison of GP models for tetraploid potato. GP models were compared that contrasted shrinkage with variable selection, parametric vs. non-parametric models and different ways of accounting for non-additive genetic effects. As a complement to GP, association studies were carried out in an attempt to understand the differences in prediction accuracy. We compared our GP models on a data set consisting of 147 cultivars, representing worldwide diversity, with over 39 k GBS markers and measurements on four tuber traits collected in six trials at three locations during 2 years. GP accuracies ranged from 0.32 for tuber count to 0.77 for dry matter content. For all traits, differences between GP models that utilised shrinkage penalties and those that performed variable selection were negligible. This was surprising for dry matter, as only a few additive markers explained over 50% of phenotypic variation. Accuracy for tuber count increased from 0.35 to 0.41, when dominance was included in the model. This result is supported by Genome Wide Association Study (GWAS) that found additive and dominance effects accounted for 37% of phenotypic variation, while significant additive effects alone accounted for 14%. For tuber weight, the Reproducing Kernel Hilbert Space (RKHS) model gave a larger improvement in prediction accuracy than explicitly modelling epistatic effects. This is an indication that capturing the between locus epistatic effects of tuber weight can be done more effectively using the semi-parametric RKHS model. Our results show good opportunities for GP in 4x potato.
Collapse
Affiliation(s)
- Stefan Wilson
- Biometris, Wageningen University & Research Centre, Wageningen, Netherlands
| | - Chaozhi Zheng
- Biometris, Wageningen University & Research Centre, Wageningen, Netherlands
| | - Chris Maliepaard
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | - Han A. Mulder
- Wageningen University and Research Animal Breeding and Genomics Centre, Wageningen, Netherlands
| | | | | | - Fred van Eeuwijk
- Biometris, Wageningen University & Research Centre, Wageningen, Netherlands
| |
Collapse
|
28
|
Genome-wide approaches for the identification of markers and genes associated with sugarcane yellow leaf virus resistance. Sci Rep 2021; 11:15730. [PMID: 34344928 PMCID: PMC8333424 DOI: 10.1038/s41598-021-95116-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 07/19/2021] [Indexed: 11/10/2022] Open
Abstract
Sugarcane yellow leaf (SCYL), caused by the sugarcane yellow leaf virus (SCYLV) is a major disease affecting sugarcane, a leading sugar and energy crop. Despite damages caused by SCYLV, the genetic base of resistance to this virus remains largely unknown. Several methodologies have arisen to identify molecular markers associated with SCYLV resistance, which are crucial for marker-assisted selection and understanding response mechanisms to this virus. We investigated the genetic base of SCYLV resistance using dominant and codominant markers and genotypes of interest for sugarcane breeding. A sugarcane panel inoculated with SCYLV was analyzed for SCYL symptoms, and viral titer was estimated by RT-qPCR. This panel was genotyped with 662 dominant markers and 70,888 SNPs and indels with allele proportion information. We used polyploid-adapted genome-wide association analyses and machine-learning algorithms coupled with feature selection methods to establish marker-trait associations. While each approach identified unique marker sets associated with phenotypes, convergences were observed between them and demonstrated their complementarity. Lastly, we annotated these markers, identifying genes encoding emblematic participants in virus resistance mechanisms and previously unreported candidates involved in viral responses. Our approach could accelerate sugarcane breeding targeting SCYLV resistance and facilitate studies on biological processes leading to this trait.
Collapse
|
29
|
Ferrão LFV, Amadeu RR, Benevenuto J, de Bem Oliveira I, Munoz PR. Genomic Selection in an Outcrossing Autotetraploid Fruit Crop: Lessons From Blueberry Breeding. FRONTIERS IN PLANT SCIENCE 2021; 12:676326. [PMID: 34194453 PMCID: PMC8236943 DOI: 10.3389/fpls.2021.676326] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 05/12/2021] [Indexed: 05/17/2023]
Abstract
Blueberry (Vaccinium corymbosum and hybrids) is a specialty crop with expanding production and consumption worldwide. The blueberry breeding program at the University of Florida (UF) has greatly contributed to expanding production areas by developing low-chilling cultivars better adapted to subtropical and Mediterranean climates of the globe. The breeding program has historically focused on recurrent phenotypic selection. As an autopolyploid, outcrossing, perennial, long juvenile phase crop, blueberry breeding cycles are costly and time consuming, which results in low genetic gains per unit of time. Motivated by applying molecular markers for a more accurate selection in the early stages of breeding, we performed pioneering genomic selection studies and optimization for its implementation in the blueberry breeding program. We have also addressed some complexities of sequence-based genotyping and model parametrization for an autopolyploid crop, providing empirical contributions that can be extended to other polyploid species. We herein revisited some of our previous genomic selection studies and showed for the first time its application in an independent validation set. In this paper, our contribution is three-fold: (i) summarize previous results on the relevance of model parametrizations, such as diploid or polyploid methods, and inclusion of dominance effects; (ii) assess the importance of sequence depth of coverage and genotype dosage calling steps; (iii) demonstrate the real impact of genomic selection on leveraging breeding decisions by using an independent validation set. Altogether, we propose a strategy for using genomic selection in blueberry, with the potential to be applied to other polyploid species of a similar background.
Collapse
Affiliation(s)
- Luís Felipe V. Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Rodrigo R. Amadeu
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Juliana Benevenuto
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Ivone de Bem Oliveira
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
- Hortifrut North America, Inc., Estero, FL, United States
| | - Patricio R. Munoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| |
Collapse
|
30
|
Voss-Fels KP, Wei X, Ross EM, Frisch M, Aitken KS, Cooper M, Hayes BJ. Strategies and considerations for implementing genomic selection to improve traits with additive and non-additive genetic architectures in sugarcane breeding. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:1493-1511. [PMID: 33587151 DOI: 10.1007/s00122-021-03785-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 01/27/2021] [Indexed: 05/14/2023]
Abstract
Simulations highlight the potential of genomic selection to substantially increase genetic gain for complex traits in sugarcane. The success rate depends on the trait genetic architecture and the implementation strategy. Genomic selection (GS) has the potential to increase the rate of genetic gain in sugarcane beyond the levels achieved by conventional phenotypic selection (PS). To assess different implementation strategies, we simulated two different GS-based breeding strategies and compared genetic gain and genetic variance over five breeding cycles to standard PS. GS scheme 1 followed similar routines like conventional PS but included three rapid recurrent genomic selection (RRGS) steps. GS scheme 2 also included three RRGS steps but did not include a progeny assessment stage and therefore differed more fundamentally from PS. Under an additive trait model, both simulated GS schemes achieved annual genetic gains of 2.6-2.7% which were 1.9 times higher compared to standard phenotypic selection (1.4%). For a complex non-additive trait model, the expected annual rates of genetic gain were lower for all breeding schemes; however, the rates for the GS schemes (1.5-1.6%) were still greater than PS (1.1%). Investigating cost-benefit ratios with regard to numbers of genotyped clones showed that substantial benefits could be achieved when only 1500 clones were genotyped per 10-year breeding cycle for the additive genetic model. Our results show that under a complex non-additive genetic model, the success rate of GS depends on the implementation strategy, the number of genotyped clones and the stage of the breeding program, likely reflecting how changes in QTL allele frequencies change additive genetic variance and therefore the efficiency of selection. These results are encouraging and motivate further work to facilitate the adoption of GS in sugarcane breeding.
Collapse
Affiliation(s)
- Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, QLD, 4741, Australia
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Matthias Frisch
- Institute of Agronomy and Plant Breeding II, Justus Liebig University, Giessen, Germany
| | - Karen S Aitken
- Agriculture and Food, CSIRO, QBP, St. Lucia, QLD, 4067, Australia
| | - Mark Cooper
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia.
| |
Collapse
|
31
|
Simeão RM, Resende MDV, Alves RS, Pessoa-Filho M, Azevedo ALS, Jones CS, Pereira JF, Machado JC. Genomic Selection in Tropical Forage Grasses: Current Status and Future Applications. FRONTIERS IN PLANT SCIENCE 2021; 12:665195. [PMID: 33995461 PMCID: PMC8120112 DOI: 10.3389/fpls.2021.665195] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 04/06/2021] [Indexed: 05/06/2023]
Abstract
The world population is expected to be larger and wealthier over the next few decades and will require more animal products, such as milk and beef. Tropical regions have great potential to meet this growing global demand, where pasturelands play a major role in supporting increased animal production. Better forage is required in consonance with improved sustainability as the planted area should not increase and larger areas cultivated with one or a few forage species should be avoided. Although, conventional tropical forage breeding has successfully released well-adapted and high-yielding cultivars over the last few decades, genetic gains from these programs have been low in view of the growing food demand worldwide. To guarantee their future impact on livestock production, breeding programs should leverage genotyping, phenotyping, and envirotyping strategies to increase genetic gains. Genomic selection (GS) and genome-wide association studies play a primary role in this process, with the advantage of increasing genetic gain due to greater selection accuracy, reduced cycle time, and increased number of individuals that can be evaluated. This strategy provides solutions to bottlenecks faced by conventional breeding methods, including long breeding cycles and difficulties to evaluate complex traits. Initial results from implementing GS in tropical forage grasses (TFGs) are promising with notable improvements over phenotypic selection alone. However, the practical impact of GS in TFG breeding programs remains unclear. The development of appropriately sized training populations is essential for the evaluation and validation of selection markers based on estimated breeding values. Large panels of single-nucleotide polymorphism markers in different tropical forage species are required for multiple application targets at a reduced cost. In this context, this review highlights the current challenges, achievements, availability, and development of genomic resources and statistical methods for the implementation of GS in TFGs. Additionally, the prediction accuracies from recent experiments and the potential to harness diversity from genebanks are discussed. Although, GS in TFGs is still incipient, the advances in genomic tools and statistical models will speed up its implementation in the foreseeable future. All TFG breeding programs should be prepared for these changes.
Collapse
Affiliation(s)
| | | | - Rodrigo S. Alves
- Instituto Nacional de Ciência e Tecnologia do Café, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | | | - Chris S. Jones
- International Livestock Research Institute, Nairobi, Kenya
| | | | | |
Collapse
|
32
|
Maximum likelihood parentage assignment using quantitative genotypes. Heredity (Edinb) 2021; 126:884-895. [PMID: 33692533 DOI: 10.1038/s41437-021-00421-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 02/22/2021] [Accepted: 02/23/2021] [Indexed: 11/09/2022] Open
Abstract
The cost of parentage assignment precludes its application in many selective breeding programmes and molecular ecology studies, and/or limits the circumstances or number of individuals to which it is applied. Pooling samples from more than one individual, and using appropriate genetic markers and algorithms to determine parental contributions to pools, is one means of reducing the cost of parentage assignment. This paper describes and validates a novel maximum likelihood (ML) parentage-assignment method, that can be used to accurately assign parentage to pooled samples of multiple individuals-previously published ML methods are applicable to samples of single individuals only-using low-density single nucleotide polymorphism (SNP) 'quantitative' (also referred to as 'continuous') genotype data. It is demonstrated with simulated data that, when applied to pools, this 'quantitative maximum likelihood' method assigns parentage with greater accuracy than established maximum likelihood parentage-assignment approaches, which rely on accurate discrete genotype calls; exclusion methods; and estimating parental contributions to pools by solving the weighted least squares problem. Quantitative maximum likelihood can be applied to pools generated using either a 'pooling-for-individual-parentage-assignment' approach, whereby each individual in a pool is tagged or traceable and from a known and mutually exclusive set of possible parents; or a 'pooling-by-phenotype' approach, whereby individuals of the same, or similar, phenotype/s are pooled. Although computationally intensive when applied to large pools, quantitative maximum likelihood has the potential to substantially reduce the cost of parentage assignment, even if applied to pools comprised of few individuals.
Collapse
|
33
|
Gonçalves MTV, Morota G, Costa PMDA, Vidigal PMP, Barbosa MHP, Peternelli LA. Near-infrared spectroscopy outperforms genomics for predicting sugarcane feedstock quality traits. PLoS One 2021; 16:e0236853. [PMID: 33661948 PMCID: PMC7932073 DOI: 10.1371/journal.pone.0236853] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 01/20/2021] [Indexed: 11/19/2022] Open
Abstract
The main objectives of this study were to evaluate the prediction performance of genomic and near-infrared spectroscopy (NIR) data and whether the integration of genomic and NIR predictor variables can increase the prediction accuracy of two feedstock quality traits (fiber and sucrose content) in a sugarcane population (Saccharum spp.). The following three modeling strategies were compared: M1 (genome-based prediction), M2 (NIR-based prediction), and M3 (integration of genomics and NIR wavenumbers). Data were collected from a commercial population comprised of three hundred and eighty-five individuals, genotyped for single nucleotide polymorphisms and screened using NIR spectroscopy. We compared partial least squares (PLS) and BayesB regression methods to estimate marker and wavenumber effects. In order to assess model performance, we employed random sub-sampling cross-validation to calculate the mean Pearson correlation coefficient between observed and predicted values. Our results showed that models fitted using BayesB were more predictive than PLS models. We found that NIR (M2) provided the highest prediction accuracy, whereas genomics (M1) presented the lowest predictive ability, regardless of the measured traits and regression methods used. The integration of predictors derived from NIR spectroscopy and genomics into a single model (M3) did not significantly improve the prediction accuracy for the two traits evaluated. These findings suggest that NIR-based prediction can be an effective strategy for predicting the genetic merit of sugarcane clones.
Collapse
Affiliation(s)
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States of America
| | | | | | | | | |
Collapse
|
34
|
Asfaw A, Aderonmu DS, Darkwa K, De Koeyer D, Agre P, Abe A, Olasanmi B, Adebola P, Asiedu R. Genetic parameters, prediction, and selection in a white Guinea yam early-generation breeding population using pedigree information. CROP SCIENCE 2021; 61:1038-1051. [PMID: 33883753 PMCID: PMC8048640 DOI: 10.1002/csc2.20382] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 10/12/2020] [Indexed: 06/12/2023]
Abstract
Better understanding of the genetic control of traits in breeding populations is crucial for the selection of superior varieties and parents. This study aimed to assess genetic parameters and breeding values for six essential traits in a white Guinea yam (Dioscorea rotundata Poir.) breeding population. For this, pedigree-based best linear unbiased prediction (P-BLUP) was used. The results revealed significant nonadditive genetic variances and medium to high (.45-.79) broad-sense heritability estimates for the traits studied. The pattern of associations among the genetic values of the traits suggests that selection based on a multiple-trait selection index has potential for identifying superior breeding lines. Parental breeding values predicted using progeny performance identified 13 clones with high genetic potential for simultaneous improvement of the measured traits in the yam breeding program. Subsets of progeny were identified for intermating or further variety testing based on additive genetic and total genetic values. Selection of the top 5% progenies based on the multi-trait index revealed positive genetic gains for fresh tuber yield (t ha-1), tuber yield (kg plant-1), and average tuber weight (kg). However, genetic gain was negative for tuber dry matter content and Yam mosaic virus resistance in comparison with standard varieties. Our results show the relevance of P-BLUP for the selection of superior parental clones and progenies with higher breeding values for interbreeding and higher genotypic value for variety development in yam.
Collapse
Affiliation(s)
- Asrat Asfaw
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
| | - Dotun Samuel Aderonmu
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
- International Potato Center (CIP)AbujaNigeria
- Dep. of AgronomyUniv. of IbadanIbadanNigeria
| | - Kwabena Darkwa
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
- Pan African Univ., Institute of Life and Earth SciencesUniv. of IbadanIbadanNigeria
| | - David De Koeyer
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
- Agriculture and Agri‐Food Canada850 Lincoln Road, PO Box 20280FrederictonNBE3B4Z7Canada
| | - Paterne Agre
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
| | - Ayodeji Abe
- Dep. of AgronomyUniv. of IbadanIbadanNigeria
| | | | - Patrick Adebola
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
| | - Robert Asiedu
- International Institute of Tropical Agriculture (IITA)IbadanNigeria
| |
Collapse
|
35
|
Gerard D. Pairwise linkage disequilibrium estimation for polyploids. Mol Ecol Resour 2021; 21:1230-1242. [PMID: 33559321 DOI: 10.1111/1755-0998.13349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 01/18/2021] [Accepted: 02/01/2021] [Indexed: 12/31/2022]
Abstract
Many tasks in statistical genetics involve pairwise estimation of linkage disequilibrium (LD). The study of LD in diploids is mature. However, in polyploids, the field lacks a comprehensive characterization of LD. Polyploids also exhibit greater levels of genotype uncertainty than diploids, yet no methods currently exist to estimate LD in polyploids in the presence of such genotype uncertainty. Furthermore, most LD estimation methods do not quantify the level of uncertainty in their LD estimates. Our study contains three major contributions. (i) We characterize haplotypic and composite measures of LD in polyploids. These composite measures of LD turn out to be functions of common statistical measures of association. (ii) We derive procedures to estimate haplotypic and composite LD in polyploids in the presence of genotype uncertainty. We do this by estimating LD directly from genotype likelihoods, which may be obtained from many genotyping platforms. (iii) We derive standard errors of all LD estimators that we discuss. We validate our methods on both real and simulated data. Our methods are implemented in the R package ldsep, available on the Comprehensive R Archive Network https://cran.r-project.org/package=ldsep.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC, USA
| |
Collapse
|
36
|
Nagasaka K, Nishiyama S, Fujikawa M, Yamane H, Shirasawa K, Babiker E, Tao R. Genome-Wide Identification of Loci Associated With Phenology-Related Traits and Their Adaptive Variations in a Highbush Blueberry Collection. FRONTIERS IN PLANT SCIENCE 2021; 12:793679. [PMID: 35126419 PMCID: PMC8814416 DOI: 10.3389/fpls.2021.793679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 12/07/2021] [Indexed: 05/04/2023]
Abstract
Genetic variation in phenological traits is the key in expanding production areas of crops. Southern highbush blueberry (SHB) is a blueberry cultivar group adapted to warmer climates and has been developed by multiple interspecific hybridizations between elite northern highbush blueberry (NHB) (Vaccinium corymbosum L.) and low-chill Vaccinium species native to the southern United States. In this study, we employed a collection of diverse SHB accessions and performed a genome-wide association study (GWAS) for five phenology-related traits [chilling requirement (CR), flowering date, ripening date, fruit development period, and continuous flowering] using polyploid GWAS models. Phenology-related traits showed higher heritability and larger correlation coefficients between year replications, which resulted in the detection of robust phenotype-genotype association peaks. Notably, a single association peak for the CR was detected on Chromosome 4. Comparison of genotypes at the GWAS peaks between NHB and SHB revealed the putative introgression of low-chill and late-flowering alleles into the highbush genetic pool. Our results provide basic insights into the diversity of phenological traits in blueberry and the genetic establishment of current highbush cultivar groups.
Collapse
Affiliation(s)
- Kyoka Nagasaka
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | - Soichiro Nishiyama
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
- *Correspondence: Soichiro Nishiyama,
| | - Mao Fujikawa
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | - Hisayo Yamane
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | | | - Ebrahiem Babiker
- Thad Cochran Southern Horticultural Laboratory, United States Department of Agriculture, Agricultural Research Service, Poplarville, MS, United States
- Ebrahiem Babiker,
| | - Ryutaro Tao
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| |
Collapse
|
37
|
Abstract
A suitable pairwise relatedness estimation is key to genetic studies. Several methods are proposed to compute relatedness in autopolyploids based on molecular data. However, unlike diploids, autopolyploids still need further studies considering scenarios with many linked molecular markers with known dosage. In this study, we provide guidelines for plant geneticists and breeders to access trustworthy pairwise relatedness estimates. To this end, we simulated populations considering different ploidy levels, meiotic pairings patterns, number of loci and alleles, and inbreeding levels. Analysis were performed to access the accuracy of distinct methods and to demonstrate the usefulness of molecular marker in practical situations. Overall, our results suggest that at least 100 effective biallelic molecular markers are required to have good pairwise relatedness estimation if methods based on correlation is used. For this number of loci, current methods based on multiallelic markers show lower performance than biallelic ones. To estimate relatedness in cases of inbreeding or close relationships (as parent-offspring, full-sibs, or half-sibs) is more challenging. Methods to estimate pairwise relatedness based on molecular markers, for different ploidy levels or pedigrees were implemented in the AGHmatrix R package.
Collapse
|
38
|
Gemenet DC, Lindqvist-Kreuze H, De Boeck B, da Silva Pereira G, Mollinari M, Zeng ZB, Craig Yencho G, Campos H. Sequencing depth and genotype quality: accuracy and breeding operation considerations for genomic selection applications in autopolyploid crops. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2020; 133:3345-3363. [PMID: 32876753 PMCID: PMC7567692 DOI: 10.1007/s00122-020-03673-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 08/21/2020] [Indexed: 05/06/2023]
Abstract
KEY MESSAGE Polypoid crop breeders can balance resources between density and sequencing depth, dosage information and fewer highly informative SNPs recommended, non-additive models and QTL advantages on prediction dependent on trait architecture. The autopolyploid nature of potato and sweetpotato ensures a wide range of meiotic configurations and linkage phases leading to complex gene-action and pose problems in genotype data quality and genomic selection analyses. We used a 315-progeny biparental F1 population of hexaploid sweetpotato and a diversity panel of 380 tetraploid potato, genotyped using different platforms to answer the following questions: (i) do polyploid crop breeders need to invest more for additional sequencing depth? (ii) how many markers are required to make selection decisions? (iii) does considering non-additive genetic effects improve predictive ability (PA)? (iv) does considering dosage or quantitative trait loci (QTL) offer significant improvement to PA? Our results show that only a small number of highly informative single nucleotide polymorphisms (SNPs; ≤ 1000) are adequate for prediction in the type of populations we analyzed. We also show that considering dosage information and models considering only additive effects had the best PA for most traits, while the comparative advantage of considering non-additive genetic effects and including known QTL in the predictive model depended on trait architecture. We conclude that genomic selection can help accelerate the rate of genetic gains in potato and sweetpotato. However, application of genomic selection should be considered as part of optimizing the entire breeding program. Additionally, since the predictions in the current study are based on single populations, further studies on the effects of haplotype structure and inheritance on PA should be studied in actual multi-generation breeding populations.
Collapse
Affiliation(s)
- Dorcus C Gemenet
- International Potato Center, ILRI Campus, P.O. Box 25171-00603, Nairobi, Kenya.
- CGIAR Excellence in Breeding Platform, International Maize and Wheat Improvement Center (CIMMYT), ICRAF Campus, 1041-00621, Nairobi, Kenya.
| | | | - Bert De Boeck
- International Potato Center, Av. La Molina 1895, Lima, Peru
| | | | | | - Zhao-Bang Zeng
- North Carolina State University, Raleigh, NC, 27695, USA
| | - G Craig Yencho
- North Carolina State University, Raleigh, NC, 27695, USA
| | - Hugo Campos
- International Potato Center, Av. La Molina 1895, Lima, Peru
| |
Collapse
|
39
|
Sood S, Lin Z, Caruana B, Slater AT, Daetwyler HD. Making the most of all data: Combining non-genotyped and genotyped potato individuals with HBLUP. THE PLANT GENOME 2020; 13:e20056. [PMID: 33217206 DOI: 10.1002/tpg2.20056] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 08/03/2020] [Accepted: 08/20/2020] [Indexed: 05/20/2023]
Abstract
Using genomic information to predict phenotypes can improve the accuracy of estimated breeding values and can potentially increase genetic gain over conventional breeding. In this study, we investigated the prediction accuracies achieved by best linear unbiased prediction (BLUP) for nine potato phenotypic traits using three types of relationship matrices pedigree ABLUP, genomic GBLUP, and a hybrid matrix (H) combining pedigree and genomic information (HBLUP). Deep pedigree information was available for >3000 different potato breeding clones evaluated over four years. Genomic relationships were estimated from >180,000 informative SNPs generated using a genotyping-by-sequencing transcriptome (GBS-t) protocol for 168 cultivars, many of which were parents of clones. Two validation scenarios were implemented, namely "Genotyped Cultivars Validation" (a subset of genotyped lines as validation set) and "Non-genotyped 2009 Progenies Validation". Most of the traits showed moderate to high narrow sense heritabilities (range 0.22-0.72). In the Genotyped Cultivars Validation, HBLUP outperformed ABLUP on prediction accuracies for all traits except early blight, and outperformed GBLUP for most of the traits except tuber shape, tuber eye depth and boil after-cooking darkening. This is evidence that the in-depth relationship within the H matrix could potentially result in better prediction accuracy in comparison to using A or G matrix individually. The prediction accuracies of the Non-genotyped 2009 Progenies Validation were comparable between ABLUP and HBLUP, varying from 0.17-0.70 and 0.18-0.69, respectively. Better prediction accuracy and less bias in prediction using HBLUP is of practical utility to breeders as all breeding material is ranked on the same scale leading to improved selection decisions. In addition, our approach provides an economical alternative to utilize historic breeding data with current genotyped individuals in implementing genomic selection.
Collapse
Affiliation(s)
- Salej Sood
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia
- Division of Crop Improvement, ICAR-Central Potato Research Institute, Shimla, Himachal Pradesh, 171001, India
| | - Zibei Lin
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia
| | - Brittney Caruana
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Anthony T Slater
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia
| | - Hans D Daetwyler
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, 5 Ring Road, Bundoora, VIC 3083, Australia
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| |
Collapse
|
40
|
Genomic insight into the developmental history of southern highbush blueberry populations. Heredity (Edinb) 2020; 126:194-205. [PMID: 32873965 DOI: 10.1038/s41437-020-00362-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/16/2020] [Accepted: 08/18/2020] [Indexed: 11/08/2022] Open
Abstract
Interspecific hybridization is a common breeding approach for introducing novel traits and genetic diversity to breeding populations. Southern highbush blueberry (SHB) is a blueberry cultivar group that has been intensively bred over the last 60 years. Specifically, it was developed by multiple interspecific crosses between northern highbush blueberry [NHB, Vaccinium corymbosum L. (2n = 4x = 48)] and low-chill Vaccinium species to expand the geographic limits of highbush blueberry production. In this study, we genotyped polyploid blueberries, including 105 SHB, 17 NHB, and 10 rabbiteye blueberry (RE) (Vaccinium virgatum Aiton), from the accessions planted at Poplarville, Mississippi, and accessions distributed in Japan, based on the double-digest restriction site-associated DNA sequencing. The genome-wide SNP data clearly indicated that RE cultivars were genetically distinct from SHB and NHB cultivars, whereas NHB and SHB were genetically indistinguishable. The population structure results appeared to reflect the differences in the allele selection strategies that breeders used for developing germplasm adapted to local climates. The genotype data implied that there are no or very few genomic segments that were commonly introgressed from low-chill Vaccinium species to the SHB genome. Principal component analysis-based outlier detection analysis found a few loci associated with a variable that could partially differentiate NHB and SHB. These SNP loci were detected in Mb-scale haplotype blocks and may be close to the functional genes related to SHB development. Collectively, the data generated in this study suggest a polygenic adaptation of SHB to the southern climate, and may be relevant for future population-scale genome-wide analyses of blueberry.
Collapse
|
41
|
Ferrão LFV, Johnson TS, Benevenuto J, Edger PP, Colquhoun TA, Munoz PR. Genome-wide association of volatiles reveals candidate loci for blueberry flavor. THE NEW PHYTOLOGIST 2020; 226:1725-1737. [PMID: 31999829 DOI: 10.1111/nph.16459] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 01/21/2020] [Indexed: 05/20/2023]
Abstract
Plants produce a range of volatile organic compounds (VOCs), some of which are perceived by the human olfactory system, contributing to a myriad flavors. Despite the importance of flavor for consumer preference, most plant breeding programs have neglected it, mainly because of the costs of phenotyping and the complexity of disentangling the role of VOCs in human perception. To develop molecular breeding tools aimed at improving fruit flavor, we carried out target genotyping of and VOC extraction from a blueberry population. Metabolite genome-wide association analysis was used to elucidate the genetic architecture, while predictive models were tested to prove that VOCs can be accurately predicted using genomic information. A historical sensory panel was considered to assess how the volatiles influenced consumers. By gathering genomics, metabolomics, and the sensory panel, we demonstrated that VOCs are controlled by a few major genomic regions, some of which harbor biosynthetic enzyme-coding genes; can be accurately predicted using molecular markers; and can enhance or decrease consumers' overall liking. Here we emphasized how the understanding of the genetic basis and the role of VOCs in consumer preference can assist breeders in developing more flavorful cultivars at a more inexpensive and accelerated pace.
Collapse
Affiliation(s)
- Luís Felipe V Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, 32611, USA
| | - Timothy S Johnson
- Environmental Horticulture Department, Plant Innovation Center, University of Florida, Gainesville, FL, 32611, USA
| | - Juliana Benevenuto
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, 32611, USA
| | - Patrick P Edger
- Department of Horticulture, University of Michigan, Michigan State University, East Lansing, MI, 48824, USA
| | - Thomas A Colquhoun
- Environmental Horticulture Department, Plant Innovation Center, University of Florida, Gainesville, FL, 32611, USA
| | - Patricio R Munoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, 32611, USA
| |
Collapse
|
42
|
Gerard D, Ferrão LFV. Priors for genotyping polyploids. BIOINFORMATICS (OXFORD, ENGLAND) 2020; 36:1795-1800. [PMID: 32176767 DOI: 10.1101/751784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/12/2019] [Indexed: 05/29/2023]
Abstract
MOTIVATION Empirical Bayes techniques to genotype polyploid organisms usually either (i) assume technical artifacts are known a priori or (ii) estimate technical artifacts simultaneously with the prior genotype distribution. Case (i) is unappealing as it places the onus on the researcher to estimate these artifacts, or to ensure that there are no systematic biases in the data. However, as we demonstrate with a few empirical examples, case (ii) makes choosing the class of prior genotype distributions extremely important. Choosing a class is either too flexible or too restrictive results in poor genotyping performance. RESULTS We propose two classes of prior genotype distributions that are of intermediate levels of flexibility: the class of proportional normal distributions and the class of unimodal distributions. We provide a complete characterization of and optimization details for the class of unimodal distributions. We demonstrate, using both simulated and real data that using these classes results in superior genotyping performance. AVAILABILITY AND IMPLEMENTATION Genotyping methods that use these priors are implemented in the updog R package available on the Comprehensive R Archive Network: https://cran.r-project.org/package=updog. All code needed to reproduce the results of this article is available on GitHub: https://github.com/dcgerard/reproduce_prior_sims. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC 20016, USA
| | | |
Collapse
|
43
|
Hardigan MA, Feldmann MJ, Lorant A, Bird KA, Famula R, Acharya C, Cole G, Edger PP, Knapp SJ. Genome Synteny Has Been Conserved Among the Octoploid Progenitors of Cultivated Strawberry Over Millions of Years of Evolution. FRONTIERS IN PLANT SCIENCE 2020; 10:1789. [PMID: 32158449 PMCID: PMC7020885 DOI: 10.3389/fpls.2019.01789] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 12/20/2019] [Indexed: 05/18/2023]
Abstract
Allo-octoploid cultivated strawberry (Fragaria × ananassa) originated through a combination of polyploid and homoploid hybridization, domestication of an interspecific hybrid lineage, and continued admixture of wild species over the last 300 years. While genes appear to flow freely between the octoploid progenitors, the genome structures and diversity of the octoploid species remain poorly understood. The complexity and absence of an octoploid genome frustrated early efforts to study chromosome evolution, resolve subgenomic structure, and develop a single coherent linkage group nomenclature. Here, we show that octoploid Fragaria species harbor millions of subgenome-specific DNA variants. Their diversity was sufficient to distinguish duplicated (homoeologous and paralogous) DNA sequences and develop 50K and 850K SNP genotyping arrays populated with co-dominant, disomic SNP markers distributed throughout the octoploid genome. Whole-genome shotgun genotyping of an interspecific segregating population yielded 1.9M genetically mapped subgenome variants in 5,521 haploblocks spanning 3,394 cM in F. chiloensis subsp. lucida, and 1.6M genetically mapped subgenome variants in 3,179 haploblocks spanning 2,017 cM in F. × ananassa. These studies provide a dense genomic framework of subgenome-specific DNA markers for seamlessly cross-referencing genetic and physical mapping information and unifying existing chromosome nomenclatures. Using comparative genomics, we show that geographically diverse wild octoploids are effectively diploidized, nearly completely collinear, and retain strong macro-synteny with diploid progenitor species. The preservation of genome structure among allo-octoploid taxa is a critical factor in the unique history of garden strawberry, where unimpeded gene flow supported its origin and domestication through repeated cycles of interspecific hybridization.
Collapse
Affiliation(s)
- Michael A. Hardigan
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Mitchell J. Feldmann
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Anne Lorant
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Kevin A. Bird
- Department of Horticulture, Michigan State University, East Lansing, MI, United States
| | - Randi Famula
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Charlotte Acharya
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Glenn Cole
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| | - Patrick P. Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, United States
| | - Steven J. Knapp
- Department of Plant Sciences, University of California, Davis, Davis, CA, United States
| |
Collapse
|
44
|
Zingaretti LM, Gezan SA, Ferrão LFV, Osorio LF, Monfort A, Muñoz PR, Whitaker VM, Pérez-Enciso M. Exploring Deep Learning for Complex Trait Genomic Prediction in Polyploid Outcrossing Species. FRONTIERS IN PLANT SCIENCE 2020; 11:25. [PMID: 32117371 PMCID: PMC7015897 DOI: 10.3389/fpls.2020.00025] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 01/10/2020] [Indexed: 05/21/2023]
Abstract
Genomic prediction (GP) is the procedure whereby the genetic merits of untested candidates are predicted using genome wide marker information. Although numerous examples of GP exist in plants and animals, applications to polyploid organisms are still scarce, partly due to limited genome resources and the complexity of this system. Deep learning (DL) techniques comprise a heterogeneous collection of machine learning algorithms that have excelled at many prediction tasks. A potential advantage of DL for GP over standard linear model methods is that DL can potentially take into account all genetic interactions, including dominance and epistasis, which are expected to be of special relevance in most polyploids. In this study, we evaluated the predictive accuracy of linear and DL techniques in two important small fruits or berries: strawberry and blueberry. The two datasets contained a total of 1,358 allopolyploid strawberry (2n=8x=112) and 1,802 autopolyploid blueberry (2n=4x=48) individuals, genotyped for 9,908 and 73,045 single nucleotide polymorphism (SNP) markers, respectively, and phenotyped for five agronomic traits each. DL depends on numerous parameters that influence performance and optimizing hyperparameter values can be a critical step. Here we show that interactions between hyperparameter combinations should be expected and that the number of convolutional filters and regularization in the first layers can have an important effect on model performance. In terms of genomic prediction, we did not find an advantage of DL over linear model methods, except when the epistasis component was important. Linear Bayesian models were better than convolutional neural networks for the full additive architecture, whereas the opposite was observed under strong epistasis. However, by using a parameterization capable of taking into account these non-linear effects, Bayesian linear models can match or exceed the predictive accuracy of DL. A semiautomatic implementation of the DL pipeline is available at https://github.com/lauzingaretti/deepGP/.
Collapse
Affiliation(s)
- Laura M. Zingaretti
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
| | - Salvador Alejandro Gezan
- School of Forest Resources and Conservation, University of Florida, Gainesville, FL, United States
| | - Luis Felipe V. Ferrão
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Luis F. Osorio
- IFAS Gulf Coast Research and Education Center, University of Florida, Wimauma, FL, United States
| | - Amparo Monfort
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
- Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Barcelona, Spain
| | - Patricio R. Muñoz
- Blueberry Breeding and Genomics Lab, Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Vance M. Whitaker
- IFAS Gulf Coast Research and Education Center, University of Florida, Wimauma, FL, United States
| | - Miguel Pérez-Enciso
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
- ICREA, Passeig de Lluís Companys 23, Barcelona, Spain
| |
Collapse
|
45
|
Benevenuto J, Ferrão LFV, Amadeu RR, Munoz P. How can a high-quality genome assembly help plant breeders? Gigascience 2020; 8:5513659. [PMID: 31184361 PMCID: PMC6558523 DOI: 10.1093/gigascience/giz068] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 05/09/2019] [Accepted: 05/16/2019] [Indexed: 12/04/2022] Open
Abstract
The decreasing costs of next-generation sequencing and the improvements in de novo sequence assemblers have made it possible to obtain reference genomes for most eukaryotes, including minor crops such as the blueberry (Vaccinium corymbosum). Nevertheless, these genomes are at various levels of completeness and few have been anchored to chromosome scale and/or are haplotype-phased. We highlight the impact of a high-quality genome assembly for plant breeding and genetic research by showing how it affects our understanding of the genetic architecture of important traits and aids marker selection and candidate gene detection. We compared the results of genome-wide association studies and genomic selection that were already published using a blueberry draft genome as reference with the results using the recent released chromosome-scale and haplotype-phased blueberry genome. We believe that the benefits shown herein reinforce the importance of genome assembly projects for other non-model species.
Collapse
Affiliation(s)
- Juliana Benevenuto
- Blueberry Breeding and Genomics Laboratory, Horticultural Sciences Department, University of Florida, Gainesville, 2550 Hull Road, FL, USA
| | - Luís Felipe V Ferrão
- Blueberry Breeding and Genomics Laboratory, Horticultural Sciences Department, University of Florida, Gainesville, 2550 Hull Road, FL, USA
| | - Rodrigo R Amadeu
- Blueberry Breeding and Genomics Laboratory, Horticultural Sciences Department, University of Florida, Gainesville, 2550 Hull Road, FL, USA
| | - Patricio Munoz
- Blueberry Breeding and Genomics Laboratory, Horticultural Sciences Department, University of Florida, Gainesville, 2550 Hull Road, FL, USA
| |
Collapse
|
46
|
de C Lara LA, Santos MF, Jank L, Chiari L, Vilela MDM, Amadeu RR, Dos Santos JPR, Pereira GDS, Zeng ZB, Garcia AAF. Genomic Selection with Allele Dosage in Panicum maximum Jacq. G3 (BETHESDA, MD.) 2019; 9:2463-2475. [PMID: 31171567 PMCID: PMC6686918 DOI: 10.1534/g3.118.200986] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 05/23/2019] [Indexed: 12/21/2022]
Abstract
Genomic selection is an efficient approach to get shorter breeding cycles in recurrent selection programs and greater genetic gains with selection of superior individuals. Despite advances in genotyping techniques, genetic studies for polyploid species have been limited to a rough approximation of studies in diploid species. The major challenge is to distinguish the different types of heterozygotes present in polyploid populations. In this work, we evaluated different genomic prediction models applied to a recurrent selection population of 530 genotypes of Panicum maximum, an autotetraploid forage grass. We also investigated the effect of the allele dosage in the prediction, i.e., considering tetraploid (GS-TD) or diploid (GS-DD) allele dosage. A longitudinal linear mixed model was fitted for each one of the six phenotypic traits, considering different covariance matrices for genetic and residual effects. A total of 41,424 genotyping-by-sequencing markers were obtained using 96-plex and Pst1 restriction enzyme, and quantitative genotype calling was performed. Six predictive models were generalized to tetraploid species and predictive ability was estimated by a replicated fivefold cross-validation process. GS-TD and GS-DD models were performed considering 1,223 informative markers. Overall, GS-TD data yielded higher predictive abilities than with GS-DD data. However, different predictive models had similar predictive ability performance. In this work, we provide bioinformatic and modeling guidelines to consider tetraploid dosage and observed that genomic selection may lead to additional gains in recurrent selection program of P. maximum.
Collapse
Affiliation(s)
- Letícia A de C Lara
- Luiz de Queiroz College of Agriculture / University of São Paulo (ESALQ/USP), Piracicaba, SP, Brazil
| | | | - Liana Jank
- Embrapa Beef Cattle, Campo Grande, MS, Brazil, and
| | | | | | - Rodrigo R Amadeu
- Luiz de Queiroz College of Agriculture / University of São Paulo (ESALQ/USP), Piracicaba, SP, Brazil
| | - Jhonathan P R Dos Santos
- Luiz de Queiroz College of Agriculture / University of São Paulo (ESALQ/USP), Piracicaba, SP, Brazil
| | | | | | - Antonio Augusto F Garcia
- Luiz de Queiroz College of Agriculture / University of São Paulo (ESALQ/USP), Piracicaba, SP, Brazil
| |
Collapse
|
47
|
polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids. G3-GENES GENOMES GENETICS 2019; 9:663-673. [PMID: 30655271 PMCID: PMC6404598 DOI: 10.1534/g3.118.200913] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Low or uneven read depth is a common limitation of genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), resulting in high missing data rates, heterozygotes miscalled as homozygotes, and uncertainty of allele copy number in heterozygous polyploids. Bayesian genotype calling can mitigate these issues, but previously has only been implemented in software that requires a reference genome or uses priors that may be inappropriate for the population. Here we present several novel Bayesian algorithms that estimate genotype posterior probabilities, all of which are implemented in a new R package, polyRAD. Appropriate priors can be specified for mapping populations, populations in Hardy-Weinberg equilibrium, or structured populations, and in each case can be informed by genotypes at linked markers. The polyRAD software imports read depth from several existing pipelines, and outputs continuous or discrete numerical genotypes suitable for analyses such as genome-wide association and genomic prediction.
Collapse
|