1
|
Gerard D. Double reduction estimation and equilibrium tests in natural autopolyploid populations. Biometrics 2023; 79:2143-2156. [PMID: 35848417 DOI: 10.1111/biom.13722] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 07/11/2022] [Indexed: 11/27/2022]
Abstract
Many bioinformatics pipelines include tests for equilibrium. Tests for diploids are well studied and widely available, but extending these approaches to autopolyploids is hampered by the presence of double reduction, the comigration of sister chromatid segments into the same gamete during meiosis. Though a hindrance for equilibrium tests, double reduction rates are quantities of interest in their own right, as they provide insights about the meiotic behavior of autopolyploid organisms. Here, we develop procedures to (i) test for equilibrium while accounting for double reduction, and (ii) estimate the double reduction rate given equilibrium. To do so, we take two approaches: a likelihood approach, and a novel U-statistic minimization approach that we show generalizes the classical equilibrium χ2 test in diploids. For small sample sizes and uncertain genotypes, we further develop a bootstrap procedure based on our U-statistic to test for equilibrium. We validate our methods on both simulated and real data.
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, District of Columbia, USA
| |
Collapse
|
2
|
Chizk TM, Clark JR, Johns C, Nelson L, Ashrafi H, Aryal R, Worthington ML. Genome-wide association identifies key loci controlling blackberry postharvest quality. FRONTIERS IN PLANT SCIENCE 2023; 14:1182790. [PMID: 37351206 PMCID: PMC10282842 DOI: 10.3389/fpls.2023.1182790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 04/27/2023] [Indexed: 06/24/2023]
Abstract
Introduction Blackberry (Rubus subgenus Rubus) is a soft-fruited specialty crop that often suffers economic losses due to degradation in the shipping process. During transportation, fresh-market blackberries commonly leak, decay, deform, or become discolored through a disorder known as red drupelet reversion (RDR). Over the past 50 years, breeding programs have achieved better fruit firmness and postharvest quality through traditional selection methods, but the underlying genetic variation is poorly understood. Methods We conducted a genome-wide association of fruit firmness and RDR measured in 300 tetraploid fresh-market blackberry genotypes from 2019-2021 with 65,995 SNPs concentrated in genic regions of the R. argutus reference genome. Results Fruit firmness and RDR had entry-mean broad sense heritabilities of 68% and 34%, respectively. Three variants on homologs of polygalacturonase (PG), pectin methylesterase (PME), and glucan endo-1,3-β-glucosidase explained 27% of variance in fruit firmness and were located on chromosomes Ra06, Ra01, and Ra02, respectively. Another PG homolog variant on chromosome Ra02 explained 8% of variance in RDR, but it was in strong linkage disequilibrium with 212 other RDR-associated SNPs across a 23 Mb region. A large cluster of six PME and PME inhibitor homologs was located near the fruit firmness quantitative trait locus (QTL) identified on Ra01. RDR and fruit firmness shared a significant negative correlation (r = -0.28) and overlapping QTL regions on Ra02 in this study. Discussion Our work demonstrates the complex nature of postharvest quality traits in blackberry, which are likely controlled by many small-effect QTLs. This study is the first large-scale effort to map the genetic control of quantitative traits in blackberry and provides a strong framework for future GWAS. Phenotypic and genotypic datasets may be used to train genomic selection models that target the improvement of postharvest quality.
Collapse
Affiliation(s)
- T. Mason Chizk
- Department of Horticulture, University of Arkansas, Fayetteville, AR, United States
| | - John R. Clark
- Department of Horticulture, University of Arkansas, Fayetteville, AR, United States
| | - Carmen Johns
- Department of Horticulture, University of Arkansas, Fayetteville, AR, United States
| | - Lacy Nelson
- Department of Horticulture, University of Arkansas, Fayetteville, AR, United States
| | - Hamid Ashrafi
- Department of Horticultural Science, North Carolina State University, Raleigh, NC, United States
| | - Rishi Aryal
- Department of Horticultural Science, North Carolina State University, Raleigh, NC, United States
| | | |
Collapse
|
3
|
Slonecki TJ, Rutter WB, Olukolu BA, Yencho GC, Jackson DM, Wadl PA. Genetic diversity, population structure, and selection of breeder germplasm subsets from the USDA sweetpotato ( Ipomoea batatas) collection. FRONTIERS IN PLANT SCIENCE 2023; 13:1022555. [PMID: 36816486 PMCID: PMC9932972 DOI: 10.3389/fpls.2022.1022555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 10/28/2022] [Indexed: 06/18/2023]
Abstract
Sweetpotato (Ipomoea batatas) is the sixth most important food crop and plays a critical role in maintaining food security worldwide. Support for sweetpotato improvement research in breeding and genetics programs, and maintenance of sweetpotato germplasm collections is essential for preserving food security for future generations. Germplasm collections seek to preserve phenotypic and genotypic diversity through accession characterization. However, due to its genetic complexity, high heterogeneity, polyploid genome, phenotypic plasticity, and high flower production variability, sweetpotato genetic characterization is challenging. Here, we characterize the genetic diversity and population structure of 604 accessions from the sweetpotato germplasm collection maintained by the United States Department of Agriculture (USDA), Agricultural Research Service (ARS), Plant Genetic Resources Conservation Unit (PGRCU) in Griffin, Georgia, United States. Using the genotyping-by-sequencing platform (GBSpoly) and bioinformatic pipelines (ngsComposer and GBSapp), a total of 102,870 polymorphic SNPs with hexaploid dosage calls were identified from the 604 accessions. Discriminant analysis of principal components (DAPC) and Bayesian clustering identified six unique genetic groupings across seven broad geographic regions. Genetic diversity analyses using the hexaploid data set revealed ample genetic diversity among the analyzed collection in concordance with previous analyses. Following population structure and diversity analyses, breeder germplasm subsets of 24, 48, 96, and 384 accessions were established using K-means clustering with manual selection to maintain phenotypic and genotypic diversity. The genetic characterization of the PGRCU sweetpotato germplasm collection and breeder germplasm subsets developed in this study provide the foundation for future association studies and serve as precursors toward phenotyping studies aimed at linking genotype with phenotype.
Collapse
Affiliation(s)
- Tyler J. Slonecki
- United States Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC, United States
| | - William B. Rutter
- United States Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC, United States
| | - Bode A. Olukolu
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, United States
| | - G. Craig Yencho
- Department of Horticultural Science, North Carolina State University, Raleigh, NC, United States
| | - D. Michael Jackson
- United States Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC, United States
| | - Phillip A. Wadl
- United States Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC, United States
| |
Collapse
|
4
|
Wang Z, Chen L, Li Q, Zhang H, Shan Y, Qi L, Wang H, Chen Y. Association between single-nucleotide polymorphism rs145497186 related to NDUFV2 and lumbar disc degeneration: a pilot case–control study. J Orthop Surg Res 2022; 17:473. [PMID: 36309697 PMCID: PMC9618206 DOI: 10.1186/s13018-022-03368-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 10/23/2022] [Indexed: 11/10/2022] Open
Abstract
Objective The association between the single-nucleotide polymorphisms (SNPs) rs28742109, rs12955018, rs987850, rs8093805, rs12965084 and rs145497186 related to gene named NADH dehydrogenase [ubiquinone] flavoprotein 2 (NDUFV2) and lumbar disc degeneration (LDD) was preliminary investigated in a small sample size.
Methods A total of 46 patients with LDD and 45 controls were recruited at Qilu Hospital of Shandong University, and each participant provided 5 mL peripheral venous blood. NA was extracted from the blood of each participant for further genotyping. The frequency of different genotypes in the case group and control group was determined, and analysis of the risk of LDD associated with different SNP genotypes was performed. The visual analogue scale (VAS) scores of the patients’ degree of chronic low back pain were calculated, and the relationship between VAS scores and SNPs was analysed.
Results After excluding the influence of sex, age, height, and weight on LDD, a significant association between SNP rs145497186 related to NDUFV2 and LDD persisted (P = 0.006). Simultaneously, rs145497186 was found to be associated with chronic low back pain in LDD populations.
Conclusion NDUFV2 rs145497186 SNP could be associated with susceptibility to LDD and the degree of chronic low back pain. Supplementary Information The online version contains supplementary material available at 10.1186/s13018-022-03368-y.
Collapse
|
5
|
Gerard D. Comment on three papers about Hardy–Weinberg equilibrium tests in autopolyploids. Front Genet 2022; 13:1027209. [PMID: 36267399 PMCID: PMC9576855 DOI: 10.3389/fgene.2022.1027209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 09/12/2022] [Indexed: 12/04/2022] Open
|
6
|
Francisco FR, Aono AH, da Silva CC, Gonçalves PS, Scaloppi Junior EJ, Le Guen V, Fritsche-Neto R, Souza LM, de Souza AP. Unravelling Rubber Tree Growth by Integrating GWAS and Biological Network-Based Approaches. FRONTIERS IN PLANT SCIENCE 2021; 12:768589. [PMID: 34992619 PMCID: PMC8724537 DOI: 10.3389/fpls.2021.768589] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 11/02/2021] [Indexed: 06/08/2023]
Abstract
Hevea brasiliensis (rubber tree) is a large tree species of the Euphorbiaceae family with inestimable economic importance. Rubber tree breeding programs currently aim to improve growth and production, and the use of early genotype selection technologies can accelerate such processes, mainly with the incorporation of genomic tools, such as marker-assisted selection (MAS). However, few quantitative trait loci (QTLs) have been used successfully in MAS for complex characteristics. Recent research shows the efficiency of genome-wide association studies (GWAS) for locating QTL regions in different populations. In this way, the integration of GWAS, RNA-sequencing (RNA-Seq) methodologies, coexpression networks and enzyme networks can provide a better understanding of the molecular relationships involved in the definition of the phenotypes of interest, supplying research support for the development of appropriate genomic based strategies for breeding. In this context, this work presents the potential of using combined multiomics to decipher the mechanisms of genotype and phenotype associations involved in the growth of rubber trees. Using GWAS from a genotyping-by-sequencing (GBS) Hevea population, we were able to identify molecular markers in QTL regions with a main effect on rubber tree plant growth under constant water stress. The underlying genes were evaluated and incorporated into a gene coexpression network modelled with an assembled RNA-Seq-based transcriptome of the species, where novel gene relationships were estimated and evaluated through in silico methodologies, including an estimated enzymatic network. From all these analyses, we were able to estimate not only the main genes involved in defining the phenotype but also the interactions between a core of genes related to rubber tree growth at the transcriptional and translational levels. This work was the first to integrate multiomics analysis into the in-depth investigation of rubber tree plant growth, producing useful data for future genetic studies in the species and enhancing the efficiency of the species improvement programs.
Collapse
Affiliation(s)
- Felipe Roberto Francisco
- Molecular Biology and Genetic Engineering Center (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Alexandre Hild Aono
- Molecular Biology and Genetic Engineering Center (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Carla Cristina da Silva
- Molecular Biology and Genetic Engineering Center (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Paulo S. Gonçalves
- Center of Rubber Tree and Agroforestry Systems, Agronomic Institute (IAC), Votuporanga, Brazil
| | | | - Vincent Le Guen
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR AGAP, Montpellier, France
- AGAP, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Roberto Fritsche-Neto
- Department of Genetics, Luiz de Queiroz College of Agriculture (ESALQ), University of São Paulo (USP), Piracicaba, Brazil
| | - Livia Moura Souza
- Molecular Biology and Genetic Engineering Center (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- São Francisco University (USF), Itatiba, Brazil
| | - Anete Pereira de Souza
- Molecular Biology and Genetic Engineering Center (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Plant Biology, Biology Institute, University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
7
|
Yadav S, Ross EM, Aitken KS, Hickey LT, Powell O, Wei X, Voss-Fels KP, Hayes BJ. A linkage disequilibrium-based approach to position unmapped SNPs in crop species. BMC Genomics 2021; 22:773. [PMID: 34715779 PMCID: PMC8555328 DOI: 10.1186/s12864-021-08116-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 10/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND High-density SNP arrays are now available for a wide range of crop species. Despite the development of many tools for generating genetic maps, the genome position of many SNPs from these arrays is unknown. Here we propose a linkage disequilibrium (LD)-based algorithm to allocate unassigned SNPs to chromosome regions from sparse genetic maps. This algorithm was tested on sugarcane, wheat, and barley data sets. We calculated the algorithm's efficiency by masking SNPs with known locations, then assigning their position to the map with the algorithm, and finally comparing the assigned and true positions. RESULTS In the 20-fold cross-validation, the mean proportion of masked mapped SNPs that were placed by the algorithm to a chromosome was 89.53, 94.25, and 97.23% for sugarcane, wheat, and barley, respectively. Of the markers that were placed in the genome, 98.73, 96.45 and 98.53% of the SNPs were positioned on the correct chromosome. The mean correlations between known and new estimated SNP positions were 0.97, 0.98, and 0.97 for sugarcane, wheat, and barley. The LD-based algorithm was used to assign 5920 out of 21,251 unpositioned markers to the current Q208 sugarcane genetic map, representing the highest density genetic map for this species to date. CONCLUSIONS Our LD-based approach can be used to accurately assign unpositioned SNPs to existing genetic maps, improving genome-wide association studies and genomic prediction in crop species with fragmented and incomplete genome assemblies. This approach will facilitate genomic-assisted breeding for many orphan crops that lack genetic and genomic resources.
Collapse
Affiliation(s)
- Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia.
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Karen S Aitken
- Agriculture and Food, CSIRO, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Lee T Hickey
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, QLD, 4741, Australia
| | - Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia.
| |
Collapse
|
8
|
Gerard D. Scalable bias-corrected linkage disequilibrium estimation under genotype uncertainty. Heredity (Edinb) 2021; 127:357-362. [PMID: 34373594 PMCID: PMC8479074 DOI: 10.1038/s41437-021-00462-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 02/07/2023] Open
Abstract
Linkage disequilibrium (LD) estimates are often calculated genome-wide for use in many tasks, such as SNP pruning and LD decay estimation. However, in the presence of genotype uncertainty, naive approaches to calculating LD have extreme attenuation biases, incorrectly suggesting that SNPs are less dependent than in reality. These biases are particularly strong in polyploid organisms, which often exhibit greater levels of genotype uncertainty than diploids. A principled approach using maximum likelihood estimation with genotype likelihoods can reduce this bias, but is prohibitively slow for genome-wide applications. Here, we present scalable moment-based adjustments to LD estimates based on the marginal posterior distributions of the genotypes. We demonstrate, on both simulated and real data, that these moment-based estimators are as accurate as maximum likelihood estimators, but are almost as fast as naive approaches based only on posterior mean genotypes. This opens up bias-corrected LD estimation to genome-wide applications. In addition, we provide standard errors for these moment-based estimators. All methods discussed in this manuscript are implemented in the ldsep package, available on the Comprehensive R Archive Network ( https://cran.r-project.org/package=ldsep ).
Collapse
Affiliation(s)
- David Gerard
- Department of Mathematics and Statistics, American University, Washington, DC, USA.
| |
Collapse
|
9
|
Genome-wide approaches for the identification of markers and genes associated with sugarcane yellow leaf virus resistance. Sci Rep 2021; 11:15730. [PMID: 34344928 PMCID: PMC8333424 DOI: 10.1038/s41598-021-95116-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 07/19/2021] [Indexed: 11/10/2022] Open
Abstract
Sugarcane yellow leaf (SCYL), caused by the sugarcane yellow leaf virus (SCYLV) is a major disease affecting sugarcane, a leading sugar and energy crop. Despite damages caused by SCYLV, the genetic base of resistance to this virus remains largely unknown. Several methodologies have arisen to identify molecular markers associated with SCYLV resistance, which are crucial for marker-assisted selection and understanding response mechanisms to this virus. We investigated the genetic base of SCYLV resistance using dominant and codominant markers and genotypes of interest for sugarcane breeding. A sugarcane panel inoculated with SCYLV was analyzed for SCYL symptoms, and viral titer was estimated by RT-qPCR. This panel was genotyped with 662 dominant markers and 70,888 SNPs and indels with allele proportion information. We used polyploid-adapted genome-wide association analyses and machine-learning algorithms coupled with feature selection methods to establish marker-trait associations. While each approach identified unique marker sets associated with phenotypes, convergences were observed between them and demonstrated their complementarity. Lastly, we annotated these markers, identifying genes encoding emblematic participants in virus resistance mechanisms and previously unreported candidates involved in viral responses. Our approach could accelerate sugarcane breeding targeting SCYLV resistance and facilitate studies on biological processes leading to this trait.
Collapse
|
10
|
Garreta L, Cerón‐Souza I, Palacio MR, Reyes‐Herrera PH. MultiGWAS: An integrative tool for Genome Wide Association Studies in tetraploid organisms. Ecol Evol 2021; 11:7411-7426. [PMID: 34188823 PMCID: PMC8216910 DOI: 10.1002/ece3.7572] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 03/22/2021] [Accepted: 03/23/2021] [Indexed: 12/27/2022] Open
Abstract
The genome-wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of the model and nonmodel organisms. For this research question, the GWAS replication testing different parameters and models to validate the results' reproducibility is common. However, straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two designed for polyploid data (GWASpoly and SHEsis) and two designed for diploid data (GAPIT and TASSEL). MultiGWAS has several advantages. It runs either in the command line or in a graphical interface; it manages different genotype formats, including VCF. Moreover, it allows control for population structure, relatedness, and several quality control checks on genotype data. Besides, MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. Finally, it generates several reports that facilitate identifying false associations from both the significant and the best-ranked association Single Nucleotide Polymorphisms (SNPs) among the four software packages. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually. Moreover, the parallel analysis of polyploid and diploid software that only offers MultiGWAS demonstrates its utility in understanding the best genetic model behind the SNP association in tetraploid organisms. Therefore, MultiGWAS probed to be an excellent alternative for wrapping GWAS replication in diploid and tetraploid organisms in a single analysis environment.
Collapse
Affiliation(s)
- Luis Garreta
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA)CI TibaitatáBogotaColombia
| | - Ivania Cerón‐Souza
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA)CI TibaitatáBogotaColombia
| | | | - Paula H. Reyes‐Herrera
- Corporación Colombiana de Investigación Agropecuaria (AGROSAVIA)CI TibaitatáBogotaColombia
| |
Collapse
|