1
|
Tergemina E, Ansari S, Salt DE, Hancock AM. Multiple independent MGR5 alleles contribute to a clinal pattern in leaf magnesium across the distribution of Arabidopsis thaliana. THE NEW PHYTOLOGIST 2025; 246:1861-1874. [PMID: 40125608 PMCID: PMC12018779 DOI: 10.1111/nph.70069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2024] [Accepted: 02/25/2025] [Indexed: 03/25/2025]
Abstract
Magnesium (Mg) is a crucial element in plants, particularly for photosynthesis. Mg homeostasis is influenced by environmental and genetic factors, and our understanding of its variation in natural populations remains incomplete. We examine the variation in leaf Mg accumulation across the distribution of Arabidopsis thaliana, and we investigate the environmental and genetic factors associated with Mg levels. Using genome-wide association studies in both the widespread Eurasian population and a local-scale population in Cape Verde, we identify genetic factors associated with variation in leaf Mg. We validate our main results, including effect size estimates, using Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) mutagenesis. Our findings reveal a significant association between leaf Mg and latitude of origin. In Eurasia, we find a signal at the nutrient-response regulator, RAPTOR1A, and across the species range, we find that multiple alleles of the Mg transporter, MAGNESIUM RELEASE 5 (MGR5), underlie variation in leaf Mg and contribute to the observed latitudinal cline. Overall, our results indicate that the spatial distribution of leaf Mg in A. thaliana is affected by climatic and genetic factors, resulting in a latitudinal cline. Further, they show an example of allelic heterogeneity, in which multiple alleles at a single locus contribute to a trait and the formation of a phenotypic cline.
Collapse
Affiliation(s)
- Emmanuel Tergemina
- Department of Plant Developmental BiologyMax Planck Institute for Plant Breeding ResearchCologne50829Germany
| | - Shifa Ansari
- Department of Plant Developmental BiologyMax Planck Institute for Plant Breeding ResearchCologne50829Germany
| | - David E. Salt
- School of BiosciencesUniversity of NottinghamSutton BoningtonLE12 5RDUK
| | - Angela M. Hancock
- Department of Plant Developmental BiologyMax Planck Institute for Plant Breeding ResearchCologne50829Germany
- Department of Botany and Plant PathologyPurdue UniversityWest Lafayette47907INUSA
| |
Collapse
|
2
|
Kaczmarek T, Cubry P, Champion L, Causse S, Couderc M, Orjuela J, Uyoh EA, Oselebe HO, Dachi SN, Adje COA, Sekloka E, Achigan-Dako EG, Ibrahim Bio Yerima AR, Saidou SI, Bakasso Y, Diop BM, Gueye MC, Agyare RY, Adjebeng-Danquah J, Gueye M, Wieringa JJ, Vigouroux Y, Billot C, Barnaud A, Leclerc C. Independent domestication and cultivation histories of two West African indigenous fonio millet crops. Nat Commun 2025; 16:4067. [PMID: 40307323 PMCID: PMC12044004 DOI: 10.1038/s41467-025-59454-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Accepted: 04/23/2025] [Indexed: 05/02/2025] Open
Abstract
Crop evolutionary history and domestication processes are key issues for better conservation and effective use of crop genetic diversity. Black and white fonio (Digitaria iburua and D. exilis, respectively) are two small indigenous grain cereals grown in West Africa. The relationship between these two cultivated crops and wild Digitaria species is still unclear. Here, we analyse whole genome sequences of 265 accessions comprising these two cultivated species and their close wild relatives. We show that white and black fonio were the result of two independent domestications without gene flow. We infer a cultivation expansion that began at the outset of the CE era, coinciding with the earliest discovered archaeological fonio remains in Nigeria. Fonio population sizes declined a few centuries ago, probably due to a combination of several factors, including major social and agricultural changes, intensification of the slave trade and the introduction of new, less labour-intensive crops. The key knowledge and genomic resources outlined here will help to promote and conserve these neglected climate-resilient crops and thereby provide an opportunity to tailor agriculture to the changing world.
Collapse
Affiliation(s)
- Thomas Kaczmarek
- CIRAD, UMR AGAP Institut, Montpellier, France.
- AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France.
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France.
| | - Philippe Cubry
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France
| | - Louis Champion
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France
| | - Sandrine Causse
- CIRAD, UMR AGAP Institut, Montpellier, France
- AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Marie Couderc
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France
| | - Julie Orjuela
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France
| | - Edak A Uyoh
- Department of Genetics and Biotechnology, University of Calabar, Calabar, Nigeria
| | - Happiness O Oselebe
- Center for Crop Improvement, Nutrition & Climate Change (CCINCC), Ebonyi State University, Abakaliki, Nigeria
| | - Stephen N Dachi
- Department of Crop Production, Faculty of Agriculture, University of Jos, Jos, Plateau State, Nigeria
| | - Charlotte O A Adje
- Genetics, Biotechnology and Seed Science Unit (GBioS), Laboratory of Plant Production, Physiology and Plant Breeding (PAGEV), School of Plant Sciences, University of Abomey-Calavi, Abomey-Calavi, Cotonou, Republic of Benin
| | | | - Enoch G Achigan-Dako
- Genetics, Biotechnology and Seed Science Unit (GBioS), Laboratory of Plant Production, Physiology and Plant Breeding (PAGEV), School of Plant Sciences, University of Abomey-Calavi, Abomey-Calavi, Cotonou, Republic of Benin
| | - Abdou R Ibrahim Bio Yerima
- Genetics, Biotechnology and Seed Science Unit (GBioS), Laboratory of Plant Production, Physiology and Plant Breeding (PAGEV), School of Plant Sciences, University of Abomey-Calavi, Abomey-Calavi, Cotonou, Republic of Benin
- Department of Rainfed Crop Production (DCP), National Institute of Agronomic Research of Niger (INRAN), Niamey, Niger
| | - Sani Idi Saidou
- Department of Plant Production and Biodiversity, Faculty of Agronomic and Ecologic Sciences, University of Diffa, Diffa, Niger
- Laboratory for the Management and Valorization of Biodiversity in the Sahel (GeVaBioS), Abdou Moumouni Univyersit, Niamey, Niger
| | - Yacoubou Bakasso
- Laboratory for the Management and Valorization of Biodiversity in the Sahel (GeVaBioS), Abdou Moumouni Univyersit, Niamey, Niger
- Department of Biology, Faculty of Science and Technic, Abdou Moumouni University, Niamey, Niger
| | - Baye M Diop
- Institut Sénégalais de Recherches Agricoles (ISRA), Centre d'Etude Régional pour l'Amélioration de l'Adaptation à la Sécheresse (CERAAS), Thiés, Sénégal
| | - Mame C Gueye
- Institut Sénégalais de Recherches Agricoles (ISRA), Centre d'Etude Régional pour l'Amélioration de l'Adaptation à la Sécheresse (CERAAS), Thiés, Sénégal
| | - Richard Y Agyare
- Council for Scientific and Industrial Research-Savanna Agricultural Research Institute (CSIR-SARI), Nyankpala, Ghana
| | - Joseph Adjebeng-Danquah
- Council for Scientific and Industrial Research-Savanna Agricultural Research Institute (CSIR-SARI), Nyankpala, Ghana
| | - Mathieu Gueye
- Laboratoire de Botanique, Département de Botanique et Géologie, IFAN Ch. A. Diop/UCAD, IRL 3189 « Environnement, Santé et Société », Université Cheikh Anta Diop, Dakar, Sénégal
| | | | - Yves Vigouroux
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France.
| | - Claire Billot
- CIRAD, UMR AGAP Institut, Montpellier, France.
- AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France.
| | - Adeline Barnaud
- DIADE, University of Montpellier, IRD, CIRAD, Montpellier, France.
| | - Christian Leclerc
- CIRAD, UMR AGAP Institut, Montpellier, France.
- AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France.
| |
Collapse
|
3
|
Roberts MD, Davis O, Josephs EB, Williamson RJ. K-mer-based Approaches to Bridging Pangenomics and Population Genetics. Mol Biol Evol 2025; 42:msaf047. [PMID: 40111256 PMCID: PMC11925024 DOI: 10.1093/molbev/msaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 01/10/2025] [Accepted: 02/04/2025] [Indexed: 03/12/2025] Open
Abstract
Many commonly studied species now have more than one chromosome-scale genome assembly, revealing a large amount of genetic diversity previously missed by approaches that map short reads to a single reference. However, many species still lack multiple reference genomes and correctly aligning references to build pangenomes can be challenging for many species, limiting our ability to study this missing genomic variation in population genetics. Here, we argue that k-mers are a very useful but underutilized tool for bridging the reference-focused paradigms of population genetics with the reference-free paradigms of pangenomics. We review current literature on the uses of k-mers for performing three core components of most population genetics analyses: identifying, measuring, and explaining patterns of genetic variation. We also demonstrate how different k-mer-based measures of genetic variation behave in population genetic simulations according to the choice of k, depth of sequencing coverage, and degree of data compression. Overall, we find that k-mer-based measures of genetic diversity scale consistently with pairwise nucleotide diversity (π) up to values of about π=0.025 (R2=0.97) for neutrally evolving populations. For populations with even more variation, using shorter k-mers will maintain the scalability up to at least π=0.1. Furthermore, in our simulated populations, k-mer dissimilarity values can be reliably approximated from counting bloom filters, highlighting a potential avenue to decreasing the memory burden of k-mer-based genomic dissimilarity analyses. For future studies, there is a great opportunity to further develop methods to identifying selected loci using k-mers.
Collapse
Affiliation(s)
- Miles D Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI 48824, USA
| | - Olivia Davis
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48824, USA
- Plant Resilience Institute, Michigan State University, East Lansing, MI 48824, USA
| | - Robert J Williamson
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
- Department of Biology and Biomedical Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| |
Collapse
|
4
|
Jenike KM, Campos-Domínguez L, Boddé M, Cerca J, Hodson CN, Schatz MC, Jaron KS. k-mer approaches for biodiversity genomics. Genome Res 2025; 35:219-230. [PMID: 39890468 PMCID: PMC11874746 DOI: 10.1101/gr.279452.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 01/09/2025] [Indexed: 02/03/2025]
Abstract
The wide array of currently available genomes displays a wonderful diversity in size, composition, and structure and is quickly expanding thanks to several global biodiversity genomics initiatives. However, sequencing of genomes, even with the latest technologies, can still be challenging for both technical (e.g., small physical size, contaminated samples, or access to appropriate sequencing platforms) and biological reasons (e.g., germline-restricted DNA, variable ploidy levels, sex chromosomes, or very large genomes). In recent years, k-mer-based techniques have become popular to overcome some of these challenges. They are based on the simple process of dividing the analyzed sequences (e.g., raw reads or genomes) into a set of subsequences of length k, called k-mers, and then analyzing the frequency or sequences of those k-mers. Analyses based on k-mers allow for a rapid and intuitive assessment of complex sequencing data sets. Here, we provide a comprehensive review to the theoretical properties and practical applications of k-mers in biodiversity genomics with a special focus on genome modeling.
Collapse
Affiliation(s)
- Katharine M Jenike
- Johns Hopkins University, School of Medicine, Baltimore, Maryland 21205, USA
| | - Lucía Campos-Domínguez
- Centre for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, 08193 Barcelona, Spain
| | - Marilou Boddé
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - José Cerca
- Center for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, 0313 Oslo, Norway
| | - Christina N Hodson
- University College London, UCL Department of Genetics, Evolution & Environment, London, WC1E 6BT, United Kingdom
| | - Michael C Schatz
- Johns Hopkins University, School of Medicine, Baltimore, Maryland 21205, USA
| | - Kamil S Jaron
- Tree of Life, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom;
| |
Collapse
|
5
|
Teshome A, Habte E, Cheema J, Mekasha A, Lire H, Muktar MS, Quiroz-Chavez J, Domoney C, Jones CS. A population genomics approach to unlock the genetic potential of lablab (Lablab purpureus (L.) Sweet), an underutilized tropical forage crop. BMC Genomics 2024; 25:1241. [PMID: 39719589 PMCID: PMC11668113 DOI: 10.1186/s12864-024-11104-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 11/28/2024] [Indexed: 12/26/2024] Open
Abstract
BACKGROUND Lablab is one of the conventionally grown multi-purpose crops that originated in Africa. It is an annual or short-lived perennial forage legume which has versatile uses (as a vegetable and dry seeds, as food or feed, or as green manure) but is yet to receive adequate research attention and hence remains underexploited. To develop new and highly productive lablab varieties, using genomics-assisted selection, the present study aimed to identify quantitative trait loci associated with agronomically important traits in lablab and to assess the stability of these traits across two different agro-ecologies in Ethiopia. Here, we resequenced one hundred and forty-two lablab accessions, utilised whole genome genotyping approaches, and conducted multi-locational phenotyping over two years. RESULTS The selected lablab accessions displayed significant agro-morphological variation in eight analysed traits, including plant height, total fresh weight, and total dry weight. Furthermore, the agronomic performance of the accessions was significantly different across locations and years, highlighting substantial genotype-by-environment interactions. The population genetic structure of the lablab accessions, based on ~ 500,000 informative single nucleotide polymorphisms (SNPs), revealed an independent domestication pattern for two-seeded and four-seeded lablab accessions. Finally, based on multi-environmental trial data, a genome-wide association study (GWAS) identified useful SNPs and k-mers for key traits, such as plant height and total dry weight. CONCLUSIONS The publicly available genomic tools and field evaluation data from this study will offer a valuable resource for plant breeders and researchers to inform a new cycle of lablab breeding. With the aid of these tools, the breeding cycle will be significantly reduced and livestock farmers will have access to improved lablab varieties in a shorter time-frame.
Collapse
Affiliation(s)
- A Teshome
- Feed and Forage Development, International Livestock Research Institute, Addis Ababa, Ethiopia
| | - E Habte
- Feed and Forage Development, International Livestock Research Institute, Addis Ababa, Ethiopia
| | - J Cheema
- John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - A Mekasha
- Ethiopian Institute of Agricultural Research (EIAR), Melkassa Research Centre, Melkassa, Ethiopia
| | - H Lire
- Ethiopian Institute of Agricultural Research (EIAR), Wondogenet Research Centre, Wondogenet, Ethiopia
| | - M S Muktar
- Feed and Forage Development, International Livestock Research Institute, Addis Ababa, Ethiopia
| | - J Quiroz-Chavez
- John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - C Domoney
- John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - C S Jones
- Feed and Forage Development, International Livestock Research Institute, Addis Ababa, Ethiopia.
| |
Collapse
|
6
|
Kadam A, Shilo S, Naor H, Wainstein A, Brilon Y, Feldman T, Minden M, Kaushansky N, Chapal-Ilani N, Shlush L. Utilizing insights of DNA repair machinery to discover MMEJ deletions and novel mechanisms. Nucleic Acids Res 2024; 52:e106. [PMID: 39607705 DOI: 10.1093/nar/gkae1132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/17/2024] [Accepted: 10/30/2024] [Indexed: 11/29/2024] Open
Abstract
We developed Del-read, an algorithm targeting medium-sized deletions (6-100 bp) in short-reads, which are challenging for current variant callers relying on alignment. Our focus was on Micro-Homolog mediated End Joining deletions (MMEJ-dels), prevalent in myeloid malignancies. MMEJ-dels follow a distinct pattern, occurring between two homologies, allowing us to generate a comprehensive list of MMEJ-dels in the exome. Using Del-read, we identified numerous novel germline and somatic MMEJ-dels in BEAT-AML and TCGA-breast datasets. Validation in 672 healthy individuals confirmed their presence. These novel MMEJ-dels were linked to genomic features associated with replication stress, like G-quadruplexes and minisatellite. Additionally, we observed a new category of MMEJ-dels with an imperfect-match at the flanking sequences of the homologies, suggesting a mechanism involving mispairing in homology alignment. We demonstrated robustness of the repair system despite CRISPR/Cas9-induced mismatches in the homologies. Further analysis of the canonical ASXL1 deletion revealed a diverse array of these imperfect-matches. This suggests a potentially more flexible and error-prone MMEJ repair system than previously understood. Our findings highlight Del-read's potential in uncovering previously undetected deletions and deepen our understanding of repair mechanisms.
Collapse
Affiliation(s)
- Aditee Kadam
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Shay Shilo
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Hadas Naor
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Alexander Wainstein
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Yardena Brilon
- Sequentify Ltd., 10 Moti Kind St., 5th Floor, Rehovot 7638519, Israel
| | - Tzah Feldman
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Mark Minden
- Princess Margaret Cancer Centre, University Health Network (UHN), Department of Medical Oncology & Hematology, Toronto, ON M5G 2C4, Canada
| | - Nathali Kaushansky
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Noa Chapal-Ilani
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
| | - Liran Shlush
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 761001, Israel
- Molecular Hematology Clinic Maccabi Healthcare Services, Tel Aviv 6812509, Israel
| |
Collapse
|
7
|
Wiersma AT, Hamilton JP, Vaillancourt B, Brose J, Awale HE, Wright EM, Kelly JD, Buell CR. k-mer genome-wide association study for anthracnose and BCMV resistance in a Phaseolus vulgaris Andean Diversity Panel. THE PLANT GENOME 2024; 17:e20523. [PMID: 39397345 PMCID: PMC11628888 DOI: 10.1002/tpg2.20523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 09/10/2024] [Accepted: 09/11/2024] [Indexed: 10/15/2024]
Abstract
Access to broad genomic resources and closely linked marker-trait associations for common beans (Phaseolus vulgaris L.) can facilitate development of improved varieties with increased yield, improved market quality traits, and enhanced disease resistance. The emergence of virulent races of anthracnose (caused by Colletotrichum lindemuthianum) and bean common mosaic virus (BCMV) highlight the need for improved methods to identify and incorporate pan-genomic variation in breeding for disease resistance. We sequenced the P. vulgaris Andean Diversity Panel (ADP) and performed a genome-wide association study (GWAS) to identify associations for resistance to BCMV and eight races of anthracnose. Historical single nucleotide polymorphism (SNP)-chip and phenotypic data enabled a three-way comparison between SNP-chip, reference-based whole genome shotgun sequence (WGS)-SNP, and reference-free k-mer (short nucleotide subsequence) GWAS. Across all traits, there was excellent concordance between SNP-chip, WGS-SNP, and k-mer GWAS results-albeit at a much higher marker resolution for the WGS data sets. Significant k-mer haplotype variation revealed selection of the linked I-gene and Co-u traits in North American breeding lines and cultivars. Due to structural variation, only 9.1 to 47.3% of the significantly associated k-mers could be mapped to the reference genome. Thus, to determine the genetic context of cis-associated k-mers, we generated draft whole genome assemblies of four ADP accessions and identified an expanded local repertoire of disease resistance genes associated with resistance to anthracnose and BCMV. With access to variant data in the context of a pan-genome, high resolution mapping of agronomic traits for common bean is now feasible.
Collapse
Affiliation(s)
- Andrew T. Wiersma
- Archer Daniels Midland CompanyNew PlymouthIdahoUSA
- Department of Plant, Soil and Microbial SciencesMichigan State UniversityEast LansingMichiganUSA
- Plant Resilience InstituteMichigan State UniversityEast LansingMichiganUSA
| | - John P. Hamilton
- Department of Plant BiologyMichigan State UniversityEast LansingMichiganUSA
- Center for Applied Genetic TechnologiesUniversity of GeorgiaAthensGeorgiaUSA
- Department of Crop and Soil SciencesUniversity of GeorgiaAthensGeorgiaUSA
| | - Brieanne Vaillancourt
- Department of Plant BiologyMichigan State UniversityEast LansingMichiganUSA
- Center for Applied Genetic TechnologiesUniversity of GeorgiaAthensGeorgiaUSA
| | - Julia Brose
- Department of Plant BiologyMichigan State UniversityEast LansingMichiganUSA
- Center for Applied Genetic TechnologiesUniversity of GeorgiaAthensGeorgiaUSA
| | - Halima E. Awale
- Department of Plant, Soil and Microbial SciencesMichigan State UniversityEast LansingMichiganUSA
| | - Evan M. Wright
- Department of Plant, Soil and Microbial SciencesMichigan State UniversityEast LansingMichiganUSA
| | - James D. Kelly
- Department of Plant, Soil and Microbial SciencesMichigan State UniversityEast LansingMichiganUSA
| | - C. Robin Buell
- Plant Resilience InstituteMichigan State UniversityEast LansingMichiganUSA
- Department of Plant BiologyMichigan State UniversityEast LansingMichiganUSA
- Center for Applied Genetic TechnologiesUniversity of GeorgiaAthensGeorgiaUSA
- Department of Crop and Soil SciencesUniversity of GeorgiaAthensGeorgiaUSA
- Institute of Plant Breeding, Genetics & GenomicsUniversity of GeorgiaAthensGeorgiaUSA
- The Plant CenterUniversity of GeorgiaAthensGeorgiaUSA
| |
Collapse
|
8
|
Couto EGO, Morales-Marroquín JA, Alves-Pereira A, Fernandes SB, Colombo CA, de Azevedo-Filho JA, Carvalho CRL, Zucchi MI. Genome-wide association insights into the genomic regions controlling vegetative and oil production traits in Acrocomia aculeata. BMC PLANT BIOLOGY 2024; 24:1125. [PMID: 39587483 PMCID: PMC11590364 DOI: 10.1186/s12870-024-05805-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 11/11/2024] [Indexed: 11/27/2024]
Abstract
BACKGROUND Macauba (Acrocomia aculeata) is a non-domesticated neotropical palm that has been attracting attention for economic use due to its great potential for oil production comparable to the commercially used oil palm (Elaeis guineensis). The discovery of associations between quantitative trait loci and economically important traits represents an advance toward understanding its genetic architecture and can contribute to accelerating macauba domestication. Pursuing this advance, this study performs single-trait and multi-trait GWAS models to identify candidate genes associated with vegetative and oil production traits in macauba. Eighteen phenotypic traits were evaluated from 201 palms within a native population. Genotyping was performed with SNP markers, following the protocol of genotyping-by-sequencing. Given that macauba lacks a reference genome, SNP calling was performed using three different strategies: using i) de novo sequencing, ii) the Elaeis guineenses Jacq. reference genome and iii) the macauba transcriptome sequences. After quality control, we identified a total of 27,410 SNPs in 153 individuals for the de novo genotypic dataset, 10,444 SNPs in 158 individuals using the oil palm genotypic dataset, and 4,329 SNPs in 167 individuals using the transcriptome genotypic dataset. The GWAS analysis was then performed on these three genotypic datasets. RESULTS Statistical phenotypic analyses revealed significant differences across all studied traits, with heritability values ranging from 63 to 95%. This indicates that the population contains promising genotypes for selection and the initiation of breeding programs. Genetic correlations between the 18 traits ranged from -0.47 to 0.99. The total number of significant SNPs in the single-trait and multi-trait GWAS was 92 and 6 using the de novo genotypic dataset, 19 and 11 using the oil palm genotypic dataset, and 1 and 2 using the transcriptome genotypic dataset, respectively. Gene annotation identified 12 candidate genes in the single-trait GWAS and four in the multi-trait GWAS, across the 18 phenotypic traits studied, in the three genotypic datasets. Gene mapping of the macauba candidate genes revealed similarities with Elaeis guineensis and Phoenix dactylifera. The candidate genes detected are responsible for metal ion binding and transport, protein transportation, DNA repair, and other cell regulation biological processes. CONCLUSIONS We provide new insights into genomic regions that map candidate genes associated with vegetative and oil production traits in macauba. These potential candidate genes require confirmation through targeted functional analyses in the future, and multi-trait associations need to be scrutinized to investigate the presence of pleiotropic or linked genes. Markers linked to traits of interest could serve as valuable resources for the development of marker-assisted selection in macauba for its domestication and pre-breeding.
Collapse
Affiliation(s)
- Evellyn G O Couto
- Department of Genetics, "Luiz de Queiroz" College of Agriculture, São Paulo University, (ESALQ/USP), Piracicaba, Brazil.
| | - Jonathan A Morales-Marroquín
- Department of Genetics, "Luiz de Queiroz" College of Agriculture, São Paulo University, (ESALQ/USP), Piracicaba, Brazil
| | | | - Samuel B Fernandes
- Department of Crop Soil, and Enviromental Sciences, Center of Agrcultural Data Analytics, University of Arkansas, Fayetteville, USA
| | - Carlos Augusto Colombo
- Research Center of Plant Genetic Resources, Campinas Agronomic Institute, Campinas, Brazil
| | | | | | - Maria Imaculada Zucchi
- Department of Genetics, "Luiz de Queiroz" College of Agriculture, São Paulo University, (ESALQ/USP), Piracicaba, Brazil.
- Polo Centro Sul, São Paulo Agency for Agribusiness Technology (APTA), Piracicaba, Brazil.
| |
Collapse
|
9
|
Ge M, Li C, Zhang Z. SNP-Based and Kmer-Based eQTL Analysis Using Transcriptome Data. Animals (Basel) 2024; 14:2941. [PMID: 39457872 PMCID: PMC11503742 DOI: 10.3390/ani14202941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 10/08/2024] [Accepted: 10/09/2024] [Indexed: 10/28/2024] Open
Abstract
Traditional expression quantitative trait locus (eQTL) mapping associates single nucleotide polymorphisms (SNPs) with gene expression, where the SNPs are derived from large-scale whole-genome sequencing (WGS) data or transcriptome data. While WGS provides a high SNP density, it also incurs substantial sequencing costs. In contrast, RNA-seq data, which are more accessible and less expensive, can simultaneously yield gene expressions and SNPs. Thus, eQTL analysis based on RNA-seq offers significant potential applications. Two primary strategies were employed for eQTL in this study. The first involved analyzing expression levels in relation to variant sites detected between populations from RNA-seq data. The second approach utilized kmers, which are sequences of length k derived from RNA-seq reads, to represent variant sites and associated these kmer genotypes with gene expression. We discovered 87 significant association signals involving eGene on the basis of the SNP-based eQTL analysis. These genes include DYNLT1, NMNAT1, and MRLC2, which are closely related to neurological functions such as motor coordination and homeostasis, play a role in cellular energy metabolism, and function in regulating calcium-dependent signaling in muscle contraction, respectively. This study compared the results obtained from eQTL mapping using RNA-seq identified SNPs and gene expression with those derived from kmers. We found that the vast majority (23/30) of the association signals overlapping the two methods could be verified by haplotype block analysis. This comparison elucidates the strengths and limitations of each method, providing insights into their relative efficacy for eQTL identification.
Collapse
Affiliation(s)
| | | | - Zhiyan Zhang
- National Key Laboratory for Swine Genetic Improvement and Germplasm Innovation Technology, Jiangxi Agricultural University, Nanchang 330045, China; (M.G.); (C.L.)
| |
Collapse
|
10
|
He C, Washburn JD, Schleif N, Hao Y, Kaeppler H, Kaeppler SM, Zhang Z, Yang J, Liu S. Trait association and prediction through integrative k-mer analysis. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 120:833-850. [PMID: 39259496 DOI: 10.1111/tpj.17012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 08/14/2024] [Accepted: 08/22/2024] [Indexed: 09/13/2024]
Abstract
Genome-wide association study (GWAS) with single nucleotide polymorphisms (SNPs) has been widely used to explore genetic controls of phenotypic traits. Alternatively, GWAS can use counts of substrings of length k from longer sequencing reads, k-mers, as genotyping data. Using maize cob and kernel color traits, we demonstrated that k-mer GWAS can effectively identify associated k-mers. Co-expression analysis of kernel color k-mers and genes directly found k-mers from known causal genes. Analyzing complex traits of kernel oil and leaf angle resulted in k-mers from both known and candidate genes. A gene encoding a MADS transcription factor was functionally validated by showing that ectopic expression of the gene led to less upright leaves. Evolution analysis revealed most k-mers positively correlated with kernel oil were strongly selected against in maize populations, while most k-mers for upright leaf angle were positively selected. In addition, genomic prediction of kernel oil, leaf angle, and flowering time using k-mer data resulted in a similarly high prediction accuracy to the standard SNP-based method. Collectively, we showed k-mer GWAS is a powerful approach for identifying trait-associated genetic elements. Further, our results demonstrated the bridging role of k-mers for data integration and functional gene discovery.
Collapse
Affiliation(s)
- Cheng He
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Jacob D Washburn
- Plant Genetics Research Unit, USDA-ARS, Columbia, Missouri, 65211, USA
| | - Nathaniel Schleif
- Department of Agronomy, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
| | - Yangfan Hao
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Heidi Kaeppler
- Department of Agronomy, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
| | - Shawn M Kaeppler
- Department of Agronomy, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
| | - Zhiwu Zhang
- Department of Crop and Soil Sciences, Washington State University, Pullman, Washington, 99164, USA
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, Nebraska, 68583-0915, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, Nebraska, 68583, USA
| | - Sanzhen Liu
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas, 66506, USA
| |
Collapse
|
11
|
Zhou H, Du W, Ouyang D, Li Y, Gong Y, Yao Z, Zhong M, Zhong X, Ye X. Simple and accurate genomic classification model for distinguishing between human and pig Staphylococcus aureus. Commun Biol 2024; 7:1171. [PMID: 39294434 PMCID: PMC11410946 DOI: 10.1038/s42003-024-06883-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 09/11/2024] [Indexed: 09/20/2024] Open
Abstract
Staphylococcus aureus (S. aureus) can cause various infections in humans and animals, contributing to high morbidity and mortality. To prevent and control cross-species transmission of S. aureus, it is necessary to understand the host-associated genetic variants. We performed a two-stage genome-wide association study (GWAS) including initial screening and further validation to compare genomic differences between human and pig S. aureus, aiming to identify host-associated determinants. Our multiple GWAS analyses found six consensus significant k-mers associated with host species, providing novel genetic evidence for distinguishing human from pig S. aureus. The best k-mer predictor achieved a high classification accuracy of 98.12% on its own and had extremely high resolution similar to the SNPs-based phylogeny, offering a very simple target for predicting the cross-species transmission risk of S. aureus. The final k-mer model revealed that 90% of S. aureus isolates from farm workers were predicted as livestock origin, suggesting a high risk of cross-species transmission. Bayesian inference revealed different cross-species transmission directions, with the human-to-pig transmission for ST5 and the pig-to-human transmission for ST398. Our findings provide a simple and accurate k-mer model for identifying and predicting the cross-species transmission risk of S. aureus.
Collapse
Affiliation(s)
- Huiliu Zhou
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Wenyin Du
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Dejia Ouyang
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yuehe Li
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yajie Gong
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Zhenjiang Yao
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Minghao Zhong
- Department of Prevention and Health Care, The Sixth People's Hospital of Dongguan, Dongguan, China
| | - Xinguang Zhong
- Department of Prevention and Health Care, The Sixth People's Hospital of Dongguan, Dongguan, China.
| | - Xiaohua Ye
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China.
| |
Collapse
|
12
|
Roberts M, Josephs EB. Previously unmeasured genetic diversity explains part of Lewontin's paradox in a k -mer-based meta-analysis of 112 plant species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.17.594778. [PMID: 38798362 PMCID: PMC11118579 DOI: 10.1101/2024.05.17.594778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
At the molecular level, most evolution is expected to be neutral. A key prediction of this expectation is that the level of genetic diversity in a population should scale with population size. However, as was noted by Richard Lewontin in 1974 and reaffirmed by later studies, the slope of the population size-diversity relationship in nature is much weaker than expected under neutral theory. We hypothesize that one contributor to this paradox is that current methods relying on single nucleotide polymorphisms (SNPs) called from aligning short reads to a reference genome underestimate levels of genetic diversity in many species. To test this idea, we calculated nucleotide diversity ( π ) and k -mer-based metrics of genetic diversity across 112 plant species, amounting to over 205 terabases of DNA sequencing data from 27,488 individual plants. We then compared how these different metrics correlated with proxies of population size that account for both range size and population density variation across species. We found that our population size proxies scaled anywhere from about 3 to over 20 times faster with k -mer diversity than nucleotide diversity after adjusting for evolutionary history, mating system, life cycle habit, cultivation status, and invasiveness. The relationship between k -mer diversity and population size proxies also remains significant after correcting for genome size, whereas the analogous relationship for nucleotide diversity does not. These results suggest that variation not captured by common SNP-based analyses explains part of Lewontin's paradox in plants.
Collapse
Affiliation(s)
- Miles Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing MI
| | - Emily B. Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI
- Plant Resilience Institute, Michigan State University, East Lansing, MI
| |
Collapse
|
13
|
Wang H, Chen M, Wei X, Xia R, Pei D, Huang X, Han B. Computational tools for plant genomics and breeding. SCIENCE CHINA. LIFE SCIENCES 2024; 67:1579-1590. [PMID: 38676814 DOI: 10.1007/s11427-024-2578-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 03/25/2024] [Indexed: 04/29/2024]
Abstract
Plant genomics and crop breeding are at the intersection of biotechnology and information technology. Driven by a combination of high-throughput sequencing, molecular biology and data science, great advances have been made in omics technologies at every step along the central dogma, especially in genome assembling, genome annotation, epigenomic profiling, and transcriptome profiling. These advances further revolutionized three directions of development. One is genetic dissection of complex traits in crops, along with genomic prediction and selection. The second is comparative genomics and evolution, which open up new opportunities to depict the evolutionary constraints of biological sequences for deleterious variant discovery. The third direction is the development of deep learning approaches for the rational design of biological sequences, especially proteins, for synthetic biology. All three directions of development serve as the foundation for a new era of crop breeding where agronomic traits are enhanced by genome design.
Collapse
Affiliation(s)
- Hai Wang
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, 100193, China.
- Sanya Institute of China Agricultural University, Sanya, 572025, China.
- Hainan Yazhou Bay Seed Laboratory, Sanya, 572025, China.
| | - Mengjiao Chen
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xin Wei
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Rui Xia
- College of Horticulture, South China Agricultural University, Guangzhou, 510640, China
| | - Dong Pei
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of the State Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai, 200234, China
| | - Bin Han
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200233, China
| |
Collapse
|
14
|
Schreiber M, Jayakodi M, Stein N, Mascher M. Plant pangenomes for crop improvement, biodiversity and evolution. Nat Rev Genet 2024; 25:563-577. [PMID: 38378816 PMCID: PMC7616794 DOI: 10.1038/s41576-024-00691-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/14/2023] [Indexed: 02/22/2024]
Abstract
Plant genome sequences catalogue genes and the genetic elements that regulate their expression. Such inventories further research aims as diverse as mapping the molecular basis of trait diversity in domesticated plants or inquiries into the origin of evolutionary innovations in flowering plants millions of years ago. The transformative technological progress of DNA sequencing in the past two decades has enabled researchers to sequence ever more genomes with greater ease. Pangenomes - complete sequences of multiple individuals of a species or higher taxonomic unit - have now entered the geneticists' toolkit. The genomes of crop plants and their wild relatives are being studied with translational applications in breeding in mind. But pangenomes are applicable also in ecological and evolutionary studies, as they help classify and monitor biodiversity across the tree of life, deepen our understanding of how plant species diverged and show how plants adapt to changing environments or new selection pressures exerted by human beings.
Collapse
Affiliation(s)
- Mona Schreiber
- Department of Biology, University of Marburg, Marburg, Germany
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany
- Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, Germany.
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany.
| |
Collapse
|
15
|
Cheng S, Feng C, Wingen LU, Cheng H, Riche AB, Jiang M, Leverington-Waite M, Huang Z, Collier S, Orford S, Wang X, Awal R, Barker G, O'Hara T, Lister C, Siluveru A, Quiroz-Chávez J, Ramírez-González RH, Bryant R, Berry S, Bansal U, Bariana HS, Bennett MJ, Bicego B, Bilham L, Brown JKM, Burridge A, Burt C, Buurman M, Castle M, Chartrain L, Chen B, Denbel W, Elkot AF, Fenwick P, Feuerhelm D, Foulkes J, Gaju O, Gauley A, Gaurav K, Hafeez AN, Han R, Horler R, Hou J, Iqbal MS, Kerton M, Kondic-Spica A, Kowalski A, Lage J, Li X, Liu H, Liu S, Lovegrove A, Ma L, Mumford C, Parmar S, Philp C, Playford D, Przewieslik-Allen AM, Sarfraz Z, Schafer D, Shewry PR, Shi Y, Slafer GA, Song B, Song B, Steele D, Steuernagel B, Tailby P, Tyrrell S, Waheed A, Wamalwa MN, Wang X, Wei Y, Winfield M, Wu S, Wu Y, Wulff BBH, Xian W, Xu Y, Xu Y, Yuan Q, Zhang X, Edwards KJ, Dixon L, Nicholson P, Chayut N, Hawkesford MJ, Uauy C, Sanders D, Huang S, Griffiths S. Harnessing landrace diversity empowers wheat breeding. Nature 2024; 632:823-831. [PMID: 38885696 PMCID: PMC11338829 DOI: 10.1038/s41586-024-07682-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 06/06/2024] [Indexed: 06/20/2024]
Abstract
Harnessing genetic diversity in major staple crops through the development of new breeding capabilities is essential to ensure food security1. Here we examined the genetic and phenotypic diversity of the A. E. Watkins landrace collection2 of bread wheat (Triticum aestivum), a major global cereal, by whole-genome re-sequencing of 827 Watkins landraces and 208 modern cultivars and in-depth field evaluation spanning a decade. We found that modern cultivars are derived from two of the seven ancestral groups of wheat and maintain very long-range haplotype integrity. The remaining five groups represent untapped genetic sources, providing access to landrace-specific alleles and haplotypes for breeding. Linkage disequilibrium-based haplotypes and association genetics analyses link Watkins genomes to the thousands of identified high-resolution quantitative trait loci and significant marker-trait associations. Using these structured germplasm, genotyping and informatics resources, we revealed many Watkins-unique beneficial haplotypes that can confer superior traits in modern wheat. Furthermore, we assessed the phenotypic effects of 44,338 Watkins-unique haplotypes, introgressed from 143 prioritized quantitative trait loci in the context of modern cultivars, bridging the gap between landrace diversity and current breeding. This study establishes a framework for systematically utilizing genetic diversity in crop improvement to achieve sustainable food security.
Collapse
Affiliation(s)
- Shifeng Cheng
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
| | - Cong Feng
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Hong Cheng
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Mei Jiang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Zejian Huang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | | | - Xiaoming Wang
- John Innes Centre, Norwich, UK
- State Key Laboratory of Crop Stress Biology for Arid Areas, College of Agronomy, Northwest A&F University, Yangling, China
| | | | - Gary Barker
- Functional Genomics, School of Biological Sciences, University of Bristol, Bristol, UK
| | | | | | | | | | | | | | | | - Urmil Bansal
- School of Life and Environmental Sciences, Faculty of Science, The University of Sydney Plant Breeding Institute, Cobbitty, New South Wales, Australia
| | - Harbans S Bariana
- School of Life and Environmental Sciences, Faculty of Science, The University of Sydney Plant Breeding Institute, Cobbitty, New South Wales, Australia
- Western Sydney University, Richmond, New South Wales, Australia
| | - Malcolm J Bennett
- School of Biosciences, University of Nottingham, Sutton Bonington, UK
| | - Breno Bicego
- Department of Agricultural and Forest Sciences and Engineering, University of Lleida-AGROTECNIO-CERCA Center, Lleida, Spain
| | | | | | - Amanda Burridge
- Functional Genomics, School of Biological Sciences, University of Bristol, Bristol, UK
| | | | | | | | | | - Baizhi Chen
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Worku Denbel
- Debre Zeit Agricultural Research Center, Ethiopian Institute of Agricultural Research, Debre Zeit, Ethiopia
| | - Ahmed F Elkot
- Wheat Research Department, Field Crops Research Institute, Agricultural Research Center, Giza, Egypt
| | | | | | - John Foulkes
- School of Biosciences, University of Nottingham, Sutton Bonington, UK
| | - Oorbessy Gaju
- School of Biosciences, University of Nottingham, Sutton Bonington, UK
| | - Adam Gauley
- School of Biology, University of Leeds, Leeds, UK
- Agri-Food and Biosciences Institute, Belfast, UK
| | | | | | - Ruirui Han
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Qingdao Agricultural University, Qingdao, China
| | | | - Junliang Hou
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Muhammad S Iqbal
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Ankica Kondic-Spica
- Institute of Field and Vegetable Crops, National Institute of the Republic of Serbia, Novi Sad, Republic of Serbia
| | | | | | - Xiaolong Li
- Key Laboratory of Quality and Safety Control for Subtropical Fruit and Vegetable, Ministry of Agriculture and Rural Affairs, College of Horticulture Science, Zhejiang A&F University, Hangzhou, China
| | - Hongbing Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Shiyan Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Lingling Ma
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | | | | | | | | | - Zareen Sarfraz
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | | | - Yan Shi
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Gustavo A Slafer
- Department of Agricultural and Forest Sciences and Engineering, University of Lleida-AGROTECNIO-CERCA Center, Lleida, Spain
- ICREA, Catalonian Institution for Research and Advanced Studies, Barcelona, Spain
| | - Baoxing Song
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, China
| | - Bo Song
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | | | | | | | - Abdul Waheed
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | | | - Xingwei Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Yanping Wei
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Mark Winfield
- Functional Genomics, School of Biological Sciences, University of Bristol, Bristol, UK
| | - Shishi Wu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Yubing Wu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Huazhong Agricultural University, Wuhan, China
| | - Brande B H Wulff
- John Innes Centre, Norwich, UK
- Center for Desert Agriculture, Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Wenfei Xian
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Yawen Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Huazhong Agricultural University, Wuhan, China
| | - Yunfeng Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Quan Yuan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Xin Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- Huazhong Agricultural University, Wuhan, China
| | - Keith J Edwards
- Functional Genomics, School of Biological Sciences, University of Bristol, Bristol, UK
| | - Laura Dixon
- School of Biology, University of Leeds, Leeds, UK
| | | | | | | | | | | | - Sanwen Huang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | | |
Collapse
|
16
|
Zhang Z, Liu D, Li B, Wang W, Zhang J, Xin M, Hu Z, Liu J, Du J, Peng H, Hao C, Zhang X, Ni Z, Sun Q, Guo W, Yao Y. A k-mer-based pangenome approach for cataloging seed-storage-protein genes in wheat to facilitate genotype-to-phenotype prediction and improvement of end-use quality. MOLECULAR PLANT 2024; 17:1038-1053. [PMID: 38796709 DOI: 10.1016/j.molp.2024.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 05/06/2024] [Accepted: 05/23/2024] [Indexed: 05/28/2024]
Abstract
Wheat is a staple food for more than 35% of the world's population, with wheat flour used to make hundreds of baked goods. Superior end-use quality is a major breeding target; however, improving it is especially time-consuming and expensive. Furthermore, genes encoding seed-storage proteins (SSPs) form multi-gene families and are repetitive, with gaps commonplace in several genome assemblies. To overcome these barriers and efficiently identify superior wheat SSP alleles, we developed "PanSK" (Pan-SSP k-mer) for genotype-to-phenotype prediction based on an SSP-based pangenome resource. PanSK uses 29-mer sequences that represent each SSP gene at the pangenomic level to reveal untapped diversity across landraces and modern cultivars. Genome-wide association studies with k-mers identified 23 SSP genes associated with end-use quality that represent novel targets for improvement. We evaluated the effect of rye secalin genes on end-use quality and found that removal of ω-secalins from 1BL/1RS wheat translocation lines is associated with enhanced end-use quality. Finally, using machine-learning-based prediction inspired by PanSK, we predicted the quality phenotypes with high accuracy from genotypes alone. This study provides an effective approach for genome design based on SSP genes, enabling the breeding of wheat varieties with superior processing capabilities and improved end-use quality.
Collapse
Affiliation(s)
- Zhaoheng Zhang
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Dan Liu
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Binyong Li
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Wenxi Wang
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Jize Zhang
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Mingming Xin
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Zhaorong Hu
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Jie Liu
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Jinkun Du
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Huiru Peng
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Chenyang Hao
- Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xueyong Zhang
- Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhongfu Ni
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Qixin Sun
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | - Weilong Guo
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.
| | - Yingyin Yao
- Frontiers Science Center for Molecular Design Breeding, Key Laboratory of Crop Heterosis and Utilization (MOE), and Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
17
|
Shi G, Dai Y, Zhou D, Chen M, Zhang J, Bi Y, Liu S, Wu Q. An alignment- and reference-free strategy using k-mer present pattern for population genomic analyses. Mycology 2024; 16:309-323. [PMID: 40083414 PMCID: PMC11899203 DOI: 10.1080/21501203.2024.2358868] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 05/17/2024] [Indexed: 03/16/2025] Open
Abstract
Pangenomes are replacing single reference genomes to capture all variants within a species or clade, but their analysis predominantly leverages graph-based methods that require multiple high-quality genomes and computationally intensive multiple-genome alignments. K-mer decomposition is an alternative to graph-based pangenomes. However, how to directly use k-mers for the population genetic analyses is unknown. Here, we developed a novel strategy that uses the variants of k-mer count in the genome for population analyses. To test the effectivity of this method, we compared it directly to the SNP-based method on the analysis of population structure and genetic diversity of 267 Saccharomyces cerevisiae strains within two simulated datasets and a real sequence dataset. The population structure identified with k-mers recapitulates that obtained using SNPs, indicating the effectiveness of k-mer-based approach, and higher genetic diversity within real dataset supported k-mers contained more genetic variants. Based on k-mer frequency, we found not only SNP but also some insertion/deletion and horizontal gene transfer (HGT) fragments related to the adaptive evolution of S. cerevisiae. Our study creates a framework for the alignment- and reference-free (ARF) method in population genetic analyses, which will be more pronounced in the species with no complete genome or highly diverged species.
Collapse
Affiliation(s)
- Guohui Shi
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Yi Dai
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- College of Life Science, University of the Chinese Academy of Sciences, Beijing, China
| | - Da Zhou
- School of Mathematical Sciences, Xiamen University, Xiamen, China
| | - Mengmeng Chen
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- College of Life Science, University of the Chinese Academy of Sciences, Beijing, China
| | - Jiaqi Zhang
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- College of Life Science, University of the Chinese Academy of Sciences, Beijing, China
| | - Yilong Bi
- School of Mathematical Sciences, Xiamen University, Xiamen, China
| | - Shuai Liu
- College of Life Science, University of the Chinese Academy of Sciences, Beijing, China
| | - Qi Wu
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
18
|
Schmid MW, Moradi A, Leigh DM, Schuman MC, van Moorsel SJ. Covering the bases: Population genomic structure of Lemna minor and the cryptic species L. japonica in Switzerland. Ecol Evol 2024; 14:e11599. [PMID: 38882534 PMCID: PMC11178436 DOI: 10.1002/ece3.11599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/27/2024] [Accepted: 06/03/2024] [Indexed: 06/18/2024] Open
Abstract
Duckweeds, including the common duckweed Lemna minor, are increasingly used to test eco-evolutionary theories. Yet, despite its popularity and near-global distribution, the understanding of its population structure (and genetic variation therein) is still limited. It is essential that this is resolved, because of the impact genetic diversity has on experimental responses and scientific understanding. Through whole-genome sequencing, we assessed the genetic diversity and population genomic structure of 23 natural Lemna spp. populations from their natural range in Switzerland. We used two distinct analytical approaches, a reference-free kmer approach and the classical reference-based one. Two genetic clusters were identified across the described species distribution of L. minor, surprisingly corresponding to species-level divisions. The first cluster contained the targeted L. minor individuals and the second contained individuals from a cryptic species: Lemna japonica. Within the L. minor cluster, we identified a well-defined population structure with little intra-population genetic diversity (i.e., within ponds) but high inter-population diversity (i.e., between ponds). In L. japonica, the population structure was significantly weaker and genetic variation between a subset of populations was as low as within populations. This study revealed that L. japonica is more widespread than previously thought. Our findings signify that thorough genotype-to-phenotype analyses are needed in duckweed experimental ecology and evolution.
Collapse
Affiliation(s)
| | - Aboubakr Moradi
- Department of Geography University of Zurich Zurich Switzerland
- Department of Chemistry University of Zurich Zurich Switzerland
| | - Deborah M Leigh
- Swiss Federal Research Institute WSL Birmensdorf Switzerland
| | - Meredith C Schuman
- Department of Geography University of Zurich Zurich Switzerland
- Department of Chemistry University of Zurich Zurich Switzerland
| | | |
Collapse
|
19
|
Sonsungsan P, Nganga ML, Lieberman MC, Amundson KR, Stewart V, Plaimas K, Comai L, Henry IM. A k-mer-based bulked segregant analysis approach to map seed traits in unphased heterozygous potato genomes. G3 (BETHESDA, MD.) 2024; 14:jkae035. [PMID: 38366577 PMCID: PMC10989861 DOI: 10.1093/g3journal/jkae035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 02/06/2024] [Accepted: 02/09/2024] [Indexed: 02/18/2024]
Abstract
High-throughput sequencing-based methods for bulked segregant analysis (BSA) allow for the rapid identification of genetic markers associated with traits of interest. BSA studies have successfully identified qualitative (binary) and quantitative trait loci (QTLs) using QTL mapping. However, most require population structures that fit the models available and a reference genome. Instead, high-throughput short-read sequencing can be combined with BSA of k-mers (BSA-k-mer) to map traits that appear refractory to standard approaches. This method can be applied to any organism and is particularly useful for species with genomes diverged from the closest sequenced genome. It is also instrumental when dealing with highly heterozygous and potentially polyploid genomes without phased haplotype assemblies and for which a single haplotype can control a trait. Finally, it is flexible in terms of population structure. Here, we apply the BSA-k-mer method for the rapid identification of candidate regions related to seed spot and seed size in diploid potato. Using a mixture of F1 and F2 individuals from a cross between 2 highly heterozygous parents, candidate sequences were identified for each trait using the BSA-k-mer approach. Using parental reads, we were able to determine the parental origin of the loci. Finally, we mapped the identified k-mers to a closely related potato genome to validate the method and determine the genomic loci underlying these sequences. The location identified for the seed spot matches with previously identified loci associated with pigmentation in potato. The loci associated with seed size are novel. Both loci are relevant in future breeding toward true seeds in potato.
Collapse
Affiliation(s)
- Pajaree Sonsungsan
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | - Mwaura Livingstone Nganga
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Meric C Lieberman
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Kirk R Amundson
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Victoria Stewart
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Kitiporn Plaimas
- Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Luca Comai
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| | - Isabelle M Henry
- Department of Plant Biology and Genome Center, University of California, Davis, Davis, CA 95616, USA
| |
Collapse
|
20
|
Willink B, Tunström K, Nilén S, Chikhi R, Lemane T, Takahashi M, Takahashi Y, Svensson EI, Wheat CW. The genomics and evolution of inter-sexual mimicry and female-limited polymorphisms in damselflies. Nat Ecol Evol 2024; 8:83-97. [PMID: 37932383 PMCID: PMC10781644 DOI: 10.1038/s41559-023-02243-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 10/04/2023] [Indexed: 11/08/2023]
Abstract
Sex-limited morphs can provide profound insights into the evolution and genomic architecture of complex phenotypes. Inter-sexual mimicry is one particular type of sex-limited polymorphism in which a novel morph resembles the opposite sex. While inter-sexual mimics are known in both sexes and a diverse range of animals, their evolutionary origin is poorly understood. Here, we investigated the genomic basis of female-limited morphs and male mimicry in the common bluetail damselfly. Differential gene expression between morphs has been documented in damselflies, but no causal locus has been previously identified. We found that male mimicry originated in an ancestrally sexually dimorphic lineage in association with multiple structural changes, probably driven by transposable element activity. These changes resulted in ~900 kb of novel genomic content that is partly shared by male mimics in a close relative, indicating that male mimicry is a trans-species polymorphism. More recently, a third morph originated following the translocation of part of the male-mimicry sequence into a genomic position ~3.5 mb apart. We provide evidence of balancing selection maintaining male mimicry, in line with previous field population studies. Our results underscore how structural variants affecting a handful of potentially regulatory genes and morph-specific genes can give rise to novel and complex phenotypic polymorphisms.
Collapse
Affiliation(s)
- Beatriz Willink
- Department of Zoology, Stockholm University, Stockholm, Sweden.
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore.
| | - Kalle Tunström
- Department of Zoology, Stockholm University, Stockholm, Sweden
| | - Sofie Nilén
- Department of Biology, Lund University, Lund, Sweden
| | - Rayan Chikhi
- Sequence Bioinformatics, Institut Pasteur, Université Paris Cité, Paris, France
| | - Téo Lemane
- University of Rennes, Inria, CNRS, IRISA, Rennes, France
| | - Michihiko Takahashi
- Graduate School of Life Sciences, Tohoku University, Sendai, Japan
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | - Yuma Takahashi
- Graduate School of Science, Chiba University, Chiba, Japan
| | | | | |
Collapse
|
21
|
Corut AK, Wallace JG. kGWASflow: a modular, flexible, and reproducible Snakemake workflow for k-mers-based GWAS. G3 (BETHESDA, MD.) 2023; 14:jkad246. [PMID: 37976215 PMCID: PMC10755180 DOI: 10.1093/g3journal/jkad246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 10/15/2023] [Indexed: 11/19/2023]
Abstract
Genome-wide association studies (GWAS) have been widely used to identify genetic variation associated with complex traits. Despite its success and popularity, the traditional GWAS approach comes with a variety of limitations. For this reason, newer methods for GWAS have been developed, including the use of pan-genomes instead of a reference genome and the utilization of markers beyond single-nucleotide polymorphisms, such as structural variations and k-mers. The k-mers-based GWAS approach has especially gained attention from researchers in recent years. However, these new methodologies can be complicated and challenging to implement. Here, we present kGWASflow, a modular, user-friendly, and scalable workflow to perform GWAS using k-mers. We adopted an existing kmersGWAS method into an easier and more accessible workflow using management tools like Snakemake and Conda and eliminated the challenges caused by missing dependencies and version conflicts. kGWASflow increases the reproducibility of the kmersGWAS method by automating each step with Snakemake and using containerization tools like Docker. The workflow encompasses supplemental components such as quality control, read-trimming procedures, and generating summary statistics. kGWASflow also offers post-GWAS analysis options to identify the genomic location and context of trait-associated k-mers. kGWASflow can be applied to any organism and requires minimal programming skills. kGWASflow is freely available on GitHub (https://github.com/akcorut/kGWASflow) and Bioconda (https://anaconda.org/bioconda/kgwasflow).
Collapse
Affiliation(s)
- Adnan Kivanc Corut
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Jason G Wallace
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, GA 30602, USA
- Department of Crop and Soil Sciences, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
22
|
Lemay MA, de Ronne M, Bélanger R, Belzile F. k-mer-based GWAS enhances the discovery of causal variants and candidate genes in soybean. THE PLANT GENOME 2023; 16:e20374. [PMID: 37596724 DOI: 10.1002/tpg2.20374] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 07/19/2023] [Indexed: 08/20/2023]
Abstract
Genome-wide association studies (GWAS) are powerful statistical methods that detect associations between genotype and phenotype at genome scale. Despite their power, GWAS frequently fail to pinpoint the causal variant or the gene controlling a given trait in crop species. Assessing genetic variants other than single-nucleotide polymorphisms (SNPs) could alleviate this problem. In this study, we tested the potential of structural variant (SV)- and k-mer-based GWAS in soybean by applying these methods as well as conventional SNP/indel-based GWAS to 13 traits. We assessed the performance of each GWAS approach based on loci for which the causal genes or variants were known from previous genetic studies. We found that k-mer-based GWAS was the most versatile approach and the best at pinpointing causal variants or candidate genes. Moreover, k-mer-based analyses identified promising candidate genes for loci related to pod color, pubescence form, and resistance to Phytophthora sojae. In our dataset, SV-based GWAS did not add value compared to k-mer-based GWAS and may not be worth the time and computational resources invested. Despite promising results, significant challenges remain regarding the downstream analysis of k-mer-based GWAS. Notably, better methods are needed to associate significant k-mers with sequence variation. Our results suggest that coupling k-mer- and SNP/indel-based GWAS is a powerful approach for discovering candidate genes in crop species.
Collapse
Affiliation(s)
- Marc-André Lemay
- Département de phytologie, Université Laval, Québec, QC, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec, QC, Canada
| | - Maxime de Ronne
- Département de phytologie, Université Laval, Québec, QC, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec, QC, Canada
| | - Richard Bélanger
- Département de phytologie, Université Laval, Québec, QC, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec, QC, Canada
| | - François Belzile
- Département de phytologie, Université Laval, Québec, QC, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec, QC, Canada
| |
Collapse
|
23
|
Dutta A, McDonald BA, Croll D. Combined reference-free and multi-reference based GWAS uncover cryptic variation underlying rapid adaptation in a fungal plant pathogen. PLoS Pathog 2023; 19:e1011801. [PMID: 37972199 PMCID: PMC10688896 DOI: 10.1371/journal.ppat.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/30/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
Microbial pathogens often harbor substantial functional diversity driven by structural genetic variation. Rapid adaptation from such standing variation threatens global food security and human health. Genome-wide association studies (GWAS) provide a powerful approach to identify genetic variants underlying recent pathogen adaptation. However, the reliance on single reference genomes and single nucleotide polymorphisms (SNPs) obscures the true extent of adaptive genetic variation. Here, we show quantitatively how a combination of multiple reference genomes and reference-free approaches captures substantially more relevant genetic variation compared to single reference mapping. We performed reference-genome based association mapping across 19 reference-quality genomes covering the diversity of the species. We contrasted the results with a reference-free (i.e., k-mer) approach using raw whole-genome sequencing data in a panel of 145 strains collected across the global distribution range of the fungal wheat pathogen Zymoseptoria tritici. We mapped the genetic architecture of 49 life history traits including virulence, reproduction and growth in multiple stressful environments. The inclusion of additional reference genome SNP datasets provides a nearly linear increase in additional loci mapped through GWAS. Variants detected through the k-mer approach explained a higher proportion of phenotypic variation than a reference genome-based approach and revealed functionally confirmed loci that classic GWAS approaches failed to map. The power of GWAS in microbial pathogens can be significantly enhanced by comprehensively capturing structural genetic variation. Our approach is generalizable to a large number of species and will uncover novel mechanisms driving rapid adaptation of pathogens.
Collapse
Affiliation(s)
- Anik Dutta
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Bruce A. McDonald
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| |
Collapse
|
24
|
Li S, Kong L, Xiao X, Li P, Liu A, Li J, Gong J, Gong W, Ge Q, Shang H, Pan J, Chen H, Peng Y, Zhang Y, Lu Q, Shi Y, Yuan Y. Genome-wide artificial introgressions of Gossypium barbadense into G. hirsutum reveal superior loci for simultaneous improvement of cotton fiber quality and yield traits. J Adv Res 2023; 53:1-16. [PMID: 36460274 PMCID: PMC10658236 DOI: 10.1016/j.jare.2022.11.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 10/31/2022] [Accepted: 11/24/2022] [Indexed: 12/02/2022] Open
Abstract
INTRODUCTION The simultaneous improvement of fiber quality and yield for cotton is strongly limited by the narrow genetic backgrounds of Gossypium hirsutum (Gh) and the negative genetic correlations among traits. An effective way to overcome the bottlenecks is to introgress the favorable alleles of Gossypium barbadense (Gb) for fiber quality into Gh with high yield. OBJECTIVES This study was to identify superior loci for the improvement of fiber quality and yield. METHODS Two sets of chromosome segment substitution lines (CSSLs) were generated by crossing Hai1 (Gb, donor-parent) with cultivar CCRI36 (Gh) and CCRI45 (Gh) as genetic backgrounds, and cultivated in 6 and 8 environments, respectively. The kmer genotyping strategy was improved and applied to the population genetic analysis of 743 genomic sequencing data. A progeny segregating population was constructed to validate genetic effects of the candidate loci. RESULTS A total of 68,912 and 83,352 genome-wide introgressed kmers were identified in the CCRI36 and CCRI45 populations, respectively. Over 90 % introgressions were homologous exchanges and about 21 % were reverse insertions. In total, 291 major introgressed segments were identified with stable genetic effects, of which 66(22.98 %), 64(21.99 %), 35(12.03 %), 31(10.65 %) and 18(6.19 %) were beneficial for the improvement of fiber length (FL), strength (FS), micronaire, lint-percentage (LP) and boll-weight, respectively. Thirty-nine introgression segments were detected with stable favorable additive effects for simultaneous improvement of 2 or more traits in Gh genetic background, including 6 could increase FL/FS and LP. The pyramiding effects of 3 pleiotropic segments (A07:C45Clu-081, D06:C45Clu-218, D02:C45Clu-193) were further validated in the segregating population. CONCLUSION The combining of genome-wide introgressions and kmer genotyping strategy showed significant advantages in exploring genetic resources. Through the genome-wide comprehensive mining, a total of 11 clusters (segments) were discovered for the stable simultaneous improvement of FL/FS and LP, which should be paid more attention in the future.
Collapse
Affiliation(s)
- Shaoqi Li
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China; Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Linglei Kong
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Xianghui Xiao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Pengtao Li
- School of Biotechnology and Food Engineering, Anyang Institute of Technology, Anyang 455000, China
| | - Aiying Liu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Junwen Li
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Juwu Gong
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Wankui Gong
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Qun Ge
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Haihong Shang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Jingtao Pan
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China
| | - Hong Chen
- Cotton Research Institute, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi 832000, China
| | - Yan Peng
- Third Division of the Xinjiang Production and Construction Corps Agricultural Research Institute, Tumushuke 843900, China
| | - Yuanming Zhang
- Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Quanwei Lu
- School of Biotechnology and Food Engineering, Anyang Institute of Technology, Anyang 455000, China.
| | - Yuzhen Shi
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China.
| | - Youlu Yuan
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China; Crop Information Center, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
25
|
Aylward AJ, Petrus S, Mamerto A, Hartwick NT, Michael TP. PanKmer: k-mer-based and reference-free pangenome analysis. Bioinformatics 2023; 39:btad621. [PMID: 37846049 PMCID: PMC10603592 DOI: 10.1093/bioinformatics/btad621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 08/29/2023] [Accepted: 10/13/2023] [Indexed: 10/18/2023] Open
Abstract
SUMMARY Pangenomes are replacing single reference genomes as the definitive representation of DNA sequence within a species or clade. Pangenome analysis predominantly leverages graph-based methods that require computationally intensive multiple genome alignments, do not scale to highly complex eukaryotic genomes, limit their scope to identifying structural variants (SVs), or incur bias by relying on a reference genome. Here, we present PanKmer, a toolkit designed for reference-free analysis of pangenome datasets consisting of dozens to thousands of individual genomes. PanKmer decomposes a set of input genomes into a table of observed k-mers and their presence-absence values in each genome. These are stored in an efficient k-mer index data format that encodes SNPs, INDELs, and SVs. It also includes functions for downstream analysis of the k-mer index, such as calculating sequence similarity statistics between individuals at whole-genome or local scales. For example, k-mers can be "anchored" in any individual genome to quantify sequence variability or conservation at a specific locus. This facilitates workflows with various biological applications, e.g. identifying cases of hybridization between plant species. PanKmer provides researchers with a valuable and convenient means to explore the full scope of genetic variation in a population, without reference bias. AVAILABILITY AND IMPLEMENTATION PanKmer is implemented as a Python package with components written in Rust, released under a BSD license. The source code is available from the Python Package Index (PyPI) at https://pypi.org/project/pankmer/ as well as Gitlab at https://gitlab.com/salk-tm/pankmer. Full documentation is available at https://salk-tm.gitlab.io/pankmer/.
Collapse
Affiliation(s)
- Anthony J Aylward
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States
| | - Semar Petrus
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States
| | - Allen Mamerto
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States
| | - Nolan T Hartwick
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States
| | - Todd P Michael
- The Plant Molecular and Cellular Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States
| |
Collapse
|
26
|
Li X, Tieman D, Alseekh S, Fernie AR, Klee HJ. Natural variations in the Sl-AKR9 aldo/keto reductase gene impact fruit flavor volatile and sugar contents. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 115:1134-1150. [PMID: 37243881 DOI: 10.1111/tpj.16310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 05/01/2023] [Accepted: 05/08/2023] [Indexed: 05/29/2023]
Abstract
The unique flavors of different fruits depend upon complex blends of soluble sugars, organic acids, and volatile organic compounds. 2-Phenylethanol and phenylacetaldehyde are major contributors to flavor in many foods, including tomato. In the tomato fruit, glucose, and fructose are the chemicals that most positively contribute to human flavor preferences. We identified a gene encoding a tomato aldo/keto reductase, Sl-AKR9, that is associated with phenylacetaldehyde and 2-phenylethanol contents in fruits. Two distinct haplotypes were identified; one encodes a chloroplast-targeted protein while the other encodes a transit peptide-less protein that accumulates in the cytoplasm. Sl-AKR9 effectively catalyzes reduction of phenylacetaldehyde to 2-phenylethanol. The enzyme can also metabolize sugar-derived reactive carbonyls, including glyceraldehyde and methylglyoxal. CRISPR-Cas9-induced loss-of-function mutations in Sl-AKR9 significantly increased phenylacetaldehyde and lowered 2-phenylethanol content in ripe fruit. Reduced fruit weight and increased soluble solids, glucose, and fructose contents were observed in the loss-of-function fruits. These results reveal a previously unidentified mechanism affecting two flavor-associated phenylalanine-derived volatile organic compounds, sugar content, and fruit weight. Modern varieties of tomato almost universally contain the haplotype associated with larger fruit, lower sugar content, and lower phenylacetaldehyde and 2-phenylethanol, likely leading to flavor deterioration in modern varieties.
Collapse
Affiliation(s)
- Xiang Li
- Horticultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida, 32611, USA
| | - Denise Tieman
- Horticultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida, 32611, USA
| | - Saleh Alseekh
- Max-Planck-Institute of Molecular Plant Physiology, 14476, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology, Plovdiv, 4000, Bulgaria
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, 14476, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology, Plovdiv, 4000, Bulgaria
| | - Harry J Klee
- Horticultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida, 32611, USA
| |
Collapse
|
27
|
Castanera R, Morales-Díaz N, Gupta S, Purugganan M, Casacuberta JM. Transposons are important contributors to gene expression variability under selection in rice populations. eLife 2023; 12:RP86324. [PMID: 37467142 PMCID: PMC10393045 DOI: 10.7554/elife.86324] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/21/2023] Open
Abstract
Transposable elements (TEs) are an important source of genome variability. Here, we analyze their contribution to gene expression variability in rice by performing a TE insertion polymorphism expression quantitative trait locus mapping using expression data from 208 varieties from the Oryza sativa ssp. indica and O. sativa ssp. japonica subspecies. Our data show that TE insertions are associated with changes of expression of many genes known to be targets of rice domestication and breeding. An important fraction of these insertions were already present in the rice wild ancestors, and have been differentially selected in indica and japonica rice populations. Taken together, our results show that small changes of expression in signal transduction genes induced by TE insertions accompany the domestication and adaptation of rice populations.
Collapse
Affiliation(s)
- Raúl Castanera
- Centre for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Noemia Morales-Díaz
- Centre for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Sonal Gupta
- Center for Genomics and Systems Biology, New York University, New York, United States
| | - Michael Purugganan
- Center for Genomics and Systems Biology, New York University, New York, United States
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Josep M Casacuberta
- Centre for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| |
Collapse
|
28
|
Karikari B, Lemay MA, Belzile F. k-mer-Based Genome-Wide Association Studies in Plants: Advances, Challenges, and Perspectives. Genes (Basel) 2023; 14:1439. [PMID: 37510343 PMCID: PMC10379394 DOI: 10.3390/genes14071439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/04/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023] Open
Abstract
Genome-wide association studies (GWAS) have allowed the discovery of marker-trait associations in crops over recent decades. However, their power is hampered by a number of limitations, with the key one among them being an overreliance on single-nucleotide polymorphisms (SNPs) as molecular markers. Indeed, SNPs represent only one type of genetic variation and are usually derived from alignment to a single genome assembly that may be poorly representative of the population under study. To overcome this, k-mer-based GWAS approaches have recently been developed. k-mer-based GWAS provide a universal way to assess variation due to SNPs, insertions/deletions, and structural variations without having to specifically detect and genotype these variants. In addition, k-mer-based analyses can be used in species that lack a reference genome. However, the use of k-mers for GWAS presents challenges such as data size and complexity, lack of standard tools, and potential detection of false associations. Nevertheless, efforts are being made to overcome these challenges and a general analysis workflow has started to emerge. We identify the priorities for k-mer-based GWAS in years to come, notably in the development of user-friendly programs for their analysis and approaches for linking significant k-mers to sequence variation.
Collapse
Affiliation(s)
- Benjamin Karikari
- Département de Phytologie, Université Laval, Quebec City, QC G1V 0A6, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada
- Department of Agricultural Biotechnology, Faculty of Agriculture, Food and Consumer Sciences, University for Development Studies, Tamale P.O. Box TL 1882, Ghana
| | - Marc-André Lemay
- Département de Phytologie, Université Laval, Quebec City, QC G1V 0A6, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Quebec City, QC G1V 0A6, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC G1V 0A6, Canada
| |
Collapse
|
29
|
Li G, Jiang D, Wang J, Liao Y, Zhang T, Zhang H, Dai X, Ren H, Chen C, Zheng Y. A High-Continuity Genome Assembly of Chinese Flowering Cabbage ( Brassica rapa var. parachinensis) Provides New Insights into Brassica Genome Structure Evolution. PLANTS (BASEL, SWITZERLAND) 2023; 12:2498. [PMID: 37447059 DOI: 10.3390/plants12132498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 06/19/2023] [Accepted: 06/27/2023] [Indexed: 07/15/2023]
Abstract
Chinese flowering cabbage (Brassica rapa var. parachinensis) is a popular and widely cultivated leaf vegetable crop in Asia. Here, we performed a high quality de novo assembly of the 384 Mb genome of 10 chromosomes of a typical cultivar of Chinese flowering cabbage with an integrated approach using PacBio, Illumina, and Hi-C technology. We modeled 47,598 protein-coding genes in this analysis and annotated 52% (205.9/384) of its genome as repetitive sequences including 17% in DNA transposons and 22% in long terminal retrotransposons (LTRs). Phylogenetic analysis reveals the genome of the Chinese flowering cabbage has a closer evolutionary relationship with the AA diploid progenitor of the allotetraploid species, Brassica juncea. Comparative genomic analysis of Brassica species with different subgenome types (A, B and C) reveals that the pericentromeric regions on chromosome 5 and 6 of the AA genome have been significantly expanded compared to the orthologous genomic regions in the BB and CC genomes, largely driven by LTR-retrotransposon amplification. Furthermore, we identified a large number of structural variations (SVs) within the B. rapa lines that could impact coding genes, suggesting the functional significance of SVs on Brassica genome evolution. Overall, our high-quality genome assembly of the Chinese flowering cabbage provides a valuable genetic resource for deciphering the genome evolution of Brassica species and it can potentially serve as the reference genome guiding the molecular breeding practice of B. rapa crops.
Collapse
Affiliation(s)
- Guangguang Li
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| | - Ding Jiang
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| | - Juntao Wang
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yi Liao
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Ting Zhang
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Hua Zhang
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| | - Xiuchun Dai
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| | - Hailong Ren
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| | - Changming Chen
- College of Horticulture, South China Agricultural University, Guangzhou 510642, China
| | - Yansong Zheng
- Guangzhou Academy of Agricultural Sciences, Guangzhou 510335, China
| |
Collapse
|
30
|
Morales-Cruz A, Aguirre-Liguori J, Massonnet M, Minio A, Zaccheo M, Cochetel N, Walker A, Riaz S, Zhou Y, Cantu D, Gaut BS. Multigenic resistance to Xylella fastidiosa in wild grapes (Vitis sps.) and its implications within a changing climate. Commun Biol 2023; 6:580. [PMID: 37253933 DOI: 10.1038/s42003-023-04938-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 05/12/2023] [Indexed: 06/01/2023] Open
Abstract
Xylella fastidiosa is a bacterium that infects crops like grapevines, coffee, almonds, citrus and olives. There is little understanding of the genes that contribute to plant resistance, the genomic architecture of resistance, and the potential role of climate in shaping resistance, in part because major crops like grapevines (Vitis vinifera) are not resistant to the bacterium. Here we study a wild grapevine species, V. arizonica, that segregates for resistance. Using genome-wide association, we identify candidate resistance genes. Resistance-associated kmers are shared with a sister species of V. arizonica but not with more distant species, suggesting that resistance evolved more than once. Finally, resistance is climate dependent, because individuals from low ( < 10 °C) temperature locations in the wettest quarter were typically susceptible to infection, likely reflecting a lack of pathogen pressure in colder climates. In fact, climate is as effective a predictor of resistance phenotypes as some genetic markers. We extend our climate observations to additional crops, predicting that increased pathogen pressure is more likely for grapevines and almonds than some other susceptible crops.
Collapse
Affiliation(s)
- Abraham Morales-Cruz
- U.S. Department of Energy, Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, 94720, USA
| | - Jonas Aguirre-Liguori
- Dept. of Ecology and Evolutionary Biology, University of California, Irvine, CA, USA
| | - Mélanie Massonnet
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA
| | - Andrea Minio
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA
| | - Mirella Zaccheo
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA
| | - Noe Cochetel
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA
| | - Andrew Walker
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA
| | - Summaira Riaz
- San Joaquin Valley Agricultural Center, United States Dept of Agriculture, Parlier, CA, USA
| | - Yongfeng Zhou
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
- Agricultural Genomics Institute at Shenzhen, The Chinese Academy of Agricultural Sciences, No. 7 Pengfei Road, Shenzen, 518120, China.
| | - Dario Cantu
- Dept. of Viticulture and Enology, University of California, Davis, CA, USA.
- Dept. of Viticulture and Enology, One Shields Avenue, University of California Davis, Davis, CA, 95616-5270, USA.
| | - Brandon S Gaut
- Dept. of Ecology and Evolutionary Biology, University of California, Irvine, CA, USA.
- Dept. of Ecology and Evolutionary Biology, 321 Steinhaus Hall UC Irvine, Irvine, CA, 92617-2525, USA.
| |
Collapse
|
31
|
Chen MM, Shi GH, Dai Y, Fang WX, Wu Q. Identifying genetic variants associated with amphotericin B (AMB) resistance in Aspergillus fumigatus via k-mer -based GWAS. Front Genet 2023; 14:1133593. [PMID: 37229189 PMCID: PMC10203564 DOI: 10.3389/fgene.2023.1133593] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 04/10/2023] [Indexed: 05/27/2023] Open
Abstract
Aspergillus fumigatus is one of the most common pathogenic fungi, which results in high morbidity and mortality in immunocompromised patients. Amphotericin B (AMB) is used as the core drug for the treatment of triazole-resistant A. fumigatus. Following the usage of amphotericin B drugs, the number of amphotericin B-resistant A. fumigatus isolates showed an increasing trend over the years, but the mechanism and mutations associated with amphotericin B sensitivity are not fully understood. In this study, we performed a k-mer-based genome-wide association study (GWAS) in 98 A. fumigatus isolates from public databases. Associations identified with k-mers not only recapitulate those with SNPs but also discover new associations with insertion/deletion (indel). Compared to SNP sites, the indel showed a stronger association with amphotericin B resistance, and a significant correlated indel is present in the exon region of AFUA_7G05160, encoding a fumarylacetoacetate hydrolase (FAH) family protein. Enrichment analysis revealed sphingolipid synthesis and transmembrane transport may be related to the resistance of A. fumigatus to amphotericin B. The expansion of variant types detected by the k-mer method increases opportunities to identify and exploit complex genetic variants that drive amphotericin B resistance, and these candidate variants help accelerate the selection of prospective gene markers for amphotericin B resistance screening in A. fumigatus.
Collapse
Affiliation(s)
- Meng-Meng Chen
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Guo-Hui Shi
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Yi Dai
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wen-Xia Fang
- Guangxi Biological Sciences and Biotechnology Center, Guangxi Academy of Sciences, Nanning, Guangxi, China
| | - Qi Wu
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
32
|
Zhu F, Wen W, Cheng Y, Alseekh S, Fernie AR. Integrating multiomics data accelerates elucidation of plant primary and secondary metabolic pathways. ABIOTECH 2023; 4:47-56. [PMID: 37220537 PMCID: PMC10199974 DOI: 10.1007/s42994-022-00091-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 12/24/2022] [Indexed: 05/25/2023]
Abstract
Plants are the most important sources of food for humans, as well as supplying many ingredients that are of great importance for human health. Developing an understanding of the functional components of plant metabolism has attracted considerable attention. The rapid development of liquid chromatography and gas chromatography, coupled with mass spectrometry, has allowed the detection and characterization of many thousands of metabolites of plant origin. Nowadays, elucidating the detailed biosynthesis and degradation pathways of these metabolites represents a major bottleneck in our understanding. Recently, the decreased cost of genome and transcriptome sequencing rendered it possible to identify the genes involving in metabolic pathways. Here, we review the recent research which integrates metabolomic with different omics methods, to comprehensively identify structural and regulatory genes of the primary and secondary metabolic pathways. Finally, we discuss other novel methods that can accelerate the process of identification of metabolic pathways and, ultimately, identify metabolite function(s).
Collapse
Affiliation(s)
- Feng Zhu
- National R&D Center for Citrus Preservation, Hubei Hongshan Laboratory, National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, Key Laboratory of Horticultural Plant Biology, Ministry of Education, Huazhong Agricultural University, Wuhan, 430070 China
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm, 14476 Germany
| | - Weiwei Wen
- National R&D Center for Citrus Preservation, Hubei Hongshan Laboratory, National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, Key Laboratory of Horticultural Plant Biology, Ministry of Education, Huazhong Agricultural University, Wuhan, 430070 China
| | - Yunjiang Cheng
- National R&D Center for Citrus Preservation, Hubei Hongshan Laboratory, National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, Key Laboratory of Horticultural Plant Biology, Ministry of Education, Huazhong Agricultural University, Wuhan, 430070 China
| | - Saleh Alseekh
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm, 14476 Germany
- Center of Plant Systems Biology and Biotechnology, Plovdiv, 4000 Bulgaria
| | - Alisdair R. Fernie
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm, 14476 Germany
- Center of Plant Systems Biology and Biotechnology, Plovdiv, 4000 Bulgaria
| |
Collapse
|
33
|
Guo T, Li X. Machine learning for predicting phenotype from genotype and environment. Curr Opin Biotechnol 2023; 79:102853. [PMID: 36463837 DOI: 10.1016/j.copbio.2022.102853] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 11/01/2022] [Accepted: 11/07/2022] [Indexed: 12/03/2022]
Abstract
Predicting phenotype with genomic and environmental information is critically needed and challenging. Machine learning methods have emerged as powerful tools to make accurate predictions from large and complex biological data. Here, we review the progress of phenotype prediction models enabled or improved by machine learning methods. We categorized the applications into three scenarios: prediction with genotypic information, with environmental information, and with both. In each scenario, we illustrate the practicality of prediction models, the advantages of machine learning, and the challenges of modeling complex relationships. We discuss the promising potential of leveraging machine learning and genetics theories to develop models that can predict phenotype and also interpret the biological consequences of changes in genotype and environment.
Collapse
Affiliation(s)
- Tingting Guo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China; Hubei Hongshan Laboratory, Wuhan 430070, China.
| | - Xianran Li
- USDA, Agricultural Research Service, Wheat Health, Genetics, and Quality Research Unit, Pullman, WA 99164, USA; Department of Crop and Soil Sciences, Washington State University, Pullman, WA 99164, USA.
| |
Collapse
|
34
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
- Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom
- The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China
- Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China
| |
Collapse
|
35
|
Shi J, Tian Z, Lai J, Huang X. Plant pan-genomics and its applications. MOLECULAR PLANT 2023; 16:168-186. [PMID: 36523157 DOI: 10.1016/j.molp.2022.12.009] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 12/07/2022] [Accepted: 12/12/2022] [Indexed: 06/17/2023]
Abstract
Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals. The variable DNA sequences, along with the conserved core sequences, compose the more sophisticated pan-genome that represents the collection of all non-redundant DNA in a species. With rapid progress in genome sequencing technologies, pan-genome research in plants is now accelerating. Here we review recent advances in plant pan-genomics, including major driving forces of structural variations that constitute the variable sequences, methodological innovations for representing the pan-genome, and major successes in constructing plant pan-genomes. We also summarize recent efforts toward decoding the remaining dark matter in telomere-to-telomere or gapless plant genomes. These new genome resources, which have remarkable advantages over numerous previously assembled less-than-perfect genomes, are expected to become new references for genetic studies and plant breeding.
Collapse
Affiliation(s)
- Junpeng Shi
- State Key Laboratory of Biocontrol, School of Agriculture, Sun Yat-sen University, Shenzhen 518107, China.
| | - Zhixi Tian
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | - Jinsheng Lai
- State Key Laboratory of Plant Physiology and Biochemistry and National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai 200234, China.
| |
Collapse
|
36
|
John M, Grimm D, Korte A. Predicting Gene Regulatory Interactions Using Natural Genetic Variation. Methods Mol Biol 2023; 2698:301-322. [PMID: 37682482 DOI: 10.1007/978-1-0716-3354-0_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Genome-wide association studies (GWAS) are a powerful tool to elucidate the genotype-phenotype map. Although GWAS are usually used to assess simple univariate associations between genetic markers and traits of interest, it is also possible to infer the underlying genetic architecture and to predict gene regulatory interactions. In this chapter, we describe the latest methods and tools to perform GWAS by calculating permutation-based significance thresholds. For this purpose, we first provide guidelines on univariate GWAS analyses that are extended in the second part of this chapter to more complex models that enable the inference of gene regulatory networks and how these networks vary.
Collapse
Affiliation(s)
- Maura John
- Technical University of Munich & Weihenstephan-Triesdorf University of Applied Sciences, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, Straubing, Germany
| | - Dominik Grimm
- Technical University of Munich & Weihenstephan-Triesdorf University of Applied Sciences, Campus Straubing for Biotechnology and Sustainability, Bioinformatics, Straubing, Germany
| | - Arthur Korte
- Center for Computational and Theoretical Biology, University of Würzburg, Würzburg, Germany.
| |
Collapse
|
37
|
Lemane T, Chikhi R, Peterlongo P. k mdiff, large-scale and user-friendly differential k-mer analyses. Bioinformatics 2022; 38:5443-5445. [PMID: 36315078 PMCID: PMC9750116 DOI: 10.1093/bioinformatics/btac689] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 09/23/2022] [Accepted: 10/28/2022] [Indexed: 12/25/2022] Open
Abstract
SUMMARY Genome wide association studies elucidate links between genotypes and phenotypes. Recent studies point out the interest of conducting such experiments using k-mers as the base signal instead of single-nucleotide polymorphisms. We propose a tool, kmdiff, that performs differential k-mer analyses on large sequencing cohorts in an order of magnitude less time and memory than previously possible. AVAILABILITYAND IMPLEMENTATION https://github.com/tlemane/kmdiff. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Téo Lemane
- Univ. Rennes, Inria, CNRS, IRISA - UMR 6074, Rennes, F-35000 France
| | - Rayan Chikhi
- Institut Pasteur, Université Paris Cité, Sequence Bioinformatics, Paris, F-75015, France
| | | |
Collapse
|
38
|
Bellis ES, Lucardi RD, Saltonstall K, Marsico TD. Predicting invasion risk of grasses in novel environments requires improved genomic understanding of adaptive potential. AMERICAN JOURNAL OF BOTANY 2022; 109:1965-1968. [PMID: 36200340 PMCID: PMC10100010 DOI: 10.1002/ajb2.16079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/29/2022] [Accepted: 09/29/2022] [Indexed: 06/16/2023]
Affiliation(s)
- Emily S. Bellis
- Department of Computer Science, Arkansas State UniversityState UniversityARUSA
- Center for No‐Boundary Thinking, Arkansas State UniversityState UniversityARUSA
| | - Rima D. Lucardi
- Southern Research StationUnited States Department of Agriculture Forest ServiceAthensGAUSA
| | | | - Travis D. Marsico
- Department of Biological Sciences, Arkansas State UniversityState UniversityARUSA
| |
Collapse
|
39
|
Sex chromosomes in the tribe Cyprichromini (Teleostei: Cichlidae) of Lake Tanganyika. Sci Rep 2022; 12:17998. [PMID: 36289404 PMCID: PMC9606112 DOI: 10.1038/s41598-022-23017-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 10/21/2022] [Indexed: 01/24/2023] Open
Abstract
Sex determining loci have been described on at least 12 of 22 chromosomes in East African cichlid fishes, indicating a high rate of sex chromosome turnover. To better understand the rates and patterns of sex chromosome replacement, we used new methods to characterize the sex chromosomes of the cichlid tribe Cyprichromini from Lake Tanganyika. Our k-mer based methods successfully identified sex-linked polymorphisms without the need for a reference genome. We confirm the three previously reported sex chromosomes in this group. We determined the polarity of the sex chromosome turnover on LG05 in Cyprichromis as ZW to XY. We identified a new ZW locus on LG04 in Paracyprichromis brieni. The LG15 XY locus in Paracyprichromis nigripinnis was not found in other Paracyprichromis species, and the sample of Paracyprichromis sp. "tembwe" is likely to be of hybrid origin. Although highly divergent sex chromosomes are thought to develop in a stepwise manner, we show two cases (LG05-ZW and LG05-XY) in which the region of differentiation encompasses most of the chromosome, but appears to have arisen in a single step. This study expands our understanding of sex chromosome evolution in the Cyprichromini, and indicates an even higher level of sex chromosome turnover than previously thought.
Collapse
|
40
|
Genomics-informed prebreeding unlocks the diversity in genebanks for wheat improvement. Nat Genet 2022; 54:1544-1552. [DOI: 10.1038/s41588-022-01189-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 08/18/2022] [Indexed: 11/06/2022]
|
41
|
Kale SM, Schulthess AW, Padmarasu S, Boeven PHG, Schacht J, Himmelbach A, Steuernagel B, Wulff BBH, Reif JC, Stein N, Mascher M. A catalogue of resistance gene homologs and a chromosome-scale reference sequence support resistance gene mapping in winter wheat. PLANT BIOTECHNOLOGY JOURNAL 2022; 20:1730-1742. [PMID: 35562859 PMCID: PMC9398310 DOI: 10.1111/pbi.13843] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 04/06/2022] [Accepted: 04/23/2022] [Indexed: 06/15/2023]
Abstract
A resistance gene atlas is an integral component of the breeder's arsenal in the fight against evolving pathogens. Thanks to high-throughput sequencing, catalogues of resistance genes can be assembled even in crop species with large and polyploid genomes. Here, we report on capture sequencing and assembly of resistance gene homologs in a diversity panel of 907 winter wheat genotypes comprising ex situ genebank accessions and current elite cultivars. In addition, we use accurate long-read sequencing and chromosome conformation capture sequencing to construct a chromosome-scale genome sequence assembly of cv. Attraktion, an elite variety representative of European winter wheat. We illustrate the value of our resource for breeders and geneticists by (i) comparing the resistance gene complements in plant genetic resources and elite varieties and (ii) conducting genome-wide associations scans (GWAS) for the fungal diseases yellow rust and leaf rust using reference-based and reference-free GWAS approaches. The gene content under GWAS peaks was scrutinized in the assembly of cv. Attraktion.
Collapse
Affiliation(s)
- Sandip M. Kale
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
| | - Albert W. Schulthess
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
| | - Sudharsan Padmarasu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
| | | | | | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
| | | | - Brande B. H. Wulff
- John Innes CentreNorwich Research ParkNorwichUK
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE)King Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
| | - Jochen C. Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
- Center for Integrated Breeding Research (CiBreed)Georg‐August‐University GöttingenGöttingenGermany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) GaterslebenSeelandGermany
- German Centre for Integrative Biodiversity Research (iDiv) Halle‐Jena‐LeipzigLeipzigGermany
| |
Collapse
|
42
|
Abstract
Genetic studies of human traits have revolutionized our understanding of the variation between individuals, and yet, the genetics of most traits is still poorly understood. In this review, we highlight the major open problems that need to be solved, and by discussing these challenges provide a primer to the field. We cover general issues such as population structure, epistasis and gene-environment interactions, data-related issues such as ancestry diversity and rare genetic variants, and specific challenges related to heritability estimates, genetic association studies, and polygenic risk scores. We emphasize the interconnectedness of these problems and suggest promising avenues to address them.
Collapse
Affiliation(s)
- Nadav Brandes
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Michal Linial
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
43
|
Walker K, Kalra D, Lowdon R, Chen G, Molik D, Soto DC, Dabbaghie F, Khleifat AA, Mahmoud M, Paulin LF, Raza MS, Pfeifer SP, Agustinho DP, Aliyev E, Avdeyev P, Barrozo ER, Behera S, Billingsley K, Chong LC, Choubey D, De Coster W, Fu Y, Gener AR, Hefferon T, Henke DM, Höps W, Illarionova A, Jochum MD, Jose M, Kesharwani RK, Kolora SRR, Kubica J, Lakra P, Lattimer D, Liew CS, Lo BW, Lo C, Lötter A, Majidian S, Mendem SK, Mondal R, Ohmiya H, Parvin N, Peralta C, Poon CL, Prabhakaran R, Saitou M, Sammi A, Sanio P, Sapoval N, Syed N, Treangen T, Wang G, Xu T, Yang J, Zhang S, Zhou W, Sedlazeck FJ, Busby B. The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms. F1000Res 2022; 11:530. [PMID: 36262335 PMCID: PMC9557141 DOI: 10.12688/f1000research.110194.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/04/2022] [Indexed: 01/25/2023] Open
Abstract
In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.
Collapse
Affiliation(s)
- Kimberly Walker
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Divya Kalra
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Guangyi Chen
- Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Saarbrücken, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | - David Molik
- Tropical Crop and Commodity Protection Research Unit, Pacific Basin Agricultural Research Center, Hilo, HI, 96720, USA
| | - Daniela C. Soto
- Biochemistry & Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, Davis, CA, 95616, USA
| | - Fawaz Dabbaghie
- Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Saarbrücken, Germany
- Institute for Medical Biometry and Bioinformatics, University hospital Düsseldorf, Düsseldorf, Germany
| | - Ahmad Al Khleifat
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Muhammad Sohail Raza
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Beijing, China
| | - Susanne P. Pfeifer
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
| | - Daniel Paiva Agustinho
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, MO, 63110, USA
| | - Elbay Aliyev
- Research Department, Sidra Medicine, Doha, Qatar
| | - Pavel Avdeyev
- Computational Biology Institute, The George Washington University, Washington, DC, 20052, USA
| | - Enrico R. Barrozo
- Department of Obstetrics & Gynecology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Sairam Behera
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kimberley Billingsley
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Li Chuin Chong
- Beykoz Institute of Life Sciences and Biotechnology, Bezmialem Vakif University, Beykoz, Istanbul, Turkey
| | - Deepak Choubey
- Department of Technology, Savitribai Phule Pune University, Pune, Maharashtra, India
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, Antwerp, Belgium
- Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Alejandro R. Gener
- Association of Public Health Labs, Centers for Disease Control and Prevention, Downey, CA, USA
| | - Timothy Hefferon
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - David Morgan Henke
- Department Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Wolfram Höps
- EMBL Heidelberg, Genome Biology Unit, Heidelberg, Germany
| | | | - Michael D. Jochum
- Department of Obstetrics & Gynecology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maria Jose
- Centre for Bioinformatics, Pondicherry University, Pondicherry, India
| | - Rupesh K. Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | | | - Priya Lakra
- Department of Zoology, University of Delhi, Delhi, India
| | - Damaris Lattimer
- University of Applied Sciences Upper Austria - FH Hagenberg, Mühlkreis, Austria
| | - Chia-Sin Liew
- Center for Biotechnology, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588, USA
| | - Bai-Wei Lo
- Department of Biology, University of Konstanz, Konstanz, Germany
| | - Chunhsuan Lo
- Human Genetics Laboratory, National Institute of Genetics, Japan, Mishima City, Japan
| | - Anneri Lötter
- Department of Biochemistry, University of Pretoria, Pretoria, South Africa
| | - Sina Majidian
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | | | - Rajarshi Mondal
- Department of Biotechnology, The University of Burdwan, West Bengal, India
| | - Hiroko Ohmiya
- Genetic Reagent Development Unit, Medical & Biological Laboratories Co., Ltd., Tokoyo, Japan
| | - Nasrin Parvin
- Department of Biotechnology, The University of Burdwan, West Bengal, India
| | | | | | | | - Marie Saitou
- Center of Integrative Genetics (CIGENE),Faculty of Biosciences, Norwegian University of Life Sciences, As, Norway
| | - Aditi Sammi
- School of Biochemical Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India
| | - Philippe Sanio
- University of Applied Sciences Upper Austria - FH Hagenberg, Hagenberg im Mühlkreis, Austria
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Najeeb Syed
- Research Department, Sidra Medicine, Doha, Qatar
| | - Todd Treangen
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | - Tiancheng Xu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Jianzhi Yang
- Department of Quantitative and Computational Biology,, University of Southern California, Los Angeles, CA, USA
| | - Shangzhe Zhang
- School of Biology, University of St Andrews, St Andrews, UK
| | - Weiyu Zhou
- Department of Statistical Science, George Mason University, Fairfax, Virginia, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | |
Collapse
|
44
|
Genetic diversity analysis and marker-trait associations in Amaranthus species. PLoS One 2022; 17:e0267752. [PMID: 35551526 PMCID: PMC9098028 DOI: 10.1371/journal.pone.0267752] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 04/15/2022] [Indexed: 11/26/2022] Open
Abstract
Amaranth (Amaranthus spp.) is a highly nutritious, underutilized vegetable and pseudo-cereal crop. It possesses diverse abiotic stress tolerance traits, is genetically diverse and highly phenotypically plastic, making it an ideal crop to thrive in a rapidly changing climate. Despite considerable genetic diversity there is a lack of detailed characterization of germplasm or population structures. The present study utilized the DArTSeq platform to determine the genetic relationships and population structure between 188 amaranth accessions from 18 agronomically important vegetable, grain, and weedy species. A total of 74, 303 SNP alleles were generated of which 63, 821 were physically mapped to the genome of the grain species A. hypochondriacus. Population structure was inferred in two steps. First, all 188 amaranth accessions comprised of 18 species and second, only 120 A. tricolor accessions. After SNP filtering, a total of 8,688 SNPs were generated on 181 amaranth accessions of 16 species and 9,789 SNPs generated on 118 A. tricolor accessions. Both SNP datasets produced three major sub-populations (K = 3) and generate consistent taxonomic classification of the amaranth sub-genera (Amaranthus Amaranthus, Amaranthus Acnida and Amaranthus albersia), although the accessions were poorly demarcated by geographical origin and morphological traits. A. tricolor accessions were well discriminated from other amaranth species. A genome-wide association study (GWAS) of 10 qualitative traits revealed an association between specific phenotypes and genetic variants within the genome and identified 22 marker trait associations (MTAs) and 100 MTAs (P≤0.01, P≤0.001) on 16 amaranth species and 118 A.tricolor datasets, respectively. The release of SNP markers from this panel has produced invaluable preliminary genetic information for phenotyping and cultivar improvement in amaranth species.
Collapse
|
45
|
Canaguier A, Guilbaud R, Denis E, Magdelenat G, Belser C, Istace B, Cruaud C, Wincker P, Le Paslier MC, Faivre-Rampant P, Barbe V. Oxford Nanopore and Bionano Genomics technologies evaluation for plant structural variation detection. BMC Genomics 2022; 23:317. [PMID: 35448948 PMCID: PMC9026655 DOI: 10.1186/s12864-022-08499-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Accepted: 03/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Structural Variations (SVs) are genomic rearrangements derived from duplication, deletion, insertion, inversion, and translocation events. In the past, SVs detection was limited to cytological approaches, then to Next-Generation Sequencing (NGS) short reads and partitioned assemblies. Nowadays, technologies such as DNA long read sequencing and optical mapping have revolutionized the understanding of SVs in genomes, due to the enhancement of the power of SVs detection. This study aims to investigate performance of two techniques, 1) long-read sequencing obtained with the MinION device (Oxford Nanopore Technologies) and 2) optical mapping obtained with Saphyr device (Bionano Genomics) to detect and characterize SVs in the genomes of the two ecotypes of Arabidopsis thaliana, Columbia-0 (Col-0) and Landsberg erecta 1 (Ler-1). RESULTS We described the SVs detected from the alignment of the best ONT assembly and DLE-1 optical maps of A. thaliana Ler-1 against the public reference genome Col-0 TAIR10.1. After filtering (SV > 1 kb), 1184 and 591 Ler-1 SVs were retained from ONT and Bionano technologies respectively. A total of 948 Ler-1 ONT SVs (80.1%) corresponded to 563 Bionano SVs (95.3%) leading to 563 common locations. The specific locations were scrutinized to assess improvement in SV detection by either technology. The ONT SVs were mostly detected near TE and gene features, and resistance genes seemed particularly impacted. CONCLUSIONS Structural variations linked to ONT sequencing error were removed and false positives limited, with high quality Bionano SVs being conserved. When compared with the Col-0 TAIR10.1 reference genome, most of the detected SVs discovered by both technologies were found in the same locations. ONT assembly sequence leads to more specific SVs than Bionano one, the latter being more efficient to characterize large SVs. Even if both technologies are complementary approaches, ONT data appears to be more adapted to large scale populations studies, while Bionano performs better in improving assembly and describing specificity of a genome compared to a reference.
Collapse
Affiliation(s)
- Aurélie Canaguier
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux EPGV, 91000 Evry-Courcouronnes, France
| | - Romane Guilbaud
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux EPGV, 91000 Evry-Courcouronnes, France
| | - Erwan Denis
- Genoscope, Institut de biologie François-Jacob, Commissariat à l’Energie Atomique CEA, Université Paris-Saclay, Evry, France
| | - Ghislaine Magdelenat
- Genoscope, Institut de biologie François-Jacob, Commissariat à l’Energie Atomique CEA, Université Paris-Saclay, Evry, France
| | - Caroline Belser
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | - Corinne Cruaud
- Genoscope, Institut de biologie François-Jacob, Commissariat à l’Energie Atomique CEA, Université Paris-Saclay, Evry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| | - Marie-Christine Le Paslier
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux EPGV, 91000 Evry-Courcouronnes, France
| | - Patricia Faivre-Rampant
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux EPGV, 91000 Evry-Courcouronnes, France
| | - Valérie Barbe
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057 Evry, France
| |
Collapse
|
46
|
Hübner S. Are we there yet? Driving the road to evolutionary graph-pangenomics. CURRENT OPINION IN PLANT BIOLOGY 2022; 66:102195. [PMID: 35217472 DOI: 10.1016/j.pbi.2022.102195] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 01/20/2022] [Accepted: 01/21/2022] [Indexed: 06/14/2023]
Abstract
With increase in the number of sequenced genomes, it is now recognized that graph-based pangenomes can provide a comprehensive platform to study diversity in a population or species, from point mutations to large chromosomal rearrangements. By incorporating concepts from graph theory, a graph-pangenome can be studied directly to identify genomic regions and genes that underlie important evolutionary processes and traits. Here, I discuss how basic concepts in graph theory can be implemented to address questions in evolutionary genomics and guide future breeding efforts. Despite its compelling versatility, a graph-pangenome assembly is still challenging especially in species with large complex genomes. As technology is rapidly improving, the graph-pangenome is expected to become a central platform in genomics studies and applications. Thus, development of tools and methods that exploit the graph structure are urged to pave the route to evolutionary graph-pangenomics.
Collapse
Affiliation(s)
- Sariel Hübner
- Galilee Research Institute (Migal), Tel-Hai Academic College, Upper Galilee, 12210, Israel.
| |
Collapse
|
47
|
Onetto CA, Sosnowski MR, Van Den Heuvel S, Borneman AR. Population genomics of the grapevine pathogen Eutypa lata reveals evidence for population expansion and intraspecific differences in secondary metabolite gene clusters. PLoS Genet 2022; 18:e1010153. [PMID: 35363788 PMCID: PMC9007359 DOI: 10.1371/journal.pgen.1010153] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 04/13/2022] [Accepted: 03/17/2022] [Indexed: 12/02/2022] Open
Abstract
Eutypa dieback of grapevine is an important disease caused by the generalist Ascomycete fungus Eutypa lata. Despite the relevance of this species to the global wine industry, its genomic diversity remains unknown, with only a single publicly available genome assembly. Whole-genome sequencing and comparative genomics was performed on forty Australian E. lata isolates to understand the genome evolution, adaptation, population size and structure of these isolates. Phylogenetic and linkage disequilibrium decay analyses provided evidence of extensive gene flow through sexual recombination between isolates obtained from different geographic locations and hosts. Investigation of the genetic diversity of these isolates suggested rapid population expansion, likely as a consequence of the recent growth of the Australian wine industry. Genomic regions affected by selective sweeps were shown to be enriched for genes associated with secondary metabolite clusters and included genes encoding proteins with a role in nutrient acquisition, degradation of host cell wall and metal and drug resistance, suggesting recent adaptation to both abiotic factors and potentially host genotypes. Genome synteny analysis using long-read genome assemblies showed significant intraspecific genomic plasticity with extensive chromosomal rearrangements impacting the secondary metabolite production potential of this species. Finally, k-mer based GWAS analysis identified a potential locus associated with mycelia recovery in canes of Vitis vinifera that will require further investigations. Eutypa dieback of grapevine, caused by the Ascomycete fungus Eutypa lata, is responsible for significant economic losses to the wine industry. Despite the worldwide prevalence of this pathogen, its genomic diversity remains unknown, with only a single publicly available genome assembly. This knowledge gap was addressed by performing whole-genome sequencing of 40 E. lata isolates sourced from different hosts and geographical locations around Australia. Investigation of the genetic diversity of this population showed a high degree of gene-flow and sexual recombination as well as demographic expansion. Through the inspection of signatures of selective sweeps, repeat-mediated chromosomal rearrangements, and pan-genomic elements, it was shown that this species has a highly dynamic secondary metabolite production potential that could have important implications for its pathogenicity and lifestyle. In addition, application of a k-mer based GWAS methodology, identified a potential locus associated with the growth of this species within canes of Vitis vinifera.
Collapse
Affiliation(s)
| | - Mark R. Sosnowski
- South Australian Research and Development Institute, Adelaide, Australia
- School of Wine, Food and Agriculture, The University of Adelaide, Adelaide, Australia
| | | | - Anthony R. Borneman
- The Australian Wine Research Institute, Adelaide, Australia
- School of Wine, Food and Agriculture, The University of Adelaide, Adelaide, Australia
- * E-mail:
| |
Collapse
|
48
|
Zhu F, Ahchige MW, Brotman Y, Alseekh S, Zsögön A, Fernie AR. Bringing more players into play: Leveraging stress in genome wide association studies. JOURNAL OF PLANT PHYSIOLOGY 2022; 271:153657. [PMID: 35231821 DOI: 10.1016/j.jplph.2022.153657] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/14/2022] [Accepted: 02/21/2022] [Indexed: 06/14/2023]
Abstract
In order to meet the demand of the burgeoning human population as well as to adapt crops to the enhanced abiotic and biotic stress caused by the global climatic change, breeders focus on identifying valuable genes to improve both crop stress tolerance and crop quality. Recently, with the development of next-generation sequencing methods, millions of high quality single-nucleotide polymorphisms (SNPs) have been made available and genome-wide association studies (GWAS) are widely used in crop improvement studies to identify the associations between genetic variants of genomes and relevant crop agronomic traits. Here, we review classic cases of use of GWAS to identify genetic variants associated with valuable traits such as geographic adaptation, crop quality and metabolites. We discuss the power of stress GWAS to identify further associations including those with genes that are not, or only lowly, expressed during optimal growth conditions. Finally, we emphasize recent demonstrations of the efficiency and accuracy of time-resolved dynamic stress GWAS and GWAS based on genomic gene expression and structural variations, which can be applied to resolve more comprehensively the genetic regulation mechanisms of complex traits.
Collapse
Affiliation(s)
- Feng Zhu
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany; National R&D Center for Citrus Preservation, Key Laboratory of Horticultural Plant Biology, Ministry of Education, Huazhong Agricultural University, 430070, Wuhan, China
| | - Micha Wijesingha Ahchige
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany
| | - Yariv Brotman
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany; Department of Life Sciences, Ben-Gurion University of the Negev, Beersheba, Israel
| | - Saleh Alseekh
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, 4000, Plovdiv, Bulgaria
| | - Agustin Zsögön
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany; Departamento de Biologia Vegetal, Universidade Federal de Viçosa, CEP 36570-900, Viçosa, MG, Brazil
| | - Alisdair R Fernie
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany; Center of Plant Systems Biology and Biotechnology, 4000, Plovdiv, Bulgaria.
| |
Collapse
|
49
|
Chelliah R, Banan-MwineDaliri E, Khan I, Wei S, Elahi F, Yeon SJ, Selvakumar V, Ofosu FK, Rubab M, Ju HH, Rallabandi HR, Madar IH, Sultan G, Oh DH. A review on the application of bioinformatics tools in food microbiome studies. Brief Bioinform 2022; 23:bbac007. [PMID: 35189636 DOI: 10.1093/bib/bbac007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 12/20/2021] [Accepted: 01/05/2022] [Indexed: 12/12/2022] Open
Abstract
There is currently a transformed interest toward understanding the impact of fermentation on functional food development due to growing consumer interest on modified health benefits of sustainable foods. In this review, we attempt to summarize recent findings regarding the impact of Next-generation sequencing and other bioinformatics methods in the food microbiome and use prediction software to understand the critical role of microbes in producing fermented foods. Traditionally, fermentation methods and starter culture development were considered conventional methods needing optimization to eliminate errors in technique and were influenced by technical knowledge of fermentation. Recent advances in high-output omics innovations permit the implementation of additional logical tactics for developing fermentation methods. Further, the review describes the multiple functions of the predictions based on docking studies and the correlation of genomic and metabolomic analysis to develop trends to understand the potential food microbiome interactions and associated products to become a part of a healthy diet.
Collapse
Affiliation(s)
- Ramachandran Chelliah
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Eric Banan-MwineDaliri
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Imran Khan
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
- Department of Biotechnology, University of Malakand, Khyber Pakhtunkhwa Pakistan
| | - Shuai Wei
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, College of Food Science and Technology, Guangdong Ocean University, Zhanjiang 524088, China
| | - Fazle Elahi
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Su-Jung Yeon
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Vijayalakshmi Selvakumar
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Fred Kwame Ofosu
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Momna Rubab
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Hum Hun Ju
- Department of Biological Environment, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Harikrishna Reddy Rallabandi
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| | - Inamul Hasan Madar
- Department of Biochemistry, School of Life Science, Bharathidasan, University, Thiruchirappalli, Tamilnadu, India
| | - Ghazala Sultan
- Department of Computer Science, Aligarh Muslim University, Aligarh, Uttar Pradesh, 202002, India
| | - Deog Hwan Oh
- Department of Food Science and Biotechnology, College of Agriculture and Life Sciences, Kangwon National University, Chuncheon, Gangwon-do 24341, Korea
| |
Collapse
|
50
|
Singh R, Kumar K, Bharadwaj C, Verma PK. Broadening the horizon of crop research: a decade of advancements in plant molecular genetics to divulge phenotype governing genes. PLANTA 2022; 255:46. [PMID: 35076815 DOI: 10.1007/s00425-022-03827-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 01/08/2022] [Indexed: 06/14/2023]
Abstract
Advancements in sequencing, genotyping, and computational technologies during the last decade (2011-2020) enabled new forward-genetic approaches, which subdue the impediments of precise gene mapping in varied crops. The modern crop improvement programs rely heavily on two major steps-trait-associated QTL/gene/marker's identification and molecular breeding. Thus, it is vital for basic and translational crop research to identify genomic regions that govern the phenotype of interest. Until the advent of next-generation sequencing, the forward-genetic techniques were laborious and time-consuming. Over the last 10 years, advancements in the area of genome assembly, genotyping, large-scale data analysis, and statistical algorithms have led faster identification of genomic variations regulating the complex agronomic traits and pathogen resistance. In this review, we describe the latest developments in genome sequencing and genotyping along with a comprehensive evaluation of the last 10-year headways in forward-genetic techniques that have shifted the focus of plant research from model plants to diverse crops. We have classified the available molecular genetic methods under bulk-segregant analysis-based (QTL-seq, GradedPool-Seq, QTG-Seq, Exome QTL-seq, and RapMap), target sequence enrichment-based (RenSeq, AgRenSeq, and TACCA), and mutation-based groups (MutMap, NIKS algorithm, MutRenSeq, MutChromSeq), alongside improvements in classical mapping and genome-wide association analyses. Newer methods for outcrossing, heterozygous, and polyploid plant genetics have also been discussed. The use of k-mers has enriched the nature of genetic variants which can be utilized to identify the phenotype-causing genes, independent of reference genomes. We envisage that the recent methods discussed herein will expand the repertoire of useful alleles and help in developing high-yielding and climate-resilient crops.
Collapse
Affiliation(s)
- Ritu Singh
- Plant Immunity Laboratory, National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Kamal Kumar
- Plant Immunity Laboratory, National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Chellapilla Bharadwaj
- Division of Genetics, ICAR-Indian Agricultural Research Institute (IARI), New Delhi, 110020, India
| | - Praveen Kumar Verma
- Plant Immunity Laboratory, National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi, 110067, India.
- Plant Immunity Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, 110067, India.
| |
Collapse
|