1
|
He G, Liu C, Wang M. Perspectives and opportunities in forensic human, animal, and plant integrative genomics in the Pangenome era. Forensic Sci Int 2025; 367:112370. [PMID: 39813779 DOI: 10.1016/j.forsciint.2025.112370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 12/24/2024] [Accepted: 01/08/2025] [Indexed: 01/18/2025]
Abstract
The Human Pangenome Reference Consortium, the Chinese Pangenome Consortium, and other plant and animal pangenome projects have announced the completion of pilot work aimed at constructing high-quality, haplotype-resolved reference graph genomes representative of global ethno-linguistically different populations or different plant and animal species. These graph-based, gapless pangenome references, which are enriched in terms of genomic diversity, completeness, and contiguity, have the potential for enhancing long-read sequencing (LRS)-based genomic research, as well as improving mappability and variant genotyping on traditional short-read sequencing platforms. We comprehensively discuss the advancements in pangenome-based genomic integrative genomic discoveries across forensic-related species (humans, animals, and plants) and summarize their applications in variant identification and forensic genomics, epigenetics, transcriptomics, and microbiome research. Recent developments in multiplexed array sequencing have introduced a highly efficient and programmable technique to overcome the limitations of short forensic marker lengths in LRS platforms. This technique enables the concatenation of short RNA transcripts and DNA fragments into LRS-optimal molecules for sequencing, assembly, and genotyping. The integration of new pangenome reference coordinates and corresponding computational algorithms will benefit forensic integrative genomics by facilitating new marker identification, accurate genotyping, high-resolution panel development, and the updating of statistical algorithms. This review highlights the necessity of integrating LRS-based platforms, pangenome-based study designs, and graph-based pangenome references in short-read mapping and LRS-based innovations to achieve precision forensic science.
Collapse
Affiliation(s)
- Guanglin He
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu 610000, China; Center for Archaeological Science, Sichuan University, Chengdu 610000, China.
| | - Chao Liu
- Anti-Drug Technology Center of Guangdong Province, Guangzhou 510230, China.
| | - Mengge Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu 610000, China; Center for Archaeological Science, Sichuan University, Chengdu 610000, China; Department of Forensic Medicine, College of Basic Medicine, Chongqing Medical University, Chongqing 400331, China.
| |
Collapse
|
2
|
Ruperao P, Rangan P, Shah T, Sharma V, Rathore A, Mayes S, Pandey MK. Developing pangenomes for large and complex plant genomes and their representation formats. J Adv Res 2025:S2090-1232(25)00071-2. [PMID: 39894347 DOI: 10.1016/j.jare.2025.01.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 01/27/2025] [Accepted: 01/27/2025] [Indexed: 02/04/2025] Open
Abstract
BACKGROUND The development of pangenomes has revolutionized genomic studies by capturing the complete genetic diversity within a species. Pangenome assembly integrates data from multiple individuals to construct a comprehensive genomic landscape, revealing both core and accessory genomic elements. This approach enables the identification of novel genes, structural variations, and gene presence-absence variations, providing insights into species evolution, adaptation, and trait variation. Representing pangenomes requires innovative visualization formats that effectively convey the complex genomic structures and variations. AIM This review delves into contemporary methodologies and recent advancements in constructing pangenomes, particularly in plant genomes. It examines the structure of pangenome representation, including format comparison, conversion, visualization techniques, and their implications for enhancing crop improvement strategies. KEY SCIENTIFIC CONCEPTS OF REVIEW Earlier comparative studies have illuminated novel gene sequences, copy number variations, and presence-absence variations across diverse crop species. The concept of a pan-genome, which captures multiple genetic variations from a broad spectrum of genotypes, offers a holistic perspective of a species' genetic makeup. However, constructing a pan-genome for plants with larger genomes poses challenges, including managing vast genome sequence data and comprehending the genetic variations within the germplasm. To address these challenges, researchers have explored cost-effective alternatives to encapsulate species diversity in a single assembly known as a pangenome. This involves reducing the volume of genome sequences while focusing on genetic variations. With the growing prominence of the pan-genome concept in plant genomics, several software tools have emerged to facilitate pangenome construction. This review sheds light on developing and utilizing software tools tailored for constructing pan-genomes in plants. It also discusses representation formats suitable for downstream analyses, offering valuable insights into the genetic landscape and evolutionary dynamics of plant species. In summary, this review underscores the significance of pan-genome construction and representation formats in resolving the genetic architecture of plants, particularly those with complex genomes. It provides a comprehensive overview of recent advancements, aiding in exploring and understanding plant genetic diversity.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources (NBPGR), New Delhi, India; Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, Australia
| | - Trushar Shah
- International Institute of Tropical Agriculture (IITA), Nairobi, Kenya
| | - Vinay Sharma
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Abhishek Rathore
- International Maize and Wheat Improvement Center (CIMMYT), Nairobi, Kenya
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Manish K Pandey
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
| |
Collapse
|
3
|
Nikhil S, Mohideen HS, Sella RN. Unveiling the Genomic Symphony: Identification Cultivar-Specific Genes and Enhanced Insights on Sweet Sorghum Genomes Through Comprehensive superTranscriptomic Analysis. J Mol Evol 2024; 92:720-743. [PMID: 39261311 DOI: 10.1007/s00239-024-10198-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 08/20/2024] [Indexed: 09/13/2024]
Abstract
Sorghum (Sorghum bicolor (L.) Moench) is a multipurpose crop grown for food, fodder, and bioenergy production. Its cultivated varieties, along with their wild counterparts, contribute to the core genetic pool. Despite the availability of several re-sequenced sorghum genomes, a variable portion of sorghum genomes is not reported during reference genome assembly and annotation. The present analysis used 223 publicly available RNA-seq datasets from seven sweet sorghum cultivars to construct superTranscriptome. This approach yielded 45,864 Representative Transcript Assemblies (RTAs) that showcased intriguing Presence/Absence Variation (PAV) across 15 published sorghum genomes. We found 301 superTranscripts were exclusive to sweet sorghum, including 58 de novo genes encoded core and linker histones, zinc finger domains, glucosyl transferases, cellulose synthase, etc. The superTranscriptome added 2,802 new protein-coding genes to the Sweet Sorghum Reference Genome (SSRG), of which 559 code for different transcription factors (TFs). Our analysis revealed that MULE-like transposases were abundant in the sweet sorghum genome and could play a hidden role in the evolution of sweet sorghum. We observed large deletions in the D locus and terminal deletions in four other NAC encoding loci in the SSRG compared to its wild progenitor (353) suggesting non-functional NAC genes contributed to trait development in sweet sorghum. Moreover, superTranscript-based methods for Differential Exon Usage (DEU) and Differential Gene Expression (DGE) analyses were more accurate than those based on the SSRG. This study demonstrates that the superTranscriptome can enhance our understanding of fundamental sorghum mechanisms, improve genome annotations, and potentially even replace the reference genome.
Collapse
Affiliation(s)
- Shinde Nikhil
- Membrane Protein Interaction Lab, Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, 603203, India
| | - Habeeb Shaikh Mohideen
- Entomoinformatics Lab, Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, 603203, India
| | - Raja Natesan Sella
- Membrane Protein Interaction Lab, Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, 603203, India.
| |
Collapse
|
4
|
Kaur H, Shannon LM, Samac DA. A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study. BMC Genomics 2024; 25:1022. [PMID: 39482604 PMCID: PMC11526573 DOI: 10.1186/s12864-024-10931-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 10/21/2024] [Indexed: 11/03/2024] Open
Abstract
BACKGROUND The concept of pangenomics and the importance of structural variants is gaining recognition within the plant genomics community. Due to advancements in sequencing and computational technology, it has become feasible to sequence the entire genome of numerous individuals of a single species at a reasonable cost. Pangenomes have been constructed for many major diploid crops, including rice, maize, soybean, sorghum, pearl millet, peas, sunflower, grapes, and mustards. However, pangenomes for polyploid species are relatively scarce and are available in only few crops including wheat, cotton, rapeseed, and potatoes. MAIN BODY In this review, we explore the various methods used in crop pangenome development, discussing the challenges and implications of these techniques based on insights from published pangenome studies. We offer a systematic guide and discuss the tools available for constructing a pangenome and conducting downstream analyses. Alfalfa, a highly heterozygous, cross pollinated and autotetraploid forage crop species, is used as an example to discuss the concerns and challenges offered by polyploid crop species. We conducted a comparative analysis using linear and graph-based methods by constructing an alfalfa graph pangenome using three publicly available genome assemblies. To illustrate the intricacies captured by pangenome graphs for a complex crop genome, we used five different gene sequences and aligned them against the three graph-based pangenomes. The comparison of the three graph pangenome methods reveals notable variations in the genomic variation captured by each pipeline. CONCLUSION Pangenome resources are proving invaluable by offering insights into core and dispensable genes, novel gene discovery, and genome-wide patterns of variation. Developing user-friendly online portals for linear pangenome visualization has made these resources accessible to the broader scientific and breeding community. However, challenges remain with graph-based pangenomes including compatibility with other tools, extraction of sequence for regions of interest, and visualization of genetic variation captured in pangenome graphs. These issues necessitate further refinement of tools and pipelines to effectively address the complexities of polyploid, highly heterozygous, and cross-pollinated species.
Collapse
Affiliation(s)
- Harpreet Kaur
- Department of Horticultural Science, University of Minnesota, St. Paul, MN, 55108, USA.
| | - Laura M Shannon
- Department of Horticultural Science, University of Minnesota, St. Paul, MN, 55108, USA
| | - Deborah A Samac
- USDA-ARS, Plant Science Research Unit, St. Paul, MN, 55108, USA
| |
Collapse
|
5
|
Gkanogiannis A, Rahman H, Singh RK, Lopez-Lavalle AB. Chromosome-level genome assembly and functional annotation of Citrullus colocynthis: unlocking genetic resources for drought-resilient crop development. PLANTA 2024; 260:124. [PMID: 39443340 PMCID: PMC11499410 DOI: 10.1007/s00425-024-04551-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Accepted: 10/11/2024] [Indexed: 10/25/2024]
Abstract
MAIN CONCLUSION The chromosome-level genome assembly of Citrullus colocynthis reveals its genetic potential for enhancing drought tolerance, paving the way for innovative crop improvement strategies. This study presents the first comprehensive genome assembly and annotation of Citrullus colocynthis, a drought-tolerant wild close relative of cultivated watermelon, highlighting its potential for enhancing agricultural resilience to climate change. The study achieved a chromosome-level assembly using advanced sequencing technologies, including PacBio HiFi and Hi-C, revealing a genome size of approximately 366 Mb with low heterozygosity and substantial repetitive content. Our analysis identified 23,327 gene models, that could encode stress response mechanisms for species' adaptation to arid environments. Comparative genomics with closely related species illuminated the evolutionary dynamics within the Cucurbitaceae family. In addition, resequencing of 27 accessions from the United Arab Emirates (UAE) identified genetic diversity, suggesting a foundation for future breeding programs. This genomic resource opens new avenues for the de novo domestication of C. colocynthis, offering a blueprint for developing crops with enhanced drought tolerance, disease resistance, and nutritional profiles, crucial for sustaining future food security in the face of escalating climate challenges.
Collapse
Affiliation(s)
- Anestis Gkanogiannis
- International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates.
| | - Hifzur Rahman
- International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates
| | - Rakesh Kumar Singh
- International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates
| | | |
Collapse
|
6
|
Tong W, Wang Y, Li F, Zhai F, Su J, Wu D, Yi L, Gao Q, Wu Q, Xia E. Genomic variation of 363 diverse tea accessions unveils the genetic diversity, domestication, and structural variations associated with tea adaptation. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2024; 66:2175-2190. [PMID: 38990113 DOI: 10.1111/jipb.13737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 06/14/2024] [Indexed: 07/12/2024]
Abstract
Domestication has shaped the population structure and agronomic traits of tea plants, yet the complexity of tea population structure and genetic variation that determines these traits remains unclear. We here investigated the resequencing data of 363 diverse tea accessions collected extensively from almost all tea distributions and found that the population structure of tea plants was divided into eight subgroups, which were basically consistent with their geographical distributions. The genetic diversity of tea plants in China decreased from southwest to east as latitude increased. Results also indicated that Camellia sinensis var. assamica (CSA) illustrated divergent selection signatures with Camellia sinensis var. sinensis (CSS). The domesticated genes of CSA were mainly involved in leaf development, flavonoid and alkaloid biosynthesis, while the domesticated genes in CSS mainly participated in amino acid metabolism, aroma compounds biosynthesis, and cold stress. Comparative population genomics further identified ~730 Mb novel sequences, generating 6,058 full-length protein-encoding genes, significantly expanding the gene pool of tea plants. We also discovered 217,376 large-scale structural variations and 56,583 presence and absence variations (PAVs) across diverse tea accessions, some of which were associated with tea quality and stress resistance. Functional experiments demonstrated that two PAV genes (CSS0049975 and CSS0006599) were likely to drive trait diversification in cold tolerance between CSA and CSS tea plants. The overall findings not only revealed the genetic diversity and domestication of tea plants, but also underscored the vital role of structural variations in the diversification of tea plant traits.
Collapse
Affiliation(s)
- Wei Tong
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Yanli Wang
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Fangdong Li
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Fei Zhai
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Jingjing Su
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Didi Wu
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Lianghui Yi
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| | - Qijuan Gao
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
- School of Computer and Artificial Intelligence, Hefei Normal University, Hefei, 230061, China
| | - Qiong Wu
- Tea Research Institute, Anhui Academy of Agricultural Sciences, Hefei, 230031, China
| | - Enhua Xia
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, 230036, China
| |
Collapse
|
7
|
Matthews CA, Watson-Haigh NS, Burton RA, Sheppard AE. A gentle introduction to pangenomics. Brief Bioinform 2024; 25:bbae588. [PMID: 39552065 PMCID: PMC11570541 DOI: 10.1093/bib/bbae588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/12/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open
Abstract
Pangenomes have emerged in response to limitations associated with traditional linear reference genomes. In contrast to a traditional reference that is (usually) assembled from a single individual, pangenomes aim to represent all of the genomic variation found in a group of organisms. The term 'pangenome' is currently used to describe multiple different types of genomic information, and limited language is available to differentiate between them. This is frustrating for researchers working in the field and confusing for researchers new to the field. Here, we provide an introduction to pangenomics relevant to both prokaryotic and eukaryotic organisms and propose a formalization of the language used to describe pangenomes (see the Glossary) to improve the specificity of discussion in the field.
Collapse
Affiliation(s)
- Chelsea A Matthews
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Nathan S Watson-Haigh
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Victoria 3000, Australia
- South Australian Genomics Centre, SAHMRI, North Terrace, Adelaide, South Australia 5000, Australia
- Alkahest Inc., San Carlos, CA 94070, United States
| | - Rachel A Burton
- School of Agriculture, Food and Wine, Waite Campus, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Anna E Sheppard
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| |
Collapse
|
8
|
Boschiero C, Neupane M, Yang L, Schroeder SG, Tuo W, Ma L, Baldwin RL, Van Tassell CP, Liu GE. A Pilot Detection and Associate Study of Gene Presence-Absence Variation in Holstein Cattle. Animals (Basel) 2024; 14:1921. [PMID: 38998033 PMCID: PMC11240624 DOI: 10.3390/ani14131921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 06/18/2024] [Accepted: 06/26/2024] [Indexed: 07/14/2024] Open
Abstract
Presence-absence variations (PAVs) are important structural variations, wherein a genomic segment containing one or more genes is present in some individuals but absent in others. While PAVs have been extensively studied in plants, research in cattle remains limited. This study identified PAVs in 173 Holstein bulls using whole-genome sequencing data and assessed their associations with 46 economically important traits. Out of 28,772 cattle genes (from the longest transcripts), a total of 26,979 (93.77%) core genes were identified (present in all individuals), while variable genes included 928 softcore (present in 95-99% of individuals), 494 shell (present in 5-94%), and 371 cloud genes (present in <5%). Cloud genes were enriched in functions associated with hormonal and antimicrobial activities, while shell genes were enriched in immune functions. PAV-based genome-wide association studies identified associations between gene PAVs and 16 traits including milk, fat, and protein yields, as well as traits related to health and reproduction. Associations were found on multiple chromosomes, illustrating important associations on cattle chromosomes 7 and 15, involving olfactory receptor and immune-related genes, respectively. By examining the PAVs at the population level, the results of this research provided crucial insights into the genetic structures underlying the complex traits of Holstein cattle.
Collapse
Affiliation(s)
- Clarissa Boschiero
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
- Department of Veterinary Medicine, University of Maryland, College Park, MD 20742, USA
| | - Mahesh Neupane
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Liu Yang
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Steven G Schroeder
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Wenbin Tuo
- Animal Parasitic Diseases Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Ransom L Baldwin
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - George E Liu
- Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| |
Collapse
|
9
|
Lu Y, Liu D, Kong X, Song Y, Jing L. Pangenome characterization and analysis of the NAC gene family reveals genes for Sclerotinia sclerotiorum resistance in sunflower (Helianthus annuus). BMC Genom Data 2024; 25:39. [PMID: 38693490 PMCID: PMC11064331 DOI: 10.1186/s12863-024-01227-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Accepted: 04/22/2024] [Indexed: 05/03/2024] Open
Abstract
BACKGROUND Sunflower (Helianthus annuus) is one of the most important economic crops in oilseed production worldwide. The different cultivars exhibit variability in their resistance genes. The NAC transcription factor (TF) family plays diverse roles in plant development and stress responses. With the completion of the H. annuus genome sequence, the entire complement of genes coding for NACs has been identified. However, the reference genome of a single individual cannot cover all the genetic information of the species. RESULTS Considering only a single reference genome to study gene families will miss many meaningful genes. A pangenome-wide survey and characterization of the NAC genes in sunflower species were conducted. In total, 139 HaNAC genes are identified, of which 114 are core and 25 are variable. Phylogenetic analysis of sunflower NAC proteins categorizes these proteins into 16 subgroups. 138 HaNACs are randomly distributed on 17 chromosomes. SNP-based haplotype analysis shows haplotype diversity of the HaNAC genes in wild accessions is richer than in landraces and modern cultivars. Ten HaNAC genes in the basal stalk rot (BSR) resistance quantitative trait loci (QTL) are found. A total of 26 HaNAC genes are differentially expressed in response to Sclerotinia head rot (SHR). A total of 137 HaNAC genes are annotated in Gene Ontology (GO) and are classified into 24 functional groups. GO functional enrichment analysis reveals that HaNAC genes are involved in various functions of the biological process. CONCLUSIONS We identified NAC genes in H. annuus (HaNAC) on a pangenome-wide scale and analyzed S. sclerotiorum resistance-related NACs. This study provided a theoretical basis for further genomic improvement targeting resistance-related NAC genes in sunflowers.
Collapse
Affiliation(s)
- Yan Lu
- College of Horticulture and Plant Protection, Inner Mongolia Agricultural University, Hohhot, China
| | - Dongqi Liu
- College of Horticulture and Plant Protection, Inner Mongolia Agricultural University, Hohhot, China
| | - Xiangjiu Kong
- College of Horticulture and Plant Protection, Inner Mongolia Agricultural University, Hohhot, China
| | - Yang Song
- College of Horticulture and Plant Protection, Inner Mongolia Agricultural University, Hohhot, China
| | - Lan Jing
- College of Horticulture and Plant Protection, Inner Mongolia Agricultural University, Hohhot, China.
| |
Collapse
|
10
|
Du H, Diao C, Zhuo Y, Zheng X, Hu Z, Lu S, Jin W, Zhou L, Liu JF. Assembly of novel sequences for Chinese domestic pigs reveals new genes and regulatory variants providing new insights into their diversity. Genomics 2024; 116:110782. [PMID: 38176574 DOI: 10.1016/j.ygeno.2024.110782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/27/2023] [Accepted: 01/01/2024] [Indexed: 01/06/2024]
Abstract
There is an increasing understanding that a reference genome representing an individual cannot capture all the gene repertoire of a species. Here, we conduct a population-scale missing sequences detection of Chinese domestic pigs using whole-genome sequencing data from 534 individuals. We identify 132.41 Mb of sequences absent in the reference assembly, including eight novel genes. In particular, the breeds spread in Chinese high-altitude regions perform significantly different frequencies of new sequences in promoters than other breeds. Furthermore, we dissect the role of non-coding variants and identify a novel sequence inserted in the 3'UTR of the FMO3 gene, which may be associated with the intramuscular fat phenotype. This novel sequence could be a candidate marker for meat quality. Our study provides a comprehensive overview of the missing sequences in Chinese domestic pigs and indicates that this dataset is a valuable resource for understanding the diversity and biology of pigs.
Collapse
Affiliation(s)
- Heng Du
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| | - Chenguang Diao
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| | - Yue Zhuo
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| | - Xianrui Zheng
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Zhengzheng Hu
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Shiyu Lu
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Wenjiao Jin
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Lei Zhou
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| | - Jian-Feng Liu
- State Key Laboratory of Animal Biotech Breeding; College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
11
|
Mathur S, Singh D, Ranjan R. Recent advances in plant translational genomics for crop improvement. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2024; 139:335-382. [PMID: 38448140 DOI: 10.1016/bs.apcsb.2023.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
The growing population, climate change, and limited agricultural resources put enormous pressure on agricultural systems. A plateau in crop yields is occurring and extreme weather events and urbanization threaten the livelihood of farmers. It is imperative that immediate attention is paid to addressing the increasing food demand, ensuring resilience against emerging threats, and meeting the demand for more nutritious, safer food. Under uncertain conditions, it is essential to expand genetic diversity and discover novel crop varieties or variations to develop higher and more stable yields. Genomics plays a significant role in developing abundant and nutrient-dense food crops. An alternative to traditional breeding approach, translational genomics is able to improve breeding programs in a more efficient and precise manner by translating genomic concepts into practical tools. Crop breeding based on genomics offers potential solutions to overcome the limitations of conventional breeding methods, including improved crop varieties that provide more nutritional value and are protected from biotic and abiotic stresses. Genetic markers, such as SNPs and ESTs, contribute to the discovery of QTLs controlling agronomic traits and stress tolerance. In order to meet the growing demand for food, there is a need to incorporate QTLs into breeding programs using marker-assisted selection/breeding and transgenic technologies. This chapter primarily focuses on the recent advances that are made in translational genomics for crop improvement and various omics techniques including transcriptomics, metagenomics, pangenomics, single cell omics etc. Numerous genome editing techniques including CRISPR Cas technology and their applications in crop improvement had been discussed.
Collapse
Affiliation(s)
- Shivangi Mathur
- Plant Molecular Biology Laboratory, Department of Botany, Faculty of Science, Dayalbagh Educational Institute, Agra, India
| | - Deeksha Singh
- Plant Molecular Biology Laboratory, Department of Botany, Faculty of Science, Dayalbagh Educational Institute, Agra, India
| | - Rajiv Ranjan
- Plant Molecular Biology Laboratory, Department of Botany, Faculty of Science, Dayalbagh Educational Institute, Agra, India.
| |
Collapse
|
12
|
Qi H, Yu F, Lü S, Damaris RN, Dong G, Yang P. Exploring domestication pattern in lotus: insights from dispensable genome assembly. FRONTIERS IN PLANT SCIENCE 2023; 14:1294033. [PMID: 38034573 PMCID: PMC10687544 DOI: 10.3389/fpls.2023.1294033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 11/01/2023] [Indexed: 12/02/2023]
Abstract
Lotus (Nelumbo nucifera Gaertn.), an important aquatic plant in horticulture and ecosystems, has been cultivated for more than 7000 years and domesticated into three different subgroups: flower lotus, rhizome lotus, and seed lotus. To explore the domesticated regions of each subgroup, re-sequencing data of 371 lotus accessions collected from the public database were aligned to the genome of 'China-Antique (CA)'. Unmapped reads were used to build the dispensable genome of each subgroup using a metagenome-like assembly strategy. More than 27 Mb of the dispensable genome in these three subgroups and the wild group was assembled, of which 11,761 genes were annotated. Some of the contigs in the dispensable genome were similar to the genomic segments of other lotus accessions other than 'CA'. The annotated genes in each subgroup played essential roles in specific developmental processes. Dissection of selective signals in three cultivated subgroups also demonstrated that subgroup-specific metabolic pathways, such as the brassinosteroids metabolism enrichment in FL, associated with these selected genes in each subgroup and the contigs in dispensable genome nearly located in the domesticated regions of each subgroup, respectively. Our data presented a valuable resource for facilitating lotus genomic studies, complemented the helpful information to the reference genome, and shed light on the selective signals of domesticated subgroups.
Collapse
Affiliation(s)
- Huanhuan Qi
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan, China
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, China
| | - Feng Yu
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, China
| | - Shiyou Lü
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, China
| | | | - Guoqing Dong
- School of Life Science and Technology, Wuhan Polytechnic University, Wuhan, China
| | - Pingfang Yang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, China
| |
Collapse
|
13
|
Wang T, Duan S, Xu C, Wang Y, Zhang X, Xu X, Chen L, Han Z, Wu T. Pan-genome analysis of 13 Malus accessions reveals structural and sequence variations associated with fruit traits. Nat Commun 2023; 14:7377. [PMID: 37968318 PMCID: PMC10651928 DOI: 10.1038/s41467-023-43270-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 11/06/2023] [Indexed: 11/17/2023] Open
Abstract
Structural variations (SVs) and copy number variations (CNVs) contribute to trait variations in fleshy-fruited species. Here, we assemble 10 genomes of genetically diverse Malus accessions, including the ever-green cultivar 'Granny Smith' and the widely cultivated cultivar 'Red Fuji'. Combining with three previously reported genomes, we assemble the pan-genome of Malus species and identify 20,220 CNVs and 317,393 SVs. We also observe CNVs that are positively correlated with expression levels of the genes they are associated with. Furthermore, we show that the noncoding RNA generated from a 209 bp insertion in the intron of mitogen-activated protein kinase homology encoding gene, MMK2, regulates the gene expression and affects fruit coloration. Moreover, we identify overlapping SVs associated with fruit quality and biotic resistance. This pan-genome uncovers possible contributions of CNVs to gene expression and highlights the role of SVs in apple domestication and economically important traits.
Collapse
Affiliation(s)
- Ting Wang
- College of Horticulture, China Agricultural University, Beijing, China
| | - Shiyao Duan
- Plant Science and Technology College, Beijing University of Agriculture, Beijing, China
| | - Chen Xu
- College of Horticulture, China Agricultural University, Beijing, China
| | - Yi Wang
- College of Horticulture, China Agricultural University, Beijing, China
| | - Xinzhong Zhang
- College of Horticulture, China Agricultural University, Beijing, China
| | - Xuefeng Xu
- College of Horticulture, China Agricultural University, Beijing, China
| | - Liyang Chen
- Smartgenomics Technology Institute, Tianjin, China
| | - Zhenhai Han
- College of Horticulture, China Agricultural University, Beijing, China.
| | - Ting Wu
- College of Horticulture, China Agricultural University, Beijing, China.
| |
Collapse
|
14
|
Ruperao P, Rangan P, Shah T, Thakur V, Kalia S, Mayes S, Rathore A. The Progression in Developing Genomic Resources for Crop Improvement. Life (Basel) 2023; 13:1668. [PMID: 37629524 PMCID: PMC10455509 DOI: 10.3390/life13081668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/21/2023] [Accepted: 07/25/2023] [Indexed: 08/27/2023] Open
Abstract
Sequencing technologies have rapidly evolved over the past two decades, and new technologies are being continually developed and commercialized. The emerging sequencing technologies target generating more data with fewer inputs and at lower costs. This has also translated to an increase in the number and type of corresponding applications in genomics besides enhanced computational capacities (both hardware and software). Alongside the evolving DNA sequencing landscape, bioinformatics research teams have also evolved to accommodate the increasingly demanding techniques used to combine and interpret data, leading to many researchers moving from the lab to the computer. The rich history of DNA sequencing has paved the way for new insights and the development of new analysis methods. Understanding and learning from past technologies can help with the progress of future applications. This review focuses on the evolution of sequencing technologies, their significant enabling role in generating plant genome assemblies and downstream applications, and the parallel development of bioinformatics tools and skills, filling the gap in data analysis techniques.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources, PUSA Campus, New Delhi 110012, India;
| | - Trushar Shah
- International Institute of Tropical Agriculture (IITA), Nairobi 30709-00100, Kenya;
| | - Vivek Thakur
- Department of Systems & Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad 500046, India;
| | - Sanjay Kalia
- Department of Biotechnology, Ministry of Science and Technology, Government of India, New Delhi 110003, India;
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad 502324, India
| | - Abhishek Rathore
- Excellence in Breeding, International Maize and Wheat Improvement Center (CIMMYT), Hyderabad 502324, India
| |
Collapse
|
15
|
Glick L, Mayrose I. The Effect of Methodological Considerations on the Construction of Gene-Based Plant Pan-genomes. Genome Biol Evol 2023; 15:evad121. [PMID: 37401440 PMCID: PMC10340445 DOI: 10.1093/gbe/evad121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 06/21/2023] [Accepted: 06/28/2023] [Indexed: 07/05/2023] Open
Abstract
Pan-genomics is an emerging approach for studying the genetic diversity within plant populations. In contrast to common resequencing studies that compare whole genome sequencing data with a single reference genome, the construction of a pan-genome (PG) involves the direct comparison of multiple genomes to one another, thereby enabling the detection of genomic sequences and genes not present in the reference, as well as the analysis of gene content diversity. Although multiple studies describing PGs of various plant species have been published in recent years, a better understanding regarding the effect of the computational procedures used for PG construction could guide researchers in making more informed methodological decisions. Here, we examine the effect of several key methodological factors on the obtained gene pool and on gene presence-absence detections by constructing and comparing multiple PGs of Arabidopsis thaliana and cultivated soybean, as well as conducting a meta-analysis on published PGs. These factors include the construction method, the sequencing depth, and the extent of input data used for gene annotation. We observe substantial differences between PGs constructed using three common procedures (de novo assembly and annotation, map-to-pan, and iterative assembly) and that results are dependent on the extent of the input data. Specifically, we report low agreement between the gene content inferred using different procedures and input data. Our results should increase the awareness of the community to the consequences of methodological decisions made during the process of PG construction and emphasize the need for further investigation of commonly applied methodologies.
Collapse
Affiliation(s)
- Lior Glick
- Department of Life Sciences, School of Plant Sciences and Food Security, Tel-Aviv University, Tel Aviv, Israel
| | - Itay Mayrose
- Department of Life Sciences, School of Plant Sciences and Food Security, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
16
|
Jaegle B, Pisupati R, Soto-Jiménez LM, Burns R, Rabanal FA, Nordborg M. Extensive sequence duplication in Arabidopsis revealed by pseudo-heterozygosity. Genome Biol 2023; 24:44. [PMID: 36895055 PMCID: PMC9999624 DOI: 10.1186/s13059-023-02875-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 02/13/2023] [Indexed: 03/11/2023] Open
Abstract
BACKGROUND It is apparent that genomes harbor much structural variation that is largely undetected for technical reasons. Such variation can cause artifacts when short-read sequencing data are mapped to a reference genome. Spurious SNPs may result from mapping of reads to unrecognized duplicated regions. Calling SNP using the raw reads of the 1001 Arabidopsis Genomes Project we identified 3.3 million (44%) heterozygous SNPs. Given that Arabidopsis thaliana (A. thaliana) is highly selfing, and that extensively heterozygous individuals have been removed, we hypothesize that these SNPs reflected cryptic copy number variation. RESULTS The heterozygosity we observe consists of particular SNPs being heterozygous across individuals in a manner that strongly suggests it reflects shared segregating duplications rather than random tracts of residual heterozygosity due to occasional outcrossing. Focusing on such pseudo-heterozygosity in annotated genes, we use genome-wide association to map the position of the duplicates. We identify 2500 putatively duplicated genes and validate them using de novo genome assemblies from six lines. Specific examples included an annotated gene and nearby transposon that transpose together. We also demonstrate that cryptic structural variation produces highly inaccurate estimates of DNA methylation polymorphism. CONCLUSIONS Our study confirms that most heterozygous SNP calls in A. thaliana are artifacts and suggest that great caution is needed when analyzing SNP data from short-read sequencing. The finding that 10% of annotated genes exhibit copy-number variation, and the realization that neither gene- nor transposon-annotation necessarily tells us what is actually mobile in the genome suggests that future analyses based on independently assembled genomes will be very informative.
Collapse
Affiliation(s)
- Benjamin Jaegle
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter, Vienna, Austria
| | - Rahul Pisupati
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter, Vienna, Austria
| | | | - Robin Burns
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter, Vienna, Austria
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | | | - Magnus Nordborg
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter, Vienna, Austria.
| |
Collapse
|
17
|
Li Y, Ma B, Hua K, Gong H, He R, Luo R, Bi D, Zhou R, Langford PR, Jin H. PPNet: Identifying Functional Association Networks by Phylogenetic Profiling of Prokaryotic Genomes. Microbiol Spectr 2023; 11:e0387122. [PMID: 36602356 PMCID: PMC9927313 DOI: 10.1128/spectrum.03871-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 12/01/2022] [Indexed: 01/06/2023] Open
Abstract
Identification of microbial functional association networks allows interpretation of biological phenomena and a greater understanding of the molecular basis of pathogenicity and also underpins the formulation of control measures. Here, we describe PPNet, a tool that uses genome information and analysis of phylogenetic profiles with binary similarity and distance measures to derive large-scale bacterial gene association networks of a single species. As an exemplar, we have derived a functional association network in the pig pathogen Streptococcus suis using 81 binary similarity and dissimilarity measures which demonstrates excellent performance based on the area under the receiver operating characteristic (AUROC), the area under the precision-recall (AUPR), and a derived overall scoring method. Selected network associations were validated experimentally by using bacterial two-hybrid experiments. We conclude that PPNet, a publicly available (https://github.com/liyangjie/PPNet), can be used to construct microbial association networks from easily acquired genome-scale data. IMPORTANCE This study developed PPNet, the first tool that can be used to infer large-scale bacterial functional association networks of a single species. PPNet includes a method for assigning the uniqueness of a bacterial strain using the average nucleotide identity and the average nucleotide coverage. PPNet collected 81 binary similarity and distance measures for phylogenetic profiling and then evaluated and divided them into four groups. PPNet can effectively capture gene networks that are functionally related to phenotype from publicly prokaryotic genomes, as well as provide valuable results for downstream analysis and experiment testing.
Collapse
Affiliation(s)
- Yangjie Li
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Bin Ma
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Kexin Hua
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Huimin Gong
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Rongrong He
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Rui Luo
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Dingren Bi
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Rui Zhou
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| | - Paul R. Langford
- Section of Paediatric Infectious Disease, Imperial College London, St Mary’s Campus, London, United Kingdom
| | - Hui Jin
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China
- College of Animal Medicine, Huazhong Agricultural University, Wuhan, China
- Hubei Provincial Key Laboratory of Preventive Veterinary Medicine, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
18
|
Bayer PE, Edwards D. Investigating Pangenome Graphs Using Wheat Panache. Methods Mol Biol 2023; 2703:23-29. [PMID: 37646934 DOI: 10.1007/978-1-0716-3389-2_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Pangenome graphs quickly become the central data structure representing the diversity of variation we see across related genomes. Pangenome graphs have been published for some species, including plants of agronomic interest. However, visualizing these graphs is not easy as the graphs are large, and variants within these graphs are complex. Tools are needed to visualize graph data structures. Here, we present a workflow to search and visualize a wheat pangenome graph using Wheat Panache. The approach presented assists researchers interested in wheat genomics.
Collapse
Affiliation(s)
- Philipp E Bayer
- Centre for Applied Bioinformatics and School of Biological Sciences, The University of Western Australia, Perth, WA, Australia
| | - David Edwards
- Centre for Applied Bioinformatics and School of Biological Sciences, The University of Western Australia, Perth, WA, Australia.
| |
Collapse
|
19
|
Analyzing integrated network of methylation and gene expression profiles in lung squamous cell carcinoma. Sci Rep 2022; 12:15799. [PMID: 36138066 PMCID: PMC9500023 DOI: 10.1038/s41598-022-20232-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 09/09/2022] [Indexed: 11/24/2022] Open
Abstract
Gene expression, DNA methylation, and their organizational relationships are commonly altered in lung squamous cell carcinoma (LUSC). To elucidate these complex interactions, we reconstructed a differentially expressed gene network and a differentially methylated cytosine (DMC) network by partial information decomposition and an inverse correlation algorithm, respectively. Then, we performed graph union to integrate the networks. Community detection and enrichment analysis of the integrated network revealed close interactions between the cell cycle, keratinization, immune system, and xenobiotic metabolism gene sets in LUSC. DMC analysis showed that hypomethylation targeted the gene sets responsible for cell cycle, keratinization, and NRF2 pathways. On the other hand, hypermethylated genes affected circulatory system development, the immune system, extracellular matrix organization, and cilium organization. By centrality measurement, we identified NCAPG2, PSMG3, and FADD as hub genes that were highly connected to other nodes and might play important roles in LUSC gene dysregulation. We also found that the genes with high betweenness centrality are more likely to affect patients’ survival than those with low betweenness centrality. These results showed that the integrated network analysis enabled us to obtain a global view of the interactions and regulations in LUSC.
Collapse
|
20
|
Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions. Sci Rep 2022; 12:13823. [PMID: 35970979 PMCID: PMC9378700 DOI: 10.1038/s41598-022-16075-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 07/04/2022] [Indexed: 11/12/2022] Open
Abstract
As the fourth most populous country in the world, Indonesia must increase the annual rice production rate to achieve national food security by 2050. One possible solution comes from the nanoscopic level: a genetic variant called Single Nucleotide Polymorphism (SNP), which can express significant yield-associated genes. The prior benchmark of this study utilized a statistical genetics model where no SNP position information and attention mechanism were involved. Hence, we developed a novel deep polygenic neural network, named the NucleoNet model, to address these obstacles. The NucleoNets were constructed with the combination of prominent components that include positional SNP encoding, the context vector, wide models, Elastic Net, and Shannon’s entropy loss. This polygenic modeling obtained up to 2.779 of Mean Squared Error (MSE) with 47.156% of Symmetric Mean Absolute Percentage Error (SMAPE), while revealing 15 new important SNPs. Furthermore, the NucleoNets reduced the MSE score up to 32.28% compared to the Ordinary Least Squares (OLS) model. Through the ablation study, we learned that the combination of Xavier distribution for weights initialization and Normal distribution for biases initialization sparked more various important SNPs throughout 12 chromosomes. Our findings confirmed that the NucleoNet model was successfully outperformed the OLS model and identified important SNPs to Indonesian rice yields.
Collapse
|
21
|
Nanni AV, Morse AM, Newman JRB, Choquette NE, Wedow JM, Liu Z, Leakey ADB, Conesa A, Ainsworth EA, McIntyre LM. Variation in leaf transcriptome responses to elevated ozone corresponds with physiological sensitivity to ozone across maize inbred lines. Genetics 2022; 221:iyac080. [PMID: 35579358 PMCID: PMC9339315 DOI: 10.1093/genetics/iyac080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open
Abstract
We examine the impact of sustained elevated ozone concentration on the leaf transcriptome of 5 diverse maize inbred genotypes, which vary in physiological sensitivity to ozone (B73, Mo17, Hp301, C123, and NC338), using long reads to assemble transcripts and short reads to quantify expression of these transcripts. More than 99% of the long reads, 99% of the assembled transcripts, and 97% of the short reads map to both B73 and Mo17 reference genomes. Approximately 95% of the genes with assembled transcripts belong to known B73-Mo17 syntenic loci and 94% of genes with assembled transcripts are present in all temperate lines in the nested association mapping pan-genome. While there is limited evidence for alternative splicing in response to ozone stress, there is a difference in the magnitude of differential expression among the 5 genotypes. The transcriptional response to sustained ozone stress in the ozone resistant B73 genotype (151 genes) was modest, while more than 3,300 genes were significantly differentially expressed in the more sensitive NC338 genotype. There is the potential for tandem duplication in 30% of genes with assembled transcripts, but there is no obvious association between potential tandem duplication and differential expression. Genes with a common response across the 5 genotypes (83 genes) were associated with photosynthesis, in particular photosystem I. The functional annotation of genes not differentially expressed in B73 but responsive in the other 4 genotypes (789) identifies reactive oxygen species. This suggests that B73 has a different response to long-term ozone exposure than the other 4 genotypes. The relative magnitude of the genotypic response to ozone, and the enrichment analyses are consistent regardless of whether aligning short reads to: long read assembled transcripts; the B73 reference; the Mo17 reference. We find that prolonged ozone exposure directly impacts the photosynthetic machinery of the leaf.
Collapse
Affiliation(s)
- Adalena V Nanni
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Alison M Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Jeremy R B Newman
- Genetics Institute, University of Florida, Gainesville, FL 32611, USA
- Department of Pathology, University of Florida, Gainesville, FL 32611, USA
| | - Nicole E Choquette
- Department of Plant Biology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Crop Sciences, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jessica M Wedow
- Department of Plant Biology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Crop Sciences, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Zihao Liu
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Andrew D B Leakey
- Department of Plant Biology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Crop Sciences, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Ana Conesa
- Department of Cell and Microbial Sciences, University of Florida, Gainesville, FL 32611, USA
- Institute for Integrative Systems Biology, Spanish National Research Council, 46980 Paterna, Spain
| | - Elizabeth A Ainsworth
- Department of Plant Biology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Crop Sciences, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- USDA ARS Global Change and Photosynthesis Research Unit, Urbana, IL 61801, USA
| | - Lauren M McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
- Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
22
|
Petereit J, Bayer PE, Thomas WJW, Tay Fernandez CG, Amas J, Zhang Y, Batley J, Edwards D. Pangenomics and Crop Genome Adaptation in a Changing Climate. PLANTS (BASEL, SWITZERLAND) 2022; 11:1949. [PMID: 35956427 PMCID: PMC9370458 DOI: 10.3390/plants11151949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 07/18/2022] [Accepted: 07/19/2022] [Indexed: 12/15/2022]
Abstract
During crop domestication and breeding, wild plant species have been shaped into modern high-yield crops and adapted to the main agro-ecological regions. However, climate change will impact crop productivity in these regions, and agriculture needs to adapt to support future food production. On a global scale, crop wild relatives grow in more diverse environments than crop species, and so may host genes that could support the adaptation of crops to new and variable environments. Through identification of individuals with increased climate resilience we may gain a greater understanding of the genomic basis for this resilience and transfer this to crops. Pangenome analysis can help to identify the genes underlying stress responses in individuals harbouring untapped genomic diversity in crop wild relatives. The information gained from the analysis of these pangenomes can then be applied towards breeding climate resilience into existing crops or to re-domesticating crops, combining environmental adaptation traits with crop productivity.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - David Edwards
- School of Biological Sciences, The University of Western Australia, Perth 6009, Australia; (J.P.); (P.E.B.); (W.J.W.T.); (C.G.T.F.); (J.A.); (Y.Z.); (J.B.)
| |
Collapse
|
23
|
|
24
|
Zanini SF, Bayer PE, Wells R, Snowdon RJ, Batley J, Varshney RK, Nguyen HT, Edwards D, Golicz AA. Pangenomics in crop improvement-from coding structural variations to finding regulatory variants with pangenome graphs. THE PLANT GENOME 2022; 15:e20177. [PMID: 34904403 DOI: 10.1002/tpg2.20177] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 10/07/2021] [Indexed: 05/15/2023]
Abstract
Since the first reported crop pangenome in 2014, advances in high-throughput and cost-effective DNA sequencing technologies facilitated multiple such studies including the pangenomes of oilseed rape (Brassica napus L.), soybean [Glycine max (L.) Merr.], rice (Oryza sativa L.), wheat (Triticum aestivum L.), and barley (Hordeum vulgare L.). Compared with single-reference genomes, pangenomes provide a more accurate representation of the genetic variation present in a species. By combining the genomic data of multiple accessions, pangenomes allow for the detection and annotation of complex DNA polymorphisms such as structural variations (SVs), one of the major determinants of genetic diversity within a species. In this review we summarize the current literature on crop pangenomics, focusing on their application to find candidate SVs involved in traits of agronomic interest. We then highlight the potential of pangenomes in the discovery and functional characterization of noncoding regulatory sequences and their variations. We conclude with a summary and outlook on innovative data structures representing the complete content of plant pangenomes including annotations of coding and noncoding elements and outcomes of transcriptomic and epigenomic experiments.
Collapse
Affiliation(s)
- Silvia F Zanini
- Dep. of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig Univ. Giessen, Giessen, 35392, Germany
| | - Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, Univ. of Western Australia, Perth, Western Australia, Australia
| | - Rachel Wells
- Dep. of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, NR47UH, UK
| | - Rod J Snowdon
- Dep. of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig Univ. Giessen, Giessen, 35392, Germany
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, Univ. of Western Australia, Perth, Western Australia, Australia
| | - Rajeev K Varshney
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India
- State Agricultural Biotechnology Centre, Centre for Crop Food Innovation, Food Futures Institute, Murdoch Univ., Murdoch, WA, Australia
| | - Henry T Nguyen
- Division of Plant Sciences, Univ. of Missouri, Columbia, MO, USA
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, Univ. of Western Australia, Perth, Western Australia, Australia
| | - Agnieszka A Golicz
- Dep. of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig Univ. Giessen, Giessen, 35392, Germany
| |
Collapse
|
25
|
Tay Fernandez CG, Nestor BJ, Danilevicz MF, Marsh JI, Petereit J, Bayer PE, Batley J, Edwards D. Expanding Gene-Editing Potential in Crop Improvement with Pangenomes. Int J Mol Sci 2022; 23:ijms23042276. [PMID: 35216392 PMCID: PMC8879065 DOI: 10.3390/ijms23042276] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 02/01/2023] Open
Abstract
Pangenomes aim to represent the complete repertoire of the genome diversity present within a species or cohort of species, capturing the genomic structural variance between individuals. This genomic information coupled with phenotypic data can be applied to identify genes and alleles involved with abiotic stress tolerance, disease resistance, and other desirable traits. The characterisation of novel structural variants from pangenomes can support genome editing approaches such as Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated protein Cas (CRISPR-Cas), providing functional information on gene sequences and new target sites in variant-specific genes with increased efficiency. This review discusses the application of pangenomes in genome editing and crop improvement, focusing on the potential of pangenomes to accurately identify target genes for CRISPR-Cas editing of plant genomes while avoiding adverse off-target effects. We consider the limitations of applying CRISPR-Cas editing with pangenome references and potential solutions to overcome these limitations.
Collapse
|
26
|
Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber. Nat Commun 2022; 13:682. [PMID: 35115520 PMCID: PMC8813957 DOI: 10.1038/s41467-022-28362-0] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 01/19/2022] [Indexed: 12/21/2022] Open
Abstract
Structural variants (SVs) represent a major source of genetic diversity and are related to numerous agronomic traits and evolutionary events; however, their comprehensive identification and characterization in cucumber (Cucumis sativus L.) have been hindered by the lack of a high-quality pan-genome. Here, we report a graph-based cucumber pan-genome by analyzing twelve chromosome-scale genome assemblies. Genotyping of seven large chromosomal rearrangements based on the pan-genome provides useful information for use of wild accessions in breeding and genetic studies. A total of ~4.3 million genetic variants including 56,214 SVs are identified leveraging the chromosome-level assemblies. The pan-genome graph integrating both variant information and reference genome sequences aids the identification of SVs associated with agronomic traits, including warty fruits, flowering times and root growth, and enhances the understanding of cucumber trait evolution. The graph-based cucumber pan-genome and the identified genetic variants provide rich resources for future biological research and genomics-assisted breeding. Increasing studies have suggested that single reference genome is insufficient to capture all variations in the genome. Here, the authors report a graph-based cucumber pan-genome by analyzing 12 chromosome-scale assemblies and reveal variations associated with agronomic traits and domestication.
Collapse
|
27
|
Functional characterization of powdery mildew resistance gene MlIW172, a new Pm60 allele and its allelic variation in wild emmer wheat. J Genet Genomics 2022; 49:787-795. [DOI: 10.1016/j.jgg.2022.01.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 01/26/2022] [Accepted: 01/29/2022] [Indexed: 11/19/2022]
|
28
|
Pronozin AY, Bragina MK, Salina EA. Crop pangenomes. Vavilovskii Zhurnal Genet Selektsii 2021; 25:57-63. [PMID: 34901703 PMCID: PMC8629360 DOI: 10.18699/vj21.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 12/27/2020] [Accepted: 01/03/2021] [Indexed: 11/19/2022] Open
Abstract
Progress in genome sequencing, assembly and analysis allows for a deeper study of agricultural plants' chromosome structures, gene identification and annotation. The published genomes of agricultural plants proved to be a valuable tool for studing gene functions and for marker-assisted and genomic selection. However, large structural genome changes, including gene copy number variations (CNVs) and gene presence/absence variations (PAVs), prevail in crops. These genomic variations play an important role in the functional set of genes and the gene composition in individuals of the same species and provide the genetic determination of the agronomically important crops properties. A high degree of genomic variation observed indicates that single reference genomes do not represent the diversity within a species, leading to the pangenome concept. The pangenome represents information about all genes in a taxon: those that are common to all taxon members and those that are variable and are partially or completely specific for particular individuals. Pangenome sequencing and analysis technologies provide a large-scale study of genomic variation and resources for an evolutionary research, functional genomics and crop breeding. This review provides an analysis of agricultural plants' pangenome studies. Pangenome structural features, methods and programs for bioinformatic analysis of pangenomic data are described.
Collapse
Affiliation(s)
- A Yu Pronozin
- Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - M K Bragina
- Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Kurchatov Genomic Center of the Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - E A Salina
- Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Kurchatov Genomic Center of the Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
29
|
Bayer PE, Petereit J, Danilevicz MF, Anderson R, Batley J, Edwards D. The application of pangenomics and machine learning in genomic selection in plants. THE PLANT GENOME 2021; 14:e20112. [PMID: 34288550 DOI: 10.1002/tpg2.20112] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/01/2021] [Indexed: 05/10/2023]
Abstract
Genomic selection approaches have increased the speed of plant breeding, leading to growing crop yields over the last decade. However, climate change is impacting current and future yields, resulting in the need to further accelerate breeding efforts to cope with these changing conditions. Here we present approaches to accelerate plant breeding and incorporate nonadditive effects in genomic selection by applying state-of-the-art machine learning approaches. These approaches are made more powerful by the inclusion of pangenomes, which represent the entire genome content of a species. Understanding the strengths and limitations of machine learning methods, compared with more traditional genomic selection efforts, is paramount to the successful application of these methods in crop breeding. We describe examples of genomic selection and pangenome-based approaches in crop breeding, discuss machine learning-specific challenges, and highlight the potential for the application of machine learning in genomic selection. We believe that careful implementation of machine learning approaches will support crop improvement to help counter the adverse outcomes of climate change on crop production.
Collapse
Affiliation(s)
- Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Jakob Petereit
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Monica Furaste Danilevicz
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Robyn Anderson
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| |
Collapse
|
30
|
Wang K, Hu H, Tian Y, Li J, Scheben A, Zhang C, Li Y, Wu J, Yang L, Fan X, Sun G, Li D, Zhang Y, Han R, Jiang R, Huang H, Yan F, Wang Y, Li Z, Li G, Liu X, Li W, Edwards D, Kang X. The chicken pan-genome reveals gene content variation and a promoter region deletion in IGF2BP1 affecting body size. Mol Biol Evol 2021; 38:5066-5081. [PMID: 34329477 PMCID: PMC8557422 DOI: 10.1093/molbev/msab231] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Domestication and breeding have reshaped the genomic architecture of chicken, but the retention and loss of genomic elements during these evolutionary processes remain unclear. We present the first chicken pan-genome constructed using 664 individuals, which identified an additional ∼66.5 Mb sequences that are absent from the reference genome (GRCg6a). The constructed pan-genome encoded 20,491 predicated protein-coding genes, of which higher expression level are observed in conserved genes relative to dispensable genes. Presence/absence variation (PAV) analyses demonstrated that gene PAV in chicken was shaped by selection, genetic drift, and hybridization. PAV-based GWAS identified numerous candidate mutations related to growth, carcass composition, meat quality, or physiological traits. Among them, a deletion in the promoter region of IGF2BP1 affecting chicken body size is reported, which is supported by functional studies and extra samples. This is the first time to report the causal variant of chicken body size QTL located at chromosome 27 which was repeatedly reported. Therefore, the chicken pan-genome is a useful resource for biological discovery and breeding. It improves our understanding of chicken genome diversity and provides materials to unveil the evolution history of chicken domestication.
Collapse
Affiliation(s)
- Kejun Wang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Haifei Hu
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Crawley, 6009 WA, Australia
| | - Yadong Tian
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Jingyi Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, 430070 Wuhan, Hubei, China
| | - Armin Scheben
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Chenxi Zhang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Yiyi Li
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Junfeng Wu
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Lan Yang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Xuewei Fan
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Guirong Sun
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Donghua Li
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Yanhua Zhang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Ruili Han
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Ruirui Jiang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Hetian Huang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Fengbin Yan
- Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Yanbin Wang
- Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Zhuanjian Li
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Guoxi Li
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Xiaojun Liu
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - Wenting Li
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Crawley, 6009 WA, Australia
| | - Xiangtao Kang
- College of Animal Science and Technology, Henan Agricultural University, Zhengzhou 450046, China.,Henan Key laboratory for innovation and utilization of chicken germplasm resources,Zhengzhou, 450046, China
| |
Collapse
|
31
|
Chu JSC, Peng B, Tang K, Yi X, Zhou H, Wang H, Li G, Leng J, Chen N, Feng X. Eight soybean reference genome resources from varying latitudes and agronomic traits. Sci Data 2021; 8:164. [PMID: 34210987 PMCID: PMC8249447 DOI: 10.1038/s41597-021-00947-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 04/30/2021] [Indexed: 01/18/2023] Open
Abstract
Comparative analysis of multiple reference genomes representing diverse genetic backgrounds is critical for understanding the role of key alleles important in domestication and genetic breeding of important crops such as soybean. To enrich the genetic resources for soybean, we describe the generation, technical assessment, and preliminary genomic variation analysis of eight de novo reference-grade soybean genome assemblies from wild and cultivated accessions. These resources represent soybeans cultured at different latitudes and exhibiting different agronomical traits. Of these eight soybeans, five are from new accessions that have not been sequenced before. We demonstrate the usage of these genomes to identify small and large genomic variations affecting known genes as well as screening for genic PAV regions for identifying candidates for further functional studies.
Collapse
Affiliation(s)
- Jeffrey Shih-Chieh Chu
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
- Wuhan Frasergen Bioinformatics Inc., East Lake High-Tech Zone, Wuhan, China
| | - Bo Peng
- Wuhan Frasergen Bioinformatics Inc., East Lake High-Tech Zone, Wuhan, China
| | - Kuanqiang Tang
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
| | - Xingxing Yi
- Wuhan Frasergen Bioinformatics Inc., East Lake High-Tech Zone, Wuhan, China
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Huangkai Zhou
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
| | - Huan Wang
- Wuhan Frasergen Bioinformatics Inc., East Lake High-Tech Zone, Wuhan, China
| | - Guang Li
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
| | - Jiantian Leng
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
| | - Nansheng Chen
- Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China.
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada.
| | - Xianzhong Feng
- Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China.
| |
Collapse
|
32
|
Lei L, Goltsman E, Goodstein D, Wu GA, Rokhsar DS, Vogel JP. Plant Pan-Genomics Comes of Age. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:411-435. [PMID: 33848428 DOI: 10.1146/annurev-arplant-080720-105454] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A pan-genome is the nonredundant collection of genes and/or DNA sequences in a species. Numerous studies have shown that plant pan-genomes are typically much larger than the genome of any individual and that a sizable fraction of the genes in any individual are present in only some genomes. The construction and interpretation of plant pan-genomes are challenging due to the large size and repetitive content of plant genomes. Most pan-genomes are largely focused on nontransposable element protein coding genes because they are more easily analyzed and defined than noncoding and repetitive sequences. Nevertheless, noncoding and repetitive DNA play important roles in determining the phenotype and genome evolution. Fortunately, it is now feasible to make multiple high-quality genomes that can be used to construct high-resolution pan-genomes that capture all the variation. However, assembling, displaying, and interacting with such high-resolution pan-genomes will require the development of new tools.
Collapse
Affiliation(s)
- Li Lei
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | - Eugene Goltsman
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | - David Goodstein
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
| | | | - Daniel S Rokhsar
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
| | - John P Vogel
- DOE Joint Genome Institute, Berkeley, California 94720, USA;
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
33
|
Barragan AC, Weigel D. Plant NLR diversity: the known unknowns of pan-NLRomes. THE PLANT CELL 2021; 33:814-831. [PMID: 33793812 PMCID: PMC8226294 DOI: 10.1093/plcell/koaa002] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 10/23/2020] [Indexed: 05/20/2023]
Abstract
Plants and pathogens constantly adapt to each other. As a consequence, many members of the plant immune system, and especially the intracellular nucleotide-binding site leucine-rich repeat receptors, also known as NOD-like receptors (NLRs), are highly diversified, both among family members in the same genome, and between individuals in the same species. While this diversity has long been appreciated, its true extent has remained unknown. With pan-genome and pan-NLRome studies becoming more and more comprehensive, our knowledge of NLR sequence diversity is growing rapidly, and pan-NLRomes provide powerful platforms for assigning function to NLRs. These efforts are an important step toward the goal of comprehensively predicting from sequence alone whether an NLR provides disease resistance, and if so, to which pathogens.
Collapse
Affiliation(s)
- A Cristina Barragan
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | | |
Collapse
|
34
|
Li J, Yuan D, Wang P, Wang Q, Sun M, Liu Z, Si H, Xu Z, Ma Y, Zhang B, Pei L, Tu L, Zhu L, Chen LL, Lindsey K, Zhang X, Jin S, Wang M. Cotton pan-genome retrieves the lost sequences and genes during domestication and selection. Genome Biol 2021; 22:119. [PMID: 33892774 PMCID: PMC8063427 DOI: 10.1186/s13059-021-02351-w] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 04/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Millennia of directional human selection has reshaped the genomic architecture of cultivated cotton relative to wild counterparts, but we have limited understanding of the selective retention and fractionation of genomic components. RESULTS We construct a comprehensive genomic variome based on 1961 cottons and identify 456 Mb and 357 Mb of sequence with domestication and improvement selection signals and 162 loci, 84 of which are novel, including 47 loci associated with 16 agronomic traits. Using pan-genome analyses, we identify 32,569 and 8851 non-reference genes lost from Gossypium hirsutum and Gossypium barbadense reference genomes respectively, of which 38.2% (39,278) and 14.2% (11,359) of genes exhibit presence/absence variation (PAV). We document the landscape of PAV selection accompanied by asymmetric gene gain and loss and identify 124 PAVs linked to favorable fiber quality and yield loci. CONCLUSIONS This variation repertoire points to genomic divergence during cotton domestication and improvement, which informs the characterization of favorable gene alleles for improved breeding practice using a pan-genome-based approach.
Collapse
Affiliation(s)
- Jianying Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Daojun Yuan
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Pengcheng Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Qiongqiong Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Mengling Sun
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Zhenping Liu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Huan Si
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Zhongping Xu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Yizan Ma
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Boyang Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Liuling Pei
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Lili Tu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Longfu Zhu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Ling-Ling Chen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Keith Lindsey
- Department of Biosciences, Durham University, Durham, UK
| | - Xianlong Zhang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China
| | - Shuangxia Jin
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China.
| | - Maojun Wang
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, China.
| |
Collapse
|
35
|
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN. How the pan-genome is changing crop genomics and improvement. Genome Biol 2021; 22:3. [PMID: 33397434 PMCID: PMC7780660 DOI: 10.1186/s13059-020-02224-8] [Citation(s) in RCA: 123] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 12/07/2020] [Indexed: 01/13/2023] Open
Abstract
Crop genomics has seen dramatic advances in recent years due to improvements in sequencing technology, assembly methods, and computational resources. These advances have led to the development of new tools to facilitate crop improvement. The study of structural variation within species and the characterization of the pan-genome has revealed extensive genome content variation among individuals within a species that is paradigm shifting to crop genomics and improvement. Here, we review advances in crop genomics and how utilization of these tools is shifting in light of pan-genomes that are becoming available for many crop species.
Collapse
Affiliation(s)
- Rafael Della Coletta
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108 USA
| | - Yinjie Qiu
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108 USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| | - Matthew B. Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| | - Candice N. Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108 USA
| |
Collapse
|
36
|
Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN. How the pan-genome is changing crop genomics and improvement. Genome Biol 2021. [PMID: 33397434 DOI: 10.1186/s13059-020-02224-2228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023] Open
Abstract
Crop genomics has seen dramatic advances in recent years due to improvements in sequencing technology, assembly methods, and computational resources. These advances have led to the development of new tools to facilitate crop improvement. The study of structural variation within species and the characterization of the pan-genome has revealed extensive genome content variation among individuals within a species that is paradigm shifting to crop genomics and improvement. Here, we review advances in crop genomics and how utilization of these tools is shifting in light of pan-genomes that are becoming available for many crop species.
Collapse
Affiliation(s)
- Rafael Della Coletta
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Yinjie Qiu
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA.
| |
Collapse
|
37
|
Ruperao P, Thirunavukkarasu N, Gandham P, Selvanayagam S, Govindaraj M, Nebie B, Manyasa E, Gupta R, Das RR, Odeny DA, Gandhi H, Edwards D, Deshpande SP, Rathore A. Sorghum Pan-Genome Explores the Functional Utility for Genomic-Assisted Breeding to Accelerate the Genetic Gain. FRONTIERS IN PLANT SCIENCE 2021; 12:666342. [PMID: 34140962 PMCID: PMC8204017 DOI: 10.3389/fpls.2021.666342] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 04/28/2021] [Indexed: 05/05/2023]
Abstract
Sorghum (Sorghum bicolor L.) is a staple food crops in the arid and rainfed production ecologies. Sorghum plays a critical role in resilient farming and is projected as a smart crop to overcome the food and nutritional insecurity in the developing world. The development and characterisation of the sorghum pan-genome will provide insight into genome diversity and functionality, supporting sorghum improvement. We built a sorghum pan-genome using reference genomes as well as 354 genetically diverse sorghum accessions belonging to different races. We explored the structural and functional characteristics of the pan-genome and explain its utility in supporting genetic gain. The newly-developed pan-genome has a total of 35,719 genes, a core genome of 16,821 genes and an average of 32,795 genes in each cultivar. The variable genes are enriched with environment responsive genes and classify the sorghum accessions according to their race. We show that 53% of genes display presence-absence variation, and some of these variable genes are predicted to be functionally associated with drought adaptation traits. Using more than two million SNPs from the pan-genome, association analysis identified 398 SNPs significantly associated with important agronomic traits, of which, 92 were in genes. Drought gene expression analysis identified 1,788 genes that are functionally linked to different conditions, of which 79 were absent from the reference genome assembly. This study provides comprehensive genomic diversity resources in sorghum which can be used in genome assisted crop improvement.
Collapse
Affiliation(s)
- Pradeep Ruperao
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
| | | | - Prasad Gandham
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
| | | | | | - Baloua Nebie
- Sorghum Breeding Program, International Crops Research Institute for the Semi-Arid Tropics, Bamako, Mali
| | - Eric Manyasa
- Sorghum Breeding Program, International Crops Research Institute for the Semi-Arid Tropics, Nairobi, Kenya
| | - Rajeev Gupta
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
| | - Roma Rani Das
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
| | - Damaris A. Odeny
- Sorghum Breeding Program, International Crops Research Institute for the Semi-Arid Tropics, Nairobi, Kenya
| | - Harish Gandhi
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Perth, WA, Australia
| | - Santosh P. Deshpande
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
- Santosh P. Deshpande
| | - Abhishek Rathore
- International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India
- *Correspondence: Abhishek Rathore
| |
Collapse
|
38
|
High-Throughput Genotyping Technologies in Plant Taxonomy. Methods Mol Biol 2021; 2222:149-166. [PMID: 33301093 DOI: 10.1007/978-1-0716-0997-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Molecular markers provide researchers with a powerful tool for variation analysis between plant genomes. They are heritable and widely distributed across the genome and for this reason have many applications in plant taxonomy and genotyping. Over the last decade, molecular marker technology has developed rapidly and is now a crucial component for genetic linkage analysis, trait mapping, diversity analysis, and association studies. This chapter focuses on molecular marker discovery, its application, and future perspectives for plant genotyping through pangenome assemblies. Included are descriptions of automated methods for genome and sequence distance estimation, genome contaminant analysis in sequence reads, genome structural variation, and SNP discovery methods.
Collapse
|
39
|
Kou Y, Liao Y, Toivainen T, Lv Y, Tian X, Emerson JJ, Gaut BS, Zhou Y. Evolutionary Genomics of Structural Variation in Asian Rice (Oryza sativa) Domestication. Mol Biol Evol 2020; 37:3507-3524. [PMID: 32681796 PMCID: PMC7743901 DOI: 10.1093/molbev/msaa185] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type-which included inversions, duplications, deletions, translocations, and mobile element insertions-was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.
Collapse
Affiliation(s)
- Yixuan Kou
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
- Laboratory of Subtropical Biodiversity, Jiangxi Agricultural University, Nanchang, China
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
| | - Tuomas Toivainen
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
- Department of Agricultural Sciences, University of Helsinki, Helsinki, Finland
| | - Yuanda Lv
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
| | - Xinmin Tian
- Department of Biological Sciences, College of Life Science and Technology, Xinjiang University, Urumqi, China
| | - J J Emerson
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
| | - Brandon S Gaut
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
| | - Yongfeng Zhou
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA
- Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| |
Collapse
|
40
|
Yao W, Li Y, Xie W, Wang L. Features of sRNA biogenesis in rice revealed by genetic dissection of sRNA expression level. Comput Struct Biotechnol J 2020; 18:3207-3216. [PMID: 33209208 PMCID: PMC7649420 DOI: 10.1016/j.csbj.2020.10.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 09/24/2020] [Accepted: 10/11/2020] [Indexed: 01/25/2023] Open
Abstract
We previously conducted a QTL analysis of small RNA (sRNA) abundance in flag leaves of an immortalized rice F2 (IMF2) population by aligning sRNA reads to the reference genome to quantify the expression levels of sRNAs. However, this approach missed about half of the sRNAs as only 50% of all sRNA reads could be uniquely aligned to the reference genome. Here, we quantified the expression levels of sRNAs and sRNA clusters without the use of a reference genome. QTL analysis of the expression levels of sRNAs and sRNA clusters confirmed the feasibility of this approach. sRNAs and sRNA clusters with identified QTLs were then aligned to the high-quality parental genomes of the IMF2 population to resolve the identified QTLs into local vs. distant regulation mode. We were able to detect new QTL hotspots by considering sRNAs aligned to multiple positions of the parental genomes and sRNAs unaligned to the parental genomes. We found that several local-QTL hotspots were caused by sequence variations in long inverted repeats, which probably function as precursors of sRNAs, between the two parental genomes. The expression levels of these sRNAs were significantly associated with the presence/absence of the long inverted repeats in the IMF2 population. Moreover, we found that the variations in whole-genome sRNA species composition among different IMF2s were attributed to sRNA biogenesis genes including OsDCL2b and OsRDR2. Our results highlight that genetic dissection of sRNA expression is a promising approach to disclose new components functioning in sRNA biogenesis and new mechanisms of sRNA biogenesis.
Collapse
Affiliation(s)
- Wen Yao
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China.,National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China
| | - Yang Li
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Weibo Xie
- National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China
| | - Lei Wang
- National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
41
|
Tahir Ul Qamar M, Zhu X, Khan MS, Xing F, Chen LL. Pan-genome: A promising resource for noncoding RNA discovery in plants. THE PLANT GENOME 2020; 13:e20046. [PMID: 33217199 DOI: 10.1002/tpg2.20046] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 06/08/2020] [Accepted: 06/22/2020] [Indexed: 05/05/2023]
Abstract
Plant genomes contain both protein-coding and noncoding sequences including transposable elements (TEs) and noncoding RNAs (ncRNAs). The ncRNAs are recognized as important elements that play fundamental roles in the structural organization and function of plant genomes. Despite various hypotheses, TEs are believed to be a major precursor of ncRNAs. Transposable elements are also prime factors that cause genomic variation among members of a species. Hence, TEs pose a major challenge in the discovery and analysis of ncRNAs. With the increase in the number of sequenced plant genomes, it is now accepted that a single reference genome is insufficient to represent the complete genomic diversity and contents of a species, and exploring the pan-genome of a species is critical. In this review, we summarize the recent progress in the field of plant pan-genomes. We also discuss TEs and their roles in ncRNA biogenesis and present our perspectives on the application of pan-genomes for the discovery of ncRNAs to fully explore and exploit their biological roles in plants.
Collapse
Affiliation(s)
- Muhammad Tahir Ul Qamar
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, 530004, P. R. China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Xitong Zhu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Muhammad Sarwar Khan
- Center of Agricultural Biochemistry and Biotechnology, University of Agriculture, Faisalabad, 38000, Pakistan
| | - Feng Xing
- College of Life Science, Xinyang Normal University, Xinyang, 464000, P. R. China
| | - Ling-Ling Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning, 530004, P. R. China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| |
Collapse
|
42
|
Zhao J, Bayer PE, Ruperao P, Saxena RK, Khan AW, Golicz AA, Nguyen HT, Batley J, Edwards D, Varshney RK. Trait associations in the pangenome of pigeon pea (Cajanus cajan). PLANT BIOTECHNOLOGY JOURNAL 2020; 18:1946-1954. [PMID: 32020732 PMCID: PMC7415775 DOI: 10.1111/pbi.13354] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 12/11/2019] [Accepted: 01/14/2020] [Indexed: 05/21/2023]
Abstract
Pigeon pea (Cajanus cajan) is an important orphan crop mainly grown by smallholder farmers in India and Africa. Here, we present the first pigeon pea pangenome based on 89 accessions mainly from India and the Philippines, showing that there is significant genetic diversity in Philippine individuals that is not present in Indian individuals. Annotation of variable genes suggests that they are associated with self-fertilization and response to disease. We identified 225 SNPs associated with nine agronomically important traits over three locations and two different time points, with SNPs associated with genes for transcription factors and kinases. These results will lead the way to an improved pigeon pea breeding programme.
Collapse
Affiliation(s)
- Junliang Zhao
- Rice Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory of New Technology in Rice BreedingGuangzhouChina
| | - Philipp E. Bayer
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| | - Pradeep Ruperao
- National Institute of Agricultural BotanyCambridgeUK
- International Crops Research Institute for the Semi‐Arid Tropics (ICRISAT)HyderabadIndia
| | - Rachit K. Saxena
- International Crops Research Institute for the Semi‐Arid Tropics (ICRISAT)HyderabadIndia
| | - Aamir W. Khan
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
- International Crops Research Institute for the Semi‐Arid Tropics (ICRISAT)HyderabadIndia
| | - Agnieszka A. Golicz
- Plant Molecular Biology and Biotechnology LaboratoryFaculty of Veterinary and Agricultural SciencesUniversity of MelbourneMelbourneVICAustralia
| | - Henry T. Nguyen
- Division of Plant SciencesUniversity of MissouriColumbiaMOUSA
| | - Jacqueline Batley
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| | - David Edwards
- School of Biological Sciences and Institute of AgricultureThe University of Western AustraliaPerthWAAustralia
| | - Rajeev K. Varshney
- International Crops Research Institute for the Semi‐Arid Tropics (ICRISAT)HyderabadIndia
| |
Collapse
|
43
|
Gordon SP, Contreras-Moreira B, Levy JJ, Djamei A, Czedik-Eysenberg A, Tartaglio VS, Session A, Martin J, Cartwright A, Katz A, Singan VR, Goltsman E, Barry K, Dinh-Thi VH, Chalhoub B, Diaz-Perez A, Sancho R, Lusinska J, Wolny E, Nibau C, Doonan JH, Mur LAJ, Plott C, Jenkins J, Hazen SP, Lee SJ, Shu S, Goodstein D, Rokhsar D, Schmutz J, Hasterok R, Catalan P, Vogel JP. Gradual polyploid genome evolution revealed by pan-genomic analysis of Brachypodium hybridum and its diploid progenitors. Nat Commun 2020; 11:3670. [PMID: 32728126 PMCID: PMC7391716 DOI: 10.1038/s41467-020-17302-5] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 06/19/2020] [Indexed: 02/08/2023] Open
Abstract
Our understanding of polyploid genome evolution is constrained because we cannot know the exact founders of a particular polyploid. To differentiate between founder effects and post polyploidization evolution, we use a pan-genomic approach to study the allotetraploid Brachypodium hybridum and its diploid progenitors. Comparative analysis suggests that most B. hybridum whole gene presence/absence variation is part of the standing variation in its diploid progenitors. Analysis of nuclear single nucleotide variants, plastomes and k-mers associated with retrotransposons reveals two independent origins for B. hybridum, ~1.4 and ~0.14 million years ago. Examination of gene expression in the younger B. hybridum lineage reveals no bias in overall subgenome expression. Our results are consistent with a gradual accumulation of genomic changes after polyploidization and a lack of subgenome expression dominance. Significantly, if we did not use a pan-genomic approach, we would grossly overestimate the number of genomic changes attributable to post polyploidization evolution.
Collapse
Affiliation(s)
- Sean P Gordon
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
| | - Bruno Contreras-Moreira
- Estación Experimental de Aula Dei (EEAD-CSIC), Zaragoza, Spain
- Fundación ARAID, Zaragoza, Spain
- Grupo de Bioquímica, Biofísica y Biología Computacional (BIFI, UNIZAR), Unidad Asociada al CSIC, Zaragoza, Spain
| | - Joshua J Levy
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
- University California, Berkeley, Berkeley, CA, 94720, USA
| | - Armin Djamei
- Gregor Mendel Institute of Molecular Plant Biology GmbH, Vienna, Austria
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben. Stadt Seeland, Seeland, Germany
| | | | - Virginia S Tartaglio
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
- University California, Berkeley, Berkeley, CA, 94720, USA
| | - Adam Session
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
| | - Joel Martin
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
| | | | - Andrew Katz
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
| | | | | | - Kerrie Barry
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
| | - Vinh Ha Dinh-Thi
- Organization and evolution of complex genomes (OECG) Institut national de la Recherche agronomique (INRA), Université d'Evry Val d'Essonne (UEVE), Evry, France
| | - Boulos Chalhoub
- Organization and evolution of complex genomes (OECG) Institut national de la Recherche agronomique (INRA), Université d'Evry Val d'Essonne (UEVE), Evry, France
- Institute of Crop Science, Zhejiang University, 866 Yu-Hang-Tang Road, 310058, Hangzhou, China
| | - Antonio Diaz-Perez
- Universidad de Zaragoza-Escuela Politécnica Superior de Huesca, 22071, Huesca, Spain
- Instituto de Genética, Facultad de Agronomía, Universidad Central de Venezuela, 2102, Maracay, Venezuela
| | - Ruben Sancho
- Universidad de Zaragoza-Escuela Politécnica Superior de Huesca, 22071, Huesca, Spain
| | - Joanna Lusinska
- Plant Cytogenetics and Molecular Biology Group, Institute of Biology, Biotechnology and Environmental Protection, Faculty of Natural Sciences, University of Silesia in Katowice, 40-032, Katowice, Poland
| | - Elzbieta Wolny
- Plant Cytogenetics and Molecular Biology Group, Institute of Biology, Biotechnology and Environmental Protection, Faculty of Natural Sciences, University of Silesia in Katowice, 40-032, Katowice, Poland
| | - Candida Nibau
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, Wales, UK
| | - John H Doonan
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, Wales, UK
| | - Luis A J Mur
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, Wales, UK
| | - Chris Plott
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA
| | - Jerry Jenkins
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA
| | - Samuel P Hazen
- Biology Department, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Scott J Lee
- Biology Department, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | | | | | - Daniel Rokhsar
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
- University California, Berkeley, Berkeley, CA, 94720, USA
| | - Jeremy Schmutz
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA
| | - Robert Hasterok
- Plant Cytogenetics and Molecular Biology Group, Institute of Biology, Biotechnology and Environmental Protection, Faculty of Natural Sciences, University of Silesia in Katowice, 40-032, Katowice, Poland
| | - Pilar Catalan
- Grupo de Bioquímica, Biofísica y Biología Computacional (BIFI, UNIZAR), Unidad Asociada al CSIC, Zaragoza, Spain.
- Universidad de Zaragoza-Escuela Politécnica Superior de Huesca, 22071, Huesca, Spain.
- Institute of Biology, Tomsk State University, Tomsk, 634050, Russia.
| | - John P Vogel
- DOE Joint Genome Institute, Berkeley, CA, 94720, USA.
- University California, Berkeley, Berkeley, CA, 94720, USA.
| |
Collapse
|
44
|
Liu Y, Du H, Li P, Shen Y, Peng H, Liu S, Zhou GA, Zhang H, Liu Z, Shi M, Huang X, Li Y, Zhang M, Wang Z, Zhu B, Han B, Liang C, Tian Z. Pan-Genome of Wild and Cultivated Soybeans. Cell 2020; 182:162-176.e13. [PMID: 32553274 DOI: 10.1016/j.cell.2020.05.023] [Citation(s) in RCA: 478] [Impact Index Per Article: 95.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Revised: 04/07/2020] [Accepted: 05/12/2020] [Indexed: 12/21/2022]
Abstract
Soybean is one of the most important vegetable oil and protein feed crops. To capture the entire genomic diversity, it is needed to construct a complete high-quality pan-genome from diverse soybean accessions. In this study, we performed individual de novo genome assemblies for 26 representative soybeans that were selected from 2,898 deeply sequenced accessions. Using these assembled genomes together with three previously reported genomes, we constructed a graph-based genome and performed pan-genome analysis, which identified numerous genetic variations that cannot be detected by direct mapping of short sequence reads onto a single reference genome. The structural variations from the 2,898 accessions that were genotyped based on the graph-based genome and the RNA sequencing (RNA-seq) data from the representative 26 accessions helped to link genetic variations to candidate genes that are responsible for important traits. This pan-genome resource will promote evolutionary and functional genomics studies in soybean.
Collapse
Affiliation(s)
- Yucheng Liu
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Huilong Du
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Pengcheng Li
- Berry Genomics Corporation, Beijing 100015, China
| | - Yanting Shen
- School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, Guangzhou 510006, China
| | - Hua Peng
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shulin Liu
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | - Guo-An Zhou
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | | | - Zhi Liu
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Miao Shi
- Berry Genomics Corporation, Beijing 100015, China
| | - Xuehui Huang
- College of Life Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Yan Li
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Min Zhang
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | - Zheng Wang
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | - Baoge Zhu
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China
| | - Bin Han
- National Center for Gene Research, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Chengzhi Liang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Zhixi Tian
- State Key Laboratory of Plant Cell and Chromosome Engineering, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; College of Advanced Agriculture Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
45
|
Christian RW, Hewitt SL, Roalson EH, Dhingra A. Genome-Scale Characterization of Predicted Plastid-Targeted Proteomes in Higher Plants. Sci Rep 2020; 10:8281. [PMID: 32427841 PMCID: PMC7237471 DOI: 10.1038/s41598-020-64670-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 04/20/2020] [Indexed: 12/20/2022] Open
Abstract
Plastids are morphologically and functionally diverse organelles that are dependent on nuclear-encoded, plastid-targeted proteins for all biochemical and regulatory functions. However, how plastid proteomes vary temporally, spatially, and taxonomically has been historically difficult to analyze at a genome-wide scale using experimental methods. A bioinformatics workflow was developed and evaluated using a combination of fast and user-friendly subcellular prediction programs to maximize performance and accuracy for chloroplast transit peptides and demonstrate this technique on the predicted proteomes of 15 sequenced plant genomes. Gene family grouping was then performed in parallel using modified approaches of reciprocal best BLAST hits (RBH) and UCLUST. A total of 628 protein families were found to have conserved plastid targeting across angiosperm species using RBH, and 828 using UCLUST. However, thousands of clusters were also detected where only one species had predicted plastid targeting, most notably in Panicum virgatum which had 1,458 proteins with species-unique targeting. An average of 45% overlap was found in plastid-targeted protein-coding gene families compared with Arabidopsis, but an additional 20% of proteins matched against the full Arabidopsis proteome, indicating a unique evolution of plastid targeting. Neofunctionalization through subcellular relocalization is known to impart novel biological functions but has not been described before on a genome-wide scale for the plastid proteome. Further work to correlate these predicted novel plastid-targeted proteins to transcript abundance and high-throughput proteomics will uncover unique aspects of plastid biology and shed light on how the plastid proteome has evolved to influence plastid morphology and biochemistry.
Collapse
Affiliation(s)
- Ryan W Christian
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Seanna L Hewitt
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Eric H Roalson
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Amit Dhingra
- Department of Horticulture, Washington State University, Pullman, WA, USA.
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA.
| |
Collapse
|
46
|
Dolatabadian A, Bayer PE, Tirnaz S, Hurgobin B, Edwards D, Batley J. Characterization of disease resistance genes in the Brassica napus pangenome reveals significant structural variation. PLANT BIOTECHNOLOGY JOURNAL 2020; 18:969-982. [PMID: 31553100 PMCID: PMC7061875 DOI: 10.1111/pbi.13262] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 08/30/2019] [Accepted: 09/13/2019] [Indexed: 05/18/2023]
Abstract
Methods based on single nucleotide polymorphism (SNP), copy number variation (CNV) and presence/absence variation (PAV) discovery provide a valuable resource to study gene structure and evolution. However, as a result of these structural variations, a single reference genome is unable to cover the entire gene content of a species. Therefore, pangenomics analysis is needed to ensure that the genomic diversity within a species is fully represented. Brassica napus is one of the most important oilseed crops in the world and exhibits variability in its resistance genes across different cultivars. Here, we characterized resistance gene distribution across 50 B. napus lines. We identified a total of 1749 resistance gene analogs (RGAs), of which 996 are core and 753 are variable, 368 of which are not present in the reference genome (cv. Darmor-bzh). In addition, a total of 15 318 SNPs were predicted within 1030 of the RGAs. The results showed that core R-genes harbour more SNPs than variable genes. More nucleotide binding site-leucine-rich repeat (NBS-LRR) genes were located in clusters than as singletons, with variable genes more likely to be found in clusters. We identified 106 RGA candidates linked to blackleg resistance quantitative trait locus (QTL). This study provides a better understanding of resistance genes to target for genomics-based improvement and improved disease resistance.
Collapse
Affiliation(s)
- Aria Dolatabadian
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| | - Philipp E. Bayer
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| | - Soodeh Tirnaz
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| | - Bhavna Hurgobin
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| | - David Edwards
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| | - Jacqueline Batley
- UWA School of Biological Sciences and the UWA Institute of AgricultureFaculty of ScienceThe University of Western AustraliaCrawleyWAAustralia
| |
Collapse
|
47
|
Khan AW, Garg V, Roorkiwal M, Golicz AA, Edwards D, Varshney RK. Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement. TRENDS IN PLANT SCIENCE 2020; 25:148-158. [PMID: 31787539 PMCID: PMC6988109 DOI: 10.1016/j.tplants.2019.10.012] [Citation(s) in RCA: 151] [Impact Index Per Article: 30.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Revised: 10/28/2019] [Accepted: 10/29/2019] [Indexed: 05/19/2023]
Abstract
The pangenome provides genomic variations in the cultivated gene pool for a given species. However, as the crop's gene pool comprises many species, especially wild relatives with diverse genetic stock, here we suggest using accessions from all available species of a given genus for the development of a more comprehensive and complete pangenome, which we refer to as a super-pangenome. The super-pangenome provides a complete genomic variation repertoire of a genus and offers unprecedented opportunities for crop improvement. This opinion article focuses on recent developments in crop pangenomics, the need for a super-pangenome that should include wild species, and its application for crop improvement.
Collapse
Affiliation(s)
- Aamir W Khan
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India; School of Biological Sciences, The University of Western Australia (UWA), Crawley, WA, Australia
| | - Vanika Garg
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Manish Roorkiwal
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Agnieszka A Golicz
- Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, Melbourne, VIC, Australia
| | - David Edwards
- School of Biological Sciences, The University of Western Australia (UWA), Crawley, WA, Australia
| | - Rajeev K Varshney
- Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
| |
Collapse
|
48
|
Abstract
Cereal improvement is based upon effective utilization of genetic resources. These include germplasm and genomics data and tools. Cereal germplasm is available from major global seed banks. Wild material remains an additional less well utilized resource. Sourcing of germplasm requires protocols to ensure intellectual property matters are adequately addressed. Advances in genomics technology have made extensive data set available for the cereals. Reference genome sequences, transcriptome resources, and pan genomes are now available for the major cereal species. The use of genomic data is facilitated by the addition of user-friendly interfaces that allow breeders to access the information they need.
Collapse
Affiliation(s)
- Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD, Australia.
| |
Collapse
|
49
|
Golicz AA, Bayer PE, Bhalla PL, Batley J, Edwards D. Pangenomics Comes of Age: From Bacteria to Plant and Animal Applications. Trends Genet 2019; 36:132-145. [PMID: 31882191 DOI: 10.1016/j.tig.2019.11.006] [Citation(s) in RCA: 120] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 11/09/2019] [Accepted: 11/12/2019] [Indexed: 02/01/2023]
Abstract
The pangenome refers to a collection of genomic sequence found in the entire species or population rather than in a single individual; the sequence can be core, present in all individuals, or accessory (variable or dispensable), found in a subset of individuals only. While pangenomic studies were first undertaken in bacterial species, developments in genome sequencing and assembly approaches have allowed construction of pangenomes for eukaryotic organisms, fungi, plants, and animals, including two large-scale human pangenome projects. Analysis of the these pangenomes revealed key differences, most likely stemming from divergent evolutionary histories, but also surprising similarities.
Collapse
Affiliation(s)
- Agnieszka A Golicz
- Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, VIC, Australia.
| | - Philipp E Bayer
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - Prem L Bhalla
- Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Melbourne, VIC, Australia
| | - Jacqueline Batley
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia.
| |
Collapse
|
50
|
Xie M, Chung CYL, Li MW, Wong FL, Wang X, Liu A, Wang Z, Leung AKY, Wong TH, Tong SW, Xiao Z, Fan K, Ng MS, Qi X, Yang L, Deng T, He L, Chen L, Fu A, Ding Q, He J, Chung G, Isobe S, Tanabata T, Valliyodan B, Nguyen HT, Cannon SB, Foyer CH, Chan TF, Lam HM. A reference-grade wild soybean genome. Nat Commun 2019; 10:1216. [PMID: 30872580 PMCID: PMC6418295 DOI: 10.1038/s41467-019-09142-9] [Citation(s) in RCA: 159] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 02/22/2019] [Indexed: 01/01/2023] Open
Abstract
Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, with a final assembled genome size of 1013.2 Mb and a contig N50 of 3.3 Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz trypsin inhibitor (KTI) genes. Such findings illustrate the power of this assembly in the analysis of large structural variations in soybean germplasm collections. The wild soybean genome assembly has wide applications in comparative genomic and evolutionary studies, as well as in crop breeding and improvement programs.
Collapse
Affiliation(s)
- Min Xie
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Claire Yik-Lok Chung
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Man-Wah Li
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Fuk-Ling Wong
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Xin Wang
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Ailin Liu
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Zhili Wang
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Alden King-Yung Leung
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Tin-Hang Wong
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Suk-Wah Tong
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Zhixia Xiao
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Kejing Fan
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Ming-Sin Ng
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Xinpeng Qi
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Linfeng Yang
- BGI Genomics, BGI-Shenzhen, Shenzhen, 518083, Guangdong, China
| | - Tianquan Deng
- BGI Genomics, BGI-Shenzhen, Shenzhen, 518083, Guangdong, China
| | - Lijuan He
- BGI Genomics, BGI-Shenzhen, Shenzhen, 518083, Guangdong, China
| | - Lu Chen
- BGI Genomics, BGI-Shenzhen, Shenzhen, 518083, Guangdong, China
| | - Aisi Fu
- Wuhan Institute of Biotechnology, Wuhan, 430075, Hubei, China
| | - Qiong Ding
- Wuhan Institute of Biotechnology, Wuhan, 430075, Hubei, China
| | - Junxian He
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China
| | - Gyuhwa Chung
- Department of Biotechnology, Chonnam National University, Gwangju, 550-749, Jeonnam, South Korea
| | - Sachiko Isobe
- Kazusa DNA Research Institute, Kazusa-Kamatari, Kisarazu, 292-0818, Chiba, Japan
| | - Takanari Tanabata
- Kazusa DNA Research Institute, Kazusa-Kamatari, Kisarazu, 292-0818, Chiba, Japan
| | - Babu Valliyodan
- Division of Plant Sciences and National Center for Soybean Biotechnology, University of Missouri, Columbia, Missouri, 65211, USA
| | - Henry T Nguyen
- Division of Plant Sciences and National Center for Soybean Biotechnology, University of Missouri, Columbia, Missouri, 65211, USA
| | - Steven B Cannon
- Corn Insects and Crop Genetics Research Unit, United States Department of Agriculture - Agricultural Research Service (USDA-ARS), Ames, Iowa, 50011-4014, USA
| | - Christine H Foyer
- Faculty of Biological Sciences, Centre for Plant Sciences, University of Leeds, Leeds, LS2 9JT, Yorkshire, UK
| | - Ting-Fung Chan
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China.
| | - Hon-Ming Lam
- Centre for Soybean Research of the State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong Special Administrative Region, China.
| |
Collapse
|