1
|
Cannon EK, Portwood JL, Hayford RK, Haley OC, Gardiner JM, Andorf CM, Woodhouse MR. Enhanced pan-genomic resources at the maize genetics and genomics database. Genetics 2024; 227:iyae036. [PMID: 38577974 DOI: 10.1093/genetics/iyae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 01/13/2024] [Indexed: 04/06/2024] Open
Abstract
Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a particularly diverse and complex genome. Presenting pan-genome data, analyses, and visualization is challenging, especially for a diverse species, but more so when pan-genomic data is linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. Here, we describe MaizeGDB's expansion to include the genic subset of the Zea pan-genome in a pan-gene data center featuring the maize genomes hosted at MaizeGDB, and the outgroup teosinte Zea genomes from the Pan-Andropoganeae project. The new data center offers a variety of browsing and visualization tools, including sequence alignment visualization, gene trees and other tools, to explore pan-genes in Zea that were calculated by the pipeline Pandagma. Combined, these data will help maize researchers study the complexity and diversity of Zea, and to use the comparative functions to validate pan-gene relationships for a selected gene model.
Collapse
Affiliation(s)
- Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Olivia C Haley
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | | |
Collapse
|
2
|
Andorf CM, Haley OC, Hayford RK, Portwood JL, Harding S, Sen S, Cannon EK, Gardiner JM, Kim HS, Woodhouse MR. PanEffect: a pan-genome visualization tool for variant effects in maize. Bioinformatics 2024; 40:btae073. [PMID: 38337024 PMCID: PMC10881103 DOI: 10.1093/bioinformatics/btae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/12/2024] Open
Abstract
SUMMARY Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).
Collapse
Affiliation(s)
- Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
- Department of Computer Science, Iowa State University, Ames, IA 50011, United States
| | - Olivia C Haley
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Stephen Harding
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, United States
| | - Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, United States
| | - Hye-Seon Kim
- USDA-ARS, Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Peoria, IL 61604, United States
| | - Margaret R Woodhouse
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, United States
| |
Collapse
|
3
|
Sen S, Woodhouse MR, Portwood JL, Andorf CM. Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications. Database (Oxford) 2023; 2023:baad078. [PMID: 37935586 PMCID: PMC10634621 DOI: 10.1093/database/baad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 09/16/2023] [Accepted: 10/19/2023] [Indexed: 11/09/2023]
Abstract
The big-data analysis of complex data associated with maize genomes accelerates genetic research and improves agronomic traits. As a result, efforts have increased to integrate diverse datasets and extract meaning from these measurements. Machine learning models are a powerful tool for gaining knowledge from large and complex datasets. However, these models must be trained on high-quality features to succeed. Currently, there are no solutions to host maize multi-omics datasets with end-to-end solutions for evaluating and linking features to target gene annotations. Our work presents the Maize Feature Store (MFS), a versatile application that combines features built on complex data to facilitate exploration, modeling and analysis. Feature stores allow researchers to rapidly deploy machine learning applications by managing and providing access to frequently used features. We populated the MFS for the maize reference genome with over 14 000 gene-based features based on published genomic, transcriptomic, epigenomic, variomic and proteomics datasets. Using the MFS, we created an accurate pan-genome classification model with an AUC-ROC score of 0.87. The MFS is publicly available through the maize genetics and genomics database. Database URL https://mfs.maizegdb.org/.
Collapse
Affiliation(s)
- Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, 1344 Advanced Teaching & Research Bldg, 2213 Pammel Dr, Ames, IA 50011, USA
| | - Margaret R Woodhouse
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, 819 Wallace Road, Ames, IA 50011, USA
- Department of Computer Science, Iowa State University, Atanasoff Hall, 2434 Osborn Dr, Ames, IA 50011, USA
| |
Collapse
|
4
|
Woodhouse MR, Portwood JL, Sen S, Hayford RK, Gardiner JM, Cannon EK, Harper LC, Andorf CM. Maize Protein Structure Resources at the Maize Genetics and Genomics Database. Genetics 2023; 224:7031797. [PMID: 36755109 DOI: 10.1093/genetics/iyad016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 02/10/2023] Open
Abstract
Protein structures play an important role in bioinformatics, such as in predicting gene function or validating gene model annotation. However, determining protein structure was, until now, costly and time-consuming, which resulted in a structural biology bottleneck. With the release of such programs AlphaFold and ESMFold, this bottleneck has been reduced by several orders of magnitude, permitting protein structural comparisons of entire genomes within reasonable timeframes. MaizeGDB has leveraged this technological breakthrough by offering several new tools to accelerate protein structural comparisons between maize and other plants as well as human and yeast outgroups. MaizeGDB also offers bulk downloads of these comparative protein structure data, along with predicted functional annotation information. In this way, MaizeGDB is poised to assist maize researchers in assessing functional homology, gene model annotation quality, and other information unavailable to maize scientists even a few years ago.
Collapse
Affiliation(s)
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
| | - Rita K Hayford
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Ethalinda K Cannon
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Lisa C Harper
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA.,Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
5
|
Mural RV, Sun G, Grzybowski M, Tross MC, Jin H, Smith C, Newton L, Andorf CM, Woodhouse MR, Thompson AM, Sigmon B, Schnable JC. Association mapping across a multitude of traits collected in diverse environments in maize. Gigascience 2022; 11:6673780. [PMID: 35997208 PMCID: PMC9396454 DOI: 10.1093/gigascience/giac080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/25/2022] [Indexed: 11/14/2022] Open
Abstract
Classical genetic studies have identified many cases of pleiotropy where mutations in individual genes alter many different phenotypes. Quantitative genetic studies of natural genetic variants frequently examine one or a few traits, limiting their potential to identify pleiotropic effects of natural genetic variants. Widely adopted community association panels have been employed by plant genetics communities to study the genetic basis of naturally occurring phenotypic variation in a wide range of traits. High-density genetic marker data-18M markers-from 2 partially overlapping maize association panels comprising 1,014 unique genotypes grown in field trials across at least 7 US states and scored for 162 distinct trait data sets enabled the identification of of 2,154 suggestive marker-trait associations and 697 confident associations in the maize genome using a resampling-based genome-wide association strategy. The precision of individual marker-trait associations was estimated to be 3 genes based on a reference set of genes with known phenotypes. Examples were observed of both genetic loci associated with variation in diverse traits (e.g., above-ground and below-ground traits), as well as individual loci associated with the same or similar traits across diverse environments. Many significant signals are located near genes whose functions were previously entirely unknown or estimated purely via functional data on homologs. This study demonstrates the potential of mining community association panel data using new higher-density genetic marker sets combined with resampling-based genome-wide association tests to develop testable hypotheses about gene functions, identify potential pleiotropic effects of natural genetic variants, and study genotype-by-environment interaction.
Collapse
Affiliation(s)
- Ravi V Mural
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Guangchao Sun
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Marcin Grzybowski
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Michael C Tross
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Hongyu Jin
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Christine Smith
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - Linsey Newton
- Department of Plant Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50010, USA.,Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | | | - Addie M Thompson
- Department of Plant Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA
| | - Brandi Sigmon
- Department of Plant Pathology, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| | - James C Schnable
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.,Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| |
Collapse
|
6
|
Woodhouse MR, Sen S, Schott D, Portwood JL, Freeling M, Walley JW, Andorf CM, Schnable JC. qTeller: a tool for comparative multi-genomic gene expression analysis. Bioinformatics 2021; 38:236-242. [PMID: 34406385 DOI: 10.1093/bioinformatics/btab604] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 07/23/2021] [Accepted: 08/17/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes and meta-analyses. RESULTS To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database and an optimized framework for adoption by other organisms' databases. AVAILABILITY AND IMPLEMENTATION The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Shatabdi Sen
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
| | - David Schott
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - John L Portwood
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Michael Freeling
- Department of Plant & Microbial Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Justin W Walley
- Department of Plant Pathology & Microbiology, Iowa State University, Ames, IA 50011, USA
| | - Carson M Andorf
- USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA.,Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - James C Schnable
- Center for Plant Science Innovation & Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
| |
Collapse
|
7
|
Woodhouse MR, Cannon EK, Portwood JL, Harper LC, Gardiner JM, Schaeffer ML, Andorf CM. A pan-genomic approach to genome databases using maize as a model system. BMC Plant Biol 2021; 21:385. [PMID: 34416864 PMCID: PMC8377966 DOI: 10.1186/s12870-021-03173-5] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 08/11/2021] [Indexed: 05/21/2023]
Abstract
Research in the past decade has demonstrated that a single reference genome is not representative of a species' diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.
Collapse
Affiliation(s)
| | - Ethalinda K Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, 50011, USA
| | - John L Portwood
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, 50011, USA
| | - Lisa C Harper
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, 50011, USA
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, 65211, Columbia, MO, USA
| | - Mary L Schaeffer
- Division of Plant Sciences, University of Missouri, 65211, Columbia, MO, USA
| | - Carson M Andorf
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, 50011, USA
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
8
|
Hufford MB, Seetharam AS, Woodhouse MR, Chougule KM, Ou S, Liu J, Ricci WA, Guo T, Olson A, Qiu Y, Della Coletta R, Tittes S, Hudson AI, Marand AP, Wei S, Lu Z, Wang B, Tello-Ruiz MK, Piri RD, Wang N, Kim DW, Zeng Y, O'Connor CH, Li X, Gilbert AM, Baggs E, Krasileva KV, Portwood JL, Cannon EKS, Andorf CM, Manchanda N, Snodgrass SJ, Hufnagel DE, Jiang Q, Pedersen S, Syring ML, Kudrna DA, Llaca V, Fengler K, Schmitz RJ, Ross-Ibarra J, Yu J, Gent JI, Hirsch CN, Ware D, Dawe RK. De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science 2021; 373:655-662. [PMID: 34353948 PMCID: PMC8733867 DOI: 10.1126/science.abg5289] [Citation(s) in RCA: 195] [Impact Index Per Article: 65.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 06/24/2021] [Indexed: 12/24/2022]
Abstract
We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The number of pan-genes in these diverse genomes exceeds 103,000, with approximately a third found across all genotypes. The results demonstrate that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres revealed additional variation in major cytological landmarks. We show that combining structural variation with single-nucleotide polymorphisms can improve the power of quantitative mapping studies. We also document variation at the level of DNA methylation and demonstrate that unmethylated regions are enriched for cis-regulatory elements that contribute to phenotypic variation.
Collapse
Affiliation(s)
- Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Arun S Seetharam
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
- Genome Informatics Facility, Iowa State University, Ames, IA 50011, USA
| | - Margaret R Woodhouse
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA
| | | | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Jianing Liu
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - William A Ricci
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Tingting Guo
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Yinjie Qiu
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Rafael Della Coletta
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Silas Tittes
- Center for Population Biology, University of California, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Asher I Hudson
- Center for Population Biology, University of California, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | | | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Zhenyuan Lu
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | | | - Rebecca D Piri
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Na Wang
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Dong Won Kim
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Yibing Zeng
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Christine H O'Connor
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN 55108, USA
| | - Xianran Li
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Amanda M Gilbert
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Erin Baggs
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Ksenia V Krasileva
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - John L Portwood
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA
| | - Ethalinda K S Cannon
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA
| | - Carson M Andorf
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA 50011, USA
| | - Nancy Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Samantha J Snodgrass
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - David E Hufnagel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
- Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA, 50010, USA
| | - Qiuhan Jiang
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Sarah Pedersen
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Michael L Syring
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - David A Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | | | | | - Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Jeffrey Ross-Ibarra
- Center for Population Biology, University of California, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
- Genome Center, University of California, Davis, CA 95616, USA
| | - Jianming Yu
- Department of Agronomy, Iowa State University, Ames, IA 50011, USA
| | - Jonathan I Gent
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN 55108, USA
| | - Doreen Ware
- USDA-ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - R Kelly Dawe
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
9
|
Martinez CC, Li S, Woodhouse MR, Sugimoto K, Sinha NR. Spatial transcriptional signatures define margin morphogenesis along the proximal-distal and medio-lateral axes in tomato (Solanum lycopersicum) leaves. Plant Cell 2021; 33:44-65. [PMID: 33710280 PMCID: PMC8136875 DOI: 10.1093/plcell/koaa012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 10/23/2020] [Indexed: 05/26/2023]
Abstract
Leaf morphogenesis involves cell division, expansion, and differentiation in the developing leaf, which take place at different rates and at different positions along the medio-lateral and proximal-distal leaf axes. The gene expression changes that control cell fate along these axes remain elusive due to difficulties in precisely isolating tissues. Here, we combined rigorous early leaf characterization, laser capture microdissection, and transcriptomic sequencing to ask how gene expression patterns regulate early leaf morphogenesis in wild-type tomato (Solanum lycopersicum) and the leaf morphogenesis mutant trifoliate. We observed transcriptional regulation of cell differentiation along the proximal-distal axis and identified molecular signatures delineating the classically defined marginal meristem/blastozone region during early leaf development. We describe the role of endoreduplication during leaf development, when and where leaf cells first achieve photosynthetic competency, and the regulation of auxin transport and signaling along the leaf axes. Knockout mutants of BLADE-ON-PETIOLE2 exhibited ectopic shoot apical meristem formation on leaves, highlighting the role of this gene in regulating margin tissue identity. We mapped gene expression signatures in specific leaf domains and evaluated the role of each domain in conferring indeterminacy and permitting blade outgrowth. Finally, we generated a global gene expression atlas of the early developing compound leaf.
Collapse
Affiliation(s)
- Ciera C Martinez
- Department of Molecular and Cellular Biology, University of California at Berkeley, Berkeley, CA 94709
- Berkeley Institute for Data Science, University of California at Berkeley, Berkeley, CA 94709
- Department of Plant Biology, University of California at Davis, Davis, CA 95616
| | - Siyu Li
- Department of Plant Biology, University of California at Davis, Davis, CA 95616
| | | | - Keiko Sugimoto
- RIKEN Center for Sustainable Resource Science, Tsurumi, Yokohama, 15 230-0045 Japan
| | - Neelima R Sinha
- Department of Plant Biology, University of California at Davis, Davis, CA 95616
| |
Collapse
|
10
|
Liu J, Seetharam AS, Chougule K, Ou S, Swentowsky KW, Gent JI, Llaca V, Woodhouse MR, Manchanda N, Presting GG, Kudrna DA, Alabady M, Hirsch CN, Fengler KA, Ware D, Michael TP, Hufford MB, Dawe RK. Gapless assembly of maize chromosomes using long-read technologies. Genome Biol 2020; 21:121. [PMID: 32434565 PMCID: PMC7238635 DOI: 10.1186/s13059-020-02029-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 04/23/2020] [Indexed: 12/16/2022] Open
Abstract
Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. This genome includes gapless assemblies of chromosome 3 (236 Mb) and chromosome 9 (162 Mb), and 53 Mb of the Ab10 meiotic drive haplotype. The data also reveal the internal structure of seven centromeres and five heterochromatic knobs, showing that the major tandem repeat arrays (CentC, knob180, and TR-1) are discontinuous and frequently interspersed with retroelements.
Collapse
Affiliation(s)
- Jianing Liu
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA, 50011, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Kyle W Swentowsky
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Jonathan I Gent
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Victor Llaca
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | | | - Nancy Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Gernot G Presting
- Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI, 96822, USA
| | - David A Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Magdy Alabady
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
- Georgia Genomics and Bioinformatics Core Laboratory, University of Georgia, Athens, GA, 30602, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Kevin A Fengler
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY, 14853, USA
| | - Todd P Michael
- Informatics Department, J. Craig Venter Institute, La Jolla, CA, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - R Kelly Dawe
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA.
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
11
|
Liu J, Seetharam AS, Chougule K, Ou S, Swentowsky KW, Gent JI, Llaca V, Woodhouse MR, Manchanda N, Presting GG, Kudrna DA, Alabady M, Hirsch CN, Fengler KA, Ware D, Michael TP, Hufford MB, Dawe RK. Gapless assembly of maize chromosomes using long-read technologies. Genome Biol 2020. [PMID: 32434565 DOI: 10.1101/2020.01.14.906230v1.full] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023] Open
Abstract
Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. This genome includes gapless assemblies of chromosome 3 (236 Mb) and chromosome 9 (162 Mb), and 53 Mb of the Ab10 meiotic drive haplotype. The data also reveal the internal structure of seven centromeres and five heterochromatic knobs, showing that the major tandem repeat arrays (CentC, knob180, and TR-1) are discontinuous and frequently interspersed with retroelements.
Collapse
Affiliation(s)
- Jianing Liu
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA, 50011, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Kyle W Swentowsky
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Jonathan I Gent
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - Victor Llaca
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | | | - Nancy Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Gernot G Presting
- Molecular Biosciences and Bioengineering, University of Hawaii, Honolulu, HI, 96822, USA
| | - David A Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA
| | - Magdy Alabady
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
- Georgia Genomics and Bioinformatics Core Laboratory, University of Georgia, Athens, GA, 30602, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Kevin A Fengler
- Corteva Agriscience™, 8325 NW 62nd Ave, Johnston, IA, 50131, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY, 14853, USA
| | - Todd P Michael
- Informatics Department, J. Craig Venter Institute, La Jolla, CA, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - R Kelly Dawe
- Department of Genetics, University of Georgia, Athens, GA, 30602, USA.
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA.
| |
Collapse
|
12
|
Portwood JL, Woodhouse MR, Cannon EK, Gardiner JM, Harper LC, Schaeffer ML, Walsh JR, Sen TZ, Cho KT, Schott DA, Braun BL, Dietze M, Dunfee B, Elsik CG, Manchanda N, Coe E, Sachs M, Stinard P, Tolbert J, Zimmerman S, Andorf CM. MaizeGDB 2018: the maize multi-genome genetics and genomics database. Nucleic Acids Res 2020; 47:D1146-D1154. [PMID: 30407532 PMCID: PMC6323944 DOI: 10.1093/nar/gky1046] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 10/16/2018] [Indexed: 01/12/2023] Open
Abstract
Since its 2015 update, MaizeGDB, the Maize Genetics and Genomics database, has expanded to support the sequenced genomes of many maize inbred lines in addition to the B73 reference genome assembly. Curation and development efforts have targeted high quality datasets and tools to support maize trait analysis, germplasm analysis, genetic studies, and breeding. MaizeGDB hosts a wide range of data including recent support of new data types including genome metadata, RNA-seq, proteomics, synteny, and large-scale diversity. To improve access and visualization of data types several new tools have been implemented to: access large-scale maize diversity data (SNPversity), download and compare gene expression data (qTeller), visualize pedigree data (Pedigree Viewer), link genes with phenotype images (MaizeDIG), and enable flexible user-specified queries to the MaizeGDB database (MaizeMine). MaizeGDB also continues to be the community hub for maize research, coordinating activities and providing technical support to the maize research community. Here we report the changes MaizeGDB has made within the last three years to keep pace with recent software and research advances, as well as the pan-genomic landscape that cheaper and better sequencing technologies have made possible. MaizeGDB is accessible online at https://www.maizegdb.org.
Collapse
Affiliation(s)
- John L Portwood
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Margaret R Woodhouse
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Ethalinda K Cannon
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Jack M Gardiner
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Lisa C Harper
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Mary L Schaeffer
- Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Jesse R Walsh
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Taner Z Sen
- USDA-ARS Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA.,Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Kyoung Tak Cho
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - David A Schott
- Department of Computer Science, Iowa State University, Ames, IA 50011, USA
| | - Bremen L Braun
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| | - Miranda Dietze
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Brittney Dunfee
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Christine G Elsik
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA.,Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Nancy Manchanda
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA 50011, USA
| | - Ed Coe
- Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Marty Sachs
- USDA/ARS/MWA Soybean/Maize Germplasm, Pathology & Genetics Research Unit, Urbana, IL, 61801, USA
| | - Philip Stinard
- USDA/ARS/MWA Soybean/Maize Germplasm, Pathology & Genetics Research Unit, Urbana, IL, 61801, USA
| | - Josh Tolbert
- USDA/ARS/MWA Soybean/Maize Germplasm, Pathology & Genetics Research Unit, Urbana, IL, 61801, USA
| | - Shane Zimmerman
- USDA/ARS/MWA Soybean/Maize Germplasm, Pathology & Genetics Research Unit, Urbana, IL, 61801, USA
| | - Carson M Andorf
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA
| |
Collapse
|
13
|
Manchanda N, Portwood JL, Woodhouse MR, Seetharam AS, Lawrence-Dill CJ, Andorf CM, Hufford MB. GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations. BMC Genomics 2020; 21:193. [PMID: 32122303 PMCID: PMC7053122 DOI: 10.1186/s12864-020-6568-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 02/07/2020] [Indexed: 11/28/2022] Open
Abstract
Background Genome assemblies are foundational for understanding the biology of a species. They provide a physical framework for mapping additional sequences, thereby enabling characterization of, for example, genomic diversity and differences in gene expression across individuals and tissue types. Quality metrics for genome assemblies gauge both the completeness and contiguity of an assembly and help provide confidence in downstream biological insights. To compare quality across multiple assemblies, a set of common metrics are typically calculated and then compared to one or more gold standard reference genomes. While several tools exist for calculating individual metrics, applications providing comprehensive evaluations of multiple assembly features are, perhaps surprisingly, lacking. Here, we describe a new toolkit that integrates multiple metrics to characterize both assembly and gene annotation quality in a way that enables comparison across multiple assemblies and assembly types. Results Our application, named GenomeQC, is an easy-to-use and interactive web framework that integrates various quantitative measures to characterize genome assemblies and annotations. GenomeQC provides researchers with a comprehensive summary of these statistics and allows for benchmarking against gold standard reference assemblies. Conclusions The GenomeQC web application is implemented in R/Shiny version 1.5.9 and Python 3.6 and is freely available at https://genomeqc.maizegdb.org/ under the GPL license. All source code and a containerized version of the GenomeQC pipeline is available in the GitHub repository https://github.com/HuffordLab/GenomeQC.
Collapse
Affiliation(s)
- Nancy Manchanda
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - John L Portwood
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | | | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA, 50011, USA
| | - Carolyn J Lawrence-Dill
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA, 50011, USA.,Department of Agronomy, Iowa State University, Ames, IA, 50011, USA
| | - Carson M Andorf
- USDA-ARS Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
14
|
Blake VC, Woodhouse MR, Lazo GR, Odell SG, Wight CP, Tinker NA, Wang Y, Gu YQ, Birkett CL, Jannink JL, Matthews DE, Hane DL, Michel SL, Yao E, Sen TZ. GrainGenes: centralized small grain resources and digital platform for geneticists and breeders. Database (Oxford) 2020; 2019:5513438. [PMID: 31210272 DOI: 10.1093/database/baz065] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Revised: 04/18/2019] [Accepted: 04/22/2019] [Indexed: 11/13/2022]
Abstract
GrainGenes (https://wheat.pw.usda.gov or https://graingenes.org) is an international centralized repository for curated, peer-reviewed datasets useful to researchers working on wheat, barley, rye and oat. GrainGenes manages genomic, genetic, germplasm and phenotypic datasets through a dynamically generated web interface for facilitated data discovery. Since 1992, GrainGenes has served geneticists and breeders in both the public and private sectors on six continents. Recently, several new datasets were curated into the database along with new tools for analysis. The GrainGenes homepage was enhanced by making it more visually intuitive and by adding links to commonly used pages. Several genome assemblies and genomic tracks are displayed through the genome browsers at GrainGenes, including the Triticum aestivum (bread wheat) cv. 'Chinese Spring' IWGSC RefSeq v1.0 genome assembly, the Aegilops tauschii (D genome progenitor) Aet v4.0 genome assembly, the Triticum turgidum ssp. dicoccoides (wild emmer wheat) cv. 'Zavitan' WEWSeq v.1.0 genome assembly, a T. aestivum (bread wheat) pangenome, the Hordeum vulgare (barley) cv. 'Morex' IBSC genome assembly, the Secale cereale (rye) select 'Lo7' assembly, a partial hexaploid Avena sativa (oat) assembly and the Triticum durum cv. 'Svevo' (durum wheat) RefSeq Release 1.0 assembly. New genetic maps and markers were added and can be displayed through CMAP. Quantitative trait loci, genetic maps and genes from the Wheat Gene Catalogue are indexed and linked through the Wheat Information System (WheatIS) portal. Training videos were created to help users query and reach the data they need. GSP (Genome Specific Primers) and PIECE2 (Plant Intron Exon Comparison and Evolution) tools were implemented and are available to use. As more small grains reference sequences become available, GrainGenes will play an increasingly vital role in helping researchers improve crops.
Collapse
Affiliation(s)
- Victoria C Blake
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Margaret R Woodhouse
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Gerard R Lazo
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Sarah G Odell
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA.,Department of Plant Sciences, University of California, Davis, CA, USA
| | - Charlene P Wight
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| | - Nicholas A Tinker
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON, Canada
| | - Yi Wang
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Yong Q Gu
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Clay L Birkett
- Robert Holley Center, United States Department of Agriculture-Agricultural Research Service, Ithaca, NY, USA
| | - Jean-Luc Jannink
- Robert Holley Center, United States Department of Agriculture-Agricultural Research Service, Ithaca, NY, USA.,Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
| | - Dave E Matthews
- Robert Holley Center, United States Department of Agriculture-Agricultural Research Service, Ithaca, NY, USA
| | - David L Hane
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Steve L Michel
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA
| | - Eric Yao
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA.,Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Taner Z Sen
- Western Regional Research Center, Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Albany, CA, USA.,Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, USA
| |
Collapse
|
15
|
Walsh JR, Woodhouse MR, Andorf CM, Sen TZ. Tissue-specific gene expression and protein abundance patterns are associated with fractionation bias in maize. BMC Plant Biol 2020; 20:4. [PMID: 31900107 PMCID: PMC6942271 DOI: 10.1186/s12870-019-2218-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2019] [Accepted: 12/24/2019] [Indexed: 05/26/2023]
Abstract
BACKGROUND Maize experienced a whole-genome duplication event approximately 5 to 12 million years ago. Because this event occurred after speciation from sorghum, the pre-duplication subgenomes can be partially reconstructed by mapping syntenic regions to the sorghum chromosomes. During evolution, maize has had uneven gene loss between each ancient subgenome. Fractionation and divergence between these genomes continue today, constantly changing genetic make-up and phenotypes and influencing agronomic traits. RESULTS Here we regenerate the subgenome reconstructions for the most recent maize reference genome assembly. Based on both expression and abundance data for homeologous gene pairs across multiple tissues, we observed functional divergence of genes across subgenomes. Although the genes in the larger maize subgenome are often expressing more highly than their homeologs in the smaller subgenome, we observed cases where homeolog expression dominance switches in different tissues. We demonstrate for the first time that protein abundances are higher in the larger subgenome, but they also show tissue-specific dominance, a pattern similar to RNA expression dominance. We also find that pollen expression is uniquely decoupled from protein abundance. CONCLUSION Our study shows that the larger subgenome has a greater range of functional assignments and that there is a relative lack of overlap between the subgenomes in terms of gene functions than would be suggested by similar patterns of gene expression and protein abundance. Our study also revealed that some reactions are catalyzed uniquely by the larger and smaller subgenomes. The tissue-specific, nonequivalent expression-level dominance pattern observed here implies a change in regulatory control which favors differentiated selective pressure on the retained duplicates leading to eventual change in gene functions.
Collapse
Affiliation(s)
- Jesse R Walsh
- U.S. Department of Agriculture, Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | - Margaret R Woodhouse
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
- U.S. Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA, 94710, USA
| | - Carson M Andorf
- U.S. Department of Agriculture, Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA
| | - Taner Z Sen
- U.S. Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA, 94710, USA.
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
16
|
Abstract
The selection of desirable traits in crops during domestication has been well studied. Many crops share a suite of modified phenotypic characteristics collectively known as the domestication syndrome. In this sense, crops have convergently evolved. Previous work has demonstrated that, at least in some instances, convergence for domestication traits has been achieved through parallel molecular means. However, both demography and selection during domestication may have placed limits on evolutionary potential and reduced opportunities for convergent adaptation during post-domestication migration to new environments. Here we review current knowledge regarding trait convergence in the cereal grasses and consider whether the complexity and dynamism of cereal genomes (e.g., transposable elements, polyploidy, genome size) helped these species overcome potential limitations owing to domestication and achieve broad subsequent adaptation, in many cases through parallel means. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- M R Woodhouse
- Iowa State University, Ecology, Evolution, and Organismal Biology , Ames, IA 50011 , USA
| | - M B Hufford
- Iowa State University, Ecology, Evolution, and Organismal Biology , Ames, IA 50011 , USA
| |
Collapse
|
17
|
Cheng F, Sun C, Wu J, Schnable J, Woodhouse MR, Liang J, Cai C, Freeling M, Wang X. Epigenetic regulation of subgenome dominance following whole genome triplication in Brassica rapa. New Phytol 2016; 211:288-99. [PMID: 26871271 DOI: 10.1111/nph.13884] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 12/28/2015] [Indexed: 05/10/2023]
Abstract
Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species.
Collapse
Affiliation(s)
- Feng Cheng
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Chao Sun
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Jian Wu
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - James Schnable
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE, 68588, USA
| | - Margaret R Woodhouse
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720, USA
| | - Jianli Liang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Chengcheng Cai
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720, USA
| | - Xiaowu Wang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| |
Collapse
|
18
|
Freeling M, Woodhouse MR, Subramaniam S, Turco G, Lisch D, Schnable JC. Fractionation mutagenesis and similar consequences of mechanisms removing dispensable or less-expressed DNA in plants. Curr Opin Plant Biol 2012; 15:131-9. [PMID: 22341793 DOI: 10.1016/j.pbi.2012.01.015] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 12/07/2011] [Accepted: 01/21/2012] [Indexed: 05/06/2023]
Abstract
Unlike in mammals, plants rapidly delete functionless, nonrepetitive DNA from their genomes. Following paleopolyploidies, duplicate genes are deleted by intrachromosomal recombination. This may explain how flowering plants have survived multiple whole genome duplications. Genes are disproportionately lost from one parental subgenome, the subgenome that is less expressed in the polyploid. The origin of this unbalanced expression between genomes remains unknown. The consequences of the tradeoffs between transposon repression and gene expression represent one potential explanation of genome dominance. If so, the same mechanisms may act in heterosis: genome dominance is like inbreeding depression. Regulatory DNA deletion following polyploidy combined with abundant RNA-seq expression datasets are being used to generate testable hypothesizes regarding the function of specific cis-regulatory sequences.
Collapse
Affiliation(s)
- Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | | | | | |
Collapse
|
19
|
Woodhouse MR, Tang H, Freeling M. Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. Plant Cell 2011; 23:4241-53. [PMID: 22180627 PMCID: PMC3269863 DOI: 10.1105/tpc.111.093567] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Certain types of gene families, such as those encoding most families of transcription factors, maintain their chromosomal syntenic positions throughout angiosperm evolutionary time. Other nonsyntenic gene families are prone to deletion, tandem duplication, and transposition. Here, we describe the chromosomal positional history of all genes in Arabidopsis thaliana throughout the rosid superorder. We introduce a public database where researchers can look up the positional history of their favorite A. thaliana gene or gene family. Finally, we show that specific gene families transposed at specific points in evolutionary time, particularly after whole-genome duplication events in the Brassicales, and suggest that genes in mobile gene families are under different selection pressure than syntenic genes.
Collapse
Affiliation(s)
- Margaret R Woodhouse
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA.
| | | | | |
Collapse
|
20
|
Woodhouse MR, Schnable JC, Pedersen BS, Lyons E, Lisch D, Subramaniam S, Freeling M. Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homologs. PLoS Biol 2010; 8:e1000409. [PMID: 20613864 PMCID: PMC2893956 DOI: 10.1371/journal.pbio.1000409] [Citation(s) in RCA: 195] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Accepted: 05/20/2010] [Indexed: 12/02/2022] Open
Abstract
Following genome duplication and selfish DNA expansion, maize used a heretofore unknown mechanism to shed redundant genes and functionless DNA with bias toward one of the parental genomes. Previous work in Arabidopsis showed that after an ancient tetraploidy event, genes were preferentially removed from one of the two homeologs, a process known as fractionation. The mechanism of fractionation is unknown. We sought to determine whether such preferential, or biased, fractionation exists in maize and, if so, whether a specific mechanism could be implicated in this process. We studied the process of fractionation using two recently sequenced grass species: sorghum and maize. The maize lineage has experienced a tetraploidy since its divergence from sorghum approximately 12 million years ago, and fragments of many knocked-out genes retain enough sequence similarity to be easily identifiable. Using sorghum exons as the query sequence, we studied the fate of both orthologous genes in maize following the maize tetraploidy. We show that genes are predominantly lost, not relocated, and that single-gene loss by deletion is the rule. Based on comparisons with orthologous sorghum and rice genes, we also infer that the sequences present before the deletion events were flanked by short direct repeats, a signature of intra-chromosomal recombination. Evidence of this deletion mechanism is found 2.3 times more frequently on one of the maize homeologs, consistent with earlier observations of biased fractionation. The over-fractionated homeolog is also a greater than 3-fold better target for transposon removal, but does not have an observably higher synonymous base substitution rate, nor could we find differentially placed methylation domains. We conclude that fractionation is indeed biased in maize and that intra-chromosomal or possibly a similar illegitimate recombination is the primary mechanism by which fractionation occurs. The mechanism of intra-chromosomal recombination explains the observed bias in both gene and transposon loss in the maize lineage. The existence of fractionation bias demonstrates that the frequency of deletion is modulated. Among the evolutionary benefits of this deletion/fractionation mechanism is bulk DNA removal and the generation of novel combinations of regulatory sequences and coding regions. All genomes can accumulate dispensable DNA in the form of duplications of individual genes or even partial or whole genome duplications. Genomes also can accumulate selfish DNA elements. Duplication events specifically are often followed by extensive gene loss. The maize genome is particularly extreme, having become tetraploid 10 million years ago and played host to massive transposon amplifications. We compared the genome of sorghum (which is homologous to the pre-tetraploid maize genome) with the two identifiable parental genomes retained in maize. The two maize genomes differ greatly: one of the parental genomes has lost 2.3 times more genes than the other, and the selfish DNA regions between genes were even more frequently lost, suggesting maize can distinguish between the parental genomes present in the original tetraploid. We show that genes are actually lost, not simply relocated. Deletions were rarely longer than a single gene, and occurred between repeated DNA sequences, suggesting mis-recombination as a mechanism of gene removal. We hypothesize an epigenetic mechanism of genome distinction to account for the selective loss. To the extent that the rate of base substitutions tracks time, we neither support nor refute claims of maize allotetraploidy. Finally, we explain why it makes sense that purifying selection in mammals does not operate at all like the gene and genome deletion program we describe here.
Collapse
Affiliation(s)
- Margaret R. Woodhouse
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - James C. Schnable
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Brent S. Pedersen
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Eric Lyons
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Damon Lisch
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Shabarinath Subramaniam
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Michael Freeling
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|