1
|
Koo HJ, Pan W. Are trait-associated genes clustered together in a gene network? Genet Epidemiol 2024. [PMID: 38472164 DOI: 10.1002/gepi.22557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/25/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024]
Abstract
Genome-wide association studies (GWAS) have provided an abundance of information about the genetic variants and their loci that are associated to complex traits and diseases. However, due to linkage disequilibrium (LD) and noncoding regions of loci, it remains a challenge to pinpoint the causal genes. Gene network-based approaches, paired with network diffusion methods, have been proposed to prioritize causal genes and to boost statistical power in GWAS based on the assumption that trait-associated genes are clustered in a gene network. Due to the difficulty in mapping trait-associated variants to genes in GWAS, this assumption has never been directly or rigorously tested empirically. On the other hand, whole exome sequencing (WES) data focuses on the protein-coding regions, directly identifying trait-associated genes. In this study, we tested the assumption by leveraging the recently available exome-based association statistics from the UK Biobank WES data along with two types of networks. We found that almost all trait-associated genes were significantly more proximal to each other than randomly selected genes within both networks. These results support the assumption that trait-associated genes are clustered in gene networks, which can be further leveraged to boost the power of GWAS such as by introducing less stringent p value thresholds.
Collapse
Affiliation(s)
- Hyun Jung Koo
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
2
|
Rapoport R, Greenberg A, Yakhini Z, Simon I. A Cyclic Permutation Approach to Removing Spatial Dependency between Clustered Gene Ontology Terms. BIOLOGY 2024; 13:175. [PMID: 38534445 DOI: 10.3390/biology13030175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 03/04/2024] [Accepted: 03/05/2024] [Indexed: 03/28/2024]
Abstract
Traditional gene set enrichment analysis falters when applied to large genomic domains, where neighboring genes often share functions. This spatial dependency creates misleading enrichments, mistaking mere physical proximity for genuine biological connections. Here we present Spatial Adjusted Gene Ontology (SAGO), a novel cyclic permutation-based approach, to tackle this challenge. SAGO separates enrichments due to spatial proximity from genuine biological links by incorporating the genes' spatial arrangement into the analysis. We applied SAGO to various datasets in which the identified genomic intervals are large, including replication timing domains, large H3K9me3 and H3K27me3 domains, HiC compartments and lamina-associated domains (LADs). Intriguingly, applying SAGO to prostate cancer samples with large copy number alteration (CNA) domains eliminated most of the enriched GO terms, thus helping to accurately identify biologically relevant gene sets linked to oncogenic processes, free from spatial bias.
Collapse
Affiliation(s)
- Rachel Rapoport
- Microbiology and Molecular Genetics, Hebrew University of Jerusalem-IMRIC, Jerusalem 9112102, Israel
| | - Avraham Greenberg
- Microbiology and Molecular Genetics, Hebrew University of Jerusalem-IMRIC, Jerusalem 9112102, Israel
| | - Zohar Yakhini
- Efi Arazi School of Computer Science, Reichman University (IDC Herzliya), Herzliya 4610101, Israel
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Itamar Simon
- Microbiology and Molecular Genetics, Hebrew University of Jerusalem-IMRIC, Jerusalem 9112102, Israel
| |
Collapse
|
3
|
Kimura A, Go AC, Markow T, Ranz JM. Evidence of Nonrandom Patterns of Functional Chromosome Organization in Danaus plexippus. Genome Biol Evol 2024; 16:evae054. [PMID: 38488057 PMCID: PMC10972686 DOI: 10.1093/gbe/evae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2024] [Indexed: 05/01/2024] Open
Abstract
Our understanding on the interplay between gene functionality and gene arrangement at different chromosome scales relies on a few Diptera and the honeybee, species with quality reference genome assemblies, accurate gene annotations, and abundant transcriptome data. Using recently generated 'omic resources in the monarch butterfly Danaus plexippus, a species with many more and smaller chromosomes relative to Drosophila species and the honeybee, we examined the organization of genes preferentially expressed at broadly defined developmental stages (larva, pupa, adult males, and adult females) at both fine and whole-chromosome scales. We found that developmental stage-regulated genes do not form more clusters, but do form larger clusters, than expected by chance, a pattern consistent across the gene categories examined. Notably, out of the 30 chromosomes in the monarch genome, 12 of them, plus the fraction of the chromosome Z that corresponds to the ancestral Z in other Lepidoptera, were found enriched for developmental stage-regulated genes. These two levels of nonrandom gene organization are not independent as enriched chromosomes for developmental stage-regulated genes tend to harbor disproportionately large clusters of these genes. Further, although paralogous genes were overrepresented in gene clusters, their presence is not enough to explain two-thirds of the documented cases of whole-chromosome enrichment. The composition of the largest clusters often included paralogs from more than one multigene family as well as unrelated single-copy genes. Our results reveal intriguing patterns at the whole-chromosome scale in D. plexippus while shedding light on the interplay between gene expression and chromosome organization beyond Diptera and Hymenoptera.
Collapse
Affiliation(s)
- Ashlyn Kimura
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA 92647, USA
| | - Alwyn C Go
- Department of Biology, University of Winnipeg, Winnipeg, MB R3B 2E9, Canada
| | - Therese Markow
- Unidad de Genómica Avanzada (Langebio), CINVESTAV, Irapuato, GTO 36824, México
- Section of Cell and Developmental Biology, Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | - José M Ranz
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, CA 92647, USA
| |
Collapse
|
4
|
de Boer CG, Taipale J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 2024; 625:41-50. [PMID: 38093018 DOI: 10.1038/s41586-023-06661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/20/2023] [Indexed: 01/05/2024]
Abstract
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Collapse
Affiliation(s)
- Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Jussi Taipale
- Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
5
|
de Vienne D, Coton C, Dillmann C. The genotype-phenotype relationship and evolutionary genetics in the light of the Metabolic Control Analysis. Biosystems 2023; 232:105000. [PMID: 37586656 DOI: 10.1016/j.biosystems.2023.105000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 08/05/2023] [Accepted: 08/11/2023] [Indexed: 08/18/2023]
Abstract
Metabolic control analysis has long been used as a systemic model of the genotype-phenotype (GP) relationship. By considering kinetic parameters and enzyme concentrations as reflecting the genotype level and metabolic fluxes or pools as phenotypes related to fitness, MCA has given a biological basis to the relationship between these two levels. The non-linear and concave relationship between enzymes and fluxes can account for common genetic effects that reductionist approaches have been powerless to explain, such as the dominance of active alleles over less active alleles, the various types of epistasis and heterosis, and reveals the structural links between these genetic effects. The summation property of the flux control coefficients accounts for the L-shaped distribution of Quantitative Trait Locus (QTL) effects, irrespective of other possible causes. Metabolic models of response to selection results in evolutionary scenarios that are markedly different from those derived from the classical infinitesimal model of quantitative genetics. In particular, evolution towards selective neutrality appears to be a consequence of the diminishing return of the flux-enzyme relationship. In this paper, we survey the historical and recent achievements of MCA in genetics, quantitative genetics and evolution, focusing on epistasis and the evolution of flux in relation to enzyme concentrations.
Collapse
Affiliation(s)
- D de Vienne
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech. GQE-Le Moulon, IDEEV, 12, route 128, Gif-sur-Yvette, 91190, France.
| | - C Coton
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech. GQE-Le Moulon, IDEEV, 12, route 128, Gif-sur-Yvette, 91190, France.
| | - C Dillmann
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech. GQE-Le Moulon, IDEEV, 12, route 128, Gif-sur-Yvette, 91190, France.
| |
Collapse
|
6
|
Tirumalai MR, Sivaraman RV, Kutty LA, Song EL, Fox GE. Ribosomal Protein Cluster Organization in Asgard Archaea. ARCHAEA (VANCOUVER, B.C.) 2023; 2023:5512414. [PMID: 38314098 PMCID: PMC10833476 DOI: 10.1155/2023/5512414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/31/2023] [Accepted: 09/08/2023] [Indexed: 02/06/2024]
Abstract
It has been proposed that the superphylum of Asgard Archaea may represent a historical link between the Archaea and Eukarya. Following the discovery of the Archaea, it was soon appreciated that archaeal ribosomes were more similar to those of Eukarya rather than Bacteria. Coupled with other eukaryotic-like features, it has been suggested that the Asgard Archaea may be directly linked to eukaryotes. However, the genomes of Bacteria and non-Asgard Archaea generally organize ribosome-related genes into clusters that likely function as operons. In contrast, eukaryotes typically do not employ an operon strategy. To gain further insight into conservation of the r-protein genes, the genome order of conserved ribosomal protein (r-protein) coding genes was identified in 17 Asgard genomes (thirteen complete genomes and four genomes with less than 20 contigs) and compared with those found previously in non-Asgard archaeal and bacterial genomes. A universal core of two clusters of 14 and 4 cooccurring r-proteins, respectively, was identified in both the Asgard and non-Asgard Archaea. The equivalent genes in the E. coli version of the cluster are found in the S10 and spc operons. The large cluster of 14 r-protein genes (uS19-uL22-uS3-uL29-uS17 from the S10 operon and uL14-uL24-uL5-uS14-uS8-uL6-uL18-uS5-uL30-uL15 from the spc operon) occurs as a complete set in the genomes of thirteen Asgard genomes (five Lokiarchaeotes, three Heimdallarchaeotes, one Odinarchaeote, and four Thorarchaeotes). Four less conserved clusters with partial bacterial equivalents were found in the Asgard. These were the L30e (str operon in Bacteria) cluster, the L18e (alpha operon in Bacteria) cluster, the S24e-S27ae-rpoE1 cluster, and the L31e, L12..L1 cluster. Finally, a new cluster referred to as L7ae was identified. In many cases, r-protein gene clusters/operons are less conserved in their organization in the Asgard group than in other Archaea. If this is generally true for nonribosomal gene clusters, the results may have implications for the history of genome organization. In particular, there may have been an early transition to or from the operon approach to genome organization. Other nonribosomal cellular features may support different relationships. For this reason, it may be important to consider ribosome features separately.
Collapse
Affiliation(s)
- Madhan R. Tirumalai
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
| | | | | | | | - George E. Fox
- Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA
| |
Collapse
|
7
|
Turco G, Chang C, Wang RY, Kim G, Stoops EH, Richardson B, Sochat V, Rust J, Oughtred R, Thayer N, Kang F, Livstone MS, Heinicke S, Schroeder M, Dolinski KJ, Botstein D, Baryshnikova A. Global analysis of the yeast knockout phenome. SCIENCE ADVANCES 2023; 9:eadg5702. [PMID: 37235661 DOI: 10.1126/sciadv.adg5702] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 04/20/2023] [Indexed: 05/28/2023]
Abstract
Genome-wide phenotypic screens in the budding yeast Saccharomyces cerevisiae, enabled by its knockout collection, have produced the largest, richest, and most systematic phenotypic description of any organism. However, integrative analyses of this rich data source have been virtually impossible because of the lack of a central data repository and consistent metadata annotations. Here, we describe the aggregation, harmonization, and analysis of ~14,500 yeast knockout screens, which we call Yeast Phenome. Using this unique dataset, we characterized two unknown genes (YHR045W and YGL117W) and showed that tryptophan starvation is a by-product of many chemical treatments. Furthermore, we uncovered an exponential relationship between phenotypic similarity and intergenic distance, which suggests that gene positions in both yeast and human genomes are optimized for function.
Collapse
Affiliation(s)
- Gina Turco
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | - Christie Chang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | | | - Griffin Kim
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | | | - Brianna Richardson
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Vanessa Sochat
- Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Jennifer Rust
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | | | - Fan Kang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Michael S Livstone
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Sven Heinicke
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Mark Schroeder
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Kara J Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | | | | |
Collapse
|
8
|
Marcet-Houben M, Collado-Cala I, Fuentes-Palacios D, Gómez AD, Molina M, Garisoain-Zafra A, Chorostecki U, Gabaldón T. EvolClustDB: Exploring Eukaryotic Gene Clusters with Evolutionarily Conserved Genomic Neighbourhoods. J Mol Biol 2023:168013. [PMID: 36806474 DOI: 10.1016/j.jmb.2023.168013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 01/24/2023] [Accepted: 02/11/2023] [Indexed: 02/17/2023]
Abstract
Conservation of gene neighbourhood over evolutionary distances is generally indicative of shared regulation or functional association among genes. This concept has been broadly exploited in prokaryotes but its use on eukaryotic genomes has been limited to specific functional classes, such as biosynthetic gene clusters. We here used an evolutionary-based gene cluster discovery algorithm (EvolClust) to pre-compute evolutionarily conserved gene neighbourhoods, which can be searched, browsed and downloaded in EvolClustDB. We inferred ∼35,000 cluster families in 882 different species in genome comparisons of five taxonomically broad clades: Fungi, Plants, Metazoans, Insects and Protists. EvolClustDB allows browsing through the cluster families, as well as searching by protein, species, identifier or sequence. Visualization allows inspecting gene order per species in a phylogenetic context, so that relevant evolutionary events such as gain, loss or transfer, can be inferred. EvolClustDB is freely available, without registration, at http://evolclustdb.org/.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Ismael Collado-Cala
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Diego Fuentes-Palacios
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Alicia D Gómez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Manuel Molina
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Andrés Garisoain-Zafra
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Uciel Chorostecki
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain
| | - Toni Gabaldón
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain; Barcelona Supercomputing Centre (BSC-CNS). Plaça Eusebi Güell, 1-3, 08034 Barcelona, Spain; Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain; Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain.
| |
Collapse
|
9
|
Coton C, Dillmann C, de Vienne D. Evolution of enzyme levels in metabolic pathways: A theoretical approach. Part 2. J Theor Biol 2023; 558:111354. [PMID: 36427531 DOI: 10.1016/j.jtbi.2022.111354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 09/30/2022] [Accepted: 11/07/2022] [Indexed: 11/24/2022]
Abstract
Metabolism is essential for cell function and adaptation. Because of their central role in metabolism, kinetic parameters and enzyme concentrations are under constant selective pressure to adapt the fluxes of the metabolic networks to the needs of the organism. In line with various studies dealing with enzyme evolution, we recently developed a model of the evolution of enzyme concentrations under selection for increased flux, considered as a proxy for fitness (Coton et al., 2022). With this model, taking into account two realistic cellular constraints, competition for resources and co-regulation, we determined the evolutionary equilibria and range of neutral variations of enzyme concentrations. In this article, we expanded this model by considering that the enzymes in a pathway can belong to different co-regulation groups. We determined the equilibria and showed that the constraints modify the adaptive landscape by limiting the number of independent dimensions. We also showed that any trade-off between enzyme concentrations is sufficient to limit the flux and relax selection for increasing the concentration of other enzymes. Even though this model is based on simplifying assumptions, the complexity of the relationship between enzyme concentrations prevents the formal analysis of the range of neutral variation of enzyme concentrations. However, we could show that selection for maximizing the flux results in selective neutrality for all enzymes regardless the constraints applied, giving generality to the prediction of Hartl et al. (1985).
Collapse
Affiliation(s)
- Charlotte Coton
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France.
| | - Christine Dillmann
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Dominique de Vienne
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
10
|
Zhang J. What Has Genomics Taught An Evolutionary Biologist? GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1-12. [PMID: 36720382 PMCID: PMC10373158 DOI: 10.1016/j.gpb.2023.01.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/06/2023] [Accepted: 01/19/2023] [Indexed: 01/30/2023]
Abstract
Genomics, an interdisciplinary field of biology on the structure, function, and evolution of genomes, has revolutionized many subdisciplines of life sciences, including my field of evolutionary biology, by supplying huge data, bringing high-throughput technologies, and offering a new approach to biology. In this review, I describe what I have learned from genomics and highlight the fundamental knowledge and mechanistic insights gained. I focus on three broad topics that are central to evolutionary biology and beyond-variation, interaction, and selection-and use primarily my own research and study subjects as examples. In the next decade or two, I expect that the most important contributions of genomics to evolutionary biology will be to provide genome sequences of nearly all known species on Earth, facilitate high-throughput phenotyping of natural variants and systematically constructed mutants for mapping genotype-phenotype-fitness landscapes, and assist the determination of causality in evolutionary processes using experimental evolution.
Collapse
Affiliation(s)
- Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
11
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| |
Collapse
|
12
|
Pazos Obregón F, Silvera D, Soto P, Yankilevich P, Guerberoff G, Cantera R. Gene function prediction in five model eukaryotes exclusively based on gene relative location through machine learning. Sci Rep 2022; 12:11655. [PMID: 35803984 PMCID: PMC9270439 DOI: 10.1038/s41598-022-15329-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 06/22/2022] [Indexed: 12/13/2022] Open
Abstract
The function of most genes is unknown. The best results in automated function prediction are obtained with machine learning-based methods that combine multiple data sources, typically sequence derived features, protein structure and interaction data. Even though there is ample evidence showing that a gene's function is not independent of its location, the few available examples of gene function prediction based on gene location rely on sequence identity between genes of different organisms and are thus subjected to the limitations of the relationship between sequence and function. Here we predict thousands of gene functions in five model eukaryotes (Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens) using machine learning models exclusively trained with features derived from the location of genes in the genomes to which they belong. Our aim was not to obtain the best performing method to automated function prediction but to explore the extent to which a gene's location can predict its function in eukaryotes. We found that our models outperform BLAST when predicting terms from Biological Process and Cellular Component Ontologies, showing that, at least in some cases, gene location alone can be more useful than sequence to infer gene function.
Collapse
Affiliation(s)
- Flavio Pazos Obregón
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Av. Italia 3318, 11600, Montevideo, Uruguay. .,Unidad de Bioquímica y Proteómica Analíticas, Instituto Pasteur de Montevideo, Montevideo, Uruguay.
| | - Diego Silvera
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Av. Italia 3318, 11600, Montevideo, Uruguay
| | - Pablo Soto
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Av. Italia 3318, 11600, Montevideo, Uruguay
| | - Patricio Yankilevich
- Instituto de Investigación en Biomedicina de Buenos Aires (IBioBA), CONICET-Partner Institute of the Max Planck Society, Buenos Aires, Argentina
| | - Gustavo Guerberoff
- Instituto de Matemática y Estadística "Prof. Ing. Rafael Laguardia", Facultad de Ingeniería, UDELAR, Montevideo, Uruguay
| | - Rafael Cantera
- Departamento de Biología del Neurodesarrollo, Instituto de Investigaciones Biológicas Clemente Estable, Av. Italia 3318, 11600, Montevideo, Uruguay
| |
Collapse
|
13
|
Jhaveri N, van den Berg W, Hwang BJ, Muller HM, Sternberg PW, Gupta BP. Genome annotation of Caenorhabditis briggsae by TEC-RED identifies new exons, paralogs, and conserved and novel operons. G3 GENES|GENOMES|GENETICS 2022; 12:6575897. [PMID: 35485953 PMCID: PMC9258526 DOI: 10.1093/g3journal/jkac101] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 04/14/2022] [Indexed: 11/14/2022]
Abstract
The nematode Caenorhabditis briggsae is routinely used in comparative and evolutionary studies involving its well-known cousin Caenorhabditis elegans. The C. briggsae genome sequence has accelerated research by facilitating the generation of new resources, tools, and functional studies of genes. While substantial progress has been made in predicting genes and start sites, experimental evidence is still lacking in many cases. Here, we report an improved annotation of the C. briggsae genome using the trans-spliced exon coupled RNA end determination technique. In addition to identifying the 5′ ends of expressed genes, we have discovered operons and paralogs. In summary, our analysis yielded 10,243 unique 5′ end sequence tags with matches in the C. briggsae genome. Of these, 6,395 were found to represent 4,252 unique genes along with 362 paralogs and 52 previously unknown exons. These genes included 14 that are exclusively trans-spliced in C. briggsae when compared with C. elegans orthologs. A major contribution of this study is the identification of 492 high confidence operons, of which two-thirds are fully supported by tags. In addition, 2 SL1-type operons were discovered. Interestingly, comparisons with C. elegans showed that only 40% of operons are conserved. Of the remaining operons, 73 are novel, including 12 that entirely lack orthologs in C. elegans. Further analysis revealed that 4 of the 12 novel operons are conserved in Caenorhabditis nigoni. Altogether, the work described here has significantly advanced our understanding of the C. briggsae system and serves as a rich resource to aid biological studies involving this species.
Collapse
Affiliation(s)
- Nikita Jhaveri
- Department of Biology, McMaster University , Hamilton, ON L8S 4K1, Canada
| | | | - Byung Joon Hwang
- Division of Biology and Biological Engineering, California Institute of Technology , Pasadena, CA 91125, USA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering, California Institute of Technology , Pasadena, CA 91125, USA
| | - Paul W Sternberg
- Division of Biology and Biological Engineering, California Institute of Technology , Pasadena, CA 91125, USA
| | - Bhagwati P Gupta
- Department of Biology, McMaster University , Hamilton, ON L8S 4K1, Canada
| |
Collapse
|
14
|
Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
15
|
Coton C, Talbot G, Louarn ML, Dillmann C, Vienne D. Evolution of enzyme levels in metabolic pathways: A theoretical approach. J Theor Biol 2022; 538:111015. [PMID: 35016894 DOI: 10.1016/j.jtbi.2022.111015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 12/03/2021] [Accepted: 01/03/2022] [Indexed: 10/19/2022]
Abstract
The central role of metabolism in cell functioning and adaptation has given rise to countless studies on the evolution of enzyme-coding genes and network topology. However, very few studies have addressed the question of how enzyme concentrations change in response to positive selective pressure on the flux, considered a proxy of fitness. In particular, the way cellular constraints, such as resource limitations and co-regulation, affect the adaptive landscape of a pathway under selection has never been analyzed theoretically. To fill this gap, we developed a model of the evolution of enzyme concentrations that combines metabolic control theory and an adaptive dynamics approach, and integrates possible dependencies between enzyme concentrations. We determined the evolutionary equilibria of enzyme concentrations and their range of neutral variation, and showed that they differ with the properties of the enzymes, the constraints applied to the system and the initial enzyme concentrations. Simulations of long-term evolution confirmed all analytical and numerical predictions, even though we relaxed the simplifying assumptions used in the analytical treatment.
Collapse
Affiliation(s)
- Charlotte Coton
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France.
| | - Grégoire Talbot
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Maud Le Louarn
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Christine Dillmann
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Dominique Vienne
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France.
| |
Collapse
|
16
|
Zinani OQH, Keseroğlu K, Özbudak EM. Regulatory mechanisms ensuring coordinated expression of functionally related genes. Trends Genet 2022; 38:73-81. [PMID: 34376301 PMCID: PMC8678166 DOI: 10.1016/j.tig.2021.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 07/12/2021] [Accepted: 07/14/2021] [Indexed: 01/03/2023]
Abstract
Coordinated spatiotemporal expression of large sets of genes is required for the development and homeostasis of organisms. To achieve this goal, organisms use myriad strategies where they form operons, utilize bidirectional promoters, cluster genes, share enhancers among genes by DNA looping, and form topologically associated domains and transcriptional condensates. Coexpression achieved by these different strategies is hypothesized to have functional importance in minimizing gene expression variability, establishing dosage balance to ensure stoichiometry of protein complexes, and minimizing accumulation of toxic intermediate metabolites. By combining gene-editing tools with computational modeling, recent studies tested the advantages of adjacent genes located in pairs and clusters. We propose that with the advancement of gene editing, single-cell sequencing, and imaging tools, one could readily test the functional importance of different coexpression strategies in a variety of biological processes.
Collapse
Affiliation(s)
- Oriana Q H Zinani
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Kemal Keseroğlu
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Ertuğrul M Özbudak
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA; Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.
| |
Collapse
|
17
|
Silveira GO, Coelho HS, Amaral MS, Verjovski-Almeida S. Long non-coding RNAs as possible therapeutic targets in protozoa, and in Schistosoma and other helminths. Parasitol Res 2021; 121:1091-1115. [PMID: 34859292 DOI: 10.1007/s00436-021-07384-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 11/14/2021] [Indexed: 12/26/2022]
Abstract
Long non-coding RNAs (lncRNAs) emerged in the past 20 years due to massive amounts of scientific data regarding transcriptomic analyses. They have been implicated in a plethora of cellular processes in higher eukaryotes. However, little is known about lncRNA possible involvement in parasitic diseases, with most studies only detecting their presence in parasites of human medical importance. Here, we review the progress on lncRNA studies and their functions in protozoans and helminths. In addition, we show an example of knockdown of one lncRNA in Schistosoma mansoni, SmLINC156349, which led to in vitro parasite adhesion, motility, and pairing impairment, with a 20% decrease in parasite viability and 33% reduction in female oviposition. Other observed phenotypes were a decrease in the proliferation rate of both male and female worms and their gonads, and reduced female lipid and vitelline droplets that are markers for well-developed vitellaria. Impairment of female worms' vitellaria in SmLINC156349-silenced worms led to egg development deficiency. All those results demonstrate the great potential of the tools and methods to characterize lncRNAs as potential new therapeutic targets. Further, we discuss the challenges and limitations of current methods for studying lncRNAs in parasites and possible solutions to overcome them, and we highlight the future directions of this exciting field.
Collapse
Affiliation(s)
- Gilbert O Silveira
- Laboratório de Parasitologia, Instituto Butantan, São Paulo, SP, 05503-900, Brazil.,Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, 05508-900, Brazil
| | - Helena S Coelho
- Laboratório de Parasitologia, Instituto Butantan, São Paulo, SP, 05503-900, Brazil
| | - Murilo S Amaral
- Laboratório de Parasitologia, Instituto Butantan, São Paulo, SP, 05503-900, Brazil.
| | - Sergio Verjovski-Almeida
- Laboratório de Parasitologia, Instituto Butantan, São Paulo, SP, 05503-900, Brazil. .,Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, 05508-900, Brazil.
| |
Collapse
|
18
|
Van Bel M, Silvestri F, Weitz EM, Kreft L, Botzki A, Coppens F, Vandepoele K. PLAZA 5.0: extending the scope and power of comparative and functional genomics in plants. Nucleic Acids Res 2021; 50:D1468-D1474. [PMID: 34747486 PMCID: PMC8728282 DOI: 10.1093/nar/gkab1024] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 10/12/2021] [Accepted: 10/13/2021] [Indexed: 11/13/2022] Open
Abstract
PLAZA is a platform for comparative, evolutionary, and functional plant genomics. It makes a broad set of genomes, data types and analysis tools available to researchers through a user-friendly website, an API, and bulk downloads. In this latest release of the PLAZA platform, we are integrating a record number of 134 high-quality plant genomes, split up over two instances: PLAZA Dicots 5.0 and PLAZA Monocots 5.0. This number of genomes corresponds with a massive expansion in the number of available species when compared to PLAZA 4.0, which offered access to 71 species, a 89% overall increase. The PLAZA 5.0 release contains information for 5 882 730 genes, and offers pre-computed gene families and phylogenetic trees for 5 274 684 protein-coding genes. This latest release also comes with a set of new and updated features: a new BED import functionality for the workbench, improved interactive visualizations for functional enrichments and genome-wide mapping of gene sets, and a fully redesigned and extended API. Taken together, this new version offers extended support for plant biologists working on different families within the green plant lineage and provides an efficient and versatile toolbox for plant genomics. All PLAZA releases are accessible from the portal website: https://bioinformatics.psb.ugent.be/plaza/.
Collapse
Affiliation(s)
- Michiel Van Bel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium
| | - Francesca Silvestri
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium
| | - Eric M Weitz
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Lukasz Kreft
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5A 02-106 Warsaw, Poland
| | | | - Frederik Coppens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium.,VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052 Ghent, Belgium
| |
Collapse
|
19
|
Zhang Y, Chen B, Sun Z, Liu Z, Cui Y, Ke H, Wang Z, Wu L, Zhang G, Wang G, Li Z, Yang J, Wu J, Shi R, Liu S, Wang X, Ma Z. A large-scale genomic association analysis identifies a fragment in Dt11 chromosome conferring cotton Verticillium wilt resistance. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:2126-2138. [PMID: 34160879 PMCID: PMC8486238 DOI: 10.1111/pbi.13650] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 06/01/2021] [Accepted: 06/14/2021] [Indexed: 05/26/2023]
Abstract
Verticillium wilt (VW) is a destructive disease that results in great losses in cotton yield and quality. Identifying genetic variation that enhances crop disease resistance is a primary objective in plant breeding. Here we reported a GWAS of cotton VW resistance in a natural-variation population, challenged by different pathogenicity stains and different environments, and found 382 SNPs significantly associated with VW resistance. The associated signal repeatedly peaked in chromosome Dt11 (68 798 494-69 212 808) containing 13 core elite alleles undescribed previously. The core SNPs can make the disease reaction type from susceptible to tolerant or resistant in accessions with alternate genotype compared to reference genotype. Of the genes associated with the Dt11 signal, 25 genes differentially expressed upon Verticillium dahliae stress, with 21 genes verified in VW resistance via gene knockdown and/or overexpression experiments. We firstly discovered that a gene cluster of L-type lectin-domain containing receptor kinase (GhLecRKs-V.9) played an important role in VW resistance. These results proved that the associated Dt11 region was a major genetic locus responsible for VW resistance. The frequency of the core elite alleles (FEA) in modern varieties was significantly higher than the early/middle varieties (12.55% vs 4.29%), indicating that the FEA increased during artificial selection breeding. The current developmental resistant cultivars, JND23 and JND24, had fixed these core elite alleles during breeding without yield penalty. These findings unprecedentedly provided genomic variations and promising alleles for promoting cotton VW resistance improvement.
Collapse
Affiliation(s)
- Yan Zhang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Bin Chen
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Zhengwen Sun
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Zhengwen Liu
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Yanru Cui
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Huifeng Ke
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Zhicheng Wang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Liqiang Wu
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Guiyin Zhang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Guoning Wang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Zhikun Li
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Jun Yang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Jinhua Wu
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Rongkang Shi
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Song Liu
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Xingfen Wang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| | - Zhiying Ma
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory for Crop Germplasm Resources of HebeiHebei Agricultural UniversityBaodingChina
| |
Collapse
|
20
|
Gilchrist CLM, Booth TJ, van Wersch B, van Grieken L, Medema MH, Chooi YH. cblaster: a remote search tool for rapid identification and visualization of homologous gene clusters. BIOINFORMATICS ADVANCES 2021; 1:vbab016. [PMID: 36700093 PMCID: PMC9710679 DOI: 10.1093/bioadv/vbab016] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/28/2021] [Accepted: 08/03/2021] [Indexed: 01/28/2023]
Abstract
Motivation Genes involved in coordinated biological pathways, including metabolism, drug resistance and virulence, are often collocalized as gene clusters. Identifying homologous gene clusters aids in the study of their function and evolution, however, existing tools are limited to searching local sequence databases. Tools for remotely searching public databases are necessary to keep pace with the rapid growth of online genomic data. Results Here, we present cblaster, a Python-based tool to rapidly detect collocated genes in local and remote databases. cblaster is easy to use, offering both a command line and a user-friendly graphical user interface. It generates outputs that enable intuitive visualizations of large datasets and can be readily incorporated into larger bioinformatic pipelines. cblaster is a significant update to the comparative genomics toolbox. Availability and implementation cblaster source code and documentation is freely available from GitHub under the MIT license (github.com/gamcil/cblaster). Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Cameron L M Gilchrist
- School of Molecular Sciences, The University of Western Australia, Crawley, WA 6009, Australia,To whom correspondence should be addressed. or or
| | - Thomas J Booth
- School of Molecular Sciences, The University of Western Australia, Crawley, WA 6009, Australia
| | - Bram van Wersch
- Bioinformatics Group, Wageningen University, Wageningen 6708PB, The Netherlands
| | - Liana van Grieken
- Bioinformatics Group, Wageningen University, Wageningen 6708PB, The Netherlands
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen 6708PB, The Netherlands,To whom correspondence should be addressed. or or
| | - Yit-Heng Chooi
- School of Molecular Sciences, The University of Western Australia, Crawley, WA 6009, Australia,To whom correspondence should be addressed. or or
| |
Collapse
|
21
|
Sall S, Thompson W, Santos A, Dwyer DS. Analysis of Major Depression Risk Genes Reveals Evolutionary Conservation, Shared Phenotypes, and Extensive Genetic Interactions. Front Psychiatry 2021; 12:698029. [PMID: 34335334 PMCID: PMC8319724 DOI: 10.3389/fpsyt.2021.698029] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 06/21/2021] [Indexed: 12/29/2022] Open
Abstract
Major depressive disorder (MDD) affects around 15% of the population at some stage in their lifetime. It can be gravely disabling and it is associated with increased risk of suicide. Genetics play an important role; however, there are additional environmental contributions to the pathogenesis. A number of possible risk genes that increase liability for developing symptoms of MDD have been identified in genome-wide association studies (GWAS). The goal of this study was to characterize the MDD risk genes with respect to the degree of evolutionary conservation in simpler model organisms such as Caenorhabditis elegans and zebrafish, the phenotypes associated with variation in these genes and the extent of network connectivity. The MDD risk genes showed higher conservation in C. elegans and zebrafish than genome-to-genome comparisons. In addition, there were recurring themes among the phenotypes associated with variation of these risk genes in C. elegans. The phenotype analysis revealed enrichment for essential genes with pleiotropic effects. Moreover, the MDD risk genes participated in more interactions with each other than did randomly-selected genes from similar-sized gene sets. Syntenic blocks of risk genes with common functional activities were also identified. By characterizing evolutionarily-conserved counterparts to the MDD risk genes, we have gained new insights into pathogenetic processes relevant to the emergence of depressive symptoms in man.
Collapse
Affiliation(s)
- Saveen Sall
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Willie Thompson
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Aurianna Santos
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Donard S. Dwyer
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
- Department of Pharmacology, Toxicology and Neuroscience, Louisiana State University Health Shreveport, Shreveport, LA, United States
| |
Collapse
|
22
|
Foflonker F, Blaby-Haas CE. Colocality to Cofunctionality: Eukaryotic Gene Neighborhoods as a Resource for Function Discovery. Mol Biol Evol 2021; 38:650-662. [PMID: 32886760 PMCID: PMC7826186 DOI: 10.1093/molbev/msaa221] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Diverging from the classic paradigm of random gene order in eukaryotes, gene proximity can be leveraged to systematically identify functionally related gene neighborhoods in eukaryotes, utilizing techniques pioneered in bacteria. Current methods of identifying gene neighborhoods typically rely on sequence similarity to characterized gene products. However, this approach is not robust for nonmodel organisms like algae, which are evolutionarily distant from well-characterized model organisms. Here, we utilize a comparative genomic approach to identify evolutionarily conserved proximal orthologous gene pairs conserved across at least two taxonomic classes of green algae. A total of 317 gene neighborhoods were identified. In some cases, gene proximity appears to have been conserved since before the streptophyte–chlorophyte split, 1,000 Ma. Using functional inferences derived from reconstructed evolutionary relationships, we identified several novel functional clusters. A putative mycosporine-like amino acid, “sunscreen,” neighborhood contains genes similar to either vertebrate or cyanobacterial pathways, suggesting a novel mosaic biosynthetic pathway in green algae. One of two putative arsenic-detoxification neighborhoods includes an organoarsenical transporter (ArsJ), a glyceraldehyde 3-phosphate dehydrogenase-like gene, homologs of which are involved in arsenic detoxification in bacteria, and a novel algal-specific phosphoglycerate kinase-like gene. Mutants of the ArsJ-like transporter and phosphoglycerate kinase-like genes in Chlamydomonas reinhardtii were found to be sensitive to arsenate, providing experimental support for the role of these identified neighbors in resistance to arsenate. Potential evolutionary origins of neighborhoods are discussed, and updated annotations for formerly poorly annotated genes are presented, highlighting the potential of this strategy for functional annotation.
Collapse
|
23
|
Darbani B. Genome Evolutionary Dynamics Meets Functional Genomics: A Case Story on the Identification of SLC25A44. Int J Mol Sci 2021; 22:ijms22115669. [PMID: 34073512 PMCID: PMC8199184 DOI: 10.3390/ijms22115669] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Revised: 05/09/2021] [Accepted: 05/23/2021] [Indexed: 12/14/2022] Open
Abstract
Gene clusters are becoming promising tools for gene identification. The study reveals the purposive genomic distribution of genes toward higher inheritance rates of intact metabolic pathways/phenotypes and, thereby, higher fitness. The co-localization of co-expressed, co-interacting, and functionally related genes was found as genome-wide trends in humans, mouse, golden eagle, rice fish, Drosophila, peanut, and Arabidopsis. As anticipated, the analyses verified the co-segregation of co-localized events. A negative correlation was notable between the likelihood of co-localization events and the inter-loci distances. The evolution of genomic blocks was also found convergent and uniform along the chromosomal arms. Calling a genomic block responsible for adjacent metabolic reactions is therefore recommended for identification of candidate genes and interpretation of cellular functions. As a case story, a function in the metabolism of energy and secondary metabolites was proposed for Slc25A44, based on its genomic local information. Slc25A44 was further characterized as an essential housekeeping gene which has been under evolutionary purifying pressure and belongs to the phylogenetic ETC-clade of SLC25s. Pathway enrichment mapped the Slc25A44s to the energy metabolism. The expression of peanut and human Slc25A44s in oocytes and Saccharomyces cerevisiae strains confirmed the transport of common precursors for secondary metabolites and ubiquinone. These results suggest that SLC25A44 is a mitochondrion-ER-nucleus zone transporter with biotechnological applications. Finally, a conserved three-amino acid signature on the cytosolic face of transport cavity was found important for rational engineering of SLC25s.
Collapse
Affiliation(s)
- Behrooz Darbani
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark; or ; Tel.: +45-(53)-578055
- Research Center Flakkebjerg, Department of Agroecology, Aarhus University, 4200 Slagelse, Denmark
| |
Collapse
|
24
|
Kulski JK, Suzuki S, Shiina T. SNP-Density Crossover Maps of Polymorphic Transposable Elements and HLA Genes Within MHC Class I Haplotype Blocks and Junction. Front Genet 2021; 11:594318. [PMID: 33537058 PMCID: PMC7848197 DOI: 10.3389/fgene.2020.594318] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 11/24/2020] [Indexed: 12/12/2022] Open
Abstract
The genomic region (~4 Mb) of the human major histocompatibility complex (MHC) on chromosome 6p21 is a prime model for the study and understanding of conserved polymorphic sequences (CPSs) and structural diversity of ancestral haplotypes (AHs)/conserved extended haplotypes (CEHs). The aim of this study was to use a set of 95 MHC genomic sequences downloaded from a publicly available BioProject database at NCBI to identify and characterise polymorphic human leukocyte antigen (HLA) class I genes and pseudogenes, MICA and MICB, and retroelement indels as haplotypic lineage markers, and single-nucleotide polymorphism (SNP) crossover loci in DNA sequence alignments of different haplotypes across the Olfactory Receptor (OR) gene region (~1.2 Mb) and the MHC class I region (~1.8 Mb) from the GPX5 to the MICB gene. Our comparative sequence analyses confirmed the identity of 12 haplotypic retroelement markers and revealed that they partitioned the HLA-A/B/C haplotypes into distinct evolutionary lineages. Crossovers between SNP-poor and SNP-rich regions defined the sequence range of haplotype blocks, and many of these crossover junctions occurred within particular transposable elements, lncRNA, OR12D2, MUC21, MUC22, PSORS1A3, HLA-C, HLA-B, and MICA. In a comparison of more than 250 paired sequence alignments, at least 38 SNP-density crossover sites were mapped across various regions from GPX5 to MICB. In a homology comparison of 16 different haplotypes, seven CEH/AH (7.1, 8.1, 18.2, 51.x, 57.1, 62.x, and 62.1) had no detectable SNP-density crossover junctions and were SNP poor across the entire ~2.8 Mb of sequence alignments. Of the analyses between different recombinant haplotypes, more than half of them had SNP crossovers within 10 kb of LTR16B/ERV3-16A3_I, MLT1, Charlie, and/or THE1 sequences and were in close vicinity to structurally polymorphic Alu and SVA insertion sites. These studies demonstrate that (1) SNP-density crossovers are associated with putative ancestral recombination sites that are widely spread across the MHC class I genomic region from at least the telomeric OR12D2 gene to the centromeric MICB gene and (2) the genomic sequences of MHC homozygous cell lines are useful for analysing haplotype blocks, ancestral haplotypic landscapes and markers, CPSs, and SNP-density crossover junctions.
Collapse
Affiliation(s)
- Jerzy K. Kulski
- Faculty of Health and Medical Sciences, Medical School, The University of Western Australia, Crawley, WA, Australia
- Division of Basic Medical Science and Molecular Medicine, Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Japan
| | - Shingo Suzuki
- Division of Basic Medical Science and Molecular Medicine, Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Japan
| | - Takashi Shiina
- Division of Basic Medical Science and Molecular Medicine, Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Japan
| |
Collapse
|
25
|
Merlo MA, Portela-Bens S, Rodríguez ME, García-Angulo A, Cross I, Arias-Pérez A, García E, Rebordinos L. A Comprehensive Integrated Genetic Map of the Complete Karyotype of Solea senegalensis (Kaup 1858). Genes (Basel) 2020; 12:genes12010049. [PMID: 33396249 PMCID: PMC7824234 DOI: 10.3390/genes12010049] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/24/2020] [Accepted: 12/28/2020] [Indexed: 12/23/2022] Open
Abstract
Solea senegalensis aquaculture production has experienced a great increase in the last decade and, consequently, the genome knowledge of the species is gaining attention. In this sense, obtaining a high-density genome mapping of the species could offer clues to the aquaculture improvement in those aspects not resolved so far. In the present article, a review and new processed data have allowed to obtain a high-density BAC-based cytogenetic map of S. senegalensis beside the analysis of the sequences of such BAC clones to achieve integrative data. A total of 93 BAC clones were used to localize the chromosome complement of the species and 588 genes were annotated, thus almost reaching the 2.5% of the S. senegalensis genome sequences. As a result, important data about its genome organization and evolution were obtained, such as the lesser gene density of the large metacentric pair compared with the other metacentric chromosomes, which supports the theory of a sex proto-chromosome pair. In addition, chromosomes with a high number of linked genes that are conserved, even in distant species, were detected. This kind of result widens the knowledge of this species’ chromosome dynamics and evolution.
Collapse
|
26
|
Zhou H, Simion V, Pierce JB, Haemmig S, Chen AF, Feinberg MW. LncRNA-MAP3K4 regulates vascular inflammation through the p38 MAPK signaling pathway and cis-modulation of MAP3K4. FASEB J 2020; 35:e21133. [PMID: 33184917 DOI: 10.1096/fj.202001654rr] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/23/2020] [Accepted: 10/08/2020] [Indexed: 12/12/2022]
Abstract
Chronic vascular inflammation plays a key role in the pathogenesis of atherosclerosis. Long non-coding RNAs (lncRNAs) have emerged as essential inflammation regulators. We identify a novel lncRNA termed lncRNA-MAP3K4 that is enriched in the vessel wall and regulates vascular inflammation. In the aortic intima, lncRNA-MAP3K4 expression was reduced by 50% during the progression of atherosclerosis (chronic inflammation) and 70% during endotoxemia (acute inflammation). lncRNA-MAP3K4 knockdown reduced the expression of key inflammatory factors (eg, ICAM-1, E-selectin, MCP-1) in endothelial cells or vascular smooth muscle cells and decreased monocytes adhesion to endothelium, as well as reducing TNF-α, IL-1β, COX2 expression in macrophages. Mechanistically, lncRNA-MAP3K4 regulates inflammation through the p38 MAPK signaling pathway. lncRNA-MAP3K4 shares a bidirectional promoter with MAP3K4, an upstream regulator of the MAPK signaling pathway, and regulates its transcription in cis. lncRNA-MAP3K4 and MAP3K4 show coordinated expression in response to inflammation in vivo and in vitro. Similar to lncRNA-MAP3K4, MAP3K4 knockdown reduced the expression of inflammatory factors in several different vascular cells. Furthermore, lncRNA-MAP3K4 and MAP3K4 knockdown showed cooperativity in reducing inflammation in endothelial cells. Collectively, these findings unveil the role of a novel lncRNA in vascular inflammation by cis-regulating MAP3K4 via a p38 MAPK pathway.
Collapse
Affiliation(s)
- Haoyang Zhou
- Department of Medicine, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.,Department of Cardiology, The Third Xiangya Hospital of Central South University, Changsha, China
| | - Viorel Simion
- Department of Medicine, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jacob B Pierce
- Department of Medicine, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.,Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Stefan Haemmig
- Department of Medicine, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Alex F Chen
- Department of Cardiology, The Third Xiangya Hospital of Central South University, Changsha, China
| | - Mark W Feinberg
- Department of Medicine, Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
27
|
Dorji J, Vander Jagt CJ, Garner JB, Marett LC, Mason BA, Reich CM, Xiang R, Clark EL, Cocks BG, Chamberlain AJ, MacLeod IM, Daetwyler HD. Expression of mitochondrial protein genes encoded by nuclear and mitochondrial genomes correlate with energy metabolism in dairy cattle. BMC Genomics 2020; 21:720. [PMID: 33076826 PMCID: PMC7574280 DOI: 10.1186/s12864-020-07018-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 08/20/2020] [Indexed: 12/21/2022] Open
Abstract
Background Mutations in the mitochondrial genome have been implicated in mitochondrial disease, often characterized by impaired cellular energy metabolism. Cellular energy metabolism in mitochondria involves mitochondrial proteins (MP) from both the nuclear (NuMP) and mitochondrial (MtMP) genomes. The expression of MP genes in tissues may be tissue specific to meet varying specific energy demands across the tissues. Currently, the characteristics of MP gene expression in tissues of dairy cattle are not well understood. In this study, we profile the expression of MP genes in 29 adult and six foetal tissues in dairy cattle using RNA sequencing and gene expression analyses: particularly differential gene expression and co-expression network analyses. Results MP genes were differentially expressed (DE; over-expressed or under-expressed) across tissues in cattle. All 29 tissues showed DE NuMP genes in varying proportions of over-expression and under-expression. On the other hand, DE of MtMP genes was observed in < 50% of tissues and notably MtMP genes within a tissue was either all over-expressed or all under-expressed. A high proportion of NuMP (up to 60%) and MtMP (up to 100%) genes were over-expressed in tissues with expected high metabolic demand; heart, skeletal muscles and tongue, and under-expressed (up to 45% of NuMP, 77% of MtMP genes) in tissues with expected low metabolic rates; leukocytes, thymus, and lymph nodes. These tissues also invariably had the expression of all MtMP genes in the direction of dominant NuMP genes expression. The NuMP and MtMP genes were highly co-expressed across tissues and co-expression of genes in a cluster were non-random and functionally enriched for energy generation pathway. The differential gene expression and co-expression patterns were validated in independent cow and sheep datasets. Conclusions The results of this study support the concept that there are biological interaction of MP genes from the mitochondrial and nuclear genomes given their over-expression in tissues with high energy demand and co-expression in tissues. This highlights the importance of considering MP genes from both genomes in future studies related to mitochondrial functions and traits related to energy metabolism.
Collapse
Affiliation(s)
- Jigme Dorji
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia. .,Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.
| | - Christy J Vander Jagt
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Josie B Garner
- Agriculture Victoria, Ellinbank Dairy Centre, Ellinbank, VIC, 3822, Australia
| | - Leah C Marett
- Agriculture Victoria, Ellinbank Dairy Centre, Ellinbank, VIC, 3822, Australia
| | - Brett A Mason
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Coralie M Reich
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Ruidong Xiang
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.,Faculty of Veterinary & Agricultural Science, University of Melbourne, Parkville, VIC, 3052, Australia
| | - Emily L Clark
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Benjamin G Cocks
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.,Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Amanda J Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Iona M MacLeod
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| | - Hans D Daetwyler
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.,Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia
| |
Collapse
|
28
|
Xie X, Spiteller D, Huhn T, Schink B, Müller N. Desulfatiglans anilini Initiates Degradation of Aniline With the Production of Phenylphosphoamidate and 4-Aminobenzoate as Intermediates Through Synthases and Carboxylases From Different Gene Clusters. Front Microbiol 2020; 11:2064. [PMID: 33013754 PMCID: PMC7500099 DOI: 10.3389/fmicb.2020.02064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/05/2020] [Indexed: 01/22/2023] Open
Abstract
The anaerobic degradation of aniline was studied in the sulfate-reducing bacterium Desulfatiglans anilini. Our aim was to identify the genes and their proteins that are required for the initial activation of aniline as well as to characterize intermediates of this reaction. Aniline-induced genes were revealed by comparison of the proteomes of D. anilini grown with different substrates (aniline, 4-aminobenzoate, phenol, and benzoate). Most genes encoding proteins that were highly abundant in aniline- or 4-aminobenzoate-grown D. anilini cells but not in phenol- or benzoate-grown cells were located in the putative gene clusters ani (aniline degradation), hcr (4-hydroxybenzoyl-CoA reductase) and phe (phenol degradation). Of these putative gene clusters, only the phe gene cluster has been studied previously. Based on the differential proteome analysis, four candidate genes coding for kinase subunits and carboxylase subunits were suspected to be responsible for the initial conversion of aniline to 4-aminobenzoate. These genes were cloned and overproduced in E. coli. The recombinant proteins were obtained in inclusion bodies but could be refolded successfully. Two subunits of phenylphosphoamidate synthase and two carboxylase subunits converted aniline to 4-aminobenzoate with phenylphosphoamidate as intermediate under consumption of ATP. Only when both carboxylase subunits, one from gene cluster ani and the other from gene cluster phe, were combined, phenylphosphoamidate was converted to 4-aminobenzoate in vitro, with Mn2+, K+, and FMN as co-factors. Thus, aniline is degraded by the anaerobic bacterium D. anilini only by recruiting genes for the enzymatic machinery from different gene clusters. We conclude, that D. anilini carboxylates aniline to 4-aminobenzoate via phenylphosphoamidate as an energy rich intermediate analogous to the degradation of phenol to 4-hydroxybenzoate via phenylphosphate.
Collapse
Affiliation(s)
- Xiaoman Xie
- Department of Biology, Universität Konstanz, Konstanz, Germany.,Konstanz Research School Chemical Biology, Konstanz, Germany
| | - Dieter Spiteller
- Department of Biology, Universität Konstanz, Konstanz, Germany.,Konstanz Research School Chemical Biology, Konstanz, Germany
| | - Thomas Huhn
- Konstanz Research School Chemical Biology, Konstanz, Germany.,Department of Chemistry, Universität Konstanz, Konstanz, Germany
| | - Bernhard Schink
- Department of Biology, Universität Konstanz, Konstanz, Germany.,Konstanz Research School Chemical Biology, Konstanz, Germany
| | - Nicolai Müller
- Department of Biology, Universität Konstanz, Konstanz, Germany
| |
Collapse
|
29
|
Marcet-Houben M, Gabaldón T. EvolClust: automated inference of evolutionary conserved gene clusters in eukaryotes. Bioinformatics 2020; 36:1265-1266. [PMID: 31560365 PMCID: PMC7703780 DOI: 10.1093/bioinformatics/btz706] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 08/30/2019] [Accepted: 09/25/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The evolution and role of gene clusters in eukaryotes is poorly understood. Currently, most studies and computational prediction programs limit their focus to specific types of clusters, such as those involved in secondary metabolism. RESULTS We present EvolClust, a python-based tool for the inference of evolutionary conserved gene clusters from genome comparisons, independently of the function or gene composition of the cluster. EvolClust predicts conserved gene clusters from pairwise genome comparisons and infers families of related clusters from multiple (all versus all) genome comparisons. AVAILABILITY AND IMPLEMENTATION https://github.com/Gabaldonlab/EvolClust/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Centre for Genomic Regulation (CRG), Bioinformatics and Genomics department, The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.,Health and Experimental Sciences Department, Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation (CRG), Bioinformatics and Genomics department, The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.,Health and Experimental Sciences Department, Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain.,ICREA, Barcelona 08010, Spain
| |
Collapse
|
30
|
Lee RR, Chae E. Variation Patterns of NLR Clusters in Arabidopsis thaliana Genomes. PLANT COMMUNICATIONS 2020; 1:100089. [PMID: 33367252 PMCID: PMC7747988 DOI: 10.1016/j.xplc.2020.100089] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Revised: 06/15/2020] [Accepted: 06/17/2020] [Indexed: 05/04/2023]
Abstract
The nucleotide-binding domain and leucine-rich repeat (NLR) gene family is highly expanded in the plant lineage with extensive sequence and structure polymorphisms. To survey the landscape of NLR expansion, we mined the published long-read data generated by the resistance gene enrichment sequencing of 64 diverse Arabidopsis thaliana accessions. We found that the hot spots of massive multi-gene NLR cluster expansion did not typically span the whole cluster; instead, they were restricted to a handful of, or only one, dominant radiation(s). All sequences in such a radiation were distinct from other genes in the cluster but not from each other in the clade, making it difficult to assign trustworthy reference-based orthologies when multiple reference genes were present in the radiation. Consequently, NLR genes can be broadly divided into two types: radiating or high-fidelity, where high-fidelity genes are well conserved and well separated from other clades. A similar distinction could be made for NLR clusters, depending on whether cluster size was determined primarily by extensive radiation or the presence of numerous high-fidelity genes. We also identified groups of well-conserved NLR clades that were missing from the Columbia-0 reference genome. This suggests that the classification of NLRs using gene IDs from a single reference accession can rarely capture all major paralogs in a cluster accurately and representatively and that a reference-agnostic perspective is required to properly characterize these additional variations. Finally, we present a quantitative visualization method for differentiating these situations in a given clade of interest.
Collapse
|
31
|
Monitoring the prolonged Tnf stimulation in space and time with topological-functional networks. Comput Struct Biotechnol J 2020; 18:220-229. [PMID: 32021663 PMCID: PMC6994266 DOI: 10.1016/j.csbj.2020.01.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 12/18/2019] [Accepted: 01/03/2020] [Indexed: 11/21/2022] Open
Abstract
Genes in linear proximity often share regulatory inputs, expression and evolutionary patterns, even in complex eukaryote genomes with extensive intergenic sequences. Gene regulation, on the other hand, is effected through the co-ordinated activation (or suppression) of genes participating in common biological pathways, which are often transcribed from distant loci. Existing approaches for the study of gene expression focus on the functional aspect, taking positional constraints into account only marginally. In this work we propose a novel concept for the study of gene expression, through the combination of topological and functional information into bipartite networks. Starting from genome-wide expression profiles, we define extended chromosomal regions with consistent patterns of differential gene expression and then associate these domains with enriched functional pathways. By analyzing the resulting networks in terms of size, connectivity and modularity we can draw conclusions on the way genome organization may underlie the gene regulation program. Implementation of this approach in a detailed RNASeq profiling of sustained Tnf stimulation of mouse synovial fibroblasts, allowed us to identify unexpected regulatory changes taking place in the cells after 24 h of stimulation. Bipartite network analysis suggests that the cytokine response set by Tnf, progresses through two distinct transitions. An early generalization of the inflammatory response, that is followed by a late shutdown of immune-related functions and the redistribution of expression to developmental and cell adhesion pathways and distinct chromosomal regions. We show that the incorporation of topological information may provide additional insights in the complex propagation of Tnf activation.
Collapse
|
32
|
Comparative Transcriptome Analysis of Different Dendrobium Species Reveals Active Ingredients-Related Genes and Pathways. Int J Mol Sci 2020; 21:ijms21030861. [PMID: 32013237 PMCID: PMC7037882 DOI: 10.3390/ijms21030861] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 01/24/2020] [Accepted: 01/27/2020] [Indexed: 02/06/2023] Open
Abstract
Dendrobium is widely used in traditional Chinese medicine, which contains many kinds of active ingredients. In recent years, many Dendrobium transcriptomes have been sequenced. Hence, weighted gene co-expression network analysis (WGCNA) was used with the gene expression profiles of active ingredients to identify the modules and genes that may associate with particular species and tissues. Three kinds of Dendrobium species and three tissues were sampled for RNA-seq to generate a high-quality, full-length transcriptome database. Based on significant changes in gene expression, we constructed co-expression networks and revealed 19 gene modules. Among them, four modules with properties correlating to active ingredients regulation and biosynthesis, and several hub genes were selected for further functional investigation. This is the first time the WGCNA method has been used to analyze Dendrobium transcriptome data. Further excavation of the gene module information will help us to further study the role and significance of key genes, key signaling pathways, and regulatory mechanisms between genes on the occurrence and development of medicinal components of Dendrobium.
Collapse
|
33
|
Nam KI, Yoon G, Kim YK, Song J. Transcriptome Analysis of Pineal Glands in the Mouse Model of Alzheimer's Disease. Front Mol Neurosci 2020; 12:318. [PMID: 31998073 PMCID: PMC6962250 DOI: 10.3389/fnmol.2019.00318] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 12/13/2019] [Indexed: 01/08/2023] Open
Abstract
The pineal gland maintains the circadian rhythm in the body by secreting the hormone melatonin. Alzheimer's disease (AD) is the most common neurodegenerative disease. Pineal gland impairment in AD is widely observed, but no study to date has analyzed the transcriptome in the pineal glands of AD. To establish resources for the study on pineal gland dysfunction in AD, we performed a transcriptome analysis of the pineal glands of AD model mice and compared them to those of wild type mice. We identified the global change of diverse protein-coding RNAs, which are implicated in the alteration in cellular transport, protein transport, protein folding, collagen expression, histone dosage, and the electron transfer system. We also discovered various dysregulated long noncoding RNAs and circular RNAs in the pineal glands of mice with AD. This study showed that the expression of diverse RNAs with important functional implications in AD was changed in the pineal gland of the AD mouse model. The analyzed data reported in this study will be an important resource for future studies to elucidate the altered physiology of the pineal gland in AD.
Collapse
Affiliation(s)
- Kwang Il Nam
- Department of Anatomy, Chonnam National University Medical School, Jeollanam-do, South Korea
| | - Gwangho Yoon
- Department of Anatomy, Chonnam National University Medical School, Jeollanam-do, South Korea.,Department of Biochemistry, Chonnam National University Medical School, Jeollanam-do, South Korea
| | - Young-Kook Kim
- Department of Biochemistry, Chonnam National University Medical School, Jeollanam-do, South Korea.,Department of Biomedical Sciences, Center for Creative Biomedical Scientists at Chonnam National University, Jeollanam-do, South Korea
| | - Juhyun Song
- Department of Anatomy, Chonnam National University Medical School, Jeollanam-do, South Korea.,Department of Biomedical Sciences, Center for Creative Biomedical Scientists at Chonnam National University, Jeollanam-do, South Korea
| |
Collapse
|
34
|
Patterns of diverse gene functions in genomic neighborhoods predict gene function and phenotype. Sci Rep 2019; 9:19537. [PMID: 31863070 PMCID: PMC6925100 DOI: 10.1038/s41598-019-55984-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 12/02/2019] [Indexed: 01/01/2023] Open
Abstract
Genes with similar roles in the cell cluster on chromosomes, thus benefiting from coordinated regulation. This allows gene function to be inferred by transferring annotations from genomic neighbors, following the guilt-by-association principle. We performed a systematic search for co-occurrence of >1000 gene functions in genomic neighborhoods across 1669 prokaryotic, 49 fungal and 80 metazoan genomes, revealing prevalent patterns that cannot be explained by clustering of functionally similar genes. It is a very common occurrence that pairs of dissimilar gene functions – corresponding to semantically distant Gene Ontology terms – are significantly co-located on chromosomes. These neighborhood associations are often as conserved across genomes as the known associations between similar functions, suggesting selective benefits from clustering of certain diverse functions, which may conceivably play complementary roles in the cell. We propose a simple encoding of chromosomal gene order, the neighborhood function profiles (NFP), which draws on diverse gene clustering patterns to predict gene function and phenotype. NFPs yield a 26–46% increase in predictive power over state-of-the-art approaches that propagate function across neighborhoods, thus providing hundreds of novel, high-confidence gene function inferences per genome. Furthermore, we demonstrate that copy number-neutral structural variation that shapes gene function distribution across chromosomes can predict phenotype of individuals from their genome sequence.
Collapse
|
35
|
Qu Y, Bi C, He B, Ye N, Yin T, Xu LA. Genome-wide identification and characterization of the MADS-box gene family in Salix suchowensis. PeerJ 2019; 7:e8019. [PMID: 31720123 PMCID: PMC6842560 DOI: 10.7717/peerj.8019] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 10/09/2019] [Indexed: 01/19/2023] Open
Abstract
MADS-box genes encode transcription factors that participate in various plant growth and development processes, particularly floral organogenesis. To date, MADS-box genes have been reported in many species, the completion of the sequence of the willow genome provides us with the opportunity to conduct a comprehensive analysis of the willow MADS-box gene family. Here, we identified 60 willow MADS-box genes using bioinformatics-based methods and classified them into 22 M-type (11 Mα, seven Mβ and four Mγ) and 38 MIKC-type (32 MIKCc and six MIKC*) genes based on a phylogenetic analysis. Fifty-six of the 60 SsMADS genes were randomly distributed on 19 putative willow chromosomes. By combining gene structure analysis with evolutionary analysis, we found that the MIKC-type genes were more conserved and played a more important role in willow growth. Further study showed that the MIKC* type was a transition between the M-type and MIKC-type. Additionally, the number of MADS-box genes in gymnosperms was notably lower than that in angiosperms. Finally, the expression profiles of these willow MADS-box genes were analysed in five different tissues (root, stem, leave, bud and bark) and validated by RT-qPCR experiments. This study is the first genome-wide analysis of the willow MADS-box gene family, and the results establish a basis for further functional studies of willow MADS-box genes and serve as a reference for related studies of other woody plants.
Collapse
Affiliation(s)
- Yanshu Qu
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Changwei Bi
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Bing He
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Ning Ye
- College of Information Science and Technology, Nanjing Forestry University, Nanjing, China
| | - Tongming Yin
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| | - Li-An Xu
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, China
| |
Collapse
|
36
|
Dai Z. Gene Repositioning Is Under Constraints After Evolutionary Conserved Gene Neighborhood Separate. Front Genet 2019; 10:1030. [PMID: 31632448 PMCID: PMC6785632 DOI: 10.3389/fgene.2019.01030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 09/25/2019] [Indexed: 11/13/2022] Open
Abstract
Genes are not randomly distributed on eukaryotic chromosomes. Some neighboring genes show order conservation among species, while some neighboring genes separate during evolution even though their neighborhoods are conserved in some species. Here, I investigated whether after-separation gene repositioning is under natural selection for evolutionary conserved gene neighborhoods compared with nonconserved neighborhoods. After separation, genes with conserved neighborhoods show low-expression divergence between the after-separation species and the before-separation species. After genes separate from their conserved gene neighbors, their after-separation gene neighbors tend to show coexpression and coprotein complex with their before-separation gene neighbors. These results indicate evolutionary constraints on the selection of neighboring genes after evolutionary conserved gene neighborhoods separate.
Collapse
Affiliation(s)
- Zhiming Dai
- School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, China.,Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-Sen University, Guangzhou, China
| |
Collapse
|
37
|
Xu H, Liu JJ, Liu Z, Li Y, Jin YS, Zhang J. Synchronization of stochastic expressions drives the clustering of functionally related genes. SCIENCE ADVANCES 2019; 5:eaax6525. [PMID: 31633028 PMCID: PMC6785257 DOI: 10.1126/sciadv.aax6525] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 09/10/2019] [Indexed: 05/18/2023]
Abstract
Functionally related genes tend to be chromosomally clustered in eukaryotic genomes even after the exclusion of tandem duplicates, but the biological significance of this widespread phenomenon is unclear. We propose that stochastic expression fluctuations of neighboring genes resulting from chromatin dynamics are more or less synchronized such that their expression ratio is more stable than that for unlinked genes. Consequently, chromosomal clustering could be advantageous when the expression ratio of the clustered genes needs to stay constant, for example, because of the accumulation of toxic compounds when this ratio is altered. Evidence from manipulative experiments on the yeast GAL cluster, comprising three chromosomally adjacent genes encoding enzymes catalyzing consecutive reactions in galactose catabolism, unequivocally supports this hypothesis and elucidates how disorder in one biological phenomenon-gene expression noise-could prompt the emergence of order in another-genome organization.
Collapse
Affiliation(s)
- Haiqing Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jing-Jing Liu
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Zhen Liu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Ying Li
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yong-Su Jin
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Food Science and Human Nutrition, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
- Corresponding author.
| |
Collapse
|
38
|
Marcet-Houben M, Gabaldón T. Evolutionary and functional patterns of shared gene neighbourhood in fungi. Nat Microbiol 2019; 4:2383-2392. [PMID: 31527797 DOI: 10.1038/s41564-019-0552-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 07/29/2019] [Indexed: 11/09/2022]
Abstract
Gene clusters comprise genomically co-localized and potentially co-regulated genes that tend to be conserved across species. In eukaryotes, multiple examples of metabolic gene clusters are known, particularly among fungi and plants. However, little is known about how gene clustering patterns vary among taxa or with respect to functional roles. Furthermore, mechanisms of the formation, maintenance and evolution of gene clusters remain unknown. We surveyed 341 fungal genomes to discover gene clusters shared by different species, independently of their functions. We inferred 12,120 cluster families, which comprised roughly one third of the gene space and were enriched in genes associated with diverse cellular functions. Additionally, most clusters did not encode transcription factors, suggesting that they are regulated distally. We used phylogenomics to characterize the evolutionary history of these clusters. We found that most clusters originated once and were transmitted vertically, coupled to differential loss. However, convergent evolution-that is, independent appearance of the same cluster-was more prevalent than anticipated. Finally, horizontal gene transfer of entire clusters was somewhat restricted, with the exception of those associated with secondary metabolism. Altogether, our results provide insights on the evolution of gene clustering as well as a broad catalogue of evolutionarily conserved gene clusters whose function remains to be elucidated.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra, Barcelona, Spain.,Barcelona Supercomputing Centre (BSC-CNS), Institute for Research in Biomedicine (IRB), Barcelona, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra, Barcelona, Spain. .,ICREA, Barcelona, Spain. .,Barcelona Supercomputing Centre (BSC-CNS), Institute for Research in Biomedicine (IRB), Barcelona, Spain.
| |
Collapse
|
39
|
van Wersch S, Li X. Stronger When Together: Clustering of Plant NLR Disease resistance Genes. TRENDS IN PLANT SCIENCE 2019; 24:688-699. [PMID: 31266697 DOI: 10.1016/j.tplants.2019.05.005] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2019] [Revised: 05/05/2019] [Accepted: 05/16/2019] [Indexed: 05/14/2023]
Abstract
Gene clustering is rare in eukaryotes. However, nucleotide-binding leucine-rich repeat (NLR)-encoding disease resistance (R) genes show consistent clustering in plant genomes. These arrangements are likely to provide coregulatory benefits, as suggested by growing evidence that the gene products of both paired and larger clusters of NLRs act together in triggering immunity. Head-to-head gene pairs where one of the encoded NLRs includes an integrated decoy domain appear to behave differently than clusters evolved from closely related typical NLRs. These patterns may help to explain the broad resistance that most plants have despite their finite number of R genes. By taking into consideration the relationship between genomic arrangement and function, we can improve our understanding of and ability to predict plant immune detection.
Collapse
Affiliation(s)
- Solveig van Wersch
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Xin Li
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
| |
Collapse
|
40
|
Aiewsakun P, Simmonds P, Katzourakis A. The First Co-Opted Endogenous Foamy Viruses and the Evolutionary History of Reptilian Foamy Viruses. Viruses 2019; 11:v11070641. [PMID: 31336856 PMCID: PMC6669660 DOI: 10.3390/v11070641] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/01/2019] [Accepted: 07/04/2019] [Indexed: 12/17/2022] Open
Abstract
A recent study reported the discovery of an endogenous reptilian foamy virus (FV), termed ERV-Spuma-Spu, found in the genome of tuatara. Here, we report two novel reptilian foamy viruses also identified as endogenous FVs (EFVs) in the genomes of panther gecko (ERV-Spuma-Ppi) and Schlegel’s Japanese gecko (ERV-Spuma-Gja). Their presence indicates that FVs are capable of infecting reptiles in addition to mammals, amphibians, and fish. Numerous copies of full length ERV-Spuma-Spu elements were found in the tuatara genome littered with in-frame stop codons and transposable elements, suggesting that they are indeed endogenous and are not functional. ERV-Spuma-Ppi and ERV-Spuma-Gja, on the other hand, consist solely of a foamy virus-like env gene. Examination of host flanking sequences revealed that they are orthologous, and despite being more than 96 million years old, their env reading frames are fully coding competent with evidence for strong purifying selection to maintain expression and for them likely being transcriptionally active. These make them the oldest EFVs discovered thus far and the first documented EFVs that may have been co-opted for potential cellular functions. Phylogenetic analyses revealed a complex virus–host co-evolutionary history and cross-species transmission routes of ancient FVs.
Collapse
Affiliation(s)
- Pakorn Aiewsakun
- Department of Microbiology, Faculty of Science, Mahidol University, Bangkok 10400, Thailand.
| | - Peter Simmonds
- Nuffield Department of Medicine, University of Oxford, South Parks Road, Oxford OX1 3SY, UK
| | - Aris Katzourakis
- Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3SY, UK.
| |
Collapse
|
41
|
Zhang Q, Liu W, Liu C, Lin SY, Guo AY. SEGtool: a specifically expressed gene detection tool and applications in human tissue and single-cell sequencing data. Brief Bioinform 2019; 19:1325-1336. [PMID: 28981576 DOI: 10.1093/bib/bbx074] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Indexed: 12/20/2022] Open
Abstract
Different tissues and diseases have distinct transcriptional profilings with specifically expressed genes (SEGs). So, the identification of SEGs is an important issue in the studies of gene function, biological development, disease mechanism and biomarker discovery. However, few accurate and easy-to-use tools are available for RNA sequencing (RNA-seq) data to detect SEGs. Here, we presented SEGtool, a tool based on fuzzy c-means, Jaccard index and greedy annealing method for SEG detection automatically and self-adaptively ignoring data distribution. Testing result showed that our SEGtool outperforms the existing tools, which was mainly developed for microarray data. By applying SEGtool to Genotype-Tissue Expression (GTEx) human tissue data set, we detected 3181 SEGs with tissue-related functions. Regulatory networks reveal tissue-specific transcription factors regulating many SEGs, such as ETV2 in testis, HNF4A in liver and NEUROD1 in brain. Applied to a case study of single-cell sequencing (SCS) data from embryo cells, we identified many SEGs in specific stages of human embryogenesis. Notably, SEGtool is suitable for RNA-seq data and even SCS data with high specificity and accuracy. An implementation of SEGtool R package is freely available at http://bioinfo.life.hust.edu.cn/SEGtool/.
Collapse
Affiliation(s)
- Qiong Zhang
- Huazhong University of Science and Technology, China
| | - Wei Liu
- Huazhong University of Science and Technology, China
| | - Chunjie Liu
- Huazhong University of Science and Technology, China
| | - Sheng-Yan Lin
- Huazhong University of Science and Technology, China
| | - An-Yuan Guo
- Huazhong University of Science and Technology, China
| |
Collapse
|
42
|
Lee BY, Choi BS, Kim MS, Park JC, Jeong CB, Han J, Lee JS. The genome of the freshwater water flea Daphnia magna: A potential use for freshwater molecular ecotoxicology. AQUATIC TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2019; 210:69-84. [PMID: 30826642 DOI: 10.1016/j.aquatox.2019.02.009] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 02/14/2019] [Accepted: 02/14/2019] [Indexed: 06/09/2023]
Abstract
The water flea Daphnia magna is a small planktonic cladoceran. D. magna has been used as a model species for ecotoxicology, as it is sensitive to environmental stressors and environmental changes. Since Daphnia is affected by culture environment and each population/strain has its own ecological and genetic characteristics, its population/strain-based genome information is useful for environmental genomic studies. In this study, we assembled and characterized the genome of D. magna. Using a high-density genetic map of D. magna xinb3, the draft genome was integrated to 10 linkage groups (LGs). The total length of the integrated genome was about 123 Mb with N50 = 10.1 Mb, and the number of scaffolds was 4193 including 10 LGs. A total of 15,721 genes were annotated after manual curation. Orthologous genes were characterized in the genome and compared with other genomes of Daphnia. In addition, we identified defense related genes such as cytochrome P450 (CYP) genes, glutathione S-transferase (GST) genes, and ATP-binding cassette (ABC) genes from the assembled D. magna genome for its potential use in molecular ecotoxicological studies in the freshwater environment. This genomic resource will be helpful to study for a better understanding on molecular mechanism in response to various pollutants.
Collapse
Affiliation(s)
- Bo-Young Lee
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | | | - Min-Sub Kim
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Jun Chul Park
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Chang-Bum Jeong
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Jeonghoon Han
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Jae-Seong Lee
- Department of Biological Science, College of Science, Sungkyunkwan University, Suwon 16419, South Korea.
| |
Collapse
|
43
|
Gupta P, Singh SK. Gene Regulatory Networks: Current Updates and Applications in Plant Biology. ENERGY, ENVIRONMENT, AND SUSTAINABILITY 2019. [DOI: 10.1007/978-981-15-0690-1_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
44
|
Leijten W, Koes R, Roobeek I, Frugis G. Translating Flowering Time From Arabidopsis thaliana to Brassicaceae and Asteraceae Crop Species. PLANTS 2018; 7:plants7040111. [PMID: 30558374 PMCID: PMC6313873 DOI: 10.3390/plants7040111] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 12/07/2018] [Accepted: 12/13/2018] [Indexed: 12/31/2022]
Abstract
Flowering and seed set are essential for plant species to survive, hence plants need to adapt to highly variable environments to flower in the most favorable conditions. Endogenous cues such as plant age and hormones coordinate with the environmental cues like temperature and day length to determine optimal time for the transition from vegetative to reproductive growth. In a breeding context, controlling flowering time would help to speed up the production of new hybrids and produce high yield throughout the year. The flowering time genetic network is extensively studied in the plant model species Arabidopsis thaliana, however this knowledge is still limited in most crops. This article reviews evidence of conservation and divergence of flowering time regulation in A. thaliana with its related crop species in the Brassicaceae and with more distant vegetable crops within the Asteraceae family. Despite the overall conservation of most flowering time pathways in these families, many genes controlling this trait remain elusive, and the function of most Arabidopsis homologs in these crops are yet to be determined. However, the knowledge gathered so far in both model and crop species can be already exploited in vegetable crop breeding for flowering time control.
Collapse
Affiliation(s)
- Willeke Leijten
- ENZA Zaden Research & Development B.V., Haling 1E, 1602 DB Enkhuizen, The Netherlands.
| | - Ronald Koes
- Swammerdam Institute for Life Sciences (SILS), University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.
| | - Ilja Roobeek
- ENZA Zaden Research & Development B.V., Haling 1E, 1602 DB Enkhuizen, The Netherlands.
| | - Giovanna Frugis
- Istituto di Biologia e Biotecnologia Agraria (IBBA), Operative Unit of Rome, Consiglio Nazionale delle Ricerche (CNR), Via Salaria Km. 29,300 ⁻ 00015, Monterotondo Scalo, Roma, Italy.
| |
Collapse
|
45
|
Wintergerst L, Selmansberger M, Maihoefer C, Schüttrumpf L, Walch A, Wilke C, Pitea A, Woischke C, Baumeister P, Kirchner T, Belka C, Ganswindt U, Zitzelsberger H, Unger K, Hess J. A prognostic mRNA expression signature of four 16q24.3 genes in radio(chemo)therapy-treated head and neck squamous cell carcinoma (HNSCC). Mol Oncol 2018; 12:2085-2101. [PMID: 30259648 PMCID: PMC6275282 DOI: 10.1002/1878-0261.12388] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 08/13/2018] [Accepted: 09/12/2018] [Indexed: 01/28/2023] Open
Abstract
Previously, we have shown that copy number gain of the chromosomal band 16q24.3 is associated with impaired clinical outcome of radiotherapy‐treated head and neck squamous cell carcinoma (HNSCC) patients. We set out to identify a prognostic mRNA signature from genes located on 16q24.3 in radio(chemo)therapy‐treated HNSCC patients of the TCGA (The Cancer Genome Atlas, n = 99) cohort. We applied stepwise forward selection using expression data of 41 16q24.3 genes. The resulting optimal Cox‐proportional hazards regression model included the genes APRT, CENPBD1, CHMP1A, and GALNS. Afterward, the prognostic value of the classifier was confirmed in an independent cohort of HNSCC patients treated by adjuvant radio(chemo)therapy (LMU‐KKG cohort). The signature significantly differentiated high‐ and low‐risk patients with regard to overall survival (HR = 2.01, 95% CI 1.10–3.70; P = 0.02125), recurrence‐free survival (HR = 1.84, 95% CI 1.01–3.34; P = 0.04206), and locoregional recurrence‐free survival (HR = 1.87, 95% CI 1.03–3.40; P = 0.03641). The functional impact of the four signature genes was investigated after reconstruction of a gene association network from transcriptome data of the TCGA HNSCC cohort using a partial correlation approach. Subsequent pathway enrichment analysis of the network neighborhood (first and second) of the signature genes suggests involvement of HNSCC‐associated signaling pathways such as apoptosis, cell cycle, cell adhesion, EGFR, JAK‐STAT, and mTOR. Furthermore, a detailed analysis of the first neighborhood revealed a cluster of co‐expressed genes located on chromosome 16q, substantiating the impact of 16q24.3 alterations in poor clinical outcome of HNSCC. The reported gene expression signature represents a prognostic marker in HNSCC patients following postoperative radio(chemo)therapy.
Collapse
Affiliation(s)
- Ludmila Wintergerst
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| | - Martin Selmansberger
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| | - Cornelius Maihoefer
- Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany
| | - Lars Schüttrumpf
- Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany
| | - Axel Walch
- Research Unit Analytical Pathology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| | - Christina Wilke
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| | - Adriana Pitea
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany
| | | | - Philipp Baumeister
- Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital, LMU Munich, Germany
| | - Thomas Kirchner
- Institute of Pathology, Faculty of Medicine, LMU Munich, Germany
| | - Claus Belka
- Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany.,German Cancer Consortium (DKTK), Munich, Germany
| | - Ute Ganswindt
- Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany.,Department of Therapeutic Radiology and Oncology, Innsbruck Medical University, Austria
| | - Horst Zitzelsberger
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany
| | - Kristian Unger
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany
| | - Julia Hess
- Research Unit Radiation Cytogenetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Clinical Cooperation Group 'Personalized Radiotherapy in Head and Neck Cancer', Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, Neuherberg, Germany.,Department of Radiation Oncology, University Hospital, LMU Munich, Germany
| |
Collapse
|
46
|
Abstract
In bacteria, more than half of the genes in the genome are organized in operons. In contrast, in eukaryotes, functionally related genes are usually dispersed across the genome. There are, however, numerous examples of functional clusters of nonhomologous genes for metabolic pathways in fungi and plants. Despite superficial similarities with operons (physical clustering, coordinate regulation), these clusters have not usually originated by horizontal gene transfer from bacteria, and (unlike operons) the genes are typically transcribed separately rather than as a single polycistronic message. This clustering phenomenon raises intriguing questions about the origins of clustered metabolic pathways in eukaryotes and the significance of clustering for pathway function. Here we review metabolic gene clusters from fungi and plants, highlight commonalities and differences, and consider how these clusters form and are regulated. We also identify opportunities for future research in the areas of large-scale genomics, synthetic biology, and experimental evolution.
Collapse
Affiliation(s)
- Hans-Wilhelm Nützmann
- Department of Metabolic Biology, John Innes Centre, Norwich NR4 7UH, United Kingdom; .,Current affiliation: Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, United Kingdom;
| | - Claudio Scazzocchio
- Department of Microbiology, Imperial College, London SW7 2AZ, United Kingdom; .,Institute for Integrative Biology of the Cell, 91190 Gif-sur-Yvette, France
| | - Anne Osbourn
- Department of Metabolic Biology, John Innes Centre, Norwich NR4 7UH, United Kingdom;
| |
Collapse
|
47
|
Diament A, Tuller T. Modeling three-dimensional genomic organization in evolution and pathogenesis. Semin Cell Dev Biol 2018; 90:78-93. [PMID: 30030143 DOI: 10.1016/j.semcdb.2018.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 07/08/2018] [Indexed: 12/17/2022]
Abstract
The regulation of gene expression is mediated via the complex three-dimensional (3D) conformation of the genetic material and its interactions with various intracellular factors. Various experimental and computational approaches have been developed in recent years for understating the relation between the 3D conformation of the genome and the phenotypes of cells in normal condition and diseases. In this review, we will discuss novel approaches for analyzing and modeling the 3D genomic conformation, focusing on deciphering disease-causing mutations that affect gene expression. We conclude that as this is a very challenging mission, an important direction should involve the comparative analysis of various 3D models from various organisms or cells.
Collapse
Affiliation(s)
- Alon Diament
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Tamir Tuller
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel; The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv 6997801, Israel.
| |
Collapse
|
48
|
Mirończuk AM, Biegalska A, Zugaj K, Rzechonek DA, Dobrowolski A. A Role of a Newly Identified Isomerase From Yarrowia lipolytica in Erythritol Catabolism. Front Microbiol 2018; 9:1122. [PMID: 29910781 PMCID: PMC5992420 DOI: 10.3389/fmicb.2018.01122] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 05/11/2018] [Indexed: 11/13/2022] Open
Abstract
Erythritol is a natural sweetener produced by microorganisms as an osmoprotectant. It belongs to the group of polyols and it can be utilized by the oleaginous yeast Yarrowia lipolytica. Despite the recent identification of the transcription factor of erythritol utilization (EUF1), the metabolic pathway of erythritol catabolism remains unknown. In this study we identified a new gene, YALI0F01628g, involved in erythritol assimilation. In silico analysis showed that YALI0F01628g is a putative isomerase and it is localized in the same region as EUF1. qRT-PCR analysis of Y. lipolytica showed a significant increase in YALI0F01628g expression during growth on erythritol and after overexpression of EUF1. Moreover, the deletion strain ΔF01628 showed significantly impaired erythritol assimilation, whereas synthesis of erythritol remained unchanged. The results showed that YALI0F1628g is involved in erythritol assimilation; thus we named the gene EYI1. Moreover, we suggest the metabolic pathway of erythritol assimilation in yeast Y. lipolytica.
Collapse
Affiliation(s)
- Aleksandra M. Mirończuk
- Department of Biotechnology and Food Microbiology, Wrocław University of Environmental and Life Sciences, Wrocław, Poland
| | | | | | | | | |
Collapse
|
49
|
Hansen BO, Meyer EH, Ferrari C, Vaid N, Movahedi S, Vandepoele K, Nikoloski Z, Mutwil M. Ensemble gene function prediction database reveals genes important for complex I formation in Arabidopsis thaliana. THE NEW PHYTOLOGIST 2018; 217:1521-1534. [PMID: 29205376 DOI: 10.1111/nph.14921] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Accepted: 10/24/2017] [Indexed: 05/25/2023]
Abstract
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists.
Collapse
Affiliation(s)
- Bjoern Oest Hansen
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
- Institut für Medizinische Informatik, Universitätsmedizin Göttingen, Robert-Koch-Str. 40, Göttingen, 37075, Germany
| | - Etienne H Meyer
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
| | - Camilla Ferrari
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
| | - Neha Vaid
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
| | - Sara Movahedi
- Department of Plant Biotechnology and Bioinformatics, VIB Center for Plant Systems Biology, Ghent University, Technologiepark 927, Gent, B-9052, Belgium
- Rijk Zwaan Breeding BV, Burgemeester Crezéelaan 40, PO Box 40, De Lier, 2678 ZG, the Netherlands
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, VIB Center for Plant Systems Biology, Ghent University, Technologiepark 927, Gent, B-9052, Belgium
| | - Zoran Nikoloski
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
- Bioinformatics Group, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, Potsdam-Golm, 14476, Germany
| | - Marek Mutwil
- Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Potsdam, 14476, Germany
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| |
Collapse
|
50
|
Zhou Q, Han D, Mason AS, Zhou C, Zheng W, Li Y, Wu C, Fu D, Huang Y. Earliness traits in rapeseed (Brassica napus): SNP loci and candidate genes identified by genome-wide association analysis. DNA Res 2017; 25:229-244. [PMID: 29236947 PMCID: PMC6014513 DOI: 10.1093/dnares/dsx052] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Accepted: 11/14/2017] [Indexed: 11/29/2022] Open
Abstract
Life cycle timing is critical for yield and productivity of Brassica napus (rapeseed) cultivars grown in different environments. To facilitate breeding for earliness traits in rapeseed, SNP loci and underlying candidate genes associated with the timing of initial flowering, maturity and final flowering, as well as flowering period (FP) were investigated in two environments in a diversity panel comprising 300 B. napus inbred lines. Genome-wide association studies (GWAS) using 201,817 SNP markers previously developed from SLAF-seq (specific locus amplified fragment sequencing) revealed a total of 131 SNPs strongly linked (P < 4.96E-07) to the investigated traits. Of these 131 SNPs, 40 fell into confidence intervals or were physically adjacent to previously published flowering time QTL or SNPs. Phenotypic effect analysis detected 35 elite allelic variants for early maturing, and 90 for long FP. Candidate genes present in the same linkage disequilibrium blocks (r2>0.6) or in 100 kb regions around significant trait-associated SNPs were screened, revealing 57 B. napus genes (33 SNPs) orthologous to 39 Arabidopsis thaliana flowering time genes. These results support the practical and scientific value of novel large-scale SNP data generation in uncovering the genetic control of agronomic traits in B. napus, and also provide a theoretical basis for molecular marker-assisted selection of earliness breeding in rapeseed.
Collapse
Affiliation(s)
- Qinghong Zhou
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Depeng Han
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Annaliese S Mason
- Plant Breeding Department, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen 35392, Germany
| | - Can Zhou
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Wei Zheng
- Jiangxi Institute of Red Soil, Jinxian, 331717, China
| | - Yazhen Li
- Jiangxi Institute of Red Soil, Jinxian, 331717, China
| | - Caijun Wu
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Donghui Fu
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| | - Yingjin Huang
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Agronomy College, Jiangxi Agricultural University, Nanchang 330045, China
| |
Collapse
|