1
|
Lakshman AH, Wright ES. EvoWeaver: large-scale prediction of gene functional associations from coevolutionary signals. Nat Commun 2025; 16:3878. [PMID: 40274827 PMCID: PMC12022180 DOI: 10.1038/s41467-025-59175-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Accepted: 04/09/2025] [Indexed: 04/26/2025] Open
Abstract
The known universe of uncharacterized proteins is expanding far faster than our ability to annotate their functions through laboratory study. Computational annotation approaches rely on similarity to previously studied proteins, thereby ignoring unstudied proteins. Coevolutionary approaches hold promise for injecting new information into our knowledge of the protein universe by linking proteins through 'guilt-by-association'. However, existing coevolutionary algorithms have insufficient accuracy and scalability to connect the entire universe of proteins. We present EvoWeaver, a method that weaves together 12 signals of coevolution to quantify the degree of shared evolution between genes. EvoWeaver accurately identifies proteins involved in protein complexes or separate steps of a biochemical pathway. We show the merits of EvoWeaver by partly reconstructing known biochemical pathways without any prior knowledge other than that available from genomic sequences. Applying EvoWeaver to 1545 gene groups from 8564 genomes reveals missing connections in popular databases and potentially undiscovered links between proteins.
Collapse
Affiliation(s)
- Aidan H Lakshman
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Erik S Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
2
|
Singh G, Pasinato A, Yriarte ALC, Pizarro D, Divakar PK, Schmitt I, Dal Grande F. Are there conserved biosynthetic genes in lichens? Genome-wide assessment of terpene biosynthetic genes suggests ubiquitous distribution of the squalene synthase cluster. BMC Genomics 2024; 25:936. [PMID: 39375591 PMCID: PMC11457338 DOI: 10.1186/s12864-024-10806-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 09/17/2024] [Indexed: 10/09/2024] Open
Abstract
Lichen-forming fungi (LFF) are prolific producers of functionally and structurally diverse secondary metabolites, most of which are taxonomically exclusive and play lineage-specific roles. To date, widely distributed, evolutionarily conserved biosynthetic pathways in LFF are not known. However, this idea stems from polyketide derivatives, since most biochemical research on lichens has concentrated on polyketide synthases (PKSs). Here, we present the first systematic identification and comparison of terpene biosynthetic genes of LFF using all the available Lecanoromycete reference genomes and 22 de novo sequenced ones (111 in total, representing 60 genera and 23 families). We implemented genome mining and gene networking approaches to identify and group the biosynthetic gene clusters (BGCs) into networks of similar BGCs. Our large-scale analysis led to the identification of 724 terpene BGCs with varying degrees of pairwise similarity. Most BGCs in the dataset were unique with no similarity to a previously known fungal or bacterial BGC or among each other. Remarkably, we found two BGCs that were widely distributed in LFF. Interestingly, both conserved BGCs contain the same core gene, i.e., putatively a squalene/phytoene synthase (SQS), involved in sterol biosynthesis. This indicates that early gene duplications, followed by gene losses/gains and gene rearrangement are the major evolutionary factors shaping the composition of these widely distributed SQS BGCs across LFF. We provide an in-depth overview of these BGCs, including the transmembrane, conserved, variable and LFF-specific regions. Our study revealed that lichenized fungi do have a highly conserved BGC, providing the first evidence that a biosynthetic gene may constitute essential genes in lichens.
Collapse
Affiliation(s)
- Garima Singh
- Department of Biology, University of Padova, Via U. Bassi, 58/B, 35121, Padua, Italy.
- Botanical Garden of Padova, University of Padova, Padua, Italy.
| | - Anna Pasinato
- Department of Biology, University of Padova, Via U. Bassi, 58/B, 35121, Padua, Italy
| | | | - David Pizarro
- Department of Pharmacology, Pharmacognosy and Botany, Faculty of Pharmacy, Complutense University of Madrid (UCM), Madrid, 28040, Spain
| | - Pradeep K Divakar
- Department of Pharmacology, Pharmacognosy and Botany, Faculty of Pharmacy, Complutense University of Madrid (UCM), Madrid, 28040, Spain
| | - Imke Schmitt
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Frankfurt Am Main, 60325, Germany
- Department of Biosciences, Institute of Ecology Evolution and Diversity, Goethe UniversityFrankfurt,, Max-Von-Laue-Str. 13, Frankfurt am Main, 60438, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt Am Main, 60325, Germany
| | - Francesco Dal Grande
- Department of Biology, University of Padova, Via U. Bassi, 58/B, 35121, Padua, Italy
- Botanical Garden of Padova, University of Padova, Padua, Italy
| |
Collapse
|
3
|
Azevedo LG, Sosa E, de Queiroz ATL, Barral A, Wheeler RJ, Nicolás MF, Farias LP, Do Porto DF, Ramos PIP. High-throughput prioritization of target proteins for development of new antileishmanial compounds. Int J Parasitol Drugs Drug Resist 2024; 25:100538. [PMID: 38669848 PMCID: PMC11068527 DOI: 10.1016/j.ijpddr.2024.100538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 03/11/2024] [Accepted: 04/04/2024] [Indexed: 04/28/2024]
Abstract
Leishmaniasis, a vector-borne disease, is caused by the infection of Leishmania spp., obligate intracellular protozoan parasites. Presently, human vaccines are unavailable, and the primary treatment relies heavily on systemic drugs, often presenting with suboptimal formulations and substantial toxicity, making new drugs a high priority for LMIC countries burdened by the disease, but a low priority in the agenda of most pharmaceutical companies due to unattractive profit margins. New ways to accelerate the discovery of new, or the repositioning of existing drugs, are needed. To address this challenge, our study aimed to identify potential protein targets shared among clinically-relevant Leishmania species. We employed a subtractive proteomics and comparative genomics approach, integrating high-throughput multi-omics data to classify these targets based on different druggability metrics. This effort resulted in the ranking of 6502 ortholog groups of protein targets across 14 pathogenic Leishmania species. Among the top 20 highly ranked groups, metabolic processes known to be attractive drug targets, including the ubiquitination pathway, aminoacyl-tRNA synthetases, and purine synthesis, were rediscovered. Additionally, we unveiled novel promising targets such as the nicotinate phosphoribosyltransferase enzyme and dihydrolipoamide succinyltransferases. These groups exhibited appealing druggability features, including less than 40% sequence identity to the human host proteome, predicted essentiality, structural classification as highly druggable or druggable, and expression levels above the 50th percentile in the amastigote form. The resources presented in this work also represent a comprehensive collection of integrated data regarding trypanosomatid biology.
Collapse
Affiliation(s)
- Lucas G Azevedo
- Center for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (Fiocruz Bahia), Salvador, Bahia, Brazil; Post-graduate Program in Biotechnology and Investigative Medicine, Instituto Gonçalo Moniz, Salvador, Bahia, Brazil.
| | - Ezequiel Sosa
- Universidad de Buenos Aires, Buenos Aires, Argentina.
| | - Artur T L de Queiroz
- Center for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (Fiocruz Bahia), Salvador, Bahia, Brazil; Post-graduate Program in Biotechnology and Investigative Medicine, Instituto Gonçalo Moniz, Salvador, Bahia, Brazil.
| | - Aldina Barral
- Laboratório de Medicina e Saúde Pública de Precisão (MeSP2), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (Fiocruz Bahia), Salvador, Bahia, Brazil.
| | - Richard J Wheeler
- Peter Medawar Building for Pathogen Research, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
| | - Marisa F Nicolás
- Laboratório Nacional de Computação Científica, Petrópolis, Rio de Janeiro, Brazil.
| | - Leonardo P Farias
- Post-graduate Program in Biotechnology and Investigative Medicine, Instituto Gonçalo Moniz, Salvador, Bahia, Brazil; Laboratório de Medicina e Saúde Pública de Precisão (MeSP2), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (Fiocruz Bahia), Salvador, Bahia, Brazil.
| | | | - Pablo Ivan P Ramos
- Center for Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz (Fiocruz Bahia), Salvador, Bahia, Brazil; Post-graduate Program in Biotechnology and Investigative Medicine, Instituto Gonçalo Moniz, Salvador, Bahia, Brazil.
| |
Collapse
|
4
|
Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
5
|
Genome Comparisons of the Fission Yeasts Reveal Ancient Collinear Loci Maintained by Natural Selection. J Fungi (Basel) 2021; 7:jof7100864. [PMID: 34682285 PMCID: PMC8537764 DOI: 10.3390/jof7100864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 10/06/2021] [Accepted: 10/12/2021] [Indexed: 11/30/2022] Open
Abstract
Fission yeasts have a unique life history and exhibit distinct evolutionary patterns from other yeasts. Besides, the species demonstrate stable genome structures despite the relatively fast evolution of their genomic sequences. To reveal what could be the reason for that, comparative genomic analyses were carried out. Our results provided evidence that the structural and sequence evolution of the fission yeasts were correlated. Moreover, we revealed ancestral locally collinear blocks (aLCBs), which could have been inherited from their last common ancestor. These aLCBs proved to be the most conserved regions of the genomes as the aLCBs contain almost eight genes/blocks on average in the same orientation and order across the species. Gene order of the aLCBs is mainly fission-yeast-specific but supports the idea of filamentous ancestors. Nevertheless, the sequences and gene structures within the aLCBs are as mutable as any sequences in other parts of the genomes. Although genes of certain Gene Ontology (GO) categories tend to cluster at the aLCBs, those GO enrichments are not related to biological functions or high co-expression rates, they are, rather, determined by the density of essential genes and Rec12 cleavage sites. These data and our simulations indicated that aLCBs might not only be remnants of ancestral gene order but are also maintained by natural selection.
Collapse
|
6
|
Wright BW, Ruan J, Molloy MP, Jaschke PR. Genome Modularization Reveals Overlapped Gene Topology Is Necessary for Efficient Viral Reproduction. ACS Synth Biol 2020; 9:3079-3090. [PMID: 33044064 DOI: 10.1021/acssynbio.0c00323] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sequence overlap between two genes is common across all genomes, with viruses having high proportions of these gene overlaps. Genome modularization and refactoring is the process of disrupting natural gene overlaps to separate coding sequences to enable their individual manipulation. The biological function and fitness effects of gene overlaps are not fully understood, and their effects on gene cluster and genome-level refactoring are unknown. The bacteriophage φX174 genome has ∼26% of nucleotides involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed to show that gene overlap is critical to maintaining optimal viral fecundity. Through detailed phenotypic measurements we reveal that genome modularization in φX174 causes virion replication, stability, and attachment deficiencies. Quantitation of the complete phage proteome across an infection cycle reveals 30% of proteins display abnormal expression patterns. Taken together, we have for the first time comprehensively demonstrated that gene modularization severely perturbs the coordinated functioning of a bacteriophage replication cycle. This work highlights the biological importance of gene overlap in natural genomes and that reducing gene overlap disruption should be an integral part of future genome engineering projects.
Collapse
Affiliation(s)
- Bradley W. Wright
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Juanfang Ruan
- Electron Microscope Unit, Mark Wainwright Analytical Centre, The University of New South Wales, Sydney, NSW 2052, Australia
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Mark P. Molloy
- Kolling Institute, Northern Clinical School, The University of Sydney, Sydney, NSW 2006, Australia
| | - Paul R. Jaschke
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
7
|
Marcet-Houben M, Gabaldón T. EvolClust: automated inference of evolutionary conserved gene clusters in eukaryotes. Bioinformatics 2020; 36:1265-1266. [PMID: 31560365 PMCID: PMC7703780 DOI: 10.1093/bioinformatics/btz706] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 08/30/2019] [Accepted: 09/25/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The evolution and role of gene clusters in eukaryotes is poorly understood. Currently, most studies and computational prediction programs limit their focus to specific types of clusters, such as those involved in secondary metabolism. RESULTS We present EvolClust, a python-based tool for the inference of evolutionary conserved gene clusters from genome comparisons, independently of the function or gene composition of the cluster. EvolClust predicts conserved gene clusters from pairwise genome comparisons and infers families of related clusters from multiple (all versus all) genome comparisons. AVAILABILITY AND IMPLEMENTATION https://github.com/Gabaldonlab/EvolClust/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Centre for Genomic Regulation (CRG), Bioinformatics and Genomics department, The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.,Health and Experimental Sciences Department, Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation (CRG), Bioinformatics and Genomics department, The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.,Health and Experimental Sciences Department, Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain.,ICREA, Barcelona 08010, Spain
| |
Collapse
|
8
|
Seligmann H. Syntenies Between Cohosted Mitochondrial, Chloroplast, and Phycodnavirus Genomes: Functional Mimicry and/or Common Ancestry? DNA Cell Biol 2019; 38:1257-1268. [DOI: 10.1089/dna.2019.4858] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Hervé Seligmann
- The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
9
|
Marcet-Houben M, Gabaldón T. Evolutionary and functional patterns of shared gene neighbourhood in fungi. Nat Microbiol 2019; 4:2383-2392. [PMID: 31527797 DOI: 10.1038/s41564-019-0552-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 07/29/2019] [Indexed: 11/09/2022]
Abstract
Gene clusters comprise genomically co-localized and potentially co-regulated genes that tend to be conserved across species. In eukaryotes, multiple examples of metabolic gene clusters are known, particularly among fungi and plants. However, little is known about how gene clustering patterns vary among taxa or with respect to functional roles. Furthermore, mechanisms of the formation, maintenance and evolution of gene clusters remain unknown. We surveyed 341 fungal genomes to discover gene clusters shared by different species, independently of their functions. We inferred 12,120 cluster families, which comprised roughly one third of the gene space and were enriched in genes associated with diverse cellular functions. Additionally, most clusters did not encode transcription factors, suggesting that they are regulated distally. We used phylogenomics to characterize the evolutionary history of these clusters. We found that most clusters originated once and were transmitted vertically, coupled to differential loss. However, convergent evolution-that is, independent appearance of the same cluster-was more prevalent than anticipated. Finally, horizontal gene transfer of entire clusters was somewhat restricted, with the exception of those associated with secondary metabolism. Altogether, our results provide insights on the evolution of gene clustering as well as a broad catalogue of evolutionarily conserved gene clusters whose function remains to be elucidated.
Collapse
Affiliation(s)
- Marina Marcet-Houben
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra, Barcelona, Spain.,Barcelona Supercomputing Centre (BSC-CNS), Institute for Research in Biomedicine (IRB), Barcelona, Spain
| | - Toni Gabaldón
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra, Barcelona, Spain. .,ICREA, Barcelona, Spain. .,Barcelona Supercomputing Centre (BSC-CNS), Institute for Research in Biomedicine (IRB), Barcelona, Spain.
| |
Collapse
|
10
|
Hollenbeck CM, Portnoy DS, Gold JR. Evolution of population structure in an estuarine-dependent marine fish. Ecol Evol 2019; 9:3141-3152. [PMID: 30962887 PMCID: PMC6434539 DOI: 10.1002/ece3.4936] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 12/19/2018] [Accepted: 01/07/2019] [Indexed: 01/06/2023] Open
Abstract
Restriction site-associated DNA (RAD) sequencing was used to characterize neutral and adaptive genetic variation among geographic samples of red drum, Sciaenops ocellatus, an estuarine-dependent fish found in coastal waters along the southeastern coast of the United States (Atlantic) and the northern Gulf of Mexico (Gulf). Analyses of neutral and outlier loci revealed three genetically distinct regional clusters: one in the Atlantic and two in the northern Gulf. Divergence in neutral loci indicated gradual genetic change and followed a linear pattern of isolation by distance. Divergence in outlier loci was at least an order of magnitude greater than divergence in neutral loci, and divergence between the regions in the Gulf was twice that of divergence between other regions. Discordance in patterns of genetic divergence between outlier and neutral loci is consistent with the hypothesis that the former reflects adaptive responses to environmental factors that vary on regional scales, while the latter largely reflects drift processes. Differences in basic habitat, initiated by glacial retreat and perpetuated by contemporary oceanic and atmospheric forces interacting with the geomorphology of the northern Gulf, followed by selection, appear to have led to reduced gene flow among red drum across the northern Gulf, reinforcing differences accrued during isolation and resulting in continued divergence across the genome. This same dynamic also may pertain to other coastal or nearshore fishes (18 species in 14 families) where genetically or morphologically defined sister taxa occur in the three regions.
Collapse
Affiliation(s)
- Christopher M. Hollenbeck
- Marine Genomics Laboratory, Department of Life SciencesTexas A&M University ‐ Corpus ChristiCorpus ChristiTexas
- Present address:
Scottish Oceans InstituteUniversity of St. AndrewsSt. Andrews, FifeUK
| | - David S. Portnoy
- Marine Genomics Laboratory, Department of Life SciencesTexas A&M University ‐ Corpus ChristiCorpus ChristiTexas
| | - John R. Gold
- Marine Genomics Laboratory, Department of Life SciencesTexas A&M University ‐ Corpus ChristiCorpus ChristiTexas
| |
Collapse
|
11
|
Phylogenetic, molecular evolution and structural analyses of the WFDC1/prostate stromal protein 20 (ps20). Gene 2019; 686:125-140. [DOI: 10.1016/j.gene.2018.10.046] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 09/07/2018] [Accepted: 10/19/2018] [Indexed: 12/20/2022]
|
12
|
Thanki AS, Soranzo N, Herrero J, Haerty W, Davey RP. Aequatus: an open-source homology browser. Gigascience 2018; 7:5160135. [PMID: 30395211 PMCID: PMC6251984 DOI: 10.1093/gigascience/giy128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 09/06/2018] [Accepted: 10/17/2018] [Indexed: 11/18/2022] Open
Abstract
Background Phylogenetic information inferred from the study of homologous genes helps us to understand the evolution of genes and gene families, including the identification of ancestral gene duplication events as well as regions under positive or purifying selection within lineages. Gene family and orthogroup characterization enables the identification of syntenic blocks, which can then be visualized with various tools. Unfortunately, currently available tools display only an overview of syntenic regions as a whole, limited to the gene level, and none provide further details about structural changes within genes, such as the conservation of ancestral exon boundaries amongst multiple genomes. Findings We present Aequatus, an open-source web-based tool that provides an in-depth view of gene structure across gene families, with various options to render and filter visualizations. It relies on precalculated alignment and gene feature information typically held in, but not limited to, the Ensembl Compara and Core databases. We also offer Aequatus.js, a reusable JavaScript module that fulfills the visualization aspects of Aequatus, available within the Galaxy web platform as a visualization plug-in, which can be used to visualize gene trees generated by the GeneSeqToFamily workflow.
Collapse
Affiliation(s)
- Anil S Thanki
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Nicola Soranzo
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Javier Herrero
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
- Bill Lyons Informatics Centre, UCL Cancer Institute, 72 Huntley St., London, WC1E 6DD, UK
| | - Wilfried Haerty
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| | - Robert P Davey
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, UK
| |
Collapse
|
13
|
Abstract
The billions of proteins inside a eukaryotic cell are organized among dozens of sub-cellular compartments, within which they are further organized into protein complexes. The maintenance of both levels of organization is crucial for normal cellular function. Newly made proteins that fail to be segregated to the correct compartment or assembled into the appropriate complex are defined as orphans. In this review, we discuss the challenges faced by a cell of minimizing orphaned proteins, the quality control systems that recognize orphans, and the consequences of excess orphans for protein homeostasis and disease.
Collapse
|
14
|
Cavalcante MG, Bastos CEMC, Nagamachi CY, Pieczarka JC, Vicari MR, Noronha RCR. Physical mapping of repetitive DNA suggests 2n reduction in Amazon turtles Podocnemis (Testudines: Podocnemididae). PLoS One 2018; 13:e0197536. [PMID: 29813087 PMCID: PMC5973585 DOI: 10.1371/journal.pone.0197536] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 05/03/2018] [Indexed: 01/27/2023] Open
Abstract
Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26-68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26-28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae.
Collapse
Affiliation(s)
- Manoella Gemaque Cavalcante
- Centro de Estudos Avançados da Biodiversidade, Laboratório de Citogenética, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Pará, Brasil
| | - Carlos Eduardo Matos Carvalho Bastos
- Centro de Estudos Avançados da Biodiversidade, Laboratório de Citogenética, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Pará, Brasil
| | - Cleusa Yoshiko Nagamachi
- Centro de Estudos Avançados da Biodiversidade, Laboratório de Citogenética, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Pará, Brasil
| | - Julio Cesar Pieczarka
- Centro de Estudos Avançados da Biodiversidade, Laboratório de Citogenética, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Pará, Brasil
| | - Marcelo Ricardo Vicari
- Departamento de Biologia Estrutural, Molecular e Genética, Universidade Estadual de Ponta Grossa, Ponta Grossa, Paraná, Brasil
| | - Renata Coelho Rodrigues Noronha
- Centro de Estudos Avançados da Biodiversidade, Laboratório de Citogenética, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, Pará, Brasil
| |
Collapse
|
15
|
Specialized plant biochemistry drives gene clustering in fungi. ISME JOURNAL 2018; 12:1694-1705. [PMID: 29463891 DOI: 10.1038/s41396-018-0075-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 01/18/2018] [Accepted: 01/26/2018] [Indexed: 01/31/2023]
Abstract
The fitness and evolution of prokaryotes and eukaryotes are affected by the organization of their genomes. In particular, the physical clustering of genes can coordinate gene expression and can prevent the breakup of co-adapted alleles. Although clustering may thus result from selection for phenotype optimization and persistence, the impact of environmental selection pressures on eukaryotic genome organization has rarely been systematically explored. Here, we investigated the organization of fungal genes involved in the degradation of phenylpropanoids, a class of plant-produced secondary metabolites that mediate many ecological interactions between plants and fungi. Using a novel gene cluster detection method, we identified 1110 gene clusters and many conserved combinations of clusters in a diverse set of fungi. We demonstrate that congruence in genome organization over small spatial scales is often associated with similarities in ecological lifestyle. Additionally, we find that while clusters are often structured as independent modules with little overlap in content, certain gene families merge multiple modules into a common network, suggesting they are important components of phenylpropanoid degradation strategies. Together, our results suggest that phenylpropanoids have repeatedly selected for gene clustering in fungi, and highlight the interplay between genome organization and ecological evolution in this ancient eukaryotic lineage.
Collapse
|
16
|
Moretti AIS, Pavanelli JC, Nolasco P, Leisegang MS, Tanaka LY, Fernandes CG, Wosniak J, Kajihara D, Dias MH, Fernandes DC, Jo H, Tran NV, Ebersberger I, Brandes RP, Bonatto D, Laurindo FRM. Conserved Gene Microsynteny Unveils Functional Interaction Between Protein Disulfide Isomerase and Rho Guanine-Dissociation Inhibitor Families. Sci Rep 2017; 7:17262. [PMID: 29222525 PMCID: PMC5722932 DOI: 10.1038/s41598-017-16947-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 11/21/2017] [Indexed: 02/07/2023] Open
Abstract
Protein disulfide isomerases (PDIs) support endoplasmic reticulum redox protein folding and cell-surface thiol-redox control of thrombosis and vascular remodeling. The family prototype PDIA1 regulates NADPH oxidase signaling and cytoskeleton organization, however the related underlying mechanisms are unclear. Here we show that genes encoding human PDIA1 and its two paralogs PDIA8 and PDIA2 are each flanked by genes encoding Rho guanine-dissociation inhibitors (GDI), known regulators of RhoGTPases/cytoskeleton. Evolutionary histories of these three microsyntenic regions reveal their emergence by two successive duplication events of a primordial gene pair in the last common vertebrate ancestor. The arrangement, however, is substantially older, detectable in echinoderms, nematodes, and cnidarians. Thus, PDI/RhoGDI pairing in the same transcription orientation emerged early in animal evolution and has been largely maintained. PDI/RhoGDI pairs are embedded into conserved genomic regions displaying common cis-regulatory elements. Analysis of gene expression datasets supports evidence for PDI/RhoGDI coexpression in developmental/inflammatory contexts. PDIA1/RhoGDIα were co-induced in endothelial cells upon CRISP-R-promoted transcription activation of each pair component, and also in mouse arterial intima during flow-induced remodeling. We provide evidence for physical interaction between both proteins. These data support strong functional links between PDI and RhoGDI families, which likely maintained PDI/RhoGDI microsynteny along > 800-million years of evolution.
Collapse
Affiliation(s)
- Ana I S Moretti
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Jessyca C Pavanelli
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Patrícia Nolasco
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | | | - Leonardo Y Tanaka
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Carolina G Fernandes
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - João Wosniak
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Daniela Kajihara
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Matheus H Dias
- Special Laboratory for Cell Cycle, Center of Toxins, Immune-Response and Cell Signaling - CeTICS-Cepid, Butantan Institute, São Paulo, Brazil
| | - Denise C Fernandes
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil
| | - Hanjoong Jo
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, USA
| | - Ngoc-Vinh Tran
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
| | - Ingo Ebersberger
- Applied Bioinformatics Group, Institute of Cell Biology & Neuroscience, Goethe University, Frankfurt, Germany
- Senckenberg Biodiversity and Climate Research Center (BiK-F), Frankfurt, Germany
| | - Ralf P Brandes
- Institut für Kardiovaskuläre Physiologie, Goethe University, Frankfurt, Germany
| | - Diego Bonatto
- Department of Molecular Biology and Biotechnology, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Francisco R M Laurindo
- Vascular Biology Laboratory, Heart Institute (Incor), University of São Paulo School of Medicine, São Paulo, Brazil.
| |
Collapse
|
17
|
Kirk IK, Weinhold N, Brunak S, Belling K. The impact of the protein interactome on the syntenic structure of mammalian genomes. PLoS One 2017; 12:e0179112. [PMID: 28910296 PMCID: PMC5598925 DOI: 10.1371/journal.pone.0179112] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 05/10/2017] [Indexed: 02/06/2023] Open
Abstract
Conserved synteny denotes evolutionary preserved gene order across species. It is not well understood to which degree functional relationships between genes are preserved in syntenic blocks. Here we investigate whether protein-coding genes conserved in mammalian syntenic blocks encode gene products that serve the common functional purpose of interacting at protein level, i.e. connectivity. High connectivity among protein-protein interactions (PPIs) was only moderately associated with conserved synteny on a genome-wide scale. However, we observed a smaller subset of 3.6% of all syntenic blocks with high-confidence PPIs that had significantly higher connectivity than expected by random. Additionally, syntenic blocks with high-confidence PPIs contained significantly more chromatin loops than the remaining blocks, indicating functional preservation among these syntenic blocks. Conserved synteny is typically defined by sequence similarity. In this study, we also examined whether a functional relationship, here PPI connectivity, can identify syntenic blocks independently of orthology. While orthology-based syntenic blocks with high-confident PPIs and the connectivity-based syntenic blocks largely overlapped, the connectivity-based approach identified additional syntenic blocks that were not found by conventional sequence-based methods alone. Additionally, the connectivity-based approach enabled identification of potential orthologous genes between species. Our analyses demonstrate that subsets of syntenic blocks are associated with highly connected proteins, and that PPI connectivity can be used to detect conserved synteny even if sequence conservation drifts beyond what orthology algorithms normally can identify.
Collapse
Affiliation(s)
- Isa Kristina Kirk
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Nils Weinhold
- Memorial Sloan Kettering Cancer Center, Computational Biology Program, New York, NY, United States of America
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Kirstine Belling
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- * E-mail:
| |
Collapse
|
18
|
Pavy N, Lamothe M, Pelgas B, Gagnon F, Birol I, Bohlmann J, Mackay J, Isabel N, Bousquet J. A high-resolution reference genetic map positioning 8.8 K genes for the conifer white spruce: structural genomics implications and correspondence with physical distance. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 90:189-203. [PMID: 28090692 DOI: 10.1111/tpj.13478] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 12/23/2016] [Accepted: 01/03/2017] [Indexed: 05/21/2023]
Abstract
Over the last decade, extensive genetic and genomic resources have been developed for the conifer white spruce (Picea glauca, Pinaceae), which has one of the largest plant genomes (20 Gbp). Draft genome sequences of white spruce and other conifers have recently been produced, but dense genetic maps are needed to comprehend genome macrostructure, delineate regions involved in quantitative traits, complement functional genomic investigations, and assist the assembly of fragmented genomic sequences. A greatly expanded P. glauca composite linkage map was generated from a set of 1976 full-sib progeny, with the positioning of 8793 expressed genes. Regions with significant low or high gene density were identified. Gene family members tended to be mapped on the same chromosomes, with tandemly arrayed genes significantly biased towards specific functional classes. The map was integrated with transcriptome data surveyed across eight tissues. In total, 69 clusters of co-expressed and co-localising genes were identified. A high level of synteny was found with pine genetic maps, which should facilitate the transfer of structural information in the Pinaceae. Although the current white spruce genome sequence remains highly fragmented, dozens of scaffolds encompassing more than one mapped gene were identified. From these, the relationship between genetic and physical distances was examined and the genome-wide recombination rate was found to be much smaller than most estimates reported for angiosperm genomes. This gene linkage map shall assist the large-scale assembly of the next-generation white spruce genome sequence and provide a reference resource for the conifer genomics community.
Collapse
Affiliation(s)
- Nathalie Pavy
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
| | - Manuel Lamothe
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, 1055 du P.E.P.S., P.O. Box 10380, Stn. Sainte-Foy, Québec, QC, G1V 4C7, Canada
| | - Betty Pelgas
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, 1055 du P.E.P.S., P.O. Box 10380, Stn. Sainte-Foy, Québec, QC, G1V 4C7, Canada
| | - France Gagnon
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
| | - Inanç Birol
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
| | - Joerg Bohlmann
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - John Mackay
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
- Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, 0X1 3RB, UK
| | - Nathalie Isabel
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, 1055 du P.E.P.S., P.O. Box 10380, Stn. Sainte-Foy, Québec, QC, G1V 4C7, Canada
| | - Jean Bousquet
- Canada Research Chair in Forest Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, QC, G1V 0A6, Canada
| |
Collapse
|
19
|
Sharma AK, Eils R, König R. Copy Number Alterations in Enzyme-Coding and Cancer-Causing Genes Reprogram Tumor Metabolism. Cancer Res 2016; 76:4058-67. [PMID: 27216182 DOI: 10.1158/0008-5472.can-15-2350] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Accepted: 05/11/2016] [Indexed: 11/16/2022]
Abstract
Somatic copy number alterations frequently occur in the cancer genome affecting not only oncogenic or tumor suppressive genes, but also passenger and potential codriver genes. An intrinsic feature resulting from such genomic perturbations is the deregulation in the metabolism of tumor cells. In this study, we have shown that metabolic and cancer-causing genes are unexpectedly often proximally positioned in the chromosome and share loci with coaltered copy numbers across multiple cancers (19 cancer types from The Cancer Genome Atlas). We have developed an analysis pipeline, Identification of Metabolic Cancer Genes (iMetCG), to infer the functional impact on metabolic remodeling from such coamplifications and codeletions and delineate genes driving cancer metabolism from those that are neutral. Using our identified metabolic genes, we were able to classify tumors based on their tissue and developmental origins. These metabolic genes were similar to known cancer genes in terms of their network connectivity, isoform frequency, and evolutionary features. We further validated these identified metabolic genes by (i) using gene essentiality data from several tumor cell lines, (ii) showing that these identified metabolic genes are strong indicators for patient survival, and (iii) observing a significant overlap between our identified metabolic genes and known cancer-metabolic genes. Our analyses revealed a hitherto unknown generic mechanism for large-scale metabolic reprogramming in cancer cells based on linear gene proximities between cancer-causing and -metabolic genes. We have identified 119 new metabolic cancer genes likely to be involved in rewiring cancer cell metabolism. Cancer Res; 76(14); 4058-67. ©2016 AACR.
Collapse
Affiliation(s)
- Ashwini Kumar Sharma
- Network Modeling, Leibniz Institute for Natural Products Research and Infection Biology, Hans-Knöll-Institute, Jena, Germany. Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Roland Eils
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany. Institute for Pharmacy and Molecular Biotechnology (IPMB) and BioQuant, Heidelberg University, Heidelberg, Germany
| | - Rainer König
- Network Modeling, Leibniz Institute for Natural Products Research and Infection Biology, Hans-Knöll-Institute, Jena, Germany. Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany. Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany.
| |
Collapse
|
20
|
Siwiak M, Zielenkiewicz P. Co-regulation of translation in protein complexes. Biol Direct 2015; 10:18. [PMID: 25909184 PMCID: PMC4409705 DOI: 10.1186/s13062-015-0048-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 03/13/2015] [Indexed: 11/23/2022] Open
Abstract
Background Co-regulation of gene expression has been known for many years, and studied widely both globally and for individual genes. Nevertheless, most analyses concerned transcriptional control, which in case of physically interacting proteins and protein complex subunits may be of secondary importance. This research is the first quantitative analysis that provides global-scale evidence for translation co-regulation among associated proteins. Results By analyzing the results of our previous quantitative model of translation, we have demonstrated that protein production rates plus several other translational parameters, such as mRNA and protein abundance, or number of produced proteins from a gene, are well concerted between stable complex subunits and party hubs. This may be energetically favorable during synthesis of complex building blocks and ensure their accurate production in time. In contrast, for connections with regulatory particles and date hubs translational co-regulation is less visible, indicating that in these cases maintenance of accurate levels of interacting particles is not necessarily beneficial. Conclusions Similar results obtained for distantly related model organisms, Saccharomyces cerevisiae and Homo sapiens, suggest that the phenomenon of translational co-regulation applies to the variety of living organisms and concerns many complex constituents. This phenomenon was also observed among the set of functionally linked proteins from Escherichia coli operons. This leads to the conclusion that translational regulation of a protein should always be studied with respect to the expression of its primary interacting partners. Reviewers This article was reviewed by Sandor Pongor and Claus Wilke. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0048-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marlena Siwiak
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5a, Warsaw, 02-106, Poland.
| | - Piotr Zielenkiewicz
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5a, Warsaw, 02-106, Poland. .,Laboratory of Plant Molecular Biology, Faculty of Biology, Warsaw University, Pawinskiego 5a, Warsaw, 02-106, Poland.
| |
Collapse
|
21
|
Flores CL, Gancedo C. The gene YALI0E20207g from Yarrowia lipolytica encodes an N-acetylglucosamine kinase implicated in the regulated expression of the genes from the N-acetylglucosamine assimilatory pathway. PLoS One 2015; 10:e0122135. [PMID: 25816199 PMCID: PMC4376941 DOI: 10.1371/journal.pone.0122135] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Accepted: 02/16/2015] [Indexed: 12/31/2022] Open
Abstract
The non-conventional yeast Yarrowia lipolytica possesses an ORF, YALI0E20207g, which encodes a protein with an amino acid sequence similar to hexokinases from different organisms. We have cloned that gene and determined several enzymatic properties of its encoded protein showing that it is an N-acetylglucosamine (NAGA) kinase. This conclusion was supported by the lack of growth in NAGA of a strain carrying a YALI0E20207g deletion. We named this gene YlNAG5. Expression of YlNAG5 as well as that of the genes encoding the enzymes of the NAGA catabolic pathway-identified by a BLAST search-was induced by this sugar. Deletion of YlNAG5 rendered that expression independent of the presence of NAGA in the medium and reintroduction of the gene restored the inducibility, indicating that YlNag5 participates in the transcriptional regulation of the NAGA assimilatory pathway genes. Expression of YlNAG5 was increased during sporulation and homozygous Ylnag5/Ylnag5 diploid strains sporulated very poorly as compared with a wild type isogenic control strain pointing to a participation of the protein in the process. Overexpression of YlNAG5 allowed growth in glucose of an Ylhxk1glk1 double mutant and produced, in a wild type background, aberrant morphologies in different media. Expression of the gene in a Saccharomyces cerevisiae hxk1 hxk2 glk1 triple mutant restored ability to grow in glucose.
Collapse
Affiliation(s)
- Carmen-Lisset Flores
- Department of Metabolism and Cell Signalling, Instituto de Investigaciones Biomédicas “Alberto Sols” CSIC-UAM, Madrid, Spain
- * E-mail:
| | - Carlos Gancedo
- Department of Metabolism and Cell Signalling, Instituto de Investigaciones Biomédicas “Alberto Sols” CSIC-UAM, Madrid, Spain
| |
Collapse
|
22
|
Abstract
When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking.
Collapse
Affiliation(s)
- Avazeh T Ghanbarian
- Department of Biology and Biochemisty, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemisty, University of Bath, Bath, United Kingdom
| |
Collapse
|
23
|
Abbas MM, Malluhi QM, Balakrishnan P. Assessment of de novo assemblers for draft genomes: a case study with fungal genomes. BMC Genomics 2014; 15 Suppl 9:S10. [PMID: 25521762 PMCID: PMC4290589 DOI: 10.1186/1471-2164-15-s9-s10] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Recently, large bio-projects dealing with the release of different genomes have transpired. Most of these projects use next-generation sequencing platforms. As a consequence, many de novo assembly tools have evolved to assemble the reads generated by these platforms. Each tool has its own inherent advantages and disadvantages, which make the selection of an appropriate tool a challenging task. RESULTS We have evaluated the performance of frequently used de novo assemblers namely ABySS, IDBA-UD, Minia, SOAP, SPAdes, Sparse, and Velvet. These assemblers are assessed based on their output quality during the assembly process conducted over fungal data. We compared the performance of these assemblers by considering both computational as well as quality metrics. By analyzing these performance metrics, the assemblers are ranked and a procedure for choosing the candidate assembler is illustrated. CONCLUSIONS In this study, we propose an assessment method for the selection of de novo assemblers by considering their computational as well as quality metrics at the draft genome level. We divide the quality metrics into three groups: g1 measures the goodness of the assemblies, g2 measures the problems of the assemblies, and g3 measures the conservation elements in the assemblies. Our results demonstrate that the assemblers ABySS and IDBA-UD exhibit a good performance for the studied data from fungal genomes in terms of running time, memory, and quality. The results suggest that whole genome shotgun sequencing projects should make use of different assemblers by considering their merits.
Collapse
|
24
|
Wang D, Yu J. Plastid-LCGbase: a collection of evolutionarily conserved plastid-associated gene pairs. Nucleic Acids Res 2014; 43:D990-5. [PMID: 25378306 PMCID: PMC4383908 DOI: 10.1093/nar/gku1070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Plastids carry their own genetic material that encodes a variable set of genes that are limited in number but functionally important. Aside from orthology, the lineage-specific order and orientation of these genes are also relevant. Here, we develop a database, Plastid-LCGbase (http://lcgbase.big.ac.cn/plastid-LCGbase/), which focuses on organizational variability of plastid genes and genomes from diverse taxonomic groups. The current Plastid-LCGbase contains information from 470 plastid genomes and exhibits several unique features. First, through a genome-overview page generated from OrganellarGenomeDRAW, it displays general arrangement of all plastid genes (circular or linear). Second, it shows patterns and modes of all paired plastid genes and their physical distances across user-defined lineages, which are facilitated by a step-wise stratification of taxonomic groups. Third, it divides the paired genes into three categories (co-directionally-paired genes or CDPGs, convergently-paired genes or CPGs and divergently-paired genes or DPGs) and three patterns (separation, overlap and inclusion) and provides basic statistics for each species. Fourth, the gene pairing scheme is expandable, where neighboring genes can also be included in species-/lineage-specific comparisons. We hope that Plastid-LCGbase facilitates gene variation (insertion-deletion, translocation and rearrangement) and transcription-level studies of plastid genomes.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, P. R. China Stem Cell Laboratory, UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, P. R. China
| |
Collapse
|
25
|
Pla-Martín D, Calpena E, Lupo V, Márquez C, Rivas E, Sivera R, Sevilla T, Palau F, Espinós C. Junctophilin-1 is a modifier gene of GDAP1-related Charcot-Marie-Tooth disease. Hum Mol Genet 2014; 24:213-29. [PMID: 25168384 DOI: 10.1093/hmg/ddu440] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Mutations in the GDAP1 gene cause different forms of Charcot-Marie-Tooth (CMT) disease, and the primary clinical expression of this disease is markedly variable in the dominant inheritance form (CMT type 2K; CMT2K), in which carriers of the GDAP1 p.R120W mutation can display a wide range of clinical severity. We investigated the JPH1 gene as a genetic modifier of clinical expression variability because junctophilin-1 (JPH1) is a good positional and functional candidate. We demonstrated that the JPH1-GDAP1 cluster forms a paralogon and is conserved in vertebrates. Moreover, both proteins play a role in Ca(2+) homeostasis, and we demonstrated that JPH1 is able to restore the store-operated Ca(2+) entry (SOCE) activity in GDAP1-silenced cells. After the mutational screening of JPH1 in a series of 24 CMT2K subjects who harbour the GDAP1 p.R120W mutation, we characterized the JPH1 p.R213P mutation in one patient with a more severe clinical picture. JPH1(p.R213P) cannot rescue the SOCE response in GDAP1-silenced cells. We observed that JPH1 colocalizes with STIM1, which is the activator of SOCE, in endoplasmic reticulum-plasma membrane puncta structures during Ca(2+) release in a GDAP1-dependent manner. However, when GDAP1(p.R120W) is expressed, JPH1 seems to be retained in mitochondria. We also established that the combination of GDAP1(p.R120W) and JPH1(p.R213P) dramatically reduces SOCE activity, mimicking the effect observed in GDAP1 knock-down cells. In summary, we conclude that JPH1 and GDAP1 share a common pathway and depend on each other; therefore, JPH1 can contribute to the phenotypical consequences of GDAP1 mutations.
Collapse
Affiliation(s)
- David Pla-Martín
- Program in Rare and Genetic Diseases and IBV/CSIC Associated Unit, Centro de Investigación Príncipe Felipe (CIPF), Valencia 46012, Spain Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia 46012, Spain
| | - Eduardo Calpena
- Program in Rare and Genetic Diseases and IBV/CSIC Associated Unit, Centro de Investigación Príncipe Felipe (CIPF), Valencia 46012, Spain Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia 46012, Spain
| | - Vincenzo Lupo
- Program in Rare and Genetic Diseases and IBV/CSIC Associated Unit, Centro de Investigación Príncipe Felipe (CIPF), Valencia 46012, Spain Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia 46012, Spain
| | | | - Eloy Rivas
- Department of Pathology, Hospital Universitario Virgen del Rocío, Seville 41013, Spain
| | - Rafael Sivera
- Department of Neurology, Hospital Universitari i Politècnic La Fe and Instituto de Investigación Sanitario (IIS)-La Fe, Valencia 46026, Spain Centro de Investigación Biomédica en Red de Enfermedades Neurodegenerativas (CIBERNED), Valencia 46026, Spain
| | - Teresa Sevilla
- Department of Neurology, Hospital Universitari i Politècnic La Fe and Instituto de Investigación Sanitario (IIS)-La Fe, Valencia 46026, Spain Centro de Investigación Biomédica en Red de Enfermedades Neurodegenerativas (CIBERNED), Valencia 46026, Spain Department of Medicine and
| | - Francesc Palau
- Program in Rare and Genetic Diseases and IBV/CSIC Associated Unit, Centro de Investigación Príncipe Felipe (CIPF), Valencia 46012, Spain Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia 46012, Spain University of Castilla-La Mancha School of Medicine, Ciudad Real 13071, Spain
| | - Carmen Espinós
- Program in Rare and Genetic Diseases and IBV/CSIC Associated Unit, Centro de Investigación Príncipe Felipe (CIPF), Valencia 46012, Spain Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Valencia 46012, Spain Department of Genetics, Universitat de València, Valencia 46010, Spain and
| |
Collapse
|
26
|
Abstract
A decade of genome sequencing has transformed our understanding of how
trypanosomatid parasites have evolved and provided fresh impetus to explaining
the origins of parasitism in the Kinetoplastida. In this review, I will consider
the many ways in which genome sequences have influenced our view of genomic
reduction in trypanosomatids; how species-specific genes, and the genomic
domains they occupy, have illuminated the innovations in trypanosomatid genomes;
and how comparative genomics has exposed the molecular mechanisms responsible
for innovation and adaptation to a parasitic lifestyle.
Collapse
|
27
|
Ma B, Charkowski AO, Glasner JD, Perna NT. Identification of host-microbe interaction factors in the genomes of soft rot-associated pathogens Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 with supervised machine learning. BMC Genomics 2014; 15:508. [PMID: 24952641 PMCID: PMC4079955 DOI: 10.1186/1471-2164-15-508] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2013] [Accepted: 06/09/2014] [Indexed: 12/14/2022] Open
Abstract
Background A wealth of genome sequences has provided thousands of genes of unknown function, but identification of functions for the large numbers of hypothetical genes in phytopathogens remains a challenge that impacts all research on plant-microbe interactions. Decades of research on the molecular basis of pathogenesis focused on a limited number of factors associated with long-known host-microbe interaction systems, providing limited direction into this challenge. Computational approaches to identify virulence genes often rely on two strategies: searching for sequence similarity to known host-microbe interaction factors from other organisms, and identifying islands of genes that discriminate between pathogens of one type and closely related non-pathogens or pathogens of a different type. The former is limited to known genes, excluding vast collections of genes of unknown function found in every genome. The latter lacks specificity, since many genes in genomic islands have little to do with host-interaction. Result In this study, we developed a supervised machine learning approach that was designed to recognize patterns from large and disparate data types, in order to identify candidate host-microbe interaction factors. The soft rot Enterobacteriaceae strains Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 were used for development of this tool, because these pathogens are important on multiple high value crops in agriculture worldwide and more genomic and functional data is available for the Enterobacteriaceae than any other microbial family. Our approach achieved greater than 90% precision and a recall rate over 80% in 10-fold cross validation tests. Conclusion Application of the learning scheme to the complete genome of these two organisms generated a list of roughly 200 candidates, many of which were previously not implicated in plant-microbe interaction and many of which are of completely unknown function. These lists provide new targets for experimental validation and further characterization, and our approach presents a promising pattern-learning scheme that can be generalized to create a resource to study host-microbe interactions in other bacterial phytopathogens. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-508) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bing Ma
- Genome Center of Wisconsin, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | | | | | | |
Collapse
|
28
|
Xie B, Wang D, Duan Y, Yu J, Lei H. Functional networking of human divergently paired genes (DPGs). PLoS One 2013; 8:e78896. [PMID: 24205343 PMCID: PMC3815023 DOI: 10.1371/journal.pone.0078896] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Accepted: 09/17/2013] [Indexed: 11/18/2022] Open
Abstract
Divergently paired genes (DPGs), also known as bidirectional (head-to-head positioned) genes, are conserved across species and lineages, and thus deemed to be exceptional in genomic organization and functional regulation. Despite previous investigations on the features of their conservation and gene organization, the functional relationship among DPGs in a given species and lineage has not been thoroughly clarified. Here we report a network-based comprehensive analysis on human DPGs and our results indicate that the two members of the DPGs tend to participate in different biological processes while enforcing related functions as modules. Comparing to randomly paired genes as a control, the DPG pairs have a tendency to be clustered in similar “cellular components” and involved in similar “molecular functions”. The functional network bridged by DPGs consists of three major modules. The largest module includes many house-keeping genes involved in core cellular activities. This module also shows low variation in expression in both CNS (central nervous system) and non-CNS tissues. Based on analyses of disease transcriptome data, we further suggest that this particular module may play crucial roles in HIV infection and its disease mechanism.
Collapse
Affiliation(s)
- Bin Xie
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Yong Duan
- UC Davis Genome Center and Department of Biomedical Engineering, Davis, California, United States of America
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * E-mail: (JY); (HL)
| | - Hongxing Lei
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- UC Davis Genome Center and Department of Biomedical Engineering, Davis, California, United States of America
- * E-mail: (JY); (HL)
| |
Collapse
|
29
|
Kashkin KN, Chernov IP, Stukacheva EA, Kopantzev EP, Monastyrskaya GS, Uspenskaya NY, Sverdlov ED. Cancer specificity of promoters of the genes involved in cell proliferation control. Acta Naturae 2013; 5:79-83. [PMID: 24303203 PMCID: PMC3848069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Core promoters with adjacent regions of the human genes CDC6, POLD1, CKS1B, MCM2, and PLK1 were cloned into a pGL3 vector in front of the Photinus pyrails gene Luc in order to study the tumor specificity of the promoters. The cloned promoters were compared in their ability to direct luciferase expression in different human cancer cells and in normal fibroblasts. The cancer-specific promoter BIRC5 and non-specific CMV immediately early gene promoter were used for comparison. All cloned promoters were shown to be substantially more active in cancer cells than in fibroblasts, while the PLK1 promoter was the most cancer-specific and promising one. The specificity of the promoters to cancer cells descended in the series PLK1, CKS1B, POLD1, MCM2, and CDC6. The bidirectional activity of the cloned CKS1B promoter was demonstrated. It apparently directs the expression of the SHC1 gene, which is located in a "head-to-head" position to the CKS1B gene in the human genome. This feature should be taken into account in future use of the CKS1B promoter. The cloned promoters may be used in artificial genetic constructions for cancer gene therapy.
Collapse
Affiliation(s)
- K. N. Kashkin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - I. P. Chernov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - E. A. Stukacheva
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - E. P. Kopantzev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - G. S. Monastyrskaya
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - N. Ya. Uspenskaya
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| | - E. D. Sverdlov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry , Russian Academy of Sciences, Miklukho-Maklaya St., 16/10, Moscow, Russia, 117997
| |
Collapse
|
30
|
Abstract
Gene direction, which is important for function, has not been subjected to statistical testing for randomness and for the degree of evolutionary changes. We analyzed 747 sequenced species and 2,061 genomes/chromosomes and detected clear differences in gene direction between kingdoms. All the archaeans, bacteria, and protozoa analyzed have genes characterized mainly by same-direction neighbors (i.e., in head-to-foot or foot-to-head order), with up to 391 genes in tandem in protozoan Leishmania infantum. Fungi and photosynthetic protists have genes characterized by opposite-direction neighbors, except chromosome VII of Ashbya gossypii, a progenitor fungus. The gene direction analysis suggests that the same-direction dominance originated from the last common ancestor of these living organisms, then was strengthened in protozoa, but weakened or lost in fungi, photosynthetic protists and some plants/animals, giving chromosomes/genomes with gene opposite-direction dominance (i.e., towards the random use of both DNA strands).
Collapse
|
31
|
Irimia M, Tena JJ, Alexis MS, Fernandez-Miñan A, Maeso I, Bogdanovic O, de la Calle-Mustienes E, Roy SW, Gómez-Skarmeta JL, Fraser HB. Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints. Genome Res 2012; 22:2356-67. [PMID: 22722344 PMCID: PMC3514665 DOI: 10.1101/gr.139725.112] [Citation(s) in RCA: 105] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought—involving ∼12% of the ancestral bilaterian genome—and that cis-regulatory constraints are crucial in determining metazoan genome architecture.
Collapse
Affiliation(s)
- Manuel Irimia
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Costa GGL, Cabrera OG, Tiburcio RA, Medrano FJ, Carazzolle MF, Thomazella DPT, Schuster SC, Carlson JE, Guiltinan MJ, Bailey BA, Mieczkowski P, Pereira GAG, Meinhardt LW. The mitochondrial genome of Moniliophthora roreri, the frosty pod rot pathogen of cacao. Fungal Biol 2012; 116:551-62. [PMID: 22559916 DOI: 10.1016/j.funbio.2012.01.008] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2011] [Revised: 01/15/2012] [Accepted: 01/25/2012] [Indexed: 11/29/2022]
Abstract
In this study, we report the sequence of the mitochondrial (mt) genome of the Basidiomycete fungus Moniliophthora roreri, which is the etiologic agent of frosty pod rot of cacao (Theobroma cacao L.). We also compare it to the mtDNA from the closely-related species Moniliophthora perniciosa, which causes witches' broom disease of cacao. The 94 Kb mtDNA genome of M. roreri has a circular topology and codes for the typical 14 mt genes involved in oxidative phosphorylation. It also codes for both rRNA genes, a ribosomal protein subunit, 13 intronic open reading frames (ORFs), and a full complement of 27 tRNA genes. The conserved genes of M. roreri mtDNA are completely syntenic with homologous genes of the 109 Kb mtDNA of M. perniciosa. As in M. perniciosa, M. roreri mtDNA contains a high number of hypothetical ORFs (28), a remarkable feature that make Moniliophthoras the largest reservoir of hypothetical ORFs among sequenced fungal mtDNA. Additionally, the mt genome of M. roreri has three free invertron-like linear mt plasmids, one of which is very similar to that previously described as integrated into the main M. perniciosa mtDNA molecule. Moniliophthora roreri mtDNA also has a region of suspected plasmid origin containing 15 hypothetical ORFs distributed in both strands. One of these ORFs is similar to an ORF in the mtDNA gene encoding DNA polymerase in Pleurotus ostreatus. The comparison to M. perniciosa showed that the 15 Kb difference in mtDNA sizes is mainly attributed to a lower abundance of repetitive regions in M. roreri (5.8 Kb vs 20.7 Kb). The most notable differences between M. roreri and M. perniciosa mtDNA are attributed to repeats and regions of plasmid origin. These elements might have contributed to the rapid evolution of mtDNA. Since M. roreri is the second species of the genus Moniliophthora whose mtDNA genome has been sequenced, the data presented here contribute valuable information for understanding the evolution of fungal mt genomes among closely-related species.
Collapse
Affiliation(s)
- Gustavo G L Costa
- Laboratório de Genômica e Expressão, Departamento de Genética, Evolução e Bioagentes, Instituto de Biologia, Universidade Estadual de Campinas, 13083-970, Campinas, SP, Brazil
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Wang D, Zhang Y, Fan Z, Liu G, Yu J. LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes. Evol Bioinform Online 2011; 8:39-46. [PMID: 22267903 PMCID: PMC3256993 DOI: 10.4137/ebo.s8540] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database—LCGbase (a comprehensive database for lineage-based co-regulated genes)—hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene ontology (GO) annotation, promoter identification, gene expression (co-expression), and evolutionary analysis. This database not only provides a way to define lineage-specific and species-specific gene clusters but also facilitates future studies on gene co-regulation, epigenetic control of gene expression (DNA methylation and histone marks), and chromosomal structures in a context of gene clusters and species evolution. LCGbase is freely available at http://lcgbase.big.ac.cn/LCGbase.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, PR China
| | | | | | | | | |
Collapse
|
34
|
Singh Sandhu K, Li G, Sung WK, Ruan Y. Chromatin interaction networks and higher order architectures of eukaryotic genomes. J Cell Biochem 2011; 112:2218-21. [PMID: 21520242 DOI: 10.1002/jcb.23155] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Eukaryotic genome is, not only linearly but also spatially, organized into non-random architecture. Though the linear organization of genes and their epigenetic descriptors are well characterized, the relevance of their spatial organization is beginning to unfold only recently. It is increasingly being recognized that physical interactions among distant genomic elements could serve as an important mean to eukaryotic genome regulation. With the advent of proximity ligation based techniques coupled with next generation sequencing, it is now possible to explore whole genome chromatin interactions at high resolution. Emerging data on genome-wide chromatin interactions suggest that distantly located genes are not independent entities and instead cross-talk with each other in an extensive manner, supporting the notion of "chromatin interaction networks". Moreover, the data also advance the field to "3-dimensional (3D) chromatin structure and dynamics", which would enable molecular biologists to explore the spatiotemporal regulation of genome. In this article, we introduce a stepwise topological transformation of genome from 1-dimension (1D, linear) to 2-dimension (2D, networks) to 3-dimension (3D, architecture) and discuss how such transformations could advance our understanding of genome biology.
Collapse
|
35
|
Disruption of Yarrowia lipolytica TPS1 gene encoding trehalose-6-P synthase does not affect growth in glucose but impairs growth at high temperature. PLoS One 2011; 6:e23695. [PMID: 21931609 PMCID: PMC3171402 DOI: 10.1371/journal.pone.0023695] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 07/22/2011] [Indexed: 11/18/2022] Open
Abstract
We have cloned the Yarrowia lipolytica TPS1 gene encoding trehalose-6-P synthase by complementation of the lack of growth in glucose of a Saccharomyces cerevisiae tps1 mutant. Disruption of YlTPS1 could only be achieved with a cassette placed in the 3' half of its coding region due to the overlap of its sequence with the promoter of the essential gene YlTFC1. The Yltps1 mutant grew in glucose although the Y. lipolytica hexokinase is extremely sensitive to inhibition by trehalose-6-P. The presence of a glucokinase, insensitive to trehalose-6-P, that constitutes about 80% of the glucose phosphorylating capacity during growth in glucose may account for the growth phenotype. Trehalose content was below 1 nmol/mg dry weight in Y. lipolytica, but it increased in strains expressing YlTPS1 under the control of the YlTEF1 promoter or with a disruption of YALI0D15598 encoding a putative trehalase. mRNA levels of YlTPS1 were low and did not respond to thermal stresses, but that of YlTPS2 (YALI0D14476) and YlTPS3 (YALI0E31086) increased 4 and 6 times, repectively, by heat treatment. Disruption of YlTPS1 drastically slowed growth at 35°C. Homozygous Yltps1 diploids showed a decreased sporulation frequency that was ascribed to the low level of YALI0D20966 mRNA an homolog of the S. cerevisiae MCK1 which encodes a protein kinase that activates early meiotic gene expression.
Collapse
|
36
|
Chavali S, Morais DADL, Gough J, Babu MM. Evolution of eukaryotic genome architecture: Insights from the study of a rapidly evolving metazoan, Oikopleura dioica. Bioessays 2011; 33:592-601. [DOI: 10.1002/bies.201100034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
37
|
Weeks AM, Chang MCY. Constructing de novo biosynthetic pathways for chemical synthesis inside living cells. Biochemistry 2011; 50:5404-18. [PMID: 21591680 DOI: 10.1021/bi200416g] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Living organisms have evolved a vast array of catalytic functions that make them ideally suited for the production of medicinally and industrially relevant small-molecule targets. Indeed, native metabolic pathways in microbial hosts have long been exploited and optimized for the scalable production of both fine and commodity chemicals. Our increasing capacity for DNA sequencing and synthesis has revealed the molecular basis for the biosynthesis of a variety of complex and useful metabolites and allows the de novo construction of novel metabolic pathways for the production of new and exotic molecular targets in genetically tractable microbes. However, the development of commercially viable processes for these engineered pathways is currently limited by our ability to quickly identify or engineer enzymes with the correct reaction and substrate selectivity as well as the speed by which metabolic bottlenecks can be determined and corrected. Efforts to understand the relationship among sequence, structure, and function in the basic biochemical sciences can advance these goals for synthetic biology applications while also serving as an experimental platform for elucidating the in vivo specificity and function of enzymes and reconstituting complex biochemical traits for study in a living model organism. Furthermore, the continuing discovery of natural mechanisms for the regulation of metabolic pathways has revealed new principles for the design of high-flux pathways with minimized metabolic burden and has inspired the development of new tools and approaches to engineering synthetic pathways in microbial hosts for chemical production.
Collapse
Affiliation(s)
- Amy M Weeks
- Department of Chemistry, University of California, Berkeley, California 94720-1460, USA
| | | |
Collapse
|
38
|
Abstract
UNLABELLED A large number of genomes have been sequenced, allowing a range of comparative studies. Here, we present the eukaryotic Gene Order Browser with information on the order of protein and non-coding RNA (ncRNA) genes of 74 different eukaryotic species. The browser is able to display a gene of interest together with its genomic context in all species where that gene is present. Thereby, questions related to the evolution of gene organization and non-random gene order may be examined. The browser also provides access to data collected on pairs of adjacent genes that are evolutionarily conserved. AVAILABILITY eGOB as well as underlying data are freely available at http://egob.biomedicine.gu.se SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. CONTACT tore.samuelsson@medkem.gu.se.
Collapse
Affiliation(s)
- Marcela Dávila López
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, SE-405 30 Göteborg, Sweden
| | | |
Collapse
|
39
|
Janga SC, Díaz-Mejía JJ, Moreno-Hagelsieb G. Network-based function prediction and interactomics: the case for metabolic enzymes. Metab Eng 2010; 13:1-10. [PMID: 20654726 DOI: 10.1016/j.ymben.2010.07.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2010] [Revised: 07/15/2010] [Accepted: 07/16/2010] [Indexed: 12/19/2022]
Abstract
As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases.
Collapse
Affiliation(s)
- S C Janga
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB20QH, United Kingdom.
| | | | | |
Collapse
|