201
|
Amrine KCH, Swingley WD, Ardell DH. tRNA signatures reveal a polyphyletic origin of SAR11 strains among alphaproteobacteria. PLoS Comput Biol 2014; 10:e1003454. [PMID: 24586126 PMCID: PMC3937112 DOI: 10.1371/journal.pcbi.1003454] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 12/10/2013] [Indexed: 12/18/2022] Open
Abstract
Molecular phylogenetics and phylogenomics are subject to noise from horizontal gene transfer (HGT) and bias from convergence in macromolecular compositions. Extensive variation in size, structure and base composition of alphaproteobacterial genomes has complicated their phylogenomics, sparking controversy over the origins and closest relatives of the SAR11 strains. SAR11 are highly abundant, cosmopolitan aquatic Alphaproteobacteria with streamlined, A+T-biased genomes. A dominant view holds that SAR11 are monophyletic and related to both Rickettsiales and the ancestor of mitochondria. Other studies dispute this, finding evidence of a polyphyletic origin of SAR11 with most strains distantly related to Rickettsiales. Although careful evolutionary modeling can reduce bias and noise in phylogenomic inference, entirely different approaches may be useful to extract robust phylogenetic signals from genomes. Here we develop simple phyloclassifiers from bioinformatically derived tRNA Class-Informative Features (CIFs), features predicted to target tRNAs for specific interactions within the tRNA interaction network. Our tRNA CIF-based model robustly and accurately classifies alphaproteobacterial genomes into one of seven undisputed monophyletic orders or families, despite great variability in tRNA gene complement sizes and base compositions. Our model robustly rejects monophyly of SAR11, classifying all but one strain as Rhizobiales with strong statistical support. Yet remarkably, conventional phylogenetic analysis of tRNAs classifies all SAR11 strains identically as Rickettsiales. We attribute this discrepancy to convergence of SAR11 and Rickettsiales tRNA base compositions. Thus, tRNA CIFs appear more robust to compositional convergence than tRNA sequences generally. Our results suggest that tRNA-CIF-based phyloclassification is robust to HGT of components of the tRNA interaction network, such as aminoacyl-tRNA synthetases. We explain why tRNAs are especially advantageous for prediction of traits governing macromolecular interactions from genomic data, and why such traits may be advantageous in the search for robust signals to address difficult problems in classification and phylogeny. If gene products work well in the networks of foreign cells, their genes may transfer horizontally between unrelated genomes. What factors dictate the ability to integrate into foreign networks? Different RNAs and proteins must interact specifically in order to function well as a system. For example, tRNA functions are determined by the interactions they have with other macromolecules. We have developed ways to predict, from genomic data alone, how tRNAs distinguish themselves to their specific interaction partners. Here, as proof of concept, we built a robust computational model from these bioinformatic predictions in seven lineages of Alphaproteobacteria. We validated our model by classifying hundreds of diverse alphaproteobacterial taxa and tested it on eight strains of SAR11, a phylogenetically controversial group that is highly abundant in the world's oceans. We found that different strains of SAR11 are more distantly related, both to each other and to mitochondria, than widely believed. We explain conflicting results about SAR11 as an artifact of bias created by the variability in base contents of alphaproteobacterial genomes. While this bias affects tRNAs too, our classifier appears unexpectedly robust to it. More broadly, our results suggest that traits governing macromolecular interactions may be more faithfully vertically inherited than the macromolecules themselves.
Collapse
Affiliation(s)
- Katherine C. H. Amrine
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - Wesley D. Swingley
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
| | - David H. Ardell
- Program in Quantitative and Systems Biology, University of California, Merced, Merced, California, United States of America
- * E-mail:
| |
Collapse
|
202
|
Kumar NS, Dullaghan EM, Finlay BB, Gong H, Reiner NE, Jon Paul Selvam J, Thorson LM, Campbell S, Vitko N, Richardson AR, Zoraghi R, Young RN. Discovery and optimization of a new class of pyruvate kinase inhibitors as potential therapeutics for the treatment of methicillin-resistant Staphylococcus aureus infections. Bioorg Med Chem 2014; 22:1708-25. [PMID: 24508307 DOI: 10.1016/j.bmc.2014.01.020] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2013] [Revised: 01/07/2014] [Accepted: 01/15/2014] [Indexed: 11/19/2022]
Abstract
A novel series of bis-indoles derived from naturally occurring marine alkaloid 4 were synthesized and evaluated as inhibitors of methicillin-resistant Staphylococcus aureus (MRSA) pyruvate kinase (PK). PK is not only critical for bacterial survival which would make it a target for development of novel antibiotics, but it is reported to be one of the most highly connected 'hub proteins' in MRSA, and thus should be very sensitive to mutations and making it difficult for the bacteria to develop resistance. From the co-crystal structure of cis-3-4-dihydrohamacanthin B (4) bound to S. aureus PK we were able to identify the pharmacophore needed for activity. Consequently, we prepared simple direct linked bis-indoles such as 10b that have similar anti-MRSA activity as compound 4. Structure-activity relationship (SAR) studies were carried out on 10b and led us to discover more potent compounds such as 10c, 10d, 10k and 10 m with enzyme inhibiting activities in the low nanomolar range that effectively inhibited the bacteria growth in culture with minimum inhibitory concentrations (MIC) for MRSA as low as 0.5 μg/ml. Some potent PK inhibitors, such as 10b, exhibited attenuated antibacterial activity and were found to be substrates for an efflux mechanism in S. aureus. Studies comparing a wild type S. aureus with a construct (S. aureus LAC Δpyk::Erm(R)) that lacks PK activity confirmed that bactericidal activity of 10d was PK-dependant.
Collapse
Affiliation(s)
- Nag S Kumar
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Edie M Dullaghan
- Centre for Drug Research and Development (CDRD), Vancouver, BC, Canada
| | - B Brett Finlay
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada; Department of Medicine, Division of Infectious Diseases, University of British Columbia, Vancouver, BC, Canada
| | - Huansheng Gong
- Department of Medicine, Division of Infectious Diseases, University of British Columbia, Vancouver, BC, Canada
| | - Neil E Reiner
- Department of Medicine, Division of Infectious Diseases, University of British Columbia, Vancouver, BC, Canada; Department of Microbiology and Immunology, Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - J Jon Paul Selvam
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Lisa M Thorson
- Department of Medicine, Division of Infectious Diseases, University of British Columbia, Vancouver, BC, Canada
| | - Sara Campbell
- Centre for Drug Research and Development (CDRD), Vancouver, BC, Canada
| | - Nicholas Vitko
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Anthony R Richardson
- Department of Microbiology and Immunology, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Roya Zoraghi
- Department of Medicine, Division of Infectious Diseases, University of British Columbia, Vancouver, BC, Canada
| | - Robert N Young
- Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada.
| |
Collapse
|
203
|
Gong P, Zhao M, He C. Slow co-evolution of the MAGO and Y14 protein families is required for the maintenance of their obligate heterodimerization mode. PLoS One 2014; 9:e84842. [PMID: 24416299 PMCID: PMC3885619 DOI: 10.1371/journal.pone.0084842] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 11/19/2013] [Indexed: 11/18/2022] Open
Abstract
The exon junction complex (EJC) plays important roles in RNA metabolisms and the development of eukaryotic organisms. MAGO (short form of MAGO NASHI) and Y14 (also Tsunagi or RBM8) are the EJC core components. Their biological roles have been well investigated in various species, but the evolutionary patterns of the two gene families and their protein-protein interactions are poorly known. Genome-wide survey suggested that the MAGO and Y14 two gene families originated in eukaryotic organisms with the maintenance of a low copy. We found that the two protein families evolved slowly; however, the MAGO family under stringent purifying selection evolved more slowly than the Y14 family that was under relative relaxed purifying selection. MAGO and Y14 were obliged to form heterodimer in a eukaryotic organism, and this obligate mode was plesiomorphic. Lack of binding of MAGO to Y14 as functional barrier was observed only among distantly species, suggesting that a slow co-evolution of the two protein families. Inter-protein co-evolutionary signal was further quantified in analyses of the Tol-MirroTree and co-evolution analysis using protein sequences. About 20% of the 41 significantly correlated mutation groups (involving 97 residues) predicted between the two families was clade-specific. Moreover, around half of the predicted co-evolved groups and nearly all clade-specific residues fell into the minimal interaction domains of the two protein families. The mutagenesis effects of the clade-specific residues strengthened that the co-evolution is required for obligate MAGO-Y14 heterodimerization mode. In turn, the obliged heterodimerization in an organism serves as a strong functional constraint for the co-evolution of the MAGO and Y14 families. Such a co-evolution allows maintaining the interaction between the proteins through large evolutionary time scales. Our work shed a light on functional evolution of the EJC genes in eukaryotes, and facilitates to understand the co-evolutionary processes among protein families.
Collapse
Affiliation(s)
- Pichang Gong
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China ; University of Chinese Academy of Sciences, Beijing, China
| | - Man Zhao
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China ; University of Chinese Academy of Sciences, Beijing, China
| | - Chaoying He
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
204
|
Integrated Genomics Approaches in Evolutionary and Ecological Endocrinology. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 781:299-319. [DOI: 10.1007/978-94-007-7347-9_15] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
205
|
Chakraborty S, Ghosh TC. Evolutionary rate heterogeneity of core and attachment proteins in yeast protein complexes. Genome Biol Evol 2013; 5:1366-75. [PMID: 23814130 PMCID: PMC3730348 DOI: 10.1093/gbe/evt096] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
In general, proteins do not work alone; they form macromolecular complexes to play fundamental roles in diverse cellular functions. On the basis of their iterative clustering procedure and frequency of occurrence in the macromolecular complexes, the protein subunits have been categorized as core and attachment. Core protein subunits are the main functional elements, whereas attachment proteins act as modifiers or activators in protein complexes. In this article, using the current data set of yeast protein complexes, we found that core proteins are evolving at a faster rate than attachment proteins in spite of their functional importance. Interestingly, our investigation revealed that attachment proteins are present in a higher number of macromolecular complexes than core proteins. We also observed that the protein complex number (defined as the number of protein complexes in which a protein subunit belongs) has a stronger influence on gene/protein essentiality than multifunctionality. Finally, our results suggest that the observed differences in the rates of protein evolution between core and attachment proteins are due to differences in protein complex number and expression level. Moreover, we conclude that proteins which are present in higher numbers of macromolecular complexes enhance their overall expression level by increasing their transcription rate as well as translation rate, and thus the protein complex number imposes a strong selection pressure on the evolution of yeast proteome.
Collapse
|
206
|
Schumacher J, Rosenkranz D, Herlyn H. Mating systems and protein-protein interactions determine evolutionary rates of primate sperm proteins. Proc Biol Sci 2013; 281:20132607. [PMID: 24307672 PMCID: PMC3866406 DOI: 10.1098/rspb.2013.2607] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
To assess the relative impact of functional constraint and post-mating sexual selection on sequence evolution of reproductive proteins, we examined 169 primate sperm proteins. In order to recognize potential genome-wide trends, we additionally analysed a sample of altogether 318 non-reproductive (brain and postsynaptic) proteins. Based on cDNAs of eight primate species (Anthropoidea), we observed that pre-mating sperm proteins engaged in sperm composition and assembly show significantly lower incidence of site-specific positive selection and overall lower non-synonymous to synonymous substitution rates (dN/dS) across sites as compared with post-mating sperm proteins involved in capacitation, hyperactivation, acrosome reaction and fertilization. Moreover, database screening revealed overall more intracellular protein interaction partners in pre-mating than in post-mating sperm proteins. Finally, post-mating sperm proteins evolved at significantly higher evolutionary rates than pre-mating sperm and non-reproductive proteins on the branches to multi-male breeding species, while no such increase was observed on the branches to unimale and monogamous species. We conclude that less protein–protein interactions of post-mating sperm proteins account for lowered functional constraint, allowing for stronger impact of post-mating sexual selection, while the opposite holds true for pre-mating sperm proteins. This pattern is particularly strong in multi-male breeding species showing high female promiscuity.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Anthropology, University of Mainz, , Anselm-Franz-von-Bentzel-Weg 7, 55099 Mainz, Germany
| | | | | |
Collapse
|
207
|
Warren S, Wan XF, Conant G, Korkin D. Extreme evolutionary conservation of functionally important regions in H1N1 influenza proteome. PLoS One 2013; 8:e81027. [PMID: 24282564 PMCID: PMC3839886 DOI: 10.1371/journal.pone.0081027] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2013] [Accepted: 10/08/2013] [Indexed: 12/31/2022] Open
Abstract
The H1N1 subtype of influenza A virus has caused two of the four documented pandemics and is responsible for seasonal epidemic outbreaks, presenting a continuous threat to public health. Co-circulating antigenically divergent influenza strains significantly complicates vaccine development and use. Here, by combining evolutionary, structural, functional, and population information about the H1N1 proteome, we seek to answer two questions: (1) do residues on the protein surfaces evolve faster than the protein core residues consistently across all proteins that constitute the influenza proteome? and (2) in spite of the rapid evolution of surface residues in influenza proteins, are there any protein regions on the protein surface that do not evolve? To answer these questions, we first built phylogenetically-aware models of the patterns of surface and interior substitutions. Employing these models, we found a single coherent pattern of faster evolution on the protein surfaces that characterizes all influenza proteins. The pattern is consistent with the events of inter-species reassortment, the worldwide introduction of the flu vaccine in the early 80's, as well as the differences caused by the geographic origins of the virus. Next, we developed an automated computational pipeline to comprehensively detect regions of the protein surface residues that were 100% conserved over multiple years and in multiple host species. We identified conserved regions on the surface of 10 influenza proteins spread across all avian, swine, and human strains; with the exception of a small group of isolated strains that affected the conservation of three proteins. Surprisingly, these regions were also unaffected by genetic variation in the pandemic 2009 H1N1 viral population data obtained from deep sequencing experiments. Finally, the conserved regions were intrinsically related to the intra-viral macromolecular interaction interfaces. Our study may provide further insights towards the identification of novel protein targets for influenza antivirals.
Collapse
Affiliation(s)
- Samantha Warren
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
| | - Xiu-Feng Wan
- Department of Basic Sciences, Mississippi State University, Mississippi State, Mississippi, United States of America
| | - Gavin Conant
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Dmitry Korkin
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
| |
Collapse
|
208
|
Davila-Velderrain J, Servin-Marquez A, Alvarez-Buylla ER. Molecular evolution constraints in the floral organ specification gene regulatory network module across 18 angiosperm genomes. Mol Biol Evol 2013; 31:560-73. [PMID: 24273325 DOI: 10.1093/molbev/mst223] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The gene regulatory network of floral organ cell fate specification of Arabidopsis thaliana is a robust developmental regulatory module. Although such finding was proposed to explain the overall conservation of floral organ types and organization among angiosperms, it has not been confirmed that the network components are conserved at the molecular level among flowering plants. Using the genomic data that have accumulated, we address the conservation of the genes involved in this network and the forces that have shaped its evolution during the divergence of angiosperms. We recovered the network gene homologs for 18 species of flowering plants spanning nine families. We found that all the genes are highly conserved with no evidence of positive selection. We studied the sequence conservation features of the genes in the context of their known biological function and the strength of the purifying selection acting upon them in relation to their placement within the network. Our results suggest an association between protein length and sequence conservation, evolutionary rates, and functional category. On the other hand, we found no significant correlation between the strength of purifying selection and gene placement. Our results confirm that the studied robust developmental regulatory module has been subjected to strong functional constraints. However, unlike previous studies, our results do not support the notion that network topology plays a major role in constraining evolutionary rates. We speculate that the dynamical functional role of genes within the network and not just its connectivity could play an important role in constraining evolution.
Collapse
|
209
|
Jackson EL, Ollikainen N, Covert AW, Kortemme T, Wilke CO. Amino-acid site variability among natural and designed proteins. PeerJ 2013; 1:e211. [PMID: 24255821 PMCID: PMC3828621 DOI: 10.7717/peerj.211] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2013] [Accepted: 10/24/2013] [Indexed: 11/20/2022] Open
Abstract
Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.
Collapse
Affiliation(s)
- Eleisha L. Jackson
- Institute of Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, and Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Noah Ollikainen
- Graduate Program in Bioinformatics, University of California San Francisco, San Francisco, CA, USA
| | - Arthur W. Covert
- Institute of Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, and Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| | - Tanja Kortemme
- Graduate Program in Bioinformatics, University of California San Francisco, San Francisco, CA, USA
- California Institute for Quantitative Biosciences (QB3) and Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Claus O. Wilke
- Institute of Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, and Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
210
|
Han M, Qin S, Song X, Li Y, Jin P, Chen L, Ma F. Evolutionary rate patterns of genes involved in the Drosophila Toll and Imd signaling pathway. BMC Evol Biol 2013; 13:245. [PMID: 24209511 PMCID: PMC3826850 DOI: 10.1186/1471-2148-13-245] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Accepted: 11/06/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND To survive in a hostile environment, insects have evolved an innate immune system to defend against infection. Studies have shown that natural selection may drive the evolution of immune system-related proteins. Yet, how network architecture influences protein sequence evolution remains unclear. Here, we analyzed the molecular evolutionary patterns of genes in the Toll and Imd innate immune signaling pathways across six Drosophila genomes within the context of a functional network. RESULTS Based on published literature, we identified 50 genes that are directly involved in the Drosophila Toll and Imd signaling pathways. Of those genes, only two (Sphinx1 and Dnr1) exhibited signals of positive selection. There existed a negative correlation between the strength of purifying selection and gene position within the pathway; the downstream genes were more conserved, indicating that they were subjected to stronger evolutionary constraints. Interestingly, there was also a significantly negative correlation between the rate of protein evolution and the number of regulatory microRNAs, implying that genes regulated by more miRNAs experience stronger functional constraints and therefore evolve more slowly. CONCLUSION Taken together, our results suggested that both network architecture and miRNA regulation affect protein sequence evolution. These findings improve our understanding of the evolutionary patterns of genes involved in Drosophila innate immune pathways.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Fei Ma
- Laboratory for Comparative Genomics and Bioinformatics & Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Science, Nanjing Normal University, Nanjing 210023, P, R China.
| |
Collapse
|
211
|
Li S, Choi KP, Wu T, Zhang L. Maximum likelihood inference of the evolutionary history of a PPI network from the duplication history of its proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1412-1421. [PMID: 24407300 DOI: 10.1109/tcbb.2013.14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Evolutionary history of protein-protein interaction (PPI) networks provides valuable insight into molecular mechanisms of network growth. In this paper, we study how to infer the evolutionary history of a PPI network from its protein duplication relationship. We show that for a plausible evolutionary history of a PPI network, its relative quality, measured by the so-called loss number, is independent of the growth parameters of the network and can be computed efficiently. This finding leads us to propose two fast maximum likelihood algorithms to infer the evolutionary history of a PPI network given the duplication history of its proteins. Simulation studies demonstrated that our approach, which takes advantage of protein duplication information, outperforms NetArch, the first maximum likelihood algorithm for PPI network history reconstruction. Using the proposed method, we studied the topological change of the PPI networks of the yeast, fruitfly, and worm.
Collapse
Affiliation(s)
- Si Li
- National University of Singapore, Singapore
| | | | - Taoyang Wu
- National University of Singapore, Singapore and University of East Anglia, Norwich
| | | |
Collapse
|
212
|
Kiran M, Nagarajaram HA. Global versus local hubs in human protein-protein interaction network. J Proteome Res 2013; 12:5436-46. [PMID: 24050456 DOI: 10.1021/pr4002788] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
In this study, we have constructed tissue-specific protein-protein interaction networks for 70 human tissues and have identified three types of hubs based on their expression breadths: (a) tissue-specific hubs (TSHs) (proteins that are expressed in ≤ 10 tissues and also form hubs in ≤ 10 tissues), (b) tissue-preferred hubs (TPHs) (proteins expressed in ≥ 60 tissues but are highly connected in ≤ 10 tissues), and (c) housekeeping hubs (HKHs) (proteins that are expressed in ≥ 60 tissues and also form hubs in ≥ 60 tissues). Comparative analyses revealed significant differences between TSHs and HKHs and also revealed that TPHs behave more like HKHs. TSHs are lengthier, more disordered, and also quickly evolving proteins as compared with HKHs. Despite having a similar number of binding surfaces and interacting domains, TSHs are associated with a lower degree of centrality as compared with HKHs, suggesting that TSHs are "unsaturated" with regard to their binding capability and are perhaps evolving with regard to their interactions. TSHs are less abundantly expressed as compared with HKHs and are enriched with PEST motifs, indicating their tight regulation. All of these properties of TSHs and HKHs correlate with their distinct functional roles; TSHs are involved in tissue-specific functional roles, viz., secretors, receptors, and signaling proteins, whereas HKHs are involved in core-cellular functions such as transcription, translation, and so on. Our study, therefore, brings forth a clear and distinct classification of hubs simply based on their expression breadth and further assumes significance in the light of the highly debated dichotomy of date and party hubs, which is based on the coexpression pattern of hubs with their partners.
Collapse
Affiliation(s)
- Manjari Kiran
- Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics , Bldg.7, Gruhakalpa, Nampally, Hyderabad 500 001, Andhra Pradesh, India
| | | |
Collapse
|
213
|
Simple topological features reflect dynamics and modularity in protein interaction networks. PLoS Comput Biol 2013; 9:e1003243. [PMID: 24130468 PMCID: PMC3794914 DOI: 10.1371/journal.pcbi.1003243] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 08/14/2013] [Indexed: 11/30/2022] Open
Abstract
The availability of large-scale protein-protein interaction networks for numerous organisms provides an opportunity to comprehensively analyze whether simple properties of proteins are predictive of the roles they play in the functional organization of the cell. We begin by re-examining an influential but controversial characterization of the dynamic modularity of the S. cerevisiae interactome that incorporated gene expression data into network analysis. We analyse the protein-protein interaction networks of five organisms, S. cerevisiae, H. sapiens, D. melanogaster, A. thaliana, and E. coli, and confirm significant and consistent functional and structural differences between hub proteins that are co-expressed with their interacting partners and those that are not, and support the view that the former tend to be intramodular whereas the latter tend to be intermodular. However, we also demonstrate that in each of these organisms, simple topological measures are significantly correlated with the average co-expression of a hub with its partners, independent of any classification, and therefore also reflect protein intra- and inter- modularity. Further, cross-interactomic analysis demonstrates that these simple topological characteristics of hub proteins tend to be conserved across organisms. Overall, we give evidence that purely topological features of static interaction networks reflect aspects of the dynamics and modularity of interactomes as well as previous measures incorporating expression data, and are a powerful means for understanding the dynamic roles of hubs in interactomes. A better understanding of protein interaction networks would be a great aid in furthering our knowledge of the molecular biology of the cell. Towards this end, large-scale protein-protein physical interaction data have been determined for organisms across the evolutionary spectrum. However, the resulting networks give a static view of interactomes, and our knowledge about protein interactions is rarely time or context specific. A previous prominent but controversial attempt to characterize the dynamic modularity of the interactome was based on integrating physical interaction data with gene activity measurements from transcript expression data. This analysis distinguished between proteins that are co-expressed with their interacting partners and those that are not, and argued that the former are intramodular and the latter are intermodular. By analyzing the interactomes of five organisms, we largely confirm the biological significance of this characterization through a variety of statistical tests and computational experiments. Surprisingly, however, we find that similar results can be obtained using just network information without additionally integrating expression data, suggesting that purely topological characteristics of interaction networks strongly reflect certain aspects of the dynamics and modularity of interactomes.
Collapse
|
214
|
Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, Gümüş ZH, Herrero J, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin SM, MacArthur DG, Marth G, Muzny D, Pers TH, Ritchie GRS, Rosenfeld JA, Sisu C, Wei X, Wilson M, Xue Y, Yu F, 1000 Genomes Project Consortium, Dermitzakis ET, Yu H, Rubin MA, Tyler-Smith C, Gerstein M. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 2013; 342:1235587. [PMID: 24092746 PMCID: PMC3947637 DOI: 10.1126/science.1235587] [Citation(s) in RCA: 276] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Interpreting variants, especially noncoding ones, in the increasing number of personal genomes is challenging. We used patterns of polymorphisms in functionally annotated regions in 1092 humans to identify deleterious variants; then we experimentally validated candidates. We analyzed both coding and noncoding regions, with the former corroborating the latter. We found regions particularly sensitive to mutations ("ultrasensitive") and variants that are disruptive because of mechanistic effects on transcription-factor binding (that is, "motif-breakers"). We also found variants in regions with higher network centrality tend to be deleterious. Insertions and deletions followed a similar pattern to single-nucleotide variants, with some notable exceptions (e.g., certain deletions and enhancers). On the basis of these patterns, we developed a computational tool (FunSeq), whose application to ~90 cancer genomes reveals nearly a hundred candidate noncoding drivers.
Collapse
Affiliation(s)
- Ekta Khurana
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
| | - Yao Fu
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
| | - Vincenza Colonna
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus,
Cambridge, CB10 1SA, UK
- Institute of Genetics and Biophysics, National Research Council
(CNR), 80131 Naples, Italy
| | - Xinmeng Jasmine Mu
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
| | - Hyun Min Kang
- Center for Statistical Genetics, Biostatistics, University of
Michigan, Ann Arbor, MI 48109, USA
| | - Tuuli Lappalainen
- Department of Genetic Medicine and Development, University of Geneva
Medical School, 1211 Geneva, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of
Geneva, 1211 Geneva, Switzerland
- Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Andrea Sboner
- Institute for Precision Medicine and the Department of Pathology and
Laboratory Medicine, Weill Cornell Medical College and New York-Presbyterian
Hospital, New York, NY 10065, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute
for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021,
USA
| | - Lucas Lochovsky
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
| | - Jieming Chen
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Integrated Graduate Program in Physical and Engineering Biology,
Yale University, New Haven, CT 06520, USA
| | - Arif Harmanci
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
| | - Jishnu Das
- Department of Biological Statistics and Computational Biology,
Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University,
Ithaca, NY 14853, USA
| | - Alexej Abyzov
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
| | - Suganthi Balasubramanian
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
| | - Kathryn Beal
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dimple Chakravarty
- Institute for Precision Medicine and the Department of Pathology and
Laboratory Medicine, Weill Cornell Medical College and New York-Presbyterian
Hospital, New York, NY 10065, USA
| | - Daniel Challis
- Baylor College of Medicine, Human Genome Sequencing Center,
Houston, TX 77030, USA
| | - Yuan Chen
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus,
Cambridge, CB10 1SA, UK
| | - Declan Clarke
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Uday S. Evani
- Baylor College of Medicine, Human Genome Sequencing Center,
Houston, TX 77030, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert Fragoza
- Weill Institute for Cell and Molecular Biology, Cornell University,
Ithaca, NY 14853, USA
- Department of Molecular Biology and Genetics, Cornell University,
Ithaca, NY 14853, USA
| | - Erik Garrison
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - Richard Gibbs
- Baylor College of Medicine, Human Genome Sequencing Center,
Houston, TX 77030, USA
| | - Zeynep H. Gümüş
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute
for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021,
USA
- Department of Physiology and Biophysics, Weill Cornell Medical
College, New York, NY, 10065, USA
| | - Javier Herrero
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Naoki Kitabayashi
- Institute for Precision Medicine and the Department of Pathology and
Laboratory Medicine, Weill Cornell Medical College and New York-Presbyterian
Hospital, New York, NY 10065, USA
| | - Yong Kong
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
- Keck Biotechnology Resource Laboratory, Yale University, New Haven,
CT 06511, USA
| | - Kasper Lage
- Pediatric Surgical Research Laboratories, MassGeneral Hospital for
Children, Massachusetts General Hospital, Boston, MA 02114, USA
- Analytical and Translational Genetics Unit, Massachusetts General
Hospital, Boston, MA 02114, USA
- Harvard Medical School, Boston, MA 02115, USA
- Center for Biological Sequence Analysis, Department of Systems
Biology, Technical University of Denmark, Lyngby, Denmark
- Center for Protein Research, University of Copenhagen, Copenhagen,
Denmark
| | - Vaja Liluashvili
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute
for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021,
USA
- Department of Physiology and Biophysics, Weill Cornell Medical
College, New York, NY, 10065, USA
| | - Steven M. Lipkin
- Department of Medicine, Weill Cornell Medical College, New York, NY
10065, USA
| | - Daniel G. MacArthur
- Analytical and Translational Genetics Unit, Massachusetts General
Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of
Harvard and Massachusetts Institute of Technology (MIT), Cambridge, MA 02142,
USA
| | - Gabor Marth
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - Donna Muzny
- Baylor College of Medicine, Human Genome Sequencing Center,
Houston, TX 77030, USA
| | - Tune H. Pers
- Center for Biological Sequence Analysis, Department of Systems
Biology, Technical University of Denmark, Lyngby, Denmark
- Division of Endocrinology and Center for Basic and Translational
Obesity Research, Children’s Hospital, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Graham R. S. Ritchie
- European Molecular Biology Laboratory, European Bioinformatics
Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeffrey A. Rosenfeld
- Department of Medicine, Rutgers New Jersey Medical School, Newark,
NJ 07101, USA
- IST/High Performance and Research Computing, Rutgers University
Newark, NJ 07101, USA
- Sackler Institute for Comparative Genomics, American Museum of
Natural History, New York, NY 10024, USA
| | - Cristina Sisu
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
| | - Xiaomu Wei
- Weill Institute for Cell and Molecular Biology, Cornell University,
Ithaca, NY 14853, USA
- Department of Medicine, Weill Cornell Medical College, New York, NY
10065, USA
| | - Michael Wilson
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Child Study Center, Yale University, New Haven, CT 06520, USA
| | - Yali Xue
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus,
Cambridge, CB10 1SA, UK
| | - Fuli Yu
- Baylor College of Medicine, Human Genome Sequencing Center,
Houston, TX 77030, USA
| | | | - Emmanouil T. Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva
Medical School, 1211 Geneva, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of
Geneva, 1211 Geneva, Switzerland
- Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology,
Cornell University, Ithaca, NY 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University,
Ithaca, NY 14853, USA
| | - Mark A. Rubin
- Institute for Precision Medicine and the Department of Pathology and
Laboratory Medicine, Weill Cornell Medical College and New York-Presbyterian
Hospital, New York, NY 10065, USA
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus,
Cambridge, CB10 1SA, UK
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale
University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale
University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT
06520, USA
| |
Collapse
|
215
|
Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, Urban AE, Montgomery SB, Levinson DF, Koller D. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 2013. [PMID: 24092820 DOI: 10.1101/gr.155192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation--by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.
Collapse
Affiliation(s)
- Alexis Battle
- Department of Computer Science, Stanford University, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
216
|
Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res 2013; 24:14-24. [PMID: 24092820 PMCID: PMC3875855 DOI: 10.1101/gr.155192.113] [Citation(s) in RCA: 397] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation—by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.
Collapse
|
217
|
Liu Y, Li X, Liu Z, Chen L, Ng MK. Construction and analysis of single nucleotide polymorphism-single nucleotide polymorphism interaction networks. IET Syst Biol 2013; 7:170-81. [PMID: 24067417 PMCID: PMC8687305 DOI: 10.1049/iet-syb.2012.0055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 02/17/2013] [Accepted: 03/25/2013] [Indexed: 11/19/2022] Open
Abstract
The study of gene regulatory network and protein-protein interaction network is believed to be fundamental to the understanding of molecular processes and functions in systems biology. In this study, the authors are interested in single nucleotide polymorphism (SNP) level and construct SNP-SNP interaction network to understand genetic characters and pathogenetic mechanisms of complex diseases. The authors employ existing methods to mine, model and evaluate a SNP sub-network from SNP-SNP interactions. In the study, the authors employ the two SNP datasets: Parkinson disease and coronary artery disease to demonstrate the procedure of construction and analysis of SNP-SNP interaction networks. Experimental results are reported to demonstrate the procedure of construction and analysis of such SNP-SNP interaction networks can recover some existing biological results and related disease genes.
Collapse
Affiliation(s)
- Yang Liu
- Bioinformatics ProgramBoston University24 Cummington StreetBostonMA02215USA
| | - Xutao Li
- Department of Computer ScienceShenzhen Graduate School, Harbin Institute of TechnologyPeople's Republic of China
| | - Zhiping Liu
- Shanghai Institutes for Biological Sciences, Chinese Academy of SciencesShanghaiPeople's Republic of China
| | - Luonan Chen
- Shanghai Institutes for Biological Sciences, Chinese Academy of SciencesShanghaiPeople's Republic of China
| | - Michael K. Ng
- Department of MathematicsHong Kong Baptist UniversityKowloon TongHong Kong
| |
Collapse
|
218
|
Seo H, Kim W, Lee J, Youn B. Network-based approaches for anticancer therapy (Review). Int J Oncol 2013; 43:1737-44. [PMID: 24085339 DOI: 10.3892/ijo.2013.2114] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 08/23/2013] [Indexed: 12/16/2022] Open
Abstract
Cancer is a complex disease resulting from alterations of multiple signaling networks. Cancer networks have been identified as scale-free networks and may contain a functionally important key player called a hub that is linked to a large number of interactors. Since a hub can serve as a biological marker in a given network, targeting the hub could be an effective strategy for enhancing the efficacy of cancer treatment. Chemotherapies and radiotherapies are generally used to treat tumors not amenable to resection, and target single or multiple molecules associated with hubs. However, these therapies may unexpectedly induce the resistance of cancer cells to drugs and radiation. Cancer cells can overcome therapy-induced damage via the activation of back-up signaling pathways and flexible modulation of affected networks. These activities are considered to be the main reasons for chemoresistance and radioresistance, and subsequent failure of cancer therapies. Much effort is required to identify the key molecules that control the modulation of signaling networks in response to drugs and radiation. Network-based therapy that affects network flexibility, including rewired network structures and hub molecules in these networks, could minimize the occurrence of side-effects and be a promising strategy for enhancing the therapeutic efficacy of cancer treatments. This review is intended to offer an overview of current research efforts including ones focused on cancer-associated complex networks, their modulation in response to cancer therapy, and further strategies targeting networks that may improve cancer treatment efficacy.
Collapse
Affiliation(s)
- Hyunjeong Seo
- Department of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 609-735, Republic of Korea
| | | | | | | |
Collapse
|
219
|
Schumacher J, Ramljak S, Asif AR, Schaffrath M, Zischler H, Herlyn H. Evolutionary conservation of mammalian sperm proteins associates with overall, not tyrosine, phosphorylation in human spermatozoa. J Proteome Res 2013; 12:5370-82. [PMID: 23919900 DOI: 10.1021/pr400228c] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
We investigated possible associations between sequence evolution of mammalian sperm proteins and their phosphorylation status in humans. As a reference, spermatozoa from three normozoospermic men were analyzed combining two-dimensional gel electrophoresis, immunoblotting, and mass spectrometry. We identified 99 sperm proteins (thereof 42 newly described) and determined the phosphorylation status for most of them. Sequence evolution was studied across six mammalian species using nonsynonymous/synonymous rate ratios (dN/dS) and amino acid distances. Site-specific purifying selection was assessed employing average ratios of evolutionary rates at phosphorylated versus nonphosphorylated amino acids (α). According to our data, mammalian sperm proteins do not show statistically significant sequence conservation difference, no matter if the human ortholog is a phosphoprotein with or without tyrosine (Y) phosphorylation. In contrast, overall phosphorylation of human sperm proteins, i.e., phosphorylation at serine (S), threonine (T), and/or Y residues, associates with above-average conservation of sequences. Complementary investigations suggest that numerous protein-protein interactants constrain sequence evolution of sperm phosphoproteins. Although our findings reject a special relevance of Y phosphorylation for sperm functioning, they still indicate that overall phosphorylation substantially contributes to proper functioning of sperm proteins. Hence, phosphorylated sperm proteins might be considered as prime candidates for diagnosis and treatment of reduced male fertility.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Anthropology, University Mainz , Anselm-Franz-von-Bentzel-Weg 7, Mainz 55128, Germany
| | | | | | | | | | | |
Collapse
|
220
|
Abstract
The modern synthesis of evolutionary theory and genetics has enabled us to discover underlying molecular mechanisms of organismal evolution. We know that in order to maximize an organism's fitness in a particular environment, individual interactions among components of protein and nucleic acid networks need to be optimized by natural selection, or sometimes through random processes, as the organism responds to changes and/or challenges in the environment. Despite the significant role of molecular networks in determining an organism's adaptation to its environment, we still do not know how such inter- and intra-molecular interactions within networks change over time and contribute to an organism's evolvability while maintaining overall network functions. One way to address this challenge is to identify connections between molecular networks and their host organisms, to manipulate these connections, and then attempt to understand how such perturbations influence molecular dynamics of the network and thus influence evolutionary paths and organismal fitness. In the present review, we discuss how integrating evolutionary history with experimental systems that combine tools drawn from molecular evolution, synthetic biology and biochemistry allow us to identify the underlying mechanisms of organismal evolution, particularly from the perspective of protein interaction networks.
Collapse
|
221
|
Colombo M, Laayouni H, Invergo BM, Bertranpetit J, Montanucci L. Metabolic flux is a determinant of the evolutionary rates of enzyme-encoding genes. Evolution 2013; 68:605-13. [PMID: 24102646 DOI: 10.1111/evo.12262] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 08/15/2013] [Indexed: 01/25/2023]
Abstract
Relationships between evolutionary rates and gene properties on a genomic, functional, pathway, or system level are being explored to unravel the principles of the evolutionary process. In particular, functional network properties have been analyzed to recognize the constraints they may impose on the evolutionary fate of genes. Here we took as a case study the core metabolic network in human erythrocytes and we analyzed the relationship between the evolutionary rates of its genes and the metabolic flux distribution throughout it. We found that metabolic flux correlates with the ratio of nonsynonymous to synonymous substitution rates. Genes encoding enzymes that carry high fluxes have been more constrained in their evolution, while purifying selection is more relaxed in genes encoding enzymes carrying low metabolic fluxes. These results demonstrate the importance of considering the dynamical functioning of gene networks when assessing the action of selection on system-level properties.
Collapse
Affiliation(s)
- Martino Colombo
- Institute of Evolutionary Biology (CSIC- Pompeu Fabra University), CEXS-UPF-PRBB, Dr. Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | | | | | | | | |
Collapse
|
222
|
Winterbach W, Mieghem PV, Reinders M, Wang H, Ridder DD. Topology of molecular interaction networks. BMC SYSTEMS BIOLOGY 2013; 7:90. [PMID: 24041013 PMCID: PMC4231395 DOI: 10.1186/1752-0509-7-90] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 08/01/2013] [Indexed: 12/23/2022]
Abstract
Molecular interactions are often represented as network models which have become the common language of many areas of biology. Graphs serve as convenient mathematical representations of network models and have themselves become objects of study. Their topology has been intensively researched over the last decade after evidence was found that they share underlying design principles with many other types of networks.Initial studies suggested that molecular interaction network topology is related to biological function and evolution. However, further whole-network analyses did not lead to a unified view on what this relation may look like, with conclusions highly dependent on the type of molecular interactions considered and the metrics used to study them. It is unclear whether global network topology drives function, as suggested by some researchers, or whether it is simply a byproduct of evolution or even an artefact of representing complex molecular interaction networks as graphs.Nevertheless, network biology has progressed significantly over the last years. We review the literature, focusing on two major developments. First, realizing that molecular interaction networks can be naturally decomposed into subsystems (such as modules and pathways), topology is increasingly studied locally rather than globally. Second, there is a move from a descriptive approach to a predictive one: rather than correlating biological network topology to generic properties such as robustness, it is used to predict specific functions or phenotypes.Taken together, this change in focus from globally descriptive to locally predictive points to new avenues of research. In particular, multi-scale approaches are developments promising to drive the study of molecular interaction networks further.
Collapse
Affiliation(s)
- Wynand Winterbach
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Piet Van Mieghem
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
- Netherlands Bioinformatics Center, 6500 HB Nijmegen, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, 2600 GA Delft, The
Netherlands
| | - Huijuan Wang
- Network Architectures and Services, Department of Intelligent Systems, Faculty of
Electrical Engineering, Mathematics and Computer Science, Delft University of
Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands
| | - Dick de Ridder
- Delft Bioinformatics Lab, Department of Intelligent Systems, Faculty of Electrical
Engineering, Mathematics and Computer Science, Delft University of Technology,
P.O. Box 5031, 2600 GA Delft, The Netherlands
- Netherlands Bioinformatics Center, 6500 HB Nijmegen, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, 2600 GA Delft, The
Netherlands
| |
Collapse
|
223
|
Sequence diversity in coding regions of candidate genes in the glycoalkaloid biosynthetic pathway of wild potato species. G3-GENES GENOMES GENETICS 2013; 3:1467-79. [PMID: 23853090 PMCID: PMC3755908 DOI: 10.1534/g3.113.007146] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Natural variation in five candidate genes of the steroidal glycoalkaloid (SGA) metabolic pathway and whole-genome single nucleotide polymorphism (SNP) genotyping were studied in six wild [Solanum chacoense (chc 80-1), S. commersonii, S. demissum, S. sparsipilum, S. spegazzinii, S. stoloniferum] and cultivated S. tuberosum Group Phureja (phu DH) potato species with contrasting levels of SGAs. Amplicons were sequenced for five candidate genes: 3-hydroxy-3-methylglutaryl coenzyme A reductase 1 and 2 (HMG1, HMG2) and 2.3-squalene epoxidase (SQE) of primary metabolism, and solanidine galactosyltransferase (SGT1), and glucosyltransferase (SGT2) of secondary metabolism. SNPs (n = 337) producing 354 variations were detected within 3.7 kb of sequenced DNA. More polymorphisms were found in introns than exons and in genes of secondary compared to primary metabolism. Although no significant deviation from neutrality was found, dN/dS ratios < 1 and negative values of Tajima’s D test suggested purifying selection and genetic hitchhiking in the gene fragments. In addition, patterns of dN/dS ratios across the SGA pathway suggested constraint by natural selection. Comparison of nucleotide diversity estimates and dN/dS ratios showed stronger selective constraints for genes of primary rather than secondary metabolism. SNPs (n = 24) with an exclusive genotype for either phu DH (low SGA) or chc 80-1 (high SGA) were identified for HMG2, SQE, SGT1 and SGT2. The SolCAP 8303 Illumina Potato SNP chip genotyping revealed eight informative SNPs on six pseudochromosomes, with homozygous and heterozygous genotypes that discriminated high, intermediate and low levels of SGA accumulation. These results can be used to evaluate SGA accumulation in segregating or association mapping populations.
Collapse
|
224
|
Zoraghi R, Reiner NE. Protein interaction networks as starting points to identify novel antimicrobial drug targets. Curr Opin Microbiol 2013; 16:566-72. [PMID: 23938265 DOI: 10.1016/j.mib.2013.07.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Revised: 07/12/2013] [Accepted: 07/16/2013] [Indexed: 01/17/2023]
Abstract
Novel classes of antimicrobials are needed to address the challenge of multidrug-resistant bacteria. Current bacterial drug targets mainly consist of specific proteins or subsets of proteins without regard for either how these targets are integrated in cellular networks or how they may interact with host proteins. However, proteins rarely act in isolation, and the majority of biological processes are dependent on interactions with other proteins. Consequently, protein-protein interaction (PPI) networks offer a realm of unexplored potential for next-generation drug targets. In this review, we argue that the architecture of bacterial or host-pathogen protein interactomes can provide invaluable insights for the identification of novel antibacterial drug targets.
Collapse
Affiliation(s)
- Roya Zoraghi
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, Canada
| | | |
Collapse
|
225
|
Han HW, Ohn JH, Moon J, Kim JH. Yin and Yang of disease genes and death genes between reciprocally scale-free biological networks. Nucleic Acids Res 2013; 41:9209-17. [PMID: 23935122 PMCID: PMC3814386 DOI: 10.1093/nar/gkt683] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Biological networks often show a scale-free topology with node degree following a power-law distribution. Lethal genes tend to form functional hubs, whereas non-lethal disease genes are located at the periphery. Uni-dimensional analyses, however, are flawed. We created and investigated two distinct scale-free networks; a protein–protein interaction (PPI) and a perturbation sensitivity network (PSN). The hubs of both networks exhibit a low molecular evolutionary rate (P < 8 × 10−12, P < 2 × 10−4) and a high codon adaptation index (P < 2 × 10−16, P < 2 × 10−8), indicating that both hubs have been shaped under high evolutionary selective pressure. Moreover, the topologies of PPI and PSN are inversely proportional: hubs of PPI tend to be located at the periphery of PSN and vice versa. PPI hubs are highly enriched with lethal genes but not with disease genes, whereas PSN hubs are highly enriched with disease genes and drug targets but not with lethal genes. PPI hub genes are enriched with essential cellular processes, but PSN hub genes are enriched with environmental interaction processes, having more TATA boxes and transcription factor binding sites. It is concluded that biological systems may balance internal growth signaling and external stress signaling by unifying the two opposite scale-free networks that are seemingly opposite to each other but work in concert between death and disease.
Collapse
Affiliation(s)
- Hyun Wook Han
- Division of Biomedical Informatics, Seoul National University Biomedical Informatics (SNUBI), Seoul National University College of Medicine, Seoul 110799, Korea, College of Medicine, CHA General Hospital, CHA University, Seoul 135081, Korea and Systems Biomedical Informatics Research Center, Seoul National University, Seoul 110799, Korea
| | | | | | | |
Collapse
|
226
|
Wei W, Zhang T, Lin D, Yang ZJ, Guo FB. Transcriptional abundance is not the single force driving the evolution of bacterial proteins. BMC Evol Biol 2013; 13:162. [PMID: 23914835 PMCID: PMC3734234 DOI: 10.1186/1471-2148-13-162] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Accepted: 08/01/2013] [Indexed: 11/20/2022] Open
Abstract
Background Despite rapid progress in understanding the mechanisms that shape the evolution of proteins, the relative importance of various factors remain to be elucidated. In this study, we have assessed the effects of 16 different biological features on the evolutionary rates (ERs) of protein-coding sequences in bacterial genomes. Results Our analysis of 18 bacterial species revealed new correlations between ERs and constraining factors. Previous studies have suggested that transcriptional abundance overwhelmingly constrains the evolution of yeast protein sequences. This transcriptional abundance leads to selection against misfolding or misinteractions. In this study we found that there was no single factor in determining the evolution of bacterial proteins. Not only transcriptional abundance (codon adaptation index and expression level), but also protein-protein associations (PPAs), essentiality (ESS), subcellular localization of cytoplasmic membrane (SLM), transmembrane helices (TMH) and hydropathicity score (HS) independently and significantly affected the ERs of bacterial proteins. In some species, PPA and ESS demonstrate higher correlations with ER than transcriptional abundance. Conclusions Different forces drive the evolution of protein sequences in yeast and bacteria. In bacteria, the constraints are involved in avoiding a build-up of toxic molecules caused by misfolding/misinteraction (transcriptional abundance), while retaining important functions (ESS, PPA) and maintaining the cell membrane (SLM, TMH and HS). Each of these independently contributes to the variation in protein evolution.
Collapse
Affiliation(s)
- Wen Wei
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, 610054 Chengdu, China
| | | | | | | | | |
Collapse
|
227
|
Chen YC, Cheng JH, Tsai ZTY, Tsai HK, Chuang TJ. The impact of trans-regulation on the evolutionary rates of metazoan proteins. Nucleic Acids Res 2013; 41:6371-80. [PMID: 23658220 PMCID: PMC3711421 DOI: 10.1093/nar/gkt349] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Revised: 04/10/2013] [Accepted: 04/14/2013] [Indexed: 11/13/2022] Open
Abstract
Transcription factor (TF) and microRNA (miRNA) are two crucial trans-regulatory factors that coordinately control gene expression. Understanding the impacts of these two factors on the rate of protein sequence evolution is of great importance in evolutionary biology. While many biological factors associated with evolutionary rate variations have been studied, evolutionary analysis of simultaneously accounting for TF and miRNA regulations across metazoans is still uninvestigated. Here, we provide a series of statistical analyses to assess the influences of TF and miRNA regulations on evolutionary rates across metazoans (human, mouse and fruit fly). Our results reveal that the negative correlations between trans-regulation and evolutionary rates hold well across metazoans, but the strength of TF regulation as a rate indicator becomes weak when the other confounding factors that may affect evolutionary rates are controlled. We show that miRNA regulation tends to be a more essential indicator of evolutionary rates than TF regulation, and the combination of TF and miRNA regulations has a significant dependent effect on protein evolutionary rates. We also show that trans-regulation (especially miRNA regulation) is much more important in human/mouse than in fruit fly in determining protein evolutionary rates, suggesting a considerable variation in rate determinants between vertebrates and invertebrates.
Collapse
Affiliation(s)
- Yi-Ching Chen
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Jen-Hao Cheng
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Zing Tsung-Yeh Tsai
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Huai-Kuang Tsai
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Trees-Juen Chuang
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan, Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei 115, Taiwan and Genomic Research Center, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
228
|
Employing directed evolution for the functional analysis of multi-specific proteins. Bioorg Med Chem 2013; 21:3511-6. [DOI: 10.1016/j.bmc.2013.04.052] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Revised: 04/11/2013] [Accepted: 04/18/2013] [Indexed: 01/17/2023]
|
229
|
Das J, Vo TV, Wei X, Mellor JC, Tong V, Degatano AG, Wang X, Wang L, Cordero NA, Kruer-Zerhusen N, Matsuyama A, Pleiss JA, Lipkin SM, Yoshida M, Roth FP, Yu H. Cross-species protein interactome mapping reveals species-specific wiring of stress response pathways. Sci Signal 2013; 6:ra38. [PMID: 23695164 DOI: 10.1126/scisignal.2003350] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The fission yeast Schizosaccharomyces pombe has more metazoan-like features than the budding yeast Saccharomyces cerevisiae, yet it has similarly facile genetics. We present a large-scale verified binary protein-protein interactome network, "StressNet," based on high-throughput yeast two-hybrid screens of interacting proteins classified as part of stress response and signal transduction pathways in S. pombe. We performed systematic, cross-species interactome mapping using StressNet and a protein interactome network of orthologous proteins in S. cerevisiae. With cross-species comparative network studies, we detected a previously unidentified component (Snr1) of the S. pombe mitogen-activated protein kinase Sty1 pathway. Coimmunoprecipitation experiments showed that Snr1 interacted with Sty1 and that deletion of snr1 increased the sensitivity of S. pombe cells to stress. Comparison of StressNet with the interactome network of orthologous proteins in S. cerevisiae showed that most of the interactions among these stress response and signaling proteins are not conserved between species but are "rewired"; orthologous proteins have different binding partners in both species. In particular, transient interactions connecting proteins in different functional modules were more likely to be rewired than conserved. By directly testing interactions between proteins in one yeast species and their corresponding binding partners in the other yeast species with yeast two-hybrid assays, we found that about half of the interactions that are traditionally considered "conserved" form modified interaction interfaces that may potentially accommodate novel functions.
Collapse
Affiliation(s)
- Jishnu Das
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Tommy V Vo
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Xiaomu Wei
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Medicine, Weill Cornell College of Medicine, New York, NY 10021, USA
| | - Joseph C Mellor
- Donnelly Centre, University of Toronto, Toronto, ON M5S-3E1, Canada
| | - Virginia Tong
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Andrew G Degatano
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Xiujuan Wang
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Lihua Wang
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Nicolas A Cordero
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| | - Nathan Kruer-Zerhusen
- Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Akihisa Matsuyama
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, Wako, Saitama 351-0198, Japan.,CREST Research Project, JST, Kawaguchi, Saitama 332-0012, Japan
| | - Jeffrey A Pleiss
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Steven M Lipkin
- Department of Medicine, Weill Cornell College of Medicine, New York, NY 10021, USA
| | - Minoru Yoshida
- Chemical Genetics Laboratory, RIKEN Advanced Science Institute, Wako, Saitama 351-0198, Japan.,CREST Research Project, JST, Kawaguchi, Saitama 332-0012, Japan.,Department of Biotechnology, Graduate School of Agriculture and Life Sciences, University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON M5S-3E1, Canada.,Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON M5S-3E1, Canada.,Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston, MA 02115.,Harvard Medical School, Boston, MA 02115.,Samuel Lunenfeld Research Institute, Mt. Sinai Hospital, Toronto, ON M5G-1X5, Canada.,Genetic Networks Program, Canadian Institute for Advanced Research, Toronto, ON M5G-1Z8, Canada
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology Cornell University, Ithaca, NY 14853, USA.,Weill Institute for Cell and Molecular Biology Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
230
|
Alvarez-Ponce D, Fares MA. Evolutionary rate and duplicability in the Arabidopsis thaliana protein-protein interaction network. Genome Biol Evol 2013; 4:1263-74. [PMID: 23160177 PMCID: PMC3542556 DOI: 10.1093/gbe/evs101] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genes show a bewildering variation in their patterns of molecular evolution, as a result of the action of different levels and types of selective forces. The factors underlying this variation are, however, still poorly understood. In the last decade, the position of proteins in the protein-protein interaction network has been put forward as a determinant factor of the evolutionary rate and duplicability of their encoding genes. This conclusion, however, has been based on the analysis of the limited number of microbes and animals for which interactome-level data are available (essentially, Escherichia coli, yeast, worm, fly, and humans). Here, we study, for the first time, the relationship between the position of proteins in the high-density interactome of a plant (Arabidopsis thaliana) and the patterns of molecular evolution of their encoding genes. We found that genes whose encoded products act at the center of the network are more evolutionarily constrained than those acting at the network periphery. This trend remains significant when potential confounding factors (gene expression level and breadth, duplicability, function, and length of the encoded products) are controlled for. Even though the correlation between centrality measures and rates of evolution is generally weak, for some functional categories, it is comparable in strength to (or even stronger than) the correlation between evolutionary rates and expression levels or breadths. In addition, genes encoding interacting proteins in the network evolve at relatively similar rates. Finally, Arabidopsis proteins encoded by duplicated genes are more highly connected than those encoded by singleton genes. This observation is in agreement with the patterns observed in humans, but in contrast with those observed in E. coli, yeast, worm, and fly (whose duplicated genes tend to act at the periphery of the network), implying that the relationship between duplicability and centrality inverted at least twice during eukaryote evolution. Taken together, these results indicate that the structure of the A. thaliana network constrains the evolution of its components at multiple levels.
Collapse
Affiliation(s)
- David Alvarez-Ponce
- Department of Abiotic Stress, Integrative and Systems Biology Laboratory, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicias (CSIC-UPV), Valencia, Spain.
| | | |
Collapse
|
231
|
Javier Zea D, Miguel Monzon A, Fornasari MS, Marino-Buslje C, Parisi G. Protein Conformational Diversity Correlates with Evolutionary Rate. Mol Biol Evol 2013; 30:1500-3. [DOI: 10.1093/molbev/mst065] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
232
|
A kinetic model of the evolution of a protein interaction network. BMC Genomics 2013; 14:172. [PMID: 23497092 PMCID: PMC3751699 DOI: 10.1186/1471-2164-14-172] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 03/08/2013] [Indexed: 11/10/2022] Open
Abstract
Background Known protein interaction networks have very particular properties. Old proteins tend to have more interactions than new ones. One of the best statistical representatives of this property is the node degree distribution (distribution of proteins having a given number of interactions). It has previously been shown that this distribution is very close to the sum of two distinct exponential components. In this paper, we asked: What are the possible mechanisms of evolution for such types of networks? To answer this question, we tested a kinetic model for simplified evolution of a protein interactome. Our proposed model considers the emergence of new genes and interactions and the loss of old ones. We assumed that there are generally two coexisting classes of proteins. Proteins constituting the first class are essential only for ecological adaptations and are easily lost when ecological conditions change. Proteins of the second class are essential for basic life processes and, hence, are always effectively protected against deletion. All proteins can transit between the above classes in both directions. We also assumed that the phenomenon of gene duplication is always related to ecological adaptation and that a new copy of a duplicated gene is not essential. According to this model, all proteins gain new interactions with a rate that preferentially increases with the number of interactions (the rich get richer). Proteins can also gain interactions because of duplication. Proteins lose their interactions both with and without the loss of partner genes. Results The proposed model reproduces the main properties of protein-protein interaction networks very well. The connectivity of the oldest part of the interaction network is densest, and the node degree distribution follows the sum of two shifted power-law functions, which is a theoretical generalization of the previous finding. The above distribution covers the wide range of values of node degrees very well, much better than a power law or generalized power law supplemented with an exponential cut-off. The presented model also relates the total number of interactome links to the total number of interacting proteins. The theoretical results were for the interactomes of A. thaliana, B. taurus, C. elegans, D. melanogaster, E. coli, H. pylori, H. sapiens, M. musculus, R. norvegicus and S. cerevisiae. Conclusions Using these approaches, the kinetic parameters could be estimated. Finally, the model revealed the evolutionary kinetics of proteome formation, the phenomenon of protein differentiation and the process of gaining new interactions.
Collapse
|
233
|
Bertolazzi P, Bock ME, Guerra C. On the functional and structural characterization of hubs in protein–protein interaction networks. Biotechnol Adv 2013; 31:274-86. [DOI: 10.1016/j.biotechadv.2012.12.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Revised: 11/13/2012] [Accepted: 12/01/2012] [Indexed: 01/07/2023]
|
234
|
Song J, Singh M. From hub proteins to hub modules: the relationship between essentiality and centrality in the yeast interactome at different scales of organization. PLoS Comput Biol 2013; 9:e1002910. [PMID: 23436988 PMCID: PMC3578755 DOI: 10.1371/journal.pcbi.1002910] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2012] [Accepted: 12/21/2012] [Indexed: 11/22/2022] Open
Abstract
Numerous studies have suggested that hub proteins in the S. cerevisiae physical interaction network are more likely to be essential than other proteins. The proposed reasons underlying this observed relationship between topology and functioning have been subject to some controversy, with recent work suggesting that it arises due to the participation of hub proteins in essential complexes and processes. However, do these essential modules themselves have distinct network characteristics, and how do their essential proteins differ in their topological properties from their non-essential proteins? We aimed to advance our understanding of protein essentiality by analyzing proteins, complexes and processes within their broader functional context and by considering physical interactions both within and across complexes and biological processes. In agreement with the view that essentiality is a modular property, we found that the number of intracomplex or intraprocess interactions that a protein has is a better indicator of its essentiality than its overall number of interactions. Moreover, we found that within an essential complex, its essential proteins have on average more interactions, especially intracomplex interactions, than its non-essential proteins. Finally, we built a module-level interaction network and found that essential complexes and processes tend to have higher interaction degrees in this network than non-essential complexes and processes; that is, they exhibit a larger amount of functional cross-talk than their non-essential counterparts.
Collapse
Affiliation(s)
- Jimin Song
- Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Mona Singh
- Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
235
|
Liu Z, Guo F, Zhang J, Wang J, Lu L, Li D, He F. Proteome-wide prediction of self-interacting proteins based on multiple properties. Mol Cell Proteomics 2013; 12:1689-700. [PMID: 23422585 DOI: 10.1074/mcp.m112.021790] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Self-interacting proteins, whose two or more copies can interact with each other, play important roles in cellular functions and the evolution of protein interaction networks (PINs). Knowing whether a protein can self-interact can contribute to and sometimes is crucial for the elucidation of its functions. Previous related research has mainly focused on the structures and functions of specific self-interacting proteins, whereas knowledge on their overall properties is limited. Meanwhile, the two current most common high throughput protein interaction assays have limited ability to detect self-interactions because of biological artifacts and design limitations, whereas the bioinformatic prediction method of self-interacting proteins is lacking. This study aims to systematically study and predict self-interacting proteins from an overall perspective. We find that compared with other proteins the self-interacting proteins in the structural aspect contain more domains; in the evolutionary aspect they tend to be conserved and ancient; in the functional aspect they are significantly enriched with enzyme genes, housekeeping genes, and drug targets, and in the topological aspect tend to occupy important positions in PINs. Furthermore, based on these features, after feature selection, we use logistic regression to integrate six representative features, including Gene Ontology term, domain, paralogous interactor, enzyme, model organism self-interacting protein, and betweenness centrality in the PIN, to develop a proteome-wide prediction model of self-interacting proteins. Using 5-fold cross-validation and an independent test, this model shows good performance. Finally, the prediction model is developed into a user-friendly web service SLIPPER (SeLf-Interacting Protein PrEdictoR). Users may submit a list of proteins, and then SLIPPER will return the probability_scores measuring their possibility to be self-interacting proteins and various related annotation information. This work helps us understand the role self-interacting proteins play in cellular functions from an overall perspective, and the constructed prediction model may contribute to the high throughput finding of self-interacting proteins and provide clues for elucidating their functions.
Collapse
Affiliation(s)
- Zhongyang Liu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | | | | | | | | | | | | |
Collapse
|
236
|
|
237
|
Choi SS, Hannenhalli S. Three independent determinants of protein evolutionary rate. J Mol Evol 2013; 76:98-111. [PMID: 23400388 DOI: 10.1007/s00239-013-9543-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 01/16/2013] [Indexed: 12/15/2022]
Abstract
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein-protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors-functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.
Collapse
Affiliation(s)
- Sun Shim Choi
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, South Korea.
| | | |
Collapse
|
238
|
Wang H, Zheng H. Correlation of genomic features with dynamic modularity in the yeast interactome: a view from the structural perspective. IEEE Trans Nanobioscience 2013; 11:244-50. [PMID: 22987130 DOI: 10.1109/tnb.2012.2212720] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The idea of the existence of date and party hubs in protein-protein interaction networks has been debated since it was proposed in 2004. Based on the incorporation of the information extracted from known three-dimensional structures of protein interactions, we revisited the properties associated with date and party hubs previously identified. The correlation of genomic essentiality, gene coexpression, and functional semantic similarity with date and party hubs were examined. The number of interaction interfaces associated with each hub was taken into account. The results suggested that the identification of date and party hubs based on their network connectivity and expression profiles with interaction partners may be incomplete. The number of interaction interfaces could play an important role in examining functional and topological properties associated with each hub protein. The observation is robust to the choice of degree cutoffs for hubs. Furthermore, we found that while singlish-interface hubs seem to correspond mostly to date hub, it appears that there is no significant difference between the proportions of multi-interface proteins categorized as date and as party hubs.
Collapse
Affiliation(s)
- Haiying Wang
- School of Computing and Mathematics, University of Ulster, Jordanstown, BT37 0QB, Northern Ireland, UK.
| | | |
Collapse
|
239
|
Differential requirements for mRNA folding partially explain why highly expressed proteins evolve slowly. Proc Natl Acad Sci U S A 2013; 110:E678-86. [PMID: 23382244 DOI: 10.1073/pnas.1218066110] [Citation(s) in RCA: 92] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The cause of the tremendous among-protein variation in the rate of sequence evolution is a central subject of molecular evolution. Expression level has been identified as a leading determinant of this variation among genes encoded in the same genome, but the underlying mechanisms are not fully understood. We here propose and demonstrate that a requirement for stronger folding of more abundant mRNAs results in slower evolution of more highly expressed genes and proteins. Specifically, we show that: (i) the higher the expression level of a gene, the greater the selective pressure for its mRNA to fold; (ii) random mutations are more likely to decrease mRNA folding when occurring in highly expressed genes than in lowly expressed genes; and (iii) amino acid substitution rate is negatively correlated with mRNA folding strength, with or without the control of expression level. Furthermore, synonymous (d(S)) and nonsynonymous (d(N)) nucleotide substitution rates are both negatively correlated with mRNA folding strength. However, counterintuitively, d(S) and d(N) are differentially constrained by selection for mRNA folding, resulting in a significant correlation between mRNA folding strength and d(N)/d(S), even when gene expression level is controlled. The direction and magnitude of this correlation is determined primarily by the G+C frequency at third codon positions. Together, these findings explain why highly expressed genes evolve slowly, demonstrate a major role of natural selection at the mRNA level in constraining protein evolution, and reveal a previously unrecognized and unexpected form of nonprotein-level selection that impacts d(N)/d(S).
Collapse
|
240
|
Willadsen K, Cao MD, Wiles J, Balasubramanian S, Bodén M. Repeat-encoded poly-Q tracts show statistical commonalities across species. BMC Genomics 2013; 14:76. [PMID: 23374135 PMCID: PMC3617014 DOI: 10.1186/1471-2164-14-76] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 01/18/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Among repetitive genomic sequence, the class of tri-nucleotide repeats has received much attention due to their association with human diseases. Tri-nucleotide repeat diseases are caused by excessive sequence length variability; diseases such as Huntington's disease and Fragile X syndrome are tied to an increase in the number of repeat units in a tract. Motivated by the recent discovery of a tri-nucleotide repeat associated genetic defect in Arabidopsis thaliana, this study takes a cross-species approach to investigating these repeat tracts, with the goal of using commonalities between species to identify potential disease-related properties. RESULTS We find that statistical enrichment in regulatory function associations for coding region repeats - previously observed in human - is consistent across multiple organisms. By distinguishing between homo-amino acid tracts that are encoded by tri-nucleotide repeats, and those encoded by varying codons, we show that amino acid repeats - not tri-nucleotide repeats - fully explain these regulatory associations. Using this same separation between repeat- and non-repeat-encoded homo-amino acid tracts, we show that poly-glutamine tracts are disproportionately encoded by tri-nucleotide repeats, and those tracts that are encoded by tri-nucleotide repeats are also significantly longer; these results are consistent across multiple species. CONCLUSION These findings establish similarities in tri-nucleotide repeats across species at the level of protein functionality and protein sequence. The tendency of tri-nucleotide repeats to encode longer poly-glutamine tracts indicates a link with the poly-glutamine repeat diseases. The cross-species nature of this tendency suggests that unknown repeat diseases are yet to be uncovered in other species. Future discoveries of new non-human repeat associated defects may provide the breadth of information needed to unravel the mechanisms that underpin this class of human disease.
Collapse
Affiliation(s)
- Kai Willadsen
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane QLD 4072, Australia
| | | | | | | | | |
Collapse
|
241
|
Sheppard SK, Didelot X, Jolley KA, Darling AE, Pascoe B, Meric G, Kelly DJ, Cody A, Colles FM, Strachan NJC, Ogden ID, Forbes K, French NP, Carter P, Miller WG, McCarthy ND, Owen R, Litrup E, Egholm M, Affourtit JP, Bentley SD, Parkhill J, Maiden MCJ, Falush D. Progressive genome-wide introgression in agricultural Campylobacter coli. Mol Ecol 2013; 22:1051-64. [PMID: 23279096 PMCID: PMC3749442 DOI: 10.1111/mec.12162] [Citation(s) in RCA: 113] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Revised: 10/16/2012] [Accepted: 10/21/2012] [Indexed: 01/24/2023]
Abstract
Hybridization between distantly related organisms can facilitate rapid adaptation to novel environments, but is potentially constrained by epistatic fitness interactions among cell components. The zoonotic pathogens Campylobacter coli and C. jejuni differ from each other by around 15% at the nucleotide level, corresponding to an average of nearly 40 amino acids per protein-coding gene. Using whole genome sequencing, we show that a single C. coli lineage, which has successfully colonized an agricultural niche, has been progressively accumulating C. jejuni DNA. Members of this lineage belong to two groups, the ST-828 and ST-1150 clonal complexes. The ST-1150 complex is less frequently isolated and has undergone a substantially greater amount of introgression leading to replacement of up to 23% of the C. coli core genome as well as import of novel DNA. By contrast, the more commonly isolated ST-828 complex bacteria have 10-11% introgressed DNA, and C. jejuni and nonagricultural C. coli lineages each have <2%. Thus, the C. coli that colonize agriculture, and consequently cause most human disease, have hybrid origin, but this cross-species exchange has so far not had a substantial impact on the gene pools of either C. jejuni or nonagricultural C. coli. These findings also indicate remarkable interchangeability of basic cellular machinery after a prolonged period of independent evolution.
Collapse
Affiliation(s)
- Samuel K Sheppard
- Department of Zoology, The Tinbergen Building, University of Oxford, South Parks Road, Oxford, OX1 3PS, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
242
|
Wang J, Peng W, Wu FX. Computational approaches to predicting essential proteins: A survey. Proteomics Clin Appl 2013; 7:181-92. [DOI: 10.1002/prca.201200068] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Revised: 09/12/2012] [Accepted: 11/06/2012] [Indexed: 12/13/2022]
Affiliation(s)
- Jianxin Wang
- School of Information Science and Engineering; Central South University; Changsha; China
| | - Wei Peng
- School of Information Science and Engineering; Central South University; Changsha; China
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering; University of Saskatchewan; Saskatoon; SK; Canada
| |
Collapse
|
243
|
Wang M, Wang Q, Wang Z, Wang Q, Zhang X, Pan Y. The Molecular Evolutionary Patterns of the Insulin/FOXO Signaling Pathway. Evol Bioinform Online 2013; 9:1-16. [PMID: 23362368 PMCID: PMC3547545 DOI: 10.4137/ebo.s10539] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
The insulin/insulin growth factor-1(IGF1)/FOXO (IIF) signal transduction pathway plays a core role in the endocrine system. Although the components of this pathway have been well characterized, the evolutionary pattern remains poorly understood. Here, we perform a comprehensive analysis to study whether the differences of signaling transduction elements exist as well as to determine whether the genes are subject to equivalent evolutionary forces and how natural selection shapes the evolution pattern of proteins in an interacting system. Our results demonstrate that most IIF pathway components are present throughout all animal phyla investigated here, and they are under strong selective constraint. Remarkably, we detect that the components in the middle of the pathway undergo stronger purifying selection, which is different from previous similar reports. We also find that the dN/dS may be influenced by quite complicated factors including codon bias, protein length among others.
Collapse
Affiliation(s)
- Minghui Wang
- School of Agriculture and Biology, Department of Animal Sciences, Shanghai Jiao Tong University, Shanghai, PR China. ; Shanghai Key Laboratory of Veterinary Biotechnology, Shanghai, China
| | | | | | | | | | | |
Collapse
|
244
|
Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013; 449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions (‘decorations’ at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure–function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Collapse
|
245
|
Pérez-Bercoff Å, Hudson CM, Conant GC. A conserved mammalian protein interaction network. PLoS One 2013; 8:e52581. [PMID: 23320073 PMCID: PMC3539715 DOI: 10.1371/journal.pone.0052581] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 11/20/2012] [Indexed: 11/19/2022] Open
Abstract
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.
Collapse
Affiliation(s)
- Åsa Pérez-Bercoff
- Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | - Corey M. Hudson
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Gavin C. Conant
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
246
|
Hsin Liu C, Li KC, Yuan S. Human protein-protein interaction prediction by a novel sequence-based co-evolution method: co-evolutionary divergence. Bioinformatics 2013; 29:92-98. [PMID: 23080115 DOI: 10.1093/bioinformatics/bts620] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
Abstract
MOTIVATION Protein-protein interaction (PPI) plays an important role in understanding gene functions, and many computational PPI prediction methods have been proposed in recent years. Despite the extensive efforts, PPI prediction still has much room to improve. Sequence-based co-evolution methods include the substitution rate method and the mirror tree method, which compare sequence substitution rates and topological similarity of phylogenetic trees, respectively. Although they have been used to predict PPI in species with small genomes like Escherichia coli, such methods have not been tested in large scale proteome like Homo sapiens. RESULT In this study, we propose a novel sequence-based co-evolution method, co-evolutionary divergence (CD), for human PPI prediction. Built on the basic assumption that protein pairs with similar substitution rates are likely to interact with each other, the CD method converts the evolutionary information from 14 species of vertebrates into likelihood ratios and combined them together to infer PPI. We showed that the CD method outperformed the mirror tree method in three independent human PPI datasets by a large margin. With the arrival of more species genome information generated by next generation sequencing, the performance of the CD method can be further improved. AVAILABILITY Source code and support are available at http://mib.stat.sinica.edu.tw/LAP/tmp/CD.rar.
Collapse
Affiliation(s)
- Chia Hsin Liu
- Institute of Statistical Science, Academia Sinica, Nangang, Taipei 115, Taiwan
| | | | | |
Collapse
|
247
|
Wang S, Wei W, Zheng Y, Hou J, Dou Y, Zhang S, Luo X, Cai X. The role of insulin C-peptide in the coevolution analyses of the insulin signaling pathway: a hint for its functions. PLoS One 2012; 7:e52847. [PMID: 23300796 PMCID: PMC3531361 DOI: 10.1371/journal.pone.0052847] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2012] [Accepted: 11/21/2012] [Indexed: 12/16/2022] Open
Abstract
As the linker between the A chain and B chain of proinsulin, C-peptide displays high variability in length and amino acid composition, and has been considered as an inert byproduct of insulin synthesis and processing for many years. Recent studies have suggested that C-peptide can act as a bioactive hormone, exerting various biological effects on the pathophysiology and treatment of diabetes. In this study, we analyzed the coevolution of insulin molecules among vertebrates, aiming at exploring the evolutionary characteristics of insulin molecule, especially the C-peptide. We also calculated the correlations of evolutionary rates between the insulin and the insulin receptor (IR) sequences as well as the domain-domain pairs of the ligand and receptor by the mirrortree method. The results revealed distinctive features of C-peptide in insulin intramolecular coevolution and correlated residue substitutions, which partly supported the idea that C-peptide can act as a bioactive hormone, with significant sequence features, as well as a linker assisting the formation of mature insulin during synthesis. Interestingly, the evolution of C-peptide exerted the highest correlation with that of the insulin receptor and its ligand binding domain (LBD), implying a potential relationship with the insulin signaling pathway.
Collapse
Affiliation(s)
- Shuai Wang
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Wei Wei
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yadong Zheng
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Junling Hou
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yongxi Dou
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Shaohua Zhang
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Xuenong Luo
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (XL); (XC)
| | - Xuepeng Cai
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (XL); (XC)
| |
Collapse
|
248
|
Leducq JB, Charron G, Diss G, Gagnon-Arsenault I, Dubé AK, Landry CR. Evidence for the robustness of protein complexes to inter-species hybridization. PLoS Genet 2012; 8:e1003161. [PMID: 23300466 PMCID: PMC3531474 DOI: 10.1371/journal.pgen.1003161] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2012] [Accepted: 10/26/2012] [Indexed: 01/11/2023] Open
Abstract
Despite the tremendous efforts devoted to the identification of genetic incompatibilities underlying hybrid sterility and inviability, little is known about the effect of inter-species hybridization at the protein interactome level. Here, we develop a screening platform for the comparison of protein-protein interactions (PPIs) among closely related species and their hybrids. We examine in vivo the architecture of protein complexes in two yeast species (Saccharomyces cerevisiae and Saccharomyces kudriavzevii) that diverged 5-20 million years ago and in their F1 hybrids. We focus on 24 proteins of two large complexes: the RNA polymerase II and the nuclear pore complex (NPC), which show contrasting patterns of molecular evolution. We found that, with the exception of one PPI in the NPC sub-complex, PPIs were highly conserved between species, regardless of protein divergence. Unexpectedly, we found that the architecture of the complexes in F1 hybrids could not be distinguished from that of the parental species. Our results suggest that the conservation of PPIs in hybrids likely results from the slow evolution taking place on the very few protein residues involved in the interaction or that protein complexes are inherently robust and may accommodate protein divergence up to the level that is observed among closely related species.
Collapse
Affiliation(s)
- Jean-Baptiste Leducq
- Institut de Biologie Intégrative et des Systèmes, Département de Biologie, PROTEO, Pavillon Charles-Eugène-Marchand, Université Laval, Québec City, Canada
| | | | | | | | | | | |
Collapse
|
249
|
Bogumil D, Dagan T. Cumulative impact of chaperone-mediated folding on genome evolution. Biochemistry 2012; 51:9941-53. [PMID: 23167595 DOI: 10.1021/bi3013643] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Molecular chaperones support protein folding and unfolding along with assembly and translocation of protein complexes. Chaperones have been recognized as important mediators between an organismal genotype and phenotype as well as important maintainers of cellular fitness under environmental conditions that induce high mutational loads. Here we review recent studies revealing that the folding assistance supplied by chaperones is evident in genomic sequences implicating chaperone-mediated folding as an influential factor during protein evolution. Interaction of protein with chaperones ensures a proper folding and function, yet an adaptation to obligatory dependence on such assistance may be irreversible, representing an evolutionary trap. A correlation between the requirement for a chaperone and protein expression level indicates that the evolution of substrate-chaperone interaction is bounded by the required substrate abundance within the cell. Accumulating evidence suggests that the utility of chaperones is governed by a delicate balance between their help in mitigating the risks of protein misfolding and aggregate formation on one hand and the slower rate of protein maturation and the energetic cost of chaperone synthesis on the other.
Collapse
Affiliation(s)
- David Bogumil
- Institute for Genomic Microbiology, Heinrich-Heine University of Düsseldorf, Düsseldorf, Germany
| | | |
Collapse
|
250
|
Rezende AM, Folador EL, Resende DDM, Ruiz JC. Computational prediction of protein-protein interactions in Leishmania predicted proteomes. PLoS One 2012; 7:e51304. [PMID: 23251492 PMCID: PMC3519578 DOI: 10.1371/journal.pone.0051304] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2012] [Accepted: 10/31/2012] [Indexed: 11/18/2022] Open
Abstract
The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI) study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping) and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks received some degree of functional annotation which represents an important contribution since approximately 60% of Leishmania predicted proteomes has no predicted function.
Collapse
Affiliation(s)
- Antonio M. Rezende
- Laboratório de Parasitologia Celular e Molecular, Centro de Pesquisa René Rachou – FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- * E-mail: (AMR); (JCR)
| | - Edson L. Folador
- Laboratório de Parasitologia Celular e Molecular, Centro de Pesquisa René Rachou – FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
- Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Rio de Janeiro, Brazil
| | - Daniela de M. Resende
- Laboratório de Parasitologia Celular e Molecular, Centro de Pesquisa René Rachou – FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
- Laboratório de Pesquisas Clínicas, Universidade Federal de Ouro Preto, Ouro Preto, Minas Gerais, Brazil
| | - Jeronimo C. Ruiz
- Laboratório de Parasitologia Celular e Molecular, Centro de Pesquisa René Rachou – FIOCRUZ, Belo Horizonte, Minas Gerais, Brazil
- * E-mail: (AMR); (JCR)
| |
Collapse
|