Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pazos F, Rausell A, Valencia A. Phylogeny-independent detection of functional residues. Bioinformatics 2006;22:1440-8. [PMID: 16551661 DOI: 10.1093/bioinformatics/btl104] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

For:	Pazos F, Rausell A, Valencia A. Phylogeny-independent detection of functional residues. Bioinformatics 2006;22:1440-8. [PMID: 16551661 DOI: 10.1093/bioinformatics/btl104] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

Carbone A, Decelle A, Rosset L, Seoane B. Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025;47:1309-1316. [PMID: 39527442 DOI: 10.1109/tpami.2024.3495999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]

Hernández Berthet AS, Aptekmann AA, Tejero J, Sánchez IE, Noguera ME, Roman EA. Associating protein sequence positions with the modulation of quantitative phenotypes. Arch Biochem Biophys 2024;755:109979. [PMID: 38583654 DOI: 10.1016/j.abb.2024.109979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/11/2024] [Accepted: 03/27/2024] [Indexed: 04/09/2024]

Abstract

Although protein sequences encode the information for folding and function, understanding their link is not an easy task. Unluckily, the prediction of how specific amino acids contribute to these features is still considerably impaired. Here, we developed a simple algorithm that finds positions in a protein sequence with potential to modulate the studied quantitative phenotypes. From a few hundred protein sequences, we perform multiple sequence alignments, obtain the per-position pairwise differences for both the sequence and the observed phenotypes, and calculate the correlation between these last two quantities. We tested our methodology with four cases: archaeal Adenylate Kinases and the organisms optimal growth temperatures, microbial rhodopsins and their maximal absorption wavelengths, mammalian myoglobins and their muscular concentration, and inhibition of HIV protease clinical isolates by two different molecules. We found from 3 to 10 positions tightly associated with those phenotypes, depending on the studied case. We showed that these correlations appear using individual positions but an improvement is achieved when the most correlated positions are jointly analyzed. Noteworthy, we performed phenotype predictions using a simple linear model that links per-position divergences and differences in the observed phenotypes. Predictions are comparable to the state-of-art methodologies which, in most of the cases, are far more complex. All of the calculations are obtained at a very low information cost since the only input needed is a multiple sequence alignment of protein sequences with their associated quantitative phenotypes. The diversity of the explored systems makes our work a valuable tool to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.

Collapse

Affiliation(s)

Ayelén S Hernández Berthet Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Intendente Güiraldes 2160 - Ciudad Universitaria, 1428EGA, C.A.B.A., Argentina.
Ariel A Aptekmann Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Buenos Aires, Argentina; Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, 08873, USA; Institute of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ, 08901, USA.
Jesús Tejero Heart, Lung, Blood and Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, 15261, USA; Division of Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, USA; Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA, 15260, USA; Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
Ignacio E Sánchez Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales (IQUIBICEN), Facultad de Ciencias Exactas y Naturales, Laboratorio de Fisiología de Proteínas, Buenos Aires, Argentina.
Martín E Noguera Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química y Fisicoquímica Biológicas Dr. Alejandro Paladini, Junín 956, 1113AAD, C.A.B.A., Argentina; Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Roque Saenz Peña 352, B1876BXD, Bernal, Argentina.
Ernesto A Roman Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Intendente Güiraldes 2160 - Ciudad Universitaria, 1428EGA, C.A.B.A., Argentina; Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto de Química y Fisicoquímica Biológicas Dr. Alejandro Paladini, Junín 956, 1113AAD, C.A.B.A., Argentina.

Collapse

Pazos F. Computational prediction of protein functional sites-Applications in biotechnology and biomedicine. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022;130:39-57. [PMID: 35534114 DOI: 10.1016/bs.apcsb.2021.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Pazos F. Prediction of Protein Sites and Physicochemical Properties Related to Functional Specificity. Bioengineering (Basel) 2021;8:bioengineering8120201. [PMID: 34940354 PMCID: PMC8698372 DOI: 10.3390/bioengineering8120201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 11/25/2021] [Accepted: 11/29/2021] [Indexed: 11/16/2022] Open

Karakulak T, Rifaioglu AS, Rodrigues JPGLM, Karaca E. Predicting the Specificity- Determining Positions of Receptor Tyrosine Kinase Axl. Front Mol Biosci 2021;8:658906. [PMID: 34195226 PMCID: PMC8236827 DOI: 10.3389/fmolb.2021.658906] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/20/2021] [Indexed: 11/22/2022] Open

Pitarch B, Ranea JAG, Pazos F. Protein residues determining interaction specificity in paralogous families. Bioinformatics 2021;37:1076-1082. [PMID: 33135068 DOI: 10.1093/bioinformatics/btaa934] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 10/06/2020] [Accepted: 10/22/2020] [Indexed: 02/06/2023] Open

Buhrman G, Enríquez P, Dillard L, Baer H, Truong V, Grunden AM, Rose RB. Structure, Function, and Thermal Adaptation of the Biotin Carboxylase Domain Dimer from Hydrogenobacter thermophilus 2-Oxoglutarate Carboxylase. Biochemistry 2021;60:324-345. [PMID: 33464881 DOI: 10.1021/acs.biochem.0c00815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Abstract

2-Oxoglutarate carboxylase (OGC), a unique member of the biotin-dependent carboxylase family from the order Aquificales, captures dissolved CO₂ via the reductive tricarboxylic acid (rTCA) cycle. Structure and function studies of OGC may facilitate adaptation of the rTCA cycle to increase the level of carbon fixation for biofuel production. Here we compare the biotin carboxylase (BC) domain of Hydrogenobacter thermophilus OGC with the well-studied mesophilic homologues to identify features that may contribute to thermal stability and activity. We report three OGC BC X-ray structures, each bound to bicarbonate, ADP, or ADP-Mg²⁺, and propose that substrate binding at high temperatures is facilitated by interactions that stabilize the flexible subdomain B in a partially closed conformation. Kinetic measurements with varying ATP and biotin concentrations distinguish two temperature-dependent steps, consistent with biotin's rate-limiting role in organizing the active site. Transition state thermodynamic values derived from the Eyring equation indicate a larger positive ΔH^⧧ and a less negative ΔS^⧧ compared to those of a previously reported mesophilic homologue. These thermodynamic values are explained by partially rate limiting product release. Phylogenetic analysis of BC domains suggests that OGC diverged prior to Aquificales evolution. The phylogenetic tree identifies mis-annotations of the Aquificales BC sequences, including the Aquifex aeolicus pyruvate carboxylase structure. Notably, our structural data reveal that the OGC BC dimer comprises a "wet" dimerization interface that is dominated by hydrophilic interactions and structural water molecules common to all BC domains and likely facilitates the conformational changes associated with the catalytic cycle. Mutations in the dimerization domain demonstrate that dimerization contributes to thermal stability.

Collapse

Garrido-Martín D, Pazos F. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues. BMC Bioinformatics 2018;19:67. [PMID: 29482506 PMCID: PMC5827975 DOI: 10.1186/s12859-018-2084-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 02/21/2018] [Indexed: 11/10/2022] Open

Kalaivani R, Reema R, Srinivasan N. Recognition of sites of functional specialisation in all known eukaryotic protein kinase families. PLoS Comput Biol 2018;14:e1005975. [PMID: 29438395 PMCID: PMC5826538 DOI: 10.1371/journal.pcbi.1005975] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2017] [Revised: 02/26/2018] [Accepted: 01/13/2018] [Indexed: 11/25/2022] Open

Indrischek H, Prohaska SJ, Gurevich VV, Gurevich EV, Stadler PF. Uncovering missing pieces: duplication and deletion history of arrestins in deuterostomes. BMC Evol Biol 2017;17:163. [PMID: 28683816 PMCID: PMC5501109 DOI: 10.1186/s12862-017-1001-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 06/19/2017] [Indexed: 12/13/2022] Open

Abstract

BACKGROUND

The cytosolic arrestin proteins mediate desensitization of activated G protein-coupled receptors (GPCRs) via competition with G proteins for the active phosphorylated receptors. Arrestins in active, including receptor-bound, conformation are also transducers of signaling. Therefore, this protein family is an attractive therapeutic target. The signaling outcome is believed to be a result of structural and sequence-dependent interactions of arrestins with GPCRs and other protein partners. Here we elucidated the detailed evolution of arrestins in deuterostomes.

RESULTS

Identity and number of arrestin paralogs were determined searching deuterostome genomes and gene expression data. In contrast to standard gene prediction methods, our strategy first detects exons situated on different scaffolds and then solves the problem of assigning them to the correct gene. This increases both the completeness and the accuracy of the annotation in comparison to conventional database search strategies applied by the community. The employed strategy enabled us to map in detail the duplication- and deletion history of arrestin paralogs including tandem duplications, pseudogenizations and the formation of retrogenes. The two rounds of whole genome duplications in the vertebrate stem lineage gave rise to four arrestin paralogs. Surprisingly, visual arrestin ARR3 was lost in the mammalian clades Afrotheria and Xenarthra. Duplications in specific clades, on the other hand, must have given rise to new paralogs that show signatures of diversification in functional elements important for receptor binding and phosphate sensing.

CONCLUSION

The current study traces the functional evolution of deuterostome arrestins in unprecedented detail. Based on a precise re-annotation of the exon-intron structure at nucleotide resolution, we infer the gain and loss of paralogs and patterns of conservation, co-variation and selection.

Collapse

Sloutsky R, Naegle KM. High-Resolution Identification of Specificity Determining Positions in the LacI Protein Family Using Ensembles of Sub-Sampled Alignments. PLoS One 2016;11:e0162579. [PMID: 27681038 PMCID: PMC5040260 DOI: 10.1371/journal.pone.0162579] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Accepted: 08/08/2016] [Indexed: 01/24/2023] Open

Moll M, Finn PW, Kavraki LE. Structure-guided selection of specificity determining positions in the human Kinome. BMC Genomics 2016;17 Suppl 4:431. [PMID: 27556159 PMCID: PMC5001202 DOI: 10.1186/s12864-016-2790-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins. Biochem Res Int 2016;2016:4360285. [PMID: 27446613 PMCID: PMC4944035 DOI: 10.1155/2016/4360285] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Accepted: 05/26/2016] [Indexed: 01/30/2023] Open

Boari de Lima E, Meira W, de Melo-Minardi RC. Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering. PLoS Comput Biol 2016;12:e1005001. [PMID: 27348631 PMCID: PMC4922564 DOI: 10.1371/journal.pcbi.1005001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 05/22/2016] [Indexed: 01/14/2023] Open

Abstract

As increasingly more genomes are sequenced, the vast majority of proteins may only be annotated computationally, given experimental investigation is extremely costly. This highlights the need for computational methods to determine protein functions quickly and reliably. We believe dividing a protein family into subtypes which share specific functions uncommon to the whole family reduces the function annotation problem's complexity. Hence, this work's purpose is to detect isofunctional subfamilies inside a family of unknown function, while identifying differentiating residues. Similarity between protein pairs according to various properties is interpreted as functional similarity evidence. Data are integrated using genetic programming and provided to a spectral clustering algorithm, which creates clusters of similar proteins. The proposed framework was applied to well-known protein families and to a family of unknown function, then compared to ASMC. Results showed our fully automated technique obtained better clusters than ASMC for two families, besides equivalent results for other two, including one whose clusters were manually defined. Clusters produced by our framework showed great correspondence with the known subfamilies, besides being more contrasting than those produced by ASMC. Additionally, for the families whose specificity determining positions are known, such residues were among those our technique considered most important to differentiate a given group. When run with the crotonase and enolase SFLD superfamilies, the results showed great agreement with this gold-standard. Best results consistently involved multiple data types, thus confirming our hypothesis that similarities according to different knowledge domains may be used as functional similarity evidence. Our main contributions are the proposed strategy for selecting and integrating data types, along with the ability to work with noisy and incomplete data; domain knowledge usage for detecting subfamilies in a family with different specificities, thus reducing the complexity of the experimental function characterization problem; and the identification of residues responsible for specificity.

Collapse

Chagoyen M, García-Martín JA, Pazos F. Practical analysis of specificity-determining residues in protein families. Brief Bioinform 2015;17:255-61. [DOI: 10.1093/bib/bbv045] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Accepted: 06/15/2015] [Indexed: 12/17/2022] Open

Chakraborty A, Chakrabarti S. A survey on prediction of specificity-determining sites in proteins. Brief Bioinform 2014;16:71-88. [DOI: 10.1093/bib/bbt092] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Wilkins AD, Venner E, Marciano DC, Erdin S, Atri B, Lua RC, Lichtarge O. Accounting for epistatic interactions improves the functional analysis of protein structures. Bioinformatics 2013;29:2714-21. [PMID: 24021383 PMCID: PMC3799481 DOI: 10.1093/bioinformatics/btt489] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Abstract

Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure.

Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions.

Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction.

Contact:lichtarge@bcm.edu

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

Gu X, Zou Y, Su Z, Huang W, Zhou Z, Arendsee Z, Zeng Y. An update of DIVERGE software for functional divergence analysis of protein family. Mol Biol Evol 2013;30:1713-9. [PMID: 23589455 DOI: 10.1093/molbev/mst069] [Citation(s) in RCA: 146] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Suplatov D, Shalaeva D, Kirilin E, Arzhanik V, Švedas V. Bioinformatic analysis of protein families for identification of variable amino acid residues responsible for functional diversity. J Biomol Struct Dyn 2013;32:75-87. [DOI: 10.1080/07391102.2012.750249] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013;449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Teppa E, Wilkins AD, Nielsen M, Buslje CM. Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction. BMC Bioinformatics 2012;13:235. [PMID: 22978315 PMCID: PMC3515339 DOI: 10.1186/1471-2105-13-235] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2012] [Accepted: 09/05/2012] [Indexed: 11/11/2022] Open

Abstract

Background

A large panel of methods exists that aim to identify residues with critical impact on protein function based on evolutionary signals, sequence and structure information. However, it is not clear to what extent these different methods overlap, and if any of the methods have higher predictive potential compared to others when it comes to, in particular, the identification of catalytic residues (CR) in proteins. Using a large set of enzymatic protein families and measures based on different evolutionary signals, we sought to break up the different components of the information content within a multiple sequence alignment to investigate their predictive potential and degree of overlap.

Results

Our results demonstrate that the different methods included in the benchmark in general can be divided into three groups with a limited mutual overlap. One group containing real-value Evolutionary Trace (rvET) methods and conservation, another containing mutual information (MI) methods, and the last containing methods designed explicitly for the identification of specificity determining positions (SDPs): integer-value Evolutionary Trace (ivET), SDPfox, and XDET. In terms of prediction of CR, we find using a proximity score integrating structural information (as the sum of the scores of residues located within a given distance of the residue in question) that only the methods from the first two groups displayed a reliable performance. Next, we investigated to what degree proximity scores for conservation, rvET and cumulative MI (cMI) provide complementary information capable of improving the performance for CR identification. We found that integrating conservation with proximity scores for rvET and cMI achieved the highest performance. The proximity conservation score contained no complementary information when integrated with proximity rvET. Moreover, the signal from rvET provided only a limited gain in predictive performance when integrated with mutual information and conservation proximity scores. Combined, these observations demonstrate that the rvET and cMI scores add complementary information to the prediction system.

Conclusions

This work contributes to the understanding of the different signals of evolution and also shows that it is possible to improve the detection of catalytic residues by integrating structural and higher order sequence evolutionary information with sequence conservation.

Collapse

Aiello D, Caffrey DR. Evolution of specific protein-protein interaction sites following gene duplication. J Mol Biol 2012;423:257-72. [PMID: 22789570 DOI: 10.1016/j.jmb.2012.06.039] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Revised: 05/16/2012] [Accepted: 06/29/2012] [Indexed: 11/15/2022]

González AJ, Liao L, Wu CH. Predicting ligand binding residues and functional sites using multipositional correlations with graph theoretic clustering and kernel CCA. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012;9:992-1001. [PMID: 22025754 DOI: 10.1109/tcbb.2011.136] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Sorn S, Sok T, Ly S, Rith S, Tung N, Viari A, Gavotte L, Holl D, Seng H, Asgari N, Richner B, Laurent D, Chea N, Duong V, Toyoda T, Yasuda CY, Kitsutani P, Zhou P, Bing S, Deubel V, Donis R, Frutos R, Buchy P. Dynamic of H5N1 virus in Cambodia and emergence of a novel endemic sub-clade. INFECTION GENETICS AND EVOLUTION 2012;15:87-94. [PMID: 22683363 DOI: 10.1016/j.meegid.2012.05.013] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2012] [Revised: 05/13/2012] [Accepted: 05/28/2012] [Indexed: 11/28/2022]

Complex dynamic of dengue virus serotypes 2 and 3 in Cambodia following series of climate disasters. INFECTION GENETICS AND EVOLUTION 2012;15:77-86. [PMID: 22677620 DOI: 10.1016/j.meegid.2012.05.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2012] [Revised: 05/13/2012] [Accepted: 05/16/2012] [Indexed: 11/23/2022]

Muth T, Garcia-Martin JA, Rausell A, Juan D, Valencia A, Pazos F. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures. Bioinformatics 2011;28:584-6. [DOI: 10.1093/bioinformatics/btr688] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Schmidt T, Haas J, Gallo Cassarino T, Schwede T. Assessment of ligand-binding residue predictions in CASP9. Proteins 2011;79 Suppl 10:126-36. [PMID: 21987472 DOI: 10.1002/prot.23174] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Revised: 07/29/2011] [Accepted: 08/04/2011] [Indexed: 11/06/2022]

Benítez-Páez A, Cárdenas-Brito S, Gutiérrez AJ. A practical guide for the computational selection of residues to be experimentally characterized in protein families. Brief Bioinform 2011;13:329-36. [PMID: 21930656 DOI: 10.1093/bib/bbr052] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Duong V, Simmons C, Gavotte L, Viari A, Ong S, Chantha N, Lennon NJ, Birren BW, Vong S, Farrar JJ, Henn MR, Deubel V, Frutos R, Buchy P. Genetic diversity and lineage dynamic of dengue virus serotype 1 (DENV-1) in Cambodia. INFECTION GENETICS AND EVOLUTION 2011;15:59-68. [PMID: 21757030 DOI: 10.1016/j.meegid.2011.06.019] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2011] [Revised: 06/12/2011] [Accepted: 06/27/2011] [Indexed: 11/28/2022]

Wass MN, David A, Sternberg MJE. Challenges for the prediction of macromolecular interactions. Curr Opin Struct Biol 2011;21:382-90. [DOI: 10.1016/j.sbi.2011.03.013] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Revised: 03/04/2011] [Accepted: 03/24/2011] [Indexed: 12/14/2022]

Cleveland SB, Davies J, McClure MA. A bioinformatics approach to the structure, function, and evolution of the nucleoprotein of the order mononegavirales. PLoS One 2011;6:e19275. [PMID: 21559282 PMCID: PMC3086907 DOI: 10.1371/journal.pone.0019275] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2010] [Accepted: 04/01/2011] [Indexed: 01/09/2023] Open

Abstract

The goal of this Bioinformatic study is to investigate sequence conservation in relation to evolutionary function/structure of the nucleoprotein of the order Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N) sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and co-evolving residues. Correlations between location and conservation of predicted regions illustrate a strong division between families while high- lighting conservation within individual families. These results suggest the conserved regions among the nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also generally among all members of the order, reflect an evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the transcription/replication machinery. Results indicate conservation for disorder in the C-terminus region of the representative proteins that is important for interacting with the phosphoprotein and the large subunit polymerase during transcription and replication. Additionally, the C-terminus region of the protein preceding the disordered region, is predicted to be important for interacting with the encapsidated genome. Portions of the N-terminus are responsible for N∶N stability and interactions identified by the presence or lack of co-evolving intra-protein contact predictions. The validation of these prediction results by current structural information illustrates the benefits of the Disorder, Intra-residue contact and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly characterizing proteins and providing the most likely residues and regions necessary to target for disruption in viruses that have little structural information available.

Collapse

Tungtur S, Parente DJ, Swint-Kruse L. Functionally important positions can comprise the majority of a protein's architecture. Proteins 2011;79:1589-608. [PMID: 21374721 PMCID: PMC3076786 DOI: 10.1002/prot.22985] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Revised: 12/08/2010] [Accepted: 12/15/2010] [Indexed: 01/13/2023]

Kc DB, Livesay DR. Topology improves phylogenetic motif functional site predictions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:226-233. [PMID: 21071810 DOI: 10.1109/tcbb.2009.60] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

de Melo-Minardi RC, Bastard K, Artiguenave F. Identification of subfamily-specific sites based on active sites modeling and clustering. ACTA ACUST UNITED AC 2010;26:3075-82. [PMID: 20980272 DOI: 10.1093/bioinformatics/btq595] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Abstract

MOTIVATION

Current computational approaches to function prediction are mostly based on protein sequence classification and transfer of annotation from known proteins to their closest homologous sequences relying on the orthology concept of function conservation. This approach suffers a major weakness: annotation reliability depends on global sequence similarity to known proteins and is poorly efficient for enzyme superfamilies that catalyze different reactions. Structural biology offers a different strategy to overcome the problem of annotation by adding information about protein 3D structures. This information can be used to identify amino acids located in active sites, focusing on detection of functional polymorphisms residues in an enzyme superfamily. Structural genomics programs are providing more and more novel protein structures at a high-throughput rate. However, there is still a huge gap between the number of sequences and available structures. Computational methods, such as homology modeling provides reliable approaches to bridge this gap and could be a new precise tool to annotate protein functions.

RESULTS

Here, we present Active Sites Modeling and Clustering (ASMC) method, a novel unsupervised method to classify sequences using structural information of protein pockets. ASMC combines homology modeling of family members, structural alignment of modeled active sites and a subsequent hierarchical conceptual classification. Comparison of profiles obtained from computed clusters allows the identification of residues correlated to subfamily function divergence, called specificity determining positions. ASMC method has been validated on a benchmark of 42 Pfam families for which previous resolved holo-structures were available. ASMC was also applied to several families containing known protein structures and comprehensive functional annotations. We will discuss how ASMC improves annotation and understanding of protein families functions by giving some specific illustrative examples on nucleotidyl cyclases, protein kinases and serine proteases.

AVAILABILITY

http://www.genoscope.fr/ASMC/.

Collapse

Brandt BW, Feenstra KA, Heringa J. Multi-Harmony: detecting functional specificity from sequence alignment. Nucleic Acids Res 2010;38:W35-40. [PMID: 20525785 PMCID: PMC2896201 DOI: 10.1093/nar/gkq415] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Hung SS, Wasmuth J, Sanford C, Parkinson J. DETECT--a density estimation tool for enzyme classification and its application to Plasmodium falciparum. ACTA ACUST UNITED AC 2010;26:1690-8. [PMID: 20513663 DOI: 10.1093/bioinformatics/btq266] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Izarzugaza JMG, Redfern OC, Orengo CA, Valencia A. Cancer-associated mutations are preferentially distributed in protein kinase functional sites. Proteins 2010;77:892-903. [PMID: 19626714 DOI: 10.1002/prot.22512] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Protein interactions and ligand binding: from protein subfamilies to functional specificity. Proc Natl Acad Sci U S A 2010;107:1995-2000. [PMID: 20133844 DOI: 10.1073/pnas.0908044107] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Goldstein P, Zucko J, Vujaklija D, Krisko A, Hranueli D, Long PF, Etchebest C, Basrak B, Cullum J. Clustering of protein domains for functional and evolutionary studies. BMC Bioinformatics 2009;10:335. [PMID: 19832975 PMCID: PMC2770074 DOI: 10.1186/1471-2105-10-335] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2009] [Accepted: 10/15/2009] [Indexed: 11/16/2022] Open

Comparing the functional roles of nonconserved sequence positions in homologous transcription repressors: implications for sequence/function analyses. J Mol Biol 2009;395:785-802. [PMID: 19818797 DOI: 10.1016/j.jmb.2009.10.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Revised: 10/01/2009] [Accepted: 10/02/2009] [Indexed: 11/21/2022]

Abstract

The explosion of protein sequences deduced from genetic code has led to both a problem and a potential resource: Efficient data use requires interpreting the functional impact of sequence change without experimentally characterizing each protein variant. Several groups have hypothesized that interpretation could be aided by analyzing the sequences of naturally occurring homologues. To that end, myriad sequence/function analyses have been developed to predict which conserved, semi-conserved, and nonconserved positions are functionally important. These positions must be discriminated from the nonconserved positions that are functionally silent. However, the assumptions that underlie sequence analyses are based on experimental results that are sparse and usually designed to address different questions. Here, we use three homologues from a test family common to bioinformatics-the LacI/GalR transcription repressors-to test a common assumption: If a position is functionally important for one family member, it has similar importance in all homologues. We generated experimental sequence/function information for each nonconserved position in the 18 amino acids that link the DNA-binding and regulatory domains of three LacI/GalR homologues. We find that the functional importance of each position is preserved among the three linkers, albeit to different degrees. We also find that every linker position contributes to function, which has twofold implications. (1) Since the linker positions range from highly conserved to semi-conserved to nonconserved and contribute to affinity, selectivity, and allosteric response, we assert that sequence/function analyses must identify positions in the LacI/GalR linkers to be qualified as "successful". Many analyses overlook this region since most of the residues do not directly contact ligand. (2) No position in the LacI/GalR linker is functionally silent. This finding is inconsistent with another underlying principle of many analyses: Using sequence sets to discriminate important from non-contributing positions obligates silent positions, which denotes that most homologues tolerate a variety of amino acid substitutions at the position without functional change. Instead, additional combinatorial mutants in the LacI/GalR linkers show that particular substitutions can be silent in a context-dependent manner. Thus, specific permutations of sequence change (rather than change at silent positions) would facilitate neutral drift during evolution. Finally, the combinatorial mutants also reveal functional synergy between semi- and nonconserved positions. Such functional relationships would be missed by analyses that rely primarily upon co-evolution.

Collapse

Frenkel-Morgenstern M, Tworowski D, Klipcan L, Safro M. Intra-protein compensatory mutations analysis highlights the tRNA recognition regions in aminoacyl-tRNA synthetases. J Biomol Struct Dyn 2009;27:115-26. [PMID: 19583438 DOI: 10.1080/07391102.2009.10507302] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Schmidt T, Haas J, Cassarino TG, Schwede T. Assessment of ligand binding residue predictions in CASP8. Proteins 2009;77 Suppl 9:138-46. [PMID: 19714771 PMCID: PMC3204792 DOI: 10.1002/prot.22557] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Meinhardt S, Swint-Kruse L. Experimental identification of specificity determinants in the domain linker of a LacI/GalR protein: bioinformatics-based predictions generate true positives and false negatives. Proteins 2008;73:941-57. [PMID: 18536016 PMCID: PMC2585155 DOI: 10.1002/prot.22121] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

In protein families, conserved residues often contribute to a common general function, such as DNA-binding. However, unique attributes for each homolog (e.g. recognition of alternative DNA sequences) must arise from variation in other functionally-important positions. The locations of these "specificity determinant" positions are obscured amongst the background of varied residues that do not make significant contributions to either structure or function. To isolate specificity determinants, a number of bioinformatics algorithms have been developed. When applied to the LacI/GalR family of transcription regulators, several specificity determinants are predicted in the 18 amino acids that link the DNA-binding and regulatory domains. However, results from alternative algorithms are only in partial agreement with each other. Here, we experimentally evaluate these predictions using an engineered repressor comprising the LacI DNA-binding domain, the LacI linker, and the GalR regulatory domain (LLhG). "Wild-type" LLhG has altered DNA specificity and weaker lacO(1) repression compared to LacI or a similar LacI:PurR chimera. Next, predictions of linker specificity determinants were tested, using amino acid substitution and in vivo repression assays to assess functional change. In LLhG, all predicted sites are specificity determinants, as well as three sites not predicted by any algorithm. Strategies are suggested for diminishing the number of false negative predictions. Finally, individual substitutions at LLhG specificity determinants exhibited a broad range of functional changes that are not predicted by bioinformatics algorithms. Results suggest that some variants have altered affinity for DNA, some have altered allosteric response, and some appear to have changed specificity for alternative DNA ligands.

Collapse

Dukka BKC, Livesay DR. Improving position-specific predictions of protein functional sites using phylogenetic motifs. ACTA ACUST UNITED AC 2008;24:2308-16. [PMID: 18723520 DOI: 10.1093/bioinformatics/btn454] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Redfern OC, Dessailly B, Orengo CA. Exploring the structure and function paradigm. Curr Opin Struct Biol 2008;18:394-402. [PMID: 18554899 PMCID: PMC2561214 DOI: 10.1016/j.sbi.2008.05.007] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Revised: 04/16/2008] [Accepted: 05/07/2008] [Indexed: 11/29/2022]

Capra JA, Singh M. Characterization and prediction of residues determining protein functional specificity. ACTA ACUST UNITED AC 2008;24:1473-80. [PMID: 18450811 PMCID: PMC2718669 DOI: 10.1093/bioinformatics/btn214] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Lee D, Redfern O, Orengo C. Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007;8:995-1005. [PMID: 18037900 DOI: 10.1038/nrm2281] [Citation(s) in RCA: 371] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

How accurate and statistically robust are catalytic site predictions based on closeness centrality? BMC Bioinformatics 2007;8:153. [PMID: 17498304 PMCID: PMC1876251 DOI: 10.1186/1471-2105-8-153] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2006] [Accepted: 05/11/2007] [Indexed: 11/25/2022] Open

Abstract

Background

We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices.

Results

We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined.

Conclusion

Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.

Collapse

Wallace IM, Higgins DG. Supervised multivariate analysis of sequence groups to identify specificity determining residues. BMC Bioinformatics 2007;8:135. [PMID: 17451607 PMCID: PMC1878507 DOI: 10.1186/1471-2105-8-135] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2006] [Accepted: 04/23/2007] [Indexed: 11/29/2022] Open