Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Zhang N, Sood D, Guo SC, Chen N, Antoszewski A, Marianchuk T, Dey S, Xiao Y, Hong L, Peng X, Baxa M, Partch C, Wang LP, Sosnick TR, Dinner AR, LiWang A. Temperature-dependent fold-switching mechanism of the circadian clock protein KaiB. Proc Natl Acad Sci U S A 2024;121:e2412327121. [PMID: 39671178 DOI: 10.1073/pnas.2412327121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 10/24/2024] [Indexed: 12/14/2024] Open

Harteveld Z, Van Hall-Beauvais A, Morozova I, Southern J, Goverde C, Georgeon S, Rosset S, Defferrard M, Loukas A, Vandergheynst P, Bronstein MM, Correia BE. Exploring "dark-matter" protein folds using deep learning. Cell Syst 2024;15:898-910.e5. [PMID: 39383860 DOI: 10.1016/j.cels.2024.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 06/13/2024] [Accepted: 09/16/2024] [Indexed: 10/11/2024]

Draizen EJ, Veretnik S, Mura C, Bourne PE. Deep generative models of protein structure uncover distant relationships across a continuous fold space. Nat Commun 2024;15:8094. [PMID: 39294145 PMCID: PMC11410806 DOI: 10.1038/s41467-024-52020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 08/23/2024] [Indexed: 09/20/2024] Open

Zhang N, Sood D, Guo SC, Chen N, Antoszewski A, Marianchuk T, Chavan A, Dey S, Xiao Y, Hong L, Peng X, Baxa M, Partch C, Wang LP, Sosnick TR, Dinner AR, LiWang A. Temperature-Dependent Fold-Switching Mechanism of the Circadian Clock Protein KaiB. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.594594. [PMID: 38826295 PMCID: PMC11142059 DOI: 10.1101/2024.05.21.594594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]

Greener JG, Jamali K. Fast protein structure searching using structure graph embeddings. BIOINFORMATICS ADVANCES 2024;5:vbaf042. [PMID: 40196750 PMCID: PMC11974391 DOI: 10.1093/bioadv/vbaf042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 02/11/2025] [Accepted: 03/03/2025] [Indexed: 04/09/2025]

Schaeffer RD, Zhang J, Medvedev KE, Kinch LN, Cong Q, Grishin NV. ECOD domain classification of 48 whole proteomes from AlphaFold Structure Database using DPAM2. PLoS Comput Biol 2024;20:e1011586. [PMID: 38416793 PMCID: PMC10927120 DOI: 10.1371/journal.pcbi.1011586] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open

Bordin N, Lau AM, Orengo C. Large-scale clustering of AlphaFold2 3D models shines light on the structure and function of proteins. Mol Cell 2023;83:3950-3952. [PMID: 37977115 DOI: 10.1016/j.molcel.2023.10.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 10/27/2023] [Accepted: 10/27/2023] [Indexed: 11/19/2023]

Bruley A, Bitard-Feildel T, Callebaut I, Duprat E. A sequence-based foldability score combined with AlphaFold2 predictions to disentangle the protein order/disorder continuum. Proteins 2023;91:466-484. [PMID: 36306150 DOI: 10.1002/prot.26441] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/14/2022] [Accepted: 10/18/2022] [Indexed: 11/11/2022]

Bordin N, Dallago C, Heinzinger M, Kim S, Littmann M, Rauer C, Steinegger M, Rost B, Orengo C. Novel machine learning approaches revolutionize protein knowledge. Trends Biochem Sci 2023;48:345-359. [PMID: 36504138 PMCID: PMC10570143 DOI: 10.1016/j.tibs.2022.11.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/24/2022] [Accepted: 11/17/2022] [Indexed: 12/10/2022]

Chou HH, Hsu CT, Hsu CW, Yao KH, Wang HC, Hsieh SY. Novel Algorithm for Improved Protein Classification Using Graph Similarity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3135-3143. [PMID: 34748498 DOI: 10.1109/tcbb.2021.3125836] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Uzoeto HO, Cosmas S, Ajima JN, Arazu AV, Didiugwu CM, Ekpo DE, Ibiang GO, Durojaye OA. Computer-aided molecular modeling and structural analysis of the human centromere protein–HIKM complex. BENI-SUEF UNIVERSITY JOURNAL OF BASIC AND APPLIED SCIENCES 2022. [DOI: 10.1186/s43088-022-00285-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract Abstract Background Protein–peptide and protein–protein interactions play an essential role in different functional and structural cellular organizational aspects. While Cryo-EM and X-ray crystallography generate the most complete structural characterization, most biological interactions exist in biomolecular complexes that are neither compliant nor responsive to direct experimental analysis. The development of computational docking approaches is therefore necessary. This starts from component protein structures to the prediction of their complexes, preferentially with precision close to complex structures generated by X-ray crystallography. Results To guarantee faithful chromosomal segregation, there must be a proper assembling of the kinetochore (a protein complex with multiple subunits) at the centromere during the process of cell division. As an important member of the inner kinetochore, defects in any of the subunits making up the CENP-HIKM complex lead to kinetochore dysfunction and an eventual chromosomal mis-segregation and cell death. Previous studies in an attempt to understand the assembly and mechanism devised by the CENP-HIKM in promoting the functionality of the kinetochore have reconstituted the protein complex from different organisms including fungi and yeast. Here, we present a detailed computational model of the physical interactions that exist between each component of the human CENP-HIKM, while validating each modeled structure using orthologs with existing crystal structures from the protein data bank. Conclusions Results from this study substantiate the existing hypothesis that the human CENP-HIK complex shares a similar architecture with its fungal and yeast orthologs, and likewise validate the binding mode of CENP-M to the C-terminus of the human CENP-I based on existing experimental reports. Graphical abstract Collapse

Torgasheva NA, Diatlova EA, Grin IR, Endutkin AV, Mechetin GV, Vokhtantsev IP, Yudkina AV, Zharkov DO. Noncatalytic Domains in DNA Glycosylases. Int J Mol Sci 2022;23:ijms23137286. [PMID: 35806289 PMCID: PMC9266487 DOI: 10.3390/ijms23137286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 06/28/2022] [Accepted: 06/29/2022] [Indexed: 02/04/2023] Open

Affiliation(s)

Natalia A. Torgasheva SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.)
Evgeniia A. Diatlova SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.) Department of Natural Sciences, Novosibirsk State University, 2 Pirogova Street, 630090 Novosibirsk, Russia
Inga R. Grin SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.)
Anton V. Endutkin SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.)
Grigory V. Mechetin SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.)
Ivan P. Vokhtantsev SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.) Department of Natural Sciences, Novosibirsk State University, 2 Pirogova Street, 630090 Novosibirsk, Russia
Anna V. Yudkina SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.)
Dmitry O. Zharkov SB RAS Institute of Chemical Biology and Fundamental Medicine, 8 Lavrentieva Avenue, 630090 Novosibirsk, Russia; (N.A.T.); (E.A.D.); (I.R.G.); (A.V.E.); (G.V.M.); (I.P.V.); (A.V.Y.) Department of Natural Sciences, Novosibirsk State University, 2 Pirogova Street, 630090 Novosibirsk, Russia Correspondence:

Collapse

Srivastava J, Balaji PV. Clues to reaction specificity in PLP ‐dependent fold type I aminotransferases of monosaccharide biosynthesis. Proteins 2022;90:1247-1258. [DOI: 10.1002/prot.26305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 01/20/2022] [Indexed: 11/10/2022]

Villegas-Morcillo A, Gomez AM, Sanchez V. An analysis of protein language model embeddings for fold prediction. Brief Bioinform 2022;23:6571527. [PMID: 35443054 DOI: 10.1093/bib/bbac142] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/21/2022] [Accepted: 03/28/2022] [Indexed: 11/13/2022] Open

Lin E, Lin CH, Lane HY. De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update. J Chem Inf Model 2022;62:761-774. [DOI: 10.1021/acs.jcim.1c01361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Li DD, Wang JL, Liu Y, Li YZ, Zhang Z. Expanded analyses of the functional correlations within structural classifications of glycoside hydrolases. Comput Struct Biotechnol J 2021;19:5931-5942. [PMID: 34849197 PMCID: PMC8602953 DOI: 10.1016/j.csbj.2021.10.039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/30/2021] [Accepted: 10/30/2021] [Indexed: 01/01/2023] Open

Villegas-Morcillo A, Gomez AM, Morales-Cordovilla JA, Sanchez V. Protein Fold Recognition From Sequences Using Convolutional and Recurrent Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2848-2854. [PMID: 32750896 DOI: 10.1109/tcbb.2020.3012732] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

LiWang A, Porter LL, Wang LP. Fold-switching proteins. Biopolymers 2021;112:e23478. [PMID: 34694634 DOI: 10.1002/bip.23478] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Villegas-Morcillo A, Sanchez V, Gomez AM. FoldHSphere: deep hyperspherical embeddings for protein fold recognition. BMC Bioinformatics 2021;22:490. [PMID: 34641786 PMCID: PMC8507389 DOI: 10.1186/s12859-021-04419-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 09/29/2021] [Indexed: 12/01/2022] Open

LiWang PJ, Wang LP, LiWang A. Resurrected Ancestors Reveal Origins of Metamorphism in XCL1. Trends Biochem Sci 2021;46:433-434. [PMID: 33752957 DOI: 10.1016/j.tibs.2021.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/06/2021] [Accepted: 03/09/2021] [Indexed: 10/21/2022]

Wang CK, Craik DJ. Linking molecular evolution to molecular grafting. J Biol Chem 2021;296:100425. [PMID: 33600801 PMCID: PMC8005815 DOI: 10.1016/j.jbc.2021.100425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/09/2021] [Accepted: 02/13/2021] [Indexed: 12/01/2022] Open

Runthala A. Probabilistic divergence of a template-based modelling methodology from the ideal protocol. J Mol Model 2021;27:25. [PMID: 33411019 DOI: 10.1007/s00894-020-04640-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 12/09/2020] [Indexed: 12/27/2022]

Karimi M, Zhu S, Cao Y, Shen Y. De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks. J Chem Inf Model 2020;60:5667-5681. [PMID: 32945673 PMCID: PMC7775287 DOI: 10.1021/acs.jcim.0c00593] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Abstract

Although massive data is quickly accumulating on protein sequence and structure, there is a small and limited number of protein architectural types (or structural folds). This study is addressing the following question: how well could one reveal underlying sequence-structure relationships and design protein sequences for an arbitrary, potentially novel, structural fold? In response to the question, we have developed novel deep generative models, namely, semisupervised gcWGAN (guided, conditional, Wasserstein Generative Adversarial Networks). To overcome training difficulties and improve design qualities, we build our models on conditional Wasserstein GAN (WGAN) that uses Wasserstein distance in the loss function. Our major contributions include (1) constructing a low-dimensional and generalizable representation of the fold space for the conditional input, (2) developing an ultrafast sequence-to-fold predictor (or oracle) and incorporating its feedback into WGAN as a loss to guide model training, and (3) exploiting sequence data with and without paired structures to enable a semisupervised training strategy. Assessed by the oracle over 100 novel folds not in the training set, gcWGAN generates more successful designs and covers 3.5 times more target folds compared to a competing data-driven method (cVAE). Assessed by sequence- and structure-based predictors, gcWGAN designs are physically and biologically sound. Assessed by a structure predictor over representative novel folds, including one not even part of basis folds, gcWGAN designs have comparable or better fold accuracy yet much more sequence diversity and novelty than cVAE. The ultrafast data-driven model is further shown to boost the success of a principle-driven de novo method (RosettaDesign), through generating design seeds and tailoring design space. In conclusion, gcWGAN explores uncharted sequence space to design proteins by learning generalizable principles from current sequence-structure data. Data, source codes, and trained models are available at https://github.com/Shen-Lab/gcWGAN.

Collapse

Leone L, Chino M, Nastri F, Maglio O, Pavone V, Lombardi A. Mimochrome, a metalloporphyrin‐based catalytic Swiss knife†. Biotechnol Appl Biochem 2020;67:495-515. [DOI: 10.1002/bab.1985] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Accepted: 07/09/2020] [Indexed: 12/20/2022]

Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Beeby M, Ferreira JL, Tripp P, Albers SV, Mitchell DR. Propulsive nanomachines: the convergent evolution of archaella, flagella and cilia. FEMS Microbiol Rev 2020;44:253-304. [DOI: 10.1093/femsre/fuaa006] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 03/06/2020] [Indexed: 02/06/2023] Open

Reading Targeted DNA Damage in the Active Demethylation Pathway: Role of Accessory Domains of Eukaryotic AP Endonucleases and Thymine-DNA Glycosylases. J Mol Biol 2020:S0022-2836(19)30720-X. [DOI: 10.1016/j.jmb.2019.12.020] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 11/24/2019] [Accepted: 12/05/2019] [Indexed: 01/07/2023]

Heo L, Feig M. High-accuracy protein structures by combining machine-learning with physics-based refinement. Proteins 2019;88:637-642. [PMID: 31693199 DOI: 10.1002/prot.25847] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 10/05/2019] [Accepted: 11/03/2019] [Indexed: 12/16/2022]

Verma R, Pandit SB. Unraveling the structural landscape of intra-chain domain interfaces: Implication in the evolution of domain-domain interactions. PLoS One 2019;14:e0220336. [PMID: 31374091 PMCID: PMC6677297 DOI: 10.1371/journal.pone.0220336] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/12/2019] [Indexed: 12/22/2022] Open

Abstract

Intra-chain domain interactions are known to play a significant role in the function and stability of multidomain proteins. These interactions are mediated through a physical interaction at domain-domain interfaces (DDIs). With a motivation to understand evolution of interfaces, we have investigated similarities among DDIs. Even though interfaces of protein-protein interactions (PPIs) have been previously studied by structurally aligning interfaces, similar analyses have not yet been performed on DDIs of either multidomain proteins or PPIs. For studying the structural landscape of DDIs, we have used iAlign to structurally align intra-chain domain interfaces of domains. The interface alignment of spatially constrained domains (due to inter-domain linkers) showed that ~88% of these could identify a structural matching interface having similar C-alpha geometry and contact pattern despite that aligned domain pairs are not structurally related. Moreover, the mean interface similarity score (IS-score) is 0.307, which is higher compared to the average random IS-score (0.207) suggesting domain interfaces are not random. The structural space of DDIs is highly connected as ~84% of all possible directed edges among interfaces are found to have at most path length of 8 when 0.26 is IS-score threshold. At this threshold, ~83% of interfaces form the largest strongly connected component. Thus, suggesting that structural space of intra-chain domain interfaces is degenerate and highly connected, as has been found in PPI interfaces. Interestingly, searching for structural neighbors of inter-chain interfaces among intra-chain interfaces showed that ~86% could find a statistically significant match to intra-chain interface with a mean IS-score of 0.311. This implies that domain interfaces are degenerate whether formed within a protein or between proteins. The interface degeneracy is most likely due to limited possible ways of packing secondary structures. In principle, interface similarities can be exploited to accurately model domain interfaces in structure prediction of multidomain proteins.

Collapse

Baiesi M, Orlandini E, Seno F, Trovato A. Sequence and structural patterns detected in entangled proteins reveal the importance of co-translational folding. Sci Rep 2019;9:8426. [PMID: 31182755 PMCID: PMC6557820 DOI: 10.1038/s41598-019-44928-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 05/23/2019] [Indexed: 11/09/2022] Open

Sequence Pattern for Supersecondary Structure of Sandwich-Like Proteins. Methods Mol Biol 2019. [PMID: 30945226 DOI: 10.1007/978-1-4939-9161-7_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Catazaro J, Caprez A, Swanson D, Powers R. Functional Evolution of Proteins. Proteins 2019;87:492-501. [PMID: 30714210 DOI: 10.1002/prot.25670] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 11/02/2018] [Accepted: 01/31/2019] [Indexed: 11/12/2022]

Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci U S A 2018;115:13276-13281. [PMID: 30530696 DOI: 10.1073/pnas.1811364115] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Navigating Among Known Structures in Protein Space. Methods Mol Biol 2018. [PMID: 30298400 DOI: 10.1007/978-1-4939-8736-8_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Abstract

Present-day protein space is the result of 3.7 billion years of evolution, constrained by the underlying physicochemical qualities of the proteins. It is difficult to differentiate between evolutionary traces and effects of physicochemical constraints. Nonetheless, as a rule of thumb, instances of structural reuse, or focusing on structural similarity, are likely attributable to physicochemical constraints, whereas sequence reuse, or focusing on sequence similarity, may be more indicative of evolutionary relationships. Both types of relationships have been studied and can provide meaningful insights to protein biophysics and evolution, which in turn can lead to better algorithms for protein search, annotation, and maybe even design.In broad strokes, studies of protein space vary in the entities they represent, the similarity measure comparing these entities, and the representation used. The entities can be, for example, protein chains, domains, supra-domains, or smaller protein sub-parts denoted themes. The measures of similarity between the entities can be based on sequence, structure, function, or any combination of these. The representation can be global, encompassing the whole space, or local, focusing on a particular region surrounding protein(s) of interest. Global representations include lists of grouped proteins, protein networks, and maps. Networks are the abstraction that is derived most directly from the similarity data: each node is the protein entity (e.g., a domain), and edges connect similar domains. Selecting the entities, the similarity measure, and the abstraction are three intertwined decisions: the similarity measures allow us to identify the entities, and the selection of entities influences what is a meaningful similarity measure. Similarly, we seek entities that are related to each other in a way, for which a simple representation describes their relationships succinctly and accurately. This chapter will cover studies that rely on different entities, similarity measures, and a range of representations to better understand protein structure space. Scholars may use publicly available navigators offering a global representation, and in particular the hierarchical classifications SCOP, CATH, and ECOD, or a local representation, which encompass structural alignment algorithms. Alternatively, scholars can configure their own navigator using existing tools. To demonstrate this DIY (do it yourself) approach for navigating in protein space, we investigate substrate-binding proteins. By presenting sequence similarities among this large and diverse protein family as a network, we can infer that one member (pdb ID 4ntl; of yet unknown function) may bind methionine and suggest a putative binding mechanism.

Collapse

Hu G, Wang K, Song J, Uversky VN, Kurgan L. Taxonomic Landscape of the Dark Proteomes: Whole-Proteome Scale Interplay Between Structural Darkness, Intrinsic Disorder, and Crystallization Propensity. Proteomics 2018;18:e1800243. [PMID: 30198635 DOI: 10.1002/pmic.201800243] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 08/30/2018] [Indexed: 12/14/2022]

Unique function words characterize genomic proteins. Proc Natl Acad Sci U S A 2018;115:6703-6708. [PMID: 29895692 PMCID: PMC6042118 DOI: 10.1073/pnas.1801182115] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

The vast, mostly unknown protein universe can be explored by analyzing protein sequences as a string of domains. A broader coverage can be achieved when these domains, the essential blocks in protein evolution, are detected using sequence profiles. Using clustering to collapse redundant profiles into unique function words (UFWs), we find that over the years 2009–2016, the number of UFWs saturates while the number of sequences matched by a combination of two or more UFWs grows exponentially.

Between 2009 and 2016 the number of protein sequences from known species increased 10-fold from 8 million to 85 million. About 80% of these sequences contain at least one region recognized by the conserved domain architecture retrieval tool (CDART) as a sequence motif. Motifs provide clues to biological function but CDART often matches the same region of a protein by two or more profiles. Such synonyms complicate estimates of functional complexity. We do full-linkage clustering of redundant profiles by finding maximum disjoint cliques: Each cluster is replaced by a single representative profile to give what we term a unique function word (UFW). From 2009 to 2016, the number of sequence profiles used by CDART increased by 80%; the number of UFWs increased more slowly by 30%, indicating that the number of UFWs may be saturating. The number of sequences matched by a single UFW (sequences with single domain architectures) increased as slowly as the number of different words, whereas the number of sequences matched by a combination of two or more UFWs in sequences with multiple domain architectures (MDAs) increased at the same rate as the total number of sequences. This combinatorial arrangement of a limited number of UFWs in MDAs accounts for the genomic diversity of protein sequences. Although eukaryotes and prokaryotes use very similar sets of “words” or UFWs (57% shared), the “sentences” (MDAs) are different (1.3% shared).

Collapse

C L B, S Nair A. Benchmark Dataset for Whole Genome Sequence Compression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017;14:1228-1236. [PMID: 27214907 DOI: 10.1109/tcbb.2016.2568186] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Complex evolutionary footprints revealed in an analysis of reused protein segments of diverse lengths. Proc Natl Acad Sci U S A 2017;114:11703-11708. [PMID: 29078314 PMCID: PMC5676897 DOI: 10.1073/pnas.1707642114] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open

Abstract

We question a central paradigm: namely, that the protein domain is the “atomic unit” of evolution. In conflict with the current textbook view, our results unequivocally show that duplication of protein segments happens both above and below the domain level among amino acid segments of diverse lengths. Indeed, we show that significant evolutionary information is lost when the protein is approached as a string of domains. Our finer-grained approach reveals a far more complicated picture, where reused segments often intertwine and overlap with each other. Our results are consistent with a recursive model of evolution, in which segments of various lengths, typically smaller than domains, “hop” between environments. The fit segments remain, leaving traces that can still be detected.

Proteins share similar segments with one another. Such “reused parts”—which have been successfully incorporated into other proteins—are likely to offer an evolutionary advantage over de novo evolved segments, as most of the latter will not even have the capacity to fold. To systematically explore the evolutionary traces of segment “reuse” across proteins, we developed an automated methodology that identifies reused segments from protein alignments. We search for “themes”—segments of at least 35 residues of similar sequence and structure—reused within representative sets of 15,016 domains [Evolutionary Classification of Protein Domains (ECOD) database] or 20,398 chains [Protein Data Bank (PDB)]. We observe that theme reuse is highly prevalent and that reuse is more extensive when the length threshold for identifying a theme is lower. Structural domains, the best characterized form of reuse in proteins, are just one of many complex and intertwined evolutionary traces. Others include long themes shared among a few proteins, which encompass and overlap with shorter themes that recur in numerous proteins. The observed complexity is consistent with evolution by duplication and divergence, and some of the themes might include descendants of ancestral segments. The observed recursive footprints, where the same amino acid can simultaneously participate in several intertwined themes, could be a useful concept for protein design. Data are available at http://trachel-srv.cs.haifa.ac.il/rachel/ppi/themes/.

Collapse

Ahrens JB, Nunez-Castilla J, Siltberg-Liberles J. Evolution of intrinsic disorder in eukaryotic proteins. Cell Mol Life Sci 2017;74:3163-3174. [PMID: 28597295 PMCID: PMC11107722 DOI: 10.1007/s00018-017-2559-0] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 06/01/2017] [Indexed: 12/23/2022]

Levy Y. Protein Assembly and Building Blocks: Beyond the Limits of the LEGO Brick Metaphor. Biochemistry 2017;56:5040-5048. [PMID: 28809494 DOI: 10.1021/acs.biochem.7b00666] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Ghosh A, Ostrander JS, Zanni MT. Watching Proteins Wiggle: Mapping Structures with Two-Dimensional Infrared Spectroscopy. Chem Rev 2017;117:10726-10759. [PMID: 28060489 PMCID: PMC5500453 DOI: 10.1021/acs.chemrev.6b00582] [Citation(s) in RCA: 192] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Olivares-Quiroz L. Protein folding and unfolding pathways: The role of energy barriers, configurational entropy and internal energy: Comment on "There and back again: Two views on the protein folding puzzle" by Alexei V. Finkelstein et al. Phys Life Rev 2017;21:75-76. [PMID: 28602717 DOI: 10.1016/j.plrev.2017.06.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 06/02/2017] [Indexed: 12/01/2022]

Repurposing proteins for new bioinorganic functions. Essays Biochem 2017;61:245-258. [DOI: 10.1042/ebc20160068] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 01/17/2017] [Accepted: 01/23/2017] [Indexed: 02/06/2023]

Exploring the dark foldable proteome by considering hydrophobic amino acids topology. Sci Rep 2017;7:41425. [PMID: 28134276 PMCID: PMC5278394 DOI: 10.1038/srep41425] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 12/19/2016] [Indexed: 12/18/2022] Open

K-nearest uphill clustering in the protein structure space. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.04.065] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Emergence of de novo proteins from 'dark genomic matter' by 'grow slow and moult'. Biochem Soc Trans 2016;43:867-73. [PMID: 26517896 DOI: 10.1042/bst20150089] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Serrano P, Dutta SK, Proudfoot A, Mohanty B, Susac L, Martin B, Geralt M, Jaroszewski L, Godzik A, Elsliger M, Wilson IA, Wüthrich K. NMR in structural genomics to increase structural coverage of the protein universe: Delivered by Prof. Kurt Wüthrich on 7 July 2013 at the 38th FEBS Congress in St. Petersburg, Russia. FEBS J 2016;283:3870-3881. [PMID: 27154589 DOI: 10.1111/febs.13751] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Revised: 04/12/2016] [Accepted: 05/04/2016] [Indexed: 12/12/2022]

Abstract

For more than a decade, the Joint Center for Structural Genomics (JCSG; www.jcsg.org) worked toward increased three-dimensional structure coverage of the protein universe. This coordinated quest was one of the main goals of the four high-throughput (HT) structure determination centers of the Protein Structure Initiative (PSI; www.nigms.nih.gov/Research/specificareas/PSI). To achieve the goals of the PSI, the JCSG made use of the complementarity of structure determination by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy to increase and diversify the range of targets entering the HT structure determination pipeline. The overall strategy, for both techniques, was to determine atomic resolution structures for representatives of large protein families, as defined by the Pfam database, which had no structural coverage and could make significant contributions to biological and biomedical research. Furthermore, the experimental structures could be leveraged by homology modeling to further expand the structural coverage of the protein universe and increase biological insights. Here, we describe what could be achieved by this structural genomics approach, using as an illustration the contributions from 20 NMR structure determinations out of a total of 98 JCSG NMR structures, which were selected because they are the first three-dimensional structure representations of the respective Pfam protein families. The information from this small sample is representative for the overall results from crystal and NMR structure determination in the JCSG. There are five new folds, which were classified as domains of unknown functions (DUF), three of the proteins could be functionally annotated based on three-dimensional structure similarity with previously characterized proteins, and 12 proteins showed only limited similarity with previous deposits in the Protein Data Bank (PDB) and were classified as DUFs.

Collapse

Affiliation(s)

Pedro Serrano Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Samit K Dutta Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Andrew Proudfoot Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Biswaranjan Mohanty Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.,Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA, USA
Lukas Susac Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Bryan Martin Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Michael Geralt Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Lukasz Jaroszewski Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Program on Bioinformatics and Systems Biology, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
Adam Godzik Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Program on Bioinformatics and Systems Biology, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
Marc Elsliger Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA
Ian A Wilson Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.,Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA, USA
Kurt Wüthrich Joint Center for Structural Genomics, The Scripps Research Institute, La Jolla, CA, USA.,Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA.,Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA, USA

Collapse

Using natural sequences and modularity to design common and novel protein topologies. Curr Opin Struct Biol 2016;38:26-36. [PMID: 27270240 DOI: 10.1016/j.sbi.2016.05.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 05/13/2016] [Accepted: 05/18/2016] [Indexed: 02/07/2023]

Xu J, Zhang J. Impact of structure space continuity on protein fold classification. Sci Rep 2016;6:23263. [PMID: 27006112 PMCID: PMC4804218 DOI: 10.1038/srep23263] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 03/03/2016] [Indexed: 11/09/2022] Open

Fox NK, Brenner SE, Chandonia JM. The value of protein structure classification information-Surveying the scientific literature. Proteins 2015;83:2025-38. [PMID: 26313554 PMCID: PMC4609302 DOI: 10.1002/prot.24915] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Revised: 08/06/2015] [Accepted: 08/18/2015] [Indexed: 11/08/2022]