Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Rost B. Review: protein secondary structure prediction continues to rise. J Struct Biol 2001;134:204-18. [PMID: 11551180 DOI: 10.1006/jsbi.2001.4336] [Citation(s) in RCA: 314] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Number

Cited by Other Article(s)

Luo Y, Zheng X, Qiu M, Gou Y, Yang Z, Qu X, Chen Z, Lin Y. Deep learning and its applications in nuclear magnetic resonance spectroscopy. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2025;146-147:101556. [PMID: 40306798 DOI: 10.1016/j.pnmrs.2024.101556] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 12/26/2024] [Accepted: 12/30/2024] [Indexed: 05/02/2025]

Weissenow K, Rost B. Are protein language models the new universal key? Curr Opin Struct Biol 2025;91:102997. [PMID: 39921962 DOI: 10.1016/j.sbi.2025.102997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 12/20/2024] [Accepted: 01/16/2025] [Indexed: 02/10/2025]

Zhang J, Qian J, Zou Q, Zhou F, Kurgan L. Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2025;2870:1-19. [PMID: 39543027 DOI: 10.1007/978-1-0716-4213-9_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]

Badaczewska-Dawid AE, Kolinski A. Importance of Secondary Structure Data in Large Scale Protein Modeling Using Low-Resolution SURPASS Method. Methods Mol Biol 2025;2867:55-78. [PMID: 39576575 DOI: 10.1007/978-1-0716-4196-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]

Flamholz ZN, Li C, Kelly L. Improving viral annotation with artificial intelligence. mBio 2024;15:e0320623. [PMID: 39230289 PMCID: PMC11481560 DOI: 10.1128/mbio.03206-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024] Open

Sanjeevi M, Mohan A, Ramachandran D, Jeyaraman J, Sekar K. CSSP-2.0: A refined consensus method for accurate protein secondary structure prediction. Comput Biol Chem 2024;112:108158. [PMID: 39053174 DOI: 10.1016/j.compbiolchem.2024.108158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 06/19/2024] [Accepted: 07/18/2024] [Indexed: 07/27/2024]

Heinzinger M, Rost B. Artificial Intelligence Learns Protein Prediction. Cold Spring Harb Perspect Biol 2024;16:a041458. [PMID: 38858069 PMCID: PMC11368192 DOI: 10.1101/cshperspect.a041458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]

Shu P, You G, Li W, Chen Y, Chu Z, Qin D, Wang Y, Zhou H, Zhao L. Cefmetazole sodium as an allosteric effector that regulates the oxygen supply efficiency of adult hemoglobin. J Biomol Struct Dyn 2024;42:7442-7456. [PMID: 37555593 DOI: 10.1080/07391102.2023.2245043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/17/2023] [Indexed: 08/10/2023]

Mikulka J, Sen MK, Košnarová P, Hamouz P, Hamouzová K, Sur VP, Šuk J, Bhattacharya S, Soukup J. Molecular Mechanisms of Resistance against PSII-Inhibiting Herbicides in Amaranthus retroflexus from the Czech Republic. Genes (Basel) 2024;15:904. [PMID: 39062683 PMCID: PMC11275581 DOI: 10.3390/genes15070904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 06/25/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open

Abstract

Amaranthus retroflexus L. (redroot pigweed) is one of the most problematic weeds in maize, sugar beet, vegetables, and soybean crop fields in Europe. Two pigweed amaranth biotypes (R1 and R2) from the Czech Republic resistant to photosystem II (PSII)-inhibiting herbicides were analyzed in this study. This study aimed to identify the genetic mechanisms that underlie the resistance observed in the biotypes. Additionally, we also intended to establish the use of chlorophyll fluorescence measurement as a rapid and reliable method for confirming herbicide resistance in this weed species. Both biotypes analyzed showed high resistance factors in a dose-response study and were thus confirmed to be resistant to PSII-inhibiting herbicides. A sequence analysis of the D1 protein revealed a well-known Ser-Gly substitution at amino acid position 264 in both biotypes. Molecular docking studies, along with the wild-type and mutant D1 protein's secondary structure analyses, revealed that the S264G mutation did not reduce herbicide affinity but instead indirectly affected the interaction between the target protein and the herbicides. The current study identified the S264G mutation as being responsible for conferring herbicide resistance in the pigweed amaranth biotypes. These findings can provide a strong basis for future studies that might use protein structure and mutation-based approaches to gain further insights into the detailed mechanisms of resistance in this weed species. In many individuals from both biotypes, resistance at a very early stage (BBCH10) of plants was demonstrated several hours after the application of the active ingredients by the chlorophyll fluorescence method. The effective PS II quantum yield parameter can be used as a rapid diagnostic tool for distinguishing between sensitive and resistant plants on an individual level. This method can be useful for identifying herbicide-resistant weed biotypes in the field, which can help farmers and weed management practitioners develop more effective weed control tactics.

Collapse

Affiliation(s)

Jakub Mikulka Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Madhab Kumar Sen Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Pavlína Košnarová Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Pavel Hamouz Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Kateřina Hamouzová Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Vishma Pratap Sur Institute of Microbiology, The Czech Academy of Sciences, Centre Algatech, Novohradská 237-Opatovický Mlýn, 379 01 Třebon, Czech Republic;
Jaromír Šuk Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Soham Bhattacharya Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)
Josef Soukup Department of Agroecology and Crop Production, Faculty of Agrobiology, Food and Natural Resources, Czech University of Life Sciences Prague, Kamýcká 1176, 165 00 Prague, Czech Republic; (J.M.); (M.K.S.); (P.K.); (P.H.); (K.H.); (J.Š.); (S.B.)

Collapse

Spadaro A, Sharma A, Dehzangi I. Predicting lysine methylation sites using a convolutional neural network. Methods 2024;226:127-132. [PMID: 38604414 DOI: 10.1016/j.ymeth.2024.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/15/2023] [Accepted: 04/07/2024] [Indexed: 04/13/2024] Open

Broz M, Jukič M, Bren U. Naive Prediction of Protein Backbone Phi and Psi Dihedral Angles Using Deep Learning. Molecules 2023;28:7046. [PMID: 37894526 PMCID: PMC10609058 DOI: 10.3390/molecules28207046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023] Open

Jin C, Patel A, Peters J, Hodawadekar S, Kalyanaraman R. Quantum Cascade Laser Based Infrared Spectroscopy: A New Paradigm for Protein Secondary Structure Measurement. Pharm Res 2023;40:1507-1517. [PMID: 36329374 DOI: 10.1007/s11095-022-03422-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]

Shea A, Bartz J, Zhang L, Dong X. Predicting mutational function using machine learning. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2023;791:108457. [PMID: 36965820 PMCID: PMC10239318 DOI: 10.1016/j.mrrev.2023.108457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/11/2023] [Accepted: 03/20/2023] [Indexed: 03/27/2023]

Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Ismi DP, Pulungan R, Afiahayati. Deep learning for protein secondary structure prediction: Pre and post-AlphaFold. Comput Struct Biotechnol J 2022;20:6271-6286. [PMID: 36420164 PMCID: PMC9678802 DOI: 10.1016/j.csbj.2022.11.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 11/05/2022] [Accepted: 11/05/2022] [Indexed: 11/13/2022] Open

Nacar C. Propensities of Some Amino Acid Pairings in α-Helices Vary with Length. Protein J 2022;41:551-562. [PMID: 36169766 DOI: 10.1007/s10930-022-10076-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/15/2022] [Indexed: 11/29/2022]

A multifaceted strategy to improve recombinant expression and structural characterisation of a Trypanosoma invariant surface protein. Sci Rep 2022;12:12706. [PMID: 35882923 PMCID: PMC9325691 DOI: 10.1038/s41598-022-16958-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 07/19/2022] [Indexed: 11/16/2022] Open

Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022;20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open

Pritam M, Singh G, Kumar R, Singh SP. Screening of potential antigens from whole proteome and development of multi-epitope vaccine against Rhizopus delemar using immunoinformatics approaches. J Biomol Struct Dyn 2022;41:2118-2145. [PMID: 35067195 DOI: 10.1080/07391102.2022.2028676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Newton MAH, Mataeimoghadam F, Zaman R, Sattar A. Secondary structure specific simpler prediction models for protein backbone angles. BMC Bioinformatics 2022;23:6. [PMID: 34983370 PMCID: PMC8728911 DOI: 10.1186/s12859-021-04525-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 12/07/2021] [Indexed: 11/10/2022] Open

Abstract

Motivation

Protein backbone angle prediction has achieved significant accuracy improvement with the development of deep learning methods. Usually the same deep learning model is used in making prediction for all residues regardless of the categories of secondary structures they belong to. In this paper, we propose to train separate deep learning models for each category of secondary structures. Machine learning methods strive to achieve generality over the training examples and consequently loose accuracy. In this work, we explicitly exploit classification knowledge to restrict generalisation within the specific class of training examples. This is to compensate the loss of generalisation by exploiting specialisation knowledge in an informed way.

Results

The new method named SAP4SS obtains mean absolute error (MAE) values of 15.59, 18.87, 6.03, and 21.71 respectively for four types of backbone angles \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi$$\end{document}ϕ, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi$$\end{document}ψ, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta$$\end{document}θ, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau$$\end{document}τ. Consequently, SAP4SS significantly outperforms existing state-of-the-art methods SAP, OPUS-TASS, and SPOT-1D: the differences in MAE for all four types of angles are from 1.5 to 4.1% compared to the best known results.

Availability

SAP4SS along with its data is available from https://gitlab.com/mahnewton/sap4ss.

Collapse

Miao Z, Wang Q, Xiao X, Kamal GM, Song L, Zhang X, Li C, Zhou X, Jiang B, Liu M. CSI-LSTM: a web server to predict protein secondary structure using bidirectional long short term memory and NMR chemical shifts. JOURNAL OF BIOMOLECULAR NMR 2021;75:393-400. [PMID: 34510297 DOI: 10.1007/s10858-021-00383-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 09/06/2021] [Indexed: 06/13/2023]

Affiliation(s)

Zhiwei Miao Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China
Qianqian Wang Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China
Xiongjie Xiao Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China
Ghulam Mustafa Kamal Department of Chemistry, Khwaja Fareed University of Engineering & Information Technology, Rahim Yar Khan, Punjab, 64200, Pakistan
Linhong Song Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China University of Chinese Academy of Sciences, Beijing, 10049, China
Xu Zhang Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China University of Chinese Academy of Sciences, Beijing, 10049, China
Conggang Li Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China University of Chinese Academy of Sciences, Beijing, 10049, China
Xin Zhou Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China University of Chinese Academy of Sciences, Beijing, 10049, China
Bin Jiang Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China. University of Chinese Academy of Sciences, Beijing, 10049, China.
Maili Liu Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, 430071, Wuhan, China. University of Chinese Academy of Sciences, Beijing, 10049, China.

Collapse

Narayanan A, Dhinojwala A, Joy A. Design principles for creating synthetic underwater adhesives. Chem Soc Rev 2021;50:13321-13345. [PMID: 34751690 DOI: 10.1039/d1cs00316j] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Moffat L, Jones DT. Increasing the accuracy of single sequence prediction methods using a deep semi-supervised learning framework. Bioinformatics 2021;37:3744-3751. [PMID: 34213528 PMCID: PMC8570780 DOI: 10.1093/bioinformatics/btab491] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/08/2021] [Accepted: 06/30/2021] [Indexed: 11/14/2022] Open

Ho CT, Huang YW, Chen TR, Lo CH, Lo WC. Discovering the Ultimate Limits of Protein Secondary Structure Prediction. Biomolecules 2021;11:1627. [PMID: 34827624 PMCID: PMC8615938 DOI: 10.3390/biom11111627] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 10/25/2021] [Accepted: 10/28/2021] [Indexed: 12/29/2022] Open

Chen TR, Juan SH, Huang YW, Lin YC, Lo WC. A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction. PLoS One 2021;16:e0255076. [PMID: 34320027 PMCID: PMC8318245 DOI: 10.1371/journal.pone.0255076] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 07/11/2021] [Indexed: 11/18/2022] Open

Goodswen SJ, Kennedy PJ, Ellis JT. Predicting Protein Therapeutic Candidates for Bovine Babesiosis Using Secondary Structure Properties and Machine Learning. Front Genet 2021;12:716132. [PMID: 34367264 PMCID: PMC8343536 DOI: 10.3389/fgene.2021.716132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 06/28/2021] [Indexed: 12/02/2022] Open

Bernhofer M, Dallago C, Karl T, Satagopam V, Heinzinger M, Littmann M, Olenyi T, Qiu J, Schütze K, Yachdav G, Ashkenazy H, Ben-Tal N, Bromberg Y, Goldberg T, Kajan L, O’Donoghue S, Sander C, Schafferhans A, Schlessinger A, Vriend G, Mirdita M, Gawron P, Gu W, Jarosz Y, Trefois C, Steinegger M, Schneider R, Rost B. PredictProtein - Predicting Protein Structure and Function for 29 Years. Nucleic Acids Res 2021;49:W535-W540. [PMID: 33999203 PMCID: PMC8265159 DOI: 10.1093/nar/gkab354] [Citation(s) in RCA: 166] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 04/06/2021] [Accepted: 05/10/2021] [Indexed: 12/12/2022] Open

Affiliation(s)

Michael Bernhofer TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
Christian Dallago TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
Tim Karl TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
Venkata Satagopam Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Michael Heinzinger TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
Maria Littmann TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany TUM Graduate School CeDoSIA, Boltzmannstr 11, 85748 Garching, Germany
Tobias Olenyi TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
Jiajun Qiu TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany Department of Otolaryngology Head & Neck Surgery, The Ninth People's Hospital & Ear Institute, School of Medicine & Shanghai Key Laboratory of Translational Medicine on Ear and Nose Diseases, Shanghai Jiao Tong University, Shanghai, China
Konstantin Schütze TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
Guy Yachdav TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
Haim Ashkenazy Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
Nir Ben-Tal Department of Biochemistry & Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
Yana Bromberg Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08901, USA
Tatyana Goldberg TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany
Laszlo Kajan Roche Polska Sp. z o.o., Domaniewska 39B, 02–672 Warsaw, Poland
Sean O’Donoghue Garvan Institute of Medical Research, Sydney, Australia
Chris Sander Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA Department of Cell Biology, Harvard Medical School, Boston, MA 02215, USA Broad Institute of MIT and Harvard, Boston, MA 02142, USA
Andrea Schafferhans TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany HSWT (Hochschule Weihenstephan Triesdorf \| University of Applied Sciences), Department of Bioengineering Sciences, Am Hofgarten 10, 85354 Freising, Germany
Avner Schlessinger Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Gerrit Vriend BIPS, Poblacion Baco, Mindoro, Philippines
Milot Mirdita Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
Piotr Gawron Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Wei Gu Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Yohan Jarosz Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Christophe Trefois Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Martin Steinegger School of Biological Sciences, Seoul National University, Seoul, South Korea Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
Reinhard Schneider Luxembourg Centre For Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg ELIXIR Luxembourg (ELIXIR-LU) Node, University of Luxembourg, Campus Belval, House of Biomedicine II, 6 avenue du Swing, L-4367 Belvaux, Luxembourg
Burkhard Rost TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr 3, 85748 Garching/Munich, Germany Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany

Collapse

Dallago C, Schütze K, Heinzinger M, Olenyi T, Littmann M, Lu AX, Yang KK, Min S, Yoon S, Morton JT, Rost B. Learned Embeddings from Deep Learning to Visualize and Predict Protein Sets. Curr Protoc 2021;1:e113. [PMID: 33961736 DOI: 10.1002/cpz1.113] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Abstract

Models from machine learning (ML) or artificial intelligence (AI) increasingly assist in guiding experimental design and decision making in molecular biology and medicine. Recently, Language Models (LMs) have been adapted from Natural Language Processing (NLP) to encode the implicit language written in protein sequences. Protein LMs show enormous potential in generating descriptive representations (embeddings) for proteins from just their sequences, in a fraction of the time with respect to previous approaches, yet with comparable or improved predictive ability. Researchers have trained a variety of protein LMs that are likely to illuminate different angles of the protein language. By leveraging the bio_embeddings pipeline and modules, simple and reproducible workflows can be laid out to generate protein embeddings and rich visualizations. Embeddings can then be leveraged as input features through machine learning libraries to develop methods predicting particular aspects of protein function and structure. Beyond the workflows included here, embeddings have been leveraged as proxies to traditional homology-based inference and even to align similar protein sequences. A wealth of possibilities remain for researchers to harness through the tools provided in the following protocols. © 2021 The Authors. Current Protocols published by Wiley Periodicals LLC. The following protocols are included in this manuscript: Basic Protocol 1: Generic use of the bio_embeddings pipeline to plot protein sequences and annotations Basic Protocol 2: Generate embeddings from protein sequences using the bio_embeddings pipeline Basic Protocol 3: Overlay sequence annotations onto a protein space visualization Basic Protocol 4: Train a machine learning classifier on protein embeddings Alternate Protocol 1: Generate 3D instead of 2D visualizations Alternate Protocol 2: Visualize protein solubility instead of protein subcellular localization Support Protocol: Join embedding generation and sequence space visualization in a pipeline.

Collapse

Affiliation(s)

Christian Dallago TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
Konstantin Schütze TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany
Michael Heinzinger TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
Tobias Olenyi TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany
Maria Littmann TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Garching/Munich, Germany
Amy X Lu Department of Computer Science, University of Toronto, Toronto, Canada & Vector Institute
Kevin K Yang Microsoft Research New England, Cambridge, Massachusetts
Seonwoo Min Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea
Sungroh Yoon Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
James T Morton Center for Computational Biology, Flatiron Institute, New York, New York
Burkhard Rost TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, Garching/Munich, Germany.,Institute for Advanced Study (TUM-IAS), Garching/Munich, Germany.,TUM School of Life Sciences Weihenstephan (WZW), Freising, Germany.,Columbia University, Department of Biochemistry and Molecular Biophysics, New York, New York.,New York Consortium on Membrane Protein Structure (NYCOMPS), New York, New York

Collapse

Slater O, Miller B, Kontoyianni M. Decoding Protein-protein Interactions: An Overview. Curr Top Med Chem 2021;20:855-882. [PMID: 32101126 DOI: 10.2174/1568026620666200226105312] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Revised: 11/27/2019] [Accepted: 11/27/2019] [Indexed: 12/24/2022]

Zhao B, Katuwawala A, Oldfield CJ, Dunker AK, Faraggi E, Gsponer J, Kloczkowski A, Malhis N, Mirdita M, Obradovic Z, Söding J, Steinegger M, Zhou Y, Kurgan L. DescribePROT: database of amino acid-level protein structure and function predictions. Nucleic Acids Res 2021;49:D298-D308. [PMID: 33119734 PMCID: PMC7778963 DOI: 10.1093/nar/gkaa931] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 12/30/2022] Open

Miller-Vedam LE, Bräuning B, Popova KD, Schirle Oakdale NT, Bonnar JL, Prabu JR, Boydston EA, Sevillano N, Shurtleff MJ, Stroud RM, Craik CS, Schulman BA, Frost A, Weissman JS. Structural and mechanistic basis of the EMC-dependent biogenesis of distinct transmembrane clients. eLife 2020;9:e62611. [PMID: 33236988 PMCID: PMC7785296 DOI: 10.7554/elife.62611] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Accepted: 11/17/2020] [Indexed: 12/11/2022] Open

Affiliation(s)

Lakshmi E Miller-Vedam Molecular, Cellular, and Computational Biophysics Graduate Program, University of California, San FranciscoSan FranciscoUnited States Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States Department of Biology, Whitehead Institute, MITCambridgeUnited States Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States
Bastian Bräuning Department of Molecular Machines and Signaling, Max Planck Institute of BiochemistryMartinsriedGermany
Katerina D Popova Department of Biology, Whitehead Institute, MITCambridgeUnited States Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States Biomedical Sciences Graduate Program, University of California, San FranciscoSan FranciscoUnited States
Nicole T Schirle Oakdale Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States
Jessica L Bonnar Department of Biology, Whitehead Institute, MITCambridgeUnited States Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States
Jesuraj R Prabu Department of Molecular Machines and Signaling, Max Planck Institute of BiochemistryMartinsriedGermany
Elizabeth A Boydston Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States
Natalia Sevillano Department of Pharmaceutical Chemistry, University of California, San FranciscoSan FranciscoUnited States
Matthew J Shurtleff Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States
Robert M Stroud Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
Charles S Craik Department of Pharmaceutical Chemistry, University of California, San FranciscoSan FranciscoUnited States
Brenda A Schulman Department of Molecular Machines and Signaling, Max Planck Institute of BiochemistryMartinsriedGermany
Adam Frost Department of Biochemistry and Biophysics, University of California, San FranciscoSan FranciscoUnited States
Jonathan S Weissman Department of Biology, Whitehead Institute, MITCambridgeUnited States Department of Cellular and Molecular Pharmacology, University of California, San FranciscoSan FranciscoUnited States Howard Hughes Medical InstituteChevy ChaseUnited States

Collapse

Enhancing protein backbone angle prediction by using simpler models of deep neural networks. Sci Rep 2020;10:19430. [PMID: 33173130 PMCID: PMC7655839 DOI: 10.1038/s41598-020-76317-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 10/23/2020] [Indexed: 11/09/2022] Open

Skolnick J, Gao M. The role of local versus nonlocal physicochemical restraints in determining protein native structure. Curr Opin Struct Biol 2020;68:1-8. [PMID: 33129066 DOI: 10.1016/j.sbi.2020.10.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/03/2020] [Accepted: 10/05/2020] [Indexed: 12/15/2022]

Qiu J, Nechaev D, Rost B. Protein-protein and protein-nucleic acid binding residues important for common and rare sequence variants in human. BMC Bioinformatics 2020;21:452. [PMID: 33050876 PMCID: PMC7557062 DOI: 10.1186/s12859-020-03759-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 09/16/2020] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Any two unrelated people differ by about 20,000 missense mutations (also referred to as SAVs: Single Amino acid Variants or missense SNV). Many SAVs have been predicted to strongly affect molecular protein function. Common SAVs (> 5% of population) were predicted to have, on average, more effect on molecular protein function than rare SAVs (< 1% of population). We hypothesized that the prevalence of effect in common over rare SAVs might partially be caused by common SAVs more often occurring at interfaces of proteins with other proteins, DNA, or RNA, thereby creating subgroup-specific phenotypes. We analyzed SAVs from 60,706 people through the lens of two prediction methods, one (SNAP2) predicting the effects of SAVs on molecular protein function, the other (ProNA2020) predicting residues in DNA-, RNA- and protein-binding interfaces.

RESULTS

Three results stood out. Firstly, SAVs predicted to occur at binding interfaces were predicted to more likely affect molecular function than those predicted as not binding (p value < 2.2 × 10^-16). Secondly, for SAVs predicted to occur at binding interfaces, common SAVs were predicted more strongly with effect on protein function than rare SAVs (p value < 2.2 × 10^-16). Restriction to SAVs with experimental annotations confirmed all results, although the resulting subsets were too small to establish statistical significance for any result. Thirdly, the fraction of SAVs predicted at binding interfaces differed significantly between tissues, e.g. urinary bladder tissue was found abundant in SAVs predicted at protein-binding interfaces, and reproductive tissues (ovary, testis, vagina, seminal vesicle and endometrium) in SAVs predicted at DNA-binding interfaces.

CONCLUSIONS

Overall, the results suggested that residues at protein-, DNA-, and RNA-binding interfaces contributed toward predicting that common SAVs more likely affect molecular function than rare SAVs.

Collapse

de Brevern AG. Impact of protein dynamics on secondary structure prediction. Biochimie 2020;179:14-22. [PMID: 32946990 DOI: 10.1016/j.biochi.2020.09.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 09/04/2020] [Accepted: 09/10/2020] [Indexed: 02/08/2023]

Guo Z, Hou J, Cheng J. DNSS2: Improved ab initio protein secondary structure prediction using advanced deep learning architectures. Proteins 2020;89:207-217. [PMID: 32893403 DOI: 10.1002/prot.26007] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 07/07/2020] [Accepted: 09/02/2020] [Indexed: 12/27/2022]

Vermeyen T, Merten C. Solvation and the secondary structure of a proline-containing dipeptide: insights from VCD spectroscopy. Phys Chem Chem Phys 2020;22:15640-15648. [PMID: 32617548 DOI: 10.1039/d0cp02283g] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Abstract

In this study we investigate the IR and VCD spectra of the diastereomeric dipeptide Boc-Pro-Phe-(n-propyl) 1 in chloroform-d₁ (CDCl₃) and the strongly hydrogen bonding solvent dimethylsulfoxide-d₆ (DMSO-d₆). From comparison of the experimental spectra, the amide II spectral region is identified as marker signature for the stereochemistry of the dipeptide: the homochiral LL-1 features a (+/-)-pattern in the amide II region of the VCD spectrum, while the amide II signature of the diastereomer LD-1 is inverted. Computational analysis of the IR and VCD spectra of LL-1 reveals that the experimentally observed amide II signature is characteristic for a β_I-turn structure of the peptide. Likewise, the inverted pattern found for LD-1 arises from a β_II-turn structure of the dipeptide. Following a micro-solvation approach, the experimental spectra recorded in DMSO-d₆ are computationally well reproduced by considering only a single solvent molecule in a hydrogen bond with N-H groups. Considering a second solvent molecule, which would lead to a cleavage of intramolecular hydrogen bonds in 1, is found to give a significantly worse match with the experiment. Hence, the detailed computational analysis of the spectra of LL- and LD-1 recorded in DMSO-d₆ confirms that the intramolecular hydrogen bonding pattern, that stabilizes the β-turns and other conformations of LL- and LD-1 in apolar solvents, remains intact. Our findings also show that it is essential to consider solvation explicitly in the analysis of the IR and VCD spectra of dipeptides in strongly hydrogen bonding solvents. As the solute-solvent interactions affect both conformational preferences and spectral signatures, it is also demonstrated that this inclusion of solvent molecules cannot be circumvented by applying fitting procedures to non-solvated structures.

Collapse

Xu G, Wang Q, Ma J. OPUS-TASS: a protein backbone torsion angles and secondary structure predictor based on ensemble neural networks. Bioinformatics 2020;36:5021-5026. [DOI: 10.1093/bioinformatics/btaa629] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 06/25/2020] [Accepted: 07/10/2020] [Indexed: 11/13/2022] Open

Abstract Abstract Motivation Predictions of protein backbone torsion angles (ϕ and ψ) and secondary structure from sequence are crucial subproblems in protein structure prediction. With the development of deep learning approaches, their accuracies have been significantly improved. To capture the long-range interactions, most studies integrate bidirectional recurrent neural networks into their models. In this study, we introduce and modify a recently proposed architecture named Transformer to capture the interactions between the two residues theoretically with arbitrary distance. Moreover, we take advantage of multitask learning to improve the generalization of neural network by introducing related tasks into the training process. Similar to many previous studies, OPUS-TASS uses an ensemble of models and achieves better results. Results OPUS-TASS uses the same training and validation sets as SPOT-1D. We compare the performance of OPUS-TASS and SPOT-1D on TEST2016 (1213 proteins) and TEST2018 (250 proteins) proposed in the SPOT-1D paper, CASP12 (55 proteins), CASP13 (32 proteins) and CASP-FM (56 proteins) proposed in the SAINT paper, and a recently released PDB structure collection from CAMEO (93 proteins) named as CAMEO93. On these six test sets, OPUS-TASS achieves consistent improvements in both backbone torsion angles prediction and secondary structure prediction. On CAMEO93, SPOT-1D achieves the mean absolute errors of 16.89 and 23.02 for ϕ and ψ predictions, respectively, and the accuracies for 3- and 8-state secondary structure predictions are 87.72 and 77.15%, respectively. In comparison, OPUS-TASS achieves 16.56 and 22.56 for ϕ and ψ predictions, and 89.06 and 78.87% for 3- and 8-state secondary structure predictions, respectively. In particular, after using our torsion angles refinement method OPUS-Refine as the post-processing procedure for OPUS-TASS, the mean absolute errors for final ϕ and ψ predictions are further decreased to 16.28 and 21.98, respectively. Availability and implementation The training and the inference codes of OPUS-TASS and its data are available at https://github.com/thuxugang/opus_tass. Supplementary information Supplementary data are available at Bioinformatics online. Collapse

Shapovalov M, Dunbrack RL, Vucetic S. Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction. PLoS One 2020;15:e0232528. [PMID: 32374785 PMCID: PMC7202669 DOI: 10.1371/journal.pone.0232528] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 04/16/2020] [Indexed: 11/30/2022] Open

Abstract

Protein secondary structure prediction remains a vital topic with broad applications. Due to lack of a widely accepted standard in secondary structure predictor evaluation, a fair comparison of predictors is challenging. A detailed examination of factors that contribute to higher accuracy is also lacking. In this paper, we present: (1) new test sets, Test2018, Test2019, and Test2018-2019, consisting of proteins from structures released in 2018 and 2019 with less than 25% identity to any protein published before 2018; (2) a 4-layer convolutional neural network, SecNet, with an input window of ±14 amino acids which was trained on proteins ≤25% identical to proteins in Test2018 and the commonly used CB513 test set; (3) an additional test set that shares no homologous domains with the training set proteins, according to the Evolutionary Classification of Proteins (ECOD) database; (4) a detailed ablation study where we reverse one algorithmic choice at a time in SecNet and evaluate the effect on the prediction accuracy; (5) new 4- and 5-label prediction alphabets that may be more practical for tertiary structure prediction methods. The 3-label accuracy (helix, sheet, coil) of the leading predictors on both Test2018 and CB513 is 81-82%, while SecNet's accuracy is 84% for both sets. Accuracy on the non-homologous ECOD set is only 0.6 points (83.9%) lower than the results on the Test2018-2019 set (84.5%). The ablation study of features, neural network architecture, and training hyper-parameters suggests the best accuracy results are achieved with good choices for each of them while the neural network architecture is not as critical as long as it is not too simple. Protocols for generating and using unbiased test, validation, and training sets are provided. Our data sets, including input features and assigned labels, and SecNet software including third-party dependencies and databases, are downloadable from dunbrack.fccc.edu/ss and github.com/sh-maxim/ss.

Collapse

Pritam M, Singh G, Swaroop S, Singh AK, Pandey B, Singh SP. A cutting-edge immunoinformatics approach for design of multi-epitope oral vaccine against dreadful human malaria. Int J Biol Macromol 2020;158:159-179. [PMID: 32360460 PMCID: PMC7189201 DOI: 10.1016/j.ijbiomac.2020.04.191] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Revised: 03/28/2020] [Accepted: 04/22/2020] [Indexed: 12/18/2022]

Veevers R, Cawley G, Hayward S. Investigation of sequence features of hinge-bending regions in proteins with domain movements using kernel logistic regression. BMC Bioinformatics 2020;21:137. [PMID: 32272894 PMCID: PMC7147021 DOI: 10.1186/s12859-020-3464-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 03/20/2020] [Indexed: 11/12/2022] Open

Abstract

Background

Hinge-bending movements in proteins comprising two or more domains form a large class of functional movements. Hinge-bending regions demarcate protein domains and collectively control the domain movement. Consequently, the ability to recognise sequence features of hinge-bending regions and to be able to predict them from sequence alone would benefit various areas of protein research. For example, an understanding of how the sequence features of these regions relate to dynamic properties in multi-domain proteins would aid in the rational design of linkers in therapeutic fusion proteins.

Results

The DynDom database of protein domain movements comprises sequences annotated to indicate whether the amino acid residue is located within a hinge-bending region or within an intradomain region. Using statistical methods and Kernel Logistic Regression (KLR) models, this data was used to determine sequence features that favour or disfavour hinge-bending regions. This is a difficult classification problem as the number of negative cases (intradomain residues) is much larger than the number of positive cases (hinge residues). The statistical methods and the KLR models both show that cysteine has the lowest propensity for hinge-bending regions and proline has the highest, even though it is the most rigid amino acid. As hinge-bending regions have been previously shown to occur frequently at the terminal regions of the secondary structures, the propensity for proline at these regions is likely due to its tendency to break secondary structures. The KLR models also indicate that isoleucine may act as a domain-capping residue. We have found that a quadratic KLR model outperforms a linear KLR model and that improvement in performance occurs up to very long window lengths (eighty residues) indicating long-range correlations.

Conclusion

In contrast to the only other approach that focused solely on interdomain hinge-bending regions, the method provides a modest and statistically significant improvement over a random classifier. An explanation of the KLR results is that in the prediction of hinge-bending regions a long-range correlation is at play between a small number amino acids that either favour or disfavour hinge-bending regions. The resulting sequence-based prediction tool, HingeSeek, is available to run through a webserver at hingeseek.cmp.uea.ac.uk.

Collapse

Smolarczyk T, Roterman-Konieczna I, Stapor K. Protein Secondary Structure Prediction: A Review of Progress and Directions. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191017104639] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

The Order-Disorder Continuum: Linking Predictions of Protein Structure and Disorder through Molecular Simulation. Sci Rep 2020;10:2068. [PMID: 32034199 PMCID: PMC7005769 DOI: 10.1038/s41598-020-58868-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 10/16/2019] [Indexed: 12/11/2022] Open

Abstract

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions within proteins (IDRs) serve an increasingly expansive list of biological functions, including regulation of transcription and translation, protein phosphorylation, cellular signal transduction, as well as mechanical roles. The strong link between protein function and disorder motivates a deeper fundamental characterization of IDPs and IDRs for discovering new functions and relevant mechanisms. We review recent advances in experimental techniques that have improved identification of disordered regions in proteins. Yet, experimentally curated disorder information still does not currently scale to the level of experimentally determined structural information in folded protein databases, and disorder predictors rely on several different binary definitions of disorder. To link secondary structure prediction algorithms developed for folded proteins and protein disorder predictors, we conduct molecular dynamics simulations on representative proteins from the Protein Data Bank, comparing secondary structure and disorder predictions with simulation results. We find that structure predictor performance from neural networks can be leveraged for the identification of highly dynamic regions within molecules, linked to disorder. Low accuracy structure predictions suggest a lack of static structure for regions that disorder predictors fail to identify. While disorder databases continue to expand, secondary structure predictors and molecular simulations can improve disorder predictor performance, which aids discovery of novel functions of IDPs and IDRs. These observations provide a platform for the development of new, integrated structural databases and fusion of prediction tools toward protein disorder characterization in health and disease.

Collapse

Torrisi M, Pollastri G, Le Q. Deep learning methods in protein structure prediction. Comput Struct Biotechnol J 2020;18:1301-1310. [PMID: 32612753 PMCID: PMC7305407 DOI: 10.1016/j.csbj.2019.12.011] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 12/19/2019] [Accepted: 12/20/2019] [Indexed: 01/01/2023] Open

Retention Time Prediction and Protein Identification. Methods Mol Biol 2020;2051:115-132. [PMID: 31552626 DOI: 10.1007/978-1-4939-9744-2_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Abstract

In bottom-up proteomics, proteins are typically identified by enzymatic digestion into peptides, tandem mass spectrometry and comparison of the tandem mass spectra with those predicted from a sequence database for peptides within measurement uncertainty from the experimentally obtained mass. Although now decreasingly common, isolated proteins or simple protein mixtures can also be identified by measuring only the masses of the peptides resulting from the enzymatic digest, without any further fragmentation. Separation methods such as liquid chromatography and electrophoresis are often used to fractionate complex protein or peptide mixtures prior to analysis by mass spectrometry. Although the primary reason for this is to avoid ion suppression and improve data quality, these separations are based on physical and chemical properties of the peptides or proteins and therefore also provide information about them. Depending on the separation method, this could be protein molecular weight (SDS-PAGE), isoelectric point (IEF), charge at a known pH (ion exchange chromatography), or hydrophobicity (reversed phase chromatography). These separations produce approximate measurements on properties that to some extent can be predicted from amino acid sequences. In the case of molecular weight of proteins without posttranslational modifications this is straightforward: simply add the molecular weights of the amino acid residues in the protein. For IEF, charge and hydrophobicity, the order of the amino acids, and folding state of the peptide or protein also matter, but it is nevertheless possible to predict the behavior of peptides and proteins in these separation methods to a degree which renders such predictions useful. This chapter reviews the topic of using data from separation methods for identification and validation in proteomics, with special emphasis on predicting retention times of tryptic peptides in reversed-phase chromatography under acidic conditions, as this is one of the most commonly used separation methods in bottom-up proteomics.

Collapse

Long S, Tian P. Protein secondary structure prediction with context convolutional neural network. RSC Adv 2019;9:38391-38396. [PMID: 35540205 PMCID: PMC9075825 DOI: 10.1039/c9ra05218f] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 11/18/2019] [Indexed: 11/21/2022] Open

Zamora-Carreras H, Maestro B, Sanz JM, Jiménez MA. Turncoat Polypeptides: We Adapt to Our Environment. Chembiochem 2019;21:432-441. [PMID: 31456307 DOI: 10.1002/cbic.201900446] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Indexed: 01/25/2023]

Sample Reduction Strategies for Protein Secondary Structure Prediction. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9204429] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Smolarczyk T, Stapor K, Roterman-Konieczna I. Backbone dihedral angles prediction servers for protein early-stage structure prediction. BIO-ALGORITHMS AND MED-SYSTEMS 2019. [DOI: 10.1515/bams-2019-0034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

A Bi-LSTM Based Ensemble Algorithm for Prediction of Protein Secondary Structure. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9173538] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]