Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 2017;33:685-692. [PMID: 28011771 DOI: 10.1093/bioinformatics/btw678] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/26/2016] [Indexed: 11/12/2022] Open

For:	Hanson J, Yang Y, Paliwal K, Zhou Y. Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks. Bioinformatics 2017;33:685-692. [PMID: 28011771 DOI: 10.1093/bioinformatics/btw678] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/26/2016] [Indexed: 11/12/2022] Open

Number

Cited by Other Article(s)

101

Singh J, Hanson J, Heffernan R, Paliwal K, Yang Y, Zhou Y. Detecting Proline and Non-Proline Cis Isomers in Protein Structures from Sequences Using Deep Residual Ensemble Learning. J Chem Inf Model 2018;58:2033-2042. [PMID: 30118602 DOI: 10.1021/acs.jcim.8b00442] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

102

Liu Y, Wang X, Liu B. IDP⁻CRF: Intrinsically Disordered Protein/Region Identification Based on Conditional Random Fields. Int J Mol Sci 2018;19:E2483. [PMID: 30135358 PMCID: PMC6164615 DOI: 10.3390/ijms19092483] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 08/14/2018] [Accepted: 08/18/2018] [Indexed: 12/16/2022] Open

103

Yamada KD, Kinoshita K. De novo profile generation based on sequence context specificity with the long short-term memory network. BMC Bioinformatics 2018;19:272. [PMID: 30021530 PMCID: PMC6052547 DOI: 10.1186/s12859-018-2284-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 07/11/2018] [Indexed: 11/24/2022] Open

Abstract

Background

Long short-term memory (LSTM) is one of the most attractive deep learning methods to learn time series or contexts of input data. Increasing studies, including biological sequence analyses in bioinformatics, utilize this architecture. Amino acid sequence profiles are widely used for bioinformatics studies, such as sequence similarity searches, multiple alignments, and evolutionary analyses. Currently, many biological sequences are becoming available, and the rapidly increasing amount of sequence data emphasizes the importance of scalable generators of amino acid sequence profiles.

Results

We employed the LSTM network and developed a novel profile generator to construct profiles without any assumptions, except for input sequence context. Our method could generate better profiles than existing de novo profile generators, including CSBuild and RPS-BLAST, on the basis of profile-sequence similarity search performance with linear calculation costs against input sequence size. In addition, we analyzed the effects of the memory power of LSTM and found that LSTM had high potential power to detect long-range interactions between amino acids, as in the case of beta-strand formation, which has been a difficult problem in protein bioinformatics using sequence information.

Conclusion

We demonstrated the importance of sequence context and the feasibility of LSTM on biological sequence analyses. Our results demonstrated the effectiveness of memories in LSTM and showed that our de novo profile generator, SPBuild, achieved higher performance than that of existing methods for profile prediction of beta-strands, where long-range interactions of amino acids are important and are known to be difficult for the existing window-based prediction methods. Our findings will be useful for the development of other prediction methods related to biological sequences by machine learning methods.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2284-1) contains supplementary material, which is available to authorized users.

Collapse

104

Zhao Z, Peng Z, Yang J. Improving Sequence-Based Prediction of Protein–Peptide Binding Residues by Introducing Intrinsic Disorder and a Consensus Method. J Chem Inf Model 2018;58:1459-1468. [DOI: 10.1021/acs.jcim.8b00019] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

105

Fa R, Cozzetto D, Wan C, Jones DT. Predicting human protein function with multi-task deep neural networks. PLoS One 2018;13:e0198216. [PMID: 29889900 PMCID: PMC5995439 DOI: 10.1371/journal.pone.0198216] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 05/15/2018] [Indexed: 11/19/2022] Open

Abstract

Machine learning methods for protein function prediction are urgently needed, especially now that a substantial fraction of known sequences remains unannotated despite the extensive use of functional assignments based on sequence similarity. One major bottleneck supervised learning faces in protein function prediction is the structured, multi-label nature of the problem, because biological roles are represented by lists of terms from hierarchically organised controlled vocabularies such as the Gene Ontology. In this work, we build on recent developments in the area of deep learning and investigate the usefulness of multi-task deep neural networks (MTDNN), which consist of upstream shared layers upon which are stacked in parallel as many independent modules (additional hidden layers with their own output units) as the number of output GO terms (the tasks). MTDNN learns individual tasks partially using shared representations and partially from task-specific characteristics. When no close homologues with experimentally validated functions can be identified, MTDNN gives more accurate predictions than baseline methods based on annotation frequencies in public databases or homology transfers. More importantly, the results show that MTDNN binary classification accuracy is higher than alternative machine learning-based methods that do not exploit commonalities and differences among prediction tasks. Interestingly, compared with a single-task predictor, the performance improvement is not linearly correlated with the number of tasks in MTDNN, but medium size models provide more improvement in our case. One of advantages of MTDNN is that given a set of features, there is no requirement for MTDNN to have a bootstrap feature selection procedure as what traditional machine learning algorithms do. Overall, the results indicate that the proposed MTDNN algorithm improves the performance of protein function prediction. On the other hand, there is still large room for deep learning techniques to further enhance prediction ability.

Collapse

106

Bidirectional Long Short-Term Memory Network for Vehicle Behavior Recognition. REMOTE SENSING 2018. [DOI: 10.3390/rs10060887] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

107

Yang Y, Gao J, Wang J, Heffernan R, Hanson J, Paliwal K, Zhou Y. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinform 2018;19:482-494. [PMID: 28040746 PMCID: PMC5952956 DOI: 10.1093/bib/bbw129] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 11/15/2016] [Indexed: 11/13/2022] Open

108

Gao J, Yang Y, Zhou Y. Grid-based prediction of torsion angle probabilities of protein backbone and its application to discrimination of protein intrinsic disorder regions and selection of model structures. BMC Bioinformatics 2018;19:29. [PMID: 29390958 PMCID: PMC5796405 DOI: 10.1186/s12859-018-2031-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/17/2018] [Indexed: 12/03/2022] Open

109

Bernardi A, Kirschner KN, Faller R. Structural analysis of human glycoprotein butyrylcholinesterase using atomistic molecular dynamics: The importance of glycosylation site ASN241. PLoS One 2017;12:e0187994. [PMID: 29190644 PMCID: PMC5708630 DOI: 10.1371/journal.pone.0187994] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 10/30/2017] [Indexed: 11/18/2022] Open

110

Li S, Chen J, Liu B. Protein remote homology detection based on bidirectional long short-term memory. BMC Bioinformatics 2017;18:443. [PMID: 29017445 PMCID: PMC5634958 DOI: 10.1186/s12859-017-1842-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 09/21/2017] [Indexed: 01/05/2023] Open

111

DisBind: A database of classified functional binding sites in disordered and structured regions of intrinsically disordered proteins. BMC Bioinformatics 2017;18:206. [PMID: 28381244 PMCID: PMC5382478 DOI: 10.1186/s12859-017-1620-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 03/31/2017] [Indexed: 01/01/2023] Open