201
|
Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method. Sci Rep 2018; 8:9856. [PMID: 29959372 PMCID: PMC6026213 DOI: 10.1038/s41598-018-28084-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 06/12/2018] [Indexed: 11/20/2022] Open
Abstract
Protein secondary structure prediction is one of the most important and challenging problems in bioinformatics. Machine learning techniques have been applied to solve the problem and have gained substantial success in this research area. However there is still room for improvement toward the theoretical limit. In this paper, we present a novel method for protein secondary structure prediction based on a data partition and semi-random subspace method (PSRSM). Data partitioning is an important strategy for our method. First, the protein training dataset was partitioned into several subsets based on the length of the protein sequence. Then we trained base classifiers on the subspace data generated by the semi-random subspace method, and combined base classifiers by majority vote rule into ensemble classifiers on each subset. Multiple classifiers were trained on different subsets. These different classifiers were used to predict the secondary structures of different proteins according to the protein sequence length. Experiments are performed on 25PDB, CB513, CASP10, CASP11, CASP12, and T100 datasets, and the good performance of 86.38%, 84.53%, 85.51%, 85.89%, 85.55%, and 85.09% is achieved respectively. Experimental results showed that our method outperforms other state-of-the-art methods.
Collapse
|
202
|
Moon T, Ahn TI, Son JE. Forecasting Root-Zone Electrical Conductivity of Nutrient Solutions in Closed-Loop Soilless Cultures via a Recurrent Neural Network Using Environmental and Cultivation Information. FRONTIERS IN PLANT SCIENCE 2018; 9:859. [PMID: 29977249 PMCID: PMC6021533 DOI: 10.3389/fpls.2018.00859] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 06/04/2018] [Indexed: 05/16/2023]
Abstract
In existing closed-loop soilless cultures, nutrient solutions are controlled by the electrical conductivity (EC) of the solution. However, the EC of nutrient solutions is affected by both growth environments and crop growth, so it is hard to predict the EC of nutrient solution. The objective of this study was to predict the EC of root-zone nutrient solutions in closed-loop soilless cultures using recurrent neural network (RNN). In a test greenhouse with sweet peppers (Capsicum annuum L.), data were measured every 10 s from October 15 to December 31, 2014. Mean values for every hour were analyzed. Validation accuracy (R2) of a single-layer long short-term memory (LSTM) was 0.92 and root-mean-square error (RMSE) was 0.07, which were the best results among the different RNNs. The trained LSTM predicted the substrate EC accurately at all ranges. Test accuracy (R2) was 0.72 and RMSE was 0.08, which were lower than values for the validation. Deep learning algorithms were more accurate when more data were added for training. The addition of other environmental factors or plant growth data would improve model robustness. A trained LSTM can control the nutrient solutions in closed-loop soilless cultures based on predicted future EC. Therefore, the algorithm can make a planned management of nutrient solutions possible, reducing resource waste.
Collapse
Affiliation(s)
| | | | - Jung Eek Son
- Department of Plant Science, Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, South Korea
| |
Collapse
|
203
|
Gao Y, Wang S, Deng M, Xu J. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning. BMC Bioinformatics 2018; 19:100. [PMID: 29745828 PMCID: PMC5998898 DOI: 10.1186/s12859-018-2065-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Background Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. Results In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Conclusions Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study. Electronic supplementary material The online version of this article (10.1186/s12859-018-2065-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yujuan Gao
- Center for Quantitative Biology, Peking University, Beijing, China.,Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA
| | - Sheng Wang
- Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA
| | - Minghua Deng
- Center for Quantitative Biology, Peking University, Beijing, China. .,School of Mathematical Sciences, Beijing, China. .,Center for Statistical Sciences, Beijing, China.
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, 6045 S Kenwood Ave., Chicago, USA.
| |
Collapse
|
204
|
Islam MM, Saha S, Rahman MM, Shatabda S, Farid DM, Dehzangi A. iProtGly-SS: Identifying protein glycation sites using sequence and structure based features. Proteins 2018; 86:777-789. [PMID: 29675975 DOI: 10.1002/prot.25511] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 02/27/2018] [Accepted: 04/14/2018] [Indexed: 12/20/2022]
Abstract
Glycation is chemical reaction by which sugar molecule bonds with a protein without the help of enzymes. This is often cause to many diseases and therefore the knowledge about glycation is very important. In this paper, we present iProtGly-SS, a protein lysine glycation site identification method based on features extracted from sequence and secondary structural information. In the experiments, we found the best feature groups combination: Amino Acid Composition, Secondary Structure Motifs, and Polarity. We used support vector machine classifier to train our model and used an optimal set of features using a group based forward feature selection technique. On standard benchmark datasets, our method is able to significantly outperform existing methods for glycation prediction. A web server for iProtGly-SS is implemented and publicly available to use: http://brl.uiu.ac.bd/iprotgly-ss/.
Collapse
Affiliation(s)
- Md Mofijul Islam
- Department of CSE, University of Dhaka, Dhaka, Bangladesh.,Department of CSE, United International University, Dhaka, Bangladesh
| | - Sanjay Saha
- Department of CSE, United International University, Dhaka, Bangladesh
| | | | - Swakkhar Shatabda
- Department of CSE, United International University, Dhaka, Bangladesh
| | - Dewan Md Farid
- Department of CSE, United International University, Dhaka, Bangladesh
| | - Abdollah Dehzangi
- Department of Computer Science, Morgan State University, Baltimore, MD, 21251, USA
| |
Collapse
|
205
|
Wang A, An N, Chen G, Liu L, Alterovitz G. Subtype dependent biomarker identification and tumor classification from gene expression profiles. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.01.025] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
206
|
Fang C, Shang Y, Xu D. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction. Proteins 2018; 86:592-598. [PMID: 29492997 DOI: 10.1002/prot.25487] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 02/25/2018] [Accepted: 02/27/2018] [Indexed: 11/05/2022]
Abstract
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html.
Collapse
Affiliation(s)
- Chao Fang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Yi Shang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.,Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri
| |
Collapse
|
207
|
Fang C, Shang Y, Xu D. Prediction of Protein Backbone Torsion Angles Using Deep Residual Inception Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 16:10.1109/TCBB.2018.2814586. [PMID: 29994074 PMCID: PMC6592781 DOI: 10.1109/tcbb.2018.2814586] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Prediction of protein backbone torsion angles (Psi and Phi) can provide important information for protein structure prediction and sequence alignment. Existing methods for Psi-Phi angle prediction have significant room for improvement. In this paper, a new deep residual inception network architecture, called DeepRIN, is proposed for the prediction of Psi-Phi angles. The input to DeepRIN is a feature matrix representing a composition of physico-chemical properties of amino acids, a 20-dimensional position-specific substitution matrix (PSSM) generated by PSI-BLAST, a 30-dimensional hidden Markov Model sequence profile generated by HHBlits, and predicted eight-state secondary structure features. DeepRIN is designed based on inception networks and residual networks that have performed well on image classification and text recognition. The architecture of DeepRIN enables effective encoding of local and global interatcions between amino acids in a protein sequence to achieve accruacte prediction. Extensive experimental results show that DeepRIN outperformed the best existing tools significantly. Compared to the recently released state-of-the-art tool, SPIDER3, DeepRIN reduced the Psi angle prediction error by more than 5 degrees and the Phi angle prediction error by more than 2 degrees on average. The executable tool of DeepRIN is available for download at http://dslsrv8.cs.missouri.edu/~cf797/MUFoldAngle/.
Collapse
|
208
|
Yamada KD. Derivative-free neural network for optimizing the scoring functions associated with dynamic programming of pairwise-profile alignment. Algorithms Mol Biol 2018; 13:5. [PMID: 29467815 PMCID: PMC5815186 DOI: 10.1186/s13015-018-0123-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Accepted: 02/06/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A profile-comparison method with position-specific scoring matrix (PSSM) is among the most accurate alignment methods. Currently, cosine similarity and correlation coefficients are used as scoring functions of dynamic programming to calculate similarity between PSSMs. However, it is unclear whether these functions are optimal for profile alignment methods. By definition, these functions cannot capture nonlinear relationships between profiles. Therefore, we attempted to discover a novel scoring function, which was more suitable for the profile-comparison method than existing functions, using neural networks. RESULTS Although neural networks required derivative-of-cost functions, the problem being addressed in this study lacked them. Therefore, we implemented a novel derivative-free neural network by combining a conventional neural network with an evolutionary strategy optimization method used as a solver. Using this novel neural network system, we optimized the scoring function to align remote sequence pairs. Our results showed that the pairwise-profile aligner using the novel scoring function significantly improved both alignment sensitivity and precision relative to aligners using existing functions. CONCLUSIONS We developed and implemented a novel derivative-free neural network and aligner (Nepal) for optimizing sequence alignments. Nepal improved alignment quality by adapting to remote sequence alignments and increasing the expressiveness of similarity scores. Additionally, this novel scoring function can be realized using a simple matrix operation and easily incorporated into other aligners. Moreover our scoring function could potentially improve the performance of homology detection and/or multiple-sequence alignment of remote homologous sequences. The goal of the study was to provide a novel scoring function for profile alignment method and develop a novel learning system capable of addressing derivative-free problems. Our system is capable of optimizing the performance of other sophisticated methods and solving problems without derivative-of-cost functions, which do not always exist in practical problems. Our results demonstrated the usefulness of this optimization method for derivative-free problems.
Collapse
|
209
|
Gao J, Yang Y, Zhou Y. Grid-based prediction of torsion angle probabilities of protein backbone and its application to discrimination of protein intrinsic disorder regions and selection of model structures. BMC Bioinformatics 2018; 19:29. [PMID: 29390958 PMCID: PMC5796405 DOI: 10.1186/s12859-018-2031-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/17/2018] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Protein structure can be described by backbone torsion angles: rotational angles about the N-Cα bond (φ) and the Cα-C bond (ψ) or the angle between Cαi-1-Cαi-Cαi + 1 (θ) and the rotational angle about the Cαi-Cαi + 1 bond (τ). Thus, their accurate prediction is useful for structure prediction and model refinement. Early methods predicted torsion angles in a few discrete bins whereas most recent methods have focused on prediction of angles in real, continuous values. Real value prediction, however, is unable to provide the information on probabilities of predicted angles. RESULTS Here, we propose to predict angles in fine grids of 5° by using deep learning neural networks. We found that this grid-based technique can yield 2-6% higher accuracy in predicting angles in the same 5° bin than existing prediction techniques compared. We further demonstrate the usefulness of predicted probabilities at given angle bins in discrimination of intrinsically disorder regions and in selection of protein models. CONCLUSIONS The proposed method may be useful for characterizing protein structure and disorder. The method is available at http://sparks-lab.org/server/SPIDER2/ as a part of SPIDER2 package.
Collapse
Affiliation(s)
- Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071 People’s Republic of China
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510000 People’s Republic of China
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD 4222 Australia
| |
Collapse
|
210
|
Müller AT, Hiss JA, Schneider G. Recurrent Neural Network Model for Constructive Peptide Design. J Chem Inf Model 2018; 58:472-479. [PMID: 29355319 DOI: 10.1021/acs.jcim.7b00414] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
We present a generative long short-term memory (LSTM) recurrent neural network (RNN) for combinatorial de novo peptide design. RNN models capture patterns in sequential data and generate new data instances from the learned context. Amino acid sequences represent a suitable input for these machine-learning models. Generative models trained on peptide sequences could therefore facilitate the design of bespoke peptide libraries. We trained RNNs with LSTM units on pattern recognition of helical antimicrobial peptides and used the resulting model for de novo sequence generation. Of these sequences, 82% were predicted to be active antimicrobial peptides compared to 65% of randomly sampled sequences with the same amino acid distribution as the training set. The generated sequences also lie closer to the training data than manually designed amphipathic helices. The results of this study showcase the ability of LSTM RNNs to construct new amino acid sequences within the applicability domain of the model and motivate their prospective application to peptide and protein design without the need for the exhaustive enumeration of sequence libraries.
Collapse
Affiliation(s)
- Alex T Müller
- Swiss Federal Institute of Technology (ETH) , Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland
| | - Jan A Hiss
- Swiss Federal Institute of Technology (ETH) , Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland
| | - Gisbert Schneider
- Swiss Federal Institute of Technology (ETH) , Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland
| |
Collapse
|
211
|
Li GXH, Vogel C, Choi H. PTMscape: an open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes. Mol Omics 2018; 14:197-209. [PMID: 29876573 PMCID: PMC6115748 DOI: 10.1039/c8mo00027a] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
PTMscape predicts PTM sites using descriptors of sequence and physico-chemical microenvironment, and tests enrichment of single or pairs of PTMs in protein domains.
While tandem mass spectrometry can detect post-translational modifications (PTM) at the proteome scale, reported PTM sites are often incomplete and include false positives. Computational approaches can complement these datasets by additional predictions, but most available tools use prediction models pre-trained for single PTM type by the developers and it remains a difficult task to perform large-scale batch prediction for multiple PTMs with flexible user control, including the choice of training data. We developed an R package called PTMscape which predicts PTM sites across the proteome based on a unified and comprehensive set of descriptors of the physico-chemical microenvironment of modified sites, with additional downstream analysis modules to test enrichment of individual or pairs of PTMs in protein domains. PTMscape is flexible in the ability to process any major modifications, such as phosphorylation and ubiquitination, while achieving the sensitivity and specificity comparable to single-PTM methods and outperforming other multi-PTM tools. Applying this framework, we expanded proteome-wide coverage of five major PTMs affecting different residues by prediction, especially for lysine and arginine modifications. Using a combination of experimentally acquired sites (PSP) and newly predicted sites, we discovered that the crosstalk among multiple PTMs occur more frequently than by random chance in key protein domains such as histone, protein kinase, and RNA recognition motifs, spanning various biological processes such as RNA processing, DNA damage response, signal transduction, and regulation of cell cycle. These results provide a proteome-scale analysis of crosstalk among major PTMs and can be easily extended to other types of PTM.
Collapse
Affiliation(s)
- Ginny X H Li
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore.
| | | | | |
Collapse
|
212
|
Li S, Chen J, Liu B. Protein remote homology detection based on bidirectional long short-term memory. BMC Bioinformatics 2017; 18:443. [PMID: 29017445 PMCID: PMC5634958 DOI: 10.1186/s12859-017-1842-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 09/21/2017] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Protein remote homology detection plays a vital role in studies of protein structures and functions. Almost all of the traditional machine leaning methods require fixed length features to represent the protein sequences. However, it is never an easy task to extract the discriminative features with limited knowledge of proteins. On the other hand, deep learning technique has demonstrated its advantage in automatically learning representations. It is worthwhile to explore the applications of deep learning techniques to the protein remote homology detection. RESULTS In this study, we employ the Bidirectional Long Short-Term Memory (BLSTM) to learn effective features from pseudo proteins, also propose a predictor called ProDec-BLSTM: it includes input layer, bidirectional LSTM, time distributed dense layer and output layer. This neural network can automatically extract the discriminative features by using bidirectional LSTM and the time distributed dense layer. CONCLUSION Experimental results on a widely-used benchmark dataset show that ProDec-BLSTM outperforms other related methods in terms of both the mean ROC and mean ROC50 scores. This promising result shows that ProDec-BLSTM is a useful tool for protein remote homology detection. Furthermore, the hidden patterns learnt by ProDec-BLSTM can be interpreted and visualized, and therefore, additional useful information can be obtained.
Collapse
Affiliation(s)
- Shumin Li
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town, Xili, Shenzhen, 518055, China
| | - Junjie Chen
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town, Xili, Shenzhen, 518055, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town, Xili, Shenzhen, 518055, China.
| |
Collapse
|
213
|
Li H, Hou J, Adhikari B, Lyu Q, Cheng J. Deep learning methods for protein torsion angle prediction. BMC Bioinformatics 2017; 18:417. [PMID: 28923002 PMCID: PMC5604354 DOI: 10.1186/s12859-017-1834-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 09/11/2017] [Indexed: 12/31/2022] Open
Abstract
Background Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. Results We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20–21° and 29–30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Conclusions Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy. Electronic supplementary material The online version of this article (10.1186/s12859-017-1834-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Haiou Li
- Department of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, 215006, China
| | - Jie Hou
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Badri Adhikari
- Department of Mathematics and Computer Science, University of Missouri-St. Louis, 1 University Blvd. 311 Express Scripts Hall, St. Louis, MO, 63121, USA
| | - Qiang Lyu
- Department of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, 215006, China
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
214
|
Protein secondary structure prediction: A survey of the state of the art. J Mol Graph Model 2017; 76:379-402. [DOI: 10.1016/j.jmgm.2017.07.015] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Revised: 07/14/2017] [Accepted: 07/17/2017] [Indexed: 11/21/2022]
|