Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Singh J, Litfin T, Paliwal K, Singh J, Hanumanthappa AK, Zhou Y. SPOT-1D-Single: Improving the Single-Sequence-Based Prediction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Half-Sphere Exposures using a Large Training Set and Ensembled Deep Learning. Bioinformatics 2021;37:3464-3472. [PMID: 33983382 DOI: 10.1093/bioinformatics/btab316] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 04/06/2021] [Accepted: 04/26/2021] [Indexed: 02/01/2023] Open

For:	Singh J, Litfin T, Paliwal K, Singh J, Hanumanthappa AK, Zhou Y. SPOT-1D-Single: Improving the Single-Sequence-Based Prediction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Half-Sphere Exposures using a Large Training Set and Ensembled Deep Learning. Bioinformatics 2021;37:3464-3472. [PMID: 33983382 DOI: 10.1093/bioinformatics/btab316] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 04/06/2021] [Accepted: 04/26/2021] [Indexed: 02/01/2023] Open

Number

Cited by Other Article(s)

Cheng L, Lu W, Xia Y, Lu Y, Shen J, Hui Z, Xu Y, Wu H, Chen J, Fu Q, Lu Y. ProAttUnet: Advancing protein secondary structure prediction with deep learning via U-Net dual-pathway feature fusion and ESM2 pretrained protein language model. Comput Biol Chem 2025;118:108429. [PMID: 40288255 DOI: 10.1016/j.compbiolchem.2025.108429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 02/23/2025] [Accepted: 03/12/2025] [Indexed: 04/29/2025]

Alanazi W, Meng D, Pollastri G. Advancements in one-dimensional protein structure prediction using machine learning and deep learning. Comput Struct Biotechnol J 2025;27:1416-1430. [PMID: 40242292 PMCID: PMC12002955 DOI: 10.1016/j.csbj.2025.04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Revised: 04/01/2025] [Accepted: 04/02/2025] [Indexed: 04/18/2025] Open

Wang Z, Yang X, Gao S, Liang Y, Shi X. GraphPhos: Predict Protein-Phosphorylation Sites Based on Graph Neural Networks. Int J Mol Sci 2025;26:941. [PMID: 39940709 PMCID: PMC11818044 DOI: 10.3390/ijms26030941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Revised: 01/16/2025] [Accepted: 01/20/2025] [Indexed: 02/16/2025] Open

Zhang J, Qian J, Zou Q, Zhou F, Kurgan L. Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2025;2870:1-19. [PMID: 39543027 DOI: 10.1007/978-1-0716-4213-9_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]

Wu T, Cheng W, Cheng J. Improving Protein Secondary Structure Prediction by Deep Language Models and Transformer Networks. Methods Mol Biol 2025;2867:43-53. [PMID: 39576574 DOI: 10.1007/978-1-0716-4196-5_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]

Alanazi W, Meng D, Pollastri G. Porter 6: Protein Secondary Structure Prediction by Leveraging Pre-Trained Language Models (PLMs). Int J Mol Sci 2024;26:130. [PMID: 39795988 PMCID: PMC11719765 DOI: 10.3390/ijms26010130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 12/18/2024] [Accepted: 12/25/2024] [Indexed: 01/13/2025] Open

da Rocha W, Liberti L, Mucherino A, Malliavin TE. Influence of Stereochemistry in a Local Approach for Calculating Protein Conformations. J Chem Inf Model 2024;64:8999-9008. [PMID: 39560315 DOI: 10.1021/acs.jcim.4c01232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]

Dong B, Su H, Xu D, Hou C, Liu Z, Niu N, Wang G. ILMCNet: A Deep Neural Network Model That Uses PLM to Process Features and Employs CRF to Predict Protein Secondary Structure. Genes (Basel) 2024;15:1350. [PMID: 39457474 PMCID: PMC11507629 DOI: 10.3390/genes15101350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2024] [Revised: 10/07/2024] [Accepted: 10/18/2024] [Indexed: 10/28/2024] Open

Abstract

BACKGROUND

Protein secondary structure prediction (PSSP) is a critical task in computational biology, pivotal for understanding protein function and advancing medical diagnostics. Recently, approaches that integrate multiple amino acid sequence features have gained significant attention in PSSP research.

OBJECTIVES

We aim to automatically extract additional features represented by evolutionary information from a large number of sequences while simultaneously incorporating positional information for more comprehensive sequence features. Additionally, we consider the interdependence between secondary structures during the prediction stage.

METHODS

To this end, we propose a deep neural network model, ILMCNet, which utilizes a language model and Conditional Random Field (CRF). Protein language models (PLMs) pre-trained on sequences from multiple large databases can provide sequence features that incorporate evolutionary information. ILMCNet uses positional encoding to ensure that the input features include positional information. To better utilize these features, we propose a hybrid network architecture that employs a Transformer Encoder to enhance features and integrates a feature extraction module combining a Convolutional Neural Network (CNN) with a Bidirectional Long Short-Term Memory Network (BiLSTM). This design enables deep extraction of localized features while capturing global bidirectional information. In the prediction stage, ILMCNet employs CRF to capture the interdependencies between secondary structures.

RESULTS

Experimental results on benchmark datasets such as CB513, TS115, NEW364, CASP11, and CASP12 demonstrate that the prediction performance of our method surpasses that of comparable approaches.

CONCLUSIONS

This study proposes a new approach to PSSP research and is expected to play an important role in other protein-related research fields, such as protein tertiary structure prediction.

Collapse

Zhang B, Zheng M, Zhang Y, Quan L. DCMA: faster protein backbone dihedral angle prediction using a dilated convolutional attention-based neural network. FRONTIERS IN BIOINFORMATICS 2024;4:1477909. [PMID: 39493577 PMCID: PMC11527783 DOI: 10.3389/fbinf.2024.1477909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 09/30/2024] [Indexed: 11/05/2024] Open

Zhao MX, Ding RF, Chen Q, Meng J, Li F, Fu S, Huang B, Liu Y, Ji ZL, Zhao Y. Nphos: Database and Predictor of Protein N-phosphorylation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024;22:qzae032. [PMID: 39380205 PMCID: PMC12016571 DOI: 10.1093/gpbjnl/qzae032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/03/2024] [Accepted: 04/01/2024] [Indexed: 10/10/2024]

Ouyang J, Gao Y, Yang Y. PCP-GC-LM: single-sequence-based protein contact prediction using dual graph convolutional neural network and convolutional neural network. BMC Bioinformatics 2024;25:287. [PMID: 39223474 PMCID: PMC11370006 DOI: 10.1186/s12859-024-05914-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 08/22/2024] [Indexed: 09/04/2024] Open

Abstract

BACKGROUND

Recently, the process of evolution information and the deep learning network has promoted the improvement of protein contact prediction methods. Nevertheless, still remain some bottleneck: (1) One of the bottlenecks is the prediction of orphans and other fewer evolution information proteins. (2) The other bottleneck is the method of predicting single-sequence-based proteins mainly focuses on selecting protein sequence features and tuning the neural network architecture, However, while the deeper neural networks improve prediction accuracy, there is still the problem of increasing the computational burden. Compared with other neural networks in the field of protein prediction, the graph neural network has the following advantages: due to the advantage of revealing the topology structure via graph neural network and being able to take advantage of the hierarchical structure and local connectivity of graph neural networks has certain advantages in capturing the features of different levels of abstraction in protein molecules. When using protein sequence and structure information for joint training, the dependencies between the two kinds of information can be better captured. And it can process protein molecular structures of different lengths and shapes, while traditional neural networks need to convert proteins into fixed-size vectors or matrices for processing.

RESULTS

Here, we propose a single-sequence-based protein contact map predictor PCP-GC-LM, with dual-level graph neural networks and convolution networks. Our method performs better with other single-sequence-based predictors in different independent tests. In addition, to verify the validity of our method against complex protein structures, we will also compare it with other methods in two homodimers protein test sets (DeepHomo test dataset and CASP-CAPRI target dataset). Furthermore, we also perform ablation experiments to demonstrate the necessity of a dual graph network. In all, our framework presents new modules to accurately predict inter-chain contact maps in protein and it's also useful to analyze interactions in other types of protein complexes.

Collapse

Munna MMR, Islam MA, Shanta SS, Monty MA. Structural, functional, molecular docking analysis of a hypothetical protein from Talaromyces marneffei and its molecular dynamic simulation: an in-silico approach. J Biomol Struct Dyn 2024:1-20. [PMID: 38345137 DOI: 10.1080/07391102.2024.2314264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 01/29/2024] [Indexed: 03/01/2024]

Kim Y, Kwon J. AttSec: protein secondary structure prediction by capturing local patterns from attention map. BMC Bioinformatics 2023;24:183. [PMID: 37142993 PMCID: PMC10161504 DOI: 10.1186/s12859-023-05310-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Accepted: 04/27/2023] [Indexed: 05/06/2023] Open

Abstract

BACKGROUND

Protein secondary structures that link simple 1D sequences to complex 3D structures can be used as good features for describing the local properties of protein, but also can serve as key features for predicting the complex 3D structures of protein. Thus, it is very important to accurately predict the secondary structure of the protein, which contains a local structural property assigned by the pattern of hydrogen bonds formed between amino acids. In this study, we accurately predict protein secondary structure by capturing the local patterns of protein. For this objective, we present a novel prediction model, AttSec, based on transformer architecture. In particular, AttSec extracts self-attention maps corresponding to pairwise features between amino acid embeddings and passes them through 2D convolution blocks to capture local patterns. In addition, instead of using additional evolutionary information, it uses protein embedding as an input, which is generated by a language model.

RESULTS

For the ProteinNet DSSP8 dataset, our model showed 11.8% better performance on the entire evaluation datasets compared with other no-evolutionary-information-based models. For the NetSurfP-2.0 DSSP8 dataset, it showed 1.2% better performance on average. There was an average performance improvement of 9.0% for the ProteinNet DSSP3 dataset and an average of 0.7% for the NetSurfP-2.0 DSSP3 dataset.

CONCLUSION

We accurately predict protein secondary structure by capturing the local patterns of protein. For this objective, we present a novel prediction model, AttSec, based on transformer architecture. Although there was no dramatic accuracy improvement compared with other models, the improvement on DSSP8 was greater than that on DSSP3. This result implies that using our proposed pairwise feature could have a remarkable effect for several challenging tasks that require finely subdivided classification. Github package URL is https://github.com/youjin-DDAI/AttSec .

Collapse

Chandra A, Tünnermann L, Löfstedt T, Gratz R. Transformer-based deep learning for predicting protein properties in the life sciences. eLife 2023;12:e82819. [PMID: 36651724 PMCID: PMC9848389 DOI: 10.7554/elife.82819] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 01/06/2023] [Indexed: 01/19/2023] Open

Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Ismi DP, Pulungan R, Afiahayati. Deep learning for protein secondary structure prediction: Pre and post-AlphaFold. Comput Struct Biotechnol J 2022;20:6271-6286. [PMID: 36420164 PMCID: PMC9678802 DOI: 10.1016/j.csbj.2022.11.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 11/05/2022] [Accepted: 11/05/2022] [Indexed: 11/13/2022] Open

Jin X, Guo L, Jiang Q, Wu N, Yao S. Prediction of protein secondary structure based on an improved channel attention and multiscale convolution module. Front Bioeng Biotechnol 2022;10:901018. [PMID: 35935483 PMCID: PMC9355137 DOI: 10.3389/fbioe.2022.901018] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 06/28/2022] [Indexed: 11/13/2022] Open

Singh J, Paliwal K, Litfin T, Singh J, Zhou Y. Predicting RNA distance-based contact maps by integrated deep learning on physics-inferred secondary structure and evolutionary-derived mutational coupling. Bioinformatics 2022;38:3900-3910. [PMID: 35751593 PMCID: PMC9364379 DOI: 10.1093/bioinformatics/btac421] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 04/30/2022] [Accepted: 06/28/2022] [Indexed: 12/24/2022] Open

Wang R, Jin J, Zou Q, Nakai K, Wei L. Predicting protein-peptide binding residues via interpretable deep learning. Bioinformatics 2022;38:3351-3360. [PMID: 35604077 DOI: 10.1093/bioinformatics/btac352] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/13/2022] [Accepted: 05/18/2022] [Indexed: 11/14/2022] Open

Abstract

Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, they highly rely on third-party tools or information for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers)-based Contrastive Learning framework to predict the protein-Peptide binding residues based on protein sequences only. PepBCL is an end-to-end predictive model that is independent of designed features. Specifically, we introduce a well pre-trained protein language model that can automatically extract and learn high-latent representations of protein sequences relevant for protein structure and functions. Further, we design a novel contrastive learning module to optimize the feature representations of binding residues underlying the imbalanced dataset. We demonstrate that our proposed method significantly outperforms the state-of-the-art methods under benchmarking comparison, and achieves more robust performance. Moreover, we found that we further improve the performance via the integration of traditional features and our learnt features. Our results highlight the flexibility and adaptability of deep learning-based protein language model to capture both conserved and non-conserved sequential characteristics of peptide-binding residues. Interestingly, we demonstrate that peptide-binding residues in local sequential regions have more specific sequential patterns as compared with other protein-ligand binding residues, which potentially provides functional difference. Finally, to facilitate the use of our method, we establish an online predictive platform as the implementation of the proposed PepBCL, which is now available at http://server.wei-group.net/PepBCL/.

AVAILABILITY

https://github.com/Ruheng-W/PepBCL.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Yang W, Liu Y, Xiao C. Deep metric learning for accurate protein secondary structure prediction. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Singh J, Litfin T, Singh J, Paliwal K, Zhou Y. SPOT-Contact-LM: improving single-sequence-based prediction of protein contact map using a transformer language model. Bioinformatics 2022;38:1888-1894. [PMID: 35104320 PMCID: PMC9113311 DOI: 10.1093/bioinformatics/btac053] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 11/21/2021] [Accepted: 01/26/2022] [Indexed: 02/03/2023] Open

Singh J, Paliwal K, Singh J, Zhou Y. RNA Backbone Torsion and Pseudotorsion Angle Prediction Using Dilated Convolutional Neural Networks. J Chem Inf Model 2021;61:2610-2622. [PMID: 34037398 DOI: 10.1021/acs.jcim.1c00153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]