Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Maddhuri Venkata Subramaniya SR, Terashi G, Jain A, Kagaya Y, Kihara D. Protein Contact Map Refinement for Improving Structure Prediction Using Generative Adversarial Networks. Bioinformatics 2021;37:3168-3174. [PMID: 33787852 PMCID: PMC8504630 DOI: 10.1093/bioinformatics/btab220] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 02/28/2021] [Accepted: 03/30/2021] [Indexed: 11/13/2022] Open

For:	Maddhuri Venkata Subramaniya SR, Terashi G, Jain A, Kagaya Y, Kihara D. Protein Contact Map Refinement for Improving Structure Prediction Using Generative Adversarial Networks. Bioinformatics 2021;37:3168-3174. [PMID: 33787852 PMCID: PMC8504630 DOI: 10.1093/bioinformatics/btab220] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 02/28/2021] [Accepted: 03/30/2021] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Sehsah AI, Mousa A, Farouk G. A hybrid variational autoencoder and WGAN with gradient penalty for tertiary protein structure generation. Sci Rep 2025;15:14191. [PMID: 40268976 PMCID: PMC12019360 DOI: 10.1038/s41598-025-94747-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2024] [Accepted: 03/17/2025] [Indexed: 04/25/2025] Open

Abstract

Elucidating the tertiary structure of proteins is important for understanding their functions and interactions. While deep neural networks have advanced the prediction of a protein's native structure from its amino acid sequence, the focus on a single-structure view limits understanding of the dynamic nature of protein molecules. Acquiring a multi-structure view of protein molecules remains a broader challenge in computational structural biology. Alternative representations, such as distance matrices, offer a compact and effective way to explore and generate realistic tertiary protein structures. This paper presents TP-VWGAN, a hybrid model to improve the realism of generating distance matrix representations of tertiary protein structures. The model integrates the probabilistic representation learning of the Variational Autoencoder (VAE) with the realistic data generation strength of the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP). The main modification of TP-VWGAN is incorporating residual blocks into its VAE architecture to improve its performance. The experimental results show that TP-VWGAN with and without residual blocks outperforms existing methods in generating realistic protein structures, but incorporating residual blocks enhances its ability to capture key structural features. Comparisons also demonstrate that the more accurately a model learns symmetry features in the generated distance matrices, the better it captures key structural features, as demonstrated through benchmarking against existing methods. This work moves us closer to more advanced deep generative models that can explore a broader range of protein structures and be applied to drug design and protein engineering. The code and data are available at https://github.com/aalaa-sehsah/tp-vwgan .

Collapse

Liu H, Zhuo C, Gao J, Zeng C, Zhao Y. AI-integrated network for RNA complex structure and dynamic prediction. BIOPHYSICS REVIEWS 2024;5:041304. [PMID: 39512332 PMCID: PMC11540444 DOI: 10.1063/5.0237319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 10/15/2024] [Indexed: 11/15/2024]

Zhao C, Wang S. AttCON: With better MSAs and attention mechanism for accurate protein contact map prediction. Comput Biol Med 2024;169:107822. [PMID: 38091726 DOI: 10.1016/j.compbiomed.2023.107822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/19/2023] [Accepted: 12/04/2023] [Indexed: 02/08/2024]

Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023;123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]

Abstract

Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.

Collapse

Improved inter-residue contact prediction via a hybrid generative model and dynamic loss function. Comput Struct Biotechnol J 2022;20:6138-6148. [DOI: 10.1016/j.csbj.2022.11.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 11/07/2022] [Accepted: 11/07/2022] [Indexed: 11/13/2022] Open

Kagaya Y, Flannery ST, Jain A, Kihara D. ContactPFP: Protein Function Prediction Using Predicted Contact Information. FRONTIERS IN BIOINFORMATICS 2022;2. [PMID: 35875419 PMCID: PMC9302406 DOI: 10.3389/fbinf.2022.896295] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method. Collapse

Lee D, Xiong D, Wierbowski S, Li L, Liang S, Yu H. Deep learning methods for 3D structural proteome and interactome modeling. Curr Opin Struct Biol 2022;73:102329. [PMID: 35139457 PMCID: PMC8957610 DOI: 10.1016/j.sbi.2022.102329] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 12/05/2021] [Accepted: 12/31/2021] [Indexed: 12/19/2022]

Deep generative modeling for protein design. Curr Opin Struct Biol 2021;72:226-236. [PMID: 34963082 DOI: 10.1016/j.sbi.2021.11.008] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/01/2021] [Accepted: 11/22/2021] [Indexed: 11/21/2022]