1
|
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J 2024; 23:2727-2739. [PMID: 39035835 PMCID: PMC11260399 DOI: 10.1016/j.csbj.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
Understanding protein-protein interactions (PPIs) and the pathways they comprise is essential for comprehending cellular functions and their links to specific phenotypes. Despite the prevalence of molecular data generated by high-throughput sequencing technologies, a significant gap remains in translating this data into functional information regarding the series of interactions that underlie phenotypic differences. In this review, we present an in-depth analysis of heterogeneous network methodologies for modeling protein pathways, highlighting the critical role of integrating multifaceted biological data. It outlines the process of constructing these networks, from data representation to machine learning-driven predictions and evaluations. The work underscores the potential of heterogeneous networks in capturing the complexity of proteomic interactions, thereby offering enhanced accuracy in pathway prediction. This approach not only deepens our understanding of cellular processes but also opens up new possibilities in disease treatment and drug discovery by leveraging the predictive power of comprehensive proteomic data analysis.
Collapse
Affiliation(s)
- Gowri Nayar
- Department of Biomedical Data Science, Stanford University, United States
| | - Russ B. Altman
- Department of Biomedical Data Science, Stanford University, United States
- Department of Genetics, Stanford University, United States
- Department of Medicine, Stanford University, United States
- Department of Bioengineering, Stanford University, United States
| |
Collapse
|
2
|
Shen C, Ding P, Wee J, Bi J, Luo J, Xia K. Curvature-enhanced graph convolutional network for biomolecular interaction prediction. Comput Struct Biotechnol J 2024; 23:1016-1025. [PMID: 38425487 PMCID: PMC10904164 DOI: 10.1016/j.csbj.2024.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 02/07/2024] [Accepted: 02/07/2024] [Indexed: 03/02/2024] Open
Abstract
Geometric deep learning has demonstrated a great potential in non-Euclidean data analysis. The incorporation of geometric insights into learning architecture is vital to its success. Here we propose a curvature-enhanced graph convolutional network (CGCN) for biomolecular interaction prediction. Our CGCN employs Ollivier-Ricci curvature (ORC) to characterize network local geometric properties and enhance the learning capability of GCNs. More specifically, ORCs are evaluated based on the local topology from node neighborhoods, and further incorporated into the weight function for the feature aggregation in message-passing procedure. Our CGCN model is extensively validated on fourteen real-world bimolecular interaction networks and analyzed in details using a series of well-designed simulated data. It has been found that our CGCN can achieve the state-of-the-art results. It outperforms all existing models, as far as we know, in thirteen out of the fourteen real-world datasets and ranks as the second in the rest one. The results from the simulated data show that our CGCN model is superior to the traditional GCN models regardless of the positive-to-negative-curvature ratios, network densities, and network sizes (when larger than 500).
Collapse
Affiliation(s)
- Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
- School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore
| | - Pingjian Ding
- Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Junjie Wee
- School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore
| | - Jialin Bi
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
| | - Kelin Xia
- School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore
| |
Collapse
|
3
|
Wang X, Gao X, Fan X, Huai Z, Zhang G, Yao M, Wang T, Huang X, Lai L. WUREN: Whole-modal union representation for epitope prediction. Comput Struct Biotechnol J 2024; 23:2122-2131. [PMID: 38817963 PMCID: PMC11137340 DOI: 10.1016/j.csbj.2024.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/14/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
B-cell epitope identification plays a vital role in the development of vaccines, therapies, and diagnostic tools. Currently, molecular docking tools in B-cell epitope prediction are heavily influenced by empirical parameters and require significant computational resources, rendering a great challenge to meet large-scale prediction demands. When predicting epitopes from antigen-antibody complex, current artificial intelligence algorithms cannot accurately implement the prediction due to insufficient protein feature representations, indicating novel algorithm is desperately needed for efficient protein information extraction. In this paper, we introduce a multimodal model called WUREN (Whole-modal Union Representation for Epitope predictioN), which effectively combines sequence, graph, and structural features. It achieved AUC-PR scores of 0.213 and 0.193 on the solved structures and AlphaFold-generated structures, respectively, for the independent test proteins selected from DiscoTope3 benchmark. Our findings indicate that WUREN is an efficient feature extraction model for protein complexes, with the generalizable application potential in the development of protein-based drugs. Moreover, the streamlined framework of WUREN could be readily extended to model similar biomolecules, such as nucleic acids, carbohydrates, and lipids.
Collapse
Affiliation(s)
| | | | - Xuezhe Fan
- XtalPi Innovation Center, Beijing, China
| | - Zhe Huai
- XtalPi Innovation Center, Beijing, China
| | | | | | | | | | - Lipeng Lai
- XtalPi Innovation Center, Beijing, China
| |
Collapse
|
4
|
Tang W, van Ooijen PMA, Sival DA, Maurits NM. Automatic two-dimensional & three-dimensional video analysis with deep learning for movement disorders: A systematic review. Artif Intell Med 2024; 156:102952. [PMID: 39180925 DOI: 10.1016/j.artmed.2024.102952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 07/19/2024] [Accepted: 08/13/2024] [Indexed: 08/27/2024]
Abstract
The advent of computer vision technology and increased usage of video cameras in clinical settings have facilitated advancements in movement disorder analysis. This review investigated these advancements in terms of providing practical, low-cost solutions for the diagnosis and analysis of movement disorders, such as Parkinson's disease, ataxia, dyskinesia, and Tourette syndrome. Traditional diagnostic methods for movement disorders are typically reliant on the subjective assessment of motor symptoms, which poses inherent challenges. Furthermore, early symptoms are often overlooked, and overlapping symptoms across diseases can complicate early diagnosis. Consequently, deep learning has been used for the objective video-based analysis of movement disorders. This study systematically reviewed the latest advancements in automatic two-dimensional & three-dimensional video analysis using deep learning for movement disorders. We comprehensively analyzed the literature published until September 2023 by searching the Web of Science, PubMed, Scopus, and Embase databases. We identified 68 relevant studies and extracted information on their objectives, datasets, modalities, and methodologies. The study aimed to identify, catalogue, and present the most significant advancements, offering a consolidated knowledge base on the role of video analysis and deep learning in movement disorder analysis. First, the objectives, including specific PD symptom quantification, ataxia assessment, cerebral palsy assessment, gait disorder analysis, tremor assessment, tic detection (in the context of Tourette syndrome), dystonia assessment, and abnormal movement recognition were discussed. Thereafter, the datasets used in the study were examined. Subsequently, video modalities and deep learning methodologies related to the topic were investigated. Finally, the challenges and opportunities in terms of datasets, interpretability, evaluation methods, and home/remote monitoring were discussed.
Collapse
Affiliation(s)
- Wei Tang
- Department of Neurology, University Medical Center Groningen, University of Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands; Data Science Center in Health, University Medical Center Groningen, University of Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands.
| | - Peter M A van Ooijen
- Data Science Center in Health, University Medical Center Groningen, University of Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands
| | - Deborah A Sival
- Department of Pediatric Neurology, University Medical Center Groningen, University of Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands
| | - Natasha M Maurits
- Department of Neurology, University Medical Center Groningen, University of Groningen, P.O. Box 30001, 9700 RB Groningen, The Netherlands
| |
Collapse
|
5
|
Kwon H, Du Z, Li Y. AlphaFold 2-based stacking model for protein solubility prediction and its transferability on seed storage proteins. Int J Biol Macromol 2024; 278:134601. [PMID: 39137857 DOI: 10.1016/j.ijbiomac.2024.134601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/29/2024] [Accepted: 08/07/2024] [Indexed: 08/15/2024]
Abstract
Accurate protein solubility prediction is crucial in screening suitable candidates for food application. Existing models often rely only on sequences, overlooking important structural details. In this study, a regression model for protein solubility was developed using both the sequences and predicted structures of 2983 E. coli proteins. The sequence and structural level properties of the proteins were bioinformatically extracted and subjected to multilayer perceptron (MLP). Moreover, residue level features and contact maps were utilized to construct a graph convolutional network (GCN). The out-of-fold predictions of the two models were combined and fed into multiple meta-regressors to create a stacking model. The stacking model with support vector regressor (SVR) achieved R2 of 0.502 and 0.468 on test and external validation datasets, respectively, displaying higher performance compared to existing regression models. Based on the improved performance compared to its based models, the stacking model effectively captured the strength of its base models as well as the significance of the different features used. Furthermore, the model's transferability was indirectly validated on a dataset of seed storage proteins using Osborne definition as well as on a case study using molecular dynamic simulation, showing potential for application beyond microbial proteins to food and agriculture-related ones.
Collapse
Affiliation(s)
- Hyukjin Kwon
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Zhenjiao Du
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA
| | - Yonghui Li
- Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.
| |
Collapse
|
6
|
Li L, Xiang Y, Hao J. Biomedical event causal relation extraction with deep knowledge fusion and Roberta-based data augmentation. Methods 2024; 231:8-14. [PMID: 39241919 DOI: 10.1016/j.ymeth.2024.08.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 08/27/2024] [Indexed: 09/09/2024] Open
Abstract
Biomedical event causal relation extraction (BECRE), as a subtask of biomedical information extraction, aims to extract event causal relation facts from unstructured biomedical texts and plays an essential role in many downstream tasks. The existing works have two main problems: i) Only shallow features are limited in helping the model establish potential relationships between biomedical events. ii) Using the traditional oversampling method to solve the data imbalance problem of the BECRE tasks ignores the requirements for data diversifying. This paper proposes a novel biomedical event causal relation extraction method to solve the above problems using deep knowledge fusion and Roberta-based data augmentation. To address the first problem, we fuse deep knowledge, including structural event representation and entity relation path, for establishing potential semantic connections between biomedical events. We use the Graph Convolutional Neural network (GCN) and the predicated tensor model to acquire structural event representation, and entity relation paths are encoded based on the external knowledge bases (GTD, CDR, CHR, GDA and UMLS). We introduce the triplet attention mechanism to fuse structural event representation and entity relation path information. Besides, this paper proposes the Roberta-based data augmentation method to address the second problem, some words of biomedical text, except biomedical events, are masked proportionally and randomly, and then pre-trained Roberta generates data instances for the imbalance BECRE dataset. Extensive experimental results on Hahn-Powell's and BioCause datasets confirm that the proposed method achieves state-of-the-art performance compared to current advances.
Collapse
Affiliation(s)
- Lishuang Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.
| | - Yi Xiang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Jing Hao
- School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
| |
Collapse
|
7
|
Risheh A, Rebel A, Nerenberg PS, Forouzesh N. Calculation of protein-ligand binding entropies using a rule-based molecular fingerprint. Biophys J 2024; 123:2839-2848. [PMID: 38481102 PMCID: PMC11393669 DOI: 10.1016/j.bpj.2024.03.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/21/2023] [Accepted: 03/08/2024] [Indexed: 03/28/2024] Open
Abstract
The use of fast in silico prediction methods for protein-ligand binding free energies holds significant promise for the initial phases of drug development. Numerous traditional physics-based models (e.g., implicit solvent models), however, tend to either neglect or heavily approximate entropic contributions to binding due to their computational complexity. Consequently, such methods often yield imprecise assessments of binding strength. Machine learning models provide accurate predictions and can often outperform physics-based models. They, however, are often prone to overfitting, and the interpretation of their results can be difficult. Physics-guided machine learning models combine the consistency of physics-based models with the accuracy of modern data-driven algorithms. This work integrates physics-based model conformational entropies into a graph convolutional network. We introduce a new neural network architecture (a rule-based graph convolutional network) that generates molecular fingerprints according to predefined rules specifically optimized for binding free energy calculations. Our results on 100 small host-guest systems demonstrate significant improvements in convergence and preventing overfitting. We additionally demonstrate the transferability of our proposed hybrid model by training it on the aforementioned host-guest systems and then testing it on six unrelated protein-ligand systems. Our new model shows little difference in training set accuracy compared to a previous model but an order-of-magnitude improvement in test set accuracy. Finally, we show how the results of our hybrid model can be interpreted in a straightforward fashion.
Collapse
Affiliation(s)
- Ali Risheh
- Department of Computer Science, California State University, Los Angeles, California
| | - Alles Rebel
- Department of Computer Science, California State University, Los Angeles, California
| | - Paul S Nerenberg
- Kravis Department of Integrated Sciences, Claremont McKenna College, Claremont, California
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California.
| |
Collapse
|
8
|
Maryam, Rehman MU, Hussain I, Tayara H, Chong KT. A graph neural network approach for predicting drug susceptibility in the human microbiome. Comput Biol Med 2024; 179:108729. [PMID: 38955124 DOI: 10.1016/j.compbiomed.2024.108729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 06/04/2024] [Accepted: 06/08/2024] [Indexed: 07/04/2024]
Abstract
Recent studies have illuminated the critical role of the human microbiome in maintaining health and influencing the pharmacological responses of drugs. Clinical trials, encompassing approximately 150 drugs, have unveiled interactions with the gastrointestinal microbiome, resulting in the conversion of these drugs into inactive metabolites. It is imperative to explore the field of pharmacomicrobiomics during the early stages of drug discovery, prior to clinical trials. To achieve this, the utilization of machine learning and deep learning models is highly desirable. In this study, we have proposed graph-based neural network models, namely GCN, GAT, and GINCOV models, utilizing the SMILES dataset of drug microbiome. Our primary objective was to classify the susceptibility of drugs to depletion by gut microbiota. Our results indicate that the GINCOV surpassed the other models, achieving impressive performance metrics, with an accuracy of 93% on the test dataset. This proposed Graph Neural Network (GNN) model offers a rapid and efficient method for screening drugs susceptible to gut microbiota depletion and also encourages the improvement of patient-specific dosage responses and formulations.
Collapse
Affiliation(s)
- Maryam
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea
| | - Mobeen Ur Rehman
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University, United Arab Emirates
| | - Irfan Hussain
- Khalifa University Center for Autonomous Robotic Systems (KUCARS), Khalifa University, United Arab Emirates
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju, 54896, South Korea.
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea; Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju, 54896, South Korea.
| |
Collapse
|
9
|
Ye R, Chen X. Electromigration Analysis for Interconnects Using Improved Graph Convolutional Network with Edge Feature Aggregation. MICROMACHINES 2024; 15:1046. [PMID: 39203697 PMCID: PMC11356254 DOI: 10.3390/mi15081046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 08/10/2024] [Accepted: 08/16/2024] [Indexed: 09/03/2024]
Abstract
Electromigration (EM) is a critical reliability issue in integrated circuits and is becoming increasingly significant as fabrication technology nodes continue to advance. The analysis of the hydrostatic stress, which is paramount in electromigration studies, typically involves solving complex physical equations (partial differential equations, or PDEs in this case), which is time consuming, inefficient and not practical for full-chip EM analysis. In this paper, a novel approach is proposed, conceptualizing circuit interconnect trees as a graph within a graph neural network framework. Using finite element solution software, ground truth hydrostatic stress values were obtained to construct a dataset of interconnected trees with hydrostatic stress values for each node. An improved Graph Convolutional Network (GCN) augmented with edge feature aggregation and attention mechanism was then trained employing the dataset, yielding a model capable of predicting hydrostatic stress values for nodes in an interconnect tree. The results show that our model demonstrated a 15% improvement in the Root Mean Square Error (RMSE) compared to the original GCN model and improved the solution speed greatly compared to traditional finite element software.
Collapse
Affiliation(s)
- Ruqing Ye
- School of Integrated Circuits, Dalian University of Technology, Dalian 116024, China;
| | - Xiaoming Chen
- School of Integrated Circuits, Dalian University of Technology, Dalian 116024, China;
- School of Optoelectronic Engineering and Instrumentation Science, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
10
|
Tian L, Rao W, Zhao K, Vo HT. Analyzing world city network by graph convolutional networks. Sci Rep 2024; 14:18933. [PMID: 39147920 PMCID: PMC11327301 DOI: 10.1038/s41598-024-69494-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 08/05/2024] [Indexed: 08/17/2024] Open
Abstract
British scholar Peter Taylor constructed the World City Network by analyzing the office networks of multinational companies, enabling a network perspective on world cities. However, this method has long been hindered by data deficiencies and update delays. In this study, we utilized publicly available, real-time updated data on global routes to construct the World City Network, thereby addressing the issues of data insufficiency and delayed updates in the existing model. For the first time, advanced Graph Convolutional Networks were employed to analyze the World City Network, and we introduced GCNRank. Finally, we compared GCNRank with other centrality measures and found that GCNRank provides a more detailed representation of city rankings and effectively avoids local optima.
Collapse
Affiliation(s)
- Linfang Tian
- School of Software Engineering, Tongji University, Shanghai, 201804, China.
| | - Weixiong Rao
- School of Software Engineering, Tongji University, Shanghai, 201804, China
| | - Kai Zhao
- J. Mack Robinson College of Business, Georgia State University, Atlanta, 30301, USA
| | - Huy T Vo
- The City College of New York, City University of New York and New York University, New York, 10031, USA
| |
Collapse
|
11
|
Wang J. Evaluation and analysis of visual perception using attention-enhanced computation in multimedia affective computing. Front Neurosci 2024; 18:1449527. [PMID: 39170679 PMCID: PMC11335721 DOI: 10.3389/fnins.2024.1449527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 07/11/2024] [Indexed: 08/23/2024] Open
Abstract
Facial expression recognition (FER) plays a crucial role in affective computing, enhancing human-computer interaction by enabling machines to understand and respond to human emotions. Despite advancements in deep learning, current FER systems often struggle with challenges such as occlusions, head pose variations, and motion blur in natural environments. These challenges highlight the need for more robust FER solutions. To address these issues, we propose the Attention-Enhanced Multi-Layer Transformer (AEMT) model, which integrates a dual-branch Convolutional Neural Network (CNN), an Attentional Selective Fusion (ASF) module, and a Multi-Layer Transformer Encoder (MTE) with transfer learning. The dual-branch CNN captures detailed texture and color information by processing RGB and Local Binary Pattern (LBP) features separately. The ASF module selectively enhances relevant features by applying global and local attention mechanisms to the extracted features. The MTE captures long-range dependencies and models the complex relationships between features, collectively improving feature representation and classification accuracy. Our model was evaluated on the RAF-DB and AffectNet datasets. Experimental results demonstrate that the AEMT model achieved an accuracy of 81.45% on RAF-DB and 71.23% on AffectNet, significantly outperforming existing state-of-the-art methods. These results indicate that our model effectively addresses the challenges of FER in natural environments, providing a more robust and accurate solution. The AEMT model significantly advances the field of FER by improving the robustness and accuracy of emotion recognition in complex real-world scenarios. This work not only enhances the capabilities of affective computing systems but also opens new avenues for future research in improving model efficiency and expanding multimodal data integration.
Collapse
Affiliation(s)
- Jingyi Wang
- School of Mass-communication and Advertising, Tongmyong University, Busan, Republic of Korea
| |
Collapse
|
12
|
Ishitani R, Takemoto M, Tomii K. Protein ligand binding site prediction using graph transformer neural network. PLoS One 2024; 19:e0308425. [PMID: 39106255 DOI: 10.1371/journal.pone.0308425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/23/2024] [Indexed: 08/09/2024] Open
Abstract
Ligand binding site prediction is a crucial initial step in structure-based drug discovery. Although several methods have been proposed previously, including those using geometry based and machine learning techniques, their accuracy is considered to be still insufficient. In this study, we introduce an approach that leverages a graph transformer neural network to rank the results of a geometry-based pocket detection method. We also created a larger training dataset compared to the conventionally used sc-PDB and investigated the correlation between the dataset size and prediction performance. Our findings indicate that utilizing a graph transformer-based method alongside a larger training dataset could enhance the performance of ligand binding site prediction.
Collapse
Affiliation(s)
- Ryuichiro Ishitani
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
- Preferred Networks, Inc., Chiyoda-ku, Tokyo, Japan
| | - Mizuki Takemoto
- Division of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo, Japan
| |
Collapse
|
13
|
Zhao L, Tan G, Wu Q, Pu B, Ren H, Li S, Li K. FARN: Fetal Anatomy Reasoning Network for Detection With Global Context Semantic and Local Topology Relationship. IEEE J Biomed Health Inform 2024; 28:4866-4877. [PMID: 38648141 DOI: 10.1109/jbhi.2024.3392531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
Accurate recognition of fetal anatomical structure is a pivotal task in ultrasound (US) image analysis. Sonographers naturally apply anatomical knowledge and clinical expertise to recognizing key anatomical structures in complex US images. However, mainstream object detection approaches usually treat each structure recognition separately, overlooking anatomical correlations between different structures in fetal US planes. In this work, we propose a Fetal Anatomy Reasoning Network (FARN) that incorporates two kinds of relationship forms: a global context semantic block summarized with visual similarity and a local topology relationship block depicting structural pair constraints. Specifically, by designing the Adaptive Relation Graph Reasoning (ARGR) module, anatomical structures are treated as nodes, with two kinds of relationships between nodes modeled as edges. The flexibility of the model is enhanced by constructing the adaptive relationship graph in a data-driven way, enabling adaptation to various data samples without the need for predefined additional constraints. The feature representation is further refined by aggregating the outputs of the ARGR module. Comprehensive experimental results demonstrate that FARN achieves promising performance in detecting 37 anatomical structures across key US planes in tertiary obstetric screening. FARN effectively utilizes key relationships to improve detection performance, demonstrates robustness to small-scale, similar, and indistinct structures, and avoids some detection errors that deviate from anatomical norms. Overall, our study serves as a resource for developing efficient and concise approaches to model inter-anatomy relationships.
Collapse
|
14
|
Ling Q, Liu A, Li Y, McKeown MJ, Chen X. fMRI-based spatio-temporal parcellations of the human brain. Curr Opin Neurol 2024; 37:369-380. [PMID: 38804205 DOI: 10.1097/wco.0000000000001280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
PURPOSE OF REVIEW Human brain parcellation based on functional magnetic resonance imaging (fMRI) plays an essential role in neuroscience research. By segmenting vast and intricate fMRI data into functionally similar units, researchers can better decipher the brain's structure in both healthy and diseased states. This article reviews current methodologies and ideas in this field, while also outlining the obstacles and directions for future research. RECENT FINDINGS Traditional brain parcellation techniques, which often rely on cytoarchitectonic criteria, overlook the functional and temporal information accessible through fMRI. The adoption of machine learning techniques, notably deep learning, offers the potential to harness both spatial and temporal information for more nuanced brain segmentation. However, the search for a one-size-fits-all solution to brain segmentation is impractical, with the choice between group-level or individual-level models and the intended downstream analysis influencing the optimal parcellation strategy. Additionally, evaluating these models is complicated by our incomplete understanding of brain function and the absence of a definitive "ground truth". SUMMARY While recent methodological advancements have significantly enhanced our grasp of the brain's spatial and temporal dynamics, challenges persist in advancing fMRI-based spatio-temporal representations. Future efforts will likely focus on refining model evaluation and selection as well as developing methods that offer clear interpretability for clinical usage, thereby facilitating further breakthroughs in our comprehension of the brain.
Collapse
Affiliation(s)
- Qinrui Ling
- Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
| | - Aiping Liu
- Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
| | - Yu Li
- Institute of Dataspace, Hefei Comprehensive National Science Center, Hefei 230088, China
| | - Martin J McKeown
- Department of Medicine, University of British Columbia, Vancouver, Vancouver V6T2B5, Canada
| | - Xun Chen
- Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
| |
Collapse
|
15
|
Chen L, Li Q, Nasif KFA, Xie Y, Deng B, Niu S, Pouriyeh S, Dai Z, Chen J, Xie CY. AI-Driven Deep Learning Techniques in Protein Structure Prediction. Int J Mol Sci 2024; 25:8426. [PMID: 39125995 PMCID: PMC11313475 DOI: 10.3390/ijms25158426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 07/29/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
Collapse
Affiliation(s)
- Lingtao Chen
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Qiaomu Li
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Kazi Fahim Ahmad Nasif
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Ying Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Bobin Deng
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Shuteng Niu
- Department of Computer Science, Bowling Green State University, Bowling Green, OH 43403, USA;
| | - Seyedamin Pouriyeh
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Zhiyu Dai
- Division of Pulmonary and Critical Care Medicine, John T. Milliken Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA;
| | - Jiawei Chen
- College of Computing, Data Science and Society, University of California, Berkeley, CA 94720, USA;
| | - Chloe Yixin Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| |
Collapse
|
16
|
Du W, Wang H, Zhao C, Cui Z, Li J, Zhang W, Yu Y, Peng X. Postoperative facial prediction for mandibular defect based on surface mesh deformation. JOURNAL OF STOMATOLOGY, ORAL AND MAXILLOFACIAL SURGERY 2024:101973. [PMID: 39089509 DOI: 10.1016/j.jormas.2024.101973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 07/12/2024] [Accepted: 07/17/2024] [Indexed: 08/04/2024]
Abstract
OBJECTIVES This study aims to introduce a novel predictive model for the post-operative facial contours of patients with mandibular defect, addressing limitations in current methodologies that fail to preserve geometric features and lack interpretability. METHODS Utilizing surface mesh theory and deep learning, our model diverges from traditional point cloud approaches by employing surface triangular mesh grids. We extract latent variables using a Mesh Convolutional Restricted Boltzmann Machines (MCRBM) model to generate a three-dimensional deformation field, aiming to enhance geometric information preservation and interpretability. RESULTS Experimental evaluations of our model demonstrate a prediction accuracy of 91.2 %, which represents a significant improvement over traditional machine learning-based methods. CONCLUSIONS The proposed model offers a promising new tool for pre-operative planning in oral and maxillofacial surgery. It significantly enhances the accuracy of post-operative facial contour predictions for mandibular defect reconstructions, providing substantial advancements over previous approaches.
Collapse
Affiliation(s)
- Wen Du
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China
| | - Hao Wang
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China
| | - Chenche Zhao
- College of Engineering, Peking University, China
| | - Zhiming Cui
- School of Biomedical Engineering, ShanghaiTech University, Shanghai 201210, China
| | - Jiaqi Li
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China
| | - Wenbo Zhang
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China
| | - Yao Yu
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China
| | - Xin Peng
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Beijing Key Laboratory of Digital Stomatology, NHC Key Laboratory of Digital Stomatology, China.
| |
Collapse
|
17
|
Labani M, Beheshti A, O’Brien TA. GENet: A Graph-Based Model Leveraging Histone Marks and Transcription Factors for Enhanced Gene Expression Prediction. Genes (Basel) 2024; 15:938. [PMID: 39062717 PMCID: PMC11275947 DOI: 10.3390/genes15070938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 07/28/2024] Open
Abstract
Understanding the regulatory mechanisms of gene expression is a crucial objective in genomics. Although the DNA sequence near the transcription start site (TSS) offers valuable insights, recent methods suggest that analyzing only the surrounding DNA may not suffice to accurately predict gene expression levels. We developed GENet (Gene Expression Network from Histone and Transcription Factor Integration), a novel approach that integrates essential regulatory signals from transcription factors and histone modifications into a graph-based model. GENet extends beyond simple DNA sequence analysis by incorporating additional layers of genetic control, which are vital for determining gene expression. Our method markedly enhances the prediction of mRNA levels compared to previous models that depend solely on DNA sequence data. The results underscore the significance of including comprehensive regulatory information in gene expression studies. GENet emerges as a promising tool for researchers, with potential applications extending from fundamental biological research to the development of medical therapies.
Collapse
Affiliation(s)
- Mahdieh Labani
- School of Computing, Macquarie University, Sydney 2109, Australia; (M.L.); (T.A.O.)
| | - Amin Beheshti
- School of Computing, Macquarie University, Sydney 2109, Australia; (M.L.); (T.A.O.)
| | - Tracey A. O’Brien
- School of Computing, Macquarie University, Sydney 2109, Australia; (M.L.); (T.A.O.)
- Cancer Institute NSW, Sydney 2065, Australia
- School of Clinical Medicine, Medicine & Health, University of New South Wales (UNSW), Sydney 2052, Australia
| |
Collapse
|
18
|
Wen J, Tang X, Lu J. An imbalanced learning method based on graph tran-smote for fraud detection. Sci Rep 2024; 14:16560. [PMID: 39019984 PMCID: PMC11255288 DOI: 10.1038/s41598-024-67550-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 07/12/2024] [Indexed: 07/19/2024] Open
Abstract
Fraud seriously threatens individual interests and social stability, so fraud detection has attracted much attention in recent years. In scenarios such as social media, fraudsters typically hide among numerous benign users, constituting only a small minority and often forming "small gangs". Due to the scarcity of fraudsters, the conventional graph neural network might overlook or obscure critical fraud information, leading to insufficient representation of fraud characteristics. To address these issues, the tran-smote on graphs (GTS) method for fraud detection is proposed by this study. Structural features of each type of node are deeply mined using a subgraph neural network extractor, these features are integrated with attribute features using transformer technology, and the node's information representation is enriched, thereby addressing the issue of inadequate feature representation. Additionally, this approach involves setting a feature embedding space to generate new nodes representing minority classes, and an edge generator is used to provide relevant connection information for these new nodes, alleviating the class imbalance problem. The results from experiments on two real datasets demonstrate that the proposed GTS, performs better than the current state-of-the-art baseline.
Collapse
Affiliation(s)
- Jintao Wen
- College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China
| | - Xianghong Tang
- College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China.
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, China.
| | - Jianguang Lu
- College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, China
| |
Collapse
|
19
|
Bera A, Bhattacharjee D, Krejcar O. PND-Net: plant nutrition deficiency and disease classification using graph convolutional network. Sci Rep 2024; 14:15537. [PMID: 38969738 PMCID: PMC11226607 DOI: 10.1038/s41598-024-66543-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 07/02/2024] [Indexed: 07/07/2024] Open
Abstract
Crop yield production could be enhanced for agricultural growth if various plant nutrition deficiencies, and diseases are identified and detected at early stages. Hence, continuous health monitoring of plant is very crucial for handling plant stress. The deep learning methods have proven its superior performances in the automated detection of plant diseases and nutrition deficiencies from visual symptoms in leaves. This article proposes a new deep learning method for plant nutrition deficiencies and disease classification using a graph convolutional network (GNN), added upon a base convolutional neural network (CNN). Sometimes, a global feature descriptor might fail to capture the vital region of a diseased leaf, which causes inaccurate classification of disease. To address this issue, regional feature learning is crucial for a holistic feature aggregation. In this work, region-based feature summarization at multi-scales is explored using spatial pyramidal pooling for discriminative feature representation. Furthermore, a GCN is developed to capacitate learning of finer details for classifying plant diseases and insufficiency of nutrients. The proposed method, called Plant Nutrition Deficiency and Disease Network (PND-Net), has been evaluated on two public datasets for nutrition deficiency, and two for disease classification using four backbone CNNs. The best classification performances of the proposed PND-Net are as follows: (a) 90.00% Banana and 90.54% Coffee nutrition deficiency; and (b) 96.18% Potato diseases and 84.30% on PlantDoc datasets using Xception backbone. Furthermore, additional experiments have been carried out for generalization, and the proposed method has achieved state-of-the-art performances on two public datasets, namely the Breast Cancer Histopathology Image Classification (BreakHis 40 × : 95.50%, and BreakHis 100 × : 96.79% accuracy) and Single cells in Pap smear images for cervical cancer classification (SIPaKMeD: 99.18% accuracy). Also, the proposed method has been evaluated using five-fold cross validation and achieved improved performances on these datasets. Clearly, the proposed PND-Net effectively boosts the performances of automated health analysis of various plants in real and intricate field environments, implying PND-Net's aptness for agricultural growth as well as human cancer classification.
Collapse
Affiliation(s)
- Asish Bera
- Department of Computer Science and Information Systems, BITS Pilani, Pilani Campus, Pilani, Rajasthan, 333031, India.
| | - Debotosh Bhattacharjee
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, West Bengal, 700032, India
- Faculty of Informatics and Management, University of Hradec Kralove, Hradec Kralove, Czech Republic
| | - Ondrej Krejcar
- Faculty of Informatics and Management, University of Hradec Kralove, Hradec Kralove, Czech Republic
- Skoda Auto University, Na Karmeli 1457, 293 01, Mlada Boleslav, Czech Republic
- Malaysia Japan International Institute of Technology (MJIIT), Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
| |
Collapse
|
20
|
Calazans MAA, Ferreira FABS, Santos FAN, Madeiro F, Lima JB. Machine Learning and Graph Signal Processing Applied to Healthcare: A Review. Bioengineering (Basel) 2024; 11:671. [PMID: 39061753 PMCID: PMC11273494 DOI: 10.3390/bioengineering11070671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/20/2024] [Accepted: 06/26/2024] [Indexed: 07/28/2024] Open
Abstract
Signal processing is a very useful field of study in the interpretation of signals in many everyday applications. In the case of applications with time-varying signals, one possibility is to consider them as graphs, so graph theory arises, which extends classical methods to the non-Euclidean domain. In addition, machine learning techniques have been widely used in pattern recognition activities in a wide variety of tasks, including health sciences. The objective of this work is to identify and analyze the papers in the literature that address the use of machine learning applied to graph signal processing in health sciences. A search was performed in four databases (Science Direct, IEEE Xplore, ACM, and MDPI), using search strings to identify papers that are in the scope of this review. Finally, 45 papers were included in the analysis, the first being published in 2015, which indicates an emerging area. Among the gaps found, we can mention the need for better clinical interpretability of the results obtained in the papers, that is not to restrict the results or conclusions simply to performance metrics. In addition, a possible research direction is the use of new transforms. It is also important to make new public datasets available that can be used to train the models.
Collapse
Affiliation(s)
| | - Felipe A. B. S. Ferreira
- Unidade Acadêmica do Cabo de Santo Agostinho, Universidade Federal Rural de Pernambuco, Cabo de Santo Agostinho 54518-430, Brazil;
| | - Fernando A. N. Santos
- Institute for Advanced Studies, Universiteit van Amsterdam, 1012 WP Amsterdam, The Netherlands;
| | - Francisco Madeiro
- Escola Politécnica de Pernambuco, Universidade de Pernambuco, Recife 50720-001, Brazil;
| | - Juliano B. Lima
- Centro de Tecnologia e Geociências, Universidade Federal de Pernambuco, Recife 50670-901, Brazil;
| |
Collapse
|
21
|
He H, Xie J, Huang D, Zhang M, Zhao X, Ying Y, Wang J. DRTerHGAT: A drug repurposing method based on the ternary heterogeneous graph attention network. J Mol Graph Model 2024; 130:108783. [PMID: 38677034 DOI: 10.1016/j.jmgm.2024.108783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/21/2024] [Accepted: 04/23/2024] [Indexed: 04/29/2024]
Abstract
Drug repurposing is an effective method to reduce the time and cost of drug development. Computational drug repurposing can quickly screen out the most likely associations from large biological databases to achieve effective drug repurposing. However, building a comprehensive model that integrates drugs, proteins, and diseases for drug repurposing remains challenging. This study proposes a drug repurposing method based on the ternary heterogeneous graph attention network (DRTerHGAT). DRTerHGAT designs a novel protein feature extraction process consisting of a large-scale protein language model and a multi-task autoencoder, so that protein features can be extracted accurately and efficiently from amino acid sequences. The ternary heterogeneous graph of drug-protein-disease comprehensively considering the relationships among the three types of nodes, including three homogeneous and three heterogeneous relationships. Based on the graph and the extracted protein features, the deep features of the drugs and the diseases are extracted by graph convolutional networks (GCN) and heterogeneous graph node attention networks (HGNA). In the experiments, DRTerHGAT is proven superior to existing advanced methods and DRTerHGAT variants. DRTerHGAT's powerful ability for drug repurposing is also demonstrated in Alzheimer's disease.
Collapse
Affiliation(s)
- Hongjian He
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiang Xie
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Dingkai Huang
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Mengfei Zhang
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Xuyu Zhao
- School of Life Sciences,Shanghai University, Shanghai, China
| | - Yiwei Ying
- School of Life Sciences,Shanghai University, Shanghai, China
| | - Jiao Wang
- School of Life Sciences,Shanghai University, Shanghai, China.
| |
Collapse
|
22
|
Wang Y, Han Y, Luo A, Xu S, Chen J, Liu W. Site selection and prediction of urban emergency shelter based on VGAE-RF model. Sci Rep 2024; 14:14368. [PMID: 38909046 PMCID: PMC11193824 DOI: 10.1038/s41598-024-64031-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 06/04/2024] [Indexed: 06/24/2024] Open
Abstract
As urban development accelerates and natural disasters occur more frequently, the urgency of developing effective emergency shelter planning strategies intensifies. The shelter location selection method under the traditional multi-criteria decision-making framework suffers from issues such as strong subjectivity and insufficient data support. Artificial intelligence offers a robust data-driven approach for site selection; however, many methods neglect the spatial relationships of site selection targets within geographical space. This paper introduces an emergency shelter site selection model that combines a variational graph autoencoder (VGAE) with a random forest (RF), namely VGAE-RF. In the constructed urban spatial topological graph, based on network geographic information, this model captures both the latent features of geographic unit coupling and integrates explicit and latent features to forecast the likelihood of emergency shelters in the construction area. This study takes Beijing, China, as the experimental area and evaluates the reliability of different model methods using a confusion matrix, Receiver Operating Characteristic (ROC) curve, and Imbalance Index of spatial distribution as evaluation indicators. The experimental results indicate that the proposed VGAE-RF model method, which considers spatial semantic associations, displays the best reliability.
Collapse
Affiliation(s)
- Yong Wang
- School of Geomatics, Anhui University of Science and Technology, Huainan, 232001, China
- Research Center of Geospatial Big Data Application, Chinese Academy of Surveying and Mapping, Beijing, 100830, China
| | - Yaoyao Han
- School of Geomatics, Anhui University of Science and Technology, Huainan, 232001, China.
| | - An Luo
- Research Center of Geospatial Big Data Application, Chinese Academy of Surveying and Mapping, Beijing, 100830, China.
| | - Shenghua Xu
- Research Center of Geospatial Big Data Application, Chinese Academy of Surveying and Mapping, Beijing, 100830, China
| | - Jian Chen
- School of Geomatics, Anhui University of Science and Technology, Huainan, 232001, China
| | - Wangwang Liu
- School of Geomatics and Marine Information, Jiangsu Ocean University, Lianyungang, 222002, China
| |
Collapse
|
23
|
Tigga NP, Garg S, Goyal N, Raj J, Das B. Brain-region specific autism prediction from electroencephalogram signals using graph convolution neural network. Technol Health Care 2024:THC240550. [PMID: 38943414 DOI: 10.3233/thc-240550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2024]
Abstract
BACKGROUND Brain variations are responsible for developmental impairments, including autism spectrum disorder (ASD). EEG signals efficiently detect neurological conditions by revealing crucial information about brain function abnormalities. OBJECTIVE This study aims to utilize EEG data collected from both autistic and typically developing children to investigate the potential of a Graph Convolutional Neural Network (GCNN) in predicting ASD based on neurological abnormalities revealed through EEG signals. METHODS In this study, EEG data were gathered from eight autistic children and eight typically developing children diagnosed using the Childhood Autism Rating Scale at the Central Institute of Psychiatry, Ranchi. EEG recording was done using a HydroCel GSN with 257 channels, and 71 channels with 10-10 international equivalents were utilized. Electrodes were divided into 12 brain regions. A GCNN was introduced for ASD prediction, preceded by autoregressive and spectral feature extraction. RESULTS The anterior-frontal brain region, crucial for cognitive functions like emotion, memory, and social interaction, proved most predictive of ASD, achieving 87.07% accuracy. This underscores the suitability of the GCNN method for EEG-based ASD detection. CONCLUSION The detailed dataset collected enhances understanding of the neurological basis of ASD, benefiting healthcare practitioners involved in ASD diagnosis.
Collapse
Affiliation(s)
- Neha Prerna Tigga
- Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi, India
| | - Shruti Garg
- Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi, India
| | - Nishant Goyal
- Department of Psychiatry, Central Institute of Psychiatry, Kanke, Ranchi, India
| | - Justin Raj
- Department of Psychiatry, Central Institute of Psychiatry, Kanke, Ranchi, India
| | - Basudeb Das
- Department of Psychiatry, Central Institute of Psychiatry, Kanke, Ranchi, India
| |
Collapse
|
24
|
Luong KD, Singh A. Application of Transformers in Cheminformatics. J Chem Inf Model 2024; 64:4392-4409. [PMID: 38815246 PMCID: PMC11167597 DOI: 10.1021/acs.jcim.3c02070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/05/2024] [Accepted: 05/06/2024] [Indexed: 06/01/2024]
Abstract
By accelerating time-consuming processes with high efficiency, computing has become an essential part of many modern chemical pipelines. Machine learning is a class of computing methods that can discover patterns within chemical data and utilize this knowledge for a wide variety of downstream tasks, such as property prediction or substance generation. The complex and diverse chemical space requires complex machine learning architectures with great learning power. Recently, learning models based on transformer architectures have revolutionized multiple domains of machine learning, including natural language processing and computer vision. Naturally, there have been ongoing endeavors in adopting these techniques to the chemical domain, resulting in a surge of publications within a short period. The diversity of chemical structures, use cases, and learning models necessitate a comprehensive summarization of existing works. In this paper, we review recent innovations in adapting transformers to solve learning problems in chemistry. Because chemical data is diverse and complex, we structure our discussion based on chemical representations. Specifically, we highlight the strengths and weaknesses of each representation, the current progress of adapting transformer architectures, and future directions.
Collapse
Affiliation(s)
- Kha-Dinh Luong
- Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, United States
| | - Ambuj Singh
- Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA 93106, United States
| |
Collapse
|
25
|
Awais M, Belhaouari SB, Kassoul K. Graphical Insight: Revolutionizing Seizure Detection with EEG Representation. Biomedicines 2024; 12:1283. [PMID: 38927490 PMCID: PMC11201274 DOI: 10.3390/biomedicines12061283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 05/28/2024] [Accepted: 05/31/2024] [Indexed: 06/28/2024] Open
Abstract
Epilepsy is characterized by recurring seizures that result from abnormal electrical activity in the brain. These seizures manifest as various symptoms including muscle contractions and loss of consciousness. The challenging task of detecting epileptic seizures involves classifying electroencephalography (EEG) signals into ictal (seizure) and interictal (non-seizure) classes. This classification is crucial because it distinguishes between the states of seizure and seizure-free periods in patients with epilepsy. Our study presents an innovative approach for detecting seizures and neurological diseases using EEG signals by leveraging graph neural networks. This method effectively addresses EEG data processing challenges. We construct a graph representation of EEG signals by extracting features such as frequency-based, statistical-based, and Daubechies wavelet transform features. This graph representation allows for potential differentiation between seizure and non-seizure signals through visual inspection of the extracted features. To enhance seizure detection accuracy, we employ two models: one combining a graph convolutional network (GCN) with long short-term memory (LSTM) and the other combining a GCN with balanced random forest (BRF). Our experimental results reveal that both models significantly improve seizure detection accuracy, surpassing previous methods. Despite simplifying our approach by reducing channels, our research reveals a consistent performance, showing a significant advancement in neurodegenerative disease detection. Our models accurately identify seizures in EEG signals, underscoring the potential of graph neural networks. The streamlined method not only maintains effectiveness with fewer channels but also offers a visually distinguishable approach for discerning seizure classes. This research opens avenues for EEG analysis, emphasizing the impact of graph representations in advancing our understanding of neurodegenerative diseases.
Collapse
Affiliation(s)
- Muhammad Awais
- Department of Creative Technologies, Air University, Islamabad 44000, Pakistan;
| | - Samir Brahim Belhaouari
- Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha 5825, Qatar
| | - Khelil Kassoul
- Geneva School of Business Administration, University of Applied Sciences Western Switzerland, HES-SO, 1227 Geneva, Switzerland
| |
Collapse
|
26
|
Zhang Y, Li J, Lin S, Zhao J, Xiong Y, Wei DQ. An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model. J Cheminform 2024; 16:67. [PMID: 38849874 PMCID: PMC11162000 DOI: 10.1186/s13321-024-00862-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 05/19/2024] [Indexed: 06/09/2024] Open
Abstract
Identification of interactions between chemical compounds and proteins is crucial for various applications, including drug discovery, target identification, network pharmacology, and elucidation of protein functions. Deep neural network-based approaches are becoming increasingly popular in efficiently identifying compound-protein interactions with high-throughput capabilities, narrowing down the scope of candidates for traditional labor-intensive, time-consuming and expensive experimental techniques. In this study, we proposed an end-to-end approach termed SPVec-SGCN-CPI, which utilized simplified graph convolutional network (SGCN) model with low-dimensional and continuous features generated from our previously developed model SPVec and graph topology information to predict compound-protein interactions. The SGCN technique, dividing the local neighborhood aggregation and nonlinearity layer-wise propagation steps, effectively aggregates K-order neighbor information while avoiding neighbor explosion and expediting training. The performance of the SPVec-SGCN-CPI method was assessed across three datasets and compared against four machine learning- and deep learning-based methods, as well as six state-of-the-art methods. Experimental results revealed that SPVec-SGCN-CPI outperformed all these competing methods, particularly excelling in unbalanced data scenarios. By propagating node features and topological information to the feature space, SPVec-SGCN-CPI effectively incorporates interactions between compounds and proteins, enabling the fusion of heterogeneity. Furthermore, our method scored all unlabeled data in ChEMBL, confirming the top five ranked compound-protein interactions through molecular docking and existing evidence. These findings suggest that our model can reliably uncover compound-protein interactions within unlabeled compound-protein pairs, carrying substantial implications for drug re-profiling and discovery. In summary, SPVec-SGCN demonstrates its efficacy in accurately predicting compound-protein interactions, showcasing potential to enhance target identification and streamline drug discovery processes.Scientific contributionsThe methodology presented in this work not only enables the comparatively accurate prediction of compound-protein interactions but also, for the first time, take sample imbalance which is very common in real world and computation efficiency into consideration simultaneously, accelerating the target identification and drug discovery process.
Collapse
Affiliation(s)
- Yufang Zhang
- School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, 200240, China
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China
- Zhongjing Research and Industrialization, Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, 473006, Henan, China
| | - Jiayi Li
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Jianwei Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China.
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Dong-Qing Wei
- Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.
- Zhongjing Research and Industrialization, Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, 473006, Henan, China.
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai JiaoTong University, Shanghai, China.
| |
Collapse
|
27
|
Rahmani M, Mohajelin F, Khaleghi N, Sheykhivand S, Danishvar S. An Automatic Lie Detection Model Using EEG Signals Based on the Combination of Type 2 Fuzzy Sets and Deep Graph Convolutional Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:3598. [PMID: 38894389 PMCID: PMC11175191 DOI: 10.3390/s24113598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 05/08/2024] [Accepted: 05/27/2024] [Indexed: 06/21/2024]
Abstract
In recent decades, many different governmental and nongovernmental organizations have used lie detection for various purposes, including ensuring the honesty of criminal confessions. As a result, this diagnosis is evaluated with a polygraph machine. However, the polygraph instrument has limitations and needs to be more reliable. This study introduces a new model for detecting lies using electroencephalogram (EEG) signals. An EEG database of 20 study participants was created to accomplish this goal. This study also used a six-layer graph convolutional network and type 2 fuzzy (TF-2) sets for feature selection/extraction and automatic classification. The classification results show that the proposed deep model effectively distinguishes between truths and lies. As a result, even in a noisy environment (SNR = 0 dB), the classification accuracy remains above 90%. The proposed strategy outperforms current research and algorithms. Its superior performance makes it suitable for a wide range of practical applications.
Collapse
Affiliation(s)
- Mahsan Rahmani
- Biomedical Engineering Department, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 51666-16471, Iran; (M.R.); (N.K.)
| | | | - Nastaran Khaleghi
- Biomedical Engineering Department, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 51666-16471, Iran; (M.R.); (N.K.)
| | - Sobhan Sheykhivand
- Department of Biomedical Engineering, University of Bonab, Bonab 55517-61167, Iran;
| | - Sebelan Danishvar
- College of Engineering, Design and Physical Sciences, Brunel University London, Uxbridge UB8 3PH, UK
| |
Collapse
|
28
|
Lou Y, Sun F, Ni J. Optimizing energy storage plant discrete system dynamics analysis with graph convolutional networks. Heliyon 2024; 10:e31119. [PMID: 38778935 PMCID: PMC11109872 DOI: 10.1016/j.heliyon.2024.e31119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/17/2024] [Accepted: 05/10/2024] [Indexed: 05/25/2024] Open
Abstract
Addressing the challenges of suboptimal model performance and excessive parameters and operations in the optimization of energy storage power plants utilizing Graph Convolutional Network (GCN), this paper introduces a novel approach - the packet-switched graph convolutional network. Initially, a GCN extreme learning machine is established. Drawing inspiration from this solid foundation, we have innovatively crafted a group exchange graph convolution module. This module leverages group graph convolution techniques to amalgamate unique node feature information, tailored to diverse topology graph matrices based on various groupings. This innovative approach ensures that information flows freely and effectively among distinct groupings. Furthermore, we have designed a cutting-edge timing depth separation convolution module, comprising two innovative components. The first component introduces timing depth separation convolution, revolutionizing the original timing convolution module. The second component, the packet-switching graph convolutional network, revolutionizes the time sequence depth separation convolution process. It achieves this by employing 1 × 1 convolutional layers between different feature fusion packets, enabling seamless information exchange between distinct packets. Experimental results demonstrate the efficacy of the proposed model, with root mean square error (RMSE) metrics and root mean square error (MAE) metrics for single-step prediction reaching 46.08 and 26.22 at 60 min, respectively. In multi-step testing, the proposed model exhibits a 14.71 % reduction in RMSE error at the 15-min scale and a 9.29 % reduction at the 60-min scale compared to the benchmark model. This performance improvement enhances the operational efficiency and reliability of the energy storage plant, particularly under dynamic changes in the time series.
Collapse
Affiliation(s)
- Yangbing Lou
- S.M. Wu Manufacturing Research Center, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, 48109, United States
| | | | - Jun Ni
- S.M. Wu Manufacturing Research Center, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, 48109, United States
| |
Collapse
|
29
|
Wang L, Hu Y, Xiao K, Zhang C, Shi Q, Chen L. Multi-modal domain adaptation for revealing spatial functional landscape from spatially resolved transcriptomics. Brief Bioinform 2024; 25:bbae257. [PMID: 38819253 PMCID: PMC11141295 DOI: 10.1093/bib/bbae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 04/13/2024] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) has emerged as a powerful tool for investigating gene expression in spatial contexts, providing insights into the molecular mechanisms underlying organ development and disease pathology. However, the expression sparsity poses a computational challenge to integrate other modalities (e.g. histological images and spatial locations) that are simultaneously captured in SRT datasets for spatial clustering and variation analyses. In this study, to meet such a challenge, we propose multi-modal domain adaption for spatial transcriptomics (stMDA), a novel multi-modal unsupervised domain adaptation method, which integrates gene expression and other modalities to reveal the spatial functional landscape. Specifically, stMDA first learns the modality-specific representations from spatial multi-modal data using multiple neural network architectures and then aligns the spatial distributions across modal representations to integrate these multi-modal representations, thus facilitating the integration of global and spatially local information and improving the consistency of clustering assignments. Our results demonstrate that stMDA outperforms existing methods in identifying spatial domains across diverse platforms and species. Furthermore, stMDA excels in identifying spatially variable genes with high prognostic potential in cancer tissues. In conclusion, stMDA as a new tool of multi-modal data integration provides a powerful and flexible framework for analyzing SRT datasets, thereby advancing our understanding of intricate biological systems.
Collapse
Affiliation(s)
- Lequn Wang
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Yaofeng Hu
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Kai Xiao
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Qianqian Shi
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| |
Collapse
|
30
|
Zhang W, Zhang P, Sun W, Xu J, Liao L, Cao Y, Han Y. Improving plant miRNA-target prediction with self-supervised k-mer embedding and spectral graph convolutional neural network. PeerJ 2024; 12:e17396. [PMID: 38799058 PMCID: PMC11122044 DOI: 10.7717/peerj.17396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/25/2024] [Indexed: 05/29/2024] Open
Abstract
Deciphering the targets of microRNAs (miRNAs) in plants is crucial for comprehending their function and the variation in phenotype that they cause. As the highly cell-specific nature of miRNA regulation, recent computational approaches usually utilize expression data to identify the most physiologically relevant targets. Although these methods are effective, they typically require a large sample size and high-depth sequencing to detect potential miRNA-target pairs, thereby limiting their applicability in improving plant breeding. In this study, we propose a novel miRNA-target prediction framework named kmerPMTF (k-mer-based prediction framework for plant miRNA-target). Our framework effectively extracts the latent semantic embeddings of sequences by utilizing k-mer splitting and a deep self-supervised neural network. We construct multiple similarity networks based on k-mer embeddings and employ graph convolutional networks to derive deep representations of miRNAs and targets and calculate the probabilities of potential associations. We evaluated the performance of kmerPMTF on four typical plant datasets: Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, and Prunus persica. The results demonstrate its ability to achieve AUPRC values of 84.9%, 91.0%, 80.1%, and 82.1% in 5-fold cross-validation, respectively. Compared with several state-of-the-art existing methods, our framework achieves better performance on threshold-independent evaluation metrics. Overall, our study provides an efficient and simplified methodology for identifying plant miRNA-target associations, which will contribute to a deeper comprehension of miRNA regulatory mechanisms in plants.
Collapse
Affiliation(s)
- Weihan Zhang
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Ping Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Weicheng Sun
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Jinsheng Xu
- College of Informatics, Huazhong Agricultural University, Wuhan, Hubei Province, China
| | - Liao Liao
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Yunpeng Cao
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| | - Yuepeng Han
- CAS Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, The Innovative Academy of Seed Design of Chinese Academy of Sciences, Wuhan, Hubei Province, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei Province, China
| |
Collapse
|
31
|
Chen K, Li G, Li H, Wang Y, Wang W, Liu Q, Wang H. Quantifying uncertainty: Air quality forecasting based on dynamic spatial-temporal denoising diffusion probabilistic model. ENVIRONMENTAL RESEARCH 2024; 249:118438. [PMID: 38350546 DOI: 10.1016/j.envres.2024.118438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/04/2024] [Accepted: 02/05/2024] [Indexed: 02/15/2024]
Abstract
Air pollution constitutes a substantial peril to human health, thereby catalyzing the evolution of an array of air quality prediction models. These models span from mechanistic and statistical strategies to machine learning methodologies. The burgeoning field of deep learning has given rise to a plethora of advanced models, which have demonstrated commendable performance. However, previous investigations have overlooked the salience of quantifying prediction uncertainties and potential future interconnections among air monitoring stations. Moreover, prior research typically utilized static predetermined spatial relationships, neglecting dynamic dependencies. To address these limitations, we propose a model named Dynamic Spatial-Temporal Denoising Diffusion Probabilistic Model (DST-DDPM) for air quality prediction. Our model is underpinned by the renowned denoising diffusion model, aiding us in discerning indeterminacy. In order to encapsulate dynamic patterns, we design a dynamic context encoder to generate dynamic adjacency matrices, whilst maintaining static spatial information. Furthermore, we incorporate a spatial-temporal denoising model to concurrently learn both spatial and temporal dependencies. Authenticating our model's performance using a real-world dataset collected in Beijing, the outcomes indicate that our model eclipses other baseline models in terms of both short-term and long-term predictions by 1.36% and 11.62% respectively. Finally, we conduct a case study to exhibit our model's capacity to quantify uncertainties.
Collapse
Affiliation(s)
- Kehua Chen
- Division of Emerging Interdisciplinary Areas (EMIA), Interdisciplinary Programs Office, The Hong Kong University of Science and Technology, Hong Kong, China; Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Guangbo Li
- Division of Emerging Interdisciplinary Areas (EMIA), Interdisciplinary Programs Office, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Hewen Li
- State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Yuqi Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Wenzhe Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Qingyi Liu
- State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Hongcheng Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China.
| |
Collapse
|
32
|
Labarga A, Martínez-Gonzalez J, Barajas M. Integrative Multi-Omics Analysis for Etiology Classification and Biomarker Discovery in Stroke: Advancing towards Precision Medicine. BIOLOGY 2024; 13:338. [PMID: 38785820 PMCID: PMC11149453 DOI: 10.3390/biology13050338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 05/02/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024]
Abstract
Recent advancements in high-throughput omics technologies have opened new avenues for investigating stroke at the molecular level and elucidating the intricate interactions among various molecular components. We present a novel approach for multi-omics data integration on knowledge graphs and have applied it to a stroke etiology classification task of 30 stroke patients through the integrative analysis of DNA methylation and mRNA, miRNA, and circRNA. This approach has demonstrated promising performance as compared to other existing single technology approaches.
Collapse
Affiliation(s)
- Alberto Labarga
- Health Science Department, Public University of Navarra, 31006 Pamplona, Spain;
| | | | - Miguel Barajas
- Health Science Department, Public University of Navarra, 31006 Pamplona, Spain;
| |
Collapse
|
33
|
Yin S, Mi X, Shukla D. Leveraging machine learning models for peptide-protein interaction prediction. RSC Chem Biol 2024; 5:401-417. [PMID: 38725911 PMCID: PMC11078210 DOI: 10.1039/d3cb00208j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/07/2024] [Indexed: 05/12/2024] Open
Abstract
Peptides play a pivotal role in a wide range of biological activities through participating in up to 40% protein-protein interactions in cellular processes. They also demonstrate remarkable specificity and efficacy, making them promising candidates for drug development. However, predicting peptide-protein complexes by traditional computational approaches, such as docking and molecular dynamics simulations, still remains a challenge due to high computational cost, flexible nature of peptides, and limited structural information of peptide-protein complexes. In recent years, the surge of available biological data has given rise to the development of an increasing number of machine learning models for predicting peptide-protein interactions. These models offer efficient solutions to address the challenges associated with traditional computational approaches. Furthermore, they offer enhanced accuracy, robustness, and interpretability in their predictive outcomes. This review presents a comprehensive overview of machine learning and deep learning models that have emerged in recent years for the prediction of peptide-protein interactions.
Collapse
Affiliation(s)
- Song Yin
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign Urbana 61801 Illinois USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign Urbana IL 61801 USA
- Department of Bioengineering, University of Illinois Urbana-Champaign Urbana IL 61801 USA
| |
Collapse
|
34
|
Huckvale ED, Moseley HNB. A cautionary tale about properly vetting datasets used in supervised learning predicting metabolic pathway involvement. PLoS One 2024; 19:e0299583. [PMID: 38696410 PMCID: PMC11065254 DOI: 10.1371/journal.pone.0299583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/13/2024] [Indexed: 05/04/2024] Open
Abstract
The mapping of metabolite-specific data to pathways within cellular metabolism is a major data analysis step needed for biochemical interpretation. A variety of machine learning approaches, particularly deep learning approaches, have been used to predict these metabolite-to-pathway mappings, utilizing a training dataset of known metabolite-to-pathway mappings. A few such training datasets have been derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG). However, several prior published machine learning approaches utilized an erroneous KEGG-derived training dataset that used SMILES molecular representations strings (KEGG-SMILES dataset) and contained a sizable proportion (~26%) duplicate entries. The presence of so many duplicates taint the training and testing sets generated from k-fold cross-validation of the KEGG-SMILES dataset. Therefore, the k-fold cross-validation performance of the resulting machine learning models was grossly inflated by the erroneous presence of these duplicate entries. Here we describe and evaluate the KEGG-SMILES dataset so that others may avoid using it. We also identify the prior publications that utilized this erroneous KEGG-SMILES dataset so their machine learning results can be properly and critically evaluated. In addition, we demonstrate the reduction of model k-fold cross-validation (CV) performance after de-duplicating the KEGG-SMILES dataset. This is a cautionary tale about properly vetting prior published benchmark datasets before using them in machine learning approaches. We hope others will avoid similar mistakes.
Collapse
Affiliation(s)
- Erik D. Huckvale
- Markey Cancer Center, University of Kentucky, Lexington, Kentucky, United States of America
| | - Hunter N. B. Moseley
- Markey Cancer Center, University of Kentucky, Lexington, Kentucky, United States of America
- Superfund Research Center, University of Kentucky, Lexington, Kentucky, United States of America
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
- Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
35
|
Gao Z, Guo Y, Wang G, Chen X, Cao X, Zhang C, An S, Xu F. Robust deep learning from incomplete annotation for accurate lung nodule detection. Comput Biol Med 2024; 173:108361. [PMID: 38569236 DOI: 10.1016/j.compbiomed.2024.108361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 03/02/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024]
Abstract
Deep learning plays a significant role in the detection of pulmonary nodules in low-dose computed tomography (LDCT) scans, contributing to the diagnosis and treatment of lung cancer. Nevertheless, its effectiveness often relies on the availability of extensive, meticulously annotated dataset. In this paper, we explore the utilization of an incompletely annotated dataset for pulmonary nodules detection and introduce the FULFIL (Forecasting Uncompleted Labels For Inexpensive Lung nodule detection) algorithm as an innovative approach. By instructing annotators to label only the nodules they are most confident about, without requiring complete coverage, we can substantially reduce annotation costs. Nevertheless, this approach results in an incompletely annotated dataset, which presents challenges when training deep learning models. Within the FULFIL algorithm, we employ Graph Convolution Network (GCN) to discover the relationships between annotated and unannotated nodules for self-adaptively completing the annotation. Meanwhile, a teacher-student framework is employed for self-adaptive learning using the completed annotation dataset. Furthermore, we have designed a Dual-Views loss to leverage different data perspectives, aiding the model in acquiring robust features and enhancing generalization. We carried out experiments using the LUng Nodule Analysis (LUNA) dataset, achieving a sensitivity of 0.574 at a False positives per scan (FPs/scan) of 0.125 with only 10% instance-level annotations for nodules. This performance outperformed comparative methods by 7.00%. Experimental comparisons were conducted to evaluate the performance of our model and human experts on test dataset. The results demonstrate that our model can achieve a comparable level of performance to that of human experts. The comprehensive experimental results demonstrate that FULFIL can effectively leverage an incomplete pulmonary nodule dataset to develop a robust deep learning model, making it a promising tool for assisting in lung nodule detection.
Collapse
Affiliation(s)
- Zebin Gao
- School of Information Science and Technology, Fudan University, Shanghai 200438, China
| | - Yuchen Guo
- Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
| | - Guoxin Wang
- JD Health International Inc, Beijing 100176, China
| | - Xiangru Chen
- Hangzhou Zhuoxi Institute of Brain and Intelligence, Hangzhou 311100, China
| | - Xuyang Cao
- JD Health International Inc, Beijing 100176, China
| | - Chao Zhang
- JD Health International Inc, Beijing 100176, China
| | - Shan An
- JD Health International Inc, Beijing 100176, China
| | - Feng Xu
- School of Software, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
36
|
Andayeshgar B, Abdali-Mohammadi F, Sepahvand M, Almasi A, Salari N. Arrhythmia detection by the graph convolution network and a proposed structure for communication between cardiac leads. BMC Med Res Methodol 2024; 24:96. [PMID: 38678178 PMCID: PMC11055258 DOI: 10.1186/s12874-024-02223-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 04/17/2024] [Indexed: 04/29/2024] Open
Abstract
One of the most common causes of death worldwide is heart disease, including arrhythmia. Today, sciences such as artificial intelligence and medical statistics are looking for methods and models for correct and automatic diagnosis of cardiac arrhythmia. In pursuit of increasing the accuracy of automated methods, many studies have been conducted. However, in none of the previous articles, the relationship and structure between the heart leads have not been included in the model. It seems that the structure of ECG data can help develop the accuracy of arrhythmia detection. Therefore, in this study, a new structure of Electrocardiogram (ECG) data was introduced, and the Graph Convolution Network (GCN), which has the possibility of learning the structure, was used to develop the accuracy of cardiac arrhythmia diagnosis. Considering the relationship between the heart leads and clusters based on different ECG poles, a new structure was introduced. In this structure, the Mutual Information(MI) index was used to evaluate the relationship between the leads, and weight was given based on the poles of the leads. Weighted Mutual Information (WMI) matrices (new structure) were formed by R software. Finally, the 15-layer GCN network was adjusted by this structure and the arrhythmia of people was detected and classified by it. To evaluate the performance of the proposed new network, sensitivity, precision, specificity, accuracy, and confusion matrix indices were used. Also, the accuracy of GCN networks was compared by three different structures, including WMI, MI, and Identity. Chapman's 12-lead ECG Dataset was used in this study. The results showed that the values of sensitivity, precision, specificity, and accuracy of the GCN-WMI network with 15 intermediate layers were equal to 98.74%, 99.08%, 99.97% & 99.82%, respectively. This new proposed network was more accurate than the Graph Convolution Network-Mutual Information (GCN-MI) with an accuracy equal to 99.71% and GCN-Id with an accuracy equal to 92.68%. Therefore, utilizing this network, the types of arrhythmia were recognized and classified. Also, the new network proposed by the Graph Convolution Network-Weighted Mutual Information (GCN-WMI) was more accurate than those conducted in other studies on the same data set (Chapman). Based on the obtained results, the structure proposed in this study increased the accuracy of cardiac arrhythmia diagnosis and classification on the Chapman data set. Achieving such accuracy for arrhythmia diagnosis is a great achievement in clinical sciences.
Collapse
Affiliation(s)
- Bahare Andayeshgar
- Department of Biostatistics, School of Health, Kermanshah University of Medical Sciences, Kermanshah, 6715847141, Iran
| | - Fardin Abdali-Mohammadi
- Department of Computer Engineering and Information Technology, Razi University, Kermanshah, 6714967346, Iran
| | - Majid Sepahvand
- Department of Computer Engineering and Information Technology, Razi University, Kermanshah, 6714967346, Iran
| | - Afshin Almasi
- Clinical Research Development Center, Mohammad Kermanshahi, and Farabi Hospitals, Imam Khomeini, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Nader Salari
- Department of Biostatistics, School of Health, Kermanshah University of Medical Sciences, Kermanshah, 6715847141, Iran.
- Sleep Disorders Research Center, Kermanshah University of Medical Sciences, Kermanshah, 6715847141, Iran.
| |
Collapse
|
37
|
Ma C, Gu Y, Wang Z. TriConvUNeXt: A Pure CNN-Based Lightweight Symmetrical Network for Biomedical Image Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01116-8. [PMID: 38653912 DOI: 10.1007/s10278-024-01116-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 03/15/2024] [Accepted: 03/25/2024] [Indexed: 04/25/2024]
Abstract
Biomedical image segmentation is essential in clinical practices, offering critical insights for accurate diagnosis and strategic treatment approaches. Nowadays, self-attention-based networks have achieved competitive performance in both natural language processing and computer vision, but the computational cost has reduced their popularity in practical applications. The recent study of Convolutional Neural Network (CNN) explores linear functions within modified CNN layer demonstrating pure CNN-based networks can still achieve competitive results against Vision Transformer (ViT) in biomedical image segmentation, with fewer parameters. The modified CNN, i.e., Depthwise CNN, however, leaves the features cleaved off in the channel dimension and prevents the extraction of features in the perspective of channel interaction. To effectively further explore the feature learning power of modified CNN with biomedical image segmentation, we design a lightweight multi-convolutional multi-scale convolutional network block (MSConvNeXt) for U-shape symmetrical network. Specifically, a network block consisting of a depthwise CNN, a deformable CNN, and a dilated CNN is proposed to capture semantic feature information effectively while with low computational cost. Furthermore, channel shuffling operation is proposed to dynamically promote an efficient feature fusion among the feature maps. The network block hereby is properly deployed within U-shape symmetrical encoder-decoder style network, named TriConvUNeXt. The proposed network is validated on a public benchmark dataset with a comprehensive evaluation in both computational cost and segmentation performance against 13 baseline methods. Specifically, TriConvUNeXt achieves 1% higher than UNet and TransUNet in Dice-Coefficient while 81% and 97% lower in computational cost. The implementation of TriConvUNeXt is made publicly accessible via https://github.com/ziyangwang007/TriConvUNeXt .
Collapse
Affiliation(s)
- Chao Ma
- Mianyang Visual Object Detection and Recognition Engineering Center, Mianyang, China
| | - Yuan Gu
- School of Medicine, Stanford University, Stanford, USA
| | - Ziyang Wang
- Department of Computer Science, University of Oxford, Oxford, UK.
| |
Collapse
|
38
|
Xiao Y, Hou Y, Zhou H, Diallo G, Fiszman M, Wolfson J, Zhou L, Kilicoglu H, Chen Y, Su C, Xu H, Mantyh WG, Zhang R. Repurposing non-pharmacological interventions for Alzheimer's disease through link prediction on biomedical literature. Sci Rep 2024; 14:8693. [PMID: 38622164 PMCID: PMC11018822 DOI: 10.1038/s41598-024-58604-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 04/01/2024] [Indexed: 04/17/2024] Open
Abstract
Non-pharmaceutical interventions (NPI) have great potential to improve cognitive function but limited investigation to discover NPI repurposing for Alzheimer's Disease (AD). This is the first study to develop an innovative framework to extract and represent NPI information from biomedical literature in a knowledge graph (KG), and train link prediction models to repurpose novel NPIs for AD prevention. We constructed a comprehensive KG, called ADInt, by extracting NPI information from biomedical literature. We used the previously-created SuppKG and NPI lexicon to identify NPI entities. Four KG embedding models (i.e., TransE, RotatE, DistMult and ComplEX) and two novel graph convolutional network models (i.e., R-GCN and CompGCN) were trained and compared to learn the representation of ADInt. Models were evaluated and compared on two test sets (time slice and clinical trial ground truth) and the best performing model was used to predict novel NPIs for AD. Discovery patterns were applied to generate mechanistic pathways for high scoring candidates. The ADInt has 162,212 nodes and 1,017,284 edges. R-GCN performed best in time slice (MR = 5.2054, Hits@10 = 0.8496) and clinical trial ground truth (MR = 3.4996, Hits@10 = 0.9192) test sets. After evaluation by domain experts, 10 novel dietary supplements and 10 complementary and integrative health were proposed from the score table calculated by R-GCN. Among proposed novel NPIs, we found plausible mechanistic pathways for photodynamic therapy and Choerospondias axillaris to prevent AD, and validated psychotherapy and manual therapy techniques using real-world data analysis. The proposed framework shows potential for discovering new NPIs for AD prevention and understanding their mechanistic pathways.
Collapse
Affiliation(s)
- Yongkang Xiao
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Yu Hou
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Huixue Zhou
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Gayo Diallo
- INRIA SISTM, Team AHeaD - INSERM 1219 Bordeaux Population Health, University of Bordeaux, 33000, Bordeaux, France
| | - Marcelo Fiszman
- NITES - Núcleo de Inovação e Tecnologia Em Saúde, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
- Semedy Inc, Needham, MA, USA
| | - Julian Wolfson
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Li Zhou
- Division of General Internal Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - You Chen
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Chang Su
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Hua Xu
- Section of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT, USA
| | - William G Mantyh
- Department of Neurology, University of Minnesota, Minneapolis, MN, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
39
|
Arab I, Egghe K, Laukens K, Chen K, Barakat K, Bittremieux W. Benchmarking of Small Molecule Feature Representations for hERG, Nav1.5, and Cav1.2 Cardiotoxicity Prediction. J Chem Inf Model 2024; 64:2515-2527. [PMID: 37870574 DOI: 10.1021/acs.jcim.3c01301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
In the field of drug discovery, there is a substantial challenge in seeking out chemical structures that possess desirable pharmacological, toxicological, and pharmacokinetic properties. Complications arise when drugs interfere with the functioning of cardiac ion channels, leading to serious cardiovascular consequences. The discontinuation and removal of numerous approved drugs from the market or at late development stages in the pipeline due to such inhibitory effects further highlight the urgency of addressing this issue. Consequently, the early prediction of potential blockers targeting cardiac ion channels during the drug discovery process is of paramount importance. This study introduces a deep learning framework that computationally determines the cardiotoxicity associated with the voltage-gated potassium channel (hERG), the voltage-gated calcium channel (Cav1.2), and the voltage-gated sodium channel (Nav1.5) for drug candidates. The predictive capabilities of three feature representations─molecular fingerprints, descriptors, and graph-based numerical representations─are rigorously benchmarked. Additionally, a novel training and evaluation data set framework is presented, enabling predictive model training of drug off-target cardiotoxicity using a comprehensive and large curated data set covering these three cardiac ion channels. To facilitate these predictions, a robust and comprehensive small molecule cardiotoxicity prediction tool named CToxPred has been developed. It is made available as open source under the permissive MIT license at https://github.com/issararab/CToxPred.
Collapse
Affiliation(s)
- Issar Arab
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| | - Kristof Egghe
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
| | - Kris Laukens
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| | - Ke Chen
- Chair for Theoretical Chemistry, Catalysis Research Center, Technische Universität München, Lichtenbergstraße 4, D-85747 Garching, Germany
| | - Khaled Barakat
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, Alberta 8613, Canada
| | - Wout Bittremieux
- Department of Computer Science, University of Antwerp, 2020 Antwerp, Belgium
- Biomedical Informatics Network Antwerpen (Biomina), 2020 Antwerp, Belgium
| |
Collapse
|
40
|
Ma H, Li D, Zhao J, Li W, Fu J, Li C. HR-BGCN : Predicting readmission for heart failure from electronic health records. Artif Intell Med 2024; 150:102829. [PMID: 38553167 DOI: 10.1016/j.artmed.2024.102829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Revised: 11/19/2023] [Accepted: 03/01/2024] [Indexed: 04/02/2024]
Abstract
Heart failure has become a huge public health problem, and failure to accurately predict readmission will further lead to the disease's high cost and high mortality. The construction of readmission prediction model can assist doctors in making decisions to prevent patients from deteriorating and reduce the cost burden. This paper extracts the patient discharge records from the MIMIC-III database. It divides the patients into three research categories: no readmission, readmission within 30 days, and readmission after 30 days, to predict the readmission of patients. We propose the HR-BGCN model to predict the readmission of patients. First, we use the Adaptive-TMix to improve the prediction indicators of a few categories and reduce the impact of unbalanced categories. Then, the knowledge-informed graph attention mechanism is proposed. By introducing a document-level explicit diagram structure, the coding ability of graph node features is significantly improved. The paragraph-level representation obtained through graph learning is combined with the context token-level representation of BERT, and finally, the multi-classification task is carried out. We also compare several typical graph learning classification models to verify the model's effectiveness, such as the IA-GCN model, GAT model, etc. The results show that the average F1 score of the HR-BGCN model proposed in this paper for 30-day readmission of heart failure patients is 88.26%, and the average accuracy is 90.47%. The HR-BGCN model is significantly better than the graph learning classification model for predicting heart failure readmission. It can help doctors predict the 30-day readmission of patients, then reduce the readmission rate of patients.
Collapse
Affiliation(s)
- Huiting Ma
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China; Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China; Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China
| | - Dengao Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China; Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China; Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China.
| | - Jumin Zhao
- College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan, 030024, China; Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China; Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China
| | - Wenjing Li
- University of California, SantaBarbara majoring in actuarial science, CA, 93106, United States of America
| | - Jian Fu
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030024, China; Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan, 030024, China; Intelligent Perception Engineering Technology Center of Shanxi, Taiyuan, 030024, China
| | - Chunxia Li
- Department of Cardiology, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Shanxi Medical University; Tongji Shanxi Hospital, Tongji Medical College, Huazhong University of Science and Technology, Taiyuan, 030032, China
| |
Collapse
|
41
|
Yang R, Fu Y, Zhang Q, Zhang L. GCNGAT: Drug-disease association prediction based on graph convolution neural network and graph attention network. Artif Intell Med 2024; 150:102805. [PMID: 38553169 DOI: 10.1016/j.artmed.2024.102805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 01/22/2024] [Accepted: 02/08/2024] [Indexed: 04/02/2024]
Abstract
Predicting drug-disease associations can contribute to discovering new therapeutic potentials of drugs, and providing important association information for new drug research and development. Many existing drug-disease association prediction methods have not distinguished relevant background information for the same drug targeted to different diseases. Therefore, this paper proposes a drug-disease association prediction model based on graph convolutional network and graph attention network (GCNGAT) to reposition marketed drugs under the distinguishment of background information. Firstly, in order to obtain initial drug-disease information, a drug-disease heterogeneous graph structure is constructed based on all known drug-disease associations. Secondly, based on the heterogeneous graph structure, the corresponding subgraphs of each group of drug-disease association pairs are extracted to distinguish different background information for the same drug from different diseases. Finally, a model combining Graph neural network with global Average pooling (GnnAp) is designed to predict potential drug-disease associations by learning drug-disease interaction feature representations. The experimental results show that adding subgraph extraction can effectively improve the prediction performance of the model, and the graph representation learning module can fully extract the deep features of drug-disease. Using the 5-fold cross-validation, the proposed model (GCNGAT) achieves AUC (Area Under the receiver operating characteristic Curve) values of 0.9182 and 0.9417 on the PREDICT dataset and CDataset dataset, respectively. Compared with other predictors on the same dataset (PREDICT dataset), GCNGAT outperforms the existing best-performing model (PSGCN), with a 1.58% increase in the AUC value. It is anticipated that this model can provide experimental reference for drug repositioning and further promote the drug research and development process.
Collapse
Affiliation(s)
- Runtao Yang
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China.
| | - Yao Fu
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China.
| | - Qian Zhang
- Heze Institute of Science and Technology Information, Heze, 274000, China.
| | - Lina Zhang
- School of Mechanical, Electrical and Information Engineering, Shandong University at Weihai, 264209, China.
| |
Collapse
|
42
|
Zhu F, Zhang X, Zhang B, Xu Y, Cui L. Medicine Package Recommendation via Dual-Level Interaction Aware Heterogeneous Graph. IEEE J Biomed Health Inform 2024; 28:2294-2303. [PMID: 38598367 DOI: 10.1109/jbhi.2024.3361552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
Medicine package recommendation aims to assist doctors in clinical decision-making by recommending appropriate packages of medicines for patients. Current methods model this task as a multi-label classification or sequence generation problem, focusing on learning relationships between individual medicines and other medical entities. However, these approaches uniformly overlook the interactions between medicine packages and other medical entities, potentially resulting in a lack of completeness in recommended medicine packages. Furthermore, medicine commonsense knowledge considered by current methods is notably limited, making it challenging to delve into the decision-making processes of doctors. To solve these problems, we propose DIAGNN, a Dual-level Interaction Aware heterogeneous Graph Neural Network for medicine package recommendation. Specifically, DIAGNN explicitly models interactions of medical entities within electronic health records(EHRs) at two levels, individual medicine and medicine package, leveraging a heterogeneous graph. A dual-level interaction aware graph convolutional network is utilized to capture semantic information in the medical heterogeneous graph. Additionally, we incorporate medication indications into the medical heterogeneous graph as medicine commonsense knowledge. Extensive experimental results on real-world datasets validate the effectiveness of the proposed method.
Collapse
|
43
|
Lin H, Qiang Z, Tse R, Tang SK, Pau G. A few-shot learning method for tobacco abnormality identification. FRONTIERS IN PLANT SCIENCE 2024; 15:1333236. [PMID: 38681219 PMCID: PMC11055634 DOI: 10.3389/fpls.2024.1333236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 02/26/2024] [Indexed: 05/01/2024]
Abstract
Tobacco is a valuable crop, but its disease identification is rarely involved in existing works. In this work, we use few-shot learning (FSL) to identify abnormalities in tobacco. FSL is a solution for the data deficiency that has been an obstacle to using deep learning. However, weak feature representation caused by limited data is still a challenging issue in FSL. The weak feature representation leads to weak generalization and troubles in cross-domain. In this work, we propose a feature representation enhancement network (FREN) that enhances the feature representation through instance embedding and task adaptation. For instance embedding, global max pooling, and global average pooling are used together for adding more features, and Gaussian-like calibration is used for normalizing the feature distribution. For task adaptation, self-attention is adopted for task contextualization. Given the absence of publicly available data on tobacco, we created a tobacco leaf abnormality dataset (TLA), which includes 16 categories, two settings, and 1,430 images in total. In experiments, we use PlantVillage, which is the benchmark dataset for plant disease identification, to validate the superiority of FREN first. Subsequently, we use the proposed method and TLA to analyze and discuss the abnormality identification of tobacco. For the multi-symptom diseases that always have low accuracy, we propose a solution by dividing the samples into subcategories created by symptom. For the 10 categories of tomato in PlantVillage, the accuracy achieves 66.04% in 5-way, 1-shot tasks. For the two settings of the tobacco leaf abnormality dataset, the accuracies were achieved at 45.5% and 56.5%. By using the multisymptom solution, the best accuracy can be lifted to 60.7% in 16-way, 1-shot tasks and achieved at 81.8% in 16-way, 10-shot tasks. The results show that our method improves the performance greatly by enhancing feature representation, especially for tasks that contain categories with high similarity. The desensitization of data when crossing domains also validates that the FREN has a strong generalization ability.
Collapse
Affiliation(s)
- Hong Lin
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, Macao SAR, China
| | - Zhenping Qiang
- College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, China
| | - Rita Tse
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, Macao SAR, China
| | - Su-Kit Tang
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, Macao SAR, China
| | - Giovanni Pau
- Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
- Samueli Computer Science Department, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
44
|
Luo H, Yin W, Wang J, Zhang G, Liang W, Luo J, Yan C. Drug-drug interactions prediction based on deep learning and knowledge graph: A review. iScience 2024; 27:109148. [PMID: 38405609 PMCID: PMC10884936 DOI: 10.1016/j.isci.2024.109148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024] Open
Abstract
Drug-drug interactions (DDIs) can produce unpredictable pharmacological effects and lead to adverse events that have the potential to cause irreversible damage to the organism. Traditional methods to detect DDIs through biological or pharmacological analysis are time-consuming and expensive, therefore, there is an urgent need to develop computational methods to effectively predict drug-drug interactions. Currently, deep learning and knowledge graph techniques which can effectively extract features of entities have been widely utilized to develop DDI prediction methods. In this research, we aim to systematically review DDI prediction researches applying deep learning and graph knowledge. The available biomedical data and public databases related to drugs are firstly summarized in this review. Then, we discuss the existing drug-drug interactions prediction methods which have utilized deep learning and knowledge graph techniques and group them into three main classes: deep learning-based methods, knowledge graph-based methods, and methods that combine deep learning with knowledge graph. We comprehensively analyze the commonly used drug related data and various DDI prediction methods, and compare these prediction methods on benchmark datasets. Finally, we briefly discuss the challenges related to drug-drug interactions prediction, including asymmetric DDIs prediction and high-order DDI prediction.
Collapse
Affiliation(s)
- Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Weijie Yin
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Academy for Advanced Interdisciplinary Studies, Zhengzhou, China
| | - Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, China
| | - Wenjuan Liang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
- Academy for Advanced Interdisciplinary Studies, Zhengzhou, China
| |
Collapse
|
45
|
Kazi A, Mora J, Fischl B, Dalca AV, Aganj I. Multi-Head Graph Convolutional Network for Structural Connectome Classification. GRAPHS IN BIOMEDICAL IMAGE ANALYSIS, AND OVERLAPPED CELL ON TISSUE DATASET FOR HISTOPATHOLOGY : 5TH MICCAI WORKSHOP, GRAIL 2023 AND 1ST MICCAI CHALLENGE, OCELOT 2023, HELD IN CONJUNCTION WITH MICCAI 2023, VANCOUVER, BC, CANADA, SEPTEMBE... 2024; 14373:27-36. [PMID: 38665679 PMCID: PMC11044650 DOI: 10.1007/978-3-031-55088-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
We tackle classification based on brain connectivity derived from diffusion magnetic resonance images. We propose a machine-learning model inspired by graph convolutional networks (GCNs), which takes a brain-connectivity input graph and processes the data separately through a parallel GCN mechanism with multiple heads. The proposed network is a simple design that employs different heads involving graph convolutions focused on edges and nodes, thoroughly capturing representations from the input data. To test the ability of our model to extract complementary and representative features from brain connectivity data, we chose the task of sex classification. This quantifies the degree to which the connectome varies depending on the sex, which is important for improving our understanding of health and disease in both sexes. We show experiments on two publicly available datasets: PREVENT-AD (347 subjects) and OASIS3 (771 subjects). The proposed model demonstrates the highest performance compared to the existing machine-learning algorithms we tested, including classical methods and (graph and non-graph) deep learning. We provide a detailed analysis of each component of our model.
Collapse
Affiliation(s)
- Anees Kazi
- Athinoula A. Martinos Center for Biomedical Imaging, Radiology Department, Massachusetts General Hospital, Boston, USA
- Radiology Department, Harvard Medical School, Boston, USA
| | - Jocelyn Mora
- Athinoula A. Martinos Center for Biomedical Imaging, Radiology Department, Massachusetts General Hospital, Boston, USA
| | - Bruce Fischl
- Athinoula A. Martinos Center for Biomedical Imaging, Radiology Department, Massachusetts General Hospital, Boston, USA
- Radiology Department, Harvard Medical School, Boston, USA
| | - Adrian V Dalca
- Athinoula A. Martinos Center for Biomedical Imaging, Radiology Department, Massachusetts General Hospital, Boston, USA
- Radiology Department, Harvard Medical School, Boston, USA
- CSAIL, Massachusetts Institute of Technology, Cambridge, USA
| | - Iman Aganj
- Athinoula A. Martinos Center for Biomedical Imaging, Radiology Department, Massachusetts General Hospital, Boston, USA
- Radiology Department, Harvard Medical School, Boston, USA
| |
Collapse
|
46
|
Li C. Joint analysis of interaction and psychological characteristics in english teaching based on multimodal integration. BMC Psychol 2024; 12:121. [PMID: 38439095 PMCID: PMC10913431 DOI: 10.1186/s40359-024-01585-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 02/12/2024] [Indexed: 03/06/2024] Open
Abstract
The intersection of psychology and English teaching is profound, as the application of psychological principles not only guides specific English instruction but also elevates the overall quality of teaching. This paper takes a multimodal approach, incorporating image, acoustics, and text information, to construct a joint analysis model for English teaching interaction and psychological characteristics. The novel addition of an attention mechanism in the multimodal fusion process enables the development of an English teaching psychological characteristics recognition model. The initial step involves balancing the proportions of each emotion, followed by achieving multimodal alignment. In the cross-modal stage, the interaction of image, acoustics, and text is facilitated through a cross-modal attention mechanism. The utilization of a multi-attention mechanism not only enhances the network's representation capabilities but also streamlines the complexity of the model. Empirical results demonstrate the model's proficiency in accurately identifying five psychological characteristics. The proposed method achieves a classification accuracy of 90.40% for psychological features, with a commendable accuracy of 78.47% in multimodal classification. Furthermore, the incorporation of the attention mechanism in feature fusion contributes to an improved fusion effect.
Collapse
Affiliation(s)
- Chao Li
- School of Culture and Education, Shaanxi University of Science & Technology, 710021, Xi'an, Shaanxi, China.
| |
Collapse
|
47
|
Cao C, Wang H, Yang JR, Chen Q, Guo YM, Chen JZ. MCPNET: Development of an interpretable deep learning model based on multiple conformations of the compound for predicting developmental toxicity. Comput Biol Med 2024; 171:108037. [PMID: 38377716 DOI: 10.1016/j.compbiomed.2024.108037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 12/21/2023] [Accepted: 01/26/2024] [Indexed: 02/22/2024]
Abstract
The development of deep learning models for predicting toxicological endpoints has shown great promise, but one of the challenges in the field is the accuracy and interpretability of these models. The bioactive conformation of a compound plays a critical role for it to bind in the target. It is a big issue to figure out the bioactive conformation in deep learning without the co-crystal structure or highly precise molecular simulations. In this study, we developed a deep learning framework of Multi-Conformation Point Network (MCPNET) to construct classification and regression models, respectively, based on electrostatic potential distributions on vdW surfaces around multiple conformations of the compound using a dataset of compounds with developmental toxicity in zebrafish embryo. MCPNET applied 3D multi-conformational surface point cloud to extract the molecular features for model training, which may be critical for capturing the structural diversity of compounds. The models achieved an accuracy of 85 % on the classification task and R2 of 0.66 on the regression task, outperforming traditional machine learning models and other deep learning models. The key feature of our model is its interpretability with the component visualization to identify the factors contributing to the prediction and to understand the compound action mechanism. MCPNET may predict the conformation quietly close to the bioactive conformation of a compound by attention-based multi-conformation pooling mechanism. Our results demonstrated the potential of deep learning based on 3D molecular representations in accurately predicting developmental toxicity. The source code is publicly available at https://github.com/Superlit-CC/MCPNET.
Collapse
Affiliation(s)
- Cheng Cao
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China; Polytechnic Institute, Zhejiang University, 269 Shixiang Rd, Hangzhou, Zhejiang, 310015, China
| | - Hao Wang
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China
| | - Jin-Rong Yang
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China; Polytechnic Institute, Zhejiang University, 269 Shixiang Rd, Hangzhou, Zhejiang, 310015, China
| | - Qiang Chen
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China
| | - Ya-Min Guo
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China
| | - Jian-Zhong Chen
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Rd., Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|
48
|
Chen J, Zhu L, Wang J. Quantitative structure-property relationship modelling on autoignition temperature: evaluation and comparative analysis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:199-218. [PMID: 38372083 DOI: 10.1080/1062936x.2024.2312527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 01/25/2024] [Indexed: 02/20/2024]
Abstract
The autoignition temperature (AIT) serves as a crucial indicator for assessing the potential hazards associated with a chemical substance. In order to gain deeper insights into model performance and facilitate the establishment of effective methodological practices for AIT predictions, this study conducts a benchmark investigation on Quantitative Structure-Property Relationship (QSPR) modelling for AIT. As novelties of this work, three significant advancements are implemented in the AIT modelling process, including explicit consideration of data quality, utilization of state-of-the-art feature engineering workflows, and the innovative application of graph-based deep learning techniques, which are employed for the first time in AIT prediction. Specifically, three traditional QSPR models (multi-linear regression, support vector regression, and artificial neural networks) are evaluated, alongside the assessment of a deep-learning model employing message passing neural network architecture supplemented by graph-data augmentation techniques.
Collapse
Affiliation(s)
- J Chen
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| | - L Zhu
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| | - J Wang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou, China
| |
Collapse
|
49
|
Yin S, Mi X, Shukla D. Leveraging Machine Learning Models for Peptide-Protein Interaction Prediction. ARXIV 2024:arXiv:2310.18249v2. [PMID: 37961736 PMCID: PMC10635286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Peptides play a pivotal role in a wide range of biological activities through participating in up to 40% protein-protein interactions in cellular processes. They also demonstrate remarkable specificity and efficacy, making them promising candidates for drug development. However, predicting peptide-protein complexes by traditional computational approaches, such as Docking and Molecular Dynamics simulations, still remains a challenge due to high computational cost, flexible nature of peptides, and limited structural information of peptide-protein complexes. In recent years, the surge of available biological data has given rise to the development of an increasing number of machine learning models for predicting peptide-protein interactions. These models offer efficient solutions to address the challenges associated with traditional computational approaches. Furthermore, they offer enhanced accuracy, robustness, and interpretability in their predictive outcomes. This review presents a comprehensive overview of machine learning and deep learning models that have emerged in recent years for the prediction of peptide-protein interactions.
Collapse
Affiliation(s)
- Song Yin
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
- These authors contributed to the work equally
| | - Xuenan Mi
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
- These authors contributed to the work equally
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| |
Collapse
|
50
|
Ardabili SZ, Bahmani S, Lahijan LZ, Khaleghi N, Sheykhivand S, Danishvar S. A Novel Approach for Automatic Detection of Driver Fatigue Using EEG Signals Based on Graph Convolutional Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:364. [PMID: 38257457 PMCID: PMC10819416 DOI: 10.3390/s24020364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/27/2023] [Accepted: 01/04/2024] [Indexed: 01/24/2024]
Abstract
Nowadays, the automatic detection of driver fatigue has become one of the important measures to prevent traffic accidents. For this purpose, a lot of research has been conducted in this field in recent years. However, the diagnosis of fatigue in recent research is binary and has no operational capability. This research presents a multi-class driver fatigue detection system based on electroencephalography (EEG) signals using deep learning networks. In the proposed system, a standard driving simulator has been designed, and a database has been collected based on the recording of EEG signals from 20 participants in five different classes of fatigue. In addition to self-report questionnaires, changes in physiological patterns are used to confirm the various stages of weariness in the suggested model. To pre-process and process the signal, a combination of generative adversarial networks (GAN) and graph convolutional networks (GCN) has been used. The proposed deep model includes five convolutional graph layers, one dense layer, and one fully connected layer. The accuracy obtained for the proposed model is 99%, 97%, 96%, and 91%, respectively, for the four different considered practical cases. The proposed model is compared to one developed through recent methods and research and has a promising performance.
Collapse
Affiliation(s)
- Sevda Zafarmandi Ardabili
- Electrical and Computer Engineering Department, Southern Methodist University, Dallas, TX 75205, USA
| | - Soufia Bahmani
- Department of Computer Engineering and Information Technology, Amirkabir University of Technology, Tehran 15875-4413, Iran
| | - Lida Zare Lahijan
- Biomedical Engineering Department, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 51666-16471, Iran
| | - Nastaran Khaleghi
- Biomedical Engineering Department, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 51666-16471, Iran
| | - Sobhan Sheykhivand
- Department of Biomedical Engineering, University of Bonab, Bonab 55517-61167, Iran;
| | - Sebelan Danishvar
- College of Engineering, Design and Physical Sciences, Brunel University London, Uxbridge UB8 3PH, UK
| |
Collapse
|