1
|
Li B, Li X, Tang X, Wang J. Prediction and Evaluation of Coronavirus and Human Protein-Protein Interactions Integrating Five Different Computational Methods. Proteins 2025. [PMID: 40231383 DOI: 10.1002/prot.26826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2024] [Revised: 03/08/2025] [Accepted: 03/26/2025] [Indexed: 04/16/2025]
Abstract
The high lethality and infectiousness of coronaviruses, particularly SARS-Cov-2, pose a significant threat to human society. Understanding coronaviruses, especially the interactions between these viruses and humans, is crucial for mitigating the coronavirus pandemic. In this study, we conducted a comprehensive comparison and evaluation of five prevalent computational methods: interolog mapping, domain-domain interaction methodology, domain-motif interaction methodology, structure-based approaches, and machine learning techniques. These methods were assessed using unbiased datasets that include C1, C2h, C2v, and C3 test sets. Ultimately, we integrated these five methodologies into a unified model for predicting protein-protein interactions (PPIs) between coronaviruses and human proteins. Our final model demonstrates relatively better performance, particularly with the C2v and C3 test sets, which are frequently used datasets in practical applications. Based on this model, we further established a high-confidence PPI network between coronaviruses and humans, consisting of 18,012 interactions between 3843 human proteins and 129 coronavirus proteins. The reliability of our predictions was further validated through the current knowledge framework and network analysis. This study is anticipated to enhance mechanistic understanding of the coronavirus-human relationship a while facilitating the rediscovery of antiviral drug targets. The source codes and datasets are accessible at https://github.com/covhppilab/CoVHPPI.
Collapse
Affiliation(s)
- Binghua Li
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xiaoyu Li
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Xian Tang
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Jia Wang
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
2
|
Zhang J, Zhou F, Liang X, Kurgan L. Accurate Prediction of Protein-Binding Residues in Protein Sequences Using SCRIBER. Methods Mol Biol 2025; 2867:247-260. [PMID: 39576586 DOI: 10.1007/978-1-0716-4196-5_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Deciphering molecular-level mechanisms that govern protein-protein interactions (PPIs) relies in part on the accurate prediction of protein-binding partners and protein-binding residues. These predictions can be used to support a wide spectrum of applications that include development of PPI networks and protein docking programs, drug design studies, and investigations of molecular details that underlie certain diseases. Computational methods that predict protein-binding residues offer convenient, inexpensive, and relatively accurate data that can aid these efforts. We introduce and describe a user-friendly webserver for the SCRIBER method that conveniently provides state-of-the-art predictions of protein-binding residues and that minimizes cross-predictions, i.e., incorrect prediction of residues that bind other/non-protein ligands as protein binding. SCRIBER relies on a two-layer architecture that is specifically designed to reduce the cross-predictions. We motivate and explain this predictive architecture. We describe how to use the webserver, interact with its web interface, and collect, read, and understand results generated by SCRIBER. The SCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/ .
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China.
| | - Feng Zhou
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
| | - Xingchen Liang
- School of Computer and Information Technology, Xinyang Normal University, Xinyang, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
3
|
Zhao B, Basu S, Kurgan L. DescribePROT Database of Residue-Level Protein Structure and Function Annotations. Methods Mol Biol 2025; 2867:169-184. [PMID: 39576581 DOI: 10.1007/978-1-0716-4196-5_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
DescribePROT is a freely available online database of structural and functional descriptors of proteins at the amino acid level. It provides access to 13 diverse descriptors that include sequence conservation, putative secondary structure, solvent accessibility, intrinsic disorder, and signal peptides, and putative annotations of residues that interact with proteins, peptides and nucleic acids. These data can be used to elucidate protein functions, to support efforts to develop therapeutics, and to develop and evaluate future predictors of protein structure and function. DescribePROT includes 7.8 billion predictions for 1.4 million proteins from 83 complete proteomes of popular model organisms. This information can be downloaded at multiple levels of scope (entire database, specific organisms, and individual proteins) and can be interacted with using a graphical interface that simultaneously displays data on multiple descriptors. We describe the contents of this resource, provide directions on how to use its interface, and offer instructions on how to obtain and interact with the underlying data. Moreover, we briefly discuss plans for a future expansion of this database. DescribePROT is available at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/ .
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, USA
| | - Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
4
|
Khan AA, Wakchoure P, Farooq F, Shiddiky MJA, Jain SK. Host-Pathogen Interaction Databases: Tools for Rapid Understanding of Microbial Pathogenesis. WIREs Mech Dis 2025; 17:e1654. [PMID: 39600198 DOI: 10.1002/wsbm.1654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 09/22/2024] [Accepted: 10/19/2024] [Indexed: 11/29/2024]
Abstract
Understanding of microbial pathogenesis has greatly revolutionized after conventional culture-based techniques are replaced by molecular methods. This technological shift is generating huge host-pathogen interactions (HPIs) data. Moreover, computational predictions of biological interactions are also adding to HPI understanding. Recently, several dedicated databases are developed for exclusively cataloging HPIs. Present article covers about some available HPI databases, types, and evolution of this area, along with recent trends in the application of these databases for biological research. As per the recent understanding in microbial pathogenesis, HPIs are considered highly dynamic in nature with multiple outcomes, which goes beyond simple microbes-disease association. Therefore, careful cataloging of complete information about HPIs can open several avenues to understand microbial pathogenesis considering their multifaceted effects on host system. HPI databases are indispensable tools for understanding microbial pathogenesis, and this article provides comprehensive information about their uses in the field of microbial pathogenesis research.
Collapse
Affiliation(s)
- Abdul Arif Khan
- Division of Microbiology, ICMR-National Institute of Translational Virology and AIDS Research, Pune, Maharashtra, India
| | - Pooja Wakchoure
- Division of Microbiology, ICMR-National Institute of Translational Virology and AIDS Research, Pune, Maharashtra, India
| | - Fozia Farooq
- School of Studies in Microbiology, Vikram University, Ujjain, Madhya Pradesh, India
| | - Muhammad J A Shiddiky
- Rural Health Research Institute (RHRI), Charles Sturt University, Orange, New South Wales, Australia
| | - Sudhir Kumar Jain
- School of Studies in Microbiology, Vikram University, Ujjain, Madhya Pradesh, India
| |
Collapse
|
5
|
Tahir ul Qamar M, Noor F, Guo YX, Zhu XT, Chen LL. Deep-HPI-pred: An R-Shiny applet for network-based classification and prediction of Host-Pathogen protein-protein interactions. Comput Struct Biotechnol J 2024; 23:316-329. [PMID: 38192372 PMCID: PMC10772389 DOI: 10.1016/j.csbj.2023.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 01/10/2024] Open
Abstract
Host-pathogen interactions (HPIs) are vital in numerous biological activities and are intrinsically linked to the onset and progression of infectious diseases. HPIs are pivotal in the entire lifecycle of diseases: from the onset of pathogen introduction, navigating through the mechanisms that bypass host cellular defenses, to its subsequent proliferation inside the host. At the heart of these stages lies the synergy of proteins from both the host and the pathogen. By understanding these interlinking protein dynamics, we can gain crucial insights into how diseases progress and pave the way for stronger plant defenses and the swift formulation of countermeasures. In the framework of current study, we developed a web-based R/Shiny app, Deep-HPI-pred, that uses network-driven feature learning method to predict the yet unmapped interactions between pathogen and host proteins. Leveraging citrus and CLas bacteria training datasets as case study, we spotlight the effectiveness of Deep-HPI-pred in discerning Protein-protein interaction (PPIs) between them. Deep-HPI-pred use Multilayer Perceptron (MLP) models for HPI prediction, which is based on a comprehensive evaluation of topological features and neural network architectures. When subjected to independent validation datasets, the predicted models consistently surpassed a Matthews correlation coefficient (MCC) of 0.80 in host-pathogen interactions. Remarkably, the use of Eigenvector Centrality as the leading topological feature further enhanced this performance. Further, Deep-HPI-pred also offers relevant gene ontology (GO) term information for each pathogen and host protein within the system. This protein annotation data contributes an additional layer to our understanding of the intricate dynamics within host-pathogen interactions. In the additional benchmarking studies, the Deep-HPI-pred model has proven its robustness by consistently delivering reliable results across different host-pathogen systems, including plant-pathogens (accuracy of 98.4% and 97.9%), human-virus (accuracy of 94.3%), and animal-bacteria (accuracy of 96.6%) interactomes. These results not only demonstrate the model's versatility but also pave the way for gaining comprehensive insights into the molecular underpinnings of complex host-pathogen interactions. Taken together, the Deep-HPI-pred applet offers a unified web service for both identifying and illustrating interaction networks. Deep-HPI-pred applet is freely accessible at its homepage: https://cbi.gxu.edu.cn/shiny-apps/Deep-HPI-pred/ and at github: https://github.com/tahirulqamar/Deep-HPI-pred.
Collapse
Affiliation(s)
- Muhammad Tahir ul Qamar
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Fatima Noor
- Integrative Omics and Molecular Modeling Laboratory, Department of Bioinformatics and Biotechnology, Government College University Faisalabad (GCUF), Faisalabad 38000, Pakistan
| | - Yi-Xiong Guo
- National Key Laboratory of Crop Genetic Improvement, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Xi-Tong Zhu
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| | - Ling-Ling Chen
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Life Science and Technology, Guangxi University, Nanning 530004, China
| |
Collapse
|
6
|
Wang S, Li W, Wang Z, Yang W, Li E, Xia X, Yan F, Chiu S. Emerging and reemerging infectious diseases: global trends and new strategies for their prevention and control. Signal Transduct Target Ther 2024; 9:223. [PMID: 39256346 PMCID: PMC11412324 DOI: 10.1038/s41392-024-01917-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 06/13/2024] [Accepted: 07/05/2024] [Indexed: 09/12/2024] Open
Abstract
To adequately prepare for potential hazards caused by emerging and reemerging infectious diseases, the WHO has issued a list of high-priority pathogens that are likely to cause future outbreaks and for which research and development (R&D) efforts are dedicated, known as paramount R&D blueprints. Within R&D efforts, the goal is to obtain effective prophylactic and therapeutic approaches, which depends on a comprehensive knowledge of the etiology, epidemiology, and pathogenesis of these diseases. In this process, the accessibility of animal models is a priority bottleneck because it plays a key role in bridging the gap between in-depth understanding and control efforts for infectious diseases. Here, we reviewed preclinical animal models for high priority disease in terms of their ability to simulate human infections, including both natural susceptibility models, artificially engineered models, and surrogate models. In addition, we have thoroughly reviewed the current landscape of vaccines, antibodies, and small molecule drugs, particularly hopeful candidates in the advanced stages of these infectious diseases. More importantly, focusing on global trends and novel technologies, several aspects of the prevention and control of infectious disease were discussed in detail, including but not limited to gaps in currently available animal models and medical responses, better immune correlates of protection established in animal models and humans, further understanding of disease mechanisms, and the role of artificial intelligence in guiding or supplementing the development of animal models, vaccines, and drugs. Overall, this review described pioneering approaches and sophisticated techniques involved in the study of the epidemiology, pathogenesis, prevention, and clinical theatment of WHO high-priority pathogens and proposed potential directions. Technological advances in these aspects would consolidate the line of defense, thus ensuring a timely response to WHO high priority pathogens.
Collapse
Affiliation(s)
- Shen Wang
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China
| | - Wujian Li
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China
- College of Veterinary Medicine, Jilin University, Changchun, Jilin, China
| | - Zhenshan Wang
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China
- College of Veterinary Medicine, Jilin Agricultural University, Changchun, Jilin, China
| | - Wanying Yang
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China
| | - Entao Li
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China
- Key Laboratory of Anhui Province for Emerging and Reemerging Infectious Diseases, Hefei, 230027, Anhui, China
| | - Xianzhu Xia
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China
| | - Feihu Yan
- Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun, 130000, China.
| | - Sandra Chiu
- Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- Key Laboratory of Anhui Province for Emerging and Reemerging Infectious Diseases, Hefei, 230027, Anhui, China.
- Department of Laboratory Medicine, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
| |
Collapse
|
7
|
Singh S, Sharma P, Pal N, Sarma DK, Tiwari R, Kumar M. Holistic One Health Surveillance Framework: Synergizing Environmental, Animal, and Human Determinants for Enhanced Infectious Disease Management. ACS Infect Dis 2024; 10:808-826. [PMID: 38415654 DOI: 10.1021/acsinfecdis.3c00625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Recent pandemics, including the COVID-19 outbreak, have brought up growing concerns about transmission of zoonotic diseases from animals to humans. This highlights the requirement for a novel approach to discern and address the escalating health threats. The One Health paradigm has been developed as a responsive strategy to confront forthcoming outbreaks through early warning, highlighting the interconnectedness of humans, animals, and their environment. The system employs several innovative methods such as the use of advanced technology, global collaboration, and data-driven decision-making to come up with an extraordinary solution for improving worldwide disease responses. This Review deliberates environmental, animal, and human factors that influence disease risk, analyzes the challenges and advantages inherent in using the One Health surveillance system, and demonstrates how these can be empowered by Big Data and Artificial Intelligence. The Holistic One Health Surveillance Framework presented herein holds the potential to revolutionize our capacity to monitor, understand, and mitigate the impact of infectious diseases on global populations.
Collapse
Affiliation(s)
- Samradhi Singh
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| | - Poonam Sharma
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| | - Namrata Pal
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| | - Devojit Kumar Sarma
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| | - Rajnarayan Tiwari
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| | - Manoj Kumar
- ICMR - National Institute for Research in Environmental Health, Bhopal Bypass Road, Bhouri, Bhopal-462030, Madhya Pradesh, India
| |
Collapse
|
8
|
Kang Y, Wang X, Xie C, Zhang H, Xie W. BBLN: A bilateral-branch learning network for unknown protein-protein interaction prediction. Comput Biol Med 2023; 167:107588. [PMID: 37918265 DOI: 10.1016/j.compbiomed.2023.107588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/03/2023] [Accepted: 10/17/2023] [Indexed: 11/04/2023]
Abstract
Unknown Protein-Protein Interactions (PPIs) prediction has a huge demand in the biological analysis field. Since the effect of the limited availability of protein data is severe, transferable representations are highly demanded to be learned from various data. The latest works enhance the model performance on unknown PPIs prediction and have achieved certain improvements by combining protein information and relation information on PPI graph. However, such methods inevitably suffer from a so-called information monotonicity problem that limits the improvements when encountering large amounts of unknown PPIs. The prediction performance cannot be actually increased without considering the complementary information and relationship information among various modalities of protein data. To this end, we propose a bilateral-branch learning network to deeply enhance the both complementary and relationship information based on the amino acid sequence and gene ontology from multi- and cross-modal views. Experimental results on massive real-world datasets show that our method significantly outperforms the previous state-of-the-art on both traditional and novel unknown PPIs prediction.
Collapse
Affiliation(s)
- Yan Kang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China; Yunnan Key Laboratory of Software Engineering, China
| | - Xinchao Wang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Cheng Xie
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China.
| | - Huadong Zhang
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| | - Wentao Xie
- National Pilot School of Software, Yunnan University, Kunming, 650091, Yunnan, China
| |
Collapse
|
9
|
Menor-Flores M, Vega-Rodríguez MA. Boosting-based ensemble of global network aligners for PPI network alignment. EXPERT SYSTEMS WITH APPLICATIONS 2023; 230:120671. [DOI: 10.1016/j.eswa.2023.120671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
10
|
Wong F, de la Fuente-Nunez C, Collins JJ. Leveraging artificial intelligence in the fight against infectious diseases. Science 2023; 381:164-170. [PMID: 37440620 PMCID: PMC10663167 DOI: 10.1126/science.adh1114] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 06/05/2023] [Indexed: 07/15/2023]
Abstract
Despite advances in molecular biology, genetics, computation, and medicinal chemistry, infectious disease remains an ominous threat to public health. Addressing the challenges posed by pathogen outbreaks, pandemics, and antimicrobial resistance will require concerted interdisciplinary efforts. In conjunction with systems and synthetic biology, artificial intelligence (AI) is now leading to rapid progress, expanding anti-infective drug discovery, enhancing our understanding of infection biology, and accelerating the development of diagnostics. In this Review, we discuss approaches for detecting, treating, and understanding infectious diseases, underscoring the progress supported by AI in each case. We suggest future applications of AI and how it might be harnessed to help control infectious disease outbreaks and pandemics.
Collapse
Affiliation(s)
- Felix Wong
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA 19104, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - James J. Collins
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Medical Engineering & Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
11
|
Guo Z, Liu L, Feng M, Su K, Chi R, Li K, Lu Q, Su X, Da L, Cao S, Zhang M, Meng L, Cao D, Wang J, He G, Shi Y. 3D genome assisted protein–protein interaction prediction. FUTURE GENERATION COMPUTER SYSTEMS 2022; 137:87-96. [DOI: 10.1016/j.future.2022.07.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
|
12
|
Zhu L, Wang X, Li F, Song J. PreAcrs: a machine learning framework for identifying anti-CRISPR proteins. BMC Bioinformatics 2022; 23:444. [PMID: 36284264 PMCID: PMC9597991 DOI: 10.1186/s12859-022-04986-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 10/14/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Anti-CRISPR proteins are potent modulators that inhibit the CRISPR-Cas immunity system and have huge potential in gene editing and gene therapy as a genome-editing tool. Extensive studies have shown that anti-CRISPR proteins are essential for modifying endogenous genes, promoting the RNA-guided binding and cleavage of DNA or RNA substrates. In recent years, identifying and characterizing anti-CRISPR proteins has become a hot and significant research topic in bioinformatics. However, as most anti-CRISPR proteins fall short in sharing similarities to those currently known, traditional screening methods are time-consuming and inefficient. Machine learning methods could fill this gap with powerful predictive capability and provide a new perspective for anti-CRISPR protein identification. RESULTS Here, we present a novel machine learning ensemble predictor, called PreAcrs, to identify anti-CRISPR proteins from protein sequences directly. Three features and eight different machine learning algorithms were used to train PreAcrs. PreAcrs outperformed other existing methods and significantly improved the prediction accuracy for identifying anti-CRISPR proteins. CONCLUSIONS In summary, the PreAcrs predictor achieved a competitive performance for predicting new anti-CRISPR proteins in terms of accuracy and robustness. We anticipate PreAcrs will be a valuable tool for researchers to speed up the research process. The source code is available at: https://github.com/Lyn-666/anti_CRISPR.git .
Collapse
Affiliation(s)
- Lin Zhu
- Institute for Advanced Study, Shenzhen University, Shenzhen, China
| | - Xiaoyu Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800 Australia
| | - Fuyi Li
- Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC Australia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800 Australia
- Monash Data Futures Institute, Monash University, Melbourne, VIC 3800 Australia
| |
Collapse
|
13
|
Graph Neural Network for Protein-Protein Interaction Prediction: A Comparative Study. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27186135. [PMID: 36144868 PMCID: PMC9501426 DOI: 10.3390/molecules27186135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Revised: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/17/2022]
Abstract
Proteins are the fundamental biological macromolecules which underline practically all biological activities. Protein-protein interactions (PPIs), as they are known, are how proteins interact with other proteins in their environment to perform biological functions. Understanding PPIs reveals how cells behave and operate, such as the antigen recognition and signal transduction in the immune system. In the past decades, many computational methods have been developed to predict PPIs automatically, requiring less time and resources than experimental techniques. In this paper, we present a comparative study of various graph neural networks for protein-protein interaction prediction. Five network models are analyzed and compared, including neural networks (NN), graph convolutional neural networks (GCN), graph attention networks (GAT), hyperbolic neural networks (HNN), and hyperbolic graph convolutions (HGCN). By utilizing the protein sequence information, all of these models can predict the interaction between proteins. Fourteen PPI datasets are extracted and utilized to compare the prediction performance of all these methods. The experimental results show that hyperbolic graph neural networks tend to have a better performance than the other methods on the protein-related datasets.
Collapse
|
14
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
15
|
Chai H, Gu Q, Hughes J, Robertson DL. In silico prediction of HIV-1-host molecular interactions and their directionality. PLoS Comput Biol 2022; 18:e1009720. [PMID: 35134057 PMCID: PMC8856524 DOI: 10.1371/journal.pcbi.1009720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 02/18/2022] [Accepted: 12/03/2021] [Indexed: 11/18/2022] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) continues to be a major cause of disease and premature death. As with all viruses, HIV-1 exploits a host cell to replicate. Improving our understanding of the molecular interactions between virus and human host proteins is crucial for a mechanistic understanding of virus biology, infection and host antiviral activities. This knowledge will potentially permit the identification of host molecules for targeting by drugs with antiviral properties. Here, we propose a data-driven approach for the analysis and prediction of the HIV-1 interacting proteins (VIPs) with a focus on the directionality of the interaction: host-dependency versus antiviral factors. Using support vector machine learning models and features encompassing genetic, proteomic and network properties, our results reveal some significant differences between the VIPs and non-HIV-1 interacting human proteins (non-VIPs). As assessed by comparison with the HIV-1 infection pathway data in the Reactome database (sensitivity > 90%, threshold = 0.5), we demonstrate these models have good generalization properties. We find that the ‘direction’ of the HIV-1-host molecular interactions is also predictable due to different characteristics of ‘forward’/pro-viral versus ‘backward’/pro-host proteins. Additionally, we infer the previously unknown direction of the interactions between HIV-1 and 1351 human host proteins. A web server for performing predictions is available at http://hivpre.cvr.gla.ac.uk/.
Collapse
Affiliation(s)
- Haiting Chai
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Quan Gu
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - Joseph Hughes
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
| | - David L. Robertson
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
- * E-mail:
| |
Collapse
|
16
|
Song B, Luo X, Luo X, Liu Y, Niu Z, Zeng X. Learning spatial structures of proteins improves protein-protein interaction prediction. Brief Bioinform 2022; 23:6501351. [PMID: 35018418 DOI: 10.1093/bib/bbab558] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 12/07/2021] [Accepted: 12/07/2021] [Indexed: 01/09/2023] Open
Abstract
Spatial structures of proteins are closely related to protein functions. Integrating protein structures improves the performance of protein-protein interaction (PPI) prediction. However, the limited quantity of known protein structures restricts the application of structure-based prediction methods. Utilizing the predicted protein structure information is a promising method to improve the performance of sequence-based prediction methods. We propose a novel end-to-end framework, TAGPPI, to predict PPIs using protein sequence alone. TAGPPI extracts multi-dimensional features by employing 1D convolution operation on protein sequences and graph learning method on contact maps constructed from AlphaFold. A contact map contains abundant spatial structure information, which is difficult to obtain from 1D sequence data directly. We further demonstrate that the spatial information learned from contact maps improves the ability of TAGPPI in PPI prediction tasks. We compare the performance of TAGPPI with those of nine state-of-the-art sequence-based methods, and TAGPPI outperforms such methods in all metrics. To the best of our knowledge, this is the first method to use the predicted protein topology structure graph for sequence-based PPI prediction. More importantly, our proposed architecture could be extended to other prediction tasks related to proteins.
Collapse
Affiliation(s)
- Bosheng Song
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410012, Hunan, China
| | - Xiaoyan Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410012, Hunan, China.,MindRank AI ltd., Hangzhou, 311113, Zhejiang, China
| | - Xiaoli Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410012, Hunan, China.,BioMap, Haidian, 100089, Beijing, China
| | - Yuansheng Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410012, Hunan, China
| | | | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410012, Hunan, China
| |
Collapse
|
17
|
Li F, Dong S, Leier A, Han M, Guo X, Xu J, Wang X, Pan S, Jia C, Zhang Y, Webb GI, Coin LJM, Li C, Song J. Positive-unlabeled learning in bioinformatics and computational biology: a brief review. Brief Bioinform 2021; 23:6415313. [PMID: 34729589 DOI: 10.1093/bib/bbab461] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 09/27/2021] [Accepted: 10/07/2021] [Indexed: 12/14/2022] Open
Abstract
Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications.
Collapse
Affiliation(s)
- Fuyi Li
- Monash University, Australia
| | | | - André Leier
- Department of Genetics, UAB School of Medicine, USA
| | - Meiya Han
- Department of Biochemistry and Molecular Biology, Monash University, Australia
| | | | - Jing Xu
- Computer Science and Technology from Nankai University, China
| | - Xiaoyu Wang
- Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Australia
| | - Shirui Pan
- University of Technology Sydney (UTS), Ultimo, NSW, Australia
| | - Cangzhi Jia
- College of Science, Dalian Maritime University, Australia
| | - Yang Zhang
- Northwestern Polytechnical University, China
| | - Geoffrey I Webb
- Faculty of Information Technology at Monash University, Australia
| | - Lachlan J M Coin
- Department of Clinical Pathology, University of Melbourne, Australia
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry of Molecular Biology, Monash University, Australia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Melbourne, Australia
| |
Collapse
|
18
|
Mahapatra S, Sahu SS. Improved prediction of protein-protein interaction using a hybrid of functional-link Siamese neural network and gradient boosting machines. Brief Bioinform 2021; 22:6318175. [PMID: 34245238 DOI: 10.1093/bib/bbab255] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/26/2020] [Accepted: 06/17/2021] [Indexed: 01/17/2023] Open
Abstract
In this paper, for accurate prediction of protein-protein interaction (PPI), a novel hybrid classifier is developed by combining the functional-link Siamese neural network (FSNN) with the light gradient boosting machine (LGBM) classifier. The hybrid classifier (FSNN-LGBM) uses the fusion of features derived using pseudo amino acid composition and conjoint triad descriptors. The FSNN extracts the high-level abstraction features from the raw features and LGBM performs the PPI prediction task using these abstraction features. On performing 5-fold cross-validation experiments, the proposed hybrid classifier provides average accuracies of 98.70 and 98.38%, respectively, on the intraspecies PPI data sets of Saccharomyces cerevisiae and Helicobacter pylori. Similarly, the average accuracies for the interspecies PPI data sets of the Human-Bacillus and Human-Yersinia data sets are 98.52 and 97.40%, respectively. Compared with the existing methods, the hybrid classifier achieves higher prediction accuracy on the independent test sets and network data sets. The improved prediction performance obtained by the FSNN-LGBM makes it a flexible and effective PPI prediction model.
Collapse
Affiliation(s)
- Satyajit Mahapatra
- Department of Electronics and Communication, Birla Institute of Technology Mesra, Ranchi, India
| | - Sitanshu Sekhar Sahu
- Department of Electronics and Communication, Birla Institute of Technology Mesra, Ranchi, India
| |
Collapse
|
19
|
Chen Z, Zhao P, Li C, Li F, Xiang D, Chen YZ, Akutsu T, Daly RJ, Webb GI, Zhao Q, Kurgan L, Song J. iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization. Nucleic Acids Res 2021; 49:e60. [PMID: 33660783 PMCID: PMC8191785 DOI: 10.1093/nar/gkab122] [Citation(s) in RCA: 156] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 02/05/2021] [Accepted: 02/25/2021] [Indexed: 12/14/2022] Open
Abstract
Sequence-based analysis and prediction are fundamental bioinformatic tasks that facilitate understanding of the sequence(-structure)-function paradigm for DNAs, RNAs and proteins. Rapid accumulation of sequences requires equally pervasive development of new predictive models, which depends on the availability of effective tools that support these efforts. We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. iLearnPlus provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and data visualization; all without programming. iLearnPlus includes a wide range of feature sets which encode information from the input sequences and over twenty machine-learning algorithms that cover several deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process. We showcase iLearnPlus with two case studies concerning prediction of long noncoding RNAs (lncRNAs) from RNA transcripts and prediction of crotonylation sites in protein chains. iLearnPlus is an open-source platform available at https://github.com/Superzchen/iLearnPlus/ with the webserver at http://ilearnplus.erc.monash.edu/.
Collapse
Affiliation(s)
- Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.,Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, Victoria 3000, Australia
| | - Dongxu Xiang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Yong-Zi Chen
- Laboratory of Tumor Cell Biology, Key Laboratory of Cancer Prevention and Therapy, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300060, China
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Roger J Daly
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Quanzhi Zhao
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China.,Key Laboratory of Rice Biology in Henan Province, Henan Agricultural University, Zhengzhou 450046, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
20
|
Xiang J, Zhang J, Zheng R, Li X, Li M. NIDM: network impulsive dynamics on multiplex biological network for disease-gene prediction. Brief Bioinform 2021; 22:6236070. [PMID: 33866352 DOI: 10.1093/bib/bbab080] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 02/11/2021] [Accepted: 02/21/2021] [Indexed: 12/12/2022] Open
Abstract
The prediction of genes related to diseases is important to the study of the diseases due to high cost and time consumption of biological experiments. Network propagation is a popular strategy for disease-gene prediction. However, existing methods focus on the stable solution of dynamics while ignoring the useful information hidden in the dynamical process, and it is still a challenge to make use of multiple types of physical/functional relationships between proteins/genes to effectively predict disease-related genes. Therefore, we proposed a framework of network impulsive dynamics on multiplex biological network (NIDM) to predict disease-related genes, along with four variants of NIDM models and four kinds of impulsive dynamical signatures (IDSs). NIDM is to identify disease-related genes by mining the dynamical responses of nodes to impulsive signals being exerted at specific nodes. By a series of experimental evaluations in various types of biological networks, we confirmed the advantage of multiplex network and the important roles of functional associations in disease-gene prediction, demonstrated superior performance of NIDM compared with four types of network-based algorithms and then gave the effective recommendations of NIDM models and IDS signatures. To facilitate the prioritization and analysis of (candidate) genes associated to specific diseases, we developed a user-friendly web server, which provides three kinds of filtering patterns for genes, network visualization, enrichment analysis and a wealth of external links (http://bioinformatics.csu.edu.cn/DGP/NID.jsp). NIDM is a protocol for disease-gene prediction integrating different types of biological networks, which may become a very useful computational tool for the study of disease-related genes.
Collapse
Affiliation(s)
- Ju Xiang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Jiashuai Zhang
- School of Computer Science and Engineering, Central South University, Human, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, China
| | - Xingyi Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
21
|
Wang Y, Zhou M, Zou Q, Xu L. Machine learning for phytopathology: from the molecular scale towards the network scale. Brief Bioinform 2021; 22:6204793. [PMID: 33787847 DOI: 10.1093/bib/bbab037] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/09/2021] [Accepted: 01/26/2021] [Indexed: 01/16/2023] Open
Abstract
With the increasing volume of high-throughput sequencing data from a variety of omics techniques in the field of plant-pathogen interactions, sorting, retrieving, processing and visualizing biological information have become a great challenge. Within the explosion of data, machine learning offers powerful tools to process these complex omics data by various algorithms, such as Bayesian reasoning, support vector machine and random forest. Here, we introduce the basic frameworks of machine learning in dissecting plant-pathogen interactions and discuss the applications and advances of machine learning in plant-pathogen interactions from molecular to network biology, including the prediction of pathogen effectors, plant disease resistance protein monitoring and the discovery of protein-protein networks. The aim of this review is to provide a summary of advances in plant defense and pathogen infection and to indicate the important developments of machine learning in phytopathology.
Collapse
Affiliation(s)
- Yansu Wang
- Postdoctoral Innovation Practice Base, Shenzhen Polytechnic, China
| | | | - Quan Zou
- University of Electronic Science and Technology of China
| | - Lei Xu
- Shenzhen Polytechnic, China
| |
Collapse
|
22
|
Guo Z, Su K, Liu L, Su X, Feng M, Cao S, Zhang M, Chi R, Meng L, He G, Shi Y. Improving Protein-protein Interaction Prediction by Incorporating 3D Genome Information. LECTURE NOTES IN COMPUTER SCIENCE 2021:511-520. [DOI: 10.1007/978-3-030-91415-8_43] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|