1
|
Wang C, Zhang H, Ma H, Wang Y, Cai K, Guo T, Yang Y, Li Z, Zhu Y. Inference of pan-cancer related genes by orthologs matching based on enhanced LSTM model. Front Microbiol 2022; 13:963704. [PMID: 36267181 PMCID: PMC9577021 DOI: 10.3389/fmicb.2022.963704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open
Abstract
Many disease-related genes have been found to be associated with cancer diagnosis, which is useful for understanding the pathophysiology of cancer, generating targeted drugs, and developing new diagnostic and treatment techniques. With the development of the pan-cancer project and the ongoing expansion of sequencing technology, many scientists are focusing on mining common genes from The Cancer Genome Atlas (TCGA) across various cancer types. In this study, we attempted to infer pan-cancer associated genes by examining the microbial model organism Saccharomyces Cerevisiae (Yeast) by homology matching, which was motivated by the benefits of reverse genetics. First, a background network of protein-protein interactions and a pathogenic gene set involving several cancer types in humans and yeast were created. The homology between the human gene and yeast gene was then discovered by homology matching, and its interaction sub-network was obtained. This was undertaken following the principle that the homologous genes of the common ancestor may have similarities in expression. Then, using bidirectional long short-term memory (BiLSTM) in combination with adaptive integration of heterogeneous information, we further explored the topological characteristics of the yeast protein interaction network and presented a node representation score to evaluate the node ability in graphs. Finally, homologous mapping for human genes matched the important genes identified by ensemble classifiers for yeast, which may be thought of as genes connected to all types of cancer. One way to assess the performance of the BiLSTM model is through experiments on the database. On the other hand, enrichment analysis, survival analysis, and other outcomes can be used to confirm the biological importance of the prediction results. You may access the whole experimental protocols and programs at https://github.com/zhuyuan-cug/AI-BiLSTM/tree/master.
Collapse
Affiliation(s)
- Chao Wang
- Department of Surgery, Hepatic Surgery Center, Institute of Hepato-Pancreato-Biliary Surgery, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Houwang Zhang
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Haishu Ma
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Yawen Wang
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Ke Cai
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Tingrui Guo
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
| | - Yuanhang Yang
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Zhen Li
- School of Mathematics and Physics, China University of Geosciences, Wuhan, China
| | - Yuan Zhu
- School of Automation, China University of Geosciences, Wuhan, China
- Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, China
- Engineering Research Center of Intelligent Technology for Geo-Exploration, Wuhan, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Shanghai, China
| |
Collapse
|
2
|
Optimization of Material Supply in Smart Manufacturing Environment: A Metaheuristic Approach for Matrix Production. MACHINES 2021. [DOI: 10.3390/machines9100220] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
In the context of Industry 4.0, the matrix production developed by KUKA robotics represents a revolutionary solution for flexible manufacturing systems. Because of the adaptable and flexible manufacturing and material handling solutions, the design and control of these processes require new models and methods, especially from a real-time control point of view. Within the frame of this article, a new real-time optimization algorithm for in-plant material supply of smart manufacturing is proposed. After a systematic literature review, this paper describes a possible structure of the in-plant supply in matrix production environment. The mathematical model of the mentioned matrix production system is defined. The optimization problem of the described model is an integrated routing and scheduling problem, which is an NP-hard problem. The integrated routing and scheduling problem are solved with a hybrid multi-phase black hole and flower pollination-based metaheuristic algorithm. The computational results focusing on clustering and routing problems validate the model and evaluate its performance. The case studies show that matrix production is a suitable solution for smart manufacturing.
Collapse
|
3
|
Zhang Z, Jiang M, Wu D, Zhang W, Yan W, Qu X. A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization. Front Genet 2021; 12:709660. [PMID: 34422014 PMCID: PMC8378176 DOI: 10.3389/fgene.2021.709660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/06/2021] [Indexed: 11/29/2022] Open
Abstract
Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein–protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein–protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein–protein interaction networks.
Collapse
Affiliation(s)
- Zhihong Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,School of Information Technology and Management, Hunan University of Finance and Economics, Changsha, China
| | - Meiping Jiang
- Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China
| | - Dongjie Wu
- Department of Banking and Finance, Monash University, Clayton, VIC, Australia
| | - Wang Zhang
- Department of Optoelectronic Engineering, Jinan University, Guangzhou, China
| | - Wei Yan
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China
| | - Xilong Qu
- School of Information Technology and Management, Hunan University of Finance and Economics, Changsha, China.,Hunan Provincial Key Laboratory of Finance and Economics Big Data Science and Technology, Hunan University of Finance and Economics, Changsha, China
| |
Collapse
|
4
|
He X, Kuang L, Chen Z, Tan Y, Wang L. Method for Identifying Essential Proteins by Key Features of Proteins in a Novel Protein-Domain Network. Front Genet 2021; 12:708162. [PMID: 34267785 PMCID: PMC8276041 DOI: 10.3389/fgene.2021.708162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 05/31/2021] [Indexed: 11/21/2022] Open
Abstract
In recent years, due to low accuracy and high costs of traditional biological experiments, more and more computational models have been proposed successively to infer potential essential proteins. In this paper, a novel prediction method called KFPM is proposed, in which, a novel protein-domain heterogeneous network is established first by combining known protein-protein interactions with known associations between proteins and domains. Next, based on key topological characteristics extracted from the newly constructed protein-domain network and functional characteristics extracted from multiple biological information of proteins, a new computational method is designed to effectively integrate multiple biological features to infer potential essential proteins based on an improved PageRank algorithm. Finally, in order to evaluate the performance of KFPM, we compared it with 13 state-of-the-art prediction methods, experimental results show that, among the top 1, 5, and 10% of candidate proteins predicted by KFPM, the prediction accuracy can achieve 96.08, 83.14, and 70.59%, respectively, which significantly outperform all these 13 competitive methods. It means that KFPM may be a meaningful tool for prediction of potential essential proteins in the future.
Collapse
Affiliation(s)
- Xin He
- College of Computer, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer, Xiangtan University, Xiangtan, China
| | - Zhiping Chen
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Yihong Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer, Xiangtan University, Xiangtan, China
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
5
|
Sunny S, Jayaraj PB. FPDock: Protein-protein docking using flower pollination algorithm. Comput Biol Chem 2021; 93:107518. [PMID: 34048986 DOI: 10.1016/j.compbiolchem.2021.107518] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 05/11/2021] [Accepted: 05/16/2021] [Indexed: 11/25/2022]
Abstract
Proteins play their vital role in biological systems through interaction and complex formation with other biological molecules. Indeed, abnormalities in the interaction patterns affect the proteins' structure and have detrimental effects on living organisms. Research in structure prediction gains its gravity as the functions of proteins depend on their structures. Protein-protein docking is one of the computational methods devised to understand the interaction between proteins. Metaheuristic algorithms are promising to use owing to the hardness of the structure prediction problem. In this paper, a variant of the Flower Pollination Algorithm (FPA) is applied to get an accurate protein-protein complex structure. The algorithm begins execution from a randomly generated initial population, which gets flourished in different isolated islands, trying to find their local optimum. The abiotic and biotic pollination applied in different generations brings diversity and intensity to the solutions. Each round of pollination applies an energy-based scoring function whose value influences the choice to accept a new solution. Analysis of final predictions based on CAPRI quality criteria shows that the proposed method has a success rate of 58% in top10 ranks, which in comparison with other methods like SwarmDock, pyDock, ZDOCK is better. Source code of the work is available at: https://github.com/Sharon1989Sunny/_FPDock_.
Collapse
Affiliation(s)
- Sharon Sunny
- Department of Computer Science and Engineering, National Institute of Technology Calicut, India.
| | - P B Jayaraj
- Department of Computer Science and Engineering, National Institute of Technology Calicut, India
| |
Collapse
|
6
|
Optimization of Feedforward Neural Networks Using an Improved Flower Pollination Algorithm for Short-Term Wind Speed Prediction. ENERGIES 2019. [DOI: 10.3390/en12214126] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
It is well known that the inherent instability of wind speed may jeopardize the safety and operation of wind power generation, consequently affecting the power dispatch efficiency in power systems. Therefore, accurate short-term wind speed prediction can provide valuable information to solve the wind power grid connection problem. For this reason, the optimization of feedforward (FF) neural networks using an improved flower pollination algorithm is proposed. First of all, the empirical mode decomposition method is devoted to decompose the wind speed sequence into components of different frequencies for decreasing the volatility of the wind speed sequence. Secondly, a back propagation neural network is integrated with the improved flower pollination algorithm to predict the changing trend of each decomposed component. Finally, the predicted values of each component can get into an overlay combination process and achieve the purpose of accurate prediction of wind speed. Compared with major existing neural network models, the performance tests confirm that the average absolute error using the proposed algorithm can be reduced up to 3.67%.
Collapse
|
7
|
Lei X, Zhang C. Predicting metabolite-disease associations based on KATZ model. BioData Min 2019; 12:19. [PMID: 31673292 PMCID: PMC6815005 DOI: 10.1186/s13040-019-0206-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Accepted: 09/12/2019] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Increasing numbers of evidences have illuminated that metabolites can respond to pathological changes. However, identifying the diseases-related metabolites is a magnificent challenge in the field of biology and medicine. Traditional medical equipment not only has the limitation of its accuracy but also is expensive and time-consuming. Therefore, it's necessary to take advantage of computational methods for predicting potential associations between metabolites and diseases. RESULTS In this study, we develop a computational method based on KATZ algorithm to predict metabolite-disease associations (KATZMDA). Firstly, we extract data about metabolite-disease pairs from the latest version of HMDB database for the materials of prediction. Then we take advantage of disease semantic similarity and the improved disease Gaussian Interaction Profile (GIP) kernel similarity to obtain more reliable disease similarity and enhance the predictive performance of our proposed computational method. Simultaneously, KATZ algorithm is applied in the domains of metabolomics for the first time. CONCLUSIONS According to three kinds of cross validations and case studies of three common diseases, KATZMDA is worth serving as an impactful measuring tool for predicting the potential associations between metabolites and diseases.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi’an, 710119 Shaanxi China
| | - Cheng Zhang
- School of Computer Science, Shaanxi Normal University, Xi’an, 710119 Shaanxi China
| |
Collapse
|
8
|
Detecting the stable point of therapeutic effect of chronic myeloid leukemia based on dynamic network biomarkers. BMC Bioinformatics 2019; 20:202. [PMID: 31074387 PMCID: PMC6509869 DOI: 10.1186/s12859-019-2738-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Background Most researches of chronic myeloid leukemia (CML) are currently focused on the treatment methods, while there are relatively few researches on the progress of patients’ condition after drug treatment. Traditional biomarkers of disease can only distinguish normal state from disease state, and cannot recognize the pre-stable state after drug treatment. Results A therapeutic effect recognition strategy based on dynamic network biomarkers (DNB) is provided for CML patients’ gene expression data. With the DNB criteria, the DNB with 250 genes is selected and the therapeutic effect index (TEI) is constructed for the detection of individual disease. The pre-stable state before the disease condition becomes stable is 1 month. Through functional analysis for the DNB, some genes are confirmed as key genes to affect the progress of CML patients’ condition. Conclusions The results provide a certain theoretical direction and theoretical basis for medical personnel in the treatment of CML patients, and find new therapeutic targets in the future. The biomarkers of CML can help patients to be treated promptly and minimize drug resistance, treatment failure and relapse, which reduce the mortality of CML significantly. Electronic supplementary material The online version of this article (10.1186/s12859-019-2738-0) contains supplementary material, which is available to authorized users.
Collapse
|
9
|
Lei X, Fang M, Guo L, Wu FX. Protein complex detection based on flower pollination mechanism in multi-relation reconstructed dynamic protein networks. BMC Bioinformatics 2019; 20:131. [PMID: 30925866 PMCID: PMC6440282 DOI: 10.1186/s12859-019-2649-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background Detecting protein complex in protein-protein interaction (PPI) networks plays a significant part in bioinformatics field. It enables us to obtain the better understanding for the structures and characteristics of biological systems. Methods In this study, we present a novel algorithm, named Improved Flower Pollination Algorithm (IFPA), to identify protein complexes in multi-relation reconstructed dynamic PPI networks. Specifically, we first introduce a concept called co-essentiality, which considers the protein essentiality to search essential interactions, Then, we devise the multi-relation reconstructed dynamic PPI networks (MRDPNs) and discover the potential cores of protein complexes in MRDPNs. Finally, an IFPA algorithm is put forward based on the flower pollination mechanism to generate protein complexes by simulating the process of pollen find the optimal pollination plants, namely, attach the peripheries to the corresponding cores. Results The experimental results on three different datasets (DIP, MIPS and Krogan) show that our IFPA algorithm is more superior to some representative methods in the prediction of protein complexes. Conclusions Our proposed IFPA algorithm is powerful in protein complex detection by building multi-relation reconstructed dynamic protein networks and using improved flower pollination algorithm. The experimental results indicate that our IFPA algorithm can obtain better performance than other methods.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, 710119, Xi'an, China.
| | - Ming Fang
- School of Computer Science, Shaanxi Normal University, 710119, Xi'an, China
| | - Ling Guo
- College of Life Sciences, Shaanxi Normal University, 710119, Xi'an, China
| | - Fang-Xiang Wu
- Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| |
Collapse
|
10
|
Feature Selection via Swarm Intelligence for Determining Protein Essentiality. MOLECULES (BASEL, SWITZERLAND) 2018; 23:molecules23071569. [PMID: 29958434 PMCID: PMC6100311 DOI: 10.3390/molecules23071569] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 06/22/2018] [Accepted: 06/25/2018] [Indexed: 01/24/2023]
Abstract
Protein essentiality is fundamental to comprehend the function and evolution of genes. The prediction of protein essentiality is pivotal in identifying disease genes and potential drug targets. Since the experimental methods need many investments in time and funds, it is of great value to predict protein essentiality with high accuracy using computational methods. In this study, we present a novel feature selection named Elite Search mechanism-based Flower Pollination Algorithm (ESFPA) to determine protein essentiality. Unlike other protein essentiality prediction methods, ESFPA uses an improved swarm intelligence⁻based algorithm for feature selection and selects optimal features for protein essentiality prediction. The first step is to collect numerous features with the highly predictive characteristics of essentiality. The second step is to develop a feature selection strategy based on a swarm intelligence algorithm to obtain the optimal feature subset. Furthermore, an elite search mechanism is adopted to further improve the quality of feature subset. Subsequently a hybrid classifier is applied to evaluate the essentiality for each protein. Finally, the experimental results show that our method is competitive to some well-known feature selection methods. The proposed method aims to provide a new perspective for protein essentiality determination.
Collapse
|