1
|
Lasantha D, Vidanagamachchi S, Nallaperuma S. CRIECNN: Ensemble convolutional neural network and advanced feature extraction methods for the precise forecasting of circRNA-RBP binding sites. Comput Biol Med 2024; 174:108466. [PMID: 38615462 DOI: 10.1016/j.compbiomed.2024.108466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/29/2024] [Accepted: 04/08/2024] [Indexed: 04/16/2024]
Abstract
Circular RNAs (circRNAs) have surfaced as important non-coding RNA molecules in biology. Understanding interactions between circRNAs and RNA-binding proteins (RBPs) is crucial in circRNA research. Existing prediction models suffer from limited availability and accuracy, necessitating advanced approaches. In this study, we propose CRIECNN (Circular RNA-RBP Interaction predictor using an Ensemble Convolutional Neural Network), a novel ensemble deep learning model that enhances circRNA-RBP binding site prediction accuracy. CRIECNN employs advanced feature extraction methods and evaluates four distinct sequence datasets and encoding techniques (BERT, Doc2Vec, KNF, EIIP). The model consists of an ensemble convolutional neural network, a BiLSTM, and a self-attention mechanism for feature refinement. Our results demonstrate that CRIECNN outperforms state-of-the-art methods in accuracy and performance, effectively predicting circRNA-RBP interactions from both full-length sequences and fragments. This novel strategy makes an enormous advancement in the prediction of circRNA-RBP interactions, improving our understanding of circRNAs and their regulatory roles.
Collapse
Affiliation(s)
- Dilan Lasantha
- Department of Computer Science, University of Ruhuna, Sri Lanka.
| | | | - Sam Nallaperuma
- Department of Engineering, University of Cambridge, United Kingdom.
| |
Collapse
|
2
|
Singh J, Khanna NN, Rout RK, Singh N, Laird JR, Singh IM, Kalra MK, Mantella LE, Johri AM, Isenovic ER, Fouda MM, Saba L, Fatemi M, Suri JS. GeneAI 3.0: powerful, novel, generalized hybrid and ensemble deep learning frameworks for miRNA species classification of stationary patterns from nucleotides. Sci Rep 2024; 14:7154. [PMID: 38531923 DOI: 10.1038/s41598-024-56786-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
Due to the intricate relationship between the small non-coding ribonucleic acid (miRNA) sequences, the classification of miRNA species, namely Human, Gorilla, Rat, and Mouse is challenging. Previous methods are not robust and accurate. In this study, we present AtheroPoint's GeneAI 3.0, a powerful, novel, and generalized method for extracting features from the fixed patterns of purines and pyrimidines in each miRNA sequence in ensemble paradigms in machine learning (EML) and convolutional neural network (CNN)-based deep learning (EDL) frameworks. GeneAI 3.0 utilized five conventional (Entropy, Dissimilarity, Energy, Homogeneity, and Contrast), and three contemporary (Shannon entropy, Hurst exponent, Fractal dimension) features, to generate a composite feature set from given miRNA sequences which were then passed into our ML and DL classification framework. A set of 11 new classifiers was designed consisting of 5 EML and 6 EDL for binary/multiclass classification. It was benchmarked against 9 solo ML (SML), 6 solo DL (SDL), 12 hybrid DL (HDL) models, resulting in a total of 11 + 27 = 38 models were designed. Four hypotheses were formulated and validated using explainable AI (XAI) as well as reliability/statistical tests. The order of the mean performance using accuracy (ACC)/area-under-the-curve (AUC) of the 24 DL classifiers was: EDL > HDL > SDL. The mean performance of EDL models with CNN layers was superior to that without CNN layers by 0.73%/0.92%. Mean performance of EML models was superior to SML models with improvements of ACC/AUC by 6.24%/6.46%. EDL models performed significantly better than EML models, with a mean increase in ACC/AUC of 7.09%/6.96%. The GeneAI 3.0 tool produced expected XAI feature plots, and the statistical tests showed significant p-values. Ensemble models with composite features are highly effective and generalized models for effectively classifying miRNA sequences.
Collapse
Affiliation(s)
- Jaskaran Singh
- Department of Computer Science, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | - Narendra N Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi, India
| | - Ranjeet K Rout
- Department of Computer Science and Engineering, NIT Srinagar, Hazratbal, Srinagar, India
| | - Narpinder Singh
- Department of Food Science, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | - John R Laird
- Heart and Vascular Institute, Adventist Health St. Helena, St Helena, CA, USA
| | - Inder M Singh
- Advanced Cardiac and Vascular Institute, Sacramento, CA, USA
| | - Mannudeep K Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA, 02115, USA
| | - Laura E Mantella
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON, Canada
| | - Amer M Johri
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON, Canada
| | - Esma R Isenovic
- Laboratory for Molecular Genetics and Radiobiology, University of Belgrade, Belgrade, Serbia
| | - Mostafa M Fouda
- Department of Electrical and Computer Engineering, Idaho State University, Pocatello, ID, 83209, USA
| | - Luca Saba
- Department of Neurology, University of Cagliari, Cagliari, Italy
| | - Mostafa Fatemi
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, 55905, USA
| | - Jasjit S Suri
- Stroke Monitoring and Diagnostic Division, AtheroPoint LLC, Roseville, CA, 95661, USA.
| |
Collapse
|
3
|
Yu L, Zhang Y, Xue L, Liu F, Jing R, Luo J. EnsembleDL-ATG: Identifying autophagy proteins by integrating their sequence and evolutionary information using an ensemble deep learning framework. Comput Struct Biotechnol J 2023; 21:4836-4848. [PMID: 37854634 PMCID: PMC10579870 DOI: 10.1016/j.csbj.2023.09.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Autophagy is a primary mechanism for maintaining cellular homeostasis. The synergistic actions of autophagy-related (ATG) proteins strictly regulate the whole autophagic process. Therefore, accurate identification of ATGs is a first and critical step to reveal the molecular mechanism underlying the regulation of autophagy. Current computational methods can predict ATGs from primary protein sequences, but owing to the limitations of algorithms, significant room for improvement still exists. In this research, we propose EnsembleDL-ATG, an ensemble deep learning framework that aggregates multiple deep learning models to predict ATGs from protein sequence and evolutionary information. We first evaluated the performance of individual networks for various feature descriptors to identify the most promising models. Then, we explored all possible combinations of independent models to select the most effective ensemble architecture. The final framework was built and maintained by an organization of four different deep learning models. Experimental results show that our proposed method achieves a prediction accuracy of 94.5 % and MCC of 0.890, which are nearly 4 % and 0.08 higher than ATGPred-FL, respectively. Overall, EnsembleDL-ATG is the first ATG machine learning predictor based on ensemble deep learning. The benchmark data and code utilized in this study can be accessed for free at https://github.com/jingry/autoBioSeqpy/tree/2.0/examples/EnsembleDL-ATG.
Collapse
Affiliation(s)
- Lezheng Yu
- School of Chemistry and Materials Science, Guizhou Education University, Guiyang 550018, Guizhou, China
- Basic Medical College, Southwest Medical University, Luzhou 646000, Sichuan, China
| | - Yonglin Zhang
- Department of Pharmacy, The Affiliated Hospital of North Sichuan Medical College, Nanchong 637000, Sichuan, China
| | - Li Xue
- School of Public Health, Southwest Medical University, Luzhou 646000, Sichuan, China
| | - Fengjuan Liu
- School of Geography and Resources, Guizhou Education University, Guiyang 550018, Guizhou, China
| | - Runyu Jing
- School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, Sichuan, China
| | - Jiesi Luo
- Basic Medical College, Southwest Medical University, Luzhou 646000, Sichuan, China
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, Southwest Medical University, Luzhou 646000, Sichuan, China
| |
Collapse
|
4
|
Tebong NK, Simo T, Takougang AN, Ntanguen PH. STL-decomposition ensemble deep learning models for daily reservoir inflow forecast for hydroelectricity production. Heliyon 2023; 9:e16456. [PMID: 37303512 PMCID: PMC10248095 DOI: 10.1016/j.heliyon.2023.e16456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/16/2023] [Accepted: 05/17/2023] [Indexed: 06/13/2023] Open
Abstract
Accurate reservoir inflow forecasting is crucial for efficient water management. In this study, different deep learning models, including Dense, Long short-term memory (LSTM), and one-dimensional convolutional neural networks (Conv1D), were used to build ensembles. Seasonal-trend decomposition using loess (STL) was applied to decompose reservoir inflows and precipitations into random, seasonal, and trend components. Seven ensemble models, namely STL-Dense, STL-Conv1D, STL-LSTM, STL-Dense-LSTM-Conv1D, STL-Dense multivariate, STL-LSTM multivariate, and STL-Conv1D multivariate, were proposed and evaluated using daily inflows and precipitation decomposed data from the Lom Pangar reservoir from 2015 to 2020. Evaluation metrics, such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Nash Sutcliff Efficiency (NSE), were applied to assess model performance. Results showed that the STL-Dense multivariate model was the best ensemble among the thirteen models with MAE of 14.636 m3/s, RMSE of 20.841 m3/s, MAPE of 6.622%, and NSE of 0.988. These findings stress the importance of considering multiple inputs and models for accurate reservoir inflow forecasting and optimal water management. Not all ensemble models were good for Lom pangar inflow forecast as the Dense, Conv1D, and LSTM models performed better than their proposed STL monovariate ensemble models.
Collapse
Affiliation(s)
- Njogho Kenneth Tebong
- Research Unit Condensed Matter, Electronics and Signal Processing, Department of Physics, Faculty of Sciences, University of Dschang, PO Box 67, Dschang, Cameroon
- Laboratory of Industrial Systems and Environmental Engineering, Fotso Victor University Institute of Technology, University of Dschang, Bandjoun, Cameroon
| | - Théophile Simo
- Laboratory of Industrial Systems and Environmental Engineering, Fotso Victor University Institute of Technology, University of Dschang, Bandjoun, Cameroon
- Institut Universitaire de Technologie Fotso Victor de Bandjoun, B.P.: 134 Bandjoun, Cameroon
| | - Armand Nzeukou Takougang
- Laboratory of Industrial Systems and Environmental Engineering, Fotso Victor University Institute of Technology, University of Dschang, Bandjoun, Cameroon
| | - Patrick Herve Ntanguen
- Research Unit Condensed Matter, Electronics and Signal Processing, Department of Physics, Faculty of Sciences, University of Dschang, PO Box 67, Dschang, Cameroon
- Laboratory of Industrial Systems and Environmental Engineering, Fotso Victor University Institute of Technology, University of Dschang, Bandjoun, Cameroon
| |
Collapse
|
5
|
Bhosale YH, Patnaik KS. PulDi-COVID: Chronic obstructive pulmonary (lung) diseases with COVID-19 classification using ensemble deep convolutional neural network from chest X-ray images to minimize severity and mortality rates. Biomed Signal Process Control 2023; 81:104445. [PMID: 36466567 DOI: 10.1016/j.bspc.2022.104445] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 10/10/2022] [Accepted: 11/20/2022] [Indexed: 12/05/2022]
Abstract
Background and Objective In the current COVID-19 outbreak, efficient testing of COVID-19 individuals has proven vital to limiting and arresting the disease's accelerated spread globally. It has been observed that the severity and mortality ratio of COVID-19 affected patients is at greater risk because of chronic pulmonary diseases. This study looks at radiographic examinations exploiting chest X-ray images (CXI), which have become one of the utmost feasible assessment approaches for pulmonary disorders, including COVID-19. Deep Learning(DL) remains an excellent image classification method and framework; research has been conducted to predict pulmonary diseases with COVID-19 instances by developing DL classifiers with nine class CXI. However, a few claim to have strong prediction results; because of noisy and small data, their recommended DL strategies may suffer from significant deviation and generality failures. Methods Therefore, a unique CNN model(PulDi-COVID) for detecting nine diseases (atelectasis, bacterial-pneumonia, cardiomegaly, covid19, effusion, infiltration, no-finding, pneumothorax, viral-Pneumonia) using CXI has been proposed using the SSE algorithm. Several transfer-learning models: VGG16, ResNet50, VGG19, DenseNet201, MobileNetV2, NASNetMobile, ResNet152V2, DenseNet169 are trained on CXI of chronic lung diseases and COVID-19 instances. Given that the proposed thirteen SSE ensemble models solved DL's constraints by making predictions with different classifiers rather than a single, we present PulDi-COVID, an ensemble DL model that combines DL with ensemble learning. The PulDi-COVID framework is created by incorporating various snapshots of DL models, which have spearheaded chronic lung diseases with COVID-19 cases identification process with a deep neural network produced CXI by applying a suggested SSE method. That is familiar with the idea of various DL perceptions on different classes. Results PulDi-COVID findings were compared to thirteen existing studies for nine-class classification using COVID-19. Test results reveal that PulDi-COVID offers impressive outcomes for chronic diseases with COVID-19 identification with a 99.70% accuracy, 98.68% precision, 98.67% recall, 98.67% F1 score, lowest 12 CXIs zero-one loss, 99.24% AUC-ROC score, and lowest 1.33% error rate. Overall test results are superior to the existing Convolutional Neural Network(CNN). To the best of our knowledge, the observed results for nine-class classification are significantly superior to the state-of-the-art approaches employed for COVID-19 detection. Furthermore, the CXI that we used to assess our algorithm is one of the larger datasets for COVID detection with pulmonary diseases. Conclusion The empirical findings of our suggested approach PulDi-COVID show that it outperforms previously developed methods. The suggested SSE method with PulDi-COVID can effectively fulfill the COVID-19 speedy detection needs with different lung diseases for physicians to minimize patient severity and mortality.
Collapse
|
6
|
Aybey E, Gümüş Ö. SENSDeep: An Ensemble Deep Learning Method for Protein-Protein Interaction Sites Prediction. Interdiscip Sci 2023; 15:55-87. [PMID: 36346583 DOI: 10.1007/s12539-022-00543-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 10/15/2022] [Accepted: 10/17/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE The determination of which amino acid in a protein interacts with other proteins is important in understanding the functional mechanism of that protein. Although there are experimental methods to detect protein-protein interaction sites (PPISs), these are costly, time-consuming, and require expertise. Therefore, many computational methods have been proposed to accelerate this type of research, but they are generally insufficient to predict PPISs accurately. There is a need for development in this field. METHODS In this study, we introduce a new PPISs prediction method. This method is a sequence-based Stacking ENSemble Deep (SENSDeep) learning method that has an ensemble learning model including the models of RNN, CNN, GRU sequence to sequence (GRUs2s), GRU sequence to sequence with an attention layer (GRUs2satt) and a multilayer perceptron. Two embedded features, secondary structure, and protein sequence information are added to the training data set in addition to twelve existing features to improve the prediction performance of the method. RESULTS SENSDeep trained on the training data set without two extra features obtains a better performance on some of the independent testing data sets than that of the other methods in the literature, especially on scoring metrics of sensitivity, F1, MCC, and AUPRC, having increments up to 63.5%, 19.3%, 18.5%, 11.4%, respectively. It is shown that the added extra features improve the performance of the method by having almost the same performance with less data as the method trained on the data set without these added features. On the other hand, different sizes of the sliding window are tried on the data sets and an optimal sliding window size for SENSDeep is found. Moreover, SENSDeep has also been compared to structure-based methods. Some of these methods have been found to perform better. Using SENSDeep obtained by training with both training data sets, PPISs prediction examples of various proteins that are not in these training data sets are also presented. Furthermore, execution times for SENSDeep and its submodels are shown. AVAILABILITY AND IMPLEMENTATION https://github.com/enginaybey/SENSDeep.
Collapse
Affiliation(s)
- Engin Aybey
- Department of Health Bioinformatics, Ege University, 35100, Bornova, Izmir, Turkey.
- Rectorate, Marmara University, 34722, Kadıköy, Istanbul, Turkey.
| | - Özgür Gümüş
- Department of Computer Engineering, Ege University, 35100, Bornova, Izmir, Turkey
| |
Collapse
|
7
|
Kim DH, Chai JW, Kang JH, Lee JH, Kim HJ, Seo J, Choi JW. Ensemble deep learning model for predicting anterior cruciate ligament tear from lateral knee radiograph. Skeletal Radiol 2022; 51:2269-2279. [PMID: 35792956 DOI: 10.1007/s00256-022-04081-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 05/24/2022] [Accepted: 05/24/2022] [Indexed: 02/02/2023]
Abstract
OBJECTIVE To develop an ensemble deep learning model (DLM) predicting anterior cruciate ligament (ACL) tears from lateral knee radiographs and to evaluate its diagnostic performance. MATERIALS AND METHODS In this study, 1433 lateral knee radiographs (661 with ACL tear confirmed on MRI, 772 normal) from two medical centers were split into training (n = 1146) and test sets (n = 287). Three single DLMs respectively classifying radiographs with ACL tears, abnormal lateral femoral notches, and joint effusion were developed. An ensemble DLM predicting ACL tears was developed by combining the three DLMs via stacking method. The sensitivities, specificities, and area under the receiver operating characteristic curves (AUCs) of the DLMs and three radiologists were compared using McNemar test and Delong test. Subgroup analysis was performed to identify the radiologic features associated with the sensitivity. RESULTS The sensitivity, specificity, and AUC of the ensemble DLM were 86.8% (95% confidence interval [CI], 79.9-92.0%), 89.4% (95% CI, 83.4-93.8%), and 0.927 (95% CI, 0.891-0.954), achieving diagnostic performance comparable with that of a musculoskeletal radiologist (P = 0.193, McNemar test; P = 0.131, Delong test). The AUC of the ensemble DLM was significantly higher than those of non-musculoskeletal radiologists (P = 0.043, P < 0.001). The sensitivity of the DLM was higher than that of the radiologists in the absence of an abnormal lateral femoral notch or joint effusion. CONCLUSION The diagnostic performance of the ensemble DLM in predicting lateral knee radiographs with ACL tears was comparable to that of a musculoskeletal radiologist.
Collapse
Affiliation(s)
- Dong Hyun Kim
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jee Won Chai
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Ji Hee Kang
- Department of Radiology, Konkuk University Medical Center, 120-1 Neungdong-ro, Gwangjin-gu, Seoul, 05030, Republic of Korea.
| | - Ji Hyun Lee
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Hyo Jin Kim
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jiwoon Seo
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jae Won Choi
- Armed Forces Yangju Hospital, Yangju, Republic of Korea.,Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea
| |
Collapse
|
8
|
Layeghian Javan S, Sepehri MM. A predictive framework in healthcare: Case study on cardiac arrest prediction. Artif Intell Med 2021; 117:102099. [PMID: 34127237 DOI: 10.1016/j.artmed.2021.102099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 04/28/2021] [Accepted: 05/05/2021] [Indexed: 11/24/2022]
Abstract
Data-driven healthcare uses predictive analytics to enhance decision-making and personalized healthcare. Developing prognostic models is one of the applications of predictive analytics in medical environments. Various studies have used machine learning techniques for this purpose. However, there is no specific standard for choosing prediction models for different medical purposes. In this paper, the ISAF framework was proposed for choosing appropriate prediction models with regard to the properties of the classification methods. As one of the case study applications, a prognostic model for predicting cardiac arrests in sepsis patients was developed step by step through the ISAF framework. Finally, a new modified stacking model produced the best results. We predict 85 % of heart arrest cases one hour before the incidence (sensitivity> = 0.85) and 73 % of arrest cases 25 h before the occurrence (sensitivity> = 0.73). The results indicated that the proposed prognostic model has significantly improved the prediction results compared to the two standard systems of APACHE II and MEWS. Furthermore, compared to previous research, the proposed model has extended the prediction interval and improved the performance criteria.
Collapse
|
9
|
Wang J, Zhao Y, Gong W, Liu Y, Wang M, Huang X, Tan J. EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA-protein interaction prediction. BMC Bioinformatics 2021; 22:133. [PMID: 33740884 PMCID: PMC7980572 DOI: 10.1186/s12859-021-04069-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 03/05/2021] [Indexed: 11/29/2022] Open
Abstract
Background Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA–protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA–protein interactions. Results In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA–protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA–protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA–protein networks of Mus musculus successfully. Conclusions In general, our proposed method EDLMFC improved the accuracy of ncRNA–protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04069-9.
Collapse
Affiliation(s)
- Jingjing Wang
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Yanpeng Zhao
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Weikang Gong
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Yang Liu
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Mei Wang
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Xiaoqian Huang
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China
| | - Jianjun Tan
- Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China.
| |
Collapse
|
10
|
El Asnaoui K. Design ensemble deep learning model for pneumonia disease classification. Int J Multimed Inf Retr 2021; 10:55-68. [PMID: 33643764 PMCID: PMC7896551 DOI: 10.1007/s13735-021-00204-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 01/06/2021] [Accepted: 01/19/2021] [Indexed: 05/23/2023]
Abstract
With the recent spread of the SARS-CoV-2 virus, computer-aided diagnosis (CAD) has received more attention. The most important CAD application is to detect and classify pneumonia diseases using X-ray images, especially, in a critical period as pandemic of covid-19 that is kind of pneumonia. In this work, we aim to evaluate the performance of single and ensemble learning models for the pneumonia disease classification. The ensembles used are mainly based on fined-tuned versions of (InceptionResNet_V2, ResNet50 and MobileNet_V2). We collected a new dataset containing 6087 chest X-ray images in which we conduct comprehensive experiments. As a result, for a single model, we found out that InceptionResNet_V2 gives 93.52% of F1 score. In addition, ensemble of 3 models (ResNet50 with MobileNet_V2 with InceptionResNet_V2) shows more accurate than other ensembles constructed (94.84% of F1 score).
Collapse
Affiliation(s)
- Khalid El Asnaoui
- National School of Applied Sciences (ENSA), Department of Computer Sciences, Mohammed First University, BP: 669, 60000 Oujda, Morocco
| |
Collapse
|
11
|
Ma M, Sun C, Mao Z, Chen X. Ensemble deep learning with multi-objective optimization for prognosis of rotating machinery. ISA Trans 2020; 113:S0019-0578(20)30391-8. [PMID: 34756307 DOI: 10.1016/j.isatra.2020.09.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 09/28/2020] [Accepted: 09/30/2020] [Indexed: 06/13/2023]
Abstract
With the emerging of Internet of Things and smart sensing techniques, enormous monitoring data has been collected by prognostics and health management (PHM) systems. Predicting the Remaining useful life (RUL) of mechanical components from monitoring data has always been a challenging task in many industries, yet determining RUL accurately is identified as one of the most demanded outcomes of PHM systems. In this study, an ensemble deep learning with multi-objective optimization (EDL-MO) method is proposed for RUL prediction. A novel ensemble deep learning algorithm for RUL prediction is designed by combining accuracy and diversity. By introducing the diversity, uncorrelated error is produced in each individual iteration, and performance of prediction will be improved by evolving deep networks. The presented EDL-MO employs evolutionary optimization to optimize the two conflicting objectives, that is, diversity and accuracy. To validate the proposed algorithm, bearing run-to-failure experiments were carried out under constant load. The vibration signals are recorded and utilized to predict the RUL by using the proposed EDL-MO method, as well as other existing methods for performance comparison. The effectiveness and superiority of EDL-MO are analyzed, which outperforms the current algorithms in predicting RUL on rotation machineries.
Collapse
Affiliation(s)
- Meng Ma
- School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an, 710049, PR China; School of Mechanical Engineering, University of Massachusetts Lowell, MA, 01854, USA
| | - Chuang Sun
- School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an, 710049, PR China.
| | - Zhu Mao
- School of Mechanical Engineering, University of Massachusetts Lowell, MA, 01854, USA
| | - Xuefeng Chen
- School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an, 710049, PR China
| |
Collapse
|
12
|
Zhao Y, Du X. econvRBP: Improved ensemble convolutional neural networks for RNA binding protein prediction directly from sequence. Methods 2020; 181-182:15-23. [PMID: 31513916 DOI: 10.1016/j.ymeth.2019.09.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 08/21/2019] [Accepted: 09/05/2019] [Indexed: 10/26/2022] Open
Abstract
RNA binding proteins (RBPs) determine RNA process from synthesis to decay, which play a key role in RNA transport, translation and degradation. Therefore, exploring RBPs' function from the amino acid sequence using computational methods has become one of the momentous topics in genome annotation. However, there still have some challenges: (1) shallow feature: Although the sequence determines structure is self-evident, it is difficult to analyze the essential features from simple sequence. (2) Poorly understand: feature-based prediction methods mainly emphasize feature extraction, while in-depth understanding of protein mysteries limits the application of feature engineering. (3) Feature fusion: multi-feature fusion is often used, but the features are not well integrated. In view of these challenges, we propose a novel ensemble convolutional neural network (econvRBP) to predict RBPs. In order to capture the local and global features of RNA binding proteins simultaneously, first of all, One Hot and Conjoint Triad encoding methods are used to transform amino acid sequence into local and global features, respectively. After that the local and global features are combined for further high-level feature extraction using convolutional neural networks. Some experiments are constructed to evaluate our method with 10-fold cross validation and the results show that it has achieved the best performance among all the predictors so far. We correctly predicted 99% of 2875 RBPs and 99% of 6782 non-RBPs with accuracy of 0.99. In addition, the datasets provided by RBPPred are also used to validate our models with an accuracy of 0.87. These results indicate that the econvRBP is the most excellent method at present, and will provide reliable guidance for the detection of RBPs. econvRBP is available at http://47.100.203.218:3389/home.html/.
Collapse
Affiliation(s)
- Yuze Zhao
- School of Computer Science and Technology, Anhui University, Hefei, Anhui, China
| | - Xiuquan Du
- School of Computer Science and Technology, Anhui University, Hefei, Anhui, China.
| |
Collapse
|