1
|
Zhang W, Huang W, Tan J, Guo Q, Wu B. Heterogeneous catalysis mediated by light, electricity and enzyme via machine learning: Paradigms, applications and prospects. CHEMOSPHERE 2022; 308:136447. [PMID: 36116627 DOI: 10.1016/j.chemosphere.2022.136447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/08/2022] [Accepted: 09/11/2022] [Indexed: 06/15/2023]
Abstract
Energy crisis and environmental pollution have become the bottleneck of human sustainable development. Therefore, there is an urgent need to develop new catalysts for energy production and environmental remediation. Due to the high cost caused by blind screening and limited valuable computing resources, the traditional experimental methods and theoretical calculations are difficult to meet with the requirements. In the past decades, computer science has made great progress, especially in the field of machine learning (ML). As a new research paradigm, ML greatly accelerates the theoretical calculation methods represented by first principal calculation and molecular dynamics, and establish the physical picture of heterogeneous catalytic processes for energy and environment. This review firstly summarized the general research paradigms of ML in the discovery of catalysts. Then, the latest progresses of ML in light-, electricity- and enzyme-mediated heterogeneous catalysis were reviewed from the perspective of catalytic performance, operating conditions and reaction mechanism. The general guidelines of ML for heterogeneous catalysis were proposed. Finally, the existing problems and future development trend of ML in heterogeneous catalysis mediated by light, electricity and enzyme were summarized. We highly expect that this review will facilitate the interaction between ML and heterogeneous catalysis, and illuminate the development prospect of heterogeneous catalysis.
Collapse
Affiliation(s)
- Wentao Zhang
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, People's Republic of China
| | - Wenguang Huang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China.
| | - Jie Tan
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China
| | - Qingwei Guo
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China
| | - Bingdang Wu
- School of Environmental Science and Engineering, Suzhou University of Science and Technology, Suzhou, 215009, People's Republic of China; Key Laboratory of Suzhou Sponge City Technology, Suzhou, 215002, People's Republic of China.
| |
Collapse
|
2
|
Moradi M, Mohabatkar H, Behbahani M, Dini G. Application of G-quadruplex aptamer conjugated MSNs to deliver ampicillin for suppressing S. aureus biofilm on mice bone. ARAB J CHEM 2022. [DOI: 10.1016/j.arabjc.2022.104274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
3
|
Cong H, Liu H, Cao Y, Chen Y, Liang C. Multiple Protein Subcellular Locations Prediction Based on Deep Convolutional Neural Networks with Self-Attention Mechanism. Interdiscip Sci 2022; 14:421-438. [PMID: 35066812 DOI: 10.1007/s12539-021-00496-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 12/06/2021] [Accepted: 12/13/2021] [Indexed: 12/12/2022]
Abstract
As an important research field in bioinformatics, protein subcellular location prediction is critical to reveal the protein functions and provide insightful information for disease diagnosis and drug development. Predicting protein subcellular locations remains a challenging task due to the difficulty of finding representative features and robust classifiers. Many feature fusion methods have been widely applied to tackle the above issues. However, they still suffer from accuracy loss due to feature redundancy. Furthermore, multiple protein subcellular locations prediction is more complicated since it is fundamentally a multi-label classification problem. The traditional binary classifiers or even multi-class classifiers cannot achieve satisfactory results. This paper proposes a novel method for protein subcellular location prediction with both single and multiple sites based on deep convolutional neural networks. Specifically, we first obtain the integrated features by simultaneously considering the pseudo amino acid, amino acid index distribution, and physicochemical property. We then adopt deep convolutional neural networks to extract high-dimensional features from the fused feature, removing the redundant preliminary features and gaining better representations of the raw sequences. Moreover, we use the self-attention mechanism and a customized loss function to ensure that the model is more inclined to positive data. In addition, we use random k-label sets to reduce the number of prediction labels. Meanwhile, we employ a hybrid strategy of over-sampling and under-sampling to tackle the data imbalance problem. We compare our model with three representative classification alternatives. The experiment results show that our model achieves the best performance in terms of accuracy, demonstrating the efficacy of the proposed model.
Collapse
Affiliation(s)
- Hanhan Cong
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China
| | - Hong Liu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China.
- Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan, China.
| | - Yi Cao
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent, Computing University of Jinan, Jinan, China
| | - Yuehui Chen
- School of Information Science and Engineering, University of Jinan, Jinan, China
- Shandong Provincial Key Laboratory of Network Based Intelligent, Computing University of Jinan, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| |
Collapse
|
4
|
Moradi M, Golmohammadi R, Najafi A, Moosazadeh Moghaddam M, Fasihi-Ramandi M, Mirnejad R. A contemporary review on the important role of in silico approaches for managing different aspects of COVID-19 crisis. INFORMATICS IN MEDICINE UNLOCKED 2022; 28:100862. [PMID: 35079621 PMCID: PMC8776350 DOI: 10.1016/j.imu.2022.100862] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/17/2022] [Accepted: 01/18/2022] [Indexed: 01/05/2023] Open
Abstract
In the last century, the emergence of in silico tools has improved the quality of healthcare studies by providing high quality predictions. In the case of COVID-19, these tools have been advantageous for bioinformatics analysis of SARS-CoV-2 structures, studying potential drugs and introducing drug targets, investigating the efficacy of potential natural product components at suppressing COVID-19 infection, designing peptide-mimetic and optimizing their structure to provide a better clinical outcome, and repurposing of the previously known therapeutics. These methods have also helped medical biotechnologists to design various vaccines; such as multi-epitope vaccines using reverse vaccinology and immunoinformatics methods, among which some of them have showed promising results through in vitro, in vivo and clinical trial studies. Moreover, emergence of artificial intelligence and machine learning algorithms have helped to classify the previously known data and use them to provide precise predictions and make plan for future of the pandemic condition. At this contemporary review, by collecting related information from the collected literature on valuable data sources; such as PubMed, Scopus, and Web of Science, we tried to provide a brief outlook regarding the importance of in silico tools in managing different aspects of COVID-19 pandemic infection and how these methods have been helpful to biomedical researchers.
Collapse
Affiliation(s)
- Mohammad Moradi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran
| | - Reza Golmohammadi
- Baqiyatallah Research Center for Gastroenterology and Liver Diseases (BRCGL), Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | | | - Mahdi Fasihi-Ramandi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Reza Mirnejad
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| |
Collapse
|
5
|
Kumar R, Dhanda SK. Bird Eye View of Protein Subcellular Localization Prediction. Life (Basel) 2020; 10:E347. [PMID: 33327400 PMCID: PMC7764902 DOI: 10.3390/life10120347] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/11/2020] [Accepted: 12/11/2020] [Indexed: 12/12/2022] Open
Abstract
Proteins are made up of long chain of amino acids that perform a variety of functions in different organisms. The activity of the proteins is determined by the nucleotide sequence of their genes and by its 3D structure. In addition, it is essential for proteins to be destined to their specific locations or compartments to perform their structure and functions. The challenge of computational prediction of subcellular localization of proteins is addressed in various in silico methods. In this review, we reviewed the progress in this field and offered a bird eye view consisting of a comprehensive listing of tools, types of input features explored, machine learning approaches employed, and evaluation matrices applied. We hope the review will be useful for the researchers working in the field of protein localization predictions.
Collapse
Affiliation(s)
- Ravindra Kumar
- Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, NIH, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Sandeep Kumar Dhanda
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
6
|
Mohabatkar H, Ebrahimi S, Moradi M. Using Chou’s Five-steps Rule to Classify and Predict Glutathione S-transferases with Different Machine Learning Algorithms and Pseudo Amino Acid Composition. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-020-10087-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
7
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|
8
|
Shao YT, Liu XX, Lu Z, Chou KC. pLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.127042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
9
|
Shao Y, Chou KC. pLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning. ACTA ACUST UNITED AC 2020. [DOI: 10.4236/ns.2020.126034] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
10
|
Verma AK, Pal S. Prediction of Skin Disease with Three Different Feature Selection Techniques Using Stacking Ensemble Method. Appl Biochem Biotechnol 2019; 191:637-656. [PMID: 31845194 DOI: 10.1007/s12010-019-03222-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2019] [Accepted: 12/05/2019] [Indexed: 02/06/2023]
Abstract
Skin disease is the most common problem between people. Due to pollution and deployment of ozone layer, harmful UV rays of sun burn the skin and develop various types of skin diseases. Nowadays, machine learning and deep learning algorithms are generally used for diagnosis for various kinds of diseases. In this study, we have applied three feature extraction techniques univariate feature selection, feature importance, and correlation matrix with heat map to find the optimum data subset of erythemato-squamous disease. Four classification techniques Gaussian Naïve Bayesian (NB), decision tree (DT), support vector machine (SVM), and random forest are used for measuring the performance of model. Stacking ensemble technique is then applied to enhance the prediction performance of the model. The proposed method used for measuring the performance of the model. It is finding that the optimal subset of the erythemato-squamous disease is performed well in the case of correlation and heat map feature selection techniques. The mean value, slandered deviation, root mean square error, kappa statistical error, and area under receiver operating characteristics and accuracy are calculated for demonstrating the effectiveness of the proposed model. The feature selection techniques applied with staking ensemble technique gives the better result as compared to individual machine learning techniques. The obtained results show that the performance of proposed model is higher than previous results obtained by researchers.
Collapse
Affiliation(s)
- Anurag Kumar Verma
- Research Scholar, MCA Department, VBS Purvanchal University, Jaunpur, India
| | - Saurabh Pal
- Department of MCA, VBS Purvanchal University, Jaunpur, UP, India.
| |
Collapse
|