1
|
Kong D, Qian J, Gao C, Wang Y, Shi T, Ye C. Machine Learning Empowering Microbial Cell Factory: A Comprehensive Review. Appl Biochem Biotechnol 2025:10.1007/s12010-025-05260-x. [PMID: 40397295 DOI: 10.1007/s12010-025-05260-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/02/2025] [Indexed: 05/22/2025]
Abstract
The wide application of machine learning has provided more possibilities for biological manufacturing, and the combination of machine learning and synthetic biology technology has ignited even more brilliant sparks, which has created an unpredictable value for the upgrading of microbial cell factories. The review delves into the synergies between machine learning and synthetic biology to create research worth investigating in biotechnology. We explore relevant databases, toolboxes, and machine learning-derived models. Furthermore, we examine specific applications of this combined approach in chemical production, human health, and environmental remediation. By elucidating these successful integrations, this review aims to provide valuable guidance for future research at the intersection of biomanufacturing and artificial intelligence.
Collapse
Affiliation(s)
- Dechun Kong
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, People's Republic of China
| | - Jinyi Qian
- Ministry of Education Key Laboratory of NSLSCS, Nanjing Normal University, Nanjing, 210023, People's Republic of China
| | - Cong Gao
- School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi, 214122, People's Republic of China
| | - Yuetong Wang
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, People's Republic of China.
| | - Tianqiong Shi
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, People's Republic of China.
- State Key Laboratory of Microbial Technology, Nanjing Normal University, Nanjing, 210023, People's Republic of China.
| | - Chao Ye
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, 210023, People's Republic of China.
- Ministry of Education Key Laboratory of NSLSCS, Nanjing Normal University, Nanjing, 210023, People's Republic of China.
| |
Collapse
|
2
|
Zhao K, Ji Z, Zhang L, Quan N, Li Y, Yu G, Bi X. HPOseq: a deep ensemble model for predicting the protein-phenotype relationships based on protein sequences. BMC Bioinformatics 2025; 26:110. [PMID: 40263997 PMCID: PMC12013097 DOI: 10.1186/s12859-025-06122-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 03/27/2025] [Indexed: 04/24/2025] Open
Abstract
BACKGROUND Understanding the relationships between proteins and specific disease phenotypes contributes to the early detection of diseases and advances the development of personalized medicine. The acquisition of a large amount of proteomics data has facilitated this process. To improve discovery efficiency and reduce the time and financial costs associated with biological experiments, various computational methods have yielded promising results. However, the lack of rich and reliable protein-related information still presents challenges in this process. RESULTS In this paper, we propose an ensemble prediction model, named HPOseq, which predicts human protein-phenotype relationships based only on sequence information. HPOseq establishes two base models to achieve objectives. One directly extracts internal information from amino acid sequences as protein features to predict the associated phenotypes. The other builds a protein-protein network based on sequence similarity, extracting information between proteins for phenotype prediction. Ultimately, an ensemble module is employed to integrate the predictions from both base models, resulting in the final prediction. CONCLUSION The results of 5-fold cross-validation reveal that HPOseq outperforms seven baseline methods for predicting protein-phenotype relationships. Moreover, we conduct case studies from the points of phenotype annotation and protein analysis to verify the practical significance of HPOseq.
Collapse
Affiliation(s)
- Kai Zhao
- School of Computer Science and Technology, Xinjiang University, Urumqi, 830011, China
| | - Zhuocheng Ji
- School of Computer Science and Technology, Xinjiang University, Urumqi, 830011, China
| | - Linlin Zhang
- School of Software, Xinjiang University, Urumqi, 830011, China
| | - Na Quan
- School of Computer Science and Technology, Xinjiang University, Urumqi, 830011, China
| | - Yuheng Li
- School of Computer Science and Technology, Xinjiang University, Urumqi, 830011, China
| | - Guanglei Yu
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830011, China
- School Of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Xuehua Bi
- College of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, 830011, China.
- School Of Computer Science and Engineering, Central South University, Changsha, 410083, China.
| |
Collapse
|
3
|
Cai Y, Zhang W, Dou Z, Wang C, Yu W, Wang L. PreTKcat: A pre-trained representation learning and machine learning framework for predicting enzyme turnover number. Comput Biol Chem 2025; 115:108327. [PMID: 39765190 DOI: 10.1016/j.compbiolchem.2024.108327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/06/2024] [Accepted: 12/24/2024] [Indexed: 02/26/2025]
Abstract
The enzyme turnover number (kcat) is crucial for understanding enzyme kinetics and optimizing biotechnological processes. However, experimentally measured kcat values are limited due to the high cost and labor intensity of wet-lab measurements, necessitating robust computational methods. To address this issue, we propose PreTKcat, a framework that integrates pre-trained representation learning and machine learning to predict kcat values. PreTKcat utilizes the ProtT5 protein language model to encode enzyme sequences and the MolGNet molecular representation learning model to encode substrate molecular graphs. By integrating these representations, the ExtraTrees model is employed to predict kcat values. Additionally, PreTKcat accounts for the impact of temperature on kcat prediction. In addition, PreTKcat can also be used to predict enzyme-substrate affinity, i.e. km values. Comparative assessments with various state-of-the-art models highlight the superior performance of PreTKcat. PreTKcat serves as an effective tool for investigating enzyme kinetics, offering new perspectives for enzyme engineering and its industrial uses.
Collapse
Affiliation(s)
- Yunxiang Cai
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin, 300457, China
| | - Wenjuan Zhang
- College of General Education, Tianjin Foreign Studies University, No. 117, Machang Road, Hexi District, Tianjin, 300204, China
| | - Zhuangzhuang Dou
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin, 300457, China
| | - Chao Wang
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin, 300457, China
| | - Wenping Yu
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin, 300457, China
| | - Lin Wang
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin, 300457, China.
| |
Collapse
|
4
|
Cao J, Zhou W, Yu Q, Ji J, Zhang J, He S, Zhu Z. MDTL-ACP: Anticancer Peptides Prediction Based on Multi-Domain Transfer Learning. IEEE J Biomed Health Inform 2025; 29:1714-1725. [PMID: 38147420 DOI: 10.1109/jbhi.2023.3347138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Anticancer peptides (ACPs) have emerged as one of the most promising therapeutic agents for cancer treatment. They are bioactive peptides featuring broad-spectrum activity and low drug-resistance. The discovery of ACPs via traditional biochemical methods is laborious and costly. Accordingly, various computational methods have been developed to facilitate the discovery of ACPs. However, the data resources and knowledge of ACPs are still very scarce, and only a few of them are clinically verified, which limits the competence of computational methods. To address this issue, in this article, we propose an ACP prediction model based on multi-domain transfer learning, namely MDTL-ACP, to discriminate novel ACPs from plentiful inactive peptides. In particular, we collect abundant antimicrobial peptides (AMPs) from four well-studied peptide domains and extract their inherent features as the input of MDTL-ACP. The features learned from multiple source domains of AMPs are then transferred into the target prediction task of ACPs via artificial neural network-based shared-extractor and task-specific classifiers in MDTL-ACP. The knowledge captured in the transferred features enhances the prediction of ACPs in the target domain. Experimental results demonstrate that MDTL-ACP can outperform the traditional and state-of-the-art ACP prediction methods.
Collapse
|
5
|
Liang Y, Ma X, Li J, Zhang S. iACVP-MR: Accurate Identification of Anti-coronavirus Peptide based on Multiple Features Information and Recurrent Neural Network. Curr Med Chem 2025; 32:2055-2067. [PMID: 38549527 DOI: 10.2174/0109298673277663240101111507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 11/26/2023] [Accepted: 11/30/2023] [Indexed: 05/14/2024]
Abstract
BACKGROUND Over the years, viruses have caused human illness and threatened human health. Therefore, it is pressing to develop anti-coronavirus infection drugs with clear function, low cost, and high safety. Anti-coronavirus peptide (ACVP) is a key therapeutic agent against coronavirus. Traditional methods for finding ACVP need a great deal of money and man power. Hence, it is a significant task to establish intelligent computational tools to able rapid, efficient and accurate identification of ACVP. METHODS In this paper, we construct an excellent model named iACVP-MR to identify ACVP based on multiple features and recurrent neural networks. Multiple features are extracted by using reduced amino acid component and dipeptide component, compositions of k-spaced amino acid pairs, BLOSUM62 encoder according to the N5C5 sequence, as well as second-order moving average approach based on 16 physicochemical properties. Then, two recurrent neural networks named long-short term memory (LSTM) and bidirectional gated recurrent unit (BiGRU) combined attention mechanism are used for feature fusion and classification, respectively. RESULTS The accuracies of ENNAVIA-C and ENNAVIA-D datasets under the 10-fold cross-validation are 99.15% and 98.92%, respectively, and other evaluation indexes have also obtained satisfactory results. The experimental results show that our model is superior to other existing models. CONCLUSION The iACVP-MR model can be viewed as a powerful and intelligent tool for the accurate identification of ACVP. The datasets and source codes for iACVP-MR are freely downloaded at https://github.com/yunyunliang88/iACVP-MR.
Collapse
Affiliation(s)
- Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, P.R. China
| | - Xinyan Ma
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, P.R. China
| | - Jin Li
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, P.R. China
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, P.R. China
| |
Collapse
|
6
|
Zhang S, Jing Y, Liang Y. EACVP: An ESM-2 LM Framework Combined CNN and CBAM Attention to Predict Anti-coronavirus Peptides. Curr Med Chem 2025; 32:2040-2054. [PMID: 38494930 DOI: 10.2174/0109298673287899240303164403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/13/2024] [Accepted: 02/19/2024] [Indexed: 03/19/2024]
Abstract
BACKGROUND The novel coronavirus pneumonia (COVID-19) outbreak in late 2019 killed millions worldwide. Coronaviruses cause diseases such as severe acute respiratory syndrome (SARS-CoV) and SARS-CoV-2. Many peptides in the host defense system have antiviral activity. How to establish a set of efficient models to identify anti-coronavirus peptides is a meaningful study. METHODS Given this, a new prediction model EACVP is proposed. This model uses the evolutionary scale language model (ESM-2 LM) to characterize peptide sequence information. The ESM model is a natural language processing model trained by machine learning technology. It is trained on a highly diverse and dense dataset (UR50/D 2021_04) and uses the pre-trained language model to obtain peptide sequence features with 320 dimensions. Compared with traditional feature extraction methods, the information represented by ESM-2 LM is more comprehensive and stable. Then, the features are input into the convolutional neural network (CNN), and the convolutional block attention module (CBAM) lightweight attention module is used to perform attention operations on CNN in space dimension and channel dimension. To verify the rationality of the model structure, we performed ablation experiments on the benchmark and independent test datasets. We compared the EACVP with existing methods on the independent test dataset. RESULTS Experimental results show that ACC, F1-score, and MCC are 3.95%, 35.65% and 0.0725 higher than the most advanced methods, respectively. At the same time, we tested EACVP on ENNAVIA-C and ENNAVIA-D data sets, and the results showed that EACVP has good migration and is a powerful tool for predicting anti-coronavirus peptides. CONCLUSION The results prove that this model EACVP could fully characterize the peptide information and achieve high prediction accuracy. It can be generalized to different data sets. The data and code of the article have been uploaded to https://github.- com/JYY625/EACVP.git.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, P.R. China
- Key Laboratory of Computational Science and Application of Hainan Province, Haikou, 571158, P.R. China
| | - Yuanyuan Jing
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, P.R. China
| | - Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, P.R. China
| |
Collapse
|
7
|
Wang R, Liang X, Zhao Y, Xue W, Liang G. UniBioPAN: A Novel Universal Classification Architecture for Bioactive Peptides Inspired by Video Action Recognition. J Chem Inf Model 2024; 64:9276-9285. [PMID: 39571078 DOI: 10.1021/acs.jcim.4c01599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
The classification of bioactive peptides is of great importance in protein biology, but there is still a lack of a universal and effective classifier. Inspired by video action recognition, we developed the UniBioPAN architecture to create a universal peptide classifier to solve this problem. The architecture treats the peptide sequence as a video sequence and the molecular image of each amino acid in the peptide sequence as a video frame, enabling feature extraction and classification using convolutional neural networks, bidirectional long short-term memory networks, and fully connected networks. As a novel peptide classification architecture, UniBioPAN significantly outperforms other universal architecture in ACC, AUC and MCC across 11 data sets, and F1 score in 9 data sets. UniBioPAN is available in three ways: python script, jupyter notebook script and web server (https://gzliang.cqu.edu.cn/software/UniBioPAN.html). In summary, UniBioPAN is a universal, convenient, and high-performance peptide classification architecture. UniBioPAN holds significant importance in the discovery of bioactive peptides and the advancement of peptide classifiers. All the codes and data sets are publicly available at https://github.com/sanwrh/UniBioPAN.
Collapse
Affiliation(s)
- Ruihong Wang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Xiao Liang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Yi Zhao
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Wenjun Xue
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology (Chongqing University), Ministry of Education, Bioengineering College, Chongqing University, Chongqing 400044, China
- Bioengineering College of Chongqing University, No. 174, Shazheng Street, Shapingba District, Chongqing 400030, China
| |
Collapse
|
8
|
Cao J, Zhang J, Yu Q, Ji J, Li J, He S, Zhu Z. TG-CDDPM: text-guided antimicrobial peptides generation based on conditional denoising diffusion probabilistic model. Brief Bioinform 2024; 26:bbae644. [PMID: 39668337 PMCID: PMC11637771 DOI: 10.1093/bib/bbae644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 11/13/2024] [Accepted: 11/27/2024] [Indexed: 12/14/2024] Open
Abstract
Antimicrobial peptides (AMPs) have emerged as a promising substitution to antibiotics thanks to their boarder range of activities, less likelihood of drug resistance, and low toxicity. Traditional biochemical methods for AMP discovery are costly and inefficient. Deep generative models, including the long-short term memory model, variational autoencoder model, and generative adversarial model, have been widely introduced to expedite AMP discovery. However, these models tend to suffer from the lack of diversity in generating AMPs. The denoising diffusion probabilistic model serves as a good candidate for solving this issue. We proposed a three-stage Text-Guided Conditional Denoising Diffusion Probabilistic Model (TG-CDDPM) to generate novel and homologous AMPs. In the first two stages, contrastive learning and inferring models are crafted to create better conditions for guiding AMP generation, respectively. In the last stage, a pre-trained conditional denoising diffusion probabilistic model is leveraged to enrich the peptide knowledge and fine-tuned to learn feature representation in downstream. TG-CDDPM was compared to the state-of-the-art generative models for AMP generation, and it demonstrated competitive or better performance with the assistance of text description as supervised information. The membrane penetration capabilities of the identified candidate AMPs by TG-CDDPM were also validated through molecular weight dynamics experiments.
Collapse
Affiliation(s)
- Junhang Cao
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Qiyuan Yu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Junkai Ji
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Jianqiang Li
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| | - Shan He
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Zexuan Zhu
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
| |
Collapse
|
9
|
Yu H, Luo X. ThermoFinder: A sequence-based thermophilic proteins prediction framework. Int J Biol Macromol 2024; 270:132469. [PMID: 38761901 DOI: 10.1016/j.ijbiomac.2024.132469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 05/20/2024]
Abstract
Thermophilic proteins are important for academic research and industrial processes, and various computational methods have been developed to identify and screen them. However, their performance has been limited due to the lack of high-quality labeled data and efficient models for representing protein. Here, we proposed a novel sequence-based thermophilic proteins prediction framework, called ThermoFinder. The results demonstrated that ThermoFinder outperforms previous state-of-the-art tools on two benchmark datasets, and feature ablation experiments confirmed the effectiveness of our approach. Additionally, ThermoFinder exhibited exceptional performance and consistency across two newly constructed datasets, one of these was specifically constructed for the regression-based prediction of temperature optimum values directly derived from protein sequences. The feature importance analysis, using shapley additive explanations, further validated the advantages of ThermoFinder. We believe that ThermoFinder will be a valuable and comprehensive framework for predicting thermophilic proteins, and we have made our model open source and available on Github at https://github.com/Luo-SynBioLab/ThermoFinder.
Collapse
Affiliation(s)
- Han Yu
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; University of Chinese Academy of Sciences, Beijing 100049, China; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xiaozhou Luo
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; University of Chinese Academy of Sciences, Beijing 100049, China; CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
10
|
Yu H, Deng H, He J, Keasling JD, Luo X. UniKP: a unified framework for the prediction of enzyme kinetic parameters. Nat Commun 2023; 14:8211. [PMID: 38081905 PMCID: PMC10713628 DOI: 10.1038/s41467-023-44113-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 11/30/2023] [Indexed: 12/18/2023] Open
Abstract
Prediction of enzyme kinetic parameters is essential for designing and optimizing enzymes for various biotechnological and industrial applications, but the limited performance of current prediction tools on diverse tasks hinders their practical applications. Here, we introduce UniKP, a unified framework based on pretrained language models for the prediction of enzyme kinetic parameters, including enzyme turnover number (kcat), Michaelis constant (Km), and catalytic efficiency (kcat / Km), from protein sequences and substrate structures. A two-layer framework derived from UniKP (EF-UniKP) has also been proposed to allow robust kcat prediction in considering environmental factors, including pH and temperature. In addition, four representative re-weighting methods are systematically explored to successfully reduce the prediction error in high-value prediction tasks. We have demonstrated the application of UniKP and EF-UniKP in several enzyme discovery and directed evolution tasks, leading to the identification of new enzymes and enzyme mutants with higher activity. UniKP is a valuable tool for deciphering the mechanisms of enzyme kinetics and enables novel insights into enzyme engineering and their industrial applications.
Collapse
Affiliation(s)
- Han Yu
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Huaxiang Deng
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Jiahui He
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Jay D Keasling
- Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
- Joint BioEnergy Institute, Emeryville, CA, 94608, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Chemical and Biomolecular Engineering & Department of Bioengineering, University of California, Berkeley, CA, 94720, USA
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800, Kgs, Lyngby, Denmark
| | - Xiaozhou Luo
- Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
- Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
11
|
Jia W, Peng J, Zhang Y, Zhu J, Qiang X, Zhang R, Shi L. Exploring novel ANGICon-EIPs through ameliorated peptidomics techniques: Can deep learning strategies as a core breakthrough in peptide structure and function prediction? Food Res Int 2023; 174:113640. [PMID: 37986483 DOI: 10.1016/j.foodres.2023.113640] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/23/2023] [Accepted: 10/24/2023] [Indexed: 11/22/2023]
Abstract
Dairy-derived angiotensin-I-converting enzyme inhibitory peptides (ANGICon-EIPs) have been regarded as a relatively safe supplementary diet-therapy strategy for individuals with hypertension, and short-chain peptides may have more relevant antihypertensive benefits due to their direct intestinal absorption. Our previous explorations have confirmed that endogenous goat milk short-chain peptides are also an essential source of ANGICon-EIPs. Nonetheless, there are limited explorations on endogenous ANGICon-EIPs owing to the limitations of the extraction and enrichment of endogenous peptides, currently. This review outlined ameliorated pre-treatment strategies, data acquisition methods, and tools for the prediction of peptide structure and function, aiming to provide creative ideas for discovering novel ANGICon-EIPs. Currently, deep learning-based peptide structure and function prediction algorithms have achieved significant advancements. The convolutional neural network (CNN) and peptide sequence-based multi-label deep learning approach for determining the multi-functionalities of bioactive peptides (MLBP) can predict multiple peptide functions with absolute true value and accuracy of 0.699 and 0.708, respectively. Utilizing peptide sequence input, torsion angles, and inter-residue distance to train neural networks, APPTEST predicted the average backbone root mean square deviation (RMSD) value of peptide (5-40 aa) structures as low as 1.96 Å. Overall, with the exploration of more neural network architectures, deep learning could be considered a critical research tool to reduce the cost and improve the efficiency of identifying novel endogenous ANGICon-EIPs.
Collapse
Affiliation(s)
- Wei Jia
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China; Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China; Shaanxi Research Institute of Agricultural Products Processing Technology, Xi'an 710021, China.
| | - Jian Peng
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Yan Zhang
- Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China
| | - Jiying Zhu
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Xin Qiang
- Inspection and Testing Center of Fuping County (Shaanxi goat milk product quality supervision and Inspection Center), Weinan 711700, China
| | - Rong Zhang
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| | - Lin Shi
- School of Food and Bioengineering, Shaanxi University of Science and Technology, Xi'an 710021, China
| |
Collapse
|