Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mou Z, Eakes J, Cooper CJ, Foster CM, Standaert RF, Podar M, Doktycz MJ, Parks JM. Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilases. Proteins 2020;89:336-347. [DOI: 10.1002/prot.26019] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 09/02/2020] [Accepted: 10/17/2020] [Indexed: 01/11/2023]

For:	Mou Z, Eakes J, Cooper CJ, Foster CM, Standaert RF, Podar M, Doktycz MJ, Parks JM. Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilases. Proteins 2020;89:336-347. [DOI: 10.1002/prot.26019] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 09/02/2020] [Accepted: 10/17/2020] [Indexed: 01/11/2023]

Number

Cited by Other Article(s)

Han Y, Zhang H, Zeng Z, Liu Z, Lu D, Liu Z. Descriptor-augmented machine learning for enzyme-chemical interaction predictions. Synth Syst Biotechnol 2024;9:259-268. [PMID: 38450325 PMCID: PMC10915406 DOI: 10.1016/j.synbio.2024.02.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 02/21/2024] [Accepted: 02/22/2024] [Indexed: 03/08/2024] Open

Abstract

Descriptors play a pivotal role in enzyme design for the greener synthesis of biochemicals, as they could characterize enzymes and chemicals from the physicochemical and evolutionary perspective. This study examined the effects of various descriptors on the performance of Random Forest model used for enzyme-chemical relationships prediction. We curated activity data of seven specific enzyme families from the literature and developed the pipeline for evaluation the machine learning model performance using 10-fold cross-validation. The influence of protein and chemical descriptors was assessed in three scenarios, which were predicting the activity of unknown relations between known enzymes and known chemicals (new relationship evaluation), predicting the activity of novel enzymes on known chemicals (new enzyme evaluation), and predicting the activity of new chemicals on known enzymes (new chemical evaluation). The results showed that protein descriptors significantly enhanced the classification performance of model on new enzyme evaluation in three out of the seven datasets with the greatest number of enzymes, whereas chemical descriptors appear no effect. A variety of sequence-based and structure-based protein descriptors were constructed, among which the esm-2 descriptor achieved the best results. Using enzyme families as labels showed that descriptors could cluster proteins well, which could explain the contributions of descriptors to the machine learning model. As a counterpart, in the new chemical evaluation, chemical descriptors made significant improvement in four out of the seven datasets, while protein descriptors appear no effect. We attempted to evaluate the generalization ability of the model by correlating the statistics of the datasets with the performance of the models. The results showed that datasets with higher sequence similarity were more likely to get better results in the new enzyme evaluation and datasets with more enzymes were more likely beneficial from the protein descriptor strategy. This work provides guidance for the development of machine learning models for specific enzyme families.

Collapse

Kroll A, Ranjan S, Lercher MJ. A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships. PLoS Comput Biol 2024;20:e1012100. [PMID: 38768223 DOI: 10.1371/journal.pcbi.1012100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 04/24/2024] [Indexed: 05/22/2024] Open

Atallah C, James K, Ou Z, Skelton J, Markham D, Burridge MS, Finnigan J, Charnock S, Wipat A. A method for the systematic selection of enzyme panel candidates by solving the maximum diversity problem. Biosystems 2024;236:105105. [PMID: 38160995 DOI: 10.1016/j.biosystems.2023.105105] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 12/05/2023] [Accepted: 12/15/2023] [Indexed: 01/03/2024]

Ao YF, Dörr M, Menke MJ, Born S, Heuson E, Bornscheuer UT. Data-Driven Protein Engineering for Improving Catalytic Activity and Selectivity. Chembiochem 2024;25:e202300754. [PMID: 38029350 DOI: 10.1002/cbic.202300754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 12/01/2023]

King-Smith E, Faber FA, Reilly U, Sinitskiy AV, Yang Q, Liu B, Hyek D, Lee AA. Predictive Minisci late stage functionalization with transfer learning. Nat Commun 2024;15:426. [PMID: 38225239 PMCID: PMC10789750 DOI: 10.1038/s41467-023-42145-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 10/01/2023] [Indexed: 01/17/2024] Open

Robinson SL. Structure-guided metagenome mining to tap microbial functional diversity. Curr Opin Microbiol 2023;76:102382. [PMID: 37741262 DOI: 10.1016/j.mib.2023.102382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/21/2023] [Accepted: 08/22/2023] [Indexed: 09/25/2023]

Gao L, Yu Z, Wang S, Hou Y, Zhang S, Zhou C, Wu X. A new paradigm in lignocellulolytic enzyme cocktail optimization: Free from expert-level prior knowledge and experimental datasets. BIORESOURCE TECHNOLOGY 2023;388:129758. [PMID: 37717701 DOI: 10.1016/j.biortech.2023.129758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 09/05/2023] [Accepted: 09/07/2023] [Indexed: 09/19/2023]

Lehner MT, Katzberger P, Maeder N, Schiebroek CC, Teetz J, Landrum GA, Riniker S. DASH: Dynamic Attention-Based Substructure Hierarchy for Partial Charge Assignment. J Chem Inf Model 2023;63:6014-6028. [PMID: 37738206 PMCID: PMC10565818 DOI: 10.1021/acs.jcim.3c00800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 09/24/2023]

Zhang Q, Zheng W, Song Z, Zhang Q, Yang L, Wu J, Lin J, Xu G, Yu H. Machine Learning Enables Prediction of Pyrrolysyl-tRNA Synthetase Substrate Specificity. ACS Synth Biol 2023;12:2403-2417. [PMID: 37486975 DOI: 10.1021/acssynbio.3c00225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]

Abstract

Knowledge about the substrate scope for a given enzyme is informative for elucidating biochemical pathways and also for expanding applications of the enzyme. However, no general methods are available to accurately predict the substrate specificity of an enzyme. Pyrrolysyl-tRNA synthetase (PylRS) is a powerful tool for incorporating various noncanonical amino acids (NCAAs) into proteins, which enabled us to probe, image, rationally engineer, and evolve protein structure and function. However, the incorporation of a new NCAA typically requires the selection of large libraries of PylRS with randomized mutations at active sites, and this process requires multiple rounds of selection for each new substrate. Therefore, a single aminoacyl-tRNA synthetase with broad substrate promiscuity is ideal to facilitate widespread applications of the genetic NCAA incorporation technique. Herein, machine learning models were developed to predict the substrate specificity of PylRS to accept novel NCAAs that could be incorporated into proteins by three PylRS mutants. The models were built from a training set of 285 unique enzyme-substrate pairs of three PylRS mutants including IFRS, BtaRS, and MFRS against 95 NCAAs. The best BaggingTree (BT) model was then used for virtually screening a NCAAs library containing 1474 phenylalanine, tyrosine, tryptophan, and alanine analogues, and 156 NCAAs were predicted to be accepted by at least one of the three PylRS mutants. Then, 27 NCAAs including 24 positive and 3 negative substrates were experimentally tested for their activities, and 20 of the 24 positive substrates showed weak or strong activity and were accepted by at least one PylRS mutant, among which 11 NCAAs were never reported to be incorporated into proteins before. Three negative substrates did not show any activity. Experimental results suggested that the BT model provides a three-class classification accuracy of 0.69 and a binary classification accuracy of 0.86. This study expanded the substrate scope of three PylRS variants and provided a framework for developing machine learning models to predict substrate specificity of other PylRS variants.

Collapse

Li A, Cui H, Sheng Y, Qiao J, Li X, Huang H. Global plastic upcycling during and after the COVID-19 pandemic: The status and perspective. JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING 2023;11:110092. [PMID: 37200549 PMCID: PMC10167783 DOI: 10.1016/j.jece.2023.110092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 04/10/2023] [Accepted: 05/08/2023] [Indexed: 05/20/2023]

Kroll A, Ranjan S, Engqvist MKM, Lercher MJ. A general model to predict small molecule substrates of enzymes based on machine and deep learning. Nat Commun 2023;14:2787. [PMID: 37188731 DOI: 10.1038/s41467-023-38347-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 04/21/2023] [Indexed: 05/17/2023] Open

Vasina M, Kovar D, Damborsky J, Ding Y, Yang T, deMello A, Mazurenko S, Stavrakis S, Prokop Z. In-depth analysis of biocatalysts by microfluidics: An emerging source of data for machine learning. Biotechnol Adv 2023;66:108171. [PMID: 37150331 DOI: 10.1016/j.biotechadv.2023.108171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/04/2023] [Accepted: 05/04/2023] [Indexed: 05/09/2023]

Huang A, Lu F, Liu F. Discrimination of psychrophilic enzymes using machine learning algorithms with amino acid composition descriptor. Front Microbiol 2023;14:1130594. [PMID: 36860491 PMCID: PMC9968940 DOI: 10.3389/fmicb.2023.1130594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 01/23/2023] [Indexed: 02/16/2023] Open

Abstract

Introduction

Psychrophilic enzymes are a class of macromolecules with high catalytic activity at low temperatures. Cold-active enzymes possessing eco-friendly and cost-effective properties, are of huge potential application in detergent, textiles, environmental remediation, pharmaceutical as well as food industry. Compared with the time-consuming and labor-intensive experiments, computational modeling especially the machine learning (ML) algorithm is a high-throughput screening tool to identify psychrophilic enzymes efficiently.

Methods

In this study, the influence of 4 ML methods (support vector machines, K-nearest neighbor, random forest, and naïve Bayes), and three descriptors, i.e., amino acid composition (AAC), dipeptide combinations (DPC), and AAC + DPC on the model performance were systematically analyzed.

Results and discussion

Among the 4 ML methods, the support vector machine model based on the AAC descriptor using 5-fold cross-validation achieved the best prediction accuracy with 80.6%. The AAC outperformed than the DPC and AAC + DPC descriptors regardless of the ML methods used. In addition, amino acid frequencies between psychrophilic and non-psychrophilic proteins revealed that higher frequencies of Ala, Gly, Ser, and Thr, and lower frequencies of Glu, Lys, Arg, Ile,Val, and Leu could be related to the protein psychrophilicity. Further, ternary models were also developed that could classify psychrophilic, mesophilic, and thermophilic proteins effectively. The predictive accuracy of the ternary classification model using AAC descriptor via the support vector machine algorithm was 75.8%. These findings would enhance our insight into the cold-adaption mechanisms of psychrophilic proteins and aid in the design of engineered cold-active enzymes. Moreover, the proposed model could be used as a screening tool to identify novel cold-adapted proteins.

Collapse

Jiang Y, Ran X, Yang ZJ. Data-driven enzyme engineering to identify function-enhancing enzymes. Protein Eng Des Sel 2023;36:gzac009. [PMID: 36214500 PMCID: PMC10365845 DOI: 10.1093/protein/gzac009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/08/2022] [Accepted: 09/28/2022] [Indexed: 01/22/2023] Open

Lim PK, Julca I, Mutwil M. Redesigning plant specialized metabolism with supervised machine learning using publicly available reactome data. Comput Struct Biotechnol J 2023;21:1639-1650. [PMID: 36874159 PMCID: PMC9976193 DOI: 10.1016/j.csbj.2023.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 01/12/2023] [Accepted: 01/12/2023] [Indexed: 01/19/2023] Open

Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022;21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open

Tian Y, Zhang D, Cai P, Lin H, Ying H, Hu QN, Wu A. Elimination of Fusarium mycotoxin deoxynivalenol (DON) via microbial and enzymatic strategies: Current status and future perspectives. Trends Food Sci Technol 2022. [DOI: 10.1016/j.tifs.2022.04.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Kovács SC, Szappanos B, Tengölics R, Notebaart RA, Papp B. Underground metabolism as a rich reservoir for pathway engineering. Bioinformatics 2022;38:3070-3077. [PMID: 35441658 PMCID: PMC9154287 DOI: 10.1093/bioinformatics/btac282] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 04/12/2022] [Accepted: 04/14/2022] [Indexed: 11/25/2022] Open

Dudley QM, Cai YM, Kallam K, Debreyne H, Carrasco Lopez JA, Patron NJ. Biofoundry-assisted expression and characterization of plant proteins. Synth Biol (Oxf) 2021;6:ysab029. [PMID: 34693026 PMCID: PMC8529701 DOI: 10.1093/synbio/ysab029] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 08/25/2021] [Accepted: 09/09/2021] [Indexed: 12/29/2022] Open

Abstract

Many goals in synthetic biology, including the elucidation and refactoring of biosynthetic pathways and the engineering of regulatory circuits and networks, require knowledge of protein function. In plants, the prevalence of large gene families means it can be particularly challenging to link specific functions to individual proteins. However, protein characterization has remained a technical bottleneck, often requiring significant effort to optimize expression and purification protocols. To leverage the ability of biofoundries to accelerate design-built-test-learn cycles, we present a workflow for automated DNA assembly and cell-free expression of plant proteins that accelerates optimization and enables rapid screening of enzyme activity. First, we developed a phytobrick-compatible Golden Gate DNA assembly toolbox containing plasmid acceptors for cell-free expression using Escherichia coli or wheat germ lysates as well as a set of N- and C-terminal tag parts for detection, purification and improved expression/folding. We next optimized automated assembly of miniaturized cell-free reactions using an acoustic liquid handling platform and then compared tag configurations to identify those that increase expression. We additionally developed a luciferase-based system for rapid quantification that requires a minimal 11-amino acid tag and demonstrate facile removal of tags following synthesis. Finally, we show that several functional assays can be performed with cell-free protein synthesis reactions without the need for protein purification. Together, the combination of automated assembly of DNA parts and cell-free expression reactions should significantly increase the throughput of experiments to test and understand plant protein function and enable the direct reuse of DNA parts in downstream plant engineering workflows.

Collapse

Dutta K, Shityakov S, Khalifa I. New Trends in Bioremediation Technologies Toward Environment-Friendly Society: A Mini-Review. Front Bioeng Biotechnol 2021;9:666858. [PMID: 34409018 PMCID: PMC8365754 DOI: 10.3389/fbioe.2021.666858] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 05/26/2021] [Indexed: 01/29/2023] Open

Jang WD, Kim GB, Kim Y, Lee SY. Applications of artificial intelligence to enzyme and pathway design for metabolic engineering. Curr Opin Biotechnol 2021;73:101-107. [PMID: 34358728 DOI: 10.1016/j.copbio.2021.07.024] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 07/16/2021] [Accepted: 07/17/2021] [Indexed: 01/07/2023]

Affiliation(s)

Woo Dae Jang Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury, KAIST Institute for Artificial Intelligence, BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
Gi Bae Kim Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
Yeji Kim Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
Sang Yup Lee Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury, KAIST Institute for Artificial Intelligence, BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea.

Collapse

Fenner K, Elsner M, Lueders T, McLachlan MS, Wackett LP, Zimmermann M, Drewes JE. Methodological Advances to Study Contaminant Biotransformation: New Prospects for Understanding and Reducing Environmental Persistence? ACS ES&T WATER 2021;1:1541-1554. [PMID: 34278380 PMCID: PMC8276273 DOI: 10.1021/acsestwater.1c00025] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Revised: 06/11/2021] [Accepted: 06/11/2021] [Indexed: 05/14/2023]