Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jonsson J, Norberg T, Carlsson L, Gustafsson C, Wold S. Quantitative sequence-activity models (QSAM)--tools for sequence design. Nucleic Acids Res 1993;21:733-9. [PMID: 8441682 PMCID: PMC309176 DOI: 10.1093/nar/21.3.733] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

For:	Jonsson J, Norberg T, Carlsson L, Gustafsson C, Wold S. Quantitative sequence-activity models (QSAM)--tools for sequence design. Nucleic Acids Res 1993;21:733-9. [PMID: 8441682 PMCID: PMC309176 DOI: 10.1093/nar/21.3.733] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

Number

Cited by Other Article(s)

Kang CK, Kim AR. Deep molecular learning of transcriptional control of a synthetic CRE enhancer and its variants. iScience 2024;27:108747. [PMID: 38222110 PMCID: PMC10784702 DOI: 10.1016/j.isci.2023.108747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/29/2023] [Accepted: 12/12/2023] [Indexed: 01/16/2024] Open

Ren N, Dai S, Ma S, Yang F. Strategies for activity analysis of single nucleotide polymorphisms associated with human diseases. Clin Genet 2023;103:392-400. [PMID: 36527336 DOI: 10.1111/cge.14282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 12/10/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022]

Dong X, Zheng W. Cheminformatics Modeling of Gene Silencing for Both Natural and Chemically Modified siRNAs. Molecules 2022;27:6412. [PMID: 36234948 PMCID: PMC9570765 DOI: 10.3390/molecules27196412] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 09/23/2022] [Accepted: 09/25/2022] [Indexed: 11/17/2022] Open

Van Brempt M, Peeters AI, Duchi D, De Wannemaeker L, Maertens J, De Paepe B, De Mey M. Biosensor-driven, model-based optimization of the orthogonally expressed naringenin biosynthesis pathway. Microb Cell Fact 2022;21:49. [PMID: 35346204 PMCID: PMC8962593 DOI: 10.1186/s12934-022-01775-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 03/15/2022] [Indexed: 12/30/2022] Open

Abstract

Background

The rapidly expanding synthetic biology toolbox allows engineers to develop smarter strategies to tackle the optimization of complex biosynthetic pathways. In such a strategy, multi-gene pathways are subdivided in several modules which are each dynamically controlled to fine-tune their expression in response to a changing cellular environment. To fine-tune separate modules without interference between modules or from the host regulatory machinery, a sigma factor (σ) toolbox was developed in previous work for tunable orthogonal gene expression. Here, this toolbox is implemented in E. coli to orthogonally express and fine-tune a pathway for the heterologous biosynthesis of the industrially relevant plant metabolite, naringenin. To optimize the production of this pathway, a practical workflow is still imperative to balance all steps of the pathway. This is tackled here by the biosensor-driven screening, subsequent genotyping of combinatorially engineered libraries and finally the training of three different computer models to predict the optimal pathway configuration.

Results

The efficiency and knowledge gained through this workflow is demonstrated here by improving the naringenin production titer by 32% with respect to a random pathway library screen. Our best strain was cultured in a batch bioreactor experiment and was able to produce 286 mg/L naringenin from glycerol in approximately 26 h. This is the highest reported naringenin production titer in E. coli without the supplementation of pathway precursors to the medium or any precursor pathway engineering. In addition, valuable pathway configuration preferences were identified in the statistical learning process, such as specific enzyme variant preferences and significant correlations between promoter strength at specific steps in the pathway and titer.

Conclusions

An efficient strategy, powered by orthogonal expression, was applied to successfully optimize a biosynthetic pathway for microbial production of flavonoids in E. coli up to high, competitive levels. Within this strategy, statistical learning techniques were combined with combinatorial pathway optimization techniques and an in vivo high-throughput screening method to efficiently determine the optimal operon configuration of the pathway. This “pathway architecture designer” workflow can be applied for the fast and efficient development of new microbial cell factories for different types of molecules of interest while also providing additional insights into the underlying pathway characteristics.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12934-022-01775-8.

Collapse

Zhou P, Liu Q, Wu T, Miao Q, Shang S, Wang H, Chen Z, Wang S, Wang H. Systematic Comparison and Comprehensive Evaluation of 80 Amino Acid Descriptors in Peptide QSAR Modeling. J Chem Inf Model 2021;61:1718-1731. [DOI: 10.1021/acs.jcim.0c01370] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Affiliation(s)

Peng Zhou Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Qian Liu Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Ting Wu School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Qingqing Miao Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Shuyong Shang College of Chemistry and Life Science, Chengdu Normal University, Chengdu 611130, China
Heyi Wang Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Zheng Chen Center for Informational Biology, University of Electronic Science and Technology of China (UESTC) at Qingshuihe Campus, Chengdu 611731, China School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Shaozhou Wang School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China
Heyan Wang School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC) at Shahe Campus, Chengdu 610054, China

Collapse

Gilman J, Zulkower V, Menolascina F. Using a Design of Experiments Approach to Inform the Design of Hybrid Synthetic Yeast Promoters. Methods Mol Biol 2021;2189:1-17. [PMID: 33180289 DOI: 10.1007/978-1-0716-0822-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Van Brempt M, Clauwaert J, Mey F, Stock M, Maertens J, Waegeman W, De Mey M. Predictive design of sigma factor-specific promoters. Nat Commun 2020;11:5822. [PMID: 33199691 PMCID: PMC7670410 DOI: 10.1038/s41467-020-19446-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 10/13/2020] [Indexed: 02/07/2023] Open

Ferreira A, Lapa R, Vale N. Combination of Gemcitabine with Cell-Penetrating Peptides: A Pharmacokinetic Approach Using In Silico Tools. Biomolecules 2019;9:biom9110693. [PMID: 31690028 PMCID: PMC6921036 DOI: 10.3390/biom9110693] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 10/07/2019] [Accepted: 11/01/2019] [Indexed: 02/06/2023] Open

Quantitative sequence-activity modeling of ACE peptide originated from milk using ACC-QTMS amino acid indices. Amino Acids 2019;51:1209-1220. [PMID: 31321559 DOI: 10.1007/s00726-019-02761-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 07/05/2019] [Indexed: 01/06/2023]

Gilman J, Singleton C, Tennant RK, James P, Howard TP, Lux T, Parker DA, Love J. Rapid, Heuristic Discovery and Design of Promoter Collections in Non-Model Microbes for Industrial Applications. ACS Synth Biol 2019;8:1175-1186. [PMID: 30995831 DOI: 10.1021/acssynbio.9b00061] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Advancement of Metabolic Engineering Assisted by Synthetic Biology. Catalysts 2018. [DOI: 10.3390/catal8120619] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Peters G, Maertens J, Lammertyn J, De Mey M. Exploring of the feature space of de novo developed post-transcriptional riboregulators. PLoS Comput Biol 2018;14:e1006170. [PMID: 30118473 PMCID: PMC6114898 DOI: 10.1371/journal.pcbi.1006170] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Revised: 08/29/2018] [Accepted: 04/30/2018] [Indexed: 11/23/2022] Open

Abstract

Metabolic engineering increasingly depends upon RNA technology to customly rewire the metabolism to maximize production. To this end, pure riboregulators allow dynamic gene repression without the need of a potentially burdensome coexpressed protein like typical Hfq binding small RNAs and clustered regularly interspaced short palindromic repeats technology. Despite this clear advantage, no clear general design principles are available to de novo develop repressing riboregulators, limiting the availability and the reliable development of these type of riboregulators. Here, to overcome this lack of knowledge on the functionality of repressing riboregulators, translation inhibiting RNAs are developed from scratch. These de novo developed riboregulators explore features related to thermodynamical and structural factors previously attributed to translation initiation modulation. In total, 12 structural and thermodynamic features were defined of which six features were retained after removing correlations from an in silico generated riboregulator library. From this translation inhibiting RNA library, 18 riboregulators were selected using a experimental design and subsequently constructed and co-expressed with two target untranslated regions to link the translation inhibiting RNA features to functionality. The pure riboregulators in the design of experiments showed repression down to 6% of the original protein expression levels, which could only be partially explained by a ordinary least squares regression model. To allow reliable forward engineering, a partial least squares regression model was constructed and validated to link the properties of translation inhibiting RNA riboregulators to gene repression. In this model both structural and thermodynamic features were important for efficient gene repression by pure riboregulators. This approach enables a more reliable de novo forward engineering of effective pure riboregulators, which further expands the RNA toolbox for gene expression modulation.

To allow reliable forward engineering of microbial cell factories, various metabolic engineering efforts rely on RNA-based technology. As such, programmable riboregulators allow dynamic control over gene expression. However, no clear design principles exist for de novo developed repressing riboregulators, which limits their applicability. Here, various engineering principles are identified and computationally explored. Subsequently, various design criteria are used in an experimental design, which were explored in an in vivo study. This resulted in a regression model that enables a more reliable computational design of repression small RNAs.

Collapse

Portela RMC, von Stosch M, Oliveira R. Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts. Synth Biol (Oxf) 2018;3:ysy010. [PMID: 32995518 PMCID: PMC7513808 DOI: 10.1093/synbio/ysy010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 05/21/2018] [Accepted: 06/11/2018] [Indexed: 12/20/2022] Open

Barley MH, Turner NJ, Goodacre R. Improved Descriptors for the Quantitative Structure-Activity Relationship Modeling of Peptides and Proteins. J Chem Inf Model 2018;58:234-243. [PMID: 29338232 DOI: 10.1021/acs.jcim.7b00488] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Guruge I, Taherzadeh G, Zhan J, Zhou Y, Yang Y. B -factor profile prediction for RNA flexibility using support vector machines. J Comput Chem 2017;39:407-411. [DOI: 10.1002/jcc.25124] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 11/07/2017] [Indexed: 12/12/2022]

Synthetic promoter design for new microbial chassis. Biochem Soc Trans 2017;44:731-7. [PMID: 27284035 PMCID: PMC4900742 DOI: 10.1042/bst20160042] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Indexed: 01/31/2023]

Moses T, Mehrshahi P, Smith AG, Goossens A. Synthetic biology approaches for the production of plant metabolites in unicellular organisms. JOURNAL OF EXPERIMENTAL BOTANY 2017;68:4057-4074. [PMID: 28449101 DOI: 10.1093/jxb/erx119] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Beier R, Labudde D. Numeric promoter description - A comparative view on concepts and general application. J Mol Graph Model 2015;63:65-77. [PMID: 26655334 DOI: 10.1016/j.jmgm.2015.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Revised: 11/12/2015] [Accepted: 11/17/2015] [Indexed: 11/25/2022]

Abstract

Nucleic acid molecules play a key role in a variety of biological processes. Starting from storage and transfer tasks, this also comprises the triggering of biological processes, regulatory effects and the active influence gained by target binding. Based on the experimental output (in this case promoter sequences), further in silico analyses aid in gaining new insights into these processes and interactions. The numerical description of nucleic acids thereby constitutes a bridge between the concrete biological issues and the analytical methods. Hence, this study compares 26 descriptor sets obtained by applying well-known numerical description concepts to an established dataset of 38 DNA promoter sequences. The suitability of the description sets was evaluated by computing partial least squares regression models and assessing the model accuracy. We conclude that the major importance regarding the descriptive power is attached to positional information rather than to explicitly incorporated physico-chemical information, since a sufficient amount of implicit physico-chemical information is already encoded in the nucleobase classification. The regression models especially benefited from employing the information that is encoded in the sequential and structural neighborhood of the nucleobases. Thus, the analyses of n-grams (short fragments of length n) suggested that they are valuable descriptors for DNA target interactions. A mixed n-gram descriptor set thereby yielded the best description of the promoter sequences. The corresponding regression model was checked and found to be plausible as it was able to reproduce the characteristic binding motifs of promoter sequences in a reasonable degree. As most functional nucleic acids are based on the principle of molecular recognition, the findings are not restricted to promoter sequences, but can rather be transferred to other kinds of functional nucleic acids. Thus, the concepts presented in this study could provide advantages for future nucleic acid-based technologies, like biosensoring, therapeutics and molecular imaging.

Collapse

Shreif Z, Striegel DA, Periwal V. The jigsaw puzzle of sequence phenotype inference: Piecing together Shannon entropy, importance sampling, and Empirical Bayes. J Theor Biol 2015;380:399-413. [PMID: 26092377 DOI: 10.1016/j.jtbi.2015.06.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Revised: 04/29/2015] [Accepted: 06/05/2015] [Indexed: 11/24/2022]

Quantitative sequence–activity modeling of antimicrobial hexapeptides using a segmented principal component strategy: an approach to describe and predict activities of peptide drugs containing l/d and unnatural residues. Amino Acids 2014;47:125-34. [DOI: 10.1007/s00726-014-1850-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Accepted: 10/03/2014] [Indexed: 12/20/2022]

van den Berg BA, Reinders MJ, van der Laan JM, Roubos JA, de Ridder D. Protein redesign by learning from data. Protein Eng Des Sel 2014;27:281-8. [DOI: 10.1093/protein/gzu031] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets. J Cheminform 2013;5:42. [PMID: 24059743 PMCID: PMC4015169 DOI: 10.1186/1758-2946-5-42] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open

Abstract

Background

While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants.

Results

The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last.

Conclusions

While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side.

Collapse

van Westen GJ, Swier RF, Wegner JK, Ijzerman AP, van Vlijmen HW, Bender A. Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. J Cheminform 2013;5:41. [PMID: 24059694 PMCID: PMC3848949 DOI: 10.1186/1758-2946-5-41] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open

Building better drugs: developing and regulating engineered therapeutic proteins. Trends Pharmacol Sci 2013;34:534-48. [PMID: 24060103 DOI: 10.1016/j.tips.2013.08.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 08/08/2013] [Accepted: 08/13/2013] [Indexed: 11/22/2022]

Quantitative estimation of activity and quality for collections of functional genetic elements. Nat Methods 2013;10:347-53. [DOI: 10.1038/nmeth.2403] [Citation(s) in RCA: 161] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 02/13/2013] [Indexed: 01/20/2023]

Rationally designed families of orthogonal RNA regulators of translation. Nat Chem Biol 2012;8:447-54. [DOI: 10.1038/nchembio.919] [Citation(s) in RCA: 144] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2011] [Accepted: 01/27/2012] [Indexed: 12/19/2022]

Gustafsson C, Minshull J, Govindarajan S, Ness J, Villalobos A, Welch M. Engineering genes for predictable protein expression. Protein Expr Purif 2012;83:37-46. [PMID: 22425659 DOI: 10.1016/j.pep.2012.02.013] [Citation(s) in RCA: 118] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Revised: 02/27/2012] [Accepted: 02/28/2012] [Indexed: 10/28/2022]

New autocorrelation QTMS-based descriptors for use in QSAM of peptides. JOURNAL OF THE IRANIAN CHEMICAL SOCIETY 2012. [DOI: 10.1007/s13738-012-0070-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat Biotechnol 2012;30:271-7. [PMID: 22371084 PMCID: PMC3297981 DOI: 10.1038/nbt.2137] [Citation(s) in RCA: 509] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 01/20/2012] [Indexed: 01/22/2023]

QSAR Study on Insect Neuropeptide Potencies Based on a Novel Set of Parameters of Amino Acids by Using OSC-PLS Method. Int J Pept Res Ther 2011. [DOI: 10.1007/s10989-011-9258-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Ebalunode JO, Jagun C, Zheng W. Informatics approach to the rational design of siRNA libraries. Methods Mol Biol 2011;672:341-58. [PMID: 20838976 DOI: 10.1007/978-1-60761-839-3_14] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]

Ebalunode JO, Zheng W. Cheminformatics Approach to Gene Silencing: Z Descriptors of Nucleotides and SVM Regression Afford Predictive Models for siRNA Potency. Mol Inform 2010;29:871-81. [PMID: 27464351 DOI: 10.1002/minf.201000091] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2010] [Accepted: 11/07/2010] [Indexed: 01/01/2023]

Abstract

Short interfering RNA mediated gene silencing technology has been through tremendous development over the past decade, and has found broad applications in both basic biomedical research and pharmaceutical development. Critical to the effective use of this technology is the development of reliable algorithms to predict the potency and selectivity of siRNAs under study. Existing algorithms are mostly built upon sequence information of siRNAs and then employ statistical pattern recognition or machine learning techniques to derive rules or models. However, sequence-based features have limited ability to characterize siRNAs, especially chemically modified ones. In this study, we proposed a cheminformatics approach to describe siRNAs. Principal component scores (z1, z2, z3, z4) have been derived for each of the 5 nucleotides (A, U, G, C, T) from the descriptor matrix computed by the MOE program. Descriptors of a given siRNA sequence are simply the concatenation of the z values of its composing nucleotides. Thus, for each of the 2431 siRNA sequences in the Huesken dataset, 76 descriptors were generated for the 19-NT representation, and 84 descriptors were generated for the 21-NT representation of siRNAs. Support Vector Machine regression (SVMR) was employed to develop predictive models. In all cases, the models achieved Pearson correlation coefficient r and R about 0.84 and 0.65 for the training sets and test sets, respectively. A minimum of 25 % of the whole dataset was needed to obtain predictive models that could accurately predict 75 % of the remaining siRNAs. Thus, for the first time, a cheminformatics approach has been developed to successfully model the structure-potency relationship in siRNA-based gene silencing data, which has laid a solid foundation for quantitative modeling of chemically modified siRNAs.

Collapse

Tian F, Zhang C, Fan X, Yang X, Wang X, Liang H. Predicting the Flexibility Profile of Ribosomal RNAs. Mol Inform 2010;29:707-15. [PMID: 27464014 DOI: 10.1002/minf.201000092] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 09/28/2010] [Indexed: 11/06/2022]

Maertens J, Vanrolleghem PA. Modeling with a view to target identification in metabolic engineering: a critical evaluation of the available tools. Biotechnol Prog 2010;26:313-31. [PMID: 20052739 DOI: 10.1002/btpr.349] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Gaussian process: an alternative approach for QSAM modeling of peptides. Amino Acids 2009;38:199-212. [DOI: 10.1007/s00726-008-0228-1] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2008] [Accepted: 12/18/2008] [Indexed: 10/21/2022]

Liang G, Li Z. Scores of generalized base properties for quantitative sequence-activity modelings for E. coli promoters based on support vector machine. J Mol Graph Model 2007;26:269-81. [PMID: 17291800 DOI: 10.1016/j.jmgm.2006.12.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2006] [Revised: 11/18/2006] [Accepted: 12/10/2006] [Indexed: 10/23/2022]

Liao J, Warmuth MK, Govindarajan S, Ness JE, Wang RP, Gustafsson C, Minshull J. Engineering proteinase K using machine learning and synthetic genes. BMC Biotechnol 2007;7:16. [PMID: 17386103 PMCID: PMC1847811 DOI: 10.1186/1472-6750-7-16] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2006] [Accepted: 03/26/2007] [Indexed: 11/10/2022] Open

Abstract

Background

Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms.

Results

We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68°C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes.

Conclusion

The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process.

Collapse

A new descriptor of amino acids based on the three-dimensional vector of atomic interaction field. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/s11434-006-0524-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Mao PL, Liu TF, Kueh K, Wu P. Predicting the efficiency of UAG translational stop signal through studies of physicochemical properties of its composite mono- and dinucleotides. Comput Biol Chem 2005;28:245-56. [PMID: 15548451 DOI: 10.1016/j.compbiolchem.2004.05.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2004] [Revised: 05/27/2004] [Accepted: 05/29/2004] [Indexed: 12/01/2022]

Minshull J, Govindarajan S, Cox T, Ness JE, Gustafsson C. Engineered protein function by selective amino acid diversification. Methods 2005;32:416-27. [PMID: 15003604 DOI: 10.1016/j.ymeth.2003.10.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/06/2003] [Indexed: 11/16/2022] Open

Gustafsson C, Govindarajan S, Minshull J. Putting engineering back into protein engineering: bioinformatic approaches to catalyst design. Curr Opin Biotechnol 2003;14:366-70. [PMID: 12943844 DOI: 10.1016/s0958-1669(03)00101-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Gustafsson C, Govindarajan S, Emig R. Exploration of sequence space for protein engineering. J Mol Recognit 2001;14:308-14. [PMID: 11746951 DOI: 10.1002/jmr.543] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Ponomarenko JV, Furman DP, Frolov AS, Podkolodny NL, Orlova GV, Ponomarenko MP, Kolchanov NA, Sarai A. ACTIVITY: a database on DNA/RNA sites activity adapted to apply sequence-activity relationships from one system to another. Nucleic Acids Res 2001;29:284-7. [PMID: 11125114 PMCID: PMC29829 DOI: 10.1093/nar/29.1.284] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem 1998;41:2481-91. [PMID: 9651153 DOI: 10.1021/jm9700575] [Citation(s) in RCA: 461] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Ponomarenko MP, Kolchanova AN, Kolchanov NA. Generating programs for predicting the activity of functional sites. J Comput Biol 1997;4:83-90. [PMID: 9109039 DOI: 10.1089/cmb.1997.4.83] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Wieslander A, Rilfors L, Dahlqvist A, Jonsson J, Hellberg S, Rännar S, Sjöström M, Lindblom G. Similar regulatory mechanisms despite differences in membrane lipid composition in Acholeplasma laidlawii strains A-EF22 and B-PG9. A multivariate data analysis. BIOCHIMICA ET BIOPHYSICA ACTA 1994;1191:331-42. [PMID: 8172919 DOI: 10.1016/0005-2736(94)90184-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]