Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Golinski AW, Mischler KM, Laxminarayan S, Neurock NL, Fossing M, Pichman H, Martiniani S, Hackel BJ. High-throughput developability assays enable library-scale identification of producible protein scaffold variants. Proc Natl Acad Sci U S A 2021;118:e2026658118. [PMID: 34078670 PMCID: PMC8201827 DOI: 10.1073/pnas.2026658118] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

For:	Golinski AW, Mischler KM, Laxminarayan S, Neurock NL, Fossing M, Pichman H, Martiniani S, Hackel BJ. High-throughput developability assays enable library-scale identification of producible protein scaffold variants. Proc Natl Acad Sci U S A 2021;118:e2026658118. [PMID: 34078670 PMCID: PMC8201827 DOI: 10.1073/pnas.2026658118] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Number

Cited by Other Article(s)

Blanchard PL, Knick BJ, Whelan SA, Hackel BJ. Hyperstable Synthetic Mini-Proteins as Effective Ligand Scaffolds. ACS Synth Biol 2023;12:3608-3622. [PMID: 38010428 PMCID: PMC10822706 DOI: 10.1021/acssynbio.3c00409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]

McConnell A, Batten SL, Hackel BJ. Determinants of Developability and Evolvability of Synthetic Miniproteins as Ligand Scaffolds. J Mol Biol 2023;435:168339. [PMID: 37923119 PMCID: PMC10872777 DOI: 10.1016/j.jmb.2023.168339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/23/2023] [Accepted: 10/28/2023] [Indexed: 11/07/2023]

Abstract

Binding ligands empower molecular therapeutics and diagnostics. Despite an array of protein scaffolds engineered for binding, the biophysical elements that drive developability and evolvability are not fully understood. In particular, engineering novel function while maintaining biophysical integrity within the context of small, single-domain proteins is challenged by integration of the structural framework and the evolved binding site. Miniproteins present a challenge to our limits of protein engineering capability and provide advantages in physiological targeting, modularity for multi-functional constructs, and unique binding modes. Herein, we evaluate the ability of hyperstable synthetic miniproteins, originally designed for foldedness, to function as binding scaffolds. We synthesized 45 combinatorial libraries, with 109 variants, systematically varied across two topologies, each with five starting frameworks and four or five diverse, structurally distinct paratopes, to elucidate their impact on evolvability and developability. We evaluated evolvability with yeast display binding selections against four targets. High-throughput assays -stability via yeast display and soluble expression via split-GFP in E. coli - measured developability. The comprehensive, robust dataset demonstrates how protein topology, parental framework, and paratope structure and location all impact scaffold performance. A hyperstable framework and localized diversity are not sufficient for an effective scaffold, but several designs of these elements within synthetic miniproteins designed solely for stability result in scaffold libraries with effective evolvability and developability. Engineered variants were well-folded, thermally stable, and bound target with single-digit nanomolar affinity. Thus, hyperstable synthetic miniproteins can serve as precursors to developable, evolvable mini-scaffolds with unique potential for physiological transport, modularity, and binding modes.

Collapse

Mardikoraem M, Wang Z, Pascual N, Woldring D. Generative models for protein sequence modeling: recent advances and future directions. Brief Bioinform 2023;24:bbad358. [PMID: 37864295 PMCID: PMC10589401 DOI: 10.1093/bib/bbad358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/08/2023] [Accepted: 09/12/2023] [Indexed: 10/22/2023] Open

Zhang C, Wu X, Song F, Liu S, Yu S, Zhou J. Core-Shell Droplet-Based Microfluidic Screening System for Filamentous Fungi. ACS Sens 2023;8:3468-3477. [PMID: 37603446 DOI: 10.1021/acssensors.3c01018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]

Affiliation(s)

Changtai Zhang Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
Xiaohui Wu Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
Fuqiang Song Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
Song Liu Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
Shiqin Yu Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
Jingwen Zhou Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China The Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China National Engineering Laboratory for Cereal Fermentation Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China

Collapse

Golinski AW, Schmitz ZD, Nielsen GH, Johnson B, Saha D, Appiah S, Hackel BJ, Martiniani S. Predicting and Interpreting Protein Developability Via Transfer of Convolutional Sequence Representation. ACS Synth Biol 2023;12:2600-2615. [PMID: 37642646 PMCID: PMC10829850 DOI: 10.1021/acssynbio.3c00196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]

Abstract

Engineered proteins have emerged as novel diagnostics, therapeutics, and catalysts. Often, poor protein developability─quantified by expression, solubility, and stability─hinders utility. The ability to predict protein developability from amino acid sequence would reduce the experimental burden when selecting candidates. Recent advances in screening technologies enabled a high-throughput (HT) developability dataset for 105 of 1020 possible variants of protein ligand scaffold Gp2. In this work, we evaluate the ability of neural networks to learn a developability representation from a HT dataset and transfer this knowledge to predict recombinant expression beyond observed sequences. The model convolves learned amino acid properties to predict expression levels 44% closer to the experimental variance compared to a non-embedded control. Analysis of learned amino acid embeddings highlights the uniqueness of cysteine, the importance of hydrophobicity and charge, and the unimportance of aromaticity, when aiming to improve the developability of small proteins. We identify clusters of similar sequences with increased recombinant expression through nonlinear dimensionality reduction and we explore the inferred expression landscape via nested sampling. The analysis enables the first direct visualization of the fitness landscape and highlights the existence of evolutionary bottlenecks in sequence space giving rise to competing subpopulations of sequences with different developability. The work advances applied protein engineering efforts by predicting and interpreting protein scaffold expression from a limited dataset. Furthermore, our statistical mechanical treatment of the problem advances foundational efforts to characterize the structure of the protein fitness landscape and the amino acid characteristics that influence protein developability.

Collapse

McConnell A, Hackel BJ. Protein engineering via sequence-performance mapping. Cell Syst 2023;14:656-666. [PMID: 37494931 PMCID: PMC10527434 DOI: 10.1016/j.cels.2023.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 05/10/2023] [Accepted: 06/21/2023] [Indexed: 07/28/2023]

Mardikoraem M, Woldring D. Protein Fitness Prediction Is Impacted by the Interplay of Language Models, Ensemble Learning, and Sampling Methods. Pharmaceutics 2023;15:1337. [PMID: 37242577 PMCID: PMC10224321 DOI: 10.3390/pharmaceutics15051337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 04/19/2023] [Accepted: 04/21/2023] [Indexed: 05/28/2023] Open

Abstract

Advances in machine learning (ML) and the availability of protein sequences via high-throughput sequencing techniques have transformed the ability to design novel diagnostic and therapeutic proteins. ML allows protein engineers to capture complex trends hidden within protein sequences that would otherwise be difficult to identify in the context of the immense and rugged protein fitness landscape. Despite this potential, there persists a need for guidance during the training and evaluation of ML methods over sequencing data. Two key challenges for training discriminative models and evaluating their performance include handling severely imbalanced datasets (e.g., few high-fitness proteins among an abundance of non-functional proteins) and selecting appropriate protein sequence representations (numerical encodings). Here, we present a framework for applying ML over assay-labeled datasets to elucidate the capacity of sampling techniques and protein encoding methods to improve binding affinity and thermal stability prediction tasks. For protein sequence representations, we incorporate two widely used methods (One-Hot encoding and physiochemical encoding) and two language-based methods (next-token prediction, UniRep; masked-token prediction, ESM). Elaboration on performance is provided over protein fitness, protein size, and sampling techniques. In addition, an ensemble of protein representation methods is generated to discover the contribution of distinct representations and improve the final prediction score. We then implement multiple criteria decision analysis (MCDA; TOPSIS with entropy weighting), using multiple metrics well-suited for imbalanced data, to ensure statistical rigor in ranking our methods. Within the context of these datasets, the synthetic minority oversampling technique (SMOTE) outperformed undersampling while encoding sequences with One-Hot, UniRep, and ESM representations. Moreover, ensemble learning increased the predictive performance of the affinity-based dataset by 4% compared to the best single-encoding candidate (F1-score = 97%), while ESM alone was rigorous enough in stability prediction (F1-score = 92%).

Collapse

Lopez-Morales J, Vanella R, Kovacevic G, Santos MS, Nash MA. Titrating Avidity of Yeast-Displayed Proteins Using a Transcriptional Regulator. ACS Synth Biol 2023;12:419-431. [PMID: 36728831 PMCID: PMC9942200 DOI: 10.1021/acssynbio.2c00351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Tresnak DT, Hackel BJ. Deep Antimicrobial Activity and Stability Analysis Inform Lysin Sequence-Function Mapping. ACS Synth Biol 2023;12:249-264. [PMID: 36599162 PMCID: PMC10822705 DOI: 10.1021/acssynbio.2c00509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Norrild RK, Johansson KE, O’Shea C, Morth JP, Lindorff-Larsen K, Winther JR. Increasing protein stability by inferring substitution effects from high-throughput experiments. CELL REPORTS METHODS 2022;2:100333. [PMID: 36452862 PMCID: PMC9701609 DOI: 10.1016/j.crmeth.2022.100333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/22/2022] [Accepted: 10/19/2022] [Indexed: 06/17/2023]

Ahmed S, Manjunath K, Chattopadhyay G, Varadarajan R. Identification of stabilizing point mutations through mutagenesis of destabilized protein libraries. J Biol Chem 2022;298:101785. [PMID: 35247389 PMCID: PMC8971944 DOI: 10.1016/j.jbc.2022.101785] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 02/18/2022] [Accepted: 02/26/2022] [Indexed: 01/22/2023] Open

McLure RJ, Radford SE, Brockwell DJ. High-throughput directed evolution: a golden era for protein science. TRENDS IN CHEMISTRY 2022. [DOI: 10.1016/j.trechm.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Mardikoraem M, Woldring D. Machine Learning-driven Protein Library Design: A Path Toward Smarter Libraries. Methods Mol Biol 2022;2491:87-104. [PMID: 35482186 DOI: 10.1007/978-1-0716-2285-8_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

DeJong MP, Ritter SC, Fransen KA, Tresnak DT, Golinski AW, Hackel BJ. A Platform for Deep Sequence-Activity Mapping and Engineering Antimicrobial Peptides. ACS Synth Biol 2021;10:2689-2704. [PMID: 34506711 DOI: 10.1021/acssynbio.1c00314] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]