1
|
Zou Y, Zheng P, Chen P, Yu X, Wu D. Multidimensional computational strategies enhance the thermostability of alpha-galactosidase. Int J Biol Macromol 2025; 314:144316. [PMID: 40388995 DOI: 10.1016/j.ijbiomac.2025.144316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2025] [Revised: 05/12/2025] [Accepted: 05/15/2025] [Indexed: 05/21/2025]
Abstract
Alpha-Galactosidase has significant industrial application value in food processing, animal nutrition and medical applications. Microbial-derived α-galactosidases predominate industrial implementation due to high productivity, yet their inherent thermal instability necessitates systematic protein engineering. In this study, we established a dual-strategy protein engineering framework to enhance the thermostability of Aspergillus tubingensis α-galactosidase (AtWU_04653). Strategy I employed integrative computational design tools (ABACUS2/PROSS/DBD2) for mutational library construction, which yielded the dominant mutant A169P exhibiting remarkable performance: 78.52 % enhancement in thermal half-life at 55 °C (pH 4.0) and 52.04 % increase in catalytic efficiency (kcat /Km). Strategy II implemented a physics-based computational methodology combining GROMACS molecular dynamics simulations with Rosetta unfolding free energy calculations and SPIRED machine learning predictions, successfully deriving three stabilized variants (E429I, N380L, T64P) displaying 57.33 %, 67.17 %, and 41.34 % extended half-lives respectively. Notably, E429I and T64P demonstrated concurrent 85.25 % and 65.90 % catalytic activity augmentation (kcat /Km). Both strategies achieved substantial reduction in experimental screening workload while enabling synergistic thermostability-activity optimization. This study uses sequence conservation analysis, unfolding free energy calculation, molecular dynamics simulation, and innovative protein prediction models to establish multidimensional computational strategies for designing mutants, providing new and important technical references for computational design and functional optimization of enzymes.
Collapse
Affiliation(s)
- Youfeng Zou
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China
| | - Pu Zheng
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China
| | - Pengcheng Chen
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China
| | - Xiaowei Yu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China
| | - Dan Wu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, China.
| |
Collapse
|
2
|
Kendrick BS, Sampathkumar K, Gabrielson JP, Ren D. Analytical control strategy for biologics. Part I: Foundations. J Pharm Sci 2025; 114:103826. [PMID: 40354897 DOI: 10.1016/j.xphs.2025.103826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 05/06/2025] [Accepted: 05/06/2025] [Indexed: 05/14/2025]
Abstract
Biologic therapeutics encompass different modalities with vastly different molecular profiles. Despite these differences, all products follow a similar approach to Pharmaceutical Development, which includes an integrated control strategy that relies on a clinical target product profile (TPP), a quality target product profile (QTPP), biophysical, biochemical and biological characterization, elucidation of critical quality attributes (CQAs), and development of an analytical control strategy. Technical and regulatory requirements for biologics development are established in numerous regulatory guidance documents issued by ICH, FDA, EMA, and other bodies. While there is substantial published knowledge on specific studies needed for development of a product, there is no specific guidance on establishing a comprehensive analytical control strategy as part of a modern integrated control strategy. This commentary is Part I of a two-part commentary series on analytical control strategy. In this part we present the foundations that are essential for developing an analytical control strategy to enable efficient lifecycle management across different biologic protein-based therapeutic modalities. In Part II, we will present a stage-appropriate roadmap to implementing an analytical control strategy from discovery research through the commercial life of the biologic.
Collapse
Affiliation(s)
| | - Krishnan Sampathkumar
- SSK Biosolutions LLC, North Potomac, MD, 20878, USA; Currently at Invetx, Inc., By Dechra, Natick, MA, 01760, USA
| | | | - Da Ren
- BioTherapeutics Solutions, Westlake Village, CA, 91361, USA
| |
Collapse
|
3
|
Sun J, Zhu T, Cui Y, Wu B. Structure-based self-supervised learning enables ultrafast protein stability prediction upon mutation. Innovation (N Y) 2025; 6:100750. [PMID: 39872490 PMCID: PMC11763918 DOI: 10.1016/j.xinn.2024.100750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 12/02/2024] [Indexed: 01/30/2025] Open
Abstract
Predicting free energy changes (ΔΔG) is essential for enhancing our understanding of protein evolution and plays a pivotal role in protein engineering and pharmaceutical development. While traditional methods offer valuable insights, they are often constrained by computational speed and reliance on biased training datasets. These constraints become particularly evident when aiming for accurate ΔΔG predictions across a diverse array of protein sequences. Herein, we introduce Pythia, a self-supervised graph neural network specifically designed for zero-shot ΔΔG predictions. Our comparative benchmarks demonstrate that Pythia outperforms other self-supervised pretraining models and force field-based approaches while also exhibiting competitive performance with fully supervised models. Notably, Pythia shows strong correlations and achieves a remarkable increase in computational speed of up to 105-fold. We further validated Pythia's performance in predicting the thermostabilizing mutations of limonene epoxide hydrolase, leading to higher experimental success rates. This exceptional efficiency has enabled us to explore 26 million high-quality protein structures, marking a significant advancement in our ability to navigate the protein sequence space and enhance our understanding of the relationships between protein genotype and phenotype. In addition, we established a web server at https://pythia.wulab.xyz to allow users to easily perform such predictions.
Collapse
Affiliation(s)
- Jinyuan Sun
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Tong Zhu
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yinglu Cui
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Bian Wu
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
4
|
Li N, Liu X, Bian C, Ren C, Hu Q, Yang Z, Xiao L, Guan T. Biomimetic androgen receptor-based AIE biosensor for detecting bisphenol analogues: An integrating in silico topological analysis, molecular docking, and experimental validation study. Talanta 2025; 281:126827. [PMID: 39245003 DOI: 10.1016/j.talanta.2024.126827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/15/2024] [Accepted: 09/05/2024] [Indexed: 09/10/2024]
Abstract
Bisphenol analogues are the typical class of endocrine disrupting chemicals (EDCs) that interfere with binding of endogenous hormones to androgen receptor (AR). With the expansion of industrial activities and the intensification of environmental pollution, an increasing array of bisphenol analogues is being released into the environment and food chain. This highlights the urgency to develop sensitive methods for the detection of bisphenol analogues. Here, we propose a biomimetic AR-based biosensor platform for detecting bisphenol analogues (BPF, TBBPA, and TBBPS) by binding with Aggregation-Induced Emission (AIE) probes. Following a comparison of the PROSS and ABACUS methods, biomimetic AR was designed using the ABACUS approach and subsequently expressed in vitro via the E. coli expression system. Through molecular docking and the observation of fluorescence changes upon binding with biomimetic AR, BS-46006 was selected as the AIE probe for the biosensor. The biomimetic AR-based biosensor showed sensitive detections of BPF, TBBPA, and TBBPS within a range of 0-50 mM. To further elucidate the multi-residue recognition mechanism, molecular orbitals, Electron Localization Function (ELF), and Localized Orbital Locator (LOL) were systematically calculated in this study. Lowest unoccupied molecular orbital and highest occupied molecular orbital indicated the energy gap of BPF, TBBPA, and TBBPS, which correspond to 0.12812, 0.19689, and 0.18711 eV, respectively. ELF and LOL offered clearer perspective through heat maps to visually represent the electron delocalization in BPF, TBBPA, and TBBPS. The matrix effect analysis suggested that the responses of bisphenol analogues in soil matrices could be effectively mitigated through sample pretreatment. The analysis of spiked soil samples showed the acceptable recoveries ranged from 91 % to 105 %. Additionally, the biomimetic AR-based AIE biosensor, which combines multi-residue detection with Tolerable Daily Intakes, shows great promise for the risk assessment of bisphenol analogues. This research may present a viable approach for the analysis of environmental pollutants.
Collapse
Affiliation(s)
- Ning Li
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Xiaoxiao Liu
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Canfeng Bian
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Chenxi Ren
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Qin Hu
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Zhenquan Yang
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China
| | - Lixia Xiao
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China.
| | - Tianzhu Guan
- School of Food Science and Engineering, Yangzhou University, Yangzhou, 225127, China.
| |
Collapse
|
5
|
Chen Z, Ji M, Qian J, Zhang Z, Zhang X, Gao H, Wang H, Wang R, Qi Y. ProBID-Net: a deep learning model for protein-protein binding interface design. Chem Sci 2024; 15:19977-19990. [PMID: 39568891 PMCID: PMC11575592 DOI: 10.1039/d4sc02233e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 10/11/2024] [Indexed: 11/22/2024] Open
Abstract
Protein-protein interactions are pivotal in numerous biological processes. The computational design of these interactions facilitates the creation of novel binding proteins, crucial for advancing biopharmaceutical products. With the evolution of artificial intelligence (AI), protein design tools have swiftly transitioned from scoring-function-based to AI-based models. However, many AI models for protein design are constrained by assuming complete unfamiliarity with the amino acid sequence of the input protein, a feature most suited for de novo design but posing challenges in designing protein-protein interactions when the receptor sequence is known. To bridge this gap in computational protein design, we introduce ProBID-Net. Trained using natural protein-protein complex structures and protein domain-domain interface structures, ProBID-Net can discern features from known target protein structures to design specific binding proteins based on their binding sites. In independent tests, ProBID-Net achieved interface sequence recovery rates of 52.7%, 43.9%, and 37.6%, surpassing or being on par with ProteinMPNN in binding protein design. Validated using AlphaFold-Multimer, the sequences designed by ProBID-Net demonstrated a close correspondence between the design target and the predicted structure. Moreover, the model's output can predict changes in binding affinity upon mutations in protein complexes, even in scenarios where no data on such mutations were provided during training (zero-shot prediction). In summary, the ProBID-Net model is poised to significantly advance the design of protein-protein interactions.
Collapse
Affiliation(s)
- Zhihang Chen
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Menglin Ji
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Jie Qian
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Zhe Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Xiangying Zhang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Haotian Gao
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Haojie Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Renxiao Wang
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| | - Yifei Qi
- Department of Medicinal Chemistry, School of Pharmacy, Fudan University 826 Zhangheng Road Shanghai 201203 People's Republic of China
| |
Collapse
|
6
|
Yan X, He Q, Geng B, Yang S. Microbial Cell Factories in the Bioeconomy Era: From Discovery to Creation. BIODESIGN RESEARCH 2024; 6:0052. [PMID: 39434802 PMCID: PMC11491672 DOI: 10.34133/bdr.0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 09/02/2024] [Accepted: 09/18/2024] [Indexed: 10/23/2024] Open
Abstract
Microbial cell factories (MCFs) are extensively used to produce a wide array of bioproducts, such as bioenergy, biochemical, food, nutrients, and pharmaceuticals, and have been regarded as the "chips" of biomanufacturing that will fuel the emerging bioeconomy era. Biotechnology advances have led to the screening, investigation, and engineering of an increasing number of microorganisms as diverse MCFs, which are the workhorses of biomanufacturing and help develop the bioeconomy. This review briefly summarizes the progress and strategies in the development of robust and efficient MCFs for sustainable and economic biomanufacturing. First, a comprehensive understanding of microbial chassis cells, including accurate genome sequences and corresponding annotations; metabolic and regulatory networks governing substances, energy, physiology, and information; and their similarity and uniqueness compared with those of other microorganisms, is needed. Moreover, the development and application of effective and efficient tools is crucial for engineering both model and nonmodel microbial chassis cells into efficient MCFs, including the identification and characterization of biological parts, as well as the design, synthesis, assembly, editing, and regulation of genes, circuits, and pathways. This review also highlights the necessity of integrating automation and artificial intelligence (AI) with biotechnology to facilitate the development of future customized artificial synthetic MCFs to expedite the industrialization process of biomanufacturing and the bioeconomy.
Collapse
Affiliation(s)
| | | | - Binan Geng
- State Key Laboratory of Biocatalysis and Enzyme Engineering, and School of Life Sciences,
Hubei University, Wuhan 430062, China
| | - Shihui Yang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, and School of Life Sciences,
Hubei University, Wuhan 430062, China
| |
Collapse
|
7
|
Wang L, Meng J, Yu X, Wang J, Zhang Y, Zhang M, Zhang Y, Wang H, Feng H, Tian Q, Zhang L, Liu H. Construction of highly active and stable recombinant nattokinase by engineered bacteria and computational design. Arch Biochem Biophys 2024; 760:110126. [PMID: 39154817 DOI: 10.1016/j.abb.2024.110126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 08/13/2024] [Accepted: 08/15/2024] [Indexed: 08/20/2024]
Abstract
Nattokinase (NK) is an enzyme that has been recognized as a new potential thrombolytic drug due to its strong thrombolytic activity. However, it is difficult to maintain the enzyme activity of NK during high temperature environment of industrial production. In this study, we constructed six NK mutants with potential for higher thermostability using a rational protein engineering strategy integrating free energy-based methods and molecular dynamics (MD) simulation. Then, wild-type NK and NK mutants were expressed in Escherichia coli (E. coli), and their thermostability and thrombolytic activity were tested. The results showed that, compared with wild-type NK, the mutants Y256P, Q206L and E156F all had improved thermostability. The optimal mutant Y256P showed a higher melting temperature (Tm) of 77.4 °C, an increase of 4 °C in maximum heat-resistant temperature and an increase of 51.8 % in activity at 37 °C compared with wild-type NK. Moreover, we also explored the mechanism of the increased thermostability of these mutants by analysing the MD trajectories under different simulation temperatures.
Collapse
Affiliation(s)
- Lianxin Wang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Jinhui Meng
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Xiaomiao Yu
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Jie Wang
- School of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Yuying Zhang
- School of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Man Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Yuxi Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Hengyi Wang
- School of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Huawei Feng
- Liaoning Provincial Key Laboratory of Computational Simulation and Information Processing of Biomacromolecules, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China; School of Pharmacy, Liaoning University, Shenyang, 110036, China
| | - Qifeng Tian
- School of Life Science, Liaoning University, Shenyang, 110036, China
| | - Li Zhang
- School of Life Science, Liaoning University, Shenyang, 110036, China; Liaoning Provincial Key Laboratory of Computational Simulation and Information Processing of Biomacromolecules, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China.
| | - Hongsheng Liu
- Liaoning Provincial Key Laboratory of Computational Simulation and Information Processing of Biomacromolecules, Shenyang, 110036, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China; School of Pharmacy, Liaoning University, Shenyang, 110036, China.
| |
Collapse
|
8
|
Hu X, Xu Y, Yi J, Wang C, Zhu Z, Yue T, Zhang H, Wang X, Wu F, Xue L, Bai L, Liu H, Chen Q. Using Protein Design and Directed Evolution to Monomerize a Bright Near-Infrared Fluorescent Protein. ACS Synth Biol 2024; 13:1177-1190. [PMID: 38552148 DOI: 10.1021/acssynbio.3c00643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
The small ultrared fluorescent protein (smURFP) is a bright near-infrared (NIR) fluorescent protein (FP) that forms a dimer and binds its fluorescence chromophore, biliverdin, at its dimer interface. To engineer a monomeric NIR FP based on smURFP potentially more suitable for bioimaging, we employed protein design to extend the protein backbone with a new segment of two helices that shield the original dimer interface while covering the biliverdin binding pocket in place of the second chain in the original dimer. We experimentally characterized 13 designs and obtained a monomeric protein with a weak fluorescence. We enhanced the fluorescence of this designed protein through two rounds of directed evolution and obtained designed monomeric smURFP (DMsmURFP), a bright, stable, and monomeric NIR FP with a molecular weight of 19.6 kDa. We determined the crystal structures of DMsmURFP both in the apo state and in complex with biliverdin, which confirmed the designed structure. The use of DMsmURFP in in vivo imaging of mammalian systems was demonstrated. The backbone design-based strategy used here can also be applied to monomerize other naturally multimeric proteins with intersubunit functional sites.
Collapse
Affiliation(s)
- Xiuhong Hu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Yang Xu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Junxi Yi
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Zhongliang Zhu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Ting Yue
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xinyu Wang
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Fan Wu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Lin Xue
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Li Bai
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Center for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
| |
Collapse
|
9
|
Xu Y, Hu X, Wang C, Liu Y, Chen Q, Liu H. De novo design of cavity-containing proteins with a backbone-centered neural network energy function. Structure 2024; 32:424-432.e4. [PMID: 38325370 DOI: 10.1016/j.str.2024.01.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 10/04/2023] [Accepted: 01/11/2024] [Indexed: 02/09/2024]
Abstract
The design of small-molecule-binding proteins requires protein backbones that contain cavities. Previous design efforts were based on naturally occurring cavity-containing backbone architectures. Here, we designed diverse cavity-containing backbones without predefined architectures by introducing tailored restraints into the backbone sampling driven by SCUBA (Side Chain-Unknown Backbone Arrangement), a neural network statistical energy function. For 521 out of 5816 designs, the root-mean-square deviations (RMSDs) of the Cα atoms for the AlphaFold2-predicted structures and our designed structures are within 2.0 Å. We experimentally tested 10 designed proteins and determined the crystal structures of two of them. One closely agrees with the designed model, while the other forms a domain-swapped dimer, where the partial structures are in agreement with the designed structures. Our results indicate that data-driven methods such as SCUBA hold great potential for designing de novo proteins with tailored small-molecule-binding function.
Collapse
Affiliation(s)
- Yang Xu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Xiuhong Hu
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Yongrui Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- Department of Rheumatology and Immunology, The First Affiliated Hospital of USTC, Centre for Advanced Interdisciplinary Science and Biomedicine of IHM, Hefei National Center for Interdisciplinary Sciences at the Microscale, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230001, China; MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China; School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China.
| |
Collapse
|
10
|
Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024; 42:203-215. [PMID: 38361073 PMCID: PMC11366440 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
Information in proteins flows from sequence to structure to function, with each step causally driven by the preceding one. Protein design is founded on inverting this process: specify a desired function, design a structure executing this function, and find a sequence that folds into this structure. This 'central dogma' underlies nearly all de novo protein-design efforts. Our ability to accomplish these tasks depends on our understanding of protein folding and function and our ability to capture this understanding in computational methods. In recent years, deep learning-derived approaches for efficient and accurate structure modeling and enrichment of successful designs have enabled progression beyond the design of protein structures and towards the design of functional proteins. We examine these advances in the broader context of classical de novo protein design and consider implications for future challenges to come, including fundamental capabilities such as sequence and structure co-design and conformational control considering flexibility, and functional objectives such as antibody and enzyme design.
Collapse
Affiliation(s)
- Alexander E Chu
- Biophysics Program, Stanford University, Palo Alto, CA, USA
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
- Google DeepMind, London, UK
| | - Tianyu Lu
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA
| | - Po-Ssu Huang
- Biophysics Program, Stanford University, Palo Alto, CA, USA.
- Department of Bioengineering, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|
11
|
Liu Y, Liu H. Protein sequence design on given backbones with deep learning. Protein Eng Des Sel 2024; 37:gzad024. [PMID: 38157313 DOI: 10.1093/protein/gzad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/08/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024] Open
Abstract
Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu 215004, China
| |
Collapse
|
12
|
Zhang X, Yin H, Ling F, Zhan J, Zhou Y. SPIN-CGNN: Improved fixed backbone protein design with contact map-based graph construction and contact graph neural network. PLoS Comput Biol 2023; 19:e1011330. [PMID: 38060617 PMCID: PMC10729952 DOI: 10.1371/journal.pcbi.1011330] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/19/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, "hallucinated" structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.
Collapse
Affiliation(s)
- Xing Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Hongmei Yin
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| |
Collapse
|
13
|
Wu C, Yu X, Zheng P, Chen P, Wu D. Rational Redesign of Chitosanase to Enhance Thermostability and Catalytic Activity to Produce Chitooligosaccharides with a Relatively High Degree of Polymerization. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:15213-15223. [PMID: 37793074 DOI: 10.1021/acs.jafc.3c04542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
Chitooligosaccharides (hdpCOS) with a high degree of polymerization (hdp, DP 4-10) generally have greater biological activities than those of low-DP (ldp, DP 2-3) COS. Chitosanase from Bacillus amyloliquefaciens KCP2 (Csn46) can degrade chitosan to more hdpCOS at high temperature (70 °C), but low thermal stability at this temperature makes it unsuitable for industrial application; the wild-type enzyme can only produce COS (DP 2-4) at lower temperatures. Several thermostable mutants were obtained by modifying chitosanase using a comprehensive strategy based on a computer-aided mutant design. A combination of four beneficial single-point mutations (A129L/T175 V/K70T/D34G) to Csn46 was selected to obtain a markedly improved mutant, Mut4, with a half-life at 60 °C extended from 34.31 to 690.80 min, and the specific activity increased from 1671.73 to 3528.77 U/mg. Mut4 produced COS with DPs of 2-4 and 2-7 at 60 and 70 °C, respectively. Therefore, Mut4 has the potential to be applied to the industrial-scale preparation of hdpCOS with high biological activity.
Collapse
Affiliation(s)
- Changyun Wu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi 214122, China
| | - Xiaowei Yu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi 214122, China
| | - Pu Zheng
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi 214122, China
| | - Pengcheng Chen
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi 214122, China
| | - Dan Wu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, 1800 Lihu Road, Wuxi 214122, China
| |
Collapse
|
14
|
Zhang L, Liu H. Exploring binding positions and backbone conformations of peptide ligands of proteins with a backbone-centred statistical energy function. J Comput Aided Mol Des 2023; 37:463-478. [PMID: 37498491 DOI: 10.1007/s10822-023-00518-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 07/05/2023] [Indexed: 07/28/2023]
Abstract
When designing peptide ligands based on the structure of a protein receptor, it can be very useful to narrow down the possible binding positions and bound conformations of the ligand without the need to choose its amino acid sequence in advance. Here, we construct and benchmark a tool for this purpose based on a recently reported statistical energy model named SCUBA (Sidechain-Unknown Backbone Arrangement) for designing protein backbones without considering specific amino acid sequences. With this tool, backbone fragments of different local conformation types are generated and optimized with SCUBA-driven stochastic simulations and simulated annealing, and then ranked and clustered to obtain representative backbone fragment poses of strong SCUBA interaction energies with the receptor. We computationally benchmarked the tool on 111 known protein-peptide complex structures. When the bound ligands are in the strand conformation, the method is able to generate backbone fragments of both low SCUBA energies and low root mean square deviations from experimental structures of peptide ligands. When the bound ligands are helices or coils, low-energy backbone fragments with binding poses similar to experimental structures have been generated for approximately 50% of benchmark cases. We have examined a number of predicted ligand-receptor complexes by atomistic molecular dynamics simulations, in which the peptide ligands have been found to stay at the predicted binding sites and to maintain their local conformations. These results suggest that promising backbone structures of peptides bound to protein receptors can be designed by identifying outstanding minima on the SCUBA-modeled backbone energy landscape.
Collapse
Affiliation(s)
- Lu Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- School of Data Science, University of Science and Technology of China, Hefei, 230027, Anhui, China.
| |
Collapse
|
15
|
Yan J, Li S, Zhang Y, Hao A, Zhao Q. ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing. Brief Bioinform 2023; 24:bbad257. [PMID: 37429578 DOI: 10.1093/bib/bbad257] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/05/2023] [Accepted: 06/21/2023] [Indexed: 07/12/2023] Open
Abstract
Computational protein design has been demonstrated to be the most powerful tool in the last few years among protein designing and repacking tasks. In practice, these two tasks are strongly related but often treated separately. Besides, state-of-the-art deep-learning-based methods cannot provide interpretability from an energy perspective, affecting the accuracy of the design. Here we propose a new systematic approach, including both a posterior probability and a joint probability parts, to solve the two essential questions once for all. This approach takes the physicochemical property of amino acids into consideration and uses the joint probability model to ensure the convergence between structure and amino acid type. Our results demonstrated that this method could generate feasible, high-confidence sequences with low-energy side conformations. The designed sequences can fold into target structures with high confidence and maintain relatively stable biochemical properties. The side chain conformation has a significantly lower energy landscape without delegating to a rotamer library or performing the expensive conformational searches. Overall, we propose an end-to-end method that combines the advantages of both deep learning and energy-based methods. The design results of this model demonstrate high efficiency, and precision, as well as a low energy state and good interpretability.
Collapse
Affiliation(s)
- Junyu Yan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Shuai Li
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Ying Zhang
- The Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Aimin Hao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Qinping Zhao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| |
Collapse
|
16
|
Huang J, Xie X, Zheng Z, Ye L, Wang P, Xu L, Wu Y, Yan J, Yang M, Yan Y. De Novo Computational Design of a Lipase with Hydrolysis Activity towards Middle-Chained Fatty Acid Esters. Int J Mol Sci 2023; 24:ijms24108581. [PMID: 37239928 DOI: 10.3390/ijms24108581] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/08/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
Innovations in biocatalysts provide great prospects for intolerant environments or novel reactions. Due to the limited catalytic capacity and the long-term and labor-intensive characteristics of mining enzymes with the desired functions, de novo enzyme design was developed to obtain industrial application candidates in a rapid and convenient way. Here, based on the catalytic mechanisms and the known structures of proteins, we proposed a computational protein design strategy combining de novo enzyme design and laboratory-directed evolution. Starting with the theozyme constructed using a quantum-mechanical approach, the theoretical enzyme-skeleton combinations were assembled and optimized via the Rosetta "inside-out" protocol. A small number of designed sequences were experimentally screened using SDS-PAGE, mass spectrometry and a qualitative activity assay in which the designed enzyme 1a8uD1 exhibited a measurable hydrolysis activity of 24.25 ± 0.57 U/g towards p-nitrophenyl octanoate. To improve the activity of the designed enzyme, molecular dynamics simulations and the RosettaDesign application were utilized to further optimize the substrate binding mode and amino acid sequence, thus keeping the residues of theozyme intact. The redesigned lipase 1a8uD1-M8 displayed enhanced hydrolysis activity towards p-nitrophenyl octanoate-3.34 times higher than that of 1a8uD1. Meanwhile, the natural skeleton protein (PDB entry 1a8u) did not display any hydrolysis activity, confirming that the hydrolysis abilities of the designed 1a8uD1 and the redesigned 1a8uD1-M8 were devised from scratch. More importantly, the designed 1a8uD1-M8 was also able to hydrolyze the natural middle-chained substrate (glycerol trioctanoate), for which the activity was 27.67 ± 0.69 U/g. This study indicates that the strategy employed here has great potential to generate novel enzymes exhibiting the desired reactions.
Collapse
Affiliation(s)
- Jinsha Huang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaoman Xie
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Zhen Zheng
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Luona Ye
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Pengbo Wang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Li Xu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Ying Wu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinyong Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Min Yang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Yunjun Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
17
|
Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
- School of Data Science University of Science and Technology of China Hefei Anhui China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
18
|
Dicks L, Wales DJ. Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins. J Phys Chem B 2022; 126:8381-8390. [PMID: 36257022 PMCID: PMC9623586 DOI: 10.1021/acs.jpcb.2c04647] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Rotamers, namely amino acid side chain conformations common to many different peptides, can be compiled into libraries. These rotamer libraries are used in protein modeling, where the limited conformational space occupied by amino acid side chains is exploited. Here, we construct a sequence-dependent rotamer library from simulations of all possible tripeptides, which provides rotameric states dependent on adjacent amino acids. We observe significant sensitivity of rotamer populations to sequence and find that the library is successful in locating side chain conformations present in crystal structures. The library is designed for applications with basin-hopping global optimization, where we use it to propose moves in conformational space. The addition of rotamer moves significantly increases the efficiency of protein structure prediction within this framework, and we determine parameters to optimize efficiency.
Collapse
Affiliation(s)
- L. Dicks
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,IBM
Research, The Hartree Centre STFC Laboratory,
Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - D. J. Wales
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,
| |
Collapse
|
19
|
Yuan B, Ru X, Lin Z. Analysis of the sidechain structures of amino acids and peptides and a deduced method for the efficient search of peptide conformations. COMPUT THEOR CHEM 2022. [DOI: 10.1016/j.comptc.2022.113815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
20
|
Liu Y, Zhang L, Wang W, Zhu M, Wang C, Li F, Zhang J, Li H, Chen Q, Liu H. Rotamer-free protein sequence design based on deep learning and self-consistency. NATURE COMPUTATIONAL SCIENCE 2022; 2:451-462. [PMID: 38177863 DOI: 10.1038/s43588-022-00273-6] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 06/07/2022] [Indexed: 01/06/2024]
Abstract
Several previously proposed deep learning methods to design amino acid sequences that autonomously fold into a given protein backbone yielded promising results in computational tests but did not outperform conventional energy function-based methods in wet experiments. Here we present the ABACUS-R method, which uses an encoder-decoder network trained using a multitask learning strategy to predict the sidechain type of a central residue from its three-dimensional local environment, which includes, besides other features, the types but not the conformations of the surrounding sidechains. This eliminates the need to reconstruct and optimize sidechain structures, and drastically simplifies the sequence design process. Thus iteratively applying the encoder-decoder to different central residues is able to produce self-consistent overall sequences for a target backbone. Results of wet experiments, including five structures solved by X-ray crystallography, show that ABACUS-R outperforms state-of-the-art energy function-based methods in success rate and design precision.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Lu Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Weilun Wang
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Min Zhu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Fudong Li
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Jiahai Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Houqiang Li
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China.
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
- School of Data Science, University of Science and Technology of China, Hefei, Anhui, China.
| |
Collapse
|
21
|
Sun J, Wu B. Protein design with a machine-learned potential about backbone designability. Trends Biochem Sci 2022; 47:638-640. [DOI: 10.1016/j.tibs.2022.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/07/2022] [Accepted: 04/07/2022] [Indexed: 10/18/2022]
|
22
|
Chen Y, Chen Q, Liu H. DEPACT and PACMatch: A Workflow of Designing De Novo Protein Pockets to Bind Small Molecules. J Chem Inf Model 2022; 62:971-985. [PMID: 35171604 DOI: 10.1021/acs.jcim.1c01398] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Engineering of new functional proteins such as enzymes and biosensors involves the design of new protein pockets for the specific binding of small molecules. Here, we report a workflow composed of two new computational methods to execute this task. The DEPACT (Design Pocket as a Cluster based on Templates) method is a data-driven approach to design and evaluate small-molecule-binding pockets as isolated clusters, while the PACMatch method is a computational approach to match pocket residues in a cluster model to positions on given protein scaffolds. Using DEPACT and its scoring function, pocket clusters of natural-pocket-like chemical compositions and protein-ligand interaction strength can be designed. DEPACT can design pocket clusters containing water- or metal-ion-mediated protein-ligand interactions. While being able to efficiently treat relatively large pocket cluster models (e.g., of around 10 pocket residues), PACMatch outperforms previous methods in test cases of recovering the native positions of pocket residues in natural enzyme-substrate complexes.
Collapse
Affiliation(s)
- Yaoxi Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China.,Biomedical Sciences and Health Laboratory of Anhui Province, University of Science & Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China.,Biomedical Sciences and Health Laboratory of Anhui Province, University of Science & Technology of China, Hefei, Anhui 230027, China.,School of Data Science, University of Science and Technology of China, Hefei, Anhui 230027, China
| |
Collapse
|
23
|
A backbone-centred energy function of neural networks for protein design. Nature 2022; 602:523-528. [PMID: 35140398 DOI: 10.1038/s41586-021-04383-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 12/23/2021] [Indexed: 12/29/2022]
Abstract
A protein backbone structure is designable if a substantial number of amino acid sequences exist that autonomously fold into it1,2. It has been suggested that the designability of backbones is governed mainly by side chain-independent or side chain type-insensitive molecular interactions3-5, indicating an approach for designing new backbones (ready for amino acid selection) based on continuous sampling and optimization of the backbone-centred energy surface. However, a sufficiently comprehensive and precise energy function has yet to be established for this purpose. Here we show that this goal is met by a statistical model named SCUBA (for Side Chain-Unknown Backbone Arrangement) that uses neural network-form energy terms. These terms are learned with a two-step approach that comprises kernel density estimation followed by neural network training and can analytically represent multidimensional, high-order correlations in known protein structures. We report the crystal structures of nine de novo proteins whose backbones were designed to high precision using SCUBA, four of which have novel, non-natural overall architectures. By eschewing use of fragments from existing protein structures, SCUBA-driven structure design facilitates far-reaching exploration of the designable backbone space, thus extending the novelty and diversity of the proteins amenable to de novo design.
Collapse
|
24
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
25
|
Liu R, Wang J, Xiong P, Chen Q, Liu H. De novo sequence redesign of a functional Ras-binding domain globally inverted the surface charge distribution and led to extreme thermostability. Biotechnol Bioeng 2021; 118:2031-2042. [PMID: 33590881 DOI: 10.1002/bit.27716] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 02/05/2021] [Accepted: 02/14/2021] [Indexed: 11/05/2022]
Abstract
To acquire extremely thermostable proteins of given functions is challenging for conventional protein engineering. Here we applied ABACUS, a statistical energy function we developed for de novo amino acid sequence design, to globally redesign a Ras-binding domain (RBD), and obtained an extremely thermostable RBD that unfolds reversibly at above 110°C, the redesigned RBD experimentally confirmed to have expected structure and Ras-binding interface. Directed evolution of the redesigned RBD improved its Ras-binding affinity to the native protein level without excessive loss of thermostability. The designed amino acid substitutions were mostly at the protein surface. For many substitutions, strong epistasis or significantly differentiated effects on thermostability in the native sequence context relative to the redesigned sequence context were observed, suggesting the globally redesigned sequence to be unreachable through combining beneficial mutations of the native sequence. Further analyses revealed that by replacing 38 of a total of 48 non-interfacial surface residues at once, ABACUS redesign was able to globally "invert" the protein's charge distribution pattern in an optimized way. Our study demonstrates that computational protein design provides powerful new tools to solve challenging protein engineering problems.
Collapse
Affiliation(s)
- Ruicun Liu
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Jichao Wang
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Peng Xiong
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Quan Chen
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China.,Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, Anhui, China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China.,Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, Anhui, China.,School of Data Science, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
26
|
Huang X, Pearce R, Zhang Y. FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 2020; 36:3758-3765. [PMID: 32259206 DOI: 10.1093/bioinformatics/btaa234] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 03/30/2020] [Accepted: 04/01/2020] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Protein structure and function are essentially determined by how the side-chain atoms interact with each other. Thus, accurate protein side-chain packing (PSCP) is a critical step toward protein structure prediction and protein design. Despite the importance of the problem, however, the accuracy and speed of current PSCP programs are still not satisfactory. RESULTS We present FASPR for fast and accurate PSCP by using an optimized scoring function in combination with a deterministic searching algorithm. The performance of FASPR was compared with four state-of-the-art PSCP methods (CISRR, RASP, SCATD and SCWRL4) on both native and non-native protein backbones. For the assessment on native backbones, FASPR achieved a good performance by correctly predicting 69.1% of all the side-chain dihedral angles using a stringent tolerance criterion of 20°, compared favorably with SCWRL4, CISRR, RASP and SCATD which successfully predicted 68.8%, 68.6%, 67.8% and 61.7%, respectively. Additionally, FASPR achieved the highest speed for packing the 379 test protein structures in only 34.3 s, which was significantly faster than the control methods. For the assessment on non-native backbones, FASPR showed an equivalent or better performance on I-TASSER predicted backbones and the backbones perturbed from experimental structures. Detailed analyses showed that the major advantage of FASPR lies in the optimal combination of the dead-end elimination and tree decomposition with a well optimized scoring function, which makes FASPR of practical use for both protein structure modeling and protein design studies. AVAILABILITY AND IMPLEMENTATION The web server, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/FASPR and https://github.com/tommyhuangthu/FASPR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
27
|
Qi Y, Zhang JZH. DenseCPD: Improving the Accuracy of Neural-Network-Based Computational Protein Sequence Design with DenseNet. J Chem Inf Model 2020; 60:1245-1252. [DOI: 10.1021/acs.jcim.0c00043] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Yifei Qi
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU−ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU−ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York, New York 10003, United States
| |
Collapse
|
28
|
Chen S, Sun Z, Lin L, Liu Z, Liu X, Chong Y, Lu Y, Zhao H, Yang Y. To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map. J Chem Inf Model 2019; 60:391-399. [PMID: 31800243 DOI: 10.1021/acs.jcim.9b00438] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein sequence profile prediction aims to generate multiple sequences from structural information to advance the protein design. Protein sequence profile can be computationally predicted by energy-based or fragment-based methods. By integrating these methods with neural networks, our previous method, SPIN2, has achieved a sequence recovery rate of 34%. However, SPIN2 employed only one-dimensional (1D) structural properties that are not sufficient to represent three-dimensional (3D) structures. In this study, we represented 3D structures by 2D maps of pairwise residue distances and developed a new method (SPROF) to predict protein sequence profiles based on an image captioning learning frame. To our best knowledge, this is the first method to employ a 2D distance map for predicting protein properties. SPROF achieved 39.8% in sequence recovery of residues on the independent test set, representing a 5.2% improvement over SPIN2. We also found the sequence recovery increased with the number of their neighbored residues in 3D structural space, indicating that our method can effectively learn long-range information from the 2D distance map. Thus, such network architecture using a 2D distance map is expected to be useful for other 3D structure-based applications, such as binding site prediction, protein function prediction, and protein interaction prediction. The online server and the source code is available at http://biomed.nscc-gz.cn and https://github.com/biomed-AI/SPROF , respectively.
Collapse
Affiliation(s)
- Sheng Chen
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zhe Sun
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Lihua Lin
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zifeng Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Xun Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutian Chong
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutong Lu
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital , Sun Yat-sen University , Guangzhou 510000 , China
| | - Yuedong Yang
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China.,Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University) of the Ministry of Education , Guangzhou 510000 , China
| |
Collapse
|