1
|
Tawfeeq C, Wang J, Khaniya U, Madej T, Song J, Abrol R, Youkharibache P. IgStrand: A universal residue numbering scheme for the immunoglobulin-fold (Ig-fold) to study Ig-proteomes and Ig-interactomes. PLoS Comput Biol 2025; 21:e1012813. [PMID: 40228037 PMCID: PMC12051499 DOI: 10.1371/journal.pcbi.1012813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 01/20/2025] [Indexed: 04/16/2025] Open
Abstract
The Immunoglobulin fold (Ig-fold) is found in proteins from all domains of life and represents the most populous fold in the human genome, with current estimates ranging from 2 to 3% of protein coding regions. That proportion is much higher in the surfaceome where Ig and Ig-like domains orchestrate cell-cell recognition, adhesion and signaling. The ability of Ig-domains to reliably fold and self-assemble through highly specific interfaces represents a remarkable property of these domains, making them key elements of molecular interaction systems: the immune system, the nervous system, the vascular system and the muscular system. We define a universal residue numbering scheme, common to all domains sharing the Ig-fold in order to study the wide spectrum of Ig-domain variants constituting the Ig-proteome and Ig-Ig interactomes at the heart of these systems. The "IgStrand numbering scheme" enables the identification of Ig structural proteomes and interactomes in and between any species, and comparative structural, functional, and evolutionary analyses. We review how Ig-domains are classified today as topological and structural variants and highlight the "Ig-fold irreducible structural signature" shared by all of them. The IgStrand numbering scheme lays the foundation for the systematic annotation of structural proteomes by detecting and accurately labeling Ig-, Ig-like and Ig-extended domains in proteins, which are poorly annotated in current databases and opens the door to accurate machine learning. Importantly, it sheds light on the robust Ig protein folding algorithm used by nature to form beta sandwich supersecondary structures. The numbering scheme powers an algorithm implemented in the interactive structural analysis software iCn3D to systematically recognize Ig-domains, annotate them and perform detailed analyses comparing any domain sharing the Ig-fold in sequence, topology and structure, regardless of their diverse topologies or origin. The scheme provides a robust fold detection and labeling mechanism that reveals unsuspected structural homologies among protein structures beyond currently identified Ig- and Ig-like domain variants. Indeed, multiple folds classified independently contain a common structural signature, in particular jelly-rolls. Examples of folds that harbor an "Ig-extended" architecture are given. Applications in protein engineering around the Ig-architecture are straightforward based on the universal numbering.
Collapse
Affiliation(s)
- Caesar Tawfeeq
- Department of Chemistry and Biochemistry, California State University Northridge, Northridge, California, United States of America
| | - Jiyao Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Umesh Khaniya
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Thomas Madej
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - James Song
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ravinder Abrol
- Department of Chemistry and Biochemistry, California State University Northridge, Northridge, California, United States of America
| | - Philippe Youkharibache
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
2
|
Guo Y, Waltari E, Lu H, Sheng Z, Wu X. Novel rhesus macaque immunoglobulin germline genes identified by three sequencing approaches. Front Immunol 2024; 15:1506348. [PMID: 39776901 PMCID: PMC11703713 DOI: 10.3389/fimmu.2024.1506348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Accepted: 12/05/2024] [Indexed: 01/11/2025] Open
Abstract
Introduction Rhesus macaques have long been a focus of research for understanding immune responses to human pathogens due to their close phylogenetic relationship with humans. As rhesus macaque antibody germlines show high degrees of polymorphism, the spectrum of database-covered genes expressed in individual macaques remains to be determined. Methods Here, four rhesus macaques infected with SHIVSF162P3N became a study of interest because they developed broadly neutralizing antibodies against HIV-1. To identify the immunoglobulin heavy chain V-gene (IGHV) germlines in these macaques, we applied three sequencing approaches - genomic DNA (gDNA) TOPO sequencing, gDNA MiSeq, and messenger RNA (mRNA) MiSeq inference with IgDiscover, and illustrated the detection power of each method. Results Of the 197 new rhesus IGHV germline sequences identified, 116 (59%) were validated by at least two methods, and 143 (73%) were found in at least two macaques or two sample sources. About 20% of germlines in each macaque are missing from the current database, including a subset frequently expressed. Overall, gDNA MiSeq determined the greatest number of germline sequences, followed by gDNA TOPO sequencing and mRNA MiSeq inference by IgDiscover, with IgDiscover providing direct evidence of allele expression and usage. Discussion Our interdisciplinary study sheds light on germline sequencing, enhances the rhesus IGHV germline database, and highlights the importance of germline sequencing in rhesus immune repertoire studies.
Collapse
Affiliation(s)
- Yicheng Guo
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
- Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Eric Waltari
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Hong Lu
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
- Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Zizhang Sheng
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
- Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Xueling Wu
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
- Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| |
Collapse
|
3
|
Bauer J, Rajagopal N, Gupta P, Gupta P, Nixon AE, Kumar S. How can we discover developable antibody-based biotherapeutics? Front Mol Biosci 2023; 10:1221626. [PMID: 37609373 PMCID: PMC10441133 DOI: 10.3389/fmolb.2023.1221626] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 07/10/2023] [Indexed: 08/24/2023] Open
Abstract
Antibody-based biotherapeutics have emerged as a successful class of pharmaceuticals despite significant challenges and risks to their discovery and development. This review discusses the most frequently encountered hurdles in the research and development (R&D) of antibody-based biotherapeutics and proposes a conceptual framework called biopharmaceutical informatics. Our vision advocates for the syncretic use of computation and experimentation at every stage of biologic drug discovery, considering developability (manufacturability, safety, efficacy, and pharmacology) of potential drug candidates from the earliest stages of the drug discovery phase. The computational advances in recent years allow for more precise formulation of disease concepts, rapid identification, and validation of targets suitable for therapeutic intervention and discovery of potential biotherapeutics that can agonize or antagonize them. Furthermore, computational methods for de novo and epitope-specific antibody design are increasingly being developed, opening novel computationally driven opportunities for biologic drug discovery. Here, we review the opportunities and limitations of emerging computational approaches for optimizing antigens to generate robust immune responses, in silico generation of antibody sequences, discovery of potential antibody binders through virtual screening, assessment of hits, identification of lead drug candidates and their affinity maturation, and optimization for developability. The adoption of biopharmaceutical informatics across all aspects of drug discovery and development cycles should help bring affordable and effective biotherapeutics to patients more quickly.
Collapse
Affiliation(s)
- Joschka Bauer
- Early Stage Pharmaceutical Development Biologicals, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
| | - Nandhini Rajagopal
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Priyanka Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Pankaj Gupta
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Andrew E. Nixon
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| | - Sandeep Kumar
- In Silico Team, Boehringer Ingelheim, Hannover, Germany
- Biotherapeutics Discovery, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, United States
| |
Collapse
|
4
|
Sheng Z, Bimela JS, Wang M, Li Z, Guo Y, Ho DD. An optimized thermodynamics integration protocol for identifying beneficial mutations in antibody design. Front Immunol 2023; 14:1190416. [PMID: 37275896 PMCID: PMC10235760 DOI: 10.3389/fimmu.2023.1190416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 04/28/2023] [Indexed: 06/07/2023] Open
Abstract
Accurate identification of beneficial mutations is central to antibody design. Many knowledge-based (KB) computational approaches have been developed to predict beneficial mutations, but their accuracy leaves room for improvement. Thermodynamic integration (TI) is an alchemical free energy algorithm that offers an alternative technique for identifying beneficial mutations, but its performance has not been evaluated. In this study, we developed an efficient TI protocol with high accuracy for predicting binding free energy changes of antibody mutations. The improved TI method outperforms KB methods at identifying both beneficial and deleterious mutations. We observed that KB methods have higher accuracies in predicting deleterious mutations than beneficial mutations. A pipeline using KB methods to efficiently exclude deleterious mutations and TI to accurately identify beneficial mutations was developed for high-throughput mutation scanning. The pipeline was applied to optimize the binding affinity of a broadly sarbecovirus neutralizing antibody 10-40 against the circulating severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) omicron variant. Three identified beneficial mutations show strong synergy and improve both binding affinity and neutralization potency of antibody 10-40. Molecular dynamics simulation revealed that the three mutations improve the binding affinity of antibody 10-40 through the stabilization of an altered binding mode with increased polar and hydrophobic interactions. Above all, this study presents an accurate and efficient TI-based approach for optimizing antibodies and other biomolecules.
Collapse
Affiliation(s)
- Zizhang Sheng
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Jude S. Bimela
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States
| | - Maple Wang
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Zhiteng Li
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - Yicheng Guo
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| | - David D. Ho
- Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, United States
| |
Collapse
|