1
|
Madsen AV, Mejias-Gomez O, Pedersen LE, Preben Morth J, Kristensen P, Jenkins TP, Goletz S. Structural trends in antibody-antigen binding interfaces: a computational analysis of 1833 experimentally determined 3D structures. Comput Struct Biotechnol J 2024; 23:199-211. [PMID: 38161735 PMCID: PMC10755492 DOI: 10.1016/j.csbj.2023.11.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 01/03/2024] Open
Abstract
Antibodies are attractive therapeutic candidates due to their ability to bind cognate antigens with high affinity and specificity. Still, the underlying molecular rules governing the antibody-antigen interface remain poorly understood, making in silico antibody design inherently difficult and keeping the discovery and design of novel antibodies a costly and laborious process. This study investigates the characteristics of antibody-antigen binding interfaces through a computational analysis of more than 850,000 atom-atom contacts from the largest reported set of antibody-antigen complexes with 1833 nonredundant, experimentally determined structures. The analysis compares binding characteristics of conventional antibodies and single-domain antibodies (sdAbs) targeting both protein- and peptide antigens. We find clear patterns in the number antibody-antigen contacts and amino acid frequencies in the paratope. The direct comparison of sdAbs and conventional antibodies helps elucidate the mechanisms employed by sdAbs to compensate for their smaller size and the fact that they harbor only half the number of complementarity-determining regions compared to conventional antibodies. Furthermore, we pinpoint antibody interface hotspot residues that are often found at the binding interface and the amino acid frequencies at these positions. These findings have direct potential applications in antibody engineering and the design of improved antibody libraries.
Collapse
Affiliation(s)
- Andreas V. Madsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Oscar Mejias-Gomez
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Lasse E. Pedersen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - J. Preben Morth
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Peter Kristensen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Timothy P. Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Steffen Goletz
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
2
|
Wang X, Gao X, Fan X, Huai Z, Zhang G, Yao M, Wang T, Huang X, Lai L. WUREN: Whole-modal union representation for epitope prediction. Comput Struct Biotechnol J 2024; 23:2122-2131. [PMID: 38817963 PMCID: PMC11137340 DOI: 10.1016/j.csbj.2024.05.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/14/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
B-cell epitope identification plays a vital role in the development of vaccines, therapies, and diagnostic tools. Currently, molecular docking tools in B-cell epitope prediction are heavily influenced by empirical parameters and require significant computational resources, rendering a great challenge to meet large-scale prediction demands. When predicting epitopes from antigen-antibody complex, current artificial intelligence algorithms cannot accurately implement the prediction due to insufficient protein feature representations, indicating novel algorithm is desperately needed for efficient protein information extraction. In this paper, we introduce a multimodal model called WUREN (Whole-modal Union Representation for Epitope predictioN), which effectively combines sequence, graph, and structural features. It achieved AUC-PR scores of 0.213 and 0.193 on the solved structures and AlphaFold-generated structures, respectively, for the independent test proteins selected from DiscoTope3 benchmark. Our findings indicate that WUREN is an efficient feature extraction model for protein complexes, with the generalizable application potential in the development of protein-based drugs. Moreover, the streamlined framework of WUREN could be readily extended to model similar biomolecules, such as nucleic acids, carbohydrates, and lipids.
Collapse
Affiliation(s)
| | | | - Xuezhe Fan
- XtalPi Innovation Center, Beijing, China
| | - Zhe Huai
- XtalPi Innovation Center, Beijing, China
| | | | | | | | | | - Lipeng Lai
- XtalPi Innovation Center, Beijing, China
| |
Collapse
|
3
|
Mi Y, Marcu SB, Tabirca S, Yallapragada VV. PS-GO parametric protein search engine. Comput Struct Biotechnol J 2024; 23:1499-1509. [PMID: 38633387 PMCID: PMC11021831 DOI: 10.1016/j.csbj.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 04/01/2024] [Accepted: 04/01/2024] [Indexed: 04/19/2024] Open
Abstract
With the explosive growth of protein-related data, we are confronted with a critical scientific inquiry: How can we effectively retrieve, compare, and profoundly comprehend these protein structures to maximize the utilization of such data resources? PS-GO, a parametric protein search engine, has been specifically designed and developed to maximize the utilization of the rapidly growing volume of protein-related data. This innovative tool addresses the critical need for effective retrieval, comparison, and deep understanding of protein structures. By integrating computational biology, bioinformatics, and data science, PS-GO is capable of managing large-scale data and accurately predicting and comparing protein structures and functions. The engine is built upon the concept of parametric protein design, a computer-aided method that adjusts and optimizes protein structures and sequences to achieve desired biological functions and structural stability. PS-GO utilizes key parameters such as amino acid sequence, side chain angle, and solvent accessibility, which have a significant influence on protein structure and function. Additionally, PS-GO leverages computable parameters, derived computationally, which are crucial for understanding and predicting protein behavior. The development of PS-GO underscores the potential of parametric protein design in a variety of applications, including enhancing enzyme activity, improving antibody affinity, and designing novel functional proteins. This advancement not only provides a robust theoretical foundation for the field of protein engineering and biotechnology but also offers practical guidelines for future progress in this domain.
Collapse
Affiliation(s)
- Yanlin Mi
- School of Computer Science and Information Technology, University College Cork, Cork, Ireland
- SFI Centre for Research Training in Artificial Intelligence, University College Cork, Cork, Ireland
| | - Stefan-Bogdan Marcu
- School of Computer Science and Information Technology, University College Cork, Cork, Ireland
| | - Sabin Tabirca
- School of Computer Science and Information Technology, University College Cork, Cork, Ireland
- Faculty of Mathematics and Informatics, Transylvania University of Brasov, Brasov, Romania
| | - Venkata V.B. Yallapragada
- Centre for Advanced Photonics and Process Analytics, Munster Technological University, Cork, Ireland
| |
Collapse
|
4
|
Farjami T, Sharma A, Hagen L, Jensen IJ, Falch E. Comparative study on composition and functional properties of brewer's spent grain proteins precipitated by citric acid and hydrochloric acid. Food Chem 2024; 446:138863. [PMID: 38428084 DOI: 10.1016/j.foodchem.2024.138863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/21/2024] [Accepted: 02/25/2024] [Indexed: 03/03/2024]
Abstract
Brewer's spent grain (BSG) is an abundant agro-industrial residue and a sustainable low-cost source for extracting proteins. The composition and functionality of BSG protein concentrates are affected by extraction conditions. This study examined the use of citric acid (CA) and HCl to precipitate BSG proteins. The resultant protein concentrates were compared in terms of their composition and functional properties. The BSG protein concentrate precipitated by CA had 10% lower protein content, 5.8% higher carbohydrate, and 5.4% higher lipid content than the sample precipitated by HCl. Hydrophilic/hydrophobic protein and saturated/unsaturated fatty acid ratios increased by 16.9% and 26.5% respectively, in the sample precipitated by CA. The formation of CA-cross-linkages was verified using shotgun proteomics and Fourier transform infrared spectroscopy. Precipitation by CA adversely affected protein solubility and emulsifying properties, while improving foaming properties. This study provides insights into the role of precipitants in modulating the properties of protein concentrates.
Collapse
Affiliation(s)
- Toktam Farjami
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology (NTNU), NO-7491 Trondheim, Norway.
| | - Animesh Sharma
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology (NTNU), NO-7491 Trondheim, Norway; Proteomics and Modomics Experimental Core (PROMEC), NTNU and the Central Norway Regional Health Authority, N-7491 Trondheim, Norway
| | - Lars Hagen
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology (NTNU), NO-7491 Trondheim, Norway; Proteomics and Modomics Experimental Core (PROMEC), NTNU and the Central Norway Regional Health Authority, N-7491 Trondheim, Norway
| | - Ida-Johanne Jensen
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology (NTNU), NO-7491 Trondheim, Norway
| | - Eva Falch
- Department of Biotechnology and Food Science, Norwegian University of Science and Technology (NTNU), NO-7491 Trondheim, Norway
| |
Collapse
|
5
|
Yuan L, Guo J. PharmaRedefine: A database server for repurposing drugs against pathogenic bacteria. Methods 2024; 227:78-85. [PMID: 38754711 DOI: 10.1016/j.ymeth.2024.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 05/03/2024] [Accepted: 05/13/2024] [Indexed: 05/18/2024] Open
Abstract
Pathogenic bacteria represent a formidable threat to human health, necessitating substantial resources for prevention and treatment. With the escalating concern regarding antibiotic resistance, there is a pressing need for innovative approaches to combat these pathogens. Repurposing existing drugs offers a promising solution. Our present work hypothesizes that proteins harboring ligand-binding pockets with similar chemical environments may be able to bind the same drug. To facilitate this drug-repurposing strategy against pathogenic bacteria, we introduce an online server, PharmaRedefine. Leveraging a combination of sequence and structure alignment and protein pocket similarity analysis, this platform enables the prediction of potential targets in representative bacteria for specific FDA-approved drugs. This novel approach holds tremendous potential for drug repositioning that effectively combat infections caused by pathogenic bacteria. PharmaRedefine is freely available at http://guolab.mpu.edu.mo/pharmredefine.
Collapse
Affiliation(s)
- Longxiao Yuan
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao 999097, China
| | - Jingjing Guo
- Centre in Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao 999097, China.
| |
Collapse
|
6
|
Huang J, Wang Q, Sanchez-Martinez P, El-Kassaby YA, Jia Q, Xie Y, Guan W, Zang R. Phylogenetic conservatism and coordination in traits of Chinese woody endemic flora. iScience 2024; 27:109885. [PMID: 38799551 PMCID: PMC11126960 DOI: 10.1016/j.isci.2024.109885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 03/20/2024] [Accepted: 04/30/2024] [Indexed: 05/29/2024] Open
Abstract
Range-limited endemic species, often labeled as endangered due to their low adaptability to climate change, exhibit unclear evolutionary mechanisms influencing their distribution. This study explores the relationship between leaf length, maximum height, and seed diameter and their linkage to phylogeny and climate in the macroecology of 1,370 woody endemics. Using Bayesian analytical method that allows partitioning phylogenetic and environmental variances and covariance, we revealed moderate to high phylogenetic signals in these traits, indicating evolutionary constraints potentially impacting climate change adaptability. The study uncovered a phylogenetically conserved coordination between height and leaf length which showed to be independent of macroecological patterns of temperature and precipitation. These findings emphasize the role of phylogenetic ancestry in shaping the distribution of woody endemics, highlighting the need for prioritized in-situ conservation and providing insights for ex situ conservation strategies.
Collapse
Affiliation(s)
- Jihong Huang
- Ecology and Nature Conservation Institute, Chinese Academy of Forestry, Key Laboratory of Biodiversity Conservation of National Forestry and Grassland Administration, Key Laboratory of Forest Ecology and Environment of National Forestry and Grassland Administration, Beijing 100091, China
- Co-Innovation Centre for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, Jiangsu 210037, China
| | - Qing Wang
- Ecology and Nature Conservation Institute, Chinese Academy of Forestry, Key Laboratory of Biodiversity Conservation of National Forestry and Grassland Administration, Key Laboratory of Forest Ecology and Environment of National Forestry and Grassland Administration, Beijing 100091, China
- Co-Innovation Centre for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, Jiangsu 210037, China
- Ecological Technical Research Institute (Beijing) CO., Ltd., CIECC, Beijing 100037, China
| | - Pablo Sanchez-Martinez
- CREAF, Cerdanyola del Vallès, 08193 Barcelona, Spain
- Universitat Autòonoma de Barcelona, Cerdanyola del Vallès, 08193 Barcelona, Spain
- School of GeoSciences, University of Edinburgh, Edinburgh, UK
| | - Yousry A. El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, The University of British Columbia, Vancouver V6T 1Z4, Canada
| | - Qiang Jia
- Ecological Technical Research Institute (Beijing) CO., Ltd., CIECC, Beijing 100037, China
| | - Yifei Xie
- Ganzhou Key Laboratory of Nanling Plant Resources Protection and Utilization, School of Life Sciences, Gannan Normal University, Ganzhou, Jiangxi 341000, China
| | - Wenbin Guan
- School of Ecology and Nature Conservation, Beijing Forestry University, Beijing 100083, China
| | - Runguo Zang
- Ecology and Nature Conservation Institute, Chinese Academy of Forestry, Key Laboratory of Biodiversity Conservation of National Forestry and Grassland Administration, Key Laboratory of Forest Ecology and Environment of National Forestry and Grassland Administration, Beijing 100091, China
- Co-Innovation Centre for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing, Jiangsu 210037, China
| |
Collapse
|
7
|
Bai G, Zeng X, Zhang L, Wang Y, Ma B. Computational investigation of the inhibitory interaction of IRF3 and SARS-CoV-2 accessory protein ORF3b. Biochem Biophys Res Commun 2024; 712-713:149945. [PMID: 38640732 DOI: 10.1016/j.bbrc.2024.149945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Accepted: 04/14/2024] [Indexed: 04/21/2024]
Abstract
ORF3b is one of the SARS-CoV-2 accessory proteins. Previous experimental study suggested that ORF3b prevents IRF3 translocating to nucleus. However, the biophysical mechanism of ORF3b-IRF3 interaction is elusive. Here, we explored the conformation ensemble of ORF3b using all-atom replica exchange molecular dynamics simulation. Disordered ORF3b has mixed α-helix, β-turn and loop conformers. The potential ORF3b-IRF3 binding modes were searched by docking representative ORF3b conformers with IRF3, and 50 ORF3b-IRF3 complex poses were screened using molecular dynamics simulations ranging from 500 to 1000 ns. We found that ORF3b binds IRF3 predominantly on its CBP binding and phosphorylated pLxIS motifs, with CBP binding site has the highest binding affinity. The ORF3b-IRF3 binding residues are highly conserved in SARS-CoV-2. Our results provided biophysics insights into ORF3b-IRF3 interaction and explained its interferon antagonism mechanism.
Collapse
Affiliation(s)
- Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Linghao Zhang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
8
|
Majane AC, Cridland JM, Blair LK, Begun DJ. Evolution and genetics of accessory gland transcriptome divergence between Drosophila melanogaster and D. simulans. Genetics 2024; 227:iyae039. [PMID: 38518250 DOI: 10.1093/genetics/iyae039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 08/27/2023] [Accepted: 02/15/2024] [Indexed: 03/24/2024] Open
Abstract
Studies of allele-specific expression in interspecific hybrids have provided important insights into gene-regulatory divergence and hybrid incompatibilities. Many such investigations in Drosophila have used transcriptome data from complex mixtures of many tissues or from gonads, however, regulatory divergence may vary widely among species, sexes, and tissues. Thus, we lack sufficiently broad sampling to be confident about the general biological principles of regulatory divergence. Here, we seek to fill some of these gaps in the literature by characterizing regulatory evolution and hybrid misexpression in a somatic male sex organ, the accessory gland, in F1 hybrids between Drosophila melanogaster and D. simulans. The accessory gland produces seminal fluid proteins, which play an important role in male and female fertility and may be subject to adaptive divergence due to male-male or male-female interactions. We find that trans differences are relatively more abundant than cis, in contrast to most of the interspecific hybrid literature, though large effect-size trans differences are rare. Seminal fluid protein genes have significantly elevated levels of expression divergence and tend to be regulated through both cis and trans divergence. We find limited misexpression (over- or underexpression relative to both parents) in this organ compared to most other Drosophila studies. As in previous studies, male-biased genes are overrepresented among misexpressed genes and are much more likely to be underexpressed. ATAC-Seq data show that chromatin accessibility is correlated with expression differences among species and hybrid allele-specific expression. This work identifies unique regulatory evolution and hybrid misexpression properties of the accessory gland and suggests the importance of tissue-specific allele-specific expression studies.
Collapse
Affiliation(s)
- Alex C Majane
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Logan K Blair
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| |
Collapse
|
9
|
Chen Z, Grim CJ, Ramachandran P, Meng J. Advancing metagenome-assembled genome-based pathogen identification: unraveling the power of long-read assembly algorithms in Oxford Nanopore sequencing. Microbiol Spectr 2024; 12:e0011724. [PMID: 38687063 DOI: 10.1128/spectrum.00117-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/05/2024] [Indexed: 05/02/2024] Open
Abstract
Oxford Nanopore sequencing is one of the high-throughput sequencing technologies that facilitates the reconstruction of metagenome-assembled genomes (MAGs). This study aimed to assess the potential of long-read assembly algorithms in Oxford Nanopore sequencing to enhance the MAG-based identification of bacterial pathogens using both simulated and mock communities. Simulated communities were generated to mimic those on fresh spinach and in surface water. Long reads were produced using R9.4.1+SQK-LSK109 and R10.4 + SQK-LSK112, with 0.5, 1, and 2 million reads. The simulated bacterial communities included multidrug-resistant Salmonella enterica serotypes Heidelberg, Montevideo, and Typhimurium in the fresh spinach community individually or in combination, as well as multidrug-resistant Pseudomonas aeruginosa in the surface water community. Real data sets of the ZymoBIOMICS HMW DNA Standard were also studied. A bioinformatic pipeline (MAGenie, freely available at https://github.com/jackchen129/MAGenie) that combines metagenome assembly, taxonomic classification, and sequence extraction was developed to reconstruct draft MAGs from metagenome assemblies. Five assemblers were evaluated based on a series of genomic analyses. Overall, Flye outperformed the other assemblers, followed by Shasta, Raven, and Unicycler, while Canu performed least effectively. In some instances, the extracted sequences resulted in draft MAGs and provided the locations and structures of antimicrobial resistance genes and mobile genetic elements. Our study showcases the viability of utilizing the extracted sequences for precise phylogenetic inference, as demonstrated by the consistent alignment of phylogenetic topology between the reference genome and the extracted sequences. R9.4.1+SQK-LSK109 was more effective in most cases than R10.4+SQK-LSK112, and greater sequencing depths generally led to more accurate results.IMPORTANCEBy examining diverse bacterial communities, particularly those housing multiple Salmonella enterica serotypes, this study holds significance in uncovering the potential of long-read assembly algorithms to improve metagenome-assembled genome (MAG)-based pathogen identification through Oxford Nanopore sequencing. Our research demonstrates that long-read assembly stands out as a promising avenue for boosting precision in MAG-based pathogen identification, thus advancing the development of more robust surveillance measures. The findings also support ongoing endeavors to fine-tune a bioinformatic pipeline for accurate pathogen identification within complex metagenomic samples.
Collapse
Affiliation(s)
- Zhao Chen
- Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, Maryland, USA
| | - Christopher J Grim
- Center for Food Safety and Applied Nutrition, United States Food and Drug Administration, College Park, Maryland, USA
| | - Padmini Ramachandran
- Center for Food Safety and Applied Nutrition, United States Food and Drug Administration, College Park, Maryland, USA
| | - Jianghong Meng
- Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, Maryland, USA
- Department of Nutrition and Food Science, University of Maryland, College Park, Maryland, USA
| |
Collapse
|
10
|
Yang R, Zhang L, Bu F, Sun F, Cheng B. AI-based prediction of protein-ligand binding affinity and discovery of potential natural product inhibitors against ERK2. BMC Chem 2024; 18:108. [PMID: 38831341 DOI: 10.1186/s13065-024-01219-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/29/2024] [Indexed: 06/05/2024] Open
Abstract
Determination of protein-ligand binding affinity (PLA) is a key technological tool in hit discovery and lead optimization, which is critical to the drug development process. PLA can be determined directly by experimental methods, but it is time-consuming and costly. In recent years, deep learning has been widely applied to PLA prediction, the key of which lies in the comprehensive and accurate representation of proteins and ligands. In this study, we proposed a multi-modal deep learning model based on the early fusion strategy, called DeepLIP, to improve PLA prediction by integrating multi-level information, and further used it for virtual screening of extracellular signal-regulated protein kinase 2 (ERK2), an ideal target for cancer treatment. Experimental results from model evaluation showed that DeepLIP achieved superior performance compared to state-of-the-art methods on the widely used benchmark dataset. In addition, by combining previously developed machine learning models and molecular dynamics simulation, we screened three novel hits from a drug-like natural product library. These compounds not only had favorable physicochemical properties, but also bound stably to the target protein. We believe they have the potential to serve as starting molecules for the development of ERK2 inhibitors.
Collapse
Affiliation(s)
- Ruoqi Yang
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250011, China.
- Shandong University of Traditional Chinese Medicine, Jinan, 250355, China.
| | - Lili Zhang
- Jinan Central Hospital Affiliated to Shandong First Medical University, Jinan, 250013, China
| | - Fanyou Bu
- Qingdao Municipal Hospital Group, Qingdao, 266000, China
| | - Fuqiang Sun
- Shandong University of Traditional Chinese Medicine, Jinan, 250355, China
| | - Bin Cheng
- Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, 250011, China.
| |
Collapse
|
11
|
Yu W, Zhang S, Zhao S, Chen LG, Cao J, Ye H, Yan J, Zhao Q, Mo B, Wang Y, Jiao Y, Ma Y, Huang X, Qian W, Dai J. Designing a synthetic moss genome using GenoDesigner. NATURE PLANTS 2024:10.1038/s41477-024-01693-0. [PMID: 38831044 DOI: 10.1038/s41477-024-01693-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 04/10/2024] [Indexed: 06/05/2024]
Abstract
The de novo synthesis of genomes has made unprecedented progress and achieved milestones, particularly in bacteria and yeast. However, the process of synthesizing a multicellular plant genome has not progressed at the same pace, due to the complexity of multicellular plant genomes, technical difficulties associated with large genome size and structure, and the intricacies of gene regulation and expression in plants. Here we outline the bottom-up design principles for the de novo synthesis of the Physcomitrium patens (that is, earthmoss) genome. To facilitate international collaboration and accessibility, we have developed and launched a public online design platform called GenoDesigner. This platform offers an intuitive graphical interface enabling users to efficiently manipulate extensive genome sequences, even up to the gigabase level. This tool is poised to greatly expedite the synthesis of the P. patens genome, offering an essential reference and roadmap for the synthesis of plant genomes.
Collapse
Affiliation(s)
- Wenfei Yu
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Shuo Zhang
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Shijun Zhao
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lian-Ge Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Jie Cao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Hao Ye
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jianbin Yan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Qiao Zhao
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Beixin Mo
- College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China
| | - Ying Wang
- University of Chinese Academy of Sciences, Beijing, China
| | - Yuling Jiao
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yixing Ma
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoluo Huang
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Wenfeng Qian
- University of Chinese Academy of Sciences, Beijing, China.
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.
| | - Junbiao Dai
- Shenzhen Key Laboratory of Synthetic Genomics, Guangdong Provincial Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
- University of Chinese Academy of Sciences, Beijing, China.
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.
- College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China.
| |
Collapse
|
12
|
Young MG, Straub TJ, Worby CJ, Metsky HC, Gnirke A, Bronson RA, van Dijk LR, Desjardins CA, Matranga C, Qu J, Villicana JB, Azimzadeh P, Kau AL, Dodson KW, Schreiber HL, Manson AL, Hultgren SJ, Earl AM. Distinct Escherichia coli transcriptional profiles in the guts of recurrent UTI sufferers revealed by pangenome hybrid selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.29.582780. [PMID: 38463963 PMCID: PMC10925322 DOI: 10.1101/2024.02.29.582780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Low-abundance members of microbial communities are difficult to study in their native habitats. This includes Escherichia coli , a minor, but common inhabitant of the gastrointestinal tract and opportunistic pathogen, including of the urinary tract, where it is the primary pathogen. While multi-omic analyses have detailed critical interactions between uropathogenic Escherichia coli (UPEC) and the bladder that mediate UTI outcome, comparatively little is known about UPEC in its pre-infection reservoir, partly due to its low abundance there (<1% relative abundance). To accurately and sensitively explore the genomes and transcriptomes of diverse E. coli in gastrointestinal communities, we developed E. coli PanSelect which uses a set of probes designed to specifically recognize and capture E. coli 's broad pangenome from sequencing libraries. We demonstrated the ability of E. coli PanSelect to enrich, by orders of magnitude, sequencing data from diverse E. coli using a mock community and a set of human stool samples collected as part of a cohort study investigating drivers of recurrent urinary tract infections (rUTI). Comparisons of genomes and transcriptomes between E. coli residing in the gastrointestinal tracts of women with and without a history of rUTI suggest that rUTI gut E. coli are responding to increased levels of oxygen and nitrate, suggestive of mucosal inflammation, which may have implications for recurrent disease. E. coli PanSelect is well suited for investigations of native in vivo biology of E. coli in other environments where it is at low relative abundance, and the framework described here has broad applicability to other highly diverse, low abundance organisms.
Collapse
|
13
|
Middendorf L, Eicholt LA. Random, de novo, and conserved proteins: How structure and disorder predictors perform differently. Proteins 2024; 92:757-767. [PMID: 38226524 DOI: 10.1002/prot.26652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/18/2023] [Accepted: 12/01/2023] [Indexed: 01/17/2024]
Abstract
Understanding the emergence and structural characteristics of de novo and random proteins is crucial for unraveling protein evolution and designing novel enzymes. However, experimental determination of their structures remains challenging. Recent advancements in protein structure prediction, particularly with AlphaFold2 (AF2), have expanded our knowledge of protein structures, but their applicability to de novo and random proteins is unclear. In this study, we investigate the structural predictions and confidence scores of AF2 and protein language model-based predictor ESMFold for de novo and conserved proteins from Drosophila and a dataset of comparable random proteins. We find that the structural predictions for de novo and random proteins differ significantly from conserved proteins. Interestingly, a positive correlation between disorder and confidence scores (pLDDT) is observed for de novo and random proteins, in contrast to the negative correlation observed for conserved proteins. Furthermore, the performance of structure predictors for de novo and random proteins is hampered by the lack of sequence identity. We also observe fluctuating median predicted disorder among different sequence length quartiles for random proteins, suggesting an influence of sequence length on disorder predictions. In conclusion, while structure predictors provide initial insights into the structural composition of de novo and random proteins, their accuracy and applicability to such proteins remain limited. Experimental determination of their structures is necessary for a comprehensive understanding. The positive correlation between disorder and pLDDT could imply a potential for conditional folding and transient binding interactions of de novo and random proteins.
Collapse
Affiliation(s)
- Lasse Middendorf
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Lars A Eicholt
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| |
Collapse
|
14
|
Gumińska N, Hałakuc P, Zakryś B, Milanowski R. Circular extrachromosomal DNA in Euglena gracilis under normal and stress conditions. Protist 2024; 175:126033. [PMID: 38574508 DOI: 10.1016/j.protis.2024.126033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/10/2024] [Accepted: 03/27/2024] [Indexed: 04/06/2024]
Abstract
Extrachromosomal circular DNA (eccDNA) enhances genomic plasticity, augmenting its coding and regulatory potential. Advances in high-throughput sequencing have enabled the investigation of these structural variants. Although eccDNAs have been investigated in numerous taxa, they remained understudied in euglenids. Therefore, we examined eccDNAs predicted from Illumina sequencing data of Euglena gracilis Z SAG 1224-5/25, grown under optimal photoperiod and exposed to UV irradiation. We identified approximately 1000 unique eccDNA candidates, about 20% of which were shared across conditions. We also observed a significant enrichment of mitochondrially encoded eccDNA in the UV-irradiated sample. Furthermore, we found that the heterogeneity of eccDNA was reduced in UV-exposed samples compared to cells that were grown in optimal conditions. Hence, eccDNA appears to play a role in the response to oxidative stress in Euglena, as it does in other studied organisms. In addition to contributing to the understanding of Euglena genomes, our results contribute to the validation of bioinformatics pipelines on a large, non-model genome.
Collapse
Affiliation(s)
- Natalia Gumińska
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, 101 Żwirki i Wigury Street, 02-089 Warsaw, Poland; Laboratory of RNA Biology, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Street, 02-109 Warsaw, Poland.
| | - Paweł Hałakuc
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, 101 Żwirki i Wigury Street, 02-089 Warsaw, Poland
| | - Bożena Zakryś
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, 101 Żwirki i Wigury Street, 02-089 Warsaw, Poland
| | - Rafał Milanowski
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, 101 Żwirki i Wigury Street, 02-089 Warsaw, Poland.
| |
Collapse
|
15
|
Chakraborty S, Sharma G, Karmakar S, Banerjee S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167120. [PMID: 38484941 DOI: 10.1016/j.bbadis.2024.167120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/01/2024]
Abstract
Innovative multi-omics frameworks integrate diverse datasets from the same patients to enhance our understanding of the molecular and clinical aspects of cancers. Advanced omics and multi-view clustering algorithms present unprecedented opportunities for classifying cancers into subtypes, refining survival predictions and treatment outcomes, and unravelling key pathophysiological processes across various molecular layers. However, with the increasing availability of cost-effective high-throughput technologies (HTT) that generate vast amounts of data, analyzing single layers often falls short of establishing causal relations. Integrating multi-omics data spanning genomes, epigenomes, transcriptomes, proteomes, metabolomes, and microbiomes offers unique prospects to comprehend the underlying biology of complex diseases like cancer. This discussion explores algorithmic frameworks designed to uncover cancer subtypes, disease mechanisms, and methods for identifying pivotal genomic alterations. It also underscores the significance of multi-omics in tumor classifications, diagnostics, and prognostications. Despite its unparalleled advantages, the integration of multi-omics data has been slow to find its way into everyday clinics. A major hurdle is the uneven maturity of different omics approaches and the widening gap between the generation of large datasets and the capacity to process this data. Initiatives promoting the standardization of sample processing and analytical pipelines, as well as multidisciplinary training for experts in data analysis and interpretation, are crucial for translating theoretical findings into practical applications.
Collapse
Affiliation(s)
- Sohini Chakraborty
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Gaurav Sharma
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Sricheta Karmakar
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Satarupa Banerjee
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
16
|
Nikitin P, Sidorov S, Liehr T, Klimina K, Al-Rikabi A, Korchagin V, Kolomiets O, Arakelyan M, Spangenberg V. Variants of a major DNA satellite discriminate parental subgenomes in a hybrid parthenogenetic lizard Darevskia unisexualis (Darevsky, 1966). JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2024; 342:368-379. [PMID: 38407543 DOI: 10.1002/jez.b.23244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/27/2024]
Abstract
Hybrid parthenogenetic animals are an exceptionally interesting model for studying the mechanisms and evolution of sexual and asexual reproduction. A diploid parthenogenetic lizard Darevskia unisexualis is a result of an ancestral cross between a maternal species Darevskia raddei nairensis and a paternal species Darevskia valentini and presents a unique opportunity for a cytogenetic and computational analysis of a hybrid karyotype. Our previous results demonstrated a significant divergence between the pericentromeric DNA sequences of the parental Darevskia species; however, an in-depth comparative study of their pericentromeres is still lacking. Here, using target sequencing of microdissected pericentromeric regions, we reveal and compare the repertoires of the pericentromeric tandem repeats of the parental Darevskia lizards. We found species-specific sequences of the major pericentromeric tandem repeat CLsat, which allowed computational prediction and experimental validation of fluorescent DNA probes discriminating parental chromosomes within the hybrid karyotype of D. unisexualis. Moreover, we have implemented a generalizable computational method, based on the optimization of the Levenshtein distance between tandem repeat monomers, for finding species-specific fluorescent probes for pericentromere staining. In total, we anticipate that our comparative analysis of Darevskia pericentromeric repeats, the species-specific fluorescent probes that we found and the pipeline that we developed will form a basis for the future detailed cytogenomic studies of a wide range of natural and laboratory hybrids.
Collapse
Affiliation(s)
- Pavel Nikitin
- Laboratory of Comparative Ethology and Biocommunication, Severtsov Institute of Ecology and Evolution RAS, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Sviatoslav Sidorov
- Computational Regulatory Genomics, MRC Laboratory of Medical Sciences, Hammersmith Hospital Campus, London, UK
| | - Thomas Liehr
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Jena, Germany
| | - Ksenia Klimina
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
| | - Ahmed Al-Rikabi
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Jena, Germany
| | | | - Oxana Kolomiets
- Laboratory of Cytogenetics, Vavilov Institute of General Genetics RAS, Moscow, Russia
| | - Marine Arakelyan
- Department of Zoology, Yerevan State University, Yerevan, Armenia
| | - Victor Spangenberg
- Laboratory of Cytogenetics, Vavilov Institute of General Genetics RAS, Moscow, Russia
| |
Collapse
|
17
|
Duman ET, Sitte M, Conrads K, Mackay A, Ludewig F, Ströbel P, Ellenrieder V, Hessmann E, Papantonis A, Salinas G. A single-cell strategy for the identification of intronic variants related to mis-splicing in pancreatic cancer. NAR Genom Bioinform 2024; 6:lqae057. [PMID: 38800828 PMCID: PMC11127633 DOI: 10.1093/nargab/lqae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/24/2024] [Accepted: 05/23/2024] [Indexed: 05/29/2024] Open
Abstract
Most clinical diagnostic and genomic research setups focus almost exclusively on coding regions and essential splice sites, thereby overlooking other non-coding variants. As a result, intronic variants that can promote mis-splicing events across a range of diseases, including cancer, are yet to be systematically investigated. Such investigations would require both genomic and transcriptomic data, but there currently exist very few datasets that satisfy these requirements. We address this by developing a single-nucleus full-length RNA-sequencing approach that allows for the detection of potentially pathogenic intronic variants. We exemplify the potency of our approach by applying pancreatic cancer tumor and tumor-derived specimens and linking intronic variants to splicing dysregulation. We specifically find that prominent intron retention and pseudo-exon activation events are shared by the tumors and affect genes encoding key transcriptional regulators. Our work paves the way for the assessment and exploitation of intronic mutations as powerful prognostic markers and potential therapeutic targets in cancer.
Collapse
Affiliation(s)
- Emre Taylan Duman
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center, Göttingen, Germany
| | - Maren Sitte
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center, Göttingen, Germany
| | - Karly Conrads
- Clinic of Gastroenterology, Gastrointestinal Oncology and Endocrinology, University Medical Center, Göttingen, Germany
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Institute of Medical Bioinformatics, University Medical Center, Göttingen, Germany
| | - Adi Mackay
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Institute of Pathology, University Medical Center, Göttingen, Germany
| | - Fabian Ludewig
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center, Göttingen, Germany
| | - Philipp Ströbel
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Institute of Pathology, University Medical Center, Göttingen, Germany
| | - Volker Ellenrieder
- Clinic of Gastroenterology, Gastrointestinal Oncology and Endocrinology, University Medical Center, Göttingen, Germany
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Comprehensive Cancer Center Lower Saxony (CCC-N), Göttingen, Germany
| | - Elisabeth Hessmann
- Clinic of Gastroenterology, Gastrointestinal Oncology and Endocrinology, University Medical Center, Göttingen, Germany
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Comprehensive Cancer Center Lower Saxony (CCC-N), Göttingen, Germany
| | - Argyris Papantonis
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
- Institute of Pathology, University Medical Center, Göttingen, Germany
- Comprehensive Cancer Center Lower Saxony (CCC-N), Göttingen, Germany
| | - Gabriela Salinas
- NGS-Core Unit for Integrative Genomics, Institute of Pathology, University Medical Center, Göttingen, Germany
- Clinical Research Unit 5002 (CRU5002), University Medical Center, Göttingen, Germany
| |
Collapse
|
18
|
Wachananawat B, Kong BL, Shaw P, Bongcheewin B, Sangvirotjanapat S, Prombutara P, Pornputtapong N, Sukrong S. Characterization and phylogenetic analysis of the complete chloroplast genome of Curcuma comosa and C. latifolia. Heliyon 2024; 10:e31248. [PMID: 38813184 PMCID: PMC11133819 DOI: 10.1016/j.heliyon.2024.e31248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 04/23/2024] [Accepted: 05/13/2024] [Indexed: 05/31/2024] Open
Abstract
Members of the Curcuma genus, a crop in the Zingiberaceae, are widely utilized rhizomatous herbs globally. There are two distinct species, C. comosa Roxb. and C. latifolia Roscoe, referred to the same vernacular name "Wan Chak Motluk" in Thai. C. comosa holds economic importance and is extensively used as a Thai traditional medicine due to its phytoestrogenic properties. However, its morphology closely resembles that of C. latifolia, which contains zederone, a compound known for its hepatotoxic effects. They are often confused, which may affect the quality, efficacy and safety of the derived herbal materials. Thus, DNA markers were developed for discriminating C. comosa from C. latifolia. This study focused on analyzing core DNA barcode regions, including rbcL, matK, psbA-trnH spacer and ITS2, of the authentic C. comosa and C. latifolia species. As a result, no variable nucleotides in core DNA barcode regions were observed. The complete chloroplast (cp) genome was introduced to differentiate between the two species. The comparison revealed that the cp genomes of C. comosa and C. latifolia were 162,272 and 162,289 bp, respectively, with a total of 133 identified genes. The phylogenetic analysis revealed that C. comosa and C. latifolia exhibited a very close relationship with other Curcuma species. The cp genome of C. comosa and C. latifolia were identified for the first time, providing valuable insights for species identification and evolutionary research within the Zingiberaceae family.
Collapse
Affiliation(s)
- Bussarin Wachananawat
- Center of Excellence in DNA Barcoding of Thai Medicinal Plants, Department of Pharmacognosy and Pharmaceutical Botany, Faculty of Pharmaceutical Sciences, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Bobby Lim‐Ho Kong
- Li Dak Sum Yip Yio Chin R & D Centre for Chinese Medicine and Institute of Chinese Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong, N.T., China
| | - Pang‐Chui Shaw
- Li Dak Sum Yip Yio Chin R & D Centre for Chinese Medicine and Institute of Chinese Medicine, The Chinese University of Hong Kong, Shatin, Hong Kong, N.T., China
| | - Bhanubong Bongcheewin
- Department of Pharmaceutical Botany, Faculty of Pharmacy and Center of Excellence in Herbal Medicine and Natural Products, Faculty of Pharmacy, Mahidol University, Bangkok, 10400, Thailand
- Sireeruckhachati Nature Learning Park, Mahidol University, Nakhon Pathom, 73170, Thailand
| | | | - Pinidphon Prombutara
- Faculty of Science, Omics Science & Bioinformatics Center, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Natapol Pornputtapong
- Department of Biochemistry and Microbiology, Faculty of Pharmaceutical Sciences, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Suchada Sukrong
- Center of Excellence in DNA Barcoding of Thai Medicinal Plants, Department of Pharmacognosy and Pharmaceutical Botany, Faculty of Pharmaceutical Sciences, Chulalongkorn University, Bangkok, 10330, Thailand
| |
Collapse
|
19
|
Elrashedy A, Nayel M, Salama A, Zaghawa A, Abdelsalam NR, Hasan ME. Phylogenetic Analysis and Comparative Genomics of Brucella abortus and Brucella melitensis Strains in Egypt. J Mol Evol 2024:10.1007/s00239-024-10173-0. [PMID: 38809331 DOI: 10.1007/s00239-024-10173-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 05/02/2024] [Indexed: 05/30/2024]
Abstract
Brucellosis is a notifiable disease induced by a facultative intracellular Brucella pathogen. In this study, eight Brucella abortus and eighteen Brucella melitensis strains from Egypt were annotated and compared with RB51 and REV1 vaccines respectively. RAST toolkit in the BV-BRC server was used for annotation, revealing genome length of 3,250,377 bp and 3,285,803 bp, 3289 and 3323 CDS, 48 and 49 tRNA genes, the same number of rRNA (3) genes, 583 and 586 hypothetical proteins, 2697 and 2726 functional proteins for B. abortus and B. melitensis respectively. B. abortus strains exhibit a similar number of candidate genes, while B. melitensis strains showed some differences, especially in the SRR19520422 Faiyum strain. Also, B. melitensis clarified differences in antimicrobial resistance genes (KatG, FabL, MtrA, MtrB, OxyR, and VanO-type) in SRR19520319 Faiyum and (Erm C and Tet K) in SRR19520422 Faiyum strain. Additionally, the whole genome phylogeny analysis proved that all B. abortus strains were related to vaccinated animals and all B. melitensis strains of Menoufia clustered together and closely related to Gharbia, Dameitta, and Kafr Elshiek. The Bowtie2 tool identified 338 (eight B. abortus) and 4271 (eighteen B. melitensis) single nucleotide polymorphisms (SNPs) along the genomes. These variants had been annotated according to type and impact. Moreover, thirty candidate genes were predicted and submitted at GenBank (24 in B. abortus) and (6 in B. melitensis). This study contributes significant insights into genetic variation, virulence factors, and vaccine-related associations of Brucella pathogens, enhancing our knowledge of brucellosis epidemiology and evolution in Egypt.
Collapse
Affiliation(s)
- Alyaa Elrashedy
- Department of Animal Medicine and Infectious Diseases (Infectious Diseases), Faculty of Veterinary Medicine, University of Sadat City, Sadat City, Egypt.
| | - Mohamed Nayel
- Department of Animal Medicine and Infectious Diseases (Infectious Diseases), Faculty of Veterinary Medicine, University of Sadat City, Sadat City, Egypt
| | - Akram Salama
- Department of Animal Medicine and Infectious Diseases (Infectious Diseases), Faculty of Veterinary Medicine, University of Sadat City, Sadat City, Egypt
| | - Ahmed Zaghawa
- Department of Animal Medicine and Infectious Diseases (Infectious Diseases), Faculty of Veterinary Medicine, University of Sadat City, Sadat City, Egypt
| | - Nader R Abdelsalam
- Agricultural Botany Department, Faculty of Agriculture (Saba Basha), Alexandria University, Alexandria, 21531, Egypt
| | - Mohamed E Hasan
- Bioinformatics Department, Genetic Engineering and Biotechnology Research Institute, University of Sadat City, Sadat City, Egypt
| |
Collapse
|
20
|
Lee RJ, Horton CA, Van Treeck B, McIntyre JJR, Collins K. Conserved and divergent DNA recognition specificities and functions of R2 retrotransposon N-terminal domains. Cell Rep 2024; 43:114239. [PMID: 38753487 DOI: 10.1016/j.celrep.2024.114239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/04/2024] [Accepted: 05/01/2024] [Indexed: 05/18/2024] Open
Abstract
R2 non-long terminal repeat (non-LTR) retrotransposons are among the most extensively distributed mobile genetic elements in multicellular eukaryotes and show promise for applications in transgene supplementation of the human genome. They insert new gene copies into a conserved site in 28S ribosomal DNA with exquisite specificity. R2 clades are defined by the number of zinc fingers (ZFs) at the N terminus of the retrotransposon-encoded protein, postulated to additively confer DNA site specificity. Here, we illuminate general principles of DNA recognition by R2 N-terminal domains across and between clades, with extensive, specific recognition requiring only one or two compact domains. DNA-binding and protection assays demonstrate broadly shared as well as clade-specific DNA interactions. Gene insertion assays in cells identify the N-terminal domains sufficient for target-site insertion and reveal roles in second-strand cleavage or synthesis for clade-specific ZFs. Our results have implications for understanding evolutionary diversification of non-LTR retrotransposon insertion mechanisms and the design of retrotransposon-based gene therapies.
Collapse
Affiliation(s)
- Rosa Jooyoung Lee
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Connor A Horton
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Briana Van Treeck
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Jeremy J R McIntyre
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Kathleen Collins
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
21
|
Chander AM, de Melo Teixeira M, Singh NK, Williams MP, Parker CW, Leo P, Stajich JE, Torok T, Tighe S, Mason CE, Venkateswaran K. Genomic and morphological characterization of Knufia obscura isolated from the Mars 2020 spacecraft assembly facility. Sci Rep 2024; 14:12249. [PMID: 38806503 PMCID: PMC11133487 DOI: 10.1038/s41598-024-61115-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 05/02/2024] [Indexed: 05/30/2024] Open
Abstract
Members of the family Trichomeriaceae, belonging to the Chaetothyriales order and the Ascomycota phylum, are known for their capability to inhabit hostile environments characterized by extreme temperatures, oligotrophic conditions, drought, or presence of toxic compounds. The genus Knufia encompasses many polyextremophilic species. In this report, the genomic and morphological features of the strain FJI-L2-BK-P2 presented, which was isolated from the Mars 2020 mission spacecraft assembly facility located at the Jet Propulsion Laboratory in Pasadena, California. The identification is based on sequence alignment for marker genes, multi-locus sequence analysis, and whole genome sequence phylogeny. The morphological features were studied using a diverse range of microscopic techniques (bright field, phase contrast, differential interference contrast and scanning electron microscopy). The phylogenetic marker genes of the strain FJI-L2-BK-P2 exhibited highest similarities with type strain of Knufia obscura (CBS 148926T) that was isolated from the gas tank of a car in Italy. To validate the species identity, whole genomes of both strains (FJI-L2-BK-P2 and CBS 148926T) were sequenced, annotated, and strain FJI-L2-BK-P2 was confirmed as K. obscura. The morphological analysis and description of the genomic characteristics of K. obscura FJI-L2-BK-P2 may contribute to refining the taxonomy of Knufia species. Key morphological features are reported in this K. obscura strain, resembling microsclerotia and chlamydospore-like propagules. These features known to be characteristic features in black fungi which could potentially facilitate their adaptation to harsh environments.
Collapse
Affiliation(s)
- Atul Munish Chander
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA
| | - Marcus de Melo Teixeira
- Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, AZ, USA
- School of Medicine, University of Brasilia, Brasília, DF, Brazil
| | - Nitin K Singh
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA
| | - Michael P Williams
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA
| | - Ceth W Parker
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA
| | - Patrick Leo
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA
| | - Jason E Stajich
- Department of Microbiology and Plant Pathology, University of CA-Riverside, Riverside, CA, USA
| | - Tamas Torok
- Ecology Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Scott Tighe
- Vermont Integrative Genomics Lab, University of Vermont, Burlington, VT, USA
| | - Christopher E Mason
- WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, 1305 York Avenue, Room Y-13.15, New York, NY, 10021, USA.
| | - Kasthuri Venkateswaran
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, M/S 89-2, 4800 Oak Grove Dr., Pasadena, CA, 91109, USA.
| |
Collapse
|
22
|
Rosignoli S, Lustrino E, Conci A, Fabrizi A, Rinaldo S, Latella MC, Enzo E, Prosseda G, De Rosa L, De Luca M, Paiardini A. AlPaCas: allele-specific CRISPR gene editing through a protospacer-adjacent-motif (PAM) approach. Nucleic Acids Res 2024:gkae419. [PMID: 38795068 DOI: 10.1093/nar/gkae419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/23/2024] [Accepted: 05/07/2024] [Indexed: 05/27/2024] Open
Abstract
Gene therapy of dominantly inherited genetic diseases requires either the selective disruption of the mutant allele or the editing of the specific mutation. The CRISPR-Cas system holds great potential for the genetic correction of single nucleotide variants (SNVs), including dominant mutations. However, distinguishing between single-nucleotide variations in a pathogenic genomic context remains challenging. The presence of a PAM in the disease-causing allele can guide its precise targeting, preserving the functionality of the wild-type allele. The AlPaCas (Aligning Patients to Cas) webserver is an automated pipeline for sequence-based identification and structural analysis of SNV-derived PAMs that satisfy this demand. When provided with a gene/SNV input, AlPaCas can: (i) identify SNV-derived PAMs; (ii) provide a list of available Cas enzymes recognizing the SNV (s); (iii) propose mutational Cas-engineering to enhance the selectivity towards the SNV-derived PAM. With its ability to identify allele-specific genetic variants that can be targeted using already available or engineered Cas enzymes, AlPaCas is at the forefront of advancements in genome editing. AlPaCas is open to all users without a login requirement and is freely available at https://schubert.bio.uniroma1.it/alpacas.
Collapse
Affiliation(s)
- Serena Rosignoli
- Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy
| | - Elisa Lustrino
- Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy
| | - Alessio Conci
- Centre for Regenerative Medicine "Stefano Ferrari", Department of Life Sciences, University of Modena and Reggio Emilia, 41125 Modena, Italy
| | - Alessandra Fabrizi
- Centre for Regenerative Medicine "Stefano Ferrari", Department of Life Sciences, University of Modena and Reggio Emilia, 41125 Modena, Italy
| | - Serena Rinaldo
- Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy
| | | | - Elena Enzo
- Centre for Regenerative Medicine "Stefano Ferrari", Department of Life Sciences, University of Modena and Reggio Emilia, 41125 Modena, Italy
| | - Gianni Prosseda
- Department of Biology and Biotechnology Charles Darwin, Sapienza University of Rome, Rome 00185, Italy
| | - Laura De Rosa
- Centre for Regenerative Medicine "Stefano Ferrari", Department of Life Sciences, University of Modena and Reggio Emilia, 41125 Modena, Italy
| | - Michele De Luca
- Centre for Regenerative Medicine "Stefano Ferrari", Department of Life Sciences, University of Modena and Reggio Emilia, 41125 Modena, Italy
| | - Alessandro Paiardini
- Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Rome, Rome 00185, Italy
| |
Collapse
|
23
|
Manriquez-Sandoval E, Brewer J, Lule G, Lopez S, Fried SD. FLiPPR: A Processor for Limited Proteolysis (LiP) Mass Spectrometry Data Sets Built on FragPipe. J Proteome Res 2024. [PMID: 38787630 DOI: 10.1021/acs.jproteome.3c00887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Here, we present FLiPPR, or FragPipe LiP (limited proteolysis) Processor, a tool that facilitates the analysis of data from limited proteolysis mass spectrometry (LiP-MS) experiments following primary search and quantification in FragPipe. LiP-MS has emerged as a method that can provide proteome-wide information on protein structure and has been applied to a range of biological and biophysical questions. Although LiP-MS can be carried out with standard laboratory reagents and mass spectrometers, analyzing the data can be slow and poses unique challenges compared to typical quantitative proteomics workflows. To address this, we leverage FragPipe and then process its output in FLiPPR. FLiPPR formalizes a specific data imputation heuristic that carefully uses missing data in LiP-MS experiments to report on the most significant structural changes. Moreover, FLiPPR introduces a data merging scheme and a protein-centric multiple hypothesis correction scheme, enabling processed LiP-MS data sets to be more robust and less redundant. These improvements strengthen statistical trends when previously published data are reanalyzed with the FragPipe/FLiPPR workflow. We hope that FLiPPR will lower the barrier for more users to adopt LiP-MS, standardize statistical procedures for LiP-MS data analysis, and systematize output to facilitate eventual larger-scale integration of LiP-MS data.
Collapse
Affiliation(s)
- Edgar Manriquez-Sandoval
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States
- T. C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Joy Brewer
- Department of Chemistry and Biochemistry, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Gabriela Lule
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Samanta Lopez
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Stephen D Fried
- Department of Chemistry, Johns Hopkins University, Baltimore, Maryland 21218, United States
- T. C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
| |
Collapse
|
24
|
Romero Romero ML, Poehls J, Kirilenko A, Richter D, Jumel T, Shevchenko A, Toth-Petroczy A. Environment modulates protein heterogeneity through transcriptional and translational stop codon readthrough. Nat Commun 2024; 15:4446. [PMID: 38789441 PMCID: PMC11126739 DOI: 10.1038/s41467-024-48387-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 04/25/2024] [Indexed: 05/26/2024] Open
Abstract
Stop codon readthrough events give rise to longer proteins, which may alter the protein's function, thereby generating short-lasting phenotypic variability from a single gene. In order to systematically assess the frequency and origin of stop codon readthrough events, we designed a library of reporters. We introduced premature stop codons into mScarlet, which enabled high-throughput quantification of protein synthesis termination errors in E. coli using fluorescent microscopy. We found that under stress conditions, stop codon readthrough may occur at rates as high as 80%, depending on the nucleotide context, suggesting that evolution frequently samples stop codon readthrough events. The analysis of selected reporters by mass spectrometry and RNA-seq showed that not only translation but also transcription errors contribute to stop codon readthrough. The RNA polymerase was more likely to misincorporate a nucleotide at premature stop codons. Proteome-wide detection of stop codon readthrough by mass spectrometry revealed that temperature regulated the expression of cryptic sequences generated by stop codon readthrough in E. coli. Overall, our findings suggest that the environment affects the accuracy of protein production, which increases protein heterogeneity when the organisms need to adapt to new conditions.
Collapse
Affiliation(s)
- Maria Luisa Romero Romero
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany.
- Center for Systems Biology Dresden, 01307, Dresden, Germany.
| | - Jonas Poehls
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
- Center for Systems Biology Dresden, 01307, Dresden, Germany
| | - Anastasiia Kirilenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
- Center for Systems Biology Dresden, 01307, Dresden, Germany
| | - Doris Richter
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
- Center for Systems Biology Dresden, 01307, Dresden, Germany
| | - Tobias Jumel
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
| | - Anna Shevchenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany
| | - Agnes Toth-Petroczy
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany.
- Center for Systems Biology Dresden, 01307, Dresden, Germany.
- Cluster of Excellence Physics of Life, TU Dresden, 01062, Dresden, Germany.
| |
Collapse
|
25
|
Zhou Y, Myung Y, Rodrigues CHM, Ascher DB. DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning. Nucleic Acids Res 2024:gkae412. [PMID: 38783112 DOI: 10.1093/nar/gkae412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024] Open
Abstract
Protein-protein interactions (PPIs) play a vital role in cellular functions and are essential for therapeutic development and understanding diseases. However, current predictive tools often struggle to balance efficiency and precision in predicting the effects of mutations on these complex interactions. To address this, we present DDMut-PPI, a deep learning model that efficiently and accurately predicts changes in PPI binding free energy upon single and multiple point mutations. Building on the robust Siamese network architecture with graph-based signatures from our prior work, DDMut, the DDMut-PPI model was enhanced with a graph convolutional network operated on the protein interaction interface. We used residue-specific embeddings from ProtT5 protein language model as node features, and a variety of molecular interactions as edge features. By integrating evolutionary context with spatial information, this framework enables DDMut-PPI to achieve a robust Pearson correlation of up to 0.75 (root mean squared error: 1.33 kcal/mol) in our evaluations, outperforming most existing methods. Importantly, the model demonstrated consistent performance across mutations that increase or decrease binding affinity. DDMut-PPI offers a significant advancement in the field and will serve as a valuable tool for researchers probing the complexities of protein interactions. DDMut-PPI is freely available as a web server and an application programming interface at https://biosig.lab.uq.edu.au/ddmut_ppi.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - YooChan Myung
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - Carlos H M Rodrigues
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
| | - David B Ascher
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| |
Collapse
|
26
|
Felbinger N, Ribeiro-Filho HV, Pierce BG. Proscan: a structure-based proline design web server. Nucleic Acids Res 2024:gkae408. [PMID: 38769060 DOI: 10.1093/nar/gkae408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/16/2024] [Accepted: 05/01/2024] [Indexed: 05/22/2024] Open
Abstract
The ability to control protein conformations and dynamics through structure-based design has been useful in various scenarios, including engineering of viral antigens for vaccines. One effective design strategy is the substitution of residues to proline amino acids, which due to its unique cyclic side chain can favor and rigidify key backbone conformations. To provide the community with a means to readily identify and explore proline designs for target proteins of interest, we developed the Proscan web server. Proscan provides assessment of backbone angles, energetic and deep learning-based favorability scores, and other parameters for proline substitutions at each position of an input structure, along with interactive visualization of backbone angles and candidate substitution sites on structures. It identifies known favorable proline substitutions for viral antigens, and was benchmarked against datasets of proline substitution stability effects from deep mutational scanning and thermodynamic measurements. This tool can enable researchers to identify and prioritize designs for prospective vaccine antigen targets, or other designs to favor stability of key protein conformations. Proscan is available at: https://proscan.ibbr.umd.edu.
Collapse
Affiliation(s)
- Nathaniel Felbinger
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Helder V Ribeiro-Filho
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
27
|
McCullum LB, Karagoz A, Dede C, Garcia R, Nosrat F, Hemmati M, Hosseinian S, Schaefer AJ, Fuller CD. Markov models for clinical decision-making in radiation oncology: A systematic review. J Med Imaging Radiat Oncol 2024. [PMID: 38766899 DOI: 10.1111/1754-9485.13656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/03/2024] [Indexed: 05/22/2024]
Abstract
The intrinsic stochasticity of patients' response to treatment is a major consideration for clinical decision-making in radiation therapy. Markov models are powerful tools to capture this stochasticity and render effective treatment decisions. This paper provides an overview of the Markov models for clinical decision analysis in radiation oncology. A comprehensive literature search was conducted within MEDLINE using PubMed, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Only studies published from 2000 to 2023 were considered. Selected publications were summarized in two categories: (i) studies that compare two (or more) fixed treatment policies using Monte Carlo simulation and (ii) studies that seek an optimal treatment policy through Markov Decision Processes (MDPs). Relevant to the scope of this study, 61 publications were selected for detailed review. The majority of these publications (n = 56) focused on comparative analysis of two or more fixed treatment policies using Monte Carlo simulation. Classifications based on cancer site, utility measures and the type of sensitivity analysis are presented. Five publications considered MDPs with the aim of computing an optimal treatment policy; a detailed statement of the analysis and results is provided for each work. As an extension of Markov model-based simulation analysis, MDP offers a flexible framework to identify an optimal treatment policy among a possibly large set of treatment policies. However, the applications of MDPs to oncological decision-making have been understudied, and the full capacity of this framework to render complex optimal treatment decisions warrants further consideration.
Collapse
Affiliation(s)
- Lucas B McCullum
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Aysenur Karagoz
- Department of Computational Applied Mathematics & Operations Research, Rice University, Houston, Texas, USA
| | - Cem Dede
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Raul Garcia
- Department of Computational Applied Mathematics & Operations Research, Rice University, Houston, Texas, USA
| | - Fatemeh Nosrat
- Department of Computational Applied Mathematics & Operations Research, Rice University, Houston, Texas, USA
| | - Mehdi Hemmati
- School of Industrial and Systems Engineering, The University of Oklahoma, Norman, Oklahoma, USA
| | | | - Andrew J Schaefer
- Department of Computational Applied Mathematics & Operations Research, Rice University, Houston, Texas, USA
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
- Department of Computational Applied Mathematics & Operations Research, Rice University, Houston, Texas, USA
| |
Collapse
|
28
|
Eralp B, Sefer E. Reference-free inferring of transcriptomic events in cancer cells on single-cell data. BMC Cancer 2024; 24:607. [PMID: 38769480 PMCID: PMC11107047 DOI: 10.1186/s12885-024-12331-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 05/02/2024] [Indexed: 05/22/2024] Open
Abstract
BACKGROUND Cancerous cells' identity is determined via a mixture of multiple factors such as genomic variations, epigenetics, and the regulatory variations that are involved in transcription. The differences in transcriptome expression as well as abnormal structures in peptides determine phenotypical differences. Thus, bulk RNA-seq and more recent single-cell RNA-seq data (scRNA-seq) are important to identify pathogenic differences. In this case, we rely on k-mer decomposition of sequences to identify pathogenic variations in detail which does not need a reference, so it outperforms more traditional Next-Generation Sequencing (NGS) analysis techniques depending on the alignment of the sequences to a reference. RESULTS Via our alignment-free analysis, over esophageal and glioblastoma cancer patients, high-frequency variations over multiple different locations (repeats, intergenic regions, exons, introns) as well as multiple different forms (fusion, polyadenylation, splicing, etc.) could be discovered. Additionally, we have analyzed the importance of less-focused events systematically in a classic transcriptome analysis pipeline where these events are considered as indicators for tumor prognosis, tumor prediction, tumor neoantigen inference, as well as their connection with respect to the immune microenvironment. CONCLUSIONS Our results suggest that esophageal cancer (ESCA) and glioblastoma processes can be explained via pathogenic microbial RNA, repeated sequences, novel splicing variants, and long intergenic non-coding RNAs (lincRNAs). We expect our application of reference-free process and analysis to be helpful in tumor and normal samples differential scRNA-seq analysis, which in turn offers a more comprehensive scheme for major cancer-associated events.
Collapse
Affiliation(s)
- Batuhan Eralp
- Department of Computer Science, Ozyegin University, Istanbul, Turkey
| | - Emre Sefer
- Department of Computer Science, Ozyegin University, Istanbul, Turkey.
| |
Collapse
|
29
|
Teo QW, Wang Y, Lv H, Mao KJ, Tan TJC, Huan YW, Rivera-Cardona J, Shao EK, Choi D, Dargani ZT, Brooke CB, Wu NC. Deep mutational scanning of influenza A virus NEP reveals pleiotropic mutations in its N-terminal domain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.594574. [PMID: 38798526 PMCID: PMC11118461 DOI: 10.1101/2024.05.16.594574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The influenza A virus nuclear export protein (NEP) is a multifunctional protein that is essential for the viral life cycle and has very high sequence conservation. However, since the open reading frame of NEP largely overlaps with that of another influenza viral protein, non-structural protein 1, it is difficult to infer the functional constraints of NEP based on sequence conservation analysis. Besides, the N-terminal of NEP is structurally disordered, which further complicates the understanding of its function. Here, we systematically measured the replication fitness effects of >1,800 mutations of NEP. Our results show that the N-terminal domain has high mutational tolerance. Additional experiments demonstrate that N-terminal domain mutations pleiotropically affect viral transcription and replication dynamics, host cellular responses, and mammalian adaptation of avian influenza virus. Overall, our study not only advances the functional understanding of NEP, but also provides insights into its evolutionary constraints.
Collapse
|
30
|
Stephens Z, Kocher JP. Characterization of telomere variant repeats using long reads enables allele-specific telomere length estimation. BMC Bioinformatics 2024; 25:194. [PMID: 38755561 PMCID: PMC11100205 DOI: 10.1186/s12859-024-05807-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 05/09/2024] [Indexed: 05/18/2024] Open
Abstract
Telomeres are regions of repetitive DNA at the ends of linear chromosomes which protect chromosome ends from degradation. Telomere lengths have been extensively studied in the context of aging and disease, though most studies use average telomere lengths which are of limited utility. We present a method for identifying all 92 telomere alleles from long read sequencing data. Individual telomeres are identified using variant repeats proximal to telomere regions, which are unique across alleles. This high-throughput and high-resolution characterization of telomeres could be foundational to future studies investigating the roles of specific telomeres in aging and disease.
Collapse
|
31
|
Chao KH, Heinz JM, Hoh C, Mao A, Shumate A, Pertea M, Salzberg SL. Combining DNA and protein alignments to improve genome annotation with LiftOn. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.593026. [PMID: 38798552 PMCID: PMC11118573 DOI: 10.1101/2024.05.16.593026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
As the number and variety of assembled genomes continues to grow, the number of annotated genomes is falling behind, particularly for eukaryotes. DNA-based mapping tools help to address this challenge, but they are only able to transfer annotation between closely-related species. Here we introduce LiftOn, a homology-based software tool that integrates DNA and protein alignments to enhance the accuracy of genome-scale annotation and to allow mapping between relatively distant species. LiftOn's protein-centric algorithm considers both types of alignments, chooses optimal open reading frames, resolves overlapping gene loci, and finds additional gene copies where they exist. LiftOn can reliably transfer annotation between genomes representing members of the same species, as we demonstrate on human, mouse, honey bee, rice, and Arabidopsis thaliana. It can further map annotation effectively across species pairs as far apart as mouse and rat or Drosophila melanogaster and D. erecta.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jakob M. Heinz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Celine Hoh
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alaina Shumate
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mihaela Pertea
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| |
Collapse
|
32
|
Puszkarska AM, Taddese B, Revell J, Davies G, Field J, Hornigold DC, Buchanan A, Vaughan TJ, Colwell LJ. Machine learning designs new GCGR/GLP-1R dual agonists with enhanced biological potency. Nat Chem 2024:10.1038/s41557-024-01532-x. [PMID: 38755312 DOI: 10.1038/s41557-024-01532-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 04/08/2024] [Indexed: 05/18/2024]
Abstract
Several peptide dual agonists of the human glucagon receptor (GCGR) and the glucagon-like peptide-1 receptor (GLP-1R) are in development for the treatment of type 2 diabetes, obesity and their associated complications. Candidates must have high potency at both receptors, but it is unclear whether the limited experimental data available can be used to train models that accurately predict the activity at both receptors of new peptide variants. Here we use peptide sequence data labelled with in vitro potency at human GCGR and GLP-1R to train several models, including a deep multi-task neural-network model using multiple loss optimization. Model-guided sequence optimization was used to design three groups of peptide variants, with distinct ranges of predicted dual activity. We found that three of the model-designed sequences are potent dual agonists with superior biological activity. With our designs we were able to achieve up to sevenfold potency improvement at both receptors simultaneously compared to the best dual-agonist in the training set.
Collapse
Affiliation(s)
- Anna M Puszkarska
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Bruck Taddese
- Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
- Biologics Center (NBC) at the Novartis Institute for BioMedical Research (NIBR), Basel, Switzerland
| | | | - Graeme Davies
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Joss Field
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - David C Hornigold
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Andrew Buchanan
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Tristan J Vaughan
- Biologics Engineering, Oncology R&D, AstraZeneca, Cambridge, UK
- Immunocore Ltd., Abingdon, UK
| | - Lucy J Colwell
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK.
- Google DeepMind, Cambridge, MA, USA.
| |
Collapse
|
33
|
Cory MB, Li A, Hurley CM, Carman PJ, Pumroy RA, Hostetler ZM, Perez RM, Venkatesh Y, Li X, Gupta K, Petersson EJ, Kohli RM. The LexA-RecA* structure reveals a cryptic lock-and-key mechanism for SOS activation. Nat Struct Mol Biol 2024:10.1038/s41594-024-01317-3. [PMID: 38755298 DOI: 10.1038/s41594-024-01317-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 04/15/2024] [Indexed: 05/18/2024]
Abstract
The bacterial SOS response plays a key role in adaptation to DNA damage, including genomic stress caused by antibiotics. SOS induction begins when activated RecA*, an oligomeric nucleoprotein filament that forms on single-stranded DNA, binds to and stimulates autoproteolysis of the repressor LexA. Here, we present the structure of the complete Escherichia coli SOS signal complex, constituting full-length LexA bound to RecA*. We uncover an extensive interface unexpectedly including the LexA DNA-binding domain, providing a new molecular rationale for ordered SOS gene induction. We further find that the interface involves three RecA subunits, with a single residue in the central engaged subunit acting as a molecular key, inserting into an allosteric binding pocket to induce LexA cleavage. Given the pro-mutagenic nature of SOS activation, our structural and mechanistic insights provide a foundation for developing new therapeutics to slow the evolution of antibiotic resistance.
Collapse
Affiliation(s)
- Michael B Cory
- Graduate Group in Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - Allen Li
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA, USA
| | - Christina M Hurley
- Graduate Group in Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - Peter J Carman
- Graduate Group in Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - Ruth A Pumroy
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Ryann M Perez
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA, USA
| | - Yarra Venkatesh
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA, USA
| | - Xinning Li
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA, USA
| | - Kushol Gupta
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - E James Petersson
- Department of Chemistry, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA.
| | - Rahul M Kohli
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
34
|
Lee YW, Weissbein U, Blum R, Lee JT. G-quadruplex folding in Xist RNA antagonizes PRC2 activity for stepwise regulation of X chromosome inactivation. Mol Cell 2024; 84:1870-1885.e9. [PMID: 38759625 DOI: 10.1016/j.molcel.2024.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 11/25/2023] [Accepted: 04/19/2024] [Indexed: 05/19/2024]
Abstract
How Polycomb repressive complex 2 (PRC2) is regulated by RNA remains an unsolved problem. Although PRC2 binds G-tracts with the potential to form RNA G-quadruplexes (rG4s), whether rG4s fold extensively in vivo and whether PRC2 binds folded or unfolded rG4 are unknown. Using the X-inactivation model in mouse embryonic stem cells, here we identify multiple folded rG4s in Xist RNA and demonstrate that PRC2 preferentially binds folded rG4s. High-affinity rG4 binding inhibits PRC2's histone methyltransferase activity, and stabilizing rG4 in vivo antagonizes H3 at lysine 27 (H3K27me3) enrichment on the inactive X chromosome. Surprisingly, mutagenizing the rG4 does not affect PRC2 recruitment but promotes its release and catalytic activation on chromatin. H3K27me3 marks are misplaced, however, and gene silencing is compromised. Xist-PRC2 complexes become entrapped in the S1 chromosome compartment, precluding the required translocation into the S2 compartment. Thus, Xist rG4 folding controls PRC2 activity, H3K27me3 enrichment, and the stepwise regulation of chromosome-wide gene silencing.
Collapse
Affiliation(s)
- Yong Woo Lee
- Department of Molecular Biology, Massachusetts General Hospital and Department of Genetics, Harvard Medical School, Boston, MA 02114, USA
| | - Uri Weissbein
- Department of Molecular Biology, Massachusetts General Hospital and Department of Genetics, Harvard Medical School, Boston, MA 02114, USA
| | - Roy Blum
- Department of Molecular Biology, Massachusetts General Hospital and Department of Genetics, Harvard Medical School, Boston, MA 02114, USA
| | - Jeannie T Lee
- Department of Molecular Biology, Massachusetts General Hospital and Department of Genetics, Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
35
|
Irby I, Broddrick JT. Microbial adaptation to spaceflight is correlated with bacteriophage-encoded functions. Nat Commun 2024; 15:3474. [PMID: 38750067 PMCID: PMC11096397 DOI: 10.1038/s41467-023-42104-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 09/27/2023] [Indexed: 05/18/2024] Open
Abstract
Evidence from the International Space Station suggests microbial populations are rapidly adapting to the spacecraft environment; however, the mechanism of this adaptation is not understood. Bacteriophages are prolific mediators of bacterial adaptation on Earth. Here we survey 245 genomes sequenced from bacterial strains isolated on the International Space Station for dormant (lysogenic) bacteriophages. Our analysis indicates phage-associated genes are significantly different between spaceflight strains and their terrestrial counterparts. In addition, we identify 283 complete prophages, those that could initiate bacterial lysis and infect additional hosts, of which 21% are novel. These prophage regions encode functions that correlate with increased persistence in extreme environments, such as spaceflight, to include antimicrobial resistance and virulence, DNA damage repair, and dormancy. Our results correlate microbial adaptation in spaceflight to bacteriophage-encoded functions that may impact human health in spaceflight.
Collapse
Affiliation(s)
- Iris Irby
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jared T Broddrick
- Space Biosciences Research Branch, NASA Ames Research Center, Moffett Field, CA, USA.
| |
Collapse
|
36
|
Vázquez-González L, Regueira-Iglesias A, Balsa-Castro C, Vila-Blanco N, Tomás I, Carreira MJ. PrimerEvalPy: a tool for in-silico evaluation of primers for targeting the microbiome. BMC Bioinformatics 2024; 25:189. [PMID: 38745271 PMCID: PMC11092261 DOI: 10.1186/s12859-024-05805-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 05/08/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND The selection of primer pairs in sequencing-based research can greatly influence the results, highlighting the need for a tool capable of analysing their performance in-silico prior to the sequencing process. We therefore propose PrimerEvalPy, a Python-based package designed to test the performance of any primer or primer pair against any sequencing database. The package calculates a coverage metric and returns the amplicon sequences found, along with information such as their average start and end positions. It also allows the analysis of coverage for different taxonomic levels. RESULTS As a case study, PrimerEvalPy was used to test the most commonly used primers in the literature against two oral 16S rRNA gene databases containing bacteria and archaea. The results showed that the most commonly used primer pairs in the oral cavity did not match those with the highest coverage. The best performing primer pairs were found for the detection of oral bacteria and archaea. CONCLUSIONS This demonstrates the importance of a coverage analysis tool such as PrimerEvalPy to find the best primer pairs for specific niches. The software is available under the MIT licence at https://gitlab.citius.usc.es/lara.vazquez/PrimerEvalPy .
Collapse
Affiliation(s)
- Lara Vázquez-González
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain.
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain.
| | - Alba Regueira-Iglesias
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical Surgical Specialities, School of Medicine and Dentistry, Universidade de Santiago de Compostela, E15782, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain
| | - Carlos Balsa-Castro
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical Surgical Specialities, School of Medicine and Dentistry, Universidade de Santiago de Compostela, E15782, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain
| | - Nicolás Vila-Blanco
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain
- Departamento de Electrónica e Computación, Escola Técnica Superior de Enxeñaría, Universidade de Santiago de Compostela, E15782, Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain
| | - Inmaculada Tomás
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain.
- Oral Sciences Research Group, Special Needs Unit, Department of Surgery and Medical Surgical Specialities, School of Medicine and Dentistry, Universidade de Santiago de Compostela, E15782, Santiago de Compostela, Spain.
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain.
| | - María J Carreira
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain.
- Departamento de Electrónica e Computación, Escola Técnica Superior de Enxeñaría, Universidade de Santiago de Compostela, E15782, Santiago de Compostela, Spain.
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain.
| |
Collapse
|
37
|
Minkin I, Salzberg SL. CONSERVATION ASSESSMENT OF HUMAN SPLICE SITE ANNOTATION BASED ON A 470-GENOME ALIGNMENT. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.01.569581. [PMID: 38076842 PMCID: PMC10705407 DOI: 10.1101/2023.12.01.569581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Despite many improvements over the years, the annotation of the human genome remains imperfect, and different annotations of the human reference genome sometimes contradict one another. The use of evolutionarily conserved sequences provides a strategy for selecting a high-confidence subset of the annotation that is more likely to be related to biological functions, and the rapidly growing number of genomes from other species increases its power. Using the latest whole genome alignment, we found that splice sites from protein-coding genes in the high-quality MANE annotation are consistently conserved across more than 400 species. We also studied splice sites from the RefSeq, GENCODE, and CHESS databases that are not present in MANE. We trained a logistic regression classifier to distinguish between the conservation exhibited by sites from MANE versus sites chosen randomly from neutrally evolving sequence. We found that splice sites classified by our model as conserved have lower SNP rates and better transcriptomic support. We then computed a subset of transcripts only using either "conserved" splice sites or ones from MANE. This subset is enriched in high-confidence transcripts of the major gene catalogs that appear to be under purifying selection and are more likely to be correct and functionally relevant.
Collapse
Affiliation(s)
- Ilia Minkin
- Department of Biomedical Engineering, Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA
| | - Steven L Salzberg
- Department of Biomedical Engineering, Center for Computational Biology, Department of Computer Science, Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| |
Collapse
|
38
|
Lei R, Qing E, Odle A, Yuan M, Gunawardene CD, Tan TJC, So N, Ouyang WO, Wilson IA, Gallagher T, Perlman S, Wu NC, Wong LYR. Functional and antigenic characterization of SARS-CoV-2 spike fusion peptide by deep mutational scanning. Nat Commun 2024; 15:4056. [PMID: 38744813 PMCID: PMC11094058 DOI: 10.1038/s41467-024-48104-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
The fusion peptide of SARS-CoV-2 spike protein is functionally important for membrane fusion during virus entry and is part of a broadly neutralizing epitope. However, sequence determinants at the fusion peptide and its adjacent regions for pathogenicity and antigenicity remain elusive. In this study, we perform a series of deep mutational scanning (DMS) experiments on an S2 region spanning the fusion peptide of authentic SARS-CoV-2 in different cell lines and in the presence of broadly neutralizing antibodies. We identify mutations at residue 813 of the spike protein that reduced TMPRSS2-mediated entry with decreased virulence. In addition, we show that an F823Y mutation, present in bat betacoronavirus HKU9 spike protein, confers resistance to broadly neutralizing antibodies. Our findings provide mechanistic insights into SARS-CoV-2 pathogenicity and also highlight a potential challenge in developing broadly protective S2-based coronavirus vaccines.
Collapse
Affiliation(s)
- Ruipeng Lei
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Enya Qing
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL, 60153, USA
| | - Abby Odle
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA
| | - Meng Yuan
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Chaminda D Gunawardene
- Center for Virus-Host Innate Immunity, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
| | - Timothy J C Tan
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Natalie So
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Wenhao O Ouyang
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Ian A Wilson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
- The Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Tom Gallagher
- Department of Microbiology and Immunology, Loyola University Chicago, Maywood, IL, 60153, USA.
| | - Stanley Perlman
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Pediatrics, University of Iowa, Iowa City, IA, 52242, USA.
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| | - Lok-Yin Roy Wong
- Department of Microbiology and Immunology, University of Iowa, Iowa City, IA, 52242, USA.
- Center for Virus-Host Innate Immunity, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA.
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA.
| |
Collapse
|
39
|
Cao MY, Zainudin S, Daud KM. Protein features fusion using attributed network embedding for predicting protein-protein interaction. BMC Genomics 2024; 25:466. [PMID: 38741045 DOI: 10.1186/s12864-024-10361-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 04/29/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND Protein-protein interactions (PPIs) hold significant importance in biology, with precise PPI prediction as a pivotal factor in comprehending cellular processes and facilitating drug design. However, experimental determination of PPIs is laborious, time-consuming, and often constrained by technical limitations. METHODS We introduce a new node representation method based on initial information fusion, called FFANE, which amalgamates PPI networks and protein sequence data to enhance the precision of PPIs' prediction. A Gaussian kernel similarity matrix is initially established by leveraging protein structural resemblances. Concurrently, protein sequence similarities are gauged using the Levenshtein distance, enabling the capture of diverse protein attributes. Subsequently, to construct an initial information matrix, these two feature matrices are merged by employing weighted fusion to achieve an organic amalgamation of structural and sequence details. To gain a more profound understanding of the amalgamated features, a Stacked Autoencoder (SAE) is employed for encoding learning, thereby yielding more representative feature representations. Ultimately, classification models are trained to predict PPIs by using the well-learned fusion feature. RESULTS When employing 5-fold cross-validation experiments on SVM, our proposed method achieved average accuracies of 94.28%, 97.69%, and 84.05% in terms of Saccharomyces cerevisiae, Homo sapiens, and Helicobacter pylori datasets, respectively. CONCLUSION Experimental findings across various authentic datasets validate the efficacy and superiority of this fusion feature representation approach, underscoring its potential value in bioinformatics.
Collapse
Affiliation(s)
- Mei-Yuan Cao
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia.
| | - Suhaila Zainudin
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| | - Kauthar Mohd Daud
- Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, Selangor, Malaysia
| |
Collapse
|
40
|
Břinda K, Lima L, Pignotti S, Quinones-Olvera N, Salikhov K, Chikhi R, Kucherov G, Iqbal Z, Baym M. Efficient and Robust Search of Microbial Genomes via Phylogenetic Compression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.15.536996. [PMID: 37131636 PMCID: PMC10153118 DOI: 10.1101/2023.04.15.536996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Comprehensive collections approaching millions of sequenced genomes have become central information sources in the life sciences. However, the rapid growth of these collections has made it effectively impossible to search these data using tools such as BLAST and its successors. Here, we present a technique called phylogenetic compression, which uses evolutionary history to guide compression and efficiently search large collections of microbial genomes using existing algorithms and data structures. We show that, when applied to modern diverse collections approaching millions of genomes, lossless phylogenetic compression improves the compression ratios of assemblies, de Bruijn graphs, and k -mer indexes by one to two orders of magnitude. Additionally, we develop a pipeline for a BLAST-like search over these phylogeny-compressed reference data, and demonstrate it can align genes, plasmids, or entire sequencing experiments against all sequenced bacteria until 2019 on ordinary desktop computers within a few hours. Phylogenetic compression has broad applications in computational biology and may provide a fundamental design principle for future genomics infrastructure.
Collapse
|
41
|
Zou Y, Zhang Z, Zeng Y, Hu H, Hao Y, Huang S, Li B. Common Methods for Phylogenetic Tree Construction and Their Implementation in R. Bioengineering (Basel) 2024; 11:480. [PMID: 38790347 PMCID: PMC11117635 DOI: 10.3390/bioengineering11050480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/04/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
A phylogenetic tree can reflect the evolutionary relationships between species or gene families, and they play a critical role in modern biological research. In this review, we summarize common methods for constructing phylogenetic trees, including distance methods, maximum parsimony, maximum likelihood, Bayesian inference, and tree-integration methods (supermatrix and supertree). Here we discuss the advantages, shortcomings, and applications of each method and offer relevant codes to construct phylogenetic trees from molecular data using packages and algorithms in R. This review aims to provide comprehensive guidance and reference for researchers seeking to construct phylogenetic trees while also promoting further development and innovation in this field. By offering a clear and concise overview of the different methods available, we hope to enable researchers to select the most appropriate approach for their specific research questions and datasets.
Collapse
Affiliation(s)
- Yue Zou
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Zixuan Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Hanyue Hu
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Sheng Huang
- Animal Nutrition Institute, Chongqing Academy of Animal Science, Chongqing 402460, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| |
Collapse
|
42
|
An J, Wang Y, Li W, Liu W, Zeng X, Liu G, Liu X, Li H. Evaluating the capability of soybean peptides as calcium ion carriers: a study through sequence analysis and molecular dynamics simulations. RSC Adv 2024; 14:15542-15553. [PMID: 38741956 PMCID: PMC11089645 DOI: 10.1039/d4ra02916j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 05/06/2024] [Indexed: 05/16/2024] Open
Abstract
Calcium homeostasis imbalance in the body can lead to a variety of chronic diseases. Supplement efficiency is essential. Peptide calcium chelate, a fourth-generation calcium supplement, offers easy absorption and minimal side effects. Its effectiveness relies on peptide's calcium binding capacity. However, research on amino acid sequences in peptides with high calcium binding capacity (HCBC) is limited, affecting the efficient identification of such peptides. This study used soybean peptides (SP), separated and purified by gel chromatography, to obtain HCBC peptide (137.45 μg mg-1) and normal peptide (≤95.78 μg mg-1). Mass spectrometry identified the sequences of these peptides, and an analysis of the positional distribution of characteristic amino acids followed. Two HCBC peptides with sequences GGDLVS (271.55 μg mg-1) and YEGVIL (272.54 μg mg-1) were discovered. Molecular dynamics showed that when either aspartic acid is located near the N-terminal's middle, or glutamic acid is near the end, or in cases of continuous Asp or Glu, the binding speed, probability, and strength between the peptide and calcium ions are superior compared to those at other locations. The study's goal was to clarify how the positions of characteristic amino acids in peptides affect calcium binding, aiding in developing peptide calcium chelates as a novel calcium supplement.
Collapse
Affiliation(s)
- Jiulong An
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
| | - Yumei Wang
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
| | - Wenhui Li
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
| | - Wanlu Liu
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
| | - Xiangquan Zeng
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
- Key Laboratory of Green and Low-carbon Processing Technology for Plant-based Food of China National Light Industry Council, Beijing Technology and Business University Beijing 100048 China
| | - Guoqi Liu
- Key Laboratory of Green and Low-carbon Processing Technology for Plant-based Food of China National Light Industry Council, Beijing Technology and Business University Beijing 100048 China
| | - Xinqi Liu
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
| | - He Li
- Key Laboratory of Geriatric Nutrition and Health (Beijing Technology and Business University), Ministry of Education Beijing 100048 China
- Key Laboratory of Green and Low-carbon Processing Technology for Plant-based Food of China National Light Industry Council, Beijing Technology and Business University Beijing 100048 China
| |
Collapse
|
43
|
Bai H, Lewitus E, Li Y, Thomas PV, Zemil M, Merbah M, Peterson CE, Thuraisamy T, Rees PA, Hajduczki A, Dussupt V, Slike B, Mendez-Rivera L, Schmid A, Kavusak E, Rao M, Smith G, Frey J, Sims A, Wieczorek L, Polonis V, Krebs SJ, Ake JA, Vasan S, Bolton DL, Joyce MG, Townsley S, Rolland M. Contemporary HIV-1 consensus Env with AI-assisted redesigned hypervariable loops promote antibody binding. Nat Commun 2024; 15:3924. [PMID: 38724518 PMCID: PMC11082178 DOI: 10.1038/s41467-024-48139-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
An effective HIV-1 vaccine must elicit broadly neutralizing antibodies (bnAbs) against highly diverse Envelope glycoproteins (Env). Since Env with the longest hypervariable (HV) loops is more resistant to the cognate bnAbs than Env with shorter HV loops, we redesigned hypervariable loops for updated Env consensus sequences of subtypes B and C and CRF01_AE. Using modeling with AlphaFold2, we reduced the length of V1, V2, and V5 HV loops while maintaining the integrity of the Env structure and glycan shield, and modified the V4 HV loop. Spacers are designed to limit strain-specific targeting. All updated Env are infectious as pseudoviruses. Preliminary structural characterization suggests that the modified HV loops have a limited impact on Env's conformation. Binding assays show improved binding to modified subtype B and CRF01_AE Env but not to subtype C Env. Neutralization assays show increases in sensitivity to bnAbs, although not always consistently across clades. Strikingly, the HV loop modification renders the resistant CRF01_AE Env sensitive to 10-1074 despite the absence of a glycan at N332.
Collapse
Affiliation(s)
- Hongjun Bai
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Eric Lewitus
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Yifan Li
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Paul V Thomas
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Michelle Zemil
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Mélanie Merbah
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Caroline E Peterson
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Thujitha Thuraisamy
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Phyllis A Rees
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Agnes Hajduczki
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Vincent Dussupt
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Bonnie Slike
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Letzibeth Mendez-Rivera
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Annika Schmid
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Erin Kavusak
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Mekhala Rao
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Gabriel Smith
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Jessica Frey
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Alicea Sims
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Lindsay Wieczorek
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Victoria Polonis
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Shelly J Krebs
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Julie A Ake
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Sandhya Vasan
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Diane L Bolton
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - M Gordon Joyce
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
- Emerging Infectious Disease Branch, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
| | - Samantha Townsley
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA
| | - Morgane Rolland
- U.S. Military HIV Research Program, Walter Reed Army Institute of Research, Silver Spring, MD, 20910, USA.
- Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, 20817, USA.
| |
Collapse
|
44
|
Takahashi M, Chong HB, Zhang S, Yang TY, Lazarov MJ, Harry S, Maynard M, Hilbert B, White RD, Murrey HE, Tsou CC, Vordermark K, Assaad J, Gohar M, Dürr BR, Richter M, Patel H, Kryukov G, Brooijmans N, Alghali ASO, Rubio K, Villanueva A, Zhang J, Ge M, Makram F, Griesshaber H, Harrison D, Koglin AS, Ojeda S, Karakyriakou B, Healy A, Popoola G, Rachmin I, Khandelwal N, Neil JR, Tien PC, Chen N, Hosp T, van den Ouweland S, Hara T, Bussema L, Dong R, Shi L, Rasmussen MQ, Domingues AC, Lawless A, Fang J, Yoda S, Nguyen LP, Reeves SM, Wakefield FN, Acker A, Clark SE, Dubash T, Kastanos J, Oh E, Fisher DE, Maheswaran S, Haber DA, Boland GM, Sade-Feldman M, Jenkins RW, Hata AN, Bardeesy NM, Suvà ML, Martin BR, Liau BB, Ott CJ, Rivera MN, Lawrence MS, Bar-Peled L. DrugMap: A quantitative pan-cancer analysis of cysteine ligandability. Cell 2024; 187:2536-2556.e30. [PMID: 38653237 PMCID: PMC11143475 DOI: 10.1016/j.cell.2024.03.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/15/2024] [Accepted: 03/19/2024] [Indexed: 04/25/2024]
Abstract
Cysteine-focused chemical proteomic platforms have accelerated the clinical development of covalent inhibitors for a wide range of targets in cancer. However, how different oncogenic contexts influence cysteine targeting remains unknown. To address this question, we have developed "DrugMap," an atlas of cysteine ligandability compiled across 416 cancer cell lines. We unexpectedly find that cysteine ligandability varies across cancer cell lines, and we attribute this to differences in cellular redox states, protein conformational changes, and genetic mutations. Leveraging these findings, we identify actionable cysteines in NF-κB1 and SOX10 and develop corresponding covalent ligands that block the activity of these transcription factors. We demonstrate that the NF-κB1 probe blocks DNA binding, whereas the SOX10 ligand increases SOX10-SOX10 interactions and disrupts melanoma transcriptional signaling. Our findings reveal heterogeneity in cysteine ligandability across cancers, pinpoint cell-intrinsic features driving cysteine targeting, and illustrate the use of covalent probes to disrupt oncogenic transcription-factor activity.
Collapse
Affiliation(s)
- Mariko Takahashi
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA.
| | - Harrison B Chong
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Siwen Zhang
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Tzu-Yi Yang
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Matthew J Lazarov
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Stefan Harry
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | | | | | | | | | | | - Kira Vordermark
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Jonathan Assaad
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Magdy Gohar
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Benedikt R Dürr
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Marianne Richter
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Himani Patel
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | | | | | | | - Karla Rubio
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Antonio Villanueva
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Junbing Zhang
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Maolin Ge
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Farah Makram
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Hanna Griesshaber
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Drew Harrison
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Ann-Sophie Koglin
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Samuel Ojeda
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Barbara Karakyriakou
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Alexander Healy
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - George Popoola
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Inbal Rachmin
- Cutaneous Biology Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Neha Khandelwal
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | | | - Pei-Chieh Tien
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Nicholas Chen
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
| | - Tobias Hosp
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Sanne van den Ouweland
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Toshiro Hara
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Lillian Bussema
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Rui Dong
- Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Lei Shi
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Martin Q Rasmussen
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Ana Carolina Domingues
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Aleigha Lawless
- Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jacy Fang
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Satoshi Yoda
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Linh Phuong Nguyen
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Sarah Marie Reeves
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Farrah Nicole Wakefield
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Adam Acker
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Sarah Elizabeth Clark
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Taronish Dubash
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - John Kastanos
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA
| | - Eugene Oh
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - David E Fisher
- Cutaneous Biology Research Center, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Shyamala Maheswaran
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Daniel A Haber
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Genevieve M Boland
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Surgery, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Surgery, Harvard Medical School, Boston, MA 02114, USA
| | - Moshe Sade-Feldman
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Russell W Jenkins
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Aaron N Hata
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Nabeel M Bardeesy
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Mario L Suvà
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
| | | | - Brian B Liau
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | - Christopher J Ott
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Miguel N Rivera
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA 02114, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA
| | - Michael S Lawrence
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pathology, Harvard Medical School, Boston, MA 02114, USA.
| | - Liron Bar-Peled
- Krantz Family Center for Cancer Research, Massachusetts General Hospital Cancer Center, Charlestown, MA 02129, USA; Department of Medicine, Harvard Medical School, Boston, MA 02114, USA.
| |
Collapse
|
45
|
Supakar T, Herring-Nicholas A, Josephs EA. Compartmentalized CRISPR Reactions (CCR) for High-Throughput Screening of Guide RNA Potency and Specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.07.592954. [PMID: 38766102 PMCID: PMC11100742 DOI: 10.1101/2024.05.07.592954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
CRISPR ribonucleoproteins (RNPs) use a variable segment in their guide RNA (gRNA) called a spacer to determine the DNA sequence at which the effector protein will exhibit nuclease activity and generate target-specific genetic mutations. However, nuclease activity with different gRNAs can vary considerably, in a spacer sequence-dependent manner that can be difficult to predict. While computational tools are helpful in predicting a CRISPR effector's activity and/or potential for off-target mutagenesis with different gRNAs, individual gRNAs must still be validated in vitro prior to their use. Here, we present compartmentalized CRISPR reactions (CCR) for screening large numbers of spacer/target/off-target combinations simultaneously in vitro for both CRISPR effector activity and specificity, by confining the complete CRISPR reaction of gRNA transcription, RNP formation, and CRISPR target cleavage within individual water-in-oil microemulsions. With CCR, large numbers of the candidate gRNAs (output by computational design tools) can be immediately validated in parallel, and we show that CCR can be used to screen hundreds of thousands of extended gRNA (x-gRNAs) variants that can completely block cleavage at off-target sequences while maintaining high levels of on-target activity. We expect CCR can help to streamline the gRNA generation and validation processes for applications in biological and biomedical research.
Collapse
Affiliation(s)
- Tinku Supakar
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| | - Ashley Herring-Nicholas
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| | - Eric A. Josephs
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| |
Collapse
|
46
|
Balamurugan C, Steenwyk JL, Goldman GH, Rokas A. The evolution of the gliotoxin biosynthetic gene cluster in Penicillium fungi. G3 (BETHESDA, MD.) 2024; 14:jkae063. [PMID: 38507596 PMCID: PMC11075534 DOI: 10.1093/g3journal/jkae063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 12/27/2023] [Accepted: 03/11/2024] [Indexed: 03/22/2024]
Abstract
Fungi biosynthesize diverse secondary metabolites, small organic bioactive molecules with key roles in fungal ecology. Fungal secondary metabolites are often encoded by physically clustered genes known as biosynthetic gene clusters (BGCs). Fungi in the genus Penicillium produce a cadre of secondary metabolites, some of which are useful (e.g. the antibiotic penicillin and the cholesterol-lowering drug mevastatin) and others harmful (e.g. the mycotoxin patulin and the immunosuppressant gliotoxin) to human affairs. Fungal genomes often also encode resistance genes that confer protection against toxic secondary metabolites. Some Penicillium species, such as Penicillium decumbens, are known to produce gliotoxin, a secondary metabolite with known immunosuppressant activity. To investigate the evolutionary conservation of homologs of the gliotoxin BGC and of genes involved in gliotoxin resistance in Penicillium, we analyzed 35 Penicillium genomes from 23 species. Homologous, lesser fragmented gliotoxin BGCs were found in 12 genomes, mostly fragmented remnants of the gliotoxin BGC were found in 21 genomes, whereas the remaining 2 Penicillium genomes lacked the gliotoxin BGC altogether. In contrast, broad conservation of homologs of resistance genes that reside outside the BGC across Penicillium genomes was observed. Evolutionary rate analysis revealed that BGCs with higher numbers of genes evolve slower than BGCs with few genes, suggestive of constraint and potential functional significance or more recent decay. Gene tree-species tree reconciliation analyses suggested that the history of homologs in the gliotoxin BGC across the genus Penicillium likely involved multiple duplications, losses, and horizontal gene transfers. Our analyses suggest that genes encoded in BGCs can have complex evolutionary histories and be retained in genomes long after the loss of secondary metabolite biosynthesis.
Collapse
Affiliation(s)
- Charu Balamurugan
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Jacob L Steenwyk
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Gustavo H Goldman
- Faculdade de Ciencias Farmacêuticas de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto, São Paulo CEP 14040-903, Brazil
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, VU Station B #35-1634, Nashville, TN 37235, USA
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
47
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy SF. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. Nucleic Acids Res 2024:gkae332. [PMID: 38709890 DOI: 10.1093/nar/gkae332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/23/2024] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45 000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| |
Collapse
|
48
|
Luebbert L, Sullivan DK, Carilli M, Hjörleifsson KE, Winnett AV, Chari T, Pachter L. Efficient and accurate detection of viral sequences at single-cell resolution reveals putative novel viruses perturbing host gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.11.571168. [PMID: 38168363 PMCID: PMC10760059 DOI: 10.1101/2023.12.11.571168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
There are an estimated 300,000 mammalian viruses from which infectious diseases in humans may arise. They inhabit human tissues such as the lungs, blood, and brain and often remain undetected. Efficient and accurate detection of viral infection is vital to understanding its impact on human health and to make accurate predictions to limit adverse effects, such as future epidemics. The increasing use of high-throughput sequencing methods in research, agriculture, and healthcare provides an opportunity for the cost-effective surveillance of viral diversity and investigation of virus-disease correlation. However, existing methods for identifying viruses in sequencing data rely on and are limited to reference genomes or cannot retain single-cell resolution through cell barcode tracking. We introduce a method that accurately and rapidly detects viral sequences in bulk and single-cell transcriptomics data based on highly conserved amino acid domains, which enables the detection of RNA viruses covering up to 1012 virus species. The analysis of viral presence and host gene expression in parallel at single-cell resolution allows for the characterization of host viromes and the identification of viral tropism and host responses. We applied our method to identify putative novel viruses in rhesus macaque PBMC data that display cell type specificity and whose presence correlates with altered host gene expression.
Collapse
Affiliation(s)
- Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Delaney K. Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Maria Carilli
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | | | - Alexander Viloria Winnett
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California, Los Angeles, California
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California
| |
Collapse
|
49
|
Kabier M, Gambacorta N, Trisciuzzi D, Kumar S, Nicolotti O, Mathew B. MzDOCK: A free ready-to-use GUI-based pipeline for molecular docking simulations. J Comput Chem 2024. [PMID: 38703357 DOI: 10.1002/jcc.27390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/12/2024] [Accepted: 04/19/2024] [Indexed: 05/06/2024]
Abstract
Molecular docking is by far the most preferred approach in structure-based drug design for its effectiveness to predict the scoring and posing of a given bioactive small molecule into the binding site of its pharmacological target. Herein, we present MzDOCK, a new GUI-based pipeline for Windows operating system, designed with the intent of making molecular docking easier to use and higher reproducible even for inexperienced people. By harmonic integration of python and batch scripts, which employs various open source packages such as Smina (docking engine), OpenBabel (file conversion) and PLIP (analysis), MzDOCK includes many practical options such as: binding site configuration based on co-crystallized ligands; generation of enantiomers from SMILES input; application of different force fields (MMFF94, MMFF94s, UFF, GAFF, Ghemical) for energy minimization; retention of selectable ions and cofactors; sidechain flexibility of selectable binding site residues; multiple input file format (SMILES, PDB, SDF, Mol2, Mol); generation of reports and of pictures for interactive visualization. Users can download for free MzDOCK at the following link: https://github.com/Muzatheking12/MzDOCK.
Collapse
Affiliation(s)
- Muzammil Kabier
- Department of Pharmaceutical Chemistry, Amrita School of Pharmacy, Amrita Vishwa Vidyapeetham, AIMS Health Sciences Campus, Kochi, India
| | - Nicola Gambacorta
- Division of Medical Genetics, IRCSS Foundation-Casa Sollievo della Sofferenza, San Giovanni Rotondo (Foggia), Foggia, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Sunil Kumar
- Department of Pharmaceutical Chemistry, Amrita School of Pharmacy, Amrita Vishwa Vidyapeetham, AIMS Health Sciences Campus, Kochi, India
| | - Orazio Nicolotti
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Bari, Italy
| | - Bijo Mathew
- Department of Pharmaceutical Chemistry, Amrita School of Pharmacy, Amrita Vishwa Vidyapeetham, AIMS Health Sciences Campus, Kochi, India
| |
Collapse
|
50
|
Kalogeropoulos K, Moldt Haack A, Madzharova E, Di Lorenzo A, Hanna R, Schoof EM, Keller UAD. CLIPPER 2.0: Peptide level annotation and data analysis for positional proteomics. Mol Cell Proteomics 2024:100781. [PMID: 38703894 DOI: 10.1016/j.mcpro.2024.100781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 04/11/2024] [Accepted: 05/01/2024] [Indexed: 05/06/2024] Open
Abstract
Positional proteomics methodologies have transformed protease research, and have brought mass spectrometry (MS)-based degradomics studies to the forefront of protease characterization and system-wide interrogation of protease signaling. Considerable advancements in both sensitivity and throughput of liquid chromatography (LC)-MS/MS instrumentation enable the generation of enormous positional proteomics datasets of natural and protein termini and neo-termini of cleaved protease substrates. However, a concomitant progress has not been observed to the same extent in data analysis and post-processing steps, arguably constituting the largest bottleneck in positional proteomics workflows. Here, we present a computational tool, CLIPPER 2.0, that builds on prior algorithms developed for MS-based protein termini analysis, facilitating peptide level annotation and data analysis. CLIPPER 2.0 can be used with several sample preparation workflows and proteomics search algorithms, and enables fast and automated database information retrieval, statistical and network analysis, as well as visualization of terminomic datasets. We demonstrate the applicability of our tool by analyzing GluC and MMP9 cleavages in HeLa lysates. CLIPPER 2.0 is available at https://github.com/UadKLab/CLIPPER-2.0.
Collapse
Affiliation(s)
- Konstantinos Kalogeropoulos
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark.
| | - Aleksander Moldt Haack
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark.
| | - Elizabeta Madzharova
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark
| | - Antea Di Lorenzo
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark
| | - Rawad Hanna
- Faculty of Biology, Technion-Israel Institute of Technology, Technion City, Haifa 3200003, Israel
| | - Erwin M Schoof
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark
| | - Ulrich Auf dem Keller
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Søltofts Plads B221, Kgs. Lyngby, 2800, Denmark
| |
Collapse
|