1
|
Rakesh S, Behera K, Krishnan A. Unveiling the structural and functional implications of uncharacterized NSPs and variations in the molecular toolkit across arteriviruses. NAR Genom Bioinform 2025; 7:lqaf035. [PMID: 40213365 PMCID: PMC11983283 DOI: 10.1093/nargab/lqaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2024] [Revised: 02/16/2025] [Accepted: 03/18/2025] [Indexed: 04/15/2025] Open
Abstract
Despite considerable scrutiny of mammalian arterivirus genomes, their genomic architecture remains incomplete, with several unannotated non-structural proteins (NSPs) and the enigmatic absence of methyltransferase (MTase) domains. Additionally, the host range of arteriviruses has expanded to include seven newly sequenced genomes from non-mammalian hosts, which remain largely unannotated and await detailed comparisons alongside mammalian isolates. Utilizing comparative genomics approaches and comprehensive sequence-structure analysis, we provide enhanced genomic architecture and annotations for arterivirus genomes. We identified the previously unannotated C-terminal domain of NSP3 as a winged helix-turn-helix domain and classified NSP7 as a new small β-barrel domain, both likely involved in interactions with viral RNA. NSP12 is identified as a derived variant of the N7-MTase-like Rossmann fold domain that retains core structural alignment with N7-MTases in Nidovirales but likely lacks enzymatic functionality due to the erosion of catalytic residues, indicating a unique role specific to mammalian arteriviruses. In contrast, non-mammalian arteriviruses sporadically retain a 2'-O-MTase and an exonuclease (ExoN) domain, which are typically absent in mammalian arteriviruses, highlighting contrasting evolutionary trends and variations in their molecular toolkit. Similar lineage-specific patterns are observed in the diversification of papain-like proteases and structural proteins. Overall, the study extends our knowledge of arterivirus genomic diversity and evolution.
Collapse
Affiliation(s)
- Siuli Rakesh
- Department of Biological Sciences, Indian Institute of Science Education and Research Berhampur (IISER Berhampur), Berhampur 760010, India
| | - Kshitij Behera
- Department of Biological Sciences, Indian Institute of Science Education and Research Berhampur (IISER Berhampur), Berhampur 760010, India
| | - Arunkumar Krishnan
- Department of Biological Sciences, Indian Institute of Science Education and Research Berhampur (IISER Berhampur), Berhampur 760010, India
| |
Collapse
|
2
|
Bandeira PT, Chaves CR, Monteiro Torres PH, de Souza W. Immunolocalization and 3D modeling of three unique proteins belonging to the costa of Tritrichomonas foetus. Parasitol Res 2025; 124:30. [PMID: 40053153 PMCID: PMC11889022 DOI: 10.1007/s00436-025-08466-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 02/10/2025] [Indexed: 03/10/2025]
Abstract
Nowadays, even in light of all the massive advances in cell biology, we still find some cellular structures that are not entirely understood. Among those, we highlight the costa, a structure from the mastigont system existent only in some members of the orders Trichomonadida and Tritrichomonadida, including the pathogens of venereal diseases in humans and cattle, Trichomonas vaginalis (T. vaginalis) and Tritrichomonas foetus (T. foetus), respectively. The costa is a prominent striated fiber and, although part of the cytoskeleton, differs from its classical components, and its molecular composition is still not fully characterized. Using proteomics of T. foetus's costa fraction, we previously identified hypothetic proteins, and among these, the protein ARM19800.1 positively localized in the costa and named costain-1. In this study, two other protein candidates were analyzed. To achieve the specific localization of 11810 and 32137 proteins in T. foetus's cells, it was used expansion microscopy and immunocytochemistry. The immunofluorescence revealed the presence of both proteins throughout the whole costa but with different intensities. Immunocytochemistry using negative staining, LR-White, and Epon embedding revealed further analyses of the protein's localization. All techniques confirmed the distinct and distributed localization of both proteins: costain-2 (11810) and costain-3 (32137). Also, AlfaFold3 was used to generate 3D models of the three identified proteins, showing a major prevalence of α-helical spans. Nonetheless, the identification and further characterization of these unique proteins can help understand their functional role in the assembled costa and, therefore, better understand the organization and function of this structure in these organisms.
Collapse
Affiliation(s)
- Paula Terra Bandeira
- Laboratório de Ultraestrutura Celular Hertha Meyer, Centro de Pesquisa Em Medicina de Precisão, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, Rio de Janeiro, Brazil.
| | - Camila Rodrigues Chaves
- Laboratório de Modelagem E Dinâmica Molecular, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Pedro Henrique Monteiro Torres
- Laboratório de Modelagem E Dinâmica Molecular, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Wanderley de Souza
- Laboratório de Ultraestrutura Celular Hertha Meyer, Centro de Pesquisa Em Medicina de Precisão, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto Nacional de Ciência E Tecnologia Em Biologia Estrutural E Bioimagens, and Centro Nacional de Biologia Estrutural E Bioimagens, Universidade Federal Do Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
3
|
Gillani M, Pollastri G. Impact of Alignments on the Accuracy of Protein Subcellular Localization Predictions. Proteins 2025; 93:745-759. [PMID: 39575640 PMCID: PMC11809130 DOI: 10.1002/prot.26767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 10/01/2024] [Accepted: 11/01/2024] [Indexed: 02/11/2025]
Abstract
Alignments in bioinformatics refer to the arrangement of sequences to identify regions of similarity that can indicate functional, structural, or evolutionary relationships. They are crucial for bioinformaticians as they enable accurate predictions and analyses in various applications, including protein subcellular localization. The predictive model used in this article is based on a deep - convolutional architecture. We tested configurations of Deep N-to-1 convolutional neural networks of various depths and widths during experimentation for the evaluation of better-performing values across a diverse set of eight classes. For without alignment assessment, sequences are encoded using one-hot encoding, converting each character into a numerical representation, which is straightforward for non-numerical data and useful for machine learning models. For with alignments assessment, multiple sequence alignments (MSAs) are created using PSI-BLAST, capturing evolutionary information by calculating frequencies of residues and gaps. The average difference in peak performance between models with alignments and without alignments is approximately 15.82%. The average difference in the highest accuracy achieved with alignments compared with without alignments is approximately 15.16%. Thus, extensive experimentation indicates that higher alignment accuracy implies a more reliable model and improved prediction accuracy, which can be trusted to deliver consistent performance across different layers and classes of subcellular localization predictions. This research provides valuable insights into prediction accuracies with and without alignments, offering bioinformaticians an effective tool for better understanding while potentially reducing the need for extensive experimental validations. The source code and datasets are available at http://distilldeep.ucd.ie/SCL8/.
Collapse
Affiliation(s)
- Maryam Gillani
- School of Computer ScienceUniversity College Dublin (UCD)DublinIreland
| | | |
Collapse
|
4
|
Zaychikova M, Malakhova M, Bespiatykh D, Kornienko M, Klimina K, Strokach A, Gorodnichev R, German A, Fursov M, Bagrov D, Vnukova A, Gracheva A, Kazyulina A, Shleeva M, Shitikov E. Vic9 mycobacteriophage: the first subcluster B2 phage isolated in Russia. Front Microbiol 2025; 15:1513081. [PMID: 39877753 PMCID: PMC11772480 DOI: 10.3389/fmicb.2024.1513081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Accepted: 12/12/2024] [Indexed: 01/31/2025] Open
Abstract
Mycobacteriophages are viruses that specifically infect bacteria of the Mycobacterium genus. A substantial collection of mycobacteriophages has been isolated and characterized, offering valuable insights into their diversity and evolution. This collection also holds significant potential for therapeutic applications, particularly as an alternative to antibiotics in combating drug-resistant bacterial strains. In this study, we report the isolation and characterization of a new mycobacteriophage, Vic9, using Mycobacterium smegmatis mc (2)155 as the host strain. Vic9 has been classified within the B2 subcluster of the B cluster. Morphological analysis revealed that Vic9 has a structure typical of siphophages from this subcluster and forms characteristic plaques. The phage adsorbs onto host strain cells within 30 min, and according to one-step growth experiments, its latent period lasts about 90 min, followed by a growth period of 150 min, with an average yield of approximately 68 phage particles per infected cell. In host range experiments, Vic9 efficiently lysed the host strain and also exhibited the ability to lyse M. tuberculosis H37Rv, albeit with a low efficiency of plating (EOP ≈ 2 × 10-5), a typical feature of B2 phages. No lysis was observed in other tested mycobacterial species. The genome of Vic9 comprises 67,543 bp of double-stranded DNA and encodes 89 open reading frames. Our analysis revealed unique features in Vic9, despite its close relationship to other B2 subcluster phages, highlighting its distinct characteristics even among closely related phages. Particularly noteworthy was the discovery of a distinct 435 bp sequence within the gene cluster responsible for queuosine biosynthesis, as well as a recombination event within the structural cassette region (Vic_0033-Vic_0035) among members of the B1, B2, and B3 subclusters. These genetic features are of interest for further research, as they may reveal new mechanisms of phage-bacteria interactions and their potential for developing novel phage therapy methods.
Collapse
Affiliation(s)
- Marina Zaychikova
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Maja Malakhova
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Dmitry Bespiatykh
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Maria Kornienko
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Ksenia Klimina
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Aleksandra Strokach
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Roman Gorodnichev
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| | - Arina German
- Federal Research Centre 'Fundamentals of Biotechnology' of the Russian Academy of Sciences, Moscow, Russia
| | - Mikhail Fursov
- State Research Center for Applied Microbiology and Biotechnology, Obolensk, Russia
| | - Dmitry Bagrov
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Anna Vnukova
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Alexandra Gracheva
- Federal State Budgetary Institution “National Medical Research Center of Phtisiopulmonology and Infectious Diseases” of the Ministry of Health of the Russian Federation, Moscow, Russia
| | - Anastasia Kazyulina
- Federal State Budgetary Institution “National Medical Research Center of Phtisiopulmonology and Infectious Diseases” of the Ministry of Health of the Russian Federation, Moscow, Russia
| | - Margarita Shleeva
- Federal Research Centre 'Fundamentals of Biotechnology' of the Russian Academy of Sciences, Moscow, Russia
| | - Egor Shitikov
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency Medicine, Moscow, Russia
| |
Collapse
|
5
|
Gao S, Zhang Y, Bush SJ, Wang B, Yang X, Ye K. Centromere Landscapes Resolved from Hundreds of Human Genomes. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae071. [PMID: 39423139 DOI: 10.1093/gpbjnl/qzae071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 08/27/2024] [Accepted: 09/20/2024] [Indexed: 10/21/2024]
Abstract
High-fidelity (HiFi) sequencing has facilitated the assembly and analysis of the most repetitive region of the genome, the centromere. Nevertheless, our current understanding of human centromeres is based on a relatively small number of telomere-to-telomere assemblies, which have not yet captured its full diversity. In this study, we investigated the genomic diversity of human centromere higher order repeats (HORs) via both HiFi reads and haplotype-resolved assemblies from hundreds of samples drawn from ongoing pangenome-sequencing projects and reprocessed them via a novel HOR annotation pipeline, HiCAT-human. We used this wealth of data to provide a global survey of the centromeric HOR landscape; in particular, we found that 23 HORs presented significant copy number variability between populations. We detected three centromere genotypes with unbalanced population frequencies on chromosomes 5, 8, and 17. An inter-assembly comparison of HOR loci further revealed that while HOR array structures are diverse, they nevertheless tend to form a number of specific landscapes, each exhibiting different levels of HOR subunit expansion and possibly reflecting a cyclical evolutionary transition from homogeneous to nested structures and back.
Collapse
Affiliation(s)
- Shenghan Gao
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yimeng Zhang
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Stephen J Bush
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Bo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xiaofei Yang
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
- Center for Mathematical Medical, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710061, China
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an 710049, China
- Faculty of Science, Leiden University, Leiden 2311 EZ, The Netherlands
| |
Collapse
|
6
|
Cheng J, Jia Y, Hill C, He T, Wang K, Guo G, Shabala S, Zhou M, Han Y, Li C. Diversity of Gibberellin 2-oxidase genes in the barley genome offers opportunities for genetic improvement. J Adv Res 2024; 66:105-118. [PMID: 38199453 PMCID: PMC11674783 DOI: 10.1016/j.jare.2023.12.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 12/26/2023] [Accepted: 12/29/2023] [Indexed: 01/12/2024] Open
Abstract
INTRODUCTION Gibberellin (GA) is a vital phytohormone in regulating plant growth and development. During the "Green Revolution", modification of GA-related genes created semi-dwarfing phenotype in cereal crops but adversely affected grain weight. Gibberellin 2-oxidases (GA2oxs) in barley act as key catabolic enzymes in deactivating GA, but their functions are still less known. OBJECTIVES This study investigates the physiological function of two HvGA2ox genes in barley and identifies novel semi-dwarf alleles with minimum impacts on other agronomic traits. METHODS Virus-induced gene silencing and CRISPR/Cas9 technology were used to manipulate gene expression of HvGA2ox9 and HvGA2ox8a in barley and RNA-seq was conducted to compare the transcriptome between wild type and mutants. Also, field trials in multiple environments were performed to detect the functional haplotypes. RESULTS There were ten GA2oxs that distinctly expressed in shoot, tiller, inflorescence, grain, embryo and root. Knockdown of HvGA2ox9 did not affect plant height, while ga2ox8a mutants generated by CRISPR/Cas9 increased plant height and significantly altered seed width and weight due to the increased bioactive GA4 level. RNA-seq analysis revealed that genes involved in starch and sucrose metabolism were significantly decreased in the inflorescence of ga2ox8a mutants. Furthermore, haplotype analysis revealed one naturally occurring HvGA2ox8a haplotype was associated with decreased plant height, early flowering and wider and heavier seed. CONCLUSION Our results demonstrate the potential of manipulating GA2ox genes to fine tune GA signalling and biofunctions in desired plant tissues and open a promising avenue for minimising the trade-off effects of Green Revolution semi-dwarfing genes on grain size and weight. The knowledge will promote the development of next generation barley cultivars with better adaptation to a changing climate.
Collapse
Affiliation(s)
- Jingye Cheng
- Tasmanian Institute of Agriculture, University of Tasmania, TAS, Australia; Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia; Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yong Jia
- Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia; Agriculture and Food, Department of Primary Industries and Regional Development, South Perth, WA, Australia
| | - Camilla Hill
- Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia
| | - Tianhua He
- Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia
| | - Ke Wang
- Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Ganggang Guo
- Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Sergey Shabala
- Tasmanian Institute of Agriculture, University of Tasmania, TAS, Australia
| | - Meixue Zhou
- Tasmanian Institute of Agriculture, University of Tasmania, TAS, Australia.
| | - Yong Han
- Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia; Agriculture and Food, Department of Primary Industries and Regional Development, South Perth, WA, Australia.
| | - Chengdao Li
- Western Crop Genetics Alliance, Food Futures Institute, School of Agriculture, Murdoch University, WA, Australia; Agriculture and Food, Department of Primary Industries and Regional Development, South Perth, WA, Australia.
| |
Collapse
|
7
|
Zhai Y, Zhou T, Wei Y, Zou Q, Wang Y. ReAlign-N: an integrated realignment approach for multiple nucleic acid sequence alignment, combining global and local realignments. NAR Genom Bioinform 2024; 6:lqae170. [PMID: 39703429 PMCID: PMC11655299 DOI: 10.1093/nargab/lqae170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 10/05/2024] [Accepted: 11/19/2024] [Indexed: 12/21/2024] Open
Abstract
Ensuring accurate multiple sequence alignment (MSA) is essential for comprehensive biological sequence analysis. However, the complexity of evolutionary relationships often results in variations that generic alignment tools may not adequately address. Realignment is crucial to remedy this issue. Currently, there is a lack of realignment methods tailored for nucleic acid sequences, particularly for lengthy sequences. Thus, there's an urgent need for the development of realignment methods better suited to address these challenges. This study presents ReAlign-N, a realignment method explicitly designed for multiple nucleic acid sequence alignment. ReAlign-N integrates both global and local realignment strategies for improved accuracy. In the global realignment phase, ReAlign-N incorporates K-Band and innovative memory-saving technology into the dynamic programming approach, ensuring high efficiency and minimal memory requirements for large-scale realignment tasks. The local realignment stage employs full matching and entropy scoring methods to identify low-quality regions and conducts realignment through MAFFT. Experimental results demonstrate that ReAlign-N consistently outperforms initial alignments on simulated and real datasets. Furthermore, compared to ReformAlign, the only existing multiple nucleic acid sequence realignment tool, ReAlign-N, exhibits shorter running times and occupies less memory space. The source code and test data for ReAlign-N are available on GitHub (https://github.com/malabz/ReAlign-N).
Collapse
Affiliation(s)
- Yixiao Zhai
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.2006, Xiyuan Avenue, Pidu Zone, Chengdu 610054, China
- Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China
| | - Tong Zhou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.2006, Xiyuan Avenue, Pidu Zone, Chengdu 610054, China
- Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China
| | - Yanming Wei
- Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China
- School of Computer Science and Technology, Xidian University, No.266, Xifeng Road, Chang'an Zone, Xi’an 710071, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.2006, Xiyuan Avenue, Pidu Zone, Chengdu 610054, China
- Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.2006, Xiyuan Avenue, Pidu Zone, Chengdu 610054, China
- Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China
| |
Collapse
|
8
|
Cinquin O. Steering veridical large language model analyses by correcting and enriching generated database queries: first steps toward ChatGPT bioinformatics. Brief Bioinform 2024; 26:bbaf045. [PMID: 39910777 PMCID: PMC11798674 DOI: 10.1093/bib/bbaf045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 01/09/2025] [Accepted: 01/20/2025] [Indexed: 02/07/2025] Open
Abstract
Large language models (LLMs) leverage factual knowledge from pretraining. Yet this knowledge remains incomplete and sometimes challenging to retrieve-especially in scientific domains not extensively covered in pretraining datasets and where information is still evolving. Here, we focus on genomics and bioinformatics. We confirm and expand upon issues with plain ChatGPT functioning as a bioinformatics assistant. Poor data retrieval and hallucination lead ChatGPT to err, as do incorrect sequence manipulations. To address this, we propose a system basing LLM outputs on up-to-date, authoritative facts and facilitating LLM-guided data analysis. Specifically, we introduce NagGPT, a middleware tool to insert between LLMs and databases, designed to bridge gaps in LLM knowledge and usage of database application programming interfaces. NagGPT proxies LLM-generated database queries, with special handling of incorrect queries. It acts as a gatekeeper between query responses and the LLM prompt, redirecting large responses to files but providing a synthesized snippet and injecting comments to steer the LLM. A companion OpenAI custom GPT, Genomics Fetcher-Analyzer, connects ChatGPT with NagGPT. It steers ChatGPT to generate and run Python code, performing bioinformatics tasks on data dynamically retrieved from a dozen common genomics databases (e.g. NCBI, Ensembl, UniProt, WormBase, and FlyBase). We implement partial mitigations for encountered challenges: detrimental interactions between code generation style and data analysis, confusion between database identifiers, and hallucination of both data and actions taken. Our results identify avenues to augment ChatGPT as a bioinformatics assistant and, more broadly, to improve factual accuracy and instruction following of unmodified LLMs.
Collapse
Affiliation(s)
- Olivier Cinquin
- Department of Developmental and Cell Biology, Center for Complex Biological Systems, University of California at Irvine, 4203 McGaugh Hall, Irvine, CA 92697, USA
| |
Collapse
|
9
|
Zhou Y, Song L, Li H. Full-resolution HLA and KIR gene annotations for human genome assemblies. Genome Res 2024; 34:1931-1941. [PMID: 38839374 PMCID: PMC11610593 DOI: 10.1101/gr.278985.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/22/2024] [Indexed: 06/07/2024]
Abstract
The human leukocyte antigen (HLA) genes and the killer cell immunoglobulin-like receptor (KIR) genes are critical to immune responses and are associated with many immune-related diseases. Located in highly polymorphic regions, it is difficult to study them with traditional short-read alignment-based methods. Although modern long-read assemblers can often assemble these genes, using existing tools to annotate HLA and KIR genes in these assemblies remains a nontrivial task. Here, we describe Immuannot, a new computation tool to annotate the gene structures of HLA and KIR genes and to type the allele of each gene. Applying Immuannot to 56 regional and 212 whole-genome assemblies from previous studies, we annotate 9931 HLA and KIR genes and found that almost half of these genes, 4068, have novel sequences compared with the current Immuno Polymorphism Database (IPD). These novel gene sequences are represented by 2664 distinct alleles, some of which contained nonsynonymous variations, resulting in 92 novel protein sequences. We demonstrate the complex haplotype structures at the two loci and report the linkage between HLA/KIR haplotypes and gene alleles. We anticipate that Immuannot will speed up the discovery of new HLA/KIR alleles and enable the association of HLA/KIR haplotype structures with clinical outcomes in the future.
Collapse
Affiliation(s)
- Ying Zhou
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
| | - Li Song
- Department of Biomedical Data Science, Dartmouth College, Hanover, New Hampshire 03755, USA
| | - Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA;
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
10
|
Khamespanah E, Asad S, Vanak Z, Mehrshad M. Niche-Aware Metagenomic Screening for Enzyme Methioninase Illuminates Its Contribution to Metabolic Syntrophy. MICROBIAL ECOLOGY 2024; 87:141. [PMID: 39546027 PMCID: PMC11568061 DOI: 10.1007/s00248-024-02458-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Accepted: 11/01/2024] [Indexed: 11/17/2024]
Abstract
The single-step methioninase-mediated degradation of methionine (as a sulfur containing amino acid) is a reaction at the interface of carbon, nitrogen, sulfur, and methane metabolism in microbes. This enzyme also has therapeutic application due to its role in starving auxotrophic cancer cells. Applying our refined in silico screening pipeline on 33,469 publicly available genome assemblies and 1878 metagenome assembled genomes/single-cell amplified genomes from brackish waters of the Caspian Sea and the Fennoscandian Shield deep groundwater resulted in recovering 1845 methioninases. The majority of recovered methioninases belong to representatives of phyla Proteobacteria (50%), Firmicutes (29%), and Firmicutes_A (13%). Prevalence of methioninase among anaerobic microbes and in the anoxic deep groundwater together with the relevance of its products for energy conservation in anaerobic metabolism highlights such environments as desirable targets for screening novel methioninases and resolving its contribution to microbial metabolism and interactions. Among archaea, majority of detected methioninases are from representatives of Methanosarcina that are able to use methanethiol, the sulfur containing product from methionine degradation, as a precursor for methanogenesis. Branching just outside these archaeal methioninases in the phylogenetic tree, we recovered three methioninases belonging to representatives of Patescibacteria reconstructed from deep groundwater metagenomes. We hypothesize that methioninase in Patescibacteria could contribute to their syntrophic interactions where their methanogenic partners/hosts benefit from the produced 2-oxobutyrate and methanethiol. Our results underscore the significance of accounting for specific ecological niche in screening for enzyme variates with desired characteristics. Finally, complementing of our findings with experimental validation of methioninase activity confirms the potential of our in silico screening in clarifying the peculiar ecological role of methioninase in anoxic environments.
Collapse
Affiliation(s)
- Erfan Khamespanah
- Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran
| | - Sedigheh Asad
- Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran.
| | - Zeynab Vanak
- Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran
| | - Maliheh Mehrshad
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, 75007, Uppsala, Sweden.
| |
Collapse
|
11
|
Horvath M, Schrofel A, Kowalska K, Sabo J, Vlasak J, Nourisanami F, Sobol M, Pinkas D, Knapp K, Koupilova N, Novacek J, Veverka V, Lansky Z, Rozbesky D. Structural basis of MICAL autoinhibition. Nat Commun 2024; 15:9810. [PMID: 39532862 PMCID: PMC11557892 DOI: 10.1038/s41467-024-54131-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 11/01/2024] [Indexed: 11/16/2024] Open
Abstract
MICAL proteins play a crucial role in cellular dynamics by binding and disassembling actin filaments, impacting processes like axon guidance, cytokinesis, and cell morphology. Their cellular activity is tightly controlled, as dysregulation can lead to detrimental effects on cellular morphology. Although previous studies have suggested that MICALs are autoinhibited, and require Rab proteins to become active, the detailed molecular mechanisms remained unclear. Here, we report the cryo-EM structure of human MICAL1 at a nominal resolution of 3.1 Å. Structural analyses, alongside biochemical and functional studies, show that MICAL1 autoinhibition is mediated by an intramolecular interaction between its N-terminal catalytic and C-terminal coiled-coil domains, blocking F-actin interaction. Moreover, we demonstrate that allosteric changes in the coiled-coil domain and the binding of the tripartite assembly of CH-L2α1-LIM domains to the coiled-coil domain are crucial for MICAL activation and autoinhibition. These mechanisms appear to be evolutionarily conserved, suggesting a potential universality across the MICAL family.
Collapse
Affiliation(s)
- Matej Horvath
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Adam Schrofel
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Karolina Kowalska
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Jan Sabo
- Institute of Biotechnology of the Czech Academy of Sciences, Prague, Czechia
| | - Jonas Vlasak
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Farahdokht Nourisanami
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Margarita Sobol
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Daniel Pinkas
- Central European Institute of Technology, Masaryk University, Brno, Czechia
| | - Krystof Knapp
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Nicola Koupilova
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia
| | - Jiri Novacek
- Central European Institute of Technology, Masaryk University, Brno, Czechia
| | - Vaclav Veverka
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague, Czechia
| | - Zdenek Lansky
- Institute of Biotechnology of the Czech Academy of Sciences, Prague, Czechia
| | - Daniel Rozbesky
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czechia.
- Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czechia.
| |
Collapse
|
12
|
Bolognini D, Halgren A, Lou RN, Raveane A, Rocha JL, Guarracino A, Soranzo N, Chin CS, Garrison E, Sudmant PH. Recurrent evolution and selection shape structural diversity at the amylase locus. Nature 2024; 634:617-625. [PMID: 39232174 PMCID: PMC11485256 DOI: 10.1038/s41586-024-07911-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 08/06/2024] [Indexed: 09/06/2024]
Abstract
The adoption of agriculture triggered a rapid shift towards starch-rich diets in human populations1. Amylase genes facilitate starch digestion, and increased amylase copy number has been observed in some modern human populations with high-starch intake2, although evidence of recent selection is lacking3,4. Here, using 94 long-read haplotype-resolved assemblies and short-read data from approximately 5,600 contemporary and ancient humans, we resolve the diversity and evolutionary history of structural variation at the amylase locus. We find that amylase genes have higher copy numbers in agricultural populations than in fishing, hunting and pastoral populations. We identify 28 distinct amylase structural architectures and demonstrate that nearly identical structures have arisen recurrently on different haplotype backgrounds throughout recent human history. AMY1 and AMY2A genes each underwent multiple duplication/deletion events with mutation rates up to more than 10,000-fold the single-nucleotide polymorphism mutation rate, whereas AMY2B gene duplications share a single origin. Using a pangenome-based approach, we infer structural haplotypes across thousands of humans identifying extensively duplicated haplotypes at higher frequency in modern agricultural populations. Leveraging 533 ancient human genomes, we find that duplication-containing haplotypes (with more gene copies than the ancestral haplotype) have rapidly increased in frequency over the past 12,000 years in West Eurasians, suggestive of positive selection. Together, our study highlights the potential effects of the agricultural revolution on human genomes and the importance of structural variation in human adaptation.
Collapse
Affiliation(s)
| | - Alma Halgren
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Runyang Nicolas Lou
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | | | - Joana L Rocha
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Nicole Soranzo
- Human Technopole, Milan, Italy
- Wellcome Sanger Institute, Hinxton, UK
- National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, University of Cambridge, Cambridge, UK
- Department of Haematology, Cambridge Biomedical Campus, Cambridge, UK
- British Heart Foundation Centre of Research Excellence, University of Cambridge, Cambridge, UK
| | - Chen-Shan Chin
- Foundation for Biological Data Science, Belmont, CA, USA
| | - Erik Garrison
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA.
| | - Peter H Sudmant
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA.
- Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA.
| |
Collapse
|
13
|
Qin Z, Yuan B, Qu G, Sun Z. Rational enzyme design by reducing the number of hotspots and library size. Chem Commun (Camb) 2024; 60:10451-10463. [PMID: 39210728 DOI: 10.1039/d4cc01394h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Biocatalysts that are eco-friendly, sustainable, and highly specific have great potential for applications in the production of fine chemicals, food, detergents, biofuels, pharmaceuticals, and more. However, due to factors such as low activity, narrow substrate scope, poor thermostability, or incorrect selectivity, most natural enzymes cannot be directly used for large-scale production of the desired products. To overcome these obstacles, protein engineering methods have been developed over decades and have become powerful and versatile tools for adapting enzymes with improved catalytic properties or new functions. The vastness of the protein sequence space makes screening a bottleneck in obtaining advantageous mutated enzymes in traditional directed evolution. In the realm of mathematics, there are two major constraints in the protein sequence space: (1) the number of residue substitutions (M); and (2) the number of codons encoding amino acids as building blocks (N). This feature review highlights protein engineering strategies to reduce screening efforts from two dimensions by reducing the numbers M and N, and also discusses representative seminal studies of rationally engineered natural enzymes to deliver new catalytic functions.
Collapse
Affiliation(s)
- Zongmin Qin
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
| | - Bo Yuan
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| | - Ge Qu
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| | - Zhoutong Sun
- University of Chinese Academy of Sciences, Beijing 100049, China
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin 300308, China
| |
Collapse
|
14
|
Santoni D. An entropy-based study on the mutational landscape of SARS-CoV-2 in USA: Comparing different variants and revealing co-mutational behavior of proteins. Gene 2024; 922:148556. [PMID: 38754568 DOI: 10.1016/j.gene.2024.148556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 05/08/2024] [Accepted: 05/09/2024] [Indexed: 05/18/2024]
Abstract
COVID-19 emergency has pushed the international scientific community to use every resource to combat the spread of the virus, to understand its biology and predict its possible evolution in terms of new variants. Since the first SARS-CoV-2 virus nucleotide and amino acid sequences were made available, information theory was used to study how viral information content was changing over time and then trace the evolution of its mutational landscape. In this work we analyzed SARS-CoV-2 sequences collected mainly in the USA in a period from March 2020 until December 2022 and computed mutation profiles of viral proteins over time through an entropy-based approach using Shannon Entropy and Hellinger distance. This representation allows an at-a-glance view of the mutational landscape of viral proteins over time and can provide new insights on the evolution of the virus from different points of view. Non-structural proteins typically showed flat mutation profiles, characterized by a very low Average mutation Entropy, while accessory and structural proteins showed mostly non uniform and high mutation profiles, often coupled with the predominance of variants. Interestingly NSP2 protein, whose function is currently still debated, falls in the same branch of NSP14 and NSP10 in the phylogenetic tree of mutations constructed through correlations of mutation profiles, suggesting a co-evolution of those proteins and a possible functional link with each other. To the best of our knowledge this is the first study based on a massive amount of data (n = 107,939,973) that analyzes from an entropy point of view the mutational landscape of SARS-CoV-2 over time and depicts a mutational temporal profile of each protein of the virus.
Collapse
Affiliation(s)
- Daniele Santoni
- Institute for System Analysis and Computer Science "Antonio Ruberti", National Research Council of Italy, Via dei Taurini 19, Rome 00185, Italy.
| |
Collapse
|
15
|
Becker F, Stanke M. learnMSA2: deep protein multiple alignments with large language and hidden Markov models. Bioinformatics 2024; 40:ii79-ii86. [PMID: 39230690 PMCID: PMC11373405 DOI: 10.1093/bioinformatics/btae381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024] Open
Abstract
MOTIVATION For the alignment of large numbers of protein sequences, tools are predominant that decide to align two residues using only simple prior knowledge, e.g. amino acid substitution matrices, and using only part of the available data. The accuracy of state-of-the-art programs declines with decreasing sequence identity and when increasingly large numbers of sequences are aligned. Recently, transformer-based deep-learning models started to harness the vast amount of protein sequence data, resulting in powerful pretrained language models with the main purpose of generating high-dimensional numerical representations, embeddings, for individual sites that agglomerate evolutionary, structural, and biophysical information. RESULTS We extend the traditional profile hidden Markov model so that it takes as inputs unaligned protein sequences and the corresponding embeddings. We fit the model with gradient descent using our existing differentiable hidden Markov layer. All sequences and their embeddings are jointly aligned to a model of the protein family. We report that our upgraded HMM-based aligner, learnMSA2, combined with the ProtT5-XL protein language model aligns on average almost 6% points more columns correctly than the best amino acid-based competitor and scales well with sequence number. The relative advantage of learnMSA2 over other programs tends to be greater when the sequence identity is lower and when the number of sequences is larger. Our results strengthen the evidence on the rich information contained in protein language models' embeddings and their potential downstream impact on the field of bioinformatics. Availability and implementation: https://github.com/Gaius-Augustus/learnMSA, PyPI and Bioconda, evaluation: https://github.com/felbecker/snakeMSA.
Collapse
Affiliation(s)
- Felix Becker
- Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany
| | - Mario Stanke
- Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany
| |
Collapse
|
16
|
Camargo AP, Roux S, Schulz F, Babinski M, Xu Y, Hu B, Chain PSG, Nayfach S, Kyrpides NC. Identification of mobile genetic elements with geNomad. Nat Biotechnol 2024; 42:1303-1312. [PMID: 37735266 PMCID: PMC11324519 DOI: 10.1038/s41587-023-01953-y] [Citation(s) in RCA: 207] [Impact Index Per Article: 207.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 08/17/2023] [Indexed: 09/23/2023]
Abstract
Identifying and characterizing mobile genetic elements in sequencing data is essential for understanding their diversity, ecology, biotechnological applications and impact on public health. Here we introduce geNomad, a classification and annotation framework that combines information from gene content and a deep neural network to identify sequences of plasmids and viruses. geNomad uses a dataset of more than 200,000 marker protein profiles to provide functional gene annotation and taxonomic assignment of viral genomes. Using a conditional random field model, geNomad also detects proviruses integrated into host genomes with high precision. In benchmarks, geNomad achieved high classification performance for diverse plasmids and viruses (Matthews correlation coefficient of 77.8% and 95.3%, respectively), substantially outperforming other tools. Leveraging geNomad's speed and scalability, we processed over 2.7 trillion base pairs of sequencing data, leading to the discovery of millions of viruses and plasmids that are available through the IMG/VR and IMG/PR databases. geNomad is available at https://portal.nersc.gov/genomad .
Collapse
Affiliation(s)
- Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Frederik Schulz
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Michal Babinski
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Yan Xu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Patrick S G Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
17
|
Birtles D, Guiyab L, Abbas W, Lee J. Positive residues of the SARS-CoV-2 fusion domain are key contributors to the initiation of membrane fusion. J Biol Chem 2024; 300:107564. [PMID: 39002677 PMCID: PMC11357847 DOI: 10.1016/j.jbc.2024.107564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 06/25/2024] [Accepted: 07/09/2024] [Indexed: 07/15/2024] Open
Abstract
SARS-CoV-2 is one of the most infectious viruses ever recorded. Despite a plethora of research over the last several years, the viral life cycle is still not well understood, particularly membrane fusion. This process is initiated by the fusion domain (FD), a highly conserved stretch of amino acids consisting of a fusion peptide (FP) and fusion loop (FL), which in synergy perturbs the target cells' lipid membrane to lower the energetic cost necessary for fusion. In this study, through a mutagenesis-based approach, we have investigated the basic residues within the FD (K825, K835, R847, K854) utilizing an in vitro fusion assay and 19F NMR, validated by traditional 13C 15N techniques. Alanine and charge-conserving mutants revealed every basic residue plays a highly specific role within the mechanism of initiating fusion. Intriguingly, K825A led to increased fusogenecity which was found to be correlated to the number of amino acids within helix one, further implicating the role of this specific helix within the FD's fusion mechanism. This work has found basic residues to be important within the FDs fusion mechanism and highlights K825A, a specific mutation made within the FD of the SARS-CoV-2 spike protein, as requiring further investigation due to its potential to contribute to a more virulent strain of SARS-CoV-2.
Collapse
Affiliation(s)
- Daniel Birtles
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| | - Lijon Guiyab
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| | - Wafa Abbas
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA
| | - Jinwoo Lee
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, USA.
| |
Collapse
|
18
|
Tan Y, Scornet AL, Yap MNF, Zhang D. Machine learning-based classification reveals distinct clusters of non-coding genomic allelic variations associated with Erm-mediated antibiotic resistance. mSystems 2024; 9:e0043024. [PMID: 38953319 PMCID: PMC11264731 DOI: 10.1128/msystems.00430-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 06/05/2024] [Indexed: 07/04/2024] Open
Abstract
The erythromycin resistance RNA methyltransferase (erm) confers cross-resistance to all therapeutically important macrolides, lincosamides, and streptogramins (MLS phenotype). The expression of erm is often induced by the macrolide-mediated ribosome stalling in the upstream co-transcribed leader sequence, thereby triggering a conformational switch of the intergenic RNA hairpins to allow the translational initiation of erm. We investigated the evolutionary emergence of the upstream erm regulatory elements and the impact of allelic variation on erm expression and the MLS phenotype. Through systematic profiling of the upstream regulatory sequences across all known erm operons, we observed that specific erm subfamilies, such as ermB and ermC, have independently evolved distinct configurations of small upstream ORFs and palindromic repeats. A population-wide genomic analysis of the upstream ermB regions revealed substantial non-random allelic variation at numerous positions. Utilizing machine learning-based classification coupled with RNA structure modeling, we found that many alleles cooperatively influence the stability of alternative RNA hairpin structures formed by the palindromic repeats, which, in turn, affects the inducibility of ermB expression and MLS phenotypes. Subsequent experimental validation of 11 randomly selected variants demonstrated an impressive 91% accuracy in predicting MLS phenotypes. Furthermore, we uncovered a mixed distribution of MLS-sensitive and MLS-resistant ermB loci within the evolutionary tree, indicating repeated and independent evolution of MLS resistance. Taken together, this study not only elucidates the evolutionary processes driving the emergence and development of MLS resistance but also highlights the potential of using non-coding genomic allele data to predict antibiotic resistance phenotypes. IMPORTANCE Antibiotic resistance (AR) poses a global health threat as the efficacy of available antibiotics has rapidly eroded due to the widespread transmission of AR genes. Using Erm-dependent MLS resistance as a model, this study highlights the significance of non-coding genomic allelic variations. Through a comprehensive analysis of upstream regulatory elements within the erm family, we elucidated the evolutionary emergence and development of AR mechanisms. Leveraging population-wide machine learning (ML)-based genomic analysis, we transformed substantial non-random allelic variations into discernible clusters of elements, enabling precise prediction of MLS phenotypes from non-coding regions. These findings offer deeper insight into AR evolution and demonstrate the potential of harnessing non-coding genomic allele data for accurately predicting AR phenotypes.
Collapse
Affiliation(s)
- Yongjun Tan
- Department of Biology, College of Arts and Sciences, Saint Louis University, St. Louis, Missouri, USA
| | - Alexandre Le Scornet
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Mee-Ngan Frances Yap
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA
| | - Dapeng Zhang
- Department of Biology, College of Arts and Sciences, Saint Louis University, St. Louis, Missouri, USA
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, Missouri, USA
| |
Collapse
|
19
|
Madeira F, Madhusoodanan N, Lee J, Eusebi A, Niewielska A, Tivey ARN, Lopez R, Butcher S. The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res 2024; 52:W521-W525. [PMID: 38597606 PMCID: PMC11223882 DOI: 10.1093/nar/gkae241] [Citation(s) in RCA: 321] [Impact Index Per Article: 321.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 02/29/2024] [Accepted: 03/26/2024] [Indexed: 04/11/2024] Open
Abstract
The EMBL-EBI Job Dispatcher sequence analysis tools framework (https://www.ebi.ac.uk/jdispatcher) enables the scientific community to perform a diverse range of sequence analyses using popular bioinformatics applications. Free access to the tools and required sequence datasets is provided through user-friendly web applications, as well as via RESTful and SOAP-based APIs. These are integrated into popular EMBL-EBI resources such as UniProt, InterPro, ENA and Ensembl Genomes. This paper overviews recent improvements to Job Dispatcher, including its brand new website and documentation, enhanced visualisations, improved job management, and a rising trend of user reliance on the service from low- and middle-income regions.
Collapse
Affiliation(s)
- Fábio Madeira
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nandana Madhusoodanan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joonheung Lee
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alberto Eusebi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ania Niewielska
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adrian R N Tivey
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah Butcher
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
20
|
Ravi J, Anantharaman V, Chen SZ, Brenner EP, Datta P, Aravind L, Gennaro ML. The phage shock protein (PSP) envelope stress response: discovery of novel partners and evolutionary history. mSystems 2024; 9:e0084723. [PMID: 38809013 PMCID: PMC11237479 DOI: 10.1128/msystems.00847-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 03/20/2024] [Indexed: 05/30/2024] Open
Abstract
Bacterial phage shock protein (PSP) systems stabilize the bacterial cell membrane and protect against envelope stress. These systems have been associated with virulence, but despite their critical roles, PSP components are not well characterized outside proteobacteria. Using comparative genomics and protein sequence-structure-function analyses, we systematically identified and analyzed PSP homologs, phyletic patterns, domain architectures, and gene neighborhoods. This approach underscored the evolutionary significance of the system, revealing that its core protein PspA (Snf7 in ESCRT outside bacteria) was present in the last universal common ancestor and that this ancestral functionality has since diversified into multiple novel, distinct PSP systems across life. Several novel partners of the PSP system were identified: (i) the Toastrack domain, likely facilitating assembly of sub-membrane stress-sensing and signaling complexes, (ii) the newly defined HTH-associated α-helical signaling domain-PadR-like transcriptional regulator pair system, and (iii) multiple independent associations with ATPase, CesT/Tir-like chaperone, and Band-7 domains in proteins thought to mediate sub-membrane dynamics. Our work also uncovered links between the PSP components and other domains, such as novel variants of SHOCT-like domains, suggesting roles in assembling membrane-associated complexes of proteins with disparate biochemical functions. Results are available at our interactive web app, https://jravilab.org/psp.IMPORTANCEPhage shock proteins (PSP) are virulence-associated, cell membrane stress-protective systems. They have mostly been characterized in Proteobacteria and Firmicutes. We now show that a minimal PSP system was present in the last universal common ancestor that evolved and diversified into newly identified functional contexts. Recognizing the conservation and evolution of PSP systems across bacterial phyla contributes to our understanding of stress response mechanisms in prokaryotes. Moreover, the newly discovered PSP modularity will likely prompt new studies of lineage-specific cell envelope structures, lifestyles, and adaptation mechanisms. Finally, our results validate the use of domain architecture and genetic context for discovery in comparative genomics.
Collapse
Affiliation(s)
- Janani Ravi
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
- Public Health Research Institute, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - Vivek Anantharaman
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland, USA
| | - Samuel Zorn Chen
- Computer Science Engineering Undergraduate Program, Michigan State University, East Lansing, Michigan, USA
| | - Evan Pierce Brenner
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Pratik Datta
- Public Health Research Institute, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| | - L. Aravind
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland, USA
| | - Maria Laura Gennaro
- Public Health Research Institute, Rutgers New Jersey Medical School, Newark, New Jersey, USA
| |
Collapse
|
21
|
Bai G, Zeng X, Zhang L, Wang Y, Ma B. Computational investigation of the inhibitory interaction of IRF3 and SARS-CoV-2 accessory protein ORF3b. Biochem Biophys Res Commun 2024; 712-713:149945. [PMID: 38640732 DOI: 10.1016/j.bbrc.2024.149945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Accepted: 04/14/2024] [Indexed: 04/21/2024]
Abstract
ORF3b is one of the SARS-CoV-2 accessory proteins. Previous experimental study suggested that ORF3b prevents IRF3 translocating to nucleus. However, the biophysical mechanism of ORF3b-IRF3 interaction is elusive. Here, we explored the conformation ensemble of ORF3b using all-atom replica exchange molecular dynamics simulation. Disordered ORF3b has mixed α-helix, β-turn and loop conformers. The potential ORF3b-IRF3 binding modes were searched by docking representative ORF3b conformers with IRF3, and 50 ORF3b-IRF3 complex poses were screened using molecular dynamics simulations ranging from 500 to 1000 ns. We found that ORF3b binds IRF3 predominantly on its CBP binding and phosphorylated pLxIS motifs, with CBP binding site has the highest binding affinity. The ORF3b-IRF3 binding residues are highly conserved in SARS-CoV-2. Our results provided biophysics insights into ORF3b-IRF3 interaction and explained its interferon antagonism mechanism.
Collapse
Affiliation(s)
- Ganggang Bai
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xincheng Zeng
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Linghao Zhang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanjing Wang
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody (MOE), School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
22
|
Bolognini D, Halgren A, Lou RN, Raveane A, Rocha JL, Guarracino A, Soranzo N, Chin J, Garrison E, Sudmant PH. Global diversity, recurrent evolution, and recent selection on amylase structural haplotypes in humans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.07.579378. [PMID: 38370750 PMCID: PMC10871346 DOI: 10.1101/2024.02.07.579378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
The adoption of agriculture, first documented ~12,000 years ago in the Fertile Crescent, triggered a rapid shift toward starch-rich diets in human populations. Amylase genes facilitate starch digestion and increased salivary amylase copy number has been observed in some modern human populations with high starch intake, though evidence of recent selection is lacking. Here, using 52 long-read diploid assemblies and short read data from ~5,600 contemporary and ancient humans, we resolve the diversity, evolutionary history, and selective impact of structural variation at the amylase locus. We find that amylase genes have higher copy numbers in populations with agricultural subsistence compared to fishing, hunting, and pastoral groups. We identify 28 distinct amylase structural architectures and demonstrate that nearly identical structures have arisen recurrently on different haplotype backgrounds throughout recent human history. AMY1 and AMY2A genes each exhibit multiple duplications/deletions with mutation rates >10,000-fold the SNP mutation rate, whereas AMY2B gene duplications share a single origin. Using a pangenome graph-based approach to infer structural haplotypes across thousands of humans, we identify extensively duplicated haplotypes present at higher frequencies in modern day populations with traditionally agricultural diets. Leveraging 533 ancient human genomes we find that duplication-containing haplotypes (i.e. haplotypes with more amylase gene copies than the ancestral haplotype) have increased in frequency more than seven-fold over the last 12,000 years providing evidence for recent selection in West Eurasians. Together, our study highlights the potential impacts of the agricultural revolution on human genomes and the importance of long-read sequencing in identifying signatures of selection at structurally complex loci.
Collapse
Affiliation(s)
| | - Alma Halgren
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
| | - Runyang Nicolas Lou
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
| | | | - Joana L Rocha
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, USA
| | | | - Jason Chin
- Foundation for Biological Data Science, Belmont, USA
| | - Erik Garrison
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, USA
| | - Peter H Sudmant
- Department of Integrative Biology, University of California Berkeley, Berkeley, USA
- Center for Computational Biology, University of California Berkeley, Berkeley, USA
| |
Collapse
|
23
|
Li W, Miller D, Liu X, Tosi L, Chkaiban L, Mei H, Hung PH, Parekkadan B, Sherlock G, Levy S. Arrayed in vivo barcoding for multiplexed sequence verification of plasmid DNA and demultiplexing of pooled libraries. Nucleic Acids Res 2024; 52:e47. [PMID: 38709890 PMCID: PMC11162764 DOI: 10.1093/nar/gkae332] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/23/2024] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open
Abstract
Sequence verification of plasmid DNA is critical for many cloning and molecular biology workflows. To leverage high-throughput sequencing, several methods have been developed that add a unique DNA barcode to individual samples prior to pooling and sequencing. However, these methods require an individual plasmid extraction and/or in vitro barcoding reaction for each sample processed, limiting throughput and adding cost. Here, we develop an arrayed in vivo plasmid barcoding platform that enables pooled plasmid extraction and library preparation for Oxford Nanopore sequencing. This method has a high accuracy and recovery rate, and greatly increases throughput and reduces cost relative to other plasmid barcoding methods or Sanger sequencing. We use in vivo barcoding to sequence verify >45 000 plasmids and show that the method can be used to transform error-containing dispersed plasmid pools into sequence-perfect arrays or well-balanced pools. In vivo barcoding does not require any specialized equipment beyond a low-overhead Oxford Nanopore sequencer, enabling most labs to flexibly process hundreds to thousands of plasmids in parallel.
Collapse
Affiliation(s)
- Weiyi Li
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Darach Miller
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Xianan Liu
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Lamia Chkaiban
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Han Mei
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| | - Po-Hsiang Hung
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University, Piscataway, NJ, USA
| | - Gavin Sherlock
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Sasha F Levy
- SLAC National Accelerator Laboratory, Stanford University, Stanford, CA, USA
| |
Collapse
|
24
|
Zhao Q, Bertolli S, Park YJ, Tan Y, Cutler KJ, Srinivas P, Asfahl KL, Fonesca-García C, Gallagher LA, Li Y, Wang Y, Coleman-Derr D, DiMaio F, Zhang D, Peterson SB, Veesler D, Mougous JD. Streptomyces umbrella toxin particles block hyphal growth of competing species. Nature 2024; 629:165-173. [PMID: 38632398 PMCID: PMC11062931 DOI: 10.1038/s41586-024-07298-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 03/11/2024] [Indexed: 04/19/2024]
Abstract
Streptomyces are a genus of ubiquitous soil bacteria from which the majority of clinically utilized antibiotics derive1. The production of these antibacterial molecules reflects the relentless competition Streptomyces engage in with other bacteria, including other Streptomyces species1,2. Here we show that in addition to small-molecule antibiotics, Streptomyces produce and secrete antibacterial protein complexes that feature a large, degenerate repeat-containing polymorphic toxin protein. A cryo-electron microscopy structure of these particles reveals an extended stalk topped by a ringed crown comprising the toxin repeats scaffolding five lectin-tipped spokes, which led us to name them umbrella particles. Streptomyces coelicolor encodes three umbrella particles with distinct toxin and lectin composition. Notably, supernatant containing these toxins specifically and potently inhibits the growth of select Streptomyces species from among a diverse collection of bacteria screened. For one target, Streptomyces griseus, inhibition relies on a single toxin and that intoxication manifests as rapid cessation of vegetative hyphal growth. Our data show that Streptomyces umbrella particles mediate competition among vegetative mycelia of related species, a function distinct from small-molecule antibiotics, which are produced at the onset of reproductive growth and act broadly3,4. Sequence analyses suggest that this role of umbrella particles extends beyond Streptomyces, as we identified umbrella loci in nearly 1,000 species across Actinobacteria.
Collapse
Affiliation(s)
- Qinqin Zhao
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Savannah Bertolli
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Young-Jun Park
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Yongjun Tan
- Department of Biology, St Louis University, St Louis, MO, USA
| | - Kevin J Cutler
- Department of Microbiology, University of Washington, Seattle, WA, USA
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Pooja Srinivas
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Kyle L Asfahl
- Department of Microbiology, University of Washington, Seattle, WA, USA
- Microbial Interactions and Microbiome Center, University of Washington, Seattle, WA, USA
| | - Citlali Fonesca-García
- Plant Gene Expression Center, USDA-ARS, Albany, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Larry A Gallagher
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Yaqiao Li
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Yaxi Wang
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - Devin Coleman-Derr
- Plant Gene Expression Center, USDA-ARS, Albany, CA, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Dapeng Zhang
- Department of Biology, St Louis University, St Louis, MO, USA
- Program of Bioinformatic and Computational Biology, St Louis University, St Louis, MO, USA
| | - S Brook Peterson
- Department of Microbiology, University of Washington, Seattle, WA, USA
| | - David Veesler
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Joseph D Mougous
- Department of Microbiology, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
- Microbial Interactions and Microbiome Center, University of Washington, Seattle, WA, USA.
| |
Collapse
|
25
|
Ng CL, Lim TS, Choong YS. Application of Computational Techniques in Antibody Fc-Fused Molecule Design for Therapeutics. Mol Biotechnol 2024; 66:568-581. [PMID: 37742298 DOI: 10.1007/s12033-023-00885-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 08/23/2023] [Indexed: 09/26/2023]
Abstract
Since the advent of hybridoma technology in the year 1975, it took a decade to witness the first approved monoclonal antibody Orthoclone OKT39 (muromonab-CD3) in the year 1986. Since then, continuous strides have been made to engineer antibodies for specific desired effects. The engineering efforts were not confined to only the variable domains of the antibody but also included the fragment crystallizable (Fc) region that influences the immune response and serum half-life. Engineering of the Fc fragment would have a profound effect on the therapeutic dose, antibody-dependent cell-mediated cytotoxicity as well as antibody-dependent cellular phagocytosis. The integration of computational techniques into antibody engineering designs has allowed for the generation of testable hypotheses and guided the rational antibody design framework prior to further experimental evaluations. In this article, we discuss the recent works in the Fc-fused molecule design that involves computational techniques. We also summarize the usefulness of in silico techniques to aid Fc-fused molecule design and analysis for the therapeutics application.
Collapse
Affiliation(s)
- Chong Lee Ng
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia
| | - Theam Soon Lim
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia
| | - Yee Siew Choong
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia.
| |
Collapse
|
26
|
Zhai Y, Chao J, Wang Y, Zhang P, Tang F, Zou Q. TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments. PLoS Comput Biol 2024; 20:e1011988. [PMID: 38557416 PMCID: PMC11008887 DOI: 10.1371/journal.pcbi.1011988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 04/11/2024] [Accepted: 03/11/2024] [Indexed: 04/04/2024] Open
Abstract
Accurate multiple sequence alignment (MSA) is imperative for the comprehensive analysis of biological sequences. However, a notable challenge arises as no single MSA tool consistently outperforms its counterparts across diverse datasets. Users often have to try multiple MSA tools to achieve optimal alignment results, which can be time-consuming and memory-intensive. While the overall accuracy of certain MSA results may be lower, there could be local regions with the highest alignment scores, prompting researchers to seek a tool capable of merging these locally optimal results from multiple initial alignments into a globally optimal alignment. In this study, we introduce Two Pointers Meta-Alignment (TPMA), a novel tool designed for the integration of nucleic acid sequence alignments. TPMA employs two pointers to partition the initial alignments into blocks containing identical sequence fragments. It selects blocks with the high sum of pairs (SP) scores to concatenate them into an alignment with an overall SP score superior to that of the initial alignments. Through tests on simulated and real datasets, the experimental results consistently demonstrate that TPMA outperforms M-Coffee in terms of aSP, Q, and total column (TC) scores across most datasets. Even in cases where TPMA's scores are comparable to M-Coffee, TPMA exhibits significantly lower running time and memory consumption. Furthermore, we comprehensively assessed all the MSA tools used in the experiments, considering accuracy, time, and memory consumption. We propose accurate and fast combination strategies for small and large datasets, which streamline the user tool selection process and facilitate large-scale dataset integration. The dataset and source code of TPMA are available on GitHub (https://github.com/malabz/TPMA).
Collapse
Affiliation(s)
- Yixiao Zhai
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Quzhou People’s Hospital, Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Jiannan Chao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yizheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Pinglu Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Furong Tang
- Quzhou People’s Hospital, Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou, China
- Department of Basic Medical Sciences, School of Medicine, Tsinghua University, Beijing, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| |
Collapse
|
27
|
Garmaeva S, Sinha T, Gulyaeva A, Kuzub N, Spreckels JE, Andreu-Sánchez S, Gacesa R, Vich Vila A, Brushett S, Kruk M, Dekens J, Sikkema J, Kuipers F, Shkoporov AN, Hill C, Scherjon S, Wijmenga C, Fu J, Kurilshikov A, Zhernakova A. Transmission and dynamics of mother-infant gut viruses during pregnancy and early life. Nat Commun 2024; 15:1945. [PMID: 38431663 PMCID: PMC10908809 DOI: 10.1038/s41467-024-45257-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 01/16/2024] [Indexed: 03/05/2024] Open
Abstract
Early development of the gut ecosystem is crucial for lifelong health. While infant gut bacterial communities have been studied extensively, the infant gut virome remains under-explored. To study the development of the infant gut virome over time and the factors that shape it, we longitudinally assess the composition of gut viruses and their bacterial hosts in 30 women during and after pregnancy and in their 32 infants during their first year of life. Using shotgun metagenomic sequencing applied to dsDNA extracted from Virus-Like Particles (VLPs) and bacteria, we generate 205 VLP metaviromes and 322 total metagenomes. With this data, we show that while the maternal gut virome composition remains stable during late pregnancy and after birth, the infant gut virome is dynamic in the first year of life. Notably, infant gut viromes contain a higher abundance of active temperate phages compared to maternal gut viromes, which decreases over the first year of life. Moreover, we show that the feeding mode and place of delivery influence the gut virome composition of infants. Lastly, we provide evidence of co-transmission of viral and bacterial strains from mothers to infants, demonstrating that infants acquire some of their virome from their mother's gut.
Collapse
Affiliation(s)
- Sanzhima Garmaeva
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Trishla Sinha
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Anastasia Gulyaeva
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Nataliia Kuzub
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Johanne E Spreckels
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Sergio Andreu-Sánchez
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Pediatrics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Ranko Gacesa
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Arnau Vich Vila
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Siobhan Brushett
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Health Sciences, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Marloes Kruk
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Jackie Dekens
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- University Medical Center Groningen, Center for Development and Innovation, Groningen, Netherlands
| | - Jan Sikkema
- University Medical Center Groningen, Center for Development and Innovation, Groningen, Netherlands
| | - Folkert Kuipers
- Department of Pediatrics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- European Research Institute for the Biology of Ageing (ERIBA), University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Andrey N Shkoporov
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
| | - Colin Hill
- APC Microbiome Ireland, University College Cork, Cork, Ireland
- School of Microbiology, University College Cork, Cork, Ireland
| | - Sicco Scherjon
- Department of Obstetrics and Gynecology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Cisca Wijmenga
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Jingyuan Fu
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Pediatrics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Alexander Kurilshikov
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Alexandra Zhernakova
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.
| |
Collapse
|
28
|
Fogarty EC, Schechter MS, Lolans K, Sheahan ML, Veseli I, Moore RM, Kiefl E, Moody T, Rice PA, Yu MK, Mimee M, Chang EB, Ruscheweyh HJ, Sunagawa S, Mclellan SL, Willis AD, Comstock LE, Eren AM. A cryptic plasmid is among the most numerous genetic elements in the human gut. Cell 2024; 187:1206-1222.e16. [PMID: 38428395 PMCID: PMC10973873 DOI: 10.1016/j.cell.2024.01.039] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 10/03/2023] [Accepted: 01/25/2024] [Indexed: 03/03/2024]
Abstract
Plasmids are extrachromosomal genetic elements that often encode fitness-enhancing features. However, many bacteria carry "cryptic" plasmids that do not confer clear beneficial functions. We identified one such cryptic plasmid, pBI143, which is ubiquitous across industrialized gut microbiomes and is 14 times as numerous as crAssphage, currently established as the most abundant extrachromosomal genetic element in the human gut. The majority of mutations in pBI143 accumulate in specific positions across thousands of metagenomes, indicating strong purifying selection. pBI143 is monoclonal in most individuals, likely due to the priority effect of the version first acquired, often from one's mother. pBI143 can transfer between Bacteroidales, and although it does not appear to impact bacterial host fitness in vivo, it can transiently acquire additional genetic content. We identified important practical applications of pBI143, including its use in identifying human fecal contamination and its potential as an alternative approach to track human colonic inflammatory states.
Collapse
Affiliation(s)
- Emily C Fogarty
- Committee on Microbiology, University of Chicago, Chicago, IL 60637, USA; Duchossois Family Institute, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
| | - Matthew S Schechter
- Committee on Microbiology, University of Chicago, Chicago, IL 60637, USA; Duchossois Family Institute, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Karen Lolans
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Madeline L Sheahan
- Duchossois Family Institute, University of Chicago, Chicago, IL 60637, USA; Department of Microbiology, University of Chicago, Chicago, IL 60637, USA
| | - Iva Veseli
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA; Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Ryan M Moore
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Evan Kiefl
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA; Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637, USA
| | - Thomas Moody
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Phoebe A Rice
- Committee on Microbiology, University of Chicago, Chicago, IL 60637, USA; Department of Biochemistry, University of Chicago, Chicago, IL 60637, USA
| | - Michael K Yu
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA
| | - Mark Mimee
- Committee on Microbiology, University of Chicago, Chicago, IL 60637, USA; Department of Microbiology, University of Chicago, Chicago, IL 60637, USA; Pritzker School of Molecular Engineering, The University of Chicago, Chicago, IL 60637, USA
| | - Eugene B Chang
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Hans-Joachim Ruscheweyh
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Sandra L Mclellan
- School of Freshwater Sciences, University of Wisconsin-Milwaukee, Milwaukee, WI 53204, USA
| | - Amy D Willis
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Laurie E Comstock
- Committee on Microbiology, University of Chicago, Chicago, IL 60637, USA; Duchossois Family Institute, University of Chicago, Chicago, IL 60637, USA; Department of Microbiology, University of Chicago, Chicago, IL 60637, USA.
| | - A Murat Eren
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA; Marine Biological Laboratory, Woods Hole, MA 02543, USA; Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, 27570 Bremerhaven, Germany; Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, 26129 Oldenburg, Germany; Max Planck Institute for Marine Microbiology, 28359 Bremen, Germany; Helmholtz Institute for Functional Marine Biodiversity, 26129 Oldenburg, Germany.
| |
Collapse
|
29
|
Currie MJ, Davies JS, Scalise M, Gulati A, Wright JD, Newton-Vesty MC, Abeysekera GS, Subramanian R, Wahlgren WY, Friemann R, Allison JR, Mace PD, Griffin MDW, Demeler B, Wakatsuki S, Drew D, Indiveri C, Dobson RCJ, North RA. Structural and biophysical analysis of a Haemophilus influenzae tripartite ATP-independent periplasmic (TRAP) transporter. eLife 2024; 12:RP92307. [PMID: 38349818 PMCID: PMC10942642 DOI: 10.7554/elife.92307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2024] Open
Abstract
Tripartite ATP-independent periplasmic (TRAP) transporters are secondary-active transporters that receive their substrates via a soluble-binding protein to move bioorganic acids across bacterial or archaeal cell membranes. Recent cryo-electron microscopy (cryo-EM) structures of TRAP transporters provide a broad framework to understand how they work, but the mechanistic details of transport are not yet defined. Here we report the cryo-EM structure of the Haemophilus influenzae N-acetylneuraminate TRAP transporter (HiSiaQM) at 2.99 Å resolution (extending to 2.2 Å at the core), revealing new features. The improved resolution (the previous HiSiaQM structure is 4.7 Å resolution) permits accurate assignment of two Na+ sites and the architecture of the substrate-binding site, consistent with mutagenic and functional data. Moreover, rather than a monomer, the HiSiaQM structure is a homodimer. We observe lipids at the dimer interface, as well as a lipid trapped within the fusion that links the SiaQ and SiaM subunits. We show that the affinity (KD) for the complex between the soluble HiSiaP protein and HiSiaQM is in the micromolar range and that a related SiaP can bind HiSiaQM. This work provides key data that enhances our understanding of the 'elevator-with-an-operator' mechanism of TRAP transporters.
Collapse
Affiliation(s)
- Michael J Currie
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
| | - James S Davies
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
- Department of Biochemistry and Biophysics, Stockholm UniversityStockholmSweden
| | - Mariafrancesca Scalise
- Department DiBEST (Biologia, Ecologia, Scienze della Terra) Unit of Biochemistry and Molecular Biotechnology, University of CalabriaArcavacata di RendeItaly
| | - Ashutosh Gulati
- Department of Biochemistry and Biophysics, Stockholm UniversityStockholmSweden
| | - Joshua D Wright
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
| | - Michael C Newton-Vesty
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
| | - Gayan S Abeysekera
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
| | - Ramaswamy Subramanian
- Biological Sciences and Biomedical Engineering, Bindley Bioscience Center, Purdue University West LafayetteWest LafayetteUnited States
| | - Weixiao Y Wahlgren
- Department of Chemistry and Molecular Biology, Biochemistry and Structural Biology, University of GothenburgGothenburgSweden
| | - Rosmarie Friemann
- Centre for Antibiotic Resistance Research (CARe) at University of GothenburgGothenburgSweden
| | - Jane R Allison
- Biomolecular Interaction Centre, Digital Life Institute, Maurice Wilkins Centre for Molecular Biodiscovery, and School of Biological Sciences, University of AucklandAucklandNew Zealand
| | - Peter D Mace
- Biochemistry Department, School of Biomedical Sciences, University of OtagoDunedinNew Zealand
| | - Michael DW Griffin
- ARC Centre for Cryo-electron Microscopy of Membrane Proteins, Bio Molecular Science and Biotechnology Institute, Department of Biochemistry and Pharmacology, University of MelbourneMelbourneAustralia
| | - Borries Demeler
- Department of Chemistry and Biochemistry, University of MontanaMissoulaUnited States
- Department of Chemistry and Biochemistry, University of LethbridgeLethbridgeCanada
| | - Soichi Wakatsuki
- Biological Sciences Division, SLAC National Accelerator LaboratoryMenlo ParkUnited States
- Department of Structural Biology, Stanford University School of MedicineStanfordUnited States
| | - David Drew
- Department of Biochemistry and Biophysics, Stockholm UniversityStockholmSweden
| | - Cesare Indiveri
- Department DiBEST (Biologia, Ecologia, Scienze della Terra) Unit of Biochemistry and Molecular Biotechnology, University of CalabriaArcavacata di RendeItaly
- CNR Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM)BariItaly
| | - Renwick CJ Dobson
- Biomolecular Interaction Centre, Maurice Wilkins Centre for Biodiscovery, MacDiarmid Institute for Advanced Materials and Nanotechnology, and School of Biological Sciences, University of CanterburyChristchurchNew Zealand
- ARC Centre for Cryo-electron Microscopy of Membrane Proteins, Bio Molecular Science and Biotechnology Institute, Department of Biochemistry and Pharmacology, University of MelbourneMelbourneAustralia
| | - Rachel A North
- Department of Biochemistry and Biophysics, Stockholm UniversityStockholmSweden
- School of Medical Sciences, Faculty of Medicine and Health, University of SydneySydneyAustralia
| |
Collapse
|
30
|
Chopy M, Cavallini-Speisser Q, Chambrier P, Morel P, Just J, Hugouvieux V, Rodrigues Bento S, Zubieta C, Vandenbussche M, Monniaux M. Cell layer-specific expression of the homeotic MADS-box transcription factor PhDEF contributes to modular petal morphogenesis in petunia. THE PLANT CELL 2024; 36:324-345. [PMID: 37804091 PMCID: PMC10827313 DOI: 10.1093/plcell/koad258] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 08/31/2023] [Accepted: 09/18/2023] [Indexed: 10/08/2023]
Abstract
Floral homeotic MADS-box transcription factors ensure the correct morphogenesis of floral organs, which are organized in different cell layers deriving from distinct meristematic layers. How cells from these distinct layers acquire their respective identities and coordinate their growth to ensure normal floral organ morphogenesis is unresolved. Here, we studied petunia (Petunia × hybrida) petals that form a limb and tube through congenital fusion. We identified petunia mutants (periclinal chimeras) expressing the B-class MADS-box gene DEFICIENS in the petal epidermis or in the petal mesophyll, called wico and star, respectively. Strikingly, wico flowers form a strongly reduced tube while their limbs are almost normal, while star flowers form a normal tube but greatly reduced and unpigmented limbs, showing that petunia petal morphogenesis is highly modular. These mutants highlight the layer-specific roles of PhDEF during petal development. We explored the link between PhDEF and petal pigmentation, a well-characterized limb epidermal trait. The anthocyanin biosynthesis pathway was strongly downregulated in star petals, including its major regulator ANTHOCYANIN2 (AN2). We established that PhDEF directly binds to the AN2 terminator in vitro and in vivo, suggesting that PhDEF might regulate AN2 expression and therefore petal epidermis pigmentation. Altogether, we show that cell layer-specific homeotic activity in petunia petals differently impacts tube and limb development, revealing the relative importance of the different cell layers in the modular architecture of petunia petals.
Collapse
Affiliation(s)
- Mathilde Chopy
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Quentin Cavallini-Speisser
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Pierre Chambrier
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Patrice Morel
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Jérémy Just
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Véronique Hugouvieux
- Laboratoire de Physiologie Cellulaire et Végétale, Université Grenoble-Alpes, CNRS, CEA, INRAE, IRIG-DBSCI, Grenoble 38000, France
| | - Suzanne Rodrigues Bento
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Chloe Zubieta
- Laboratoire de Physiologie Cellulaire et Végétale, Université Grenoble-Alpes, CNRS, CEA, INRAE, IRIG-DBSCI, Grenoble 38000, France
| | - Michiel Vandenbussche
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| | - Marie Monniaux
- Laboratoire de Reproduction et Développement des Plantes, Université de Lyon, ENS de Lyon, UCB Lyon 1, CNRS, INRAE, Lyon 69007, France
| |
Collapse
|
31
|
Xiao B, Rey-lglesia A, Yuan J, Hu J, Song S, Hou Y, Chen X, Germonpré M, Bao L, Wang S, Taogetongqimuge, Valentinovna LL, Lister AM, Lai X, Sheng G. Relationships of Late Pleistocene giant deer as revealed by Sinomegaceros mitogenomes from East Asia. iScience 2023; 26:108406. [PMID: 38047074 PMCID: PMC10690636 DOI: 10.1016/j.isci.2023.108406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/26/2023] [Accepted: 11/03/2023] [Indexed: 12/05/2023] Open
Abstract
The giant deer, widespread in northern Eurasia during the Late Pleistocene, have been classified as western Megaloceros and eastern Sinomegaceros through morphological studies. While Megaloceros's evolutionary history has been unveiled through mitogenomes, Sinomegaceros remains molecularly unexplored. Herein, we generated mitogenomes of giant deer from East Asia. We find that, in contrast to the morphological differences between Megaloceros and Sinomegaceros, they are mixed in the mitochondrial phylogeny, and Siberian specimens suggest a range contact or overlap between these two groups. Meanwhile, one deep divergent clade and another surviving until 20.1 thousand years ago (ka) were detected in northeastern China, the latter implying this area as a potential refugium during the Last Glacial Maximum (LGM). Moreover, stable isotope analyses indicate correlations between climate-introduced vegetation changes and giant deer extinction. Our study demonstrates the genetic relationship between eastern and western giant deer and explores the promoters of their extirpation in northern East Asia.
Collapse
Affiliation(s)
- Bo Xiao
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- School of Earth Sciences, China University of Geosciences, Wuhan 430074, China
| | - Alba Rey-lglesia
- Globe Institute, University of Copenhagen, Copenhagen, 1350 Copenhagen K, Denmark
| | - Junxia Yuan
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- Faculty of Materials Science and Chemistry, China University of Geosciences, Wuhan 430078, China
| | - Jiaming Hu
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- School of Earth Sciences, China University of Geosciences, Wuhan 430074, China
| | - Shiwen Song
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- School of Environmental Studies, China University of Geosciences, Wuhan 430078, China
| | - Yamei Hou
- Key Laboratory of Vertebrate Evolution and Human Origins, Institute of Vertebrate Paleontology and Paleoanthropology, Chinese Academy of Sciences, Beijing 100044, China
| | - Xi Chen
- Department of Cultural Heritage and Museology, Nanjing Normal University, Nanjing 210046, China
| | - Mietje Germonpré
- Royal Belgian Institute of Natural Sciences, 1000 Brussels, Belgium
| | - Lei Bao
- Ordos Institute of Cultural Relics and Archaeology, Ordos 017010, China
| | | | | | - Lbova Liudmila Valentinovna
- Graduate School of International Relations, Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Grazhdansky Av., 28, Russia
| | | | - Xulong Lai
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- School of Earth Sciences, China University of Geosciences, Wuhan 430074, China
| | - Guilian Sheng
- State Key Laboratory of Biogeology and Environmental Geology, China University of Geosciences, Wuhan 430078, China
- School of Environmental Studies, China University of Geosciences, Wuhan 430078, China
| |
Collapse
|
32
|
Scott CJR, Leadbeater DR, Oates NC, James SR, Newling K, Li Y, McGregor NGS, Bird S, Bruce NC. Whole genome structural predictions reveal hidden diversity in putative oxidative enzymes of the lignocellulose-degrading ascomycete Parascedosporium putredinis NO1. Microbiol Spectr 2023; 11:e0103523. [PMID: 37811978 PMCID: PMC10714830 DOI: 10.1128/spectrum.01035-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 08/22/2023] [Indexed: 10/10/2023] Open
Abstract
IMPORTANCE An annotated reference genome has revealed P. putredinis NO1 as a useful resource for the identification of new lignocellulose-degrading enzymes for biorefining of woody plant biomass. Utilizing a "structure-omics"-based searching strategy, we identified new potentially lignocellulose-active sequences that would have been missed by traditional sequence searching methods. These new identifications, alongside the discovery of novel enzymatic functions from this underexplored lineage with the recent discovery of a new phenol oxidase that cleaves the main structural β-O-4 linkage in lignin from P. putredinis NO1, highlight the underexplored and poorly represented family Microascaceae as a particularly interesting candidate worthy of further exploration toward the valorization of high value biorenewable products.
Collapse
Affiliation(s)
- Conor J. R. Scott
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, United Kingdom
| | - Daniel R. Leadbeater
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, United Kingdom
| | - Nicola C. Oates
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, United Kingdom
| | - Sally R. James
- Department of Biology, Bioscience Technology Facility, University of York, York, United Kingdom
| | - Katherine Newling
- Department of Biology, Bioscience Technology Facility, University of York, York, United Kingdom
| | - Yi Li
- Department of Biology, Bioscience Technology Facility, University of York, York, United Kingdom
| | - Nicholas G. S. McGregor
- Department of Chemistry, York Structural Biology Laboratory, The University of York, York, United Kingdom
| | - Susannah Bird
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, United Kingdom
| | - Neil C. Bruce
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, United Kingdom
| |
Collapse
|
33
|
Liu Y, Yuan H, Zhang Q, Wang Z, Xiong S, Wen N, Zhang Y. Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding. Bioinformatics 2023; 39:btad636. [PMID: 37856335 PMCID: PMC10628385 DOI: 10.1093/bioinformatics/btad636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 07/24/2023] [Accepted: 10/17/2023] [Indexed: 10/21/2023] Open
Abstract
MOTIVATION Multiple sequence alignment (MSA) is one of the hotspots of current research and is commonly used in sequence analysis scenarios. However, there is no lasting solution for MSA because it is a Nondeterministic Polynomially complete problem, and the existing methods still have room to improve the accuracy. RESULTS We propose Deep reinforcement learning with Positional encoding and self-Attention for MSA, based on deep reinforcement learning, to enhance the accuracy of the alignment Specifically, inspired by the translation technique in natural language processing, we introduce self-attention and positional encoding to improve accuracy and reliability. Firstly, positional encoding encodes the position of the sequence to prevent the loss of nucleotide position information. Secondly, the self-attention model is used to extract the key features of the sequence. Then input the features into a multi-layer perceptron, which can calculate the insertion position of the gap according to the features. In addition, a novel reinforcement learning environment is designed to convert the classic progressive alignment into progressive column alignment, gradually generating each column's sub-alignment. Finally, merge the sub-alignment into the complete alignment. Extensive experiments based on several datasets validate our method's effectiveness for MSA, outperforming some state-of-the-art methods in terms of the Sum-of-pairs and Column scores. AVAILABILITY AND IMPLEMENTATION The process is implemented in Python and available as open-source software from https://github.com/ZhangLab312/DPAMSA.
Collapse
Affiliation(s)
- Yuhang Liu
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Hao Yuan
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Qiang Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Zixuan Wang
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China
| | - Shuwen Xiong
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Naifeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian 116600, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| |
Collapse
|
34
|
Van Doren SR, Scott BS, Koppisetti RK. SARS-CoV-2 fusion peptide sculpting of a membrane with insertion of charged and polar groups. Structure 2023; 31:1184-1199.e3. [PMID: 37625399 PMCID: PMC10592393 DOI: 10.1016/j.str.2023.07.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 07/10/2023] [Accepted: 07/31/2023] [Indexed: 08/27/2023]
Abstract
The fusion peptide of SARS-CoV-2 spike is essential for infection. How this charged and hydrophobic domain occupies and affects membranes needs clarification. Its depth in zwitterionic, bilayered micelles at pH 5 (resembling late endosomes) was measured by paramagnetic NMR relaxation enhancements used to bias molecular dynamics simulations. Asp830 inserted deeply, along with Lys825 or Lys835. Protonation of Asp830 appeared to enhance agreement of simulated and NMR-measured depths. While the fusion peptide occupied a leaflet of the DMPC bilayer, the opposite leaflet invaginated with influx of water and choline head groups in around Asp830 and bilayer-inserted polar side chains. NMR-detected hydrogen exchange found corroborating hydration of the backbone of Thr827-Phe833 inserted deeply in bicelles. Pinching of the membrane at the inserted charge and the intramembrane hydration of polar groups agree with theory. Formation of corridors of hydrated, inward-turned head groups was accompanied by flip-flop of head groups. Potential roles of the defects are discussed.
Collapse
Affiliation(s)
- Steven R Van Doren
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA; Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA.
| | - Benjamin S Scott
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
| | - Rama K Koppisetti
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
35
|
Vafaeie F, Miri Karam Z, Yari A, Safarpour H, Kazemi T, Etesam S, Mohammadpour M, Miri‐Moghaddam E. Clinical and genetic screening in a large Iranian family with Marfan syndrome: A case study. Health Sci Rep 2023; 6:e1647. [PMID: 37877128 PMCID: PMC10591539 DOI: 10.1002/hsr2.1647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/13/2023] [Accepted: 10/12/2023] [Indexed: 10/26/2023] Open
Abstract
Background and Aims Marfan syndrome (MFS) is an autosomal dominant genetic disorder caused by pathogenic variants of the fibrillin-1-encoding FBN1 gene that commonly affects the cardiovascular, skeletal, and ocular systems. This study aimed to evaluate the clinical features and genetic causes of the MFS phenotype in a large Iranian family. Methods Seventeen affected family members were examined clinically by cardiologists and ophthalmologists. The proband, a 48-year-old woman with obvious signs of MFS, her DNA sample subjected to whole-exome sequencing (WES). The candidate variant was validated by bidirectional sequencing of proband and other available family members. In silico analysis and molecular modeling were conducted to determine the pathogenic effects of the candidate variants. Results The most frequent cardiac complications are mitral valve prolapse and regurgitation. Ophthalmic examination revealed iridodonesis and ectopic lentis. A heterozygous missense variant (c.2179T>C/p.C727R) in exon 19 of FBN1 gene was identified and found to cosegregate with affected family members. Its pathogenicity has been predicted using several in silico predictive algorithms. Molecular docking analysis indicated that the variant might affect the binding affinity between FBN1 and LTBP1 proteins by impairing disulfide bond formation. Conclusion Our report expands the spectrum of the Marfan phenotype by providing details of its clinical manifestations and disease-associated molecular changes. It also highlights the value of WES in genetic diagnosis and contributes to genetic counseling in families with MFS.
Collapse
Affiliation(s)
- Farzane Vafaeie
- Cellular and Molecular Research CenterBirjand University of Medical SciencesBirjandIran
| | - Zahra Miri Karam
- Physiology Research Center, Institute of NeuropharmacologyKerman University of Medical SciencesKermanIran
- Department of Medical Genetics, Afzalipour Faculty of MedicineKerman University of Medical SciencesKermanIran
| | - Abolfazl Yari
- Cellular and Molecular Research CenterBirjand University of Medical SciencesBirjandIran
- Department of Medical Genetics, Afzalipour Faculty of MedicineKerman University of Medical SciencesKermanIran
| | - Hossein Safarpour
- Cellular and Molecular Research CenterBirjand University of Medical SciencesBirjandIran
| | - Tooba Kazemi
- Cardiovascular Disease Research Center, Razi HospitalBirjand University of Medical SciencesBirjandIran
| | - Shokoofeh Etesam
- Department of Biological SciencesTechnical and Vocational University (TVU)TehranIran
| | - Mojtaba Mohammadpour
- Department of Optometry, School of RehabilitationShahid Beheshti University of Medical SciencesTehranIran
| | - Ebrahim Miri‐Moghaddam
- Cardiovascular Disease Research Center, Razi HospitalBirjand University of Medical SciencesBirjandIran
| |
Collapse
|
36
|
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, Hook PW, Koren S, Rautiainen M, Alexandrov IA, Allen J, Asri M, Bzikadze AV, Chen NC, Chin CS, Diekhans M, Flicek P, Formenti G, Fungtammasan A, Garcia Giron C, Garrison E, Gershman A, Gerton JL, Grady PGS, Guarracino A, Haggerty L, Halabian R, Hansen NF, Harris R, Hartley GA, Harvey WT, Haukness M, Heinz J, Hourlier T, Hubley RM, Hunt SE, Hwang S, Jain M, Kesharwani RK, Lewis AP, Li H, Logsdon GA, Lucas JK, Makalowski W, Markovic C, Martin FJ, Mc Cartney AM, McCoy RC, McDaniel J, McNulty BM, Medvedev P, Mikheenko A, Munson KM, Murphy TD, Olsen HE, Olson ND, Paulin LF, Porubsky D, Potapova T, Ryabov F, Salzberg SL, Sauria MEG, Sedlazeck FJ, Shafin K, Shepelev VA, Shumate A, Storer JM, Surapaneni L, Taravella Oill AM, Thibaud-Nissen F, Timp W, Tomaszkiewicz M, Vollger MR, Walenz BP, Watwood AC, Weissensteiner MH, Wenger AM, Wilson MA, Zarate S, Zhu Y, Zook JM, Eichler EE, O'Neill RJ, Schatz MC, Miga KH, Makova KD, Phillippy AM. The complete sequence of a human Y chromosome. Nature 2023; 621:344-354. [PMID: 37612512 PMCID: PMC10752217 DOI: 10.1038/s41586-023-06457-y] [Citation(s) in RCA: 183] [Impact Index Per Article: 91.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/19/2023] [Indexed: 08/25/2023]
Abstract
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies Inc., Oxford, UK
| | - Monika Cechova
- Faculty of Informatics, Masaryk University, Brno, Czech Republic
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Paul W Hook
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A Alexandrov
- Federal Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Chen-Shan Chin
- GeneDX Holdings Corp, Stamford, CT, USA
- Foundation of Biological Data Science, Belmont, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ariel Gershman
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer L Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical Center, Kansas City, MO, USA
| | - Patrick G S Grady
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Reza Halabian
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Nancy F Hansen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Robert Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Gabrielle A Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jakob Heinz
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen Hwang
- XDBio Program, Johns Hopkins University, Baltimore, MD, USA
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Northeastern University, Boston, MA, USA
| | - Rupesh K Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Wojciech Makalowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Christopher Markovic
- Genome Technology Access Center at the McDonnell Genome Institute, Washington University, St. Louis, MO, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ann M Mc Cartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brandy M McNulty
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hugh E Olsen
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Nathan D Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Steven L Salzberg
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | | | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Angela M Taravella Oill
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Department of Biomedical Engineering, Pennsylvania State University, State College, PA, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison C Watwood
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | | | | | - Melissa A Wilson
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Justin M Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Investigator, Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
37
|
Yamasaki S, Zwama M, Yoneda T, Hayashi-Nishino M, Nishino K. Drug resistance and physiological roles of RND multidrug efflux pumps in Salmonella enterica, Escherichia coli and Pseudomonas aeruginosa. MICROBIOLOGY (READING, ENGLAND) 2023; 169:001322. [PMID: 37319001 PMCID: PMC10333786 DOI: 10.1099/mic.0.001322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 03/18/2023] [Indexed: 06/17/2023]
Abstract
Drug efflux pumps transport antimicrobial agents out of bacteria, thereby reducing the intracellular antimicrobial concentration, which is associated with intrinsic and acquired bacterial resistance to these antimicrobials. As genome analysis has advanced, many drug efflux pump genes have been detected in the genomes of bacterial species. In addition to drug resistance, these pumps are involved in various essential physiological functions, such as bacterial adaptation to hostile environments, toxin and metabolite efflux, biofilm formation and quorum sensing. In Gram-negative bacteria, efflux pumps in the resistance–nodulation–division (RND) superfamily play a clinically important role. In this review, we focus on Gram-negative bacteria, including Salmonella enterica , Escherichia coli and Pseudomonas aeruginosa , and discuss the role of RND efflux pumps in drug resistance and physiological functions.
Collapse
Affiliation(s)
- Seiji Yamasaki
- SANKEN (The Institute of Scientific and Industrial Research), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
- Institute for Advanced Co-Creation Studies, Osaka University, 1-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Martijn Zwama
- SANKEN (The Institute of Scientific and Industrial Research), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
| | - Tomohiro Yoneda
- SANKEN (The Institute of Scientific and Industrial Research), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Mitsuko Hayashi-Nishino
- SANKEN (The Institute of Scientific and Industrial Research), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Kunihiko Nishino
- SANKEN (The Institute of Scientific and Industrial Research), Osaka University, 8-1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan
- Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
- Center for Infectious Disease Education and Research, 2-8 Yamadaoka, Osaka University, Suita, Osaka 565-0871, Japan
| |
Collapse
|
38
|
Kandwal S, Fayne D. Genetic conservation across SARS-CoV-2 non-structural proteins - Insights into possible targets for treatment of future viral outbreaks. Virology 2023; 581:97-115. [PMID: 36940641 PMCID: PMC9999249 DOI: 10.1016/j.virol.2023.02.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 02/10/2023] [Accepted: 02/20/2023] [Indexed: 03/12/2023]
Abstract
The majority of SARS-CoV-2 therapeutic development work has focussed on targeting the spike protein, viral polymerase and proteases. As the pandemic progressed, many studies reported that these proteins are prone to high levels of mutation and can become drug resistant. Thus, it is necessary to not only target other viral proteins such as the non-structural proteins (NSPs) but to also target the most conserved residues of these proteins. In order to understand the level of conservation among these viruses, in this review, we have focussed on the conservation across RNA viruses, conservation across the coronaviruses and then narrowed our focus to conservation of NSPs across coronaviruses. We have also discussed the various treatment options for SARS-CoV-2 infection. A synergistic melding of bioinformatics, computer-aided drug-design and in vitro/vivo studies can feed into better understanding of the virus and therefore help in the development of small molecule inhibitors against the viral proteins.
Collapse
Affiliation(s)
- Shubhangi Kandwal
- Molecular Design Group, School of Biochemistry & Immunology, Trinity Biomedical Sciences Institute, Trinity College Dublin, Pearse Street, Dublin, 2, Ireland
| | - Darren Fayne
- Molecular Design Group, School of Biochemistry & Immunology, Trinity Biomedical Sciences Institute, Trinity College Dublin, Pearse Street, Dublin, 2, Ireland.
| |
Collapse
|
39
|
Burroughs A, Aravind L. New biochemistry in the Rhodanese-phosphatase superfamily: emerging roles in diverse metabolic processes, nucleic acid modifications, and biological conflicts. NAR Genom Bioinform 2023; 5:lqad029. [PMID: 36968430 PMCID: PMC10034599 DOI: 10.1093/nargab/lqad029] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/10/2023] [Accepted: 03/09/2023] [Indexed: 03/25/2023] Open
Abstract
The protein-tyrosine/dual-specificity phosphatases and rhodanese domains constitute a sprawling superfamily of Rossmannoid domains that use a conserved active site with a cysteine to catalyze a range of phosphate-transfer, thiotransfer, selenotransfer and redox activities. While these enzymes have been extensively studied in the context of protein/lipid head group dephosphorylation and various thiotransfer reactions, their overall diversity and catalytic potential remain poorly understood. Using comparative genomics and sequence/structure analysis, we comprehensively investigate and develop a natural classification for this superfamily. As a result, we identified several novel clades, both those which retain the catalytic cysteine and those where a distinct active site has emerged in the same location (e.g. diphthine synthase-like methylases and RNA 2' OH ribosyl phosphate transferases). We also present evidence that the superfamily has a wider range of catalytic capabilities than previously known, including a set of parallel activities operating on various sugar/sugar alcohol groups in the context of NAD+-derivatives and RNA termini, and potential phosphate transfer activities involving sugars and nucleotides. We show that such activities are particularly expanded in the RapZ-C-DUF488-DUF4326 clade, defined here for the first time. Some enzymes from this clade are predicted to catalyze novel DNA-end processing activities as part of nucleic-acid-modifying systems that are likely to function in biological conflicts between viruses and their hosts.
Collapse
Affiliation(s)
- A Maxwell Burroughs
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
40
|
Jacques F, Bolivar P, Pietras K, Hammarlund EU. Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023; 18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open
Abstract
Developments in sequencing technologies and the sequencing of an ever-increasing number of genomes have revolutionised studies of biodiversity and organismal evolution. This accumulation of data has been paralleled by the creation of numerous public biological databases through which the scientific community can mine the sequences and annotations of genomes, transcriptomes, and proteomes of multiple species. However, to find the appropriate databases and bioinformatic tools for respective inquiries and aims can be challenging. Here, we present a compilation of DNA and protein databases, as well as bioinformatic tools for phylogenetic reconstruction and a wide range of studies on molecular evolution. We provide a protocol for information extraction from biological databases and simple phylogenetic reconstruction using probabilistic and distance methods, facilitating the study of biodiversity and evolution at the molecular level for the broad scientific community.
Collapse
Affiliation(s)
- Florian Jacques
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Paulina Bolivar
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Kristian Pietras
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
| | - Emma U. Hammarlund
- Lund University Cancer Centre, Department of Laboratory Medicine, Lund University, Lund, Sweden
- Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
| |
Collapse
|
41
|
Cvrčková F, Bezvoda R. Gaining Insight into Large Gene Families with the Aid of Bioinformatic Tools. Methods Mol Biol 2023; 2604:173-191. [PMID: 36773233 DOI: 10.1007/978-1-0716-2867-6_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Proteins participating in plant cell morphogenesis are often encoded by large gene families, in some cases comprising paralogs with variable (modular) domain organization, as in the case of the formin (FH2 protein) family of actin nucleators that can have also additional functions. Unravelling the phylogeny of such a complex gene family brings a number of specific challenges but may be crucial for predictions of protein function and for experimental design. Here we present an overview of our "cottage industry" semi-manual bioinformatic approach, based mostly, though not exclusively, on freely available software tools, which we used to obtain insight into the evolutionary history of plant FH2 proteins and some other components of the plant cell morphogenesis apparatus.
Collapse
Affiliation(s)
- Fatima Cvrčková
- Department of Experimental Plant Biology, Faculty of Science, Charles University, CZ, Prague, Czechia.
| | - Radek Bezvoda
- Department of Experimental Plant Biology, Faculty of Science, Charles University, CZ, Prague, Czechia
| |
Collapse
|
42
|
Bainbridge RE, Rosenbaum JC, Sau P, Carlson AE. Xenopus laevis lack the critical sperm factor PLCζ. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.02.526858. [PMID: 36778253 PMCID: PMC9915601 DOI: 10.1101/2023.02.02.526858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
Fertilization of eggs from the African clawed frog Xenopus laevis is characterized by an increase in cytosolic calcium, a phenomenon that is also observed in other vertebrates such as mammals and birds. During fertilization in mammals and birds, the transfer of the soluble PLCζ from sperm into the egg is thought to trigger the release of calcium from the endoplasmic reticulum (ER). Injecting sperm extracts into eggs reproduces this effect, reinforcing the hypothesis that a sperm factor is responsible for calcium release and egg activation. Remarkably, this occurs even when sperm extracts from X. laevis are injected into mouse eggs, suggesting that mammals and X. laevis share a sperm factor. However, X. laevis lacks an annotated PLCZ1 gene, which encodes the PLCζ enzyme. In this study, we attempted to determine whether sperm from X. laevis express an unannotated PLCZ1 ortholog. We identified PLCZ1 orthologs in 11 amphibian species, including 5 that had not been previously characterized, but did not find any in either X. laevis or the closely related Xenopus tropicalis. Additionally, we performed RNA sequencing on testes obtained from adult X. laevis males and did not identify potential PLCZ1 orthologs in our dataset or in previously collected ones. These findings suggest that PLCZ1 may have been lost in the Xenopus lineage and raise the question of how fertilization triggers calcium release and egg activation in these species.
Collapse
Affiliation(s)
| | | | - Paushaly Sau
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA 15260
| | - Anne E. Carlson
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA 15260
| |
Collapse
|
43
|
Sanchez-Alonso P, Cobos-Justo E, Avalos-Rangel MA, López-Reyes L, Paniagua-Contreras GL, Vaca-Paniagua F, Anastacio-Marcelino E, López-Ochoa AJ, Pérez Marquez VM, Negrete-Abascal E, Vázquez-Cruz C. A Maverick-like cluster in the genome of a pathogenic, moderately virulent strain of Gallibacterium anatis, ESV200, a transient biofilm producer. Front Microbiol 2023; 14:1084766. [PMID: 36778889 PMCID: PMC9909271 DOI: 10.3389/fmicb.2023.1084766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 01/06/2023] [Indexed: 01/28/2023] Open
Abstract
Introduction Gallibacterium anatis causes gallibacteriosis in birds. These bacteria produce biofilms and secrete several fimbrial appendages as tools to cause disease in animals. G. anatis strains contain up to three types of fimbriae. Complete genome sequencing is the strategy currently used to determine variations in the gene content of G. anatis, although today only the completely circularized genome of G. anatis UMN179 is available. Methods The appearance of growth of various strains of G. anatis in liquid culture medium was studied. Biofilm production and how the amount of biofilm was affected by DNase, Proteinase K, and Pronase E enzymes were analyzed. Fimbrial gene expression was performed by protein analysis and qRT-PCR. In an avian model, the pathogenesis generated by the strains G. anatis ESV200 and 12656-12 was investigated. Using bioinformatic tools, the complete genome of G. anatis ESV200 was comparatively studied to search for virulence factors that would help explain the pathogenic behavior of this strain. Results and Discussion G. anatis ESV200 strain differs from the 12656-12 strain because it produces a biofilm at 20%. G. anatis ESV200 strain express fimbrial genes and produces biofilm but with a different structure than that observed for strain 12656-12. ESV200 and 12656-12 strains are pathogenic for chickens, although the latter is the most virulent. Here, we show that the complete genome of the ESV200 strain is similar to that of the UNM179 strain. However, these strains have evolved with many structural rearrangements; the most striking chromosomal arrangement is a Maverick-like element present in the ESV200 strain.
Collapse
Affiliation(s)
- Patricia Sanchez-Alonso
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico,*Correspondence: Patricia Sanchez-Alonso,
| | - Elena Cobos-Justo
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Miguel Angel Avalos-Rangel
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Lucía López-Reyes
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Gloria Luz Paniagua-Contreras
- Carrera de Biología, Facultad de Estudios Superiores de Iztacala, UNAM, Los Reyes Iztacala, Estado de, México, Mexico
| | - Felipe Vaca-Paniagua
- Carrera de Biología, Facultad de Estudios Superiores de Iztacala, UNAM, Los Reyes Iztacala, Estado de, México, Mexico,Subdirección de Investigación Basica, Instituto Nacional de Cancerología, CDMX, México
| | - Estela Anastacio-Marcelino
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Ana Jaqueline López-Ochoa
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Victor M. Pérez Marquez
- Diagnóstico y Patobiología Aviar, Biotecnología Veterinaria S.A.-Biovetsa, BIOVETSA, Tehuacán, Mexico
| | - Erasmo Negrete-Abascal
- Carrera de Biología, Facultad de Estudios Superiores de Iztacala, UNAM, Los Reyes Iztacala, Estado de, México, Mexico
| | - Candelario Vázquez-Cruz
- Centro de Investigaciones en Ciencias Microbiológicas, Instituto de Ciencias, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico,Candelario Vázquez-Cruz,
| |
Collapse
|
44
|
Rodriguez OL, Silver CA, Shields K, Smith ML, Watson CT. Targeted long-read sequencing facilitates phased diploid assembly and genotyping of the human T cell receptor alpha, delta, and beta loci. CELL GENOMICS 2022; 2:100228. [PMID: 36778049 PMCID: PMC9903726 DOI: 10.1016/j.xgen.2022.100228] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/25/2022] [Accepted: 11/05/2022] [Indexed: 12/02/2022]
Abstract
T cell receptors (TCRs) recognize peptide fragments presented by the major histocompatibility complex (MHC) and are critical to T cell-mediated immunity. Recent data have indicated that genetic diversity within TCR-encoding gene regions is underexplored, limiting understanding of the impact of TCR loci polymorphisms on TCR function in disease, even though TCR repertoire signatures (1) are heritable and (2) associate with disease phenotypes. To address this, we developed a targeted long-read sequencing approach to generate highly accurate haplotype resolved assemblies of the TCR beta (TRB) and alpha/delta (TRA/D) loci, facilitating the genotyping of all variant types, including structural variants. We validate our approach using two mother-father-child trios and 5 unrelated donors representing multiple populations. This resulted in improved genotyping accuracy and the discovery of 84 undocumented V, D, J, and C alleles, demonstrating the utility of this framework for improving our understanding of TCR diversity and function in disease.
Collapse
Affiliation(s)
- Oscar L. Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Catherine A. Silver
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Kaitlyn Shields
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Melissa L. Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T. Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA,Corresponding author
| |
Collapse
|
45
|
Maleki E, Akbari Rokn Abadi S, Koohi S. HELIOS: High-speed sequence alignment in optics. PLoS Comput Biol 2022; 18:e1010665. [PMID: 36409684 PMCID: PMC9678324 DOI: 10.1371/journal.pcbi.1010665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 10/18/2022] [Indexed: 11/22/2022] Open
Abstract
In response to the imperfections of current sequence alignment methods, originated from the inherent serialism within their corresponding electrical systems, a few optical approaches for biological data comparison have been proposed recently. However, due to their low performance, raised from their inefficient coding scheme, this paper presents a novel all-optical high-throughput method for aligning DNA, RNA, and protein sequences, named HELIOS. The HELIOS method employs highly sophisticated operations to locate character matches, single or multiple mutations, and single or multiple indels within various biological sequences. On the other hand, the HELIOS optical architecture exploits high-speed processing and operational parallelism in optics, by adopting wavelength and polarization of optical beams. For evaluation, the functionality and accuracy of the HELIOS method are approved through behavioral and optical simulation studies, while its complexity and performance are estimated through analytical computation. The accuracy evaluations indicate that the HELIOS method achieves a precise pairwise alignment of two sequences, highly similar to those of Smith-Waterman, Needleman-Wunsch, BLAST, MUSCLE, ClustalW, ClustalΩ, T-Coffee, Kalign, and MAFFT. According to our performance evaluations, the HELIOS optical architecture outperforms all alternative electrical and optical algorithms in terms of processing time and memory requirement, relying on its highly sophisticated method and optical architecture. Moreover, the employed compact coding scheme highly escalates the number of input characters, and hence, it offers reduced time and space complexities, compared to the electrical and optical alternatives. It makes the HELIOS method and optical architecture highly applicable for biomedical applications.
Collapse
Affiliation(s)
- Ehsan Maleki
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | | | - Somayyeh Koohi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| |
Collapse
|
46
|
Somee MR, Amoozegar MA, Dastgheib SMM, Shavandi M, Maman LG, Bertilsson S, Mehrshad M. Genome-resolved analyses show an extensive diversification in key aerobic hydrocarbon-degrading enzymes across bacteria and archaea. BMC Genomics 2022; 23:690. [PMID: 36203131 PMCID: PMC9535955 DOI: 10.1186/s12864-022-08906-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 09/26/2022] [Indexed: 12/04/2022] Open
Abstract
Background Hydrocarbons (HCs) are organic compounds composed solely of carbon and hydrogen that are mainly accumulated in oil reservoirs. As the introduction of all classes of hydrocarbons including crude oil and oil products into the environment has increased significantly, oil pollution has become a global ecological problem. However, our perception of pathways for biotic degradation of major HCs and key enzymes in these bioconversion processes has mainly been based on cultured microbes and is biased by uneven taxonomic representation. Here we used Annotree to provide a gene-centric view of the aerobic degradation ability of aliphatic and aromatic HCs in 23,446 genomes from 123 bacterial and 14 archaeal phyla. Results Apart from the widespread genetic potential for HC degradation in Proteobacteria, Actinobacteriota, Bacteroidota, and Firmicutes, genomes from an additional 18 bacterial and 3 archaeal phyla also hosted key HC degrading enzymes. Among these, such degradation potential has not been previously reported for representatives in the phyla UBA8248, Tectomicrobia, SAR324, and Eremiobacterota. Genomes containing whole pathways for complete degradation of HCs were only detected in Proteobacteria and Actinobacteriota. Except for several members of Crenarchaeota, Halobacterota, and Nanoarchaeota that have tmoA, ladA, and alkB/M key genes, respectively, representatives of archaeal genomes made a small contribution to HC degradation. None of the screened archaeal genomes coded for complete HC degradation pathways studied here; however, they contribute significantly to peripheral routes of HC degradation with bacteria. Conclusion Phylogeny reconstruction showed that the reservoir of key aerobic hydrocarbon-degrading enzymes in Bacteria and Archaea undergoes extensive diversification via gene duplication and horizontal gene transfer. This diversification could potentially enable microbes to rapidly adapt to novel and manufactured HCs that reach the environment. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08906-w.
Collapse
Affiliation(s)
- Maryam Rezaei Somee
- Extremophile Laboratory, Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran
| | - Mohammad Ali Amoozegar
- Extremophile Laboratory, Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran
| | | | - Mahmoud Shavandi
- Biotechnology Research Group, Research Institute of Petroleum Industry, Tehran, Iran
| | - Leila Ghanbari Maman
- Laboratory of Complex Biological Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Stefan Bertilsson
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences (SLU), Box 7050, 75007, Uppsala, Sweden
| | - Maliheh Mehrshad
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences (SLU), Box 7050, 75007, Uppsala, Sweden.
| |
Collapse
|
47
|
Chen PYT, Adak S, Chekan JR, Liscombe DK, Miyanaga A, Bernhardt P, Diethelm S, Fielding EN, George JH, Miles ZD, Murray LAM, Steele TS, Winter JM, Noel JP, Moore BS. Structural Basis of Stereospecific Vanadium-Dependent Haloperoxidase Family Enzymes in Napyradiomycin Biosynthesis. Biochemistry 2022; 61:1844-1852. [PMID: 35985031 PMCID: PMC10978243 DOI: 10.1021/acs.biochem.2c00338] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Vanadium-dependent haloperoxidases (VHPOs) from Streptomyces bacteria differ from their counterparts in fungi, macroalgae, and other bacteria by catalyzing organohalogenating reactions with strict regiochemical and stereochemical control. While this group of enzymes collectively uses hydrogen peroxide to oxidize halides for incorporation into electron-rich organic molecules, the mechanism for the controlled transfer of highly reactive chloronium ions in the biosynthesis of napyradiomycin and merochlorin antibiotics sets the Streptomyces vanadium-dependent chloroperoxidases apart. Here we report high-resolution crystal structures of two homologous VHPO family members associated with napyradiomycin biosynthesis, NapH1 and NapH3, that catalyze distinctive chemical reactions in the construction of meroterpenoid natural products. The structures, combined with site-directed mutagenesis and intact protein mass spectrometry studies, afforded a mechanistic model for the asymmetric alkene and arene chlorination reactions catalyzed by NapH1 and the isomerase activity catalyzed by NapH3. A key lysine residue in NapH1 situated between the coordinated vanadate and the putative substrate binding pocket was shown to be essential for catalysis. This observation suggested the involvement of the ε-NH2, possibly through formation of a transient chloramine, as the chlorinating species much as proposed in structurally distinct flavin-dependent halogenases. Unexpectedly, NapH3 is modified post-translationally by phosphorylation of an active site His (τ-pHis) consistent with its repurposed halogenation-independent, α-hydroxyketone isomerase activity. These structural studies deepen our understanding of the mechanistic underpinnings of VHPO enzymes and their evolution as enantioselective biocatalysts.
Collapse
Affiliation(s)
- Percival Yang-Ting Chen
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Sanjoy Adak
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Jonathan R Chekan
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - David K Liscombe
- Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, La Jolla, California 92037, United States
| | - Akimasa Miyanaga
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Peter Bernhardt
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Stefan Diethelm
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Elisha N Fielding
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Jonathan H George
- Department of Chemistry, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Zachary D Miles
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Lauren A M Murray
- Department of Chemistry, University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Taylor S Steele
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Jaclyn M Winter
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
| | - Joseph P Noel
- Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, La Jolla, California 92037, United States
| | - Bradley S Moore
- Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093, United States
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, United States
| |
Collapse
|
48
|
Blaine HC, Burke JT, Ravi J, Stallings CL. DciA Helicase Operators Exhibit Diversity across Bacterial Phyla. J Bacteriol 2022; 204:e0016322. [PMID: 35880876 PMCID: PMC9380583 DOI: 10.1128/jb.00163-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 06/21/2022] [Indexed: 01/28/2023] Open
Abstract
A fundamental requirement for life is the replication of an organism's DNA. Studies in Escherichia coli and Bacillus subtilis have set the paradigm for DNA replication in bacteria. During replication initiation in E. coli and B. subtilis, the replicative helicase is loaded onto the DNA at the origin of replication by an ATPase helicase loader. However, most bacteria do not encode homologs to the helicase loaders in E. coli and B. subtilis. Recent work has identified the DciA protein as a predicted helicase operator that may perform a function analogous to the helicase loaders in E. coli and B. subtilis. DciA proteins, which are defined by the presence of a DUF721 domain (termed the DciA domain herein), are conserved in most bacteria but have only been studied in mycobacteria and gammaproteobacteria (Pseudomonas aeruginosa and Vibrio cholerae). Sequences outside the DciA domain in Mycobacterium tuberculosis DciA are essential for protein function but are not conserved in the P. aeruginosa and V. cholerae homologs, raising questions regarding the conservation and evolution of DciA proteins across bacterial phyla. To comprehensively define the DciA protein family, we took a computational evolutionary approach and analyzed the domain architectures and sequence properties of DciA domain-containing proteins across the tree of life. These analyses identified lineage-specific domain architectures among DciA homologs, as well as broadly conserved sequence-structural motifs. The diversity of DciA proteins represents the evolution of helicase operation in bacterial DNA replication and highlights the need for phylum-specific analyses of this fundamental biological process. IMPORTANCE Despite the fundamental importance of DNA replication for life, this process remains understudied in bacteria outside Escherichia coli and Bacillus subtilis. In particular, most bacteria do not encode the helicase-loading proteins that are essential in E. coli and B. subtilis for DNA replication. Instead, most bacteria encode a DciA homolog that likely constitutes the predominant mechanism of helicase operation in bacteria. However, it is still unknown how DciA structure and function compare across diverse phyla that encode DciA proteins. In this study, we performed computational evolutionary analyses to uncover tremendous diversity among DciA homologs. These studies provide a significant advance in our understanding of an essential component of the bacterial DNA replication machinery.
Collapse
Affiliation(s)
- Helen C. Blaine
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
- Center for Women’s Infectious Disease Research, Washington University School of Medicine, Saint Louis, Missouri, USA
| | - Joseph T. Burke
- Department of Pathobiology and Diagnostic Investigation, Michigan State University, East Lansing, Michigan, USA
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, USA
- Genomics and Molecular Genetics Undergraduate Program, Michigan State University, East Lansing, Michigan, USA
| | - Janani Ravi
- Department of Pathobiology and Diagnostic Investigation, Michigan State University, East Lansing, Michigan, USA
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, USA
| | - Christina L. Stallings
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, Missouri, USA
- Center for Women’s Infectious Disease Research, Washington University School of Medicine, Saint Louis, Missouri, USA
| |
Collapse
|
49
|
Hsueh BY, Severin GB, Elg CA, Waldron EJ, Kant A, Wessel AJ, Dover JA, Rhoades CR, Ridenhour BJ, Parent KN, Neiditch MB, Ravi J, Top EM, Waters CM. Phage defence by deaminase-mediated depletion of deoxynucleotides in bacteria. Nat Microbiol 2022; 7:1210-1220. [PMID: 35817890 PMCID: PMC9830645 DOI: 10.1038/s41564-022-01162-4] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 05/24/2022] [Indexed: 02/03/2023]
Abstract
Vibrio cholerae biotype El Tor is perpetuating the longest cholera pandemic in recorded history. The genomic islands VSP-1 and VSP-2 distinguish El Tor from previous pandemic V. cholerae strains. Using a co-occurrence analysis of VSP genes in >200,000 bacterial genomes we built gene networks to infer biological functions encoded in these islands. This revealed that dncV, a component of the cyclic-oligonucleotide-based anti-phage signalling system (CBASS) anti-phage defence system, co-occurs with an uncharacterized gene vc0175 that we rename avcD for anti-viral cytodine deaminase. We show that AvcD is a deoxycytidylate deaminase and that its activity is post-translationally inhibited by a non-coding RNA named AvcI. AvcID and bacterial homologues protect bacterial populations against phage invasion by depleting free deoxycytidine nucleotides during infection, thereby decreasing phage replication. Homologues of avcD exist in all three domains of life, and bacterial AvcID defends against phage infection by combining traits of two eukaryotic innate viral immunity proteins, APOBEC and SAMHD1.
Collapse
Affiliation(s)
- Brian Y Hsueh
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
| | - Geoffrey B Severin
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
| | - Clinton A Elg
- Department of Biological Sciences, Institute for Interdisciplinary Data Sciences, Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID, USA
| | - Evan J Waldron
- Department of Microbiology, Biochemistry, and Molecular Genetics, New Jersey Medical School, Rutgers University, Newark, NJ, USA
- Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| | - Abhiruchi Kant
- Department of Microbiology, Biochemistry, and Molecular Genetics, New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - Alex J Wessel
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
| | - John A Dover
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Christopher R Rhoades
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
| | - Benjamin J Ridenhour
- Department of Mathematics and Statistical Sciences, University of Idaho, Moscow, ID, USA
| | - Kristin N Parent
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Matthew B Neiditch
- Department of Microbiology, Biochemistry, and Molecular Genetics, New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - Janani Ravi
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
- Department of Pathobiology and Diagnostic Investigation, Michigan State University, East Lansing, MI, USA
| | - Eva M Top
- Department of Biological Sciences, Institute for Interdisciplinary Data Sciences, Bioinformatics and Computational Biology Program, University of Idaho, Moscow, ID, USA
| | - Christopher M Waters
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
50
|
Chao J, Tang F, Xu L. Developments in Algorithms for Sequence Alignment: A Review. Biomolecules 2022; 12:biom12040546. [PMID: 35454135 PMCID: PMC9024764 DOI: 10.3390/biom12040546] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 01/27/2023] Open
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized.
Collapse
Affiliation(s)
- Jiannan Chao
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;
| | - Furong Tang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003, China;
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|