1
|
Liu G, Chen X, Luan Y, Li D. VirusPredictor: XGBoost-based software to predict virus-related sequences in human data. Bioinformatics 2024; 40:btae192. [PMID: 38597887 PMCID: PMC11052659 DOI: 10.1093/bioinformatics/btae192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 02/29/2024] [Accepted: 04/08/2024] [Indexed: 04/11/2024] Open
Abstract
MOTIVATION Discovering disease causative pathogens, particularly viruses without reference genomes, poses a technical challenge as they are often unidentifiable through sequence alignment. Machine learning prediction of patient high-throughput sequences unmappable to human and pathogen genomes may reveal sequences originating from uncharacterized viruses. Currently, there is a lack of software specifically designed for accurately predicting such viral sequences in human data. RESULTS We developed a fast XGBoost method and software VirusPredictor leveraging an in-house viral genome database. Our two-step XGBoost models first classify each query sequence into one of three groups: infectious virus, endogenous retrovirus (ERV) or non-ERV human. The prediction accuracies increased as the sequences became longer, i.e. 0.76, 0.93, and 0.98 for 150-350 (Illumina short reads), 850-950 (Sanger sequencing data), and 2000-5000 bp sequences, respectively. Then, sequences predicted to be from infectious viruses are further classified into one of six virus taxonomic subgroups, and the accuracies increased from 0.92 to >0.98 when query sequences increased from 150-350 to >850 bp. The results suggest that Illumina short reads should be de novo assembled into contigs (e.g. ∼1000 bp or longer) before prediction whenever possible. We applied VirusPredictor to multiple real genomic and metagenomic datasets and obtained high accuracies. VirusPredictor, a user-friendly open-source Python software, is useful for predicting the origins of patients' unmappable sequences. This study is the first to classify ERVs in infectious viral sequence prediction. This is also the first study combining virus sub-group predictions. AVAILABILITY AND IMPLEMENTATION www.dllab.org/software/VirusPredictor.html.
Collapse
Affiliation(s)
- Guangchen Liu
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, United States
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
- School of Mathematics and Statistics, Ludong University, Yantai, Shandong 264025, China
| | - Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, United States
| | - Yihui Luan
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont 05405, United States
- Department of Immunology and Molecular Microbiology, Texas Tech University Health Sciences Center, Lubbock, Texas 79430, United States
- ICanCME Research Network, Sainte-Justine University Hospital Research Center, Montreal, Quebec H3T 1C5, Canada
| |
Collapse
|
2
|
Warren CJ, Barbachano-Guerrero A, Bauer VL, Stabell AC, Dirasantha O, Yang Q, Sawyer SL. Adaptation of CD4 in gorillas and chimpanzees conveyed resistance to simian immunodeficiency viruses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.13.566830. [PMID: 38014262 PMCID: PMC10680607 DOI: 10.1101/2023.11.13.566830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Simian immunodeficiency viruses (SIVs) comprise a large group of primate lentiviruses that endemically infect African monkeys. HIV-1 spilled over to humans from this viral reservoir, but the spillover did not occur directly from monkeys to humans. Instead, a key event was the introduction of SIVs into great apes, which then set the stage for infection of humans. Here, we investigate the role of the lentiviral entry receptor, CD4, in this key and fateful event in the history of SIV/HIV emergence. First, we reconstructed and tested ancient forms of CD4 at two important nodes in ape speciation, both prior to the infection of chimpanzees and gorillas with these viruses. These ancestral CD4s fully supported entry of diverse SIV isolates related to the viruses that made this initial jump to apes. In stark contrast, modern chimpanzee and gorilla CD4 orthologs are more resistant to these viruses. To investigate how this resistance in CD4 was gained, we acquired CD4 gene sequences from 32 gorilla individuals of two species, and identified alleles that encode 8 unique CD4 protein variants. Functional testing of these identified variant-specific differences in susceptibility to virus entry. By engineering single point mutations from resistant gorilla CD4 variants into the permissive human CD4 receptor, we demonstrate that acquired substitutions in gorilla CD4 did convey resistance to virus entry. We provide a population genetic analysis to support the theory that selection is acting in favor of more and more resistant CD4 alleles in ape species harboring SIV endemically (gorillas and chimpanzees), but not in other ape species that lack SIV infections (bonobos and orangutans). Taken together, our results show that SIV has placed intense selective pressure on ape CD4, acting to propagate SIV-resistant alleles in chimpanzee and gorilla populations.
Collapse
Affiliation(s)
- Cody J. Warren
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Arturo Barbachano-Guerrero
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Vanessa L. Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Alex C. Stabell
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Obaiah Dirasantha
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Qing Yang
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Sara L. Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| |
Collapse
|
3
|
Cassiano GC, Martinelli A, Mottin M, Neves BJ, Andrade CH, Ferreira PE, Cravo P. Whole genome sequencing identifies novel mutations in malaria parasites resistant to artesunate (ATN) and to ATN + mefloquine combination. Front Cell Infect Microbiol 2024; 14:1353057. [PMID: 38495651 PMCID: PMC10940360 DOI: 10.3389/fcimb.2024.1353057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 02/14/2024] [Indexed: 03/19/2024] Open
Abstract
Introduction The global evolution of resistance to Artemisinin-based Combination Therapies (ACTs) by malaria parasites, will severely undermine our ability to control this devastating disease. Methods Here, we have used whole genome sequencing to characterize the genetic variation in the experimentally evolved Plasmodium chabaudi parasite clone AS-ATNMF1, which is resistant to artesunate + mefloquine. Results and discussion Five novel single nucleotide polymorphisms (SNPs) were identified, one of which was a previously undescribed E738K mutation in a 26S proteasome subunit that was selected for under artesunate pressure (in AS-ATN) and retained in AS-ATNMF1. The wild type and mutated three-dimensional (3D) structure models and molecular dynamics simulations of the P. falciparum 26S proteasome subunit Rpn2 suggested that the E738K mutation could change the toroidal proteasome/cyclosome domain organization and change the recognition of ubiquitinated proteins. The mutation in the 26S proteasome subunit may therefore contribute to altering oxidation-dependent ubiquitination of the MDR-1 and/or K13 proteins and/or other targets, resulting in changes in protein turnover. In light of the alarming increase in resistance to artemisin derivatives and ACT partner drugs in natural parasite populations, our results shed new light on the biology of resistance and provide information on novel molecular markers of resistance that may be tested (and potentially validated) in the field.
Collapse
Affiliation(s)
- Gustavo Capatti Cassiano
- Global Health and Tropical Medicine (GHTM), Associate Laboratory in Translation and Innovation Towards Global Health, (LA-REAL), Instituto de Higiene e Medicina Tropical, (IHMT), Universidade NOVA de Lisboa, (UNL), Lisbon, Portugal
| | | | - Melina Mottin
- Laboratory for Molecular Modeling and Drug Design (LabMol), Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, Brazil
| | - Bruno Junior Neves
- Laboratory or Cheminformatics (LabChem), Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, Brazil
| | - Carolina Horta Andrade
- Laboratory for Molecular Modeling and Drug Design (LabMol), Faculty of Pharmacy, Universidade Federal de Goiás, Goiânia, Brazil
- Center for the Research and Advancement in Fragments and Molecular Targets (CRAFT), School of Pharmaceutical Sciences at Ribeirão Preto, University of São Paulo, Ribeirão Preto, Brazil
| | - Pedro Eduardo Ferreira
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, Braga, Portugal
| | - Pedro Cravo
- Global Health and Tropical Medicine (GHTM), Associate Laboratory in Translation and Innovation Towards Global Health, (LA-REAL), Instituto de Higiene e Medicina Tropical, (IHMT), Universidade NOVA de Lisboa, (UNL), Lisbon, Portugal
| |
Collapse
|
4
|
Ramadan WS, Saber-Ayad MM, Saleh E, Abdu-Allah HH, El-Shorbagi ANA, Menon V, Tarazi H, Semreen MH, Soares NC, Hafezi S, Venkatakhalam T, Ahmed S, Kanie O, Hamoudi R, El-Awady R. Design, synthesis and mechanistic anticancer activity of new acetylated 5-aminosalicylate-thiazolinone hybrid derivatives. iScience 2024; 27:108659. [PMID: 38235331 PMCID: PMC10792193 DOI: 10.1016/j.isci.2023.108659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/29/2023] [Accepted: 12/04/2023] [Indexed: 01/19/2024] Open
Abstract
The development of hybrid compounds has been widely considered as a promising strategy to circumvent the difficulties that emerge in cancer treatment. The well-established strategy of adding acetyl groups to certain drugs has been demonstrated to enhance their therapeutic efficacy. Based on our previous work, an approach of accommodating two chemical entities into a single structure was implemented to synthesize new acetylated hybrids (HH32 and HH33) from 5-aminosalicylic acid and 4-thiazolinone derivatives. These acetylated hybrids showed potential anticancer activities and distinct metabolomic profile with antiproliferative properties. The in-silico molecular docking predicts a strong binding of HH32 and HH33 to cell cycle regulators, and transcriptomic analysis revealed DNA repair and cell cycle as the main targets of HH33 compounds. These findings were validated using in vitro models. In conclusion, the pleiotropic biological effects of HH32 and HH33 compounds on cancer cells demonstrated a new avenue to develop more potent cancer therapies.
Collapse
Affiliation(s)
- Wafaa S. Ramadan
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Maha M. Saber-Ayad
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Ekram Saleh
- Medical Biochemistry and Molecular Biology Unit, Cancer Biology Department, National Cancer Institute, Cairo University, Cairo 12613, Egypt
| | | | - Abdel-nasser A. El-Shorbagi
- College of Pharmacy, University of Sharjah, Sharjah 27272, United Arab Emirates
- Faculty of Pharmacy, Assiut University, Assiut 16122, Egypt
| | - Varsha Menon
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Hamadeh Tarazi
- College of Pharmacy, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Mohammad H. Semreen
- College of Pharmacy, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Nelson C. Soares
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
- College of Pharmacy, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Shirin Hafezi
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Thenmozhi Venkatakhalam
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Samrein Ahmed
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
- Department of Biosciences and Chemistry, College of Health, Wellbeing and Life sciences, University of Sheffield Hallam, Sheffield S1 1WB, United Kingdom
| | - Osamu Kanie
- Department of Applied Biochemistry, Tokai University, 4-1-1 Kitakaname, Hiratsuka, Kanagawa 259-1292, Japan
| | - Rifat Hamoudi
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates
- Division of Surgery and Interventional Science, Faculty of Medical Science, University College London, London, United Kingdom
| | - Raafat El-Awady
- Research Institute for Medical and Health Sciences, University of Sharjah, Sharjah 27272, United Arab Emirates
- College of Pharmacy, University of Sharjah, Sharjah 27272, United Arab Emirates
| |
Collapse
|
5
|
Zheng H, Marçais G, Kingsford C. Creating and Using Minimizer Sketches in Computational Genomics. J Comput Biol 2023; 30:1251-1276. [PMID: 37646787 PMCID: PMC11082048 DOI: 10.1089/cmb.2023.0094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023] Open
Abstract
Processing large data sets has become an essential part of computational genomics. Greatly increased availability of sequence data from multiple sources has fueled breakthroughs in genomics and related fields but has led to computational challenges processing large sequencing experiments. The minimizer sketch is a popular method for sequence sketching that underlies core steps in computational genomics such as read mapping, sequence assembling, k-mer counting, and more. In most applications, minimizer sketches are constructed using one of few classical approaches. More recently, efforts have been put into building minimizer sketches with desirable properties compared with the classical constructions. In this survey, we review the history of the minimizer sketch, the theories developed around the concept, and the plethora of applications taking advantage of such sketches. We aim to provide the readers a comprehensive picture of the research landscape involving minimizer sketches, in anticipation of better fusion of theory and application in the future.
Collapse
Affiliation(s)
- Hongyu Zheng
- Computer Science Department, Princeton University, Princeton, New Jersey, USA
| | - Guillaume Marçais
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Carl Kingsford
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
6
|
Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023; 18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open
Abstract
Developments in sequencing technologies and the sequencing of an ever-increasing number of genomes have revolutionised studies of biodiversity and organismal evolution. This accumulation of data has been paralleled by the creation of numerous public biological databases through which the scientific community can mine the sequences and annotations of genomes, transcriptomes, and proteomes of multiple species. However, to find the appropriate databases and bioinformatic tools for respective inquiries and aims can be challenging. Here, we present a compilation of DNA and protein databases, as well as bioinformatic tools for phylogenetic reconstruction and a wide range of studies on molecular evolution. We provide a protocol for information extraction from biological databases and simple phylogenetic reconstruction using probabilistic and distance methods, facilitating the study of biodiversity and evolution at the molecular level for the broad scientific community.
Collapse
|
7
|
Firtina C, Park J, Alser M, Kim JS, Cali D, Shahroodi T, Ghiasi N, Singh G, Kanellopoulos K, Alkan C, Mutlu O. BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis. NAR Genom Bioinform 2023; 5:lqad004. [PMID: 36685727 PMCID: PMC9853099 DOI: 10.1093/nargab/lqad004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/16/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023] Open
Abstract
Generating the hash values of short subsequences, called seeds, enables quickly identifying similarities between genomic sequences by matching seeds with a single lookup of their hash values. However, these hash values can be used only for finding exact-matching seeds as the conventional hashing methods assign distinct hash values for different seeds, including highly similar seeds. Finding only exact-matching seeds causes either (i) increasing the use of the costly sequence alignment or (ii) limited sensitivity. We introduce BLEND, the first efficient and accurate mechanism that can identify both exact-matching and highly similar seeds with a single lookup of their hash values, called fuzzy seed matches. BLEND (i) utilizes a technique called SimHash, that can generate the same hash value for similar sets, and (ii) provides the proper mechanisms for using seeds as sets with the SimHash technique to find fuzzy seed matches efficiently. We show the benefits of BLEND when used in read overlapping and read mapping. For read overlapping, BLEND is faster by 2.4×-83.9× (on average 19.3×), has a lower memory footprint by 0.9×-14.1× (on average 3.8×), and finds higher quality overlaps leading to accurate de novo assemblies than the state-of-the-art tool, minimap2. For read mapping, BLEND is faster by 0.8×-4.1× (on average 1.7×) than minimap2. Source code is available at https://github.com/CMU-SAFARI/BLEND.
Collapse
Affiliation(s)
- Can Firtina
- To whom correspondence should be addressed. Tel: +41 44 632 64 29;
| | - Jisung Park
- ETH Zurich, Zurich 8092, Switzerland,POSTECH, Pohang 37673, Republic of Korea
| | | | | | | | | | | | | | | | - Can Alkan
- Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Correspondence may also be addressed to Onur Mutlu. Tel: +41 44 632 64 29;
| |
Collapse
|
8
|
Transcriptomic Changes Associated with ERBB2 Overexpression in Colorectal Cancer Implicate a Potential Role of the Wnt Signaling Pathway in Tumorigenesis. Cancers (Basel) 2022; 15:cancers15010130. [PMID: 36612126 PMCID: PMC9817785 DOI: 10.3390/cancers15010130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 12/16/2022] [Accepted: 12/20/2022] [Indexed: 12/28/2022] Open
Abstract
Colorectal cancer (CRC) remains the third most common cause of cancer mortality worldwide. Precision medicine using OMICs guided by transcriptomic profiling has improved disease diagnosis and prognosis by identifying many CRC targets. One such target that has been actively pursued is an erbb2 receptor tyrosine kinase 2 (ERBB2) (Human Epidermal Growth Factor Receptor 2 (HER2)), which is overexpressed in around 3-5% of patients with CRC worldwide. Despite targeted therapies against HER2 showing significant improvement in disease outcomes in multiple clinical trials, to date, no HER2-based treatment has been clinically approved for CRC. In this study we performed whole transcriptome ribonucleic acid (RNA) sequencing on 11 HER2+ and 3 HER2- CRC patients with advanced stages II, III and IV of the disease. In addition, transcriptomic profiling was carried out on CRC cell lines (HCT116 and HT29) and normal colon cell lines (CCD841 and CCD33), ectopically overexpressing ERBB2. Our analysis revealed transcriptomic changes involving many genes in both CRC cell lines overexpressing ERBB2 and in HER2+ patients, compared to normal colon cell lines and HER2- patients, respectively. Gene Set Enrichment Analysis indicated a role for HER2 in regulating CRC pathogenesis, with Wnt/β-catenin signaling being mediated via a HER2-dependent regulatory pathway impacting expression of the homeobox gene NK2 homeobox 5 (NKX2-5). Results from this study thus identified putative targets that are co-expressed with HER2 in CRC warranting further investigation into their role in CRC pathogenesis.
Collapse
|
9
|
Dunowska M, Perrott M, Biggs P. Identification of a novel polyomavirus from a marsupial host. Virus Evol 2022; 8:veac096. [PMID: 36381233 PMCID: PMC9662318 DOI: 10.1093/ve/veac096] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 09/09/2022] [Accepted: 10/05/2022] [Indexed: 08/26/2023] Open
Abstract
We report the identification and analysis of a full sequence of a novel polyomavirus from a brushtail possum (Trichosurus vulpecula ) termed possum polyomavirus (PPyV). The sequence was obtained from the next-generation sequencing assembly during an investigation into the aetiological agent for a neurological disease of possums termed wobbly possum disease (WPD), but the virus was not aetiologically involved in WPD. The PPyV genome was 5,224 nt long with the organisation typical for polyomaviruses, including early (large and small T antigens) and late (Viral Protein 1 (VP1), VP2, and VP3) coding regions separated by the non-coding control region of 465 nt. PPyV clustered with betapolyomaviruses in the WUKI clade but showed less than 60 per cent identity to any of the members of this clade. We propose that PPyV is classified within a new species in the genus Betapolyomavirus . These data add to our limited knowledge of marsupial viruses and their evolution.
Collapse
Affiliation(s)
- Magdalena Dunowska
- School of Veterinary Science, Massey University, Palmerston North 4410, New Zealand
| | - Matthew Perrott
- School of Veterinary Science, Massey University, Palmerston North 4410, New Zealand
| | - Patrick Biggs
- School of Veterinary Science, Massey University, Palmerston North 4410, New Zealand
- School of Natural Sciences, Massey University, Palmerston North 4410, New Zealand
| |
Collapse
|
10
|
Identification of novel differentially expressed genes in type 1 diabetes mellitus complications using transcriptomic profiling of UAE patients: a multicenter study. Sci Rep 2022; 12:16316. [PMID: 36175575 PMCID: PMC9523055 DOI: 10.1038/s41598-022-18997-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 08/23/2022] [Indexed: 12/01/2022] Open
Abstract
Type 1 diabetes mellitus (T1DM) is a chronic metabolic disorder that mainly affects children and young adults. It is associated with debilitating and long-life complications. Therefore, understanding the factors that lead to the onset and development of these complications is crucial. To our knowledge this is the first study that attempts to identify the common differentially expressed genes (DEGs) in T1DM complications using whole transcriptomic profiling in United Arab Emirates (UAE) patients. The present multicenter study was conducted in different hospitals in UAE including University Hospital Sharjah, Dubai Hospital and Rashid Hospital. A total of fifty-eight Emirati participants aged above 18 years and with a BMI < 25 kg/m2 were recruited and forty-five of these participants had a confirmed diagnosis of T1DM. Five groups of complications associated with the latter were identified including hyperlipidemia, neuropathy, ketoacidosis, hypothyroidism and polycystic ovary syndrome (PCOS). A comprehensive whole transcriptomic analysis using NGS was conducted. The outcomes of the study revealed the common DEGs between T1DM without complications and T1DM with different complications. The results revealed seven common candidate DEGs, SPINK9, TRDN, PVRL4, MYO3A, PDLIM1, KIAA1614 and GRP were upregulated in T1DM complications with significant increase in expression of SPINK9 (Fold change: 5.28, 3.79, 5.20, 3.79, 5.20) and MYO3A (Fold change: 4.14, 6.11, 2.60, 4.33, 4.49) in hyperlipidemia, neuropathy, ketoacidosis, hypothyroidism and PCOS, respectively. In addition, functional pathways of ion transport, mineral absorption and cytosolic calcium concentration were involved in regulation of candidate upregulated genes related to neuropathy, ketoacidosis and PCOS, respectively. The findings of this study represent a novel reference warranting further studies to shed light on the causative genetic factors that are involved in the onset and development of T1DM complications.
Collapse
|
11
|
Talaat IM, Yakout NM, Soliman AS, Venkatachalam T, Vinod A, Eldohaji L, Nair V, Hareedy A, Kandil A, Abdel-Rahman WM, Hamoudi R, Saber-Ayad M. Evaluation of Galanin Expression in Colorectal Cancer: An Immunohistochemical and Transcriptomic Study. Front Oncol 2022; 12:877147. [PMID: 35707368 PMCID: PMC9190230 DOI: 10.3389/fonc.2022.877147] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 04/27/2022] [Indexed: 01/02/2023] Open
Abstract
Colorectal cancer (CRC) represents around 10% of all cancers, with an increasing incidence in the younger age group. The gut is considered a unique organ with its distinctive neuronal supply. The neuropeptide, human galanin, is widely distributed in the colon and expressed in many cancers, including the CRC. The current study aimed to explore the role of galanin at different stages of CRC. Eighty-one CRC cases (TNM stages I – IV) were recruited, and formalin-fixed paraffin-embedded samples were analyzed for the expression of galanin and galanin receptor 1 (GALR1) by immunohistochemistry (IHC). Galanin intensity was significantly lower in stage IV (n= 6) in comparison to other stages (p= 0.037 using the Mann-Whitney U test). Whole transcriptomics analysis using NGS was performed for selected samples based on the galanin expression by IHC [early (n=5) with high galanin expression and late (n=6) with low galanin expression]. Five differentially regulated pathways (using Absolute GSEA) were identified as drivers for tumor progression and associated with higher galanin expression, namely, cell cycle, cell division, autophagy, transcriptional regulation of TP53, and immune system process. The top shared genes among the upregulated pathways are AURKA, BIRC5, CCNA1, CCNA2, CDC25C, CDK2, CDK6, EREG, LIG3, PIN1, TGFB1, TPX2. The results were validated using real-time PCR carried out on four cell lines [two primaries (HCT116 and HT29) and two metastatic (LoVo and SK-Co-1)]. The current study shows galanin as a potential negative biomarker. Galanin downregulation is correlated with advanced CRC staging and linked to cell cycle and division, autophagy, transcriptional regulation of TP53 and immune system response.
Collapse
Affiliation(s)
- Iman M. Talaat
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Pathology Department, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Nada M. Yakout
- Pathology Department, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | | | - Thenmozhi Venkatachalam
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Department of Physiology and Immunology, College of Medicine and Health Science, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Arya Vinod
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Leen Eldohaji
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Vidhya Nair
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Amal Hareedy
- Pathology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Alaa Kandil
- Clinical Oncology and Nuclear Medicine Department, Faculty of Medicine, Alexandria University, Cairo, Egypt
| | - Wael M. Abdel-Rahman
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Department of Medical Laboratory Sciences, College of Health Sciences, University of Sharjah, Sharjah, United Arab Emirates
| | - Rifat Hamoudi
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Division of Surgery and Interventional Science, University College London, London, United Kingdom
| | - Maha Saber-Ayad
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
- Pharmacology Department, Faculty of Medicine, Cairo University, Cairo, Egypt
- *Correspondence: Maha Saber-Ayad,
| |
Collapse
|
12
|
Perchat N, Dubois C, Mor-Gautier R, Duquesne S, Lechaplais C, Roche D, Fouteau S, Darii E, Perret A. Characterization of a novel β-alanine biosynthetic pathway consisting of promiscuous metabolic enzymes. J Biol Chem 2022; 298:102067. [PMID: 35623386 PMCID: PMC9213253 DOI: 10.1016/j.jbc.2022.102067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 05/19/2022] [Accepted: 05/22/2022] [Indexed: 10/28/2022] Open
Abstract
Bacteria adapt to utilize the nutrients available in their environment through a sophisticated metabolic system composed of highly specialized enzymes. Although these enzymes can metabolize molecules other than those for which they evolved, their efficiency toward promiscuous substrates is considered too low to be of physiological relevance. Herein, we investigated the possibility that these promiscuous enzymes are actually efficient enough at metabolizing secondary substrates to modify the phenotype of the cell. For example, in the bacterium Acinetobacter baylyi ADP1 (ADP1), panD (coding for l-aspartate decarboxylase) encodes the only protein known to catalyze the synthesis of β-alanine, an obligate intermediate in CoA synthesis. However, we show that the ADP1 ΔpanD mutant could also form this molecule through an unknown metabolic pathway arising from promiscuous enzymes and grow as efficiently as the wildtype strain. Using metabolomic analyses, we identified 1,3-diaminopropane and 3-aminopropanal as intermediates in this novel pathway. We also conducted activity screening and enzyme kinetics to elucidate candidate enzymes involved in this pathway, including 2,4-diaminobutyrate aminotransferase (Dat) and 2,4-diaminobutyrate decarboxylase (Ddc) and validated this pathway in vivo by analyzing the phenotype of mutant bacterial strains. Finally, we experimentally demonstrate that this novel metabolic route is not restricted to ADP1. We propose that the occurrence of conserved genes in hundreds of genomes across many phyla suggests that this previously undescribed pathway is widespread in prokaryotes.
Collapse
Affiliation(s)
- Nadia Perchat
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Christelle Dubois
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Rémi Mor-Gautier
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Sophie Duquesne
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Christophe Lechaplais
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - David Roche
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Stéphanie Fouteau
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Ekaterina Darii
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Alain Perret
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France.
| |
Collapse
|
13
|
Wei ZG, Fan XG, Zhang H, Zhang XD, Liu F, Qian Y, Zhang SW. kngMap: Sensitive and Fast Mapping Algorithm for Noisy Long Reads Based on the K-Mer Neighborhood Graph. Front Genet 2022; 13:890651. [PMID: 35601495 PMCID: PMC9117619 DOI: 10.3389/fgene.2022.890651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 04/07/2022] [Indexed: 11/13/2022] Open
Abstract
With the rapid development of single molecular sequencing (SMS) technologies such as PacBio single-molecule real-time and Oxford Nanopore sequencing, the output read length is continuously increasing, which has dramatical potentials on cutting-edge genomic applications. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. However, these long reads contain higher sequencing errors and could more frequently span the breakpoints of structural variants (SVs) than those of shorter reads, leading to many unaligned reads or reads that are partially aligned for most state-of-the-art mappers. As a result, these methods usually focus on producing local mapping results for the query read rather than obtaining the whole end-to-end alignment. We introduce kngMap, a novel k-mer neighborhood graph-based mapper that is specifically designed to align long noisy SMS reads to a reference sequence. By benchmarking exhaustive experiments on both simulated and real-life SMS datasets to assess the performance of kngMap with ten other popular SMS mapping tools (e.g., BLASR, BWA-MEM, and minimap2), we demonstrated that kngMap has higher sensitivity that can align more reads and bases to the reference genome; meanwhile, kngMap can produce consecutive alignments for the whole read and span different categories of SVs in the reads. kngMap is implemented in C++ and supports multi-threading; the source code of kngMap can be downloaded for free at: https://github.com/zhang134/kngMap for academic usage.
Collapse
Affiliation(s)
- Ze-Gang Wei
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
| | - Xing-Guo Fan
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
| | - Hao Zhang
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
| | - Xiao-Dan Zhang
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
| | - Fei Liu
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
| | - Yu Qian
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Sciences, Baoji, China
- *Correspondence: Yu Qian, ; Shao-Wu Zhang,
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
- *Correspondence: Yu Qian, ; Shao-Wu Zhang,
| |
Collapse
|
14
|
Delmas VA, Perchat N, Monet O, Fouré M, Darii E, Roche D, Dubois I, Pateau E, Perret A, Döring V, Bouzon M. Genetic and biocatalytic basis of formate dependent growth of Escherichia coli strains evolved in continuous culture. Metab Eng 2022; 72:200-214. [PMID: 35341982 DOI: 10.1016/j.ymben.2022.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/22/2022] [Accepted: 03/14/2022] [Indexed: 10/18/2022]
Abstract
The reductive glycine pathway was described as the most energetically favorable synthetic route of aerobic formate assimilation. Here we report the successful implementation of formatotrophy in Escherichia coli by means of a stepwise adaptive evolution strategy. Medium swap and turbidostat regimes of continuous culture were applied to force the channeling of carbon flux through the synthetic pathway to pyruvate establishing growth on formate and CO2 as sole carbon sources. Labeling with 13C-formate proved the assimilation of the C1 substrate via the pathway metabolites. Genetic analysis of intermediate isolates revealed a mutational path followed throughout the adaptation process. Mutations were detected affecting the copy number (gene ftfL) or the coding sequence (genes folD and lpd) of genes which specify enzymes implicated in the three steps forming glycine from formate and CO2, the central metabolite of the synthetic pathway. The mutation R196S present in methylene-tetrahydrofolate dehydrogenase/cyclohydrolase (FolD) abolishes the inhibition of cyclohydrolase activity by the substrate formyl-tetrahydrofolate. The mutation R273H in lipoamide dehydrogenase (Lpd) alters substrate affinities as well as kinetics at physiological substrate concentrations likely favoring a reactional shift towards lipoamide reduction. In addition, genetic reconstructions proved the necessity of all three mutations for formate assimilation by the adapted cells. The largely unpredictable nature of these changes demonstrates the usefulness of the evolutionary approach enabling the selection of adaptive mutations crucial for pathway engineering of biotechnological model organisms.
Collapse
Affiliation(s)
- Valérie A Delmas
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Nadia Perchat
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Oriane Monet
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Marion Fouré
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Ekatarina Darii
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - David Roche
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Ivan Dubois
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Emilie Pateau
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Alain Perret
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Volker Döring
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France
| | - Madeleine Bouzon
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91057, Evry-Courcouronnes, France.
| |
Collapse
|
15
|
van der Putten BCL, Huijsmans NAH, Mende DR, Schultsz C. Benchmarking the topological accuracy of bacterial phylogenomic workflows using in silico evolution. Microb Genom 2022; 8. [PMID: 35290758 PMCID: PMC9176278 DOI: 10.1099/mgen.0.000799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phylogenetic analyses are widely used in microbiological research, for example to trace the progression of bacterial outbreaks based on whole-genome sequencing data. In practice, multiple analysis steps such as de novo assembly, alignment and phylogenetic inference are combined to form phylogenetic workflows. Comprehensive benchmarking of the accuracy of complete phylogenetic workflows is lacking. To benchmark different phylogenetic workflows, we simulated bacterial evolution under a wide range of evolutionary models, varying the relative rates of substitution, insertion, deletion, gene duplication, gene loss and lateral gene transfer events. The generated datasets corresponded to a genetic diversity usually observed within bacterial species (≥95 % average nucleotide identity). We replicated each simulation three times to assess replicability. In total, we benchmarked 19 distinct phylogenetic workflows using 8 different simulated datasets. We found that recently developed k-mer alignment methods such as kSNP and ska achieve similar accuracy as reference mapping. The high accuracy of k-mer alignment methods can be explained by the large fractions of genomes these methods can align, relative to other approaches. We also found that the choice of de novo assembly algorithm influences the accuracy of phylogenetic reconstruction, with workflows employing SPAdes or skesa outperforming those employing Velvet. Finally, we found that the results of phylogenetic benchmarking are highly variable between replicates. We conclude that for phylogenomic reconstruction, k-mer alignment methods are relevant alternatives to reference mapping at the species level, especially in the absence of suitable reference genomes. We show de novo genome assembly accuracy to be an underappreciated parameter required for accurate phylogenomic reconstruction.
Collapse
Affiliation(s)
- Boas C L van der Putten
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Niek A H Huijsmans
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Constance Schultsz
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
16
|
Hussen BM, Abdullah ST, Salihi A, Sabir DK, Sidiq KR, Rasul MF, Hidayat HJ, Ghafouri-Fard S, Taheri M, Jamali E. The emerging roles of NGS in clinical oncology and personalized medicine. Pathol Res Pract 2022; 230:153760. [PMID: 35033746 DOI: 10.1016/j.prp.2022.153760] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/29/2021] [Accepted: 01/06/2022] [Indexed: 02/07/2023]
Abstract
Next-generation sequencing (NGS) has been increasingly popular in genomics studies over the last decade, as new sequencing technology has been created and improved. Recently, NGS started to be used in clinical oncology to improve cancer therapy through diverse modalities ranging from finding novel and rare cancer mutations, discovering cancer mutation carriers to reaching specific therapeutic approaches known as personalized medicine (PM). PM has the potential to minimize medical expenses by shifting the current traditional medical approach of treating cancer and other diseases to an individualized preventive and predictive approach. Currently, NGS can speed up in the early diagnosis of diseases and discover pharmacogenetic markers that help in personalizing therapies. Despite the tremendous growth in our understanding of genetics, NGS holds the added advantage of providing more comprehensive picture of cancer landscape and uncovering cancer development pathways. In this review, we provided a complete overview of potential NGS applications in scientific and clinical oncology, with a particular emphasis on pharmacogenomics in the direction of precision medicine treatment options.
Collapse
Affiliation(s)
- Bashdar Mahmud Hussen
- Department Pharmacognosy, College of Pharmacy, Hawler Medical University, Kurdistan Region, Erbil, Iraq; Center of Research and Strategic Studies, Lebanese French University, Kurdistan Region, Erbil, Iraq
| | - Sara Tharwat Abdullah
- Department of Pharmacology and Toxicology, College of Pharmacy, Hawler Medical University, Erbil, Iraq
| | - Abbas Salihi
- Center of Research and Strategic Studies, Lebanese French University, Kurdistan Region, Erbil, Iraq; Department of Biology, College of Science, Salahaddin University, Kurdistan Region, Erbil, Iraq
| | - Dana Khdr Sabir
- Department of Medical Laboratory Sciences, Charmo University, Kurdistan Region, Iraq
| | - Karzan R Sidiq
- Department of Biology, College of Education, University of Sulaimani, Sulaimani 334, Kurdistan, Iraq
| | - Mohammed Fatih Rasul
- Department of Medical Analysis, Faculty of Applied Science, Tishk International University, Kurdistan Region, Erbil, Iraq
| | - Hazha Jamal Hidayat
- Department of Biology, College of Education, Salahaddin University, Kurdistan Region, Erbil, Iraq
| | - Soudeh Ghafouri-Fard
- Department of Medical Genetics, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Taheri
- Institute of Human Genetics, Jena University Hospital, Jena, Germany; Urology and Nephrology Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Elena Jamali
- Skull Base Research Center, Loghman Hakim Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
17
|
Abu‐Hashem M, Gutub A. Efficient computation of Hash Hirschberg protein alignment utilizing hyper threading multi‐core sharing technology. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2021. [DOI: 10.1049/cit2.12070] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Muhannad Abu‐Hashem
- Department of Geomatics Faculty of Architecture and Planning King Abdulaziz University Jeddah Saudi Arabia
| | - Adnan Gutub
- Department of Computer Engineering College of Computer & Information Systems Umm Al‐Qura University Makkah Saudi Arabia
| |
Collapse
|
18
|
Hammoudeh SM, Hammoudeh AM, Venkatachalam T, Rawat S, Jayakumar MN, Rahmani M, Hamoudi R. Enriched transcriptome analysis of laser capture microdissected populations of single cells to investigate intracellular heterogeneity in immunostained FFPE sections. Comput Struct Biotechnol J 2021; 19:5198-5209. [PMID: 34745451 PMCID: PMC8531757 DOI: 10.1016/j.csbj.2021.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/21/2021] [Accepted: 09/09/2021] [Indexed: 11/29/2022] Open
Abstract
To investigate intracellular heterogeneity, cell capture of particular cell populations followed by transcriptome analysis has been highly effective in freshly isolated tissues. However, this approach has been quite challenging in immunostained formalin-fixed paraffin-embedded (FFPE) sections. This study aimed at combining the standard pathology techniques, immunostaining and laser capture microdissection, with whole RNA-sequencing and bioinformatics analysis to characterize FFPE breast cancer cell populations with heterogeneous expression of progesterone receptor (PR). Immunocytochemical analysis revealed that 60% of MCF-7 cells admixture highly express PR. Immunocytochemistry-based targeted RNA-seq (ICC-RNAseq) and in silico functional analysis revealed that the PR-high cell population is associated with upregulation in transcripts implicated in immunomodulatory and inflammatory pathways (e.g. NF-κB and interferon signaling). In contrast, the PR-low cell population is associated with upregulation of genes involved in metabolism and mitochondrial processes as well as EGFR and MAPK signaling. These findings were cross-validated and confirmed in FACS-sorted PR high and PR-low MCF-7 cells and in MDA-MB-231 cells ectopically overexpressing PR. Significantly, ICC-RNAseq could be extended to analyze samples captured at specific spatio-temporal states to investigate gene expression profiles using diverse biomarkers. This would also facilitate our understanding of cell population-specific molecular events driving cancer and potentially other diseases.
Collapse
Affiliation(s)
- Sarah M Hammoudeh
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates.,Sharjah Institute for Medical Research, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Arabella M Hammoudeh
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates.,General Surgery Department, Tawam Hospital, SEHA, Al-Ain 15258, United Arab Emirates
| | - Thenmozhi Venkatachalam
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Surendra Rawat
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Manju N Jayakumar
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah 27272, United Arab Emirates
| | - Mohamed Rahmani
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah 27272, United Arab Emirates.,Department of Molecular Biology and Genetics, College of Medicine and Health Sciences, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Rifat Hamoudi
- College of Medicine, University of Sharjah, Sharjah 27272, United Arab Emirates.,Sharjah Institute for Medical Research, University of Sharjah, Sharjah 27272, United Arab Emirates.,Division of Surgery and Interventional Science, University College London, London, United Kingdom
| |
Collapse
|
19
|
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021; 22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open
Abstract
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Jeremy Rotman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Dhrithi Deshpande
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kodi Taraszka
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pelin Icer Baykal
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Victor Xue
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA
- Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David Koslicki
- Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Biology Department, Pennsylvania State University, University Park, PA, 16801, USA
- The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Can Alkan
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
20
|
Singh R, Kusalik A, Dillon JAR. Bioinformatics tools used for whole-genome sequencing analysis of Neisseria gonorrhoeae: a literature review. Brief Funct Genomics 2021; 21:78-89. [PMID: 34170311 DOI: 10.1093/bfgp/elab028] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/21/2021] [Accepted: 05/24/2021] [Indexed: 01/02/2023] Open
Abstract
Whole-genome sequencing (WGS) data are well established for the investigation of gonococcal transmission, antimicrobial resistance prediction, population structure determination and population dynamics. A variety of bioinformatics tools, repositories, services and platforms have been applied to manage and analyze Neisseria gonorrhoeae WGS datasets. This review provides an overview of the various bioinformatics approaches and resources used in 105 published studies (as of 30 April 2021). The challenges in the analysis of N. gonorrhoeae WGS datasets, as well as future bioinformatics requirements, are also discussed.
Collapse
Affiliation(s)
- Reema Singh
- Department of Biochemistry, Microbiology and Immunology
| | - Anthony Kusalik
- Department of Computer Science at the University of Saskatchewan
| | - Jo-Anne R Dillon
- Department of Biochemistry Microbiology and Immunology, College of Medicine, c/o Vaccine and Infectious Disease Organization, University of Saskatchewan, 120 Veterinary Road, Saskatoon, Saskatchewan S7N5E3, Canada
| |
Collapse
|
21
|
Abstract
For DNA sequence analysis, we are facing challenging tasks such as the identification of structural variants, sequencing repetitive regions, and phasing of alleles. Those challenging tasks suffer from the short length of sequencing reads, where each read may cover less than 2 single nucleotide polymorphism (SNP), or less than two occurrences of a repeated region. It is believed that long reads can help to solve those challenging tasks. In this study, we have designed new algorithms for mapping long reads to reference genomes. We have also designed efficient and effective heuristic algorithms for local alignments of long reads against the corresponding segments of the reference genome. To design the new mapping algorithm, we formulate the problem as the longest common subsequence with distance constraints. The local alignment heuristic algorithm is based on the idea of recursive alignment of k-mers, where the size of k differs in each round. We have implemented all the algorithms in C++ and produce a software package named mapAlign. Experiments on real data sets showed that the newly proposed approach can generate better alignments in terms of both identity and alignment scores for both Nanopore and single molecule real time sequencing (SMRT) data sets. For human individuals of both Nanopore and SMRT data sets, the new method can successfully math/align 91.53% and 85.36% of letters from reads to identical letters on reference genomes, respectively. In comparison, the best known method can only align 88.44% and 79.08% letters of reads for Nanopore and SMRT data sets, respectively. Our method is also faster than the best known method.
Collapse
Affiliation(s)
- Wen Yang
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.,City University of Hong Kong Shenzhen Research Institution, Shenzhen, China
| |
Collapse
|
22
|
Development and Evaluation of a Point-of-Care Test in a Low-Resource Setting with High Rates of Chlamydia trachomatis Urogenital Infections in Fiji. J Clin Microbiol 2021; 59:e0018221. [PMID: 33910964 PMCID: PMC8218753 DOI: 10.1128/jcm.00182-21] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Rapid and precise detection of Chlamydia trachomatis, the leading global cause of sexually transmitted infections (STI), at the point of care (POC) is required for treatment decisions to prevent transmission and sequelae, including pelvic inflammatory disease, ectopic pregnancy, tubal factor infertility, and preterm birth. We developed a rapid POC test (POCT), termed LH-POCT, which uses loop-mediated amplification (LAMP) of nucleic acids. We performed a head-to-head comparison with the Cepheid Xpert CT/NG assay using clinician-collected, deidentified paired vaginal samples from a parent study that consecutively enrolled symptomatic and asymptomatic females over 18 years of age from the Ministry of Health and Medical Services Health Centers in Fiji. Samples were processed by the Xpert CT/NG assay and LH-POCT, blinded to the comparator. Discrepant samples were resolved by quantitative PCR. Deidentified clinical data and tests for Trichomonas vaginalis, Candida, and bacterial vaginosis (BV) were provided. There were a total of 353 samples from 327 females. C. trachomatis positivity was 16.7% (59/353), while the prevalence was 16.82% (55/327) after discrepant resolution. Seven discrepant samples resolved to four false negatives, two false positives, and one true positive for the LH-POCT. The sensitivity of the LH-POCT was 93.65% (95% confidence interval [CI], 84.53% to 98.24%), and specificity was 99.31% (95% CI, 97.53% to 99.92%). Discrepant samples clustered among women with vaginal discharge and/or BV. The prototype LH-POCT workflow has excellent performance, meeting many World Health Organization ASSURED criteria for POC tests, including a sample-to-result time of 35 min. Our LH-POCT holds promise for improving clinical practice to prevent and control C. trachomatis STIs in diverse health care settings globally.
Collapse
|
23
|
Predicting the Most Deleterious Missense Nonsynonymous Single-Nucleotide Polymorphisms of Hennekam Syndrome-Causing CCBE1 Gene, In Silico Analysis. ScientificWorldJournal 2021; 2021:6642626. [PMID: 34234628 PMCID: PMC8211529 DOI: 10.1155/2021/6642626] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Accepted: 05/27/2021] [Indexed: 01/02/2023] Open
Abstract
Hennekam lymphangiectasia-lymphedema syndrome has been linked to single-nucleotide polymorphisms in the CCBE1 (collagen and calcium-binding EGF domains 1) gene. Several bioinformatics methods were used to find the most dangerous nsSNPs that could affect CCBE1 structure and function. Using state-of-the-art in silico tools, this study examined the most pathogenic nonsynonymous single-nucleotide polymorphisms (nsSNPs) that disrupt the CCBE1 protein and extracellular matrix remodeling and migration. Our results indicate that seven nsSNPs, rs115982879, rs149792489, rs374941368, rs121908254, rs149531418, rs121908251, and rs372499913, are deleterious in the CCBE1 gene, four (G330E, C102S, C174R, and G107D) of which are the highly deleterious, two of them (G330E and G107D) have never been seen reported in the context of Hennekam syndrome. Twelve missense SNPs, rs199902030, rs267605221, rs37517418, rs80008675, rs116596858, rs116675104, rs121908252, rs147974432, rs147681552, rs192224843, rs139059968, and rs148498685, are found to revert into stop codons. Structural homology-based methods and sequence homology-based tools revealed that 8.8% of the nsSNPs are pathogenic. SIFT, PolyPhen2, M-CAP, CADD, FATHMM-MKL, DANN, PANTHER, Mutation Taster, LRT, and SNAP2 had a significant score for identifying deleterious nsSNPs. The importance of rs374941368 and rs200149541 in the prediction of post-translation changes was highlighted because it impacts a possible phosphorylation site. Gene-gene interactions revealed CCBE1's association with other genes, showing its role in a number of pathways and coexpressions. The top 16 deleterious nsSNPs found in this research should be investigated further in the future while researching diseases caused CCBE1 gene specifically HS. The FT web server predicted amino acid residues involved in the ligand-binding site of the CCBE1 protein, and two of the substitutions (R167W and T153N) were found to be involved. These highly deleterious nsSNPs can be used as marker pathogenic variants in the mutational diagnosis of the HS syndrome, and this research also offers potential insights that will aid in the development of precision medicines. CCBE1 proteins from Hennekam syndrome patients should be tested in animal models for this purpose.
Collapse
|
24
|
Bennedbæk M, Zhukova A, Tang MHE, Bennet J, Munderi P, Ruxrungtham K, Gisslen M, Worobey M, Lundgren JD, Marvig RL. Phylogenetic analysis of HIV-1 shows frequent cross-country transmission and local population expansions. Virus Evol 2021; 7:veab055. [PMID: 34532059 PMCID: PMC8438898 DOI: 10.1093/ve/veab055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 05/26/2021] [Accepted: 06/09/2021] [Indexed: 12/03/2022] Open
Abstract
Understanding of pandemics depends on the characterization of pathogen collections from well-defined and demographically diverse cohorts. Since its emergence in Congo almost a century ago, Human Immunodeficiency Virus Type 1 (HIV-1) has geographically spread and genetically diversified into distinct viral subtypes. Phylogenetic analysis can be used to reconstruct the ancestry of the virus to better understand the origin and distribution of subtypes. We sequenced two 3.6-kb amplicons of HIV-1 genomes from 3,197 participants in a clinical trial with consistent and uniform sampling at sites across 35 countries and analyzed our data with another 2,632 genomes that comprehensively reflect the HIV-1 genetic diversity. We used maximum likelihood phylogenetic analysis coupled with geographical information to infer the state of ancestors. The majority of our sequenced genomes (n = 2,501) were either pure subtypes (A-D, F, and G) or CRF01_AE. The diversity and distribution of subtypes across geographical regions differed; USA showed the most homogenous subtype population, whereas African samples were most diverse. We delineated transmission of the four most prevalent subtypes in our dataset (A, B, C, and CRF01_AE), and our results suggest both continuous and frequent transmission of HIV-1 over country borders, as well as single transmission events being the seed of endemic population expansions. Overall, we show that coupling of genetic and geographical information of HIV-1 can be used to understand the origin and spread of pandemic pathogens.
Collapse
Affiliation(s)
| | - Anna Zhukova
- Unité Bioinformatique Evolutive, Hub Bioinformatique et Biostatistique, USR3756 (C3BI//DBC), Institut Pasteur and CNRS, 25-28 Rue du Dr Roux, 75015 Paris, France
| | | | | | - Paula Munderi
- MRC Uganda Research Unit on AIDS, UVRI P.O.Box 49, Plot 51-59 Nakiwogo Road, Entebbe-Uganda
| | - Kiat Ruxrungtham
- HIV-NAT, Thai Red Cross AIDS Research Center, and School of Global Health, Faculty Medicine, Chulalongkorn University, Chamchuri 5 Bld. 6th Fl., Phayathai Rd., Wangmai, Pathumwan Bangkok 10330, Thailand
| | - Magnus Gisslen
- Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Universitetsplatsen 1, 405 30 Gothenburg, Sweden,Department of Infectious Diseases, Region Västra Götaland, Sahlgrenska University Hospital, Universitetsplatsen 1, 405 30 Gothenburg, Sweden
| | - Michael Worobey
- Department of Ecology and Evolutionary Biology, University of Arizona, Biological Sciences West, Rm. 324 Tucson, AZ 85721, USA
| | | | | |
Collapse
|
25
|
Mapping genetic variability in mature miRNAs and miRNA binding sites in prostate cancer. J Hum Genet 2021; 66:1127-1137. [PMID: 34099864 DOI: 10.1038/s10038-021-00934-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 04/16/2021] [Accepted: 04/20/2021] [Indexed: 01/23/2023]
Abstract
MicroRNAs (miRNAs) regulate diverse cancer hallmarks through sequence-specific regulation of gene expression, so genetic variability in their seed sequences or target sites could be responsible for cancer initiation or progression. While several efforts have been made to predict the locations of single nucleotide variants (SNVs) at miRNA target sites and associate them with cancer risk and susceptibility, there have been few direct assessments of SNVs in both mature miRNAs and their target sites to assess their impact on miRNA function in cancers. Using genome-wide target capture of miRNAs and miRNA-binding sites followed by deep sequencing in prostate cancer cell lines, here we identified prostate cancer-specific SNVs in mature miRNAs and their target binding sites. SNV rs9860655 in the mature sequence of miR-570 was not present in benign prostate hyperplasia (BPH) tissue or cell lines but was detectable in clinical prostate cancer tissue samples and adjacent normal tissue. SLC45A3 (prostein), a putative oncogene target of miR-1178, was highly upregulated in PC3 cells harboring an miR-1178 seed sequence SNV. Finally, systematic assessment of losses and gains of miRNA targets through 3'UTR SNVs revealed SNV-associated changes in target oncogene and tumor suppressor gene expression that might be associated with prostate carcinogenesis. Further work is required to systematically assess the functional effects of miRNA SNVs.
Collapse
|
26
|
Berger B, Waterman MS, Yu YW. Levenshtein Distance, Sequence Comparison and Biological Database Search. IEEE TRANSACTIONS ON INFORMATION THEORY 2021; 67:3287-3294. [PMID: 34257466 PMCID: PMC8274556 DOI: 10.1109/tit.2020.2996543] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Levenshtein edit distance has played a central role-both past and present-in sequence alignment in particular and biological database similarity search in general. We start our review with a history of dynamic programming algorithms for computing Levenshtein distance and sequence alignments. Following, we describe how those algorithms led to heuristics employed in the most widely used software in bioinformatics, BLAST, a program to search DNA and protein databases for evolutionarily relevant similarities. More recently, the advent of modern genomic sequencing and the volume of data it generates has resulted in a return to the problem of local alignment. We conclude with how the mathematical formulation of Levenshtein distance as a metric made possible additional optimizations to similarity search in biological contexts. These modern optimizations are built around the low metric entropy and fractional dimensionality of biological databases, enabling orders of magnitude acceleration of biological similarity search.
Collapse
Affiliation(s)
- Bonnie Berger
- Department of Mathematics and Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139 USA, and also with the Department of Computer Science and AI Lab, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Michael S Waterman
- Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 USA
| | - Yun William Yu
- Department of Mathematics, University of Toronto, Toronto, ON M5S 2E4, Canada, and also with the Department of Computer and Mathematical Sciences, University of Toronto at Scarborough, Toronto, ON M1C 1A4, Canada
| |
Collapse
|
27
|
Pacharne S, Dovey OM, Cooper JL, Gu M, Friedrich MJ, Rajan SS, Barenboim M, Collord G, Vijayabaskar MS, Ponstingl H, De Braekeleer E, Bautista R, Mazan M, Rad R, Tzelepis K, Wright P, Gozdecka M, Vassiliou GS. SETBP1 overexpression acts in the place of class-defining mutations to drive FLT3-ITD-mutant AML. Blood Adv 2021; 5:2412-2425. [PMID: 33956058 PMCID: PMC8114559 DOI: 10.1182/bloodadvances.2020003443] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 01/25/2021] [Indexed: 12/23/2022] Open
Abstract
Advances in cancer genomics have revealed genomic classes of acute myeloid leukemia (AML) characterized by class-defining mutations, such as chimeric fusion genes or in genes such as NPM1, MLL, and CEBPA. These class-defining mutations frequently synergize with internal tandem duplications in FLT3 (FLT3-ITDs) to drive leukemogenesis. However, ∼20% of FLT3-ITD-positive AMLs bare no class-defining mutations, and mechanisms of leukemic transformation in these cases are unknown. To identify pathways that drive FLT3-ITD mutant AML in the absence of class-defining mutations, we performed an insertional mutagenesis (IM) screening in Flt3-ITD mice, using Sleeping Beauty transposons. All mice developed acute leukemia (predominantly AML) after a median of 73 days. Analysis of transposon insertions in 38 samples from Flt3-ITD/IM leukemic mice identified recurrent integrations at 22 loci, including Setbp1 (20/38), Ets1 (11/38), Ash1l (8/38), Notch1 (8/38), Erg (7/38), and Runx1 (5/38). Insertions at Setbp1 led exclusively to AML and activated a transcriptional program similar, but not identical, to those of NPM1-mutant and MLL-rearranged AMLs. Guide RNA targeting of Setbp1 was highly detrimental to Flt3ITD/+/Setbp1IM+, but not to Flt3ITD/+/Npm1cA/+, AMLs. Also, analysis of RNA-sequencing data from hundreds of human AMLs revealed that SETBP1 expression is significantly higher in FLT3-ITD AMLs lacking class-defining mutations. These findings propose that SETBP1 overexpression collaborates with FLT3-ITD to drive a subtype of human AML. To identify genetic vulnerabilities of these AMLs, we performed genome-wide CRISPR-Cas9 screening in Flt3ITD/+/Setbp1IM+ AMLs and identified potential therapeutic targets, including Kdm1a, Brd3, Ezh2, and Hmgcr. Our study gives new insights into epigenetic pathways that can drive AMLs lacking class-defining mutations and proposes therapeutic approaches against such cases.
Collapse
Affiliation(s)
- Suruchi Pacharne
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Oliver M Dovey
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Jonathan L Cooper
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Muxin Gu
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Mathias J Friedrich
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Department of Medicine II, Klinikum Rechts der Isar, Technische Universität München, Munich, Germany
| | - Sandeep S Rajan
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- United Kingdom Dementia Research Institute, University of Cambridge, Cambridge, United Kingdom
| | - Maxim Barenboim
- Department of Pediatrics and Children's Cancer Research Center, Klinikum Rechts der Isar, Technical University of Munich, School of Medicine, Munich, Germany
| | - Grace Collord
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - M S Vijayabaskar
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Hannes Ponstingl
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Etienne De Braekeleer
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - Ruben Bautista
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
| | - Milena Mazan
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Research and Development Department, Selvita S.A., Krakow, Poland
| | - Roland Rad
- Department of Medicine II, Klinikum Rechts der Isar, Technische Universität München, Munich, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany; and
| | - Konstantinos Tzelepis
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Gurdon Institute
- Department of Pathology, and
| | | | - Malgorzata Gozdecka
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
| | - George S Vassiliou
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, United Kingdom
- Wellcome-Medical Research Center (MRC) Cambridge Stem Cell Institute, University of Cambridge, Cambridge, United Kingdom
- Department of Haematology, Cambridge University Hospitals National Health Service (NHS) Trust, Cambridge, United Kingdom
| |
Collapse
|
28
|
Savari H, Shafiey H, Savadi A, Saadati N, Naghibzadeh M. Statistics and Patterns of Occurrence of Simple Tandem Repeats in SARS-CoV-1 and SARS-CoV-2 Genomic Data. Data Brief 2021; 36:107057. [PMID: 33898662 PMCID: PMC8057928 DOI: 10.1016/j.dib.2021.107057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 03/15/2021] [Accepted: 04/09/2021] [Indexed: 11/25/2022] Open
Abstract
The data presented in this article is related to the research article entitled "Developing an ultra-efficient microsatellite discoverer to find structural differences between SARS-CoV-1 and Covid-19" [Naghibzadeh et al. 2020]. Simple tandem repeats (microsatellites, STR) are extracted and investigated across all viral families from four main viral realms. An ultra-efficient and reliable software, which is recently developed by the authors and published in the above-mentioned article, is used for extracting STRs. The analysis is done for k-mer tandem repeats where k varies from one to seven. In particular the frequency of trimer STRs is shown to be low in RNA viruses compared with DNA viruses. Special attention is paid to seven zoonotic viruses from family Coronaviridae which caused several severe human crises during last two decades including MERS, SARS 2003 and Covid-19.
Collapse
Affiliation(s)
- Hossein Savari
- Knowledge Engineering Research Group, Computer Engineering Dept., Ferdowsi University of Mashhad, Mashhad, Iran
| | - Hassan Shafiey
- High Performance Computing Lab., Computer Engineering Dept., Ferdowsi University of Mashhad, Mashhad, Iran
| | - Abdorreza Savadi
- High Performance Computing Lab., Computer Engineering Dept., Ferdowsi University of Mashhad, Mashhad, Iran
| | - Nayyereh Saadati
- Ghaem Hospital, Department of Internal Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Knowledge Engineering Research Group, Computer Engineering Dept., Ferdowsi University of Mashhad, Mashhad, Iran.,High Performance Computing Lab., Computer Engineering Dept., Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
29
|
A domestic cat whole exome sequencing resource for trait discovery. Sci Rep 2021; 11:7159. [PMID: 33785770 PMCID: PMC8009874 DOI: 10.1038/s41598-021-86200-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 02/17/2021] [Indexed: 12/13/2022] Open
Abstract
Over 94 million domestic cats are susceptible to cancers and other common and rare diseases. Whole exome sequencing (WES) is a proven strategy to study these disease-causing variants. Presented is a 35.7 Mb exome capture design based on the annotated Felis_catus_9.0 genome assembly, covering 201,683 regions of the cat genome. Whole exome sequencing was conducted on 41 cats with known and unknown genetic diseases and traits, of which ten cats had matching whole genome sequence (WGS) data available, used to validate WES performance. At 80 × mean exome depth of coverage, 96.4% of on-target base coverage had a sequencing depth > 20-fold, while over 98% of single nucleotide variants (SNVs) identified by WGS were also identified by WES. Platform-specific SNVs were restricted to sex chromosomes and a small number of olfactory receptor genes. Within the 41 cats, we identified 31 previously known causal variants and discovered new gene candidate variants, including novel missense variance for polycystic kidney disease and atrichia in the Peterbald cat. These results show the utility of WES to identify novel gene candidate alleles for diseases and traits for the first time in a feline model.
Collapse
|
30
|
Silva DMZDA, Ruiz-Ruano FJ, Utsunomia R, Martín-Peciña M, Castro JP, Freire PP, Carvalho RF, Hashimoto DT, Suh A, Oliveira C, Porto-Foresti F, Artoni RF, Foresti F, Camacho JPM. Long-term persistence of supernumerary B chromosomes in multiple species of Astyanax fish. BMC Biol 2021; 19:52. [PMID: 33740955 PMCID: PMC7976721 DOI: 10.1186/s12915-021-00991-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 02/24/2021] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Eukaryote genomes frequently harbor supernumerary B chromosomes in addition to the "standard" A chromosome set. B chromosomes are thought to arise as byproducts of genome rearrangements and have mostly been considered intraspecific oddities. However, their evolutionary transcendence beyond species level has remained untested. RESULTS Here we reveal that the large metacentric B chromosomes reported in several fish species of the genus Astyanax arose in a common ancestor at least 4 million years ago. We generated transcriptomes of A. scabripinnis and A. paranae 0B and 1B individuals and used these assemblies as a reference for mapping all gDNA and RNA libraries to quantify coverage differences between B-lacking and B-carrying genomes. We show that the B chromosomes of A. scabripinnis and A. paranae share 19 protein-coding genes, of which 14 and 11 were also present in the B chromosomes of A. bockmanni and A. fasciatus, respectively. Our search for B-specific single-nucleotide polymorphisms (SNPs) identified the presence of B-derived transcripts in B-carrying ovaries, 80% of which belonged to nobox, a gene involved in oogenesis regulation. Importantly, the B chromosome nobox paralog is expressed > 30× more than the A chromosome paralog. This indicates that the normal regulation of this gene is altered in B-carrying females, which could potentially facilitate B inheritance at higher rates than Mendelian law prediction. CONCLUSIONS Taken together, our results demonstrate the long-term survival of B chromosomes despite their lack of regular pairing and segregation during meiosis and that they can endure episodes of population divergence leading to species formation.
Collapse
Affiliation(s)
- Duílio Mazzoni Zerbinato de Andrade Silva
- Departamento de Biologia Estrutural e Funcional, Instituto de Biociências de Botucatu, Universidade Estadual Paulista, UNESP, Distrito de Rubião Junior, Botucatu, SP, 18618-970, Brazil
| | - Francisco J Ruiz-Ruano
- Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden.
- Departamento de Genética, Universidad de Granada, 18071, Granada, Spain.
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU, UK.
| | - Ricardo Utsunomia
- Departamento de Genética, Instituto de Ciências Biológicas e da Saúde, ICBS, Universidade Federal Rural do Rio de Janeiro, Seropédica, RJ, 23897-000, Brazil
- Departamento de Ciências Biológicas, Faculdade de Ciências, Universidade Estadual Paulista, UNESP, Campus de Bauru, Bauru, SP, 17033-360, Brazil
| | | | - Jonathan Pena Castro
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, UFSCAR, São Carlos, SP, 13565-905, Brazil
- Departamento de Biologia Estrutural, Molecular e Genética, Universidade Estadual de Ponta Grossa, UEPG, Ponta Grossa, PR, 84030-900, Brazil
| | - Paula Paccielli Freire
- Departamento de Biologia Estrutural e Funcional, Instituto de Biociências de Botucatu, Universidade Estadual Paulista, UNESP, Distrito de Rubião Junior, Botucatu, SP, 18618-970, Brazil
- Departamento de Imunologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, USP, São Paulo, SP, 05508-900, Brazil
| | - Robson Francisco Carvalho
- Departamento de Biologia Estrutural e Funcional, Instituto de Biociências de Botucatu, Universidade Estadual Paulista, UNESP, Distrito de Rubião Junior, Botucatu, SP, 18618-970, Brazil
| | - Diogo T Hashimoto
- Centro de Aquicultura, Universidade Estadual Paulista, UNESP, Campus Jaboticabal, Jaboticabal, SP, 14884-900, Brazil
| | - Alexander Suh
- Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre, Uppsala University, SE-752 36, Uppsala, Sweden
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, NR4 7TU, UK
| | - Claudio Oliveira
- Departamento de Biologia Estrutural e Funcional, Instituto de Biociências de Botucatu, Universidade Estadual Paulista, UNESP, Distrito de Rubião Junior, Botucatu, SP, 18618-970, Brazil
| | - Fábio Porto-Foresti
- Departamento de Ciências Biológicas, Faculdade de Ciências, Universidade Estadual Paulista, UNESP, Campus de Bauru, Bauru, SP, 17033-360, Brazil
| | - Roberto Ferreira Artoni
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, UFSCAR, São Carlos, SP, 13565-905, Brazil
- Departamento de Biologia Estrutural, Molecular e Genética, Universidade Estadual de Ponta Grossa, UEPG, Ponta Grossa, PR, 84030-900, Brazil
| | - Fausto Foresti
- Departamento de Biologia Estrutural e Funcional, Instituto de Biociências de Botucatu, Universidade Estadual Paulista, UNESP, Distrito de Rubião Junior, Botucatu, SP, 18618-970, Brazil
| | | |
Collapse
|
31
|
Sagor GHM, Simm S, Kim DW, Niitsu M, Kusano T, Berberich T. Effect of thermospermine on expression profiling of different gene using massive analysis of cDNA ends (MACE) and vascular maintenance in Arabidopsis. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2021; 27:577-586. [PMID: 33854285 PMCID: PMC7981342 DOI: 10.1007/s12298-021-00967-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 02/27/2021] [Accepted: 03/03/2021] [Indexed: 05/24/2023]
Abstract
Arabidopsis thaliana polyamine oxidase 5 gene (AtPAO5) functions as a thermospermine (T-Spm) oxidase. Aerial growth of its knock-out mutant (Atpao5-2) was significantly repressed by low dose(s) of T-Spm but not by other polyamines. To figure out the underlying mechanism, massive analysis of 3'-cDNA ends was performed. Low dose of T-Spm treatment modulates more than two fold expression 1,398 genes in WT compared to 3186 genes in Atpao5-2. Cell wall, lipid and secondary metabolisms were dramatically affected in low dose T-Spm-treated Atpao5-2, in comparison to other pathways such as TCA cycle-, amino acid- metabolisms and photosynthesis. The cell wall pectin metabolism, cell wall proteins and degradation process were highly modulated. Intriguingly Fe-deficiency responsive genes and drought stress-induced genes were also up-regulated, suggesting the importance of thermospermi'ne flux on regulation of gene network. Histological observation showed that the vascular system of the joint part between stem and leaves was structurally dissociated, indicating its involvement in vascular maintenance. Endogenous increase in T-Spm and reduction in H2O2 contents were found in mutant grown in T-Spm containing media. The results indicate that T-Spm homeostasis by a fine tuned balance of its synthesis and catabolism is important for maintaining gene regulation network and the vascular system in plants.
Collapse
Affiliation(s)
- G. H. M. Sagor
- Plant Molecular Genetics Laboratory, Department of Genetics & Plant Breeding, Bangladesh Agricultural University, Mymensingh, 2202 Bangladesh
| | - Stefan Simm
- Department of Biosciences, Molecular Cell Biology of Plants, Goethe University, Frankfurt am Main, Germany
| | - Dong Wook Kim
- Graduate School of Life Sciences, Tohoku University, 2-1-1 Katahira, Aoba, Sendai, Miyagi 980-8577 Japan
| | - Masaru Niitsu
- Faculty of Pharmaceutical Sciences, Josai University, Sakado, Saitama 370-0290 Japan
| | - Tomonobu Kusano
- Graduate School of Life Sciences, Tohoku University, 2-1-1 Katahira, Aoba, Sendai, Miyagi 980-8577 Japan
| | - Thomas Berberich
- Senckenberg Biodiversity and Climate Research Center, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany
| |
Collapse
|
32
|
|
33
|
Variant Calling in Next Generation Sequencing Data. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11285-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
34
|
Pereira RJ, Ruiz‐Ruano FJ, Thomas CJ, Pérez‐Ruiz M, Jiménez‐Bartolomé M, Liu S, Torre J, Bella JL. Mind the
numt
: Finding informative mitochondrial markers in a giant grasshopper genome. J ZOOL SYST EVOL RES 2020. [DOI: 10.1111/jzs.12446] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Ricardo J. Pereira
- Division of Evolutionary Biology Faculty of Biology II Ludwig‐ Maximilians‐Universität München Planegg‐Martinsried Germany
| | - Francisco J. Ruiz‐Ruano
- Department of Genetics University of Granada Granada Spain
- Department of Ecology and Genetics – Evolutionary Biology Evolutionary Biology Centre (EBC) Uppsala University Uppsala Sweden
- Department of Organismal Biology – Systematic Biology Evolutionary Biology Centre (EBC) Uppsala University Uppsala Sweden
| | - Callum J.E. Thomas
- Division of Evolutionary Biology Faculty of Biology II Ludwig‐ Maximilians‐Universität München Planegg‐Martinsried Germany
| | - Mar Pérez‐Ruiz
- Departamento de Biología (Genética) Facultad de Ciencias Universidad Autónoma de Madrid Madrid Spain
| | - Miguel Jiménez‐Bartolomé
- Departamento de Biología (Genética) Facultad de Ciencias Universidad Autónoma de Madrid Madrid Spain
| | - Shanlin Liu
- Department of Entomology College of Plant Protection China Agricultural University Beijing China
| | - Joaquina Torre
- Departamento de Biología (Genética) Facultad de Ciencias Universidad Autónoma de Madrid Madrid Spain
- Centro de Investigación en Biodiversidad y Cambio Global (CIBC‐UAM) Universidad Autónoma de Madrid Madrid Spain
| | - José L. Bella
- Departamento de Biología (Genética) Facultad de Ciencias Universidad Autónoma de Madrid Madrid Spain
- Centro de Investigación en Biodiversidad y Cambio Global (CIBC‐UAM) Universidad Autónoma de Madrid Madrid Spain
| |
Collapse
|
35
|
Hammoudeh SM, Venkatachalam T, Ansari AW, Bendardaf R, Hamid Q, Rahmani M, Hamoudi R. Systems Immunology Analysis Reveals an Immunomodulatory Effect of Snail-p53 Binding on Neutrophil- and T Cell-Mediated Immunity in KRAS Mutant Non-Small Cell Lung Cancer. Front Immunol 2020; 11:569671. [PMID: 33381110 PMCID: PMC7768232 DOI: 10.3389/fimmu.2020.569671] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 11/11/2020] [Indexed: 11/13/2022] Open
Abstract
Immunomodulation and chronic inflammation are important mechanisms utilized by cancer cells to evade the immune defense and promote tumor progression. Therefore, various efforts were focused on the development of approaches to reprogram the immune response to increase the immune detection of cancer cells and enhance patient response to various types of therapy. A number of regulatory proteins were investigated and proposed as potential targets for immunomodulatory therapeutic approaches including p53 and Snail. In this study, we investigated the immunomodulatory effect of disrupting Snail-p53 binding induced by the oncogenic KRAS to suppress p53 signaling. We analyzed the transcriptomic profile mediated by Snail-p53 binding inhibitor GN25 in non-small cell lung cancer cells (A549) using Next generation whole RNA-sequencing. Notably, we observed a significant enrichment in transcripts involved in immune response pathways especially those contributing to neutrophil (IL8) and T-cell mediated immunity (BCL6, and CD81). Moreover, transcripts associated with NF-κB signaling were also enriched which may play an important role in the immunomodulatory effect of Snail-p53 binding. Further analysis revealed that the immune expression signature of GN25 overlaps with the signature of other therapeutic compounds known to exhibit immunomodulatory effects validating the immunomodulatory potential of targeting Snail-p53 binding. The effects of GN25 on the immune response pathways suggest that targeting Snail-p53 binding might be a potentially effective therapeutic strategy.
Collapse
Affiliation(s)
- Sarah Musa Hammoudeh
- Clinical Sciences Department, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates.,Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Thenmozhi Venkatachalam
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Abdul Wahid Ansari
- Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Riyad Bendardaf
- Clinical Sciences Department, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates.,Oncology Unit, University Hospital Sharjah, Sharjah, United Arab Emirates
| | - Qutayba Hamid
- Clinical Sciences Department, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates.,Meakins-Christie Laboratories, McGill University, Montreal, QC, Canada
| | - Mohamed Rahmani
- Clinical Sciences Department, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates.,Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates
| | - Rifat Hamoudi
- Clinical Sciences Department, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates.,Sharjah Institute for Medical Research, University of Sharjah, Sharjah, United Arab Emirates.,Division of Surgery and Interventional Science, University College London, London, United Kingdom
| |
Collapse
|
36
|
Banes GL, Fountain ED, Karklus A, Huang HM, Jang-Liaw NH, Burgess DL, Wendt J, Moehlenkamp C, Mayhew GF. Genomic targets for high-resolution inference of kinship, ancestry and disease susceptibility in orang-utans (genus: Pongo). BMC Genomics 2020; 21:873. [PMID: 33287706 PMCID: PMC7720378 DOI: 10.1186/s12864-020-07278-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 11/24/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Orang-utans comprise three critically endangered species endemic to the islands of Borneo and Sumatra. Though whole-genome sequencing has recently accelerated our understanding of their evolutionary history, the costs of implementing routine genome screening and diagnostics remain prohibitive. Capitalizing on a tri-fold locus discovery approach, combining data from published whole-genome sequences, novel whole-exome sequencing, and microarray-derived genotype data, we aimed to develop a highly informative gene-focused panel of targets that can be used to address a broad range of research questions. RESULTS We identified and present genomic co-ordinates for 175,186 SNPs and 2315 Y-chromosomal targets, plus 185 genes either known or presumed to be pathogenic in cardiovascular (N = 109) or respiratory (N = 43) diseases in humans - the primary and secondary causes of captive orang-utan mortality - or a majority of other human diseases (N = 33). As proof of concept, we designed and synthesized 'SeqCap' hybrid capture probes for these targets, demonstrating cost-effective target enrichment and reduced-representation sequencing. CONCLUSIONS Our targets are of broad utility in studies of orang-utan ancestry, admixture and disease susceptibility and aetiology, and thus are of value in addressing questions key to the survival of these species. To facilitate comparative analyses, these targets could now be standardized for future orang-utan population genomic studies. The targets are broadly compatible with commercial target enrichment platforms and can be utilized as published here to synthesize applicable probes.
Collapse
Affiliation(s)
- Graham L Banes
- Wisconsin National Primate Research Center, University of Wisconsin-Madison, 1220 Capitol Court, Madison, WI, 53715, USA.
| | - Emily D Fountain
- Wisconsin National Primate Research Center, University of Wisconsin-Madison, 1220 Capitol Court, Madison, WI, 53715, USA
| | - Alyssa Karklus
- School of Veterinary Medicine, University of Wisconsin-Madison, 2015 Linden Drive, Madison, WI, 53706, USA
| | - Hao-Ming Huang
- Conservation Genetics Laboratory, Conservation and Research Center, Taipei Zoo, No. 30, Section 2, Xinguang Road, Wenshan District, Taipei City, Taiwan, 11656
| | - Nian-Hong Jang-Liaw
- Conservation Genetics Laboratory, Conservation and Research Center, Taipei Zoo, No. 30, Section 2, Xinguang Road, Wenshan District, Taipei City, Taiwan, 11656
| | - Daniel L Burgess
- Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI, 53719, USA.,Polymer Forge, Inc., 504 S Rosa Rd Ste 200, Madison, WI, 53719, USA
| | - Jennifer Wendt
- Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI, 53719, USA.,Promega Corporation, 2800 Woods Hollow Rd, Fitchburg, WI, 53711, USA
| | - Cynthia Moehlenkamp
- Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI, 53719, USA.,Exact Sciences, 441 Charmany Dr, Madison, WI, 53719, USA
| | - George F Mayhew
- Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI, 53719, USA
| |
Collapse
|
37
|
Dang X, Yang Y, Zhang Y, Chen X, Fan Z, Liu Q, Ji J, Li D, Li Y, Fang B, Wu Z, Liu E, Hu X, Zhu S, She D, Wang H, Li Y, Chen S, Wu Y, Hong D. OsSYL2 AA , an allele identified by gene-based association, increases style length in rice (Oryza sativa L.). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:1491-1503. [PMID: 33031564 PMCID: PMC7821000 DOI: 10.1111/tpj.15013] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 09/10/2020] [Accepted: 09/18/2020] [Indexed: 06/01/2023]
Abstract
Stigma characteristics are important factors affecting the seed yield of hybrid rice per unit area. Natural variation of stigma characteristics has been reported in rice, but the genetic basis for this variation is largely unknown. We performed a genome-wide association study on three stigma characteristics in six environments using 1.3 million single-nucleotide polymorphism (SNPs) characterized in 353 diverse accessions of Oryza sativa. An abundance of phenotypic variation was present in the three stigma characteristics of these collections. We identified four significant SNPs associated with stigma length, 20 SNPs with style length (SYL), and 17 SNPs with the sum of stigma and style length, which were detected repeatedly in more than four environments. Of these SNPs, 28 were novel. We identified two causal gene loci for SYL, OsSYL3 and OsSYL2; OsSYL3 was co-localized with the grain size gene GS3. The SYL of accessions carrying allele OsSYL3AA was significantly longer than that of those carrying allele OsSYL3CC . We also demonstrated that the outcrossing rate of female parents carrying allele OsSYL2AA increased by 5.71% compared with that of the isogenic line carrying allele OsSYL2CC in an F1 hybrid seed production field. The allele frequencies of OsSYL3AA and OsSYL2AA decreased gradually with an increase in latitude in the Northern Hemisphere. Our results should facilitate the improvement in stigma characteristics of parents of hybrid rice.
Collapse
Affiliation(s)
- Xiaojing Dang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Yang Yang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Yuanqing Zhang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Xiangong Chen
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Zhilan Fan
- Rice Research InstituteGuangdong Academy of Agricultural SciencesGuangzhou510640China
| | - Qiangming Liu
- Special Crop Research InstituteChongqing Academy of Agricultural SciencesChongqing402160China
| | - Jie Ji
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Dalu Li
- School of Agriculture and BiologyShanghai Jiao Tong UniversityShanghai200240China
| | - Yanhui Li
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Bingjie Fang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Zexu Wu
- Key Laboratory of Crop Germplasm of Zhejiang ProvinceInstitute of Crop ScienceZhejiang UniversityHangzhou310058China
| | - Erbao Liu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Xiaoxiao Hu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Shangshang Zhu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Dong She
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Hui Wang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Yulong Li
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Siqi Chen
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Yufeng Wu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| | - Delin Hong
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjing210095China
| |
Collapse
|
38
|
Kumar PS, Dabdoub SM, Ganesan SM. Probing periodontal microbial dark matter using metataxonomics and metagenomics. Periodontol 2000 2020; 85:12-27. [PMID: 33226714 DOI: 10.1111/prd.12349] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Our view of the periodontal microbial community has been shaped by a century or more of cultivation-based and microscopic investigations. While these studies firmly established the infection-mediated etiology of periodontal diseases, it was apparent from the very early days that periodontal microbiology suffered from what Staley and Konopka described as the "great plate count anomaly", in that these culturable bacteria were only a minor part of what was visible under the microscope. For nearly a century, much effort has been devoted to finding the right tools to investigate this uncultivated majority, also known as "microbial dark matter". The discovery that DNA was an effective tool to "see" microbial dark matter was a significant breakthrough in environmental microbiology, and oral microbiologists were among the earliest to capitalize on these advances. By identifying the order in which nucleotides are arranged in a stretch of DNA (DNA sequencing) and creating a repository of these sequences, sequence databases were created. Computational tools that used probability-driven analysis of these sequences enabled the discovery of new and unsuspected species and ascribed novel functions to these species. This review will trace the development of DNA sequencing as a quantitative, open-ended, comprehensive approach to characterize microbial communities in their native environments, and explore how this technology has shifted traditional dogmas on how the oral microbiome promotes health and its role in disease causation and perpetuation.
Collapse
Affiliation(s)
- Purnima S Kumar
- Department of Periodontology, College of Dentistry, The Ohio State University, Columbus, Ohio, USA
| | - Shareef M Dabdoub
- Department of Periodontology, College of Dentistry, The Ohio State University, Columbus, Ohio, USA
| | - Sukirth M Ganesan
- Department of Periodontics, College of Dentistry and Dental Clinics, The University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
39
|
Freel KC, Fouteau S, Roche D, Farasin J, Huber A, Koechler S, Peres M, Chiboub O, Varet H, Proux C, Deschamps J, Briandet R, Torchet R, Cruveiller S, Lièvremont D, Coppée JY, Barbe V, Arsène-Ploetze F. Effect of arsenite and growth in biofilm conditions on the evolution of Thiomonas sp. CB2. Microb Genom 2020; 6:mgen000447. [PMID: 33034553 PMCID: PMC7660254 DOI: 10.1099/mgen.0.000447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 09/14/2020] [Indexed: 11/30/2022] Open
Abstract
Thiomonas bacteria are ubiquitous at acid mine drainage sites and play key roles in the remediation of water at these locations by oxidizing arsenite to arsenate, favouring the sorption of arsenic by iron oxides and their coprecipitation. Understanding the adaptive capacities of these bacteria is crucial to revealing how they persist and remain active in such extreme conditions. Interestingly, it was previously observed that after exposure to arsenite, when grown in a biofilm, some strains of Thiomonas bacteria develop variants that are more resistant to arsenic. Here, we identified the mechanisms involved in the emergence of such variants in biofilms. We found that the percentage of variants generated increased in the presence of high concentrations of arsenite (5.33 mM), especially in the detached cells after growth under biofilm-forming conditions. Analysis of gene expression in the parent strain CB2 revealed that genes involved in DNA repair were upregulated in the conditions where variants were observed. Finally, we assessed the phenotypes and genomes of the subsequent variants generated to evaluate the number of mutations compared to the parent strain. We determined that multiple point mutations accumulated after exposure to arsenite when cells were grown under biofilm conditions. Some of these mutations were found in what is referred to as ICE19, a genomic island (GI) carrying arsenic-resistance genes, also harbouring characteristics of an integrative and conjugative element (ICE). The mutations likely favoured the excision and duplication of this GI. This research aids in understanding how Thiomonas bacteria adapt to highly toxic environments, and, more generally, provides a window to bacterial genome evolution in extreme environments.
Collapse
Affiliation(s)
- Kelle C. Freel
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
- Present address: Hawaiʻi Institute of Marine Biology, University of Hawaiʻi at Mānoa, Kāneʻohe, HI, USA
| | - Stephanie Fouteau
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université Evry, Université Paris-Saclay, Evry, France
| | - David Roche
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université Evry, Université Paris-Saclay, Evry, France
| | - Julien Farasin
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
| | - Aline Huber
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
| | - Sandrine Koechler
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
- Present address: Institut de Biologie Moléculaire des Plantes, CNRS, Université de Strasbourg, Strasbourg, France
| | - Martina Peres
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
| | - Olfa Chiboub
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
| | - Hugo Varet
- Plateforme Transcriptome et Epigenome, BioMics, Centre de Ressources et Recherches Technologiques, Institut Pasteur, Paris, France
- Hub Bioinformatique et Biostatistique, Centre de Bioinformatique, Biostatistique et Biologie Intégrative (C3BI, USR 3756, IP CNRS), Institut Pasteur, Paris, France
| | - Caroline Proux
- Plateforme Transcriptome et Epigenome, BioMics, Centre de Ressources et Recherches Technologiques, Institut Pasteur, Paris, France
| | - Julien Deschamps
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| | - Romain Briandet
- Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
| | - Rachel Torchet
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université Evry, Université Paris-Saclay, Evry, France
| | - Stephane Cruveiller
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université Evry, Université Paris-Saclay, Evry, France
| | - Didier Lièvremont
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
| | - Jean-Yves Coppée
- Plateforme Transcriptome et Epigenome, BioMics, Centre de Ressources et Recherches Technologiques, Institut Pasteur, Paris, France
| | - Valérie Barbe
- Génomique Métabolique, Genoscope, Institut de Biologie François Jacob, Commissariat à l'Energie Atomique (CEA), CNRS, Université Evry, Université Paris-Saclay, Evry, France
| | - Florence Arsène-Ploetze
- Laboratoire Génétique Moléculaire, Génomique et Microbiologie, UMR7156, Institut de Botanique, CNRS – Université de Strasbourg, Strasbourg, France
- Present address: Institut de Biologie Moléculaire des Plantes, CNRS, Université de Strasbourg, Strasbourg, France
| |
Collapse
|
40
|
Global transcriptome analysis of subterranean pod and seed in peanut (Arachis hypogaea L.) unravels the complexity of fruit development under dark condition. Sci Rep 2020; 10:13050. [PMID: 32747681 PMCID: PMC7398922 DOI: 10.1038/s41598-020-69943-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 07/13/2020] [Indexed: 12/14/2022] Open
Abstract
Peanut pods develop underground, which is the most salient characteristic in peanut. However, its developmental transcriptome remains largely unknown. In the present study, we sequenced over one billion transcripts to explore the developmental transcriptome of peanut pod using Illumina sequencing. Moreover, we identified and quantified the abundances of 165,689 transcripts in seed and shell tissues along with a pod developmental gradient. The dynamic changes of differentially expressed transcripts (DETs) were described in seed and shell. Additionally, we found that photosynthetic genes were not only pronouncedly enriched in aerial pod, but also played roles in developing pod under dark condition. Genes functioning in photomorphogenesis showed distinct expression profiles along subterranean pod development. Clustering analysis unraveled a dynamic transcriptome, in which transcripts for DNA synthesis and cell division during pod expansion were transitioning to transcripts for cell expansion and storage activity during seed filling. Collectively, our study formed a transcriptional baseline for peanut fruit development under dark condition.
Collapse
|
41
|
Development and characterization of 15 novel polymorphic microsatellite loci for two important bot flies (Diptera, Oestridae) by next-generation sequencing. Parasitol Res 2020; 119:2829-2835. [PMID: 32705375 DOI: 10.1007/s00436-020-06824-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 07/19/2020] [Indexed: 12/12/2022]
Abstract
Cephenemyia stimulator and Oestrus ovis are two important parasitic bot flies (Oestridae) species causing myiasis, with a potential negative impact on the welfare of the host. Using next-generation sequencing approach and bioinformatics tools, a large panel of possible microsatellites loci was obtained in both species. Primer pairs were designed for 15 selected microsatellite loci in C. stimulator and other 15 loci in O. ovis for PCR amplification. Loci amplification and analysis were performed in four populations of each species. The results demonstrated that all selected loci were polymorphic, with the number of alleles ranging from 2 to 6 per locus in C. stimulator and 3 to 13 per locus in O. ovis. This is the first time to describe these microsatellite loci for C. stimulator and O. ovis. These two sets of microsatellite markers could be further used for biogeographic and population genetics studies.
Collapse
|
42
|
Abstract
Mapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30× coverage 2×150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.
Collapse
Affiliation(s)
- Robert Edgar
- Unaffiliated, Corte Madera, CA, United States of America
| |
Collapse
|
43
|
Talevi V, Wen J, Lalla RV, Brennan MT, Mougeot FB, Mougeot JLC. Identification of single nucleotide pleomorphisms associated with periodontal disease in head and neck cancer irradiation patients by exome sequencing. Oral Surg Oral Med Oral Pathol Oral Radiol 2020; 130:32-42.e4. [PMID: 32451231 DOI: 10.1016/j.oooo.2020.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Revised: 02/05/2020] [Accepted: 02/16/2020] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Periodontal disease (PD) is a common oral complication in patients with head and neck cancer (HNC) undergoing radiation therapy (RT). Our objective was to identify candidate single nucleotide polymorphisms (SNPs) associated with PD in radiation-treated patients with HNC. STUDY DESIGN DNA was extracted from the saliva of patients with HNC (n = 69) before RT. Clinical attachment loss (CAL) increment greater than 0.2 mm over 24 months after RT was used to define PD progression. After exome sequencing, SNPs associated with post-RT PD progression were identified by using logistic regression and homozygosity analyses. The web tools STRING, the Database for Annotation, Visualization and Integrated Discovery (DAVID), GeneCodis, and Ensembl Variant Effect Predictor were used for functional analysis. RESULTS Of the 48 patients with HNC with post-RT PD progression, 24 had no tooth with 5 mm or greater pocket depth before RT, whereas of the 21 patients with HNC without progression, 11 had PD initially. A total of 330 SNPs (249 genes) with over-represented homozygous genotype (98.5% variant allele) were found to be associated with post-RT PD. Sixty of these corresponded to PD-related pathways, including previously identified genes. In patients with HNC with post-RT PD progression, SNPs were found in genes (n = 10) in contrast to those without progression (n = 7). CONCLUSIONS The SNPs of collagen genes were identified, potentially defining susceptibility to PD in patients with HNC, and this could be further investigated to characterize PD drug targets.
Collapse
Affiliation(s)
- Valentina Talevi
- Department of Oral Medicine, Carolinas Medical Center, Atrium Health, Charlotte, NC, USA; College of Computing and Informatics, Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA
| | - Jia Wen
- College of Computing and Informatics, Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA
| | - Rajesh V Lalla
- Section of Oral Medicine, University of Connecticut Health, Farmington, CT, USA
| | - Michael T Brennan
- Department of Oral Medicine, Carolinas Medical Center, Atrium Health, Charlotte, NC, USA
| | - Farah B Mougeot
- Department of Oral Medicine, Carolinas Medical Center, Atrium Health, Charlotte, NC, USA
| | - Jean-Luc C Mougeot
- Department of Oral Medicine, Carolinas Medical Center, Atrium Health, Charlotte, NC, USA; College of Computing and Informatics, Department of Bioinformatics and Genomics, UNC-Charlotte, Charlotte, NC, USA.
| |
Collapse
|
44
|
Developing an ultra-efficient microsatellite discoverer to find structural differences between SARS-CoV-1 and Covid-19. INFORMATICS IN MEDICINE UNLOCKED 2020; 19:100356. [PMID: 32501423 PMCID: PMC7241407 DOI: 10.1016/j.imu.2020.100356] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 05/20/2020] [Accepted: 05/20/2020] [Indexed: 12/24/2022] Open
Abstract
Motivation Recently, the outbreak of Coronavirus-Covid-19 has forced the World Health Organization to declare a pandemic status. A genome sequence is the core of this virus which interferes with the normal activities of its counterparts within humans. Analysis of its genome may provide clues toward the proper treatment of patients and the design of new drugs and vaccines. Microsatellites are composed of short genome subsequences which are successively repeated many times in the same direction. They are highly variable in terms of their building blocks, number of repeats, and their locations in the genome sequences. This mutability property has been the source of many diseases. Usually the host genome is analyzed to diagnose possible diseases in the victim. In this research, the focus is concentrated on the attacker's genome for discovery of its malicious properties. Results The focus of this research is the microsatellites of both SARS and Covid-19. An accurate and highly efficient computer method for identifying all microsatellites in the genome sequences is discovered and implemented, and it is used to find all microsatellites in the Coronavirus-Covid-19 and SARS2003. The Microsatellite discovery is based on an efficient indexing technique called K-Mer Hash Indexing. The method is called Fast Microsatellite Discovery (FMSD) and it is used for both SARS and Covid-19. A table composed of all microsatellites is reported. There are many differences between SARS and Covid-19, but there is an outstanding difference which requires further investigation. Availability FMSD is freely available at https://gitlab.com/FUM_HPCLab/fmsd_project, implemented in C on Linux-Ubuntu system. Software related contact: hossein_savari@mail.um.ac.ir.
Collapse
|
45
|
Abstract
Mapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30× coverage 2×150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.
Collapse
Affiliation(s)
- Robert Edgar
- Unaffiliated, Corte Madera, CA, United States of America
| |
Collapse
|
46
|
Dong C, Zhang L, Chen Z, Xia C, Gu Y, Wang J, Li D, Xie Z, Zhang Q, Zhang X, Gui L, Liu X, Kong X. Combining a New Exome Capture Panel With an Effective varBScore Algorithm Accelerates BSA-Based Gene Cloning in Wheat. FRONTIERS IN PLANT SCIENCE 2020; 11:1249. [PMID: 32903549 PMCID: PMC7438552 DOI: 10.3389/fpls.2020.01249] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 07/29/2020] [Indexed: 05/07/2023]
Abstract
The discovery of functional genes underlying agronomic traits is of great importance for wheat improvement. Here we designed a new wheat exome capture probe panel based on IWGSC RefSeq v1.0 genome sequence information and developed an effective algorithm, varBScore, that can sufficiently reduce the background noise in gene mapping and identification. An effective method, termed bulked segregant exome capture sequencing (BSE-Seq) for identifying causal mutations or candidate genes was established by combining the use of a newly designed wheat exome capture panel, sequencing of bulked segregant pools from segregating populations, and the robust algorithm varBScore. We evaluated the effectiveness of varBScore on SNP calling using the published dataset for mapping and cloning the yellow rust resistance gene Yr7 in wheat. Furthermore, using BSE-Seq, we rapidly identified a wheat yellow leaf mutant gene, ygl1, in an ethyl methanesulfonate (EMS) mutant population and found that a single mutation of G to A at 921 position in the wild type YGL1 gene encoding magnesium-chelatase subunit chlI caused the leaf yellowing phenotype. We further showed that mutation of YGL1 through CRISPR/Cas9 gene editing led to a yellow phenotype on the leaves of transgenic wheat, indicating that ygl1 is the correct causal gene responsible for the mutant phenotype. In summary, our approach is highly efficient for discovering causal mutations and gene cloning in wheat.
Collapse
Affiliation(s)
- Chunhao Dong
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lichao Zhang
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
- *Correspondence: Lichao Zhang, ; Xu Liu, ; Xiuying Kong,
| | - Zhongxu Chen
- Department of Life Science, Chengdu Tcuni Technology, Chengdu, China
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Sichuan Agricultural University, Chengdu, China
| | - Chuan Xia
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Yongqiang Gu
- Western Regional Research, United States Department of Agriculture-Agricultural Research Service, Albany, CA, United States
| | - Jirui Wang
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Sichuan Agricultural University, Chengdu, China
- Triticeae Research Institute, Sichuan Agricultural University, Chengdu, China
| | - Danping Li
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Zhencheng Xie
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Qiang Zhang
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xueying Zhang
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Lixuan Gui
- Department of Life Science, Chengdu Tcuni Technology, Chengdu, China
- State Key Laboratory of Crop Gene Exploration and Utilization in Southwest China, Sichuan Agricultural University, Chengdu, China
| | - Xu Liu
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
- *Correspondence: Lichao Zhang, ; Xu Liu, ; Xiuying Kong,
| | - Xiuying Kong
- Key Laboratory for Crop Gene Resources and Germplasm Enhancement, MOA, National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, China
- *Correspondence: Lichao Zhang, ; Xu Liu, ; Xiuying Kong,
| |
Collapse
|
47
|
Illikoud N, Gohier R, Werner D, Barrachina C, Roche D, Jaffrès E, Zagorec M. Transcriptome and Volatilome Analysis During Growth of Brochothrix thermosphacta in Food: Role of Food Substrate and Strain Specificity for the Expression of Spoilage Functions. Front Microbiol 2019; 10:2527. [PMID: 31781057 PMCID: PMC6856214 DOI: 10.3389/fmicb.2019.02527] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/21/2019] [Indexed: 11/13/2022] Open
Abstract
Brochothrix thermosphacta is one of the main spoilers in food, responsible for meat and seafood spoilage through the production of malodorous volatile organic compounds. The molecules produced by this bacterium depend on the substrate (meat or seafood) and the storage conditions such as gas mixtures used in the packaging. It seems also that the spoilage potential is strain dependent as production of diacetyl and acetoin, two molecules responsible for seafood spoilage, varies with strains. Therefore, this suggests the involvement of different metabolic functions depending on both food substrate and strain capacities. In this study, we selected two strains with different abilities to produce diacetyl and acetoin and compared their behavior after grown in beef or cooked peeled shrimp juices. We determined the genes upregulated by both strains depending on the growth substrate and those that were specifically upregulated in only one strain. The genes upregulated by both strains in meat or in shrimp juice revealed the importance of the substrate for inducing specific metabolic pathways. The examination of genes that were specifically upregulated in only one of the two strains revealed strain features associated to specific substrates and also strain-specific regulations of metabolic pathways putatively leading to different levels of spoilage molecule production. This shows that the spoilage potential of B. thermosphacta depends on nutrients provided by food substrate and on metabolic activity potential that each strain possesses.
Collapse
Affiliation(s)
| | | | | | - Célia Barrachina
- MGX, CNRS, INSERM, University of Montpellier, Montpellier, France
| | - David Roche
- Génomique Métabolique, Génoscope, Institut François Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, Evry, France
| | | | | |
Collapse
|
48
|
Kinsella CM, Ruiz-Ruano FJ, Dion-Côté AM, Charles AJ, Gossmann TI, Cabrero J, Kappei D, Hemmings N, Simons MJP, Camacho JPM, Forstmeier W, Suh A. Programmed DNA elimination of germline development genes in songbirds. Nat Commun 2019; 10:5468. [PMID: 31784533 PMCID: PMC6884545 DOI: 10.1038/s41467-019-13427-4] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 11/08/2019] [Indexed: 02/08/2023] Open
Abstract
In some eukaryotes, germline and somatic genomes differ dramatically in their composition. Here we characterise a major germline–soma dissimilarity caused by a germline-restricted chromosome (GRC) in songbirds. We show that the zebra finch GRC contains >115 genes paralogous to single-copy genes on 18 autosomes and the Z chromosome, and is enriched in genes involved in female gonad development. Many genes are likely functional, evidenced by expression in testes and ovaries at the RNA and protein level. Using comparative genomics, we show that genes have been added to the GRC over millions of years of evolution, with embryonic development genes bicc1 and trim71 dating to the ancestor of songbirds and dozens of other genes added very recently. The somatic elimination of this evolutionarily dynamic chromosome in songbirds implies a unique mechanism to minimise genetic conflict between germline and soma, relevant to antagonistic pleiotropy, an evolutionary process underlying ageing and sexual traits. Songbirds have extensive germline–soma genome differences due to developmental elimination of a germline-specific chromosome (GRC). Here, the authors show that the GRC contains dozens of expressed developmental genes, some of which have been on the GRC since the ancestor of all songbirds.
Collapse
Affiliation(s)
- Cormac M Kinsella
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden.,Laboratory of Experimental Virology, Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, 1105 AZ, Amsterdam, The Netherlands
| | - Francisco J Ruiz-Ruano
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden. .,Department of Genetics, University of Granada, E-18071, Granada, Spain. .,Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden.
| | - Anne-Marie Dion-Côté
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden.,Department of Molecular Biology & Genetics, Cornell University, Ithaca, NY, 14853, USA.,Département de Biologie, Université de Moncton, Moncton, NB, E1A 3E9, Canada
| | - Alexander J Charles
- Department of Animal and Plant Sciences, University of Sheffield, S10 2TN, Sheffield, UK
| | - Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, S10 2TN, Sheffield, UK.,Department of Animal Behaviour, Bielefeld University, D-33501, Bielefeld, Germany
| | - Josefa Cabrero
- Department of Genetics, University of Granada, E-18071, Granada, Spain
| | - Dennis Kappei
- Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore, Singapore.,Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 117596, Singapore, Singapore
| | - Nicola Hemmings
- Department of Animal and Plant Sciences, University of Sheffield, S10 2TN, Sheffield, UK
| | - Mirre J P Simons
- Department of Animal and Plant Sciences, University of Sheffield, S10 2TN, Sheffield, UK
| | | | | | - Alexander Suh
- Department of Ecology and Genetics - Evolutionary Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden. .,Department of Organismal Biology - Systematic Biology, Evolutionary Biology Centre (EBC), Science for Life Laboratory, Uppsala University, SE-752 36, Uppsala, Sweden.
| |
Collapse
|
49
|
Liang R, Xie J, Zhang C, Zhang M, Huang H, Huo H, Cao X, Niu B. Identifying Cancer Targets Based on Machine Learning Methods via Chou's 5-steps Rule and General Pseudo Components. Curr Top Med Chem 2019; 19:2301-2317. [PMID: 31622219 DOI: 10.2174/1568026619666191016155543] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Revised: 07/19/2019] [Accepted: 08/26/2019] [Indexed: 01/09/2023]
Abstract
In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of 'big data' derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.
Collapse
Affiliation(s)
- Ruirui Liang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Jiayang Xie
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Chi Zhang
- Foshan Huaxia Eye Hospital, Huaxia Eye Hospital Group, Foshan 528000, China
| | - Mengying Zhang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Hai Huang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Haizhong Huo
- Department of General Surgery, Shanghai Ninth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai 200011, China
| | - Xin Cao
- Zhongshan Hospital, Institute of Clinical Science, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
50
|
New prognostic markers revealed by RNA-Seq transcriptome analysis after MYC silencing in a metastatic gastric cancer cell line. Oncotarget 2019; 10:5768-5779. [PMID: 31645899 PMCID: PMC6791377 DOI: 10.18632/oncotarget.27208] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 08/27/2019] [Indexed: 02/06/2023] Open
Abstract
MYC overexpression is considered a driver event in gastric cancer (GC), and is frequently correlated with poor prognosis and metastasis. In this study, we evaluated the prognostic value of genes upregulated by MYC in patients with GC. Metastatic GC cells (AGP01) characterized by MYC amplification, were transfected with siRNAs targeting MYC. RNA-seq was performed in silenced and non-silenced AGP01 cells. Among the differentially expressed genes, CIAPIN1, MTA2, and UXT were validated using qRT-PCR, western blot, and immunohistochemistry in gastric tissues of 213 patients with GC; and their expressions were correlated with clinicopathological and survival data. High mRNA and protein levels of CIAPIN1, MTA2, and UXT were strongly associated with advanced GC stages (P < 0.0001). However, only CIAPIN1 and UXT gene expressions were able to predict distant metastases in patients with early-stage GC (P < 0.0001), with high sensitivity (> 92%) and specificity (> 90%). Overall survival rate of patients with overexpressed CIAPIN1 or UXT was significantly lower (P < 0.0001). In conclusion, CIAPIN1 and UXT may serve as potential molecular markers for GC prognosis.
Collapse
|