1
|
Agrawal S, Buyan A, Severin J, Koido M, Alam T, Abugessaisa I, Chang HY, Dostie J, Itoh M, Kere J, Kondo N, Li Y, Makeev VJ, Mendez M, Okazaki Y, Ramilowski JA, Sigorskikh AI, Strug LJ, Yagi K, Yasuzawa K, Yip CW, Hon CC, Hoffman MM, Terao C, Kulakovskiy IV, Kasukawa T, Shin JW, Carninci P, de Hoon MJL. Annotation of nuclear lncRNAs based on chromatin interactions. PLoS One 2024; 19:e0295971. [PMID: 38709794 PMCID: PMC11073715 DOI: 10.1371/journal.pone.0295971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 12/02/2023] [Indexed: 05/08/2024] Open
Abstract
The human genome is pervasively transcribed and produces a wide variety of long non-coding RNAs (lncRNAs), constituting the majority of transcripts across human cell types. Some specific nuclear lncRNAs have been shown to be important regulatory components acting locally. As RNA-chromatin interaction and Hi-C chromatin conformation data showed that chromatin interactions of nuclear lncRNAs are determined by the local chromatin 3D conformation, we used Hi-C data to identify potential target genes of lncRNAs. RNA-protein interaction data suggested that nuclear lncRNAs act as scaffolds to recruit regulatory proteins to target promoters and enhancers. Nuclear lncRNAs may therefore play a role in directing regulatory factors to locations spatially close to the lncRNA gene. We provide the analysis results through an interactive visualization web portal at https://fantom.gsc.riken.jp/zenbu/reports/#F6_3D_lncRNA.
Collapse
Affiliation(s)
- Saumya Agrawal
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Andrey Buyan
- Autosome.org, Russia
- FANTOM Consortium, Dolgoprudny, Russia
| | - Jessica Severin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | | | - Howard Y. Chang
- Center for Personal Dynamic Regulome, Stanford University, Stanford, California, United States of America
| | - Josée Dostie
- Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec, Canada
| | - Masayoshi Itoh
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Stem Cells and Metabolism Research Program, University of Helsinki and Folkhälsan Research Center, Helsinki, Finland
| | - Naoto Kondo
- RIKEN Center for Life Science Technologies, Yokohama, Japan
| | - Yunjing Li
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | | | - Mickaël Mendez
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Yasushi Okazaki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jordan A. Ramilowski
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Advanced Medical Research Center, Yokohama City University, Yokohama, Japan
| | | | - Lisa J. Strug
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Ontario, Canada
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Ken Yagi
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kayoko Yasuzawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chi Wai Yip
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chung Chau Hon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Michael M. Hoffman
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Chikashi Terao
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Takeya Kasukawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jay W. Shin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | | |
Collapse
|
2
|
Jain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, Yu Y, Huff CD, Murakami K, Nagai Y, Imanishi T, Mungall CJ, Jacobsen JOB, Kim D, Jeong CS, Jones DT, Li MJ, Guthrie VB, Bhattacharya R, Chen YC, Douville C, Fan J, Kim D, Masica D, Niknafs N, Sengupta S, Tokheim C, Turner TN, Yeo HTG, Karchin R, Shin S, Welch R, Keles S, Li Y, Kellis M, Corbi-Verge C, Strokach AV, Kim PM, Klein TE, Mohan R, Sinnott-Armstrong NA, Wainberg M, Kundaje A, Gonzaludo N, Mak ACY, Chhibber A, Lam HYK, Dahary D, Fishilevich S, Lancet D, Lee I, Bachman B, Katsonis P, Lua RC, Wilson SJ, Lichtarge O, Bhat RR, Sundaram L, Viswanath V, Bellazzi R, Nicora G, Rizzo E, Limongelli I, Mezlini AM, Chang R, Kim S, Lai C, O’Connor R, Topper S, van den Akker J, Zhou AY, Zimmer AD, Mishne G, Bergquist TR, Breese MR, Guerrero RF, Jiang Y, Kiga N, Li B, Mort M, Pagel KA, Pejaver V, Stamboulian MH, Thusberg J, Mooney SD, Teerakulkittipong N, Cao C, Kundu K, Yin Y, Yu CH, Kleyman M, Lin CF, Stackpole M, Mount SM, Eraslan G, Mueller NS, Naito T, Rao AR, Azaria JR, Brodie A, Ofran Y, Garg A, Pal D, Hawkins-Hooker A, Kenlay H, Reid J, Mucaki EJ, Rogan PK, Schwarz JM, Searls DB, Lee GR, Seok C, Krämer A, Shah S, Huang CV, Kirsch JF, Shatsky M, Cao Y, Chen H, Karimi M, Moronfoye O, Sun Y, Shen Y, Shigeta R, Ford CT, Nodzak C, Uppal A, Shi X, Joseph T, Kotte S, Rana S, Rao A, Saipradeep VG, Sivadasan N, Sunderam U, Stanke M, Su A, Adzhubey I, Jordan DM, Sunyaev S, Rousseau F, Schymkowitz J, Van Durme J, Tavtigian SV, Carraro M, Giollo M, Tosatto SCE, Adato O, Carmel L, Cohen NE, Fenesh T, Holtzer T, Juven-Gershon T, Unger R, Niroula A, Olatubosun A, Väliaho J, Yang Y, Vihinen M, Wahl ME, Chang B, Chong KC, Hu I, Sun R, Wu WKK, Xia X, Zee BC, Wang MH, Wang M, Wu C, Lu Y, Chen K, Yang Y, Yates CM, Kreimer A, Yan Z, Yosef N, Zhao H, Wei Z, Yao Z, Zhou F, Folkman L, Zhou Y, Daneshjou R, Altman RB, Inoue F, Ahituv N, Arkin AP, Lovisa F, Bonvini P, Bowdin S, Gianni S, Mantuano E, Minicozzi V, Novak L, Pasquo A, Pastore A, Petrosino M, Puglisi R, Toto A, Veneziano L, Chiaraluce R, Ball MP, Bobe JR, Church GM, Consalvi V, Cooper DN, Buckley BA, Sheridan MB, Cutting GR, Scaini MC, Cygan KJ, Fredericks AM, Glidden DT, Neil C, Rhine CL, Fairbrother WG, Alontaga AY, Fenton AW, Matreyek KA, Starita LM, Fowler DM, Löscher BS, Franke A, Adamson SI, Graveley BR, Gray JW, Malloy MJ, Kane JP, Kousi M, Katsanis N, Schubach M, Kircher M, Mak ACY, Tang PLF, Kwok PY, Lathrop RH, Clark WT, Yu GK, LeBowitz JH, Benedicenti F, Bettella E, Bigoni S, Cesca F, Mammi I, Marino-Buslje C, Milani D, Peron A, Polli R, Sartori S, Stanzial F, Toldo I, Turolla L, Aspromonte MC, Bellini M, Leonardi E, Liu X, Marshall C, McCombie WR, Elefanti L, Menin C, Meyn MS, Murgia A, Nadeau KCY, Neuhausen SL, Nussbaum RL, Pirooznia M, Potash JB, Dimster-Denk DF, Rine JD, Sanford JR, Snyder M, Cote AG, Sun S, Verby MW, Weile J, Roth FP, Tewhey R, Sabeti PC, Campagna J, Refaat MM, Wojciak J, Grubb S, Schmitt N, Shendure J, Spurdle AB, Stavropoulos DJ, Walton NA, Zandi PP, Ziv E, Burke W, Chen F, Carr LR, Martinez S, Paik J, Harris-Wai J, Yarborough M, Fullerton SM, Koenig BA, McInnes G, Shigaki D, Chandonia JM, Furutsuki M, Kasak L, Yu C, Chen R, Friedberg I, Getz GA, Cong Q, Kinch LN, Zhang J, Grishin NV, Voskanian A, Kann MG, Tran E, Ioannidis NM, Hunter JM, Udani R, Cai B, Morgan AA, Sokolov A, Stuart JM, Minervini G, Monzon AM, Batzoglou S, Butte AJ, Greenblatt MS, Hart RK, Hernandez R, Hubbard TJP, Kahn S, O’Donnell-Luria A, Ng PC, Shon J, Veltman J, Zook JM. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol 2024; 25:53. [PMID: 38389099 PMCID: PMC10882881 DOI: 10.1186/s13059-023-03113-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 11/17/2023] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. RESULTS Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. CONCLUSIONS Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
Collapse
|
3
|
Subkhankulova T, Camargo Sosa K, Uroshlev LA, Nikaido M, Shriever N, Kasianov AS, Yang X, Rodrigues FSLM, Carney TJ, Bavister G, Schwetlick H, Dawes JHP, Rocco A, Makeev VJ, Kelsh RN. Zebrafish pigment cells develop directly from persistent highly multipotent progenitors. Nat Commun 2023; 14:1258. [PMID: 36878908 PMCID: PMC9988989 DOI: 10.1038/s41467-023-36876-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 02/17/2023] [Indexed: 03/08/2023] Open
Abstract
Neural crest cells are highly multipotent stem cells, but it remains unclear how their fate restriction to specific fates occurs. The direct fate restriction model hypothesises that migrating cells maintain full multipotency, whilst progressive fate restriction envisages fully multipotent cells transitioning to partially-restricted intermediates before committing to individual fates. Using zebrafish pigment cell development as a model, we show applying NanoString hybridization single cell transcriptional profiling and RNAscope in situ hybridization that neural crest cells retain broad multipotency throughout migration and even in post-migratory cells in vivo, with no evidence for partially-restricted intermediates. We find that leukocyte tyrosine kinase early expression marks a multipotent stage, with signalling driving iridophore differentiation through repression of fate-specific transcription factors for other fates. We reconcile the direct and progressive fate restriction models by proposing that pigment cell development occurs directly, but dynamically, from a highly multipotent state, consistent with our recently-proposed Cyclical Fate Restriction model.
Collapse
Affiliation(s)
| | - Karen Camargo Sosa
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Leonid A Uroshlev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Ul. Gubkina 3, Moscow, 119991, Russia
| | - Masataka Nikaido
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- Graduate School of Science, University of Hyogo, Ako-gun, Hyogo Pref., 678-1297, Japan
| | - Noah Shriever
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Ul. Gubkina 3, Moscow, 119991, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow Region, 141701, Russia
- A.A. Kharkevich Institute for Information Transmission Problems (IITP), Russian Academy of Sciences, Bolshoy Karetny per. 19, build.1, Moscow, 127051, Russia
| | - Xueyan Yang
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- The MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, 200438, PR China
| | | | - Thomas J Carney
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- Lee Kong Chian School of Medicine, Experimental Medicine Building, Yunnan Garden Campus, Nanyang Technological University, 59 Nanyang Drive, Yunnan Garden, 636921, Singapore
| | - Gemma Bavister
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Hartmut Schwetlick
- Department of Mathematical Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Jonathan H P Dawes
- Department of Mathematical Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Andrea Rocco
- Department of Microbial Sciences, FHMS, University of Surrey, GU2 7XH, Guildford, UK
- Department of Physics, FEPS, University of Surrey, GU2 7XH, Guildford, UK
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Ul. Gubkina 3, Moscow, 119991, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow Region, 141701, Russia
- Laboratory 'Regulatory Genomics', Institute of Fundamental Medicine and Biology, Kazan Federal University, 18 Kremlyovskaya street, Kazan, 420008, Russia
| | - Robert N Kelsh
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| |
Collapse
|
4
|
Abstract
The position weight matrix, also called the position-specific scoring matrix, is the commonly accepted model to quantify the specificity of transcription factor binding to DNA. Position weight matrices are used in thousands of projects and software tools in regulatory genomics, including computational prediction of the regulatory impact of single-nucleotide variants. Yet, recently Yan et al. reported that "the position weight matrices of most transcription factors lack sufficient predictive power" if applied to the analysis of regulatory variants studied with a newly developed experimental method, SNP-SELEX. Here, we re-analyze the rich experimental dataset obtained by Yan et al. and show that appropriately selected position weight matrices in fact can adequately quantify transcription factor binding to alternative alleles.
Collapse
Affiliation(s)
- Alexandr Boytsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation
- Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - Sergey Abramov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation
- Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - Vsevolod J. Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation
- Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - Ivan V. Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russian Federation
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, 142290, Russian Federation
| |
Collapse
|
5
|
Kuiper M, Bonello J, Fernández-Breis JT, Bucher P, Futschik ME, Gaudet P, Kulakovskiy IV, Licata L, Logie C, Lovering RC, Makeev VJ, Orchard S, Panni S, Perfetto L, Sant D, Schulz S, Zerbino DR, Lægreid A. The Gene Regulation Knowledge Commons: The action area of GREEKC. Biochim Biophys Acta Gene Regul Mech 2021; 1865:194768. [PMID: 34757206 DOI: 10.1016/j.bbagrm.2021.194768] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 10/18/2021] [Accepted: 10/20/2021] [Indexed: 02/08/2023]
Abstract
The COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC, CA15205, www.greekc.org) organized nine workshops in a four-year period, starting September 2016. The workshops brought together a wide range of experts from all over the world working on various parts of the knowledge cycle that is central to understanding gene regulatory mechanisms. The discussions between ontologists, curators, text miners, biologists, bioinformaticians, philosophers and computational scientists spawned a host of activities aimed to update and standardise existing knowledge management workflows, encourage new experimental approaches and thoroughly involve end-users in the process to design the Gene Regulation Knowledge Commons (GRKC). The GREEKC consortium describes its main achievements, contextualised in a state-of-the-art of current tools and resources that today represent the GRKC.
Collapse
Affiliation(s)
- Martin Kuiper
- Systems Biology Group, Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway.
| | - Joseph Bonello
- Faculty of Information & Communication Technology, University of Malta, Msida, Malta
| | | | - Philipp Bucher
- Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Amphipôle, 1015 Lausanne, Switzerland
| | - Matthias E Futschik
- Systems Biology and Bioinformatics Laboratory (SysBioLab), Centre of Marine Sciences (CCMAR), University of Algarve, 8005-139 Faro, Portugal
| | - Pascale Gaudet
- SIB Swiss Institute of Bioinformatics, 1 Rue Michel-Servet, 1204 Geneva, Switzerland
| | - Ivan V Kulakovskiy
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya 4, 142290 Pushchino, Russia
| | - Luana Licata
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Colin Logie
- Department of Molecular Biology, Faculty of Science, Radboud University, PO Box 9101, Nijmegen 6500HG, the Netherlands
| | - Ruth C Lovering
- Functional Gene Annotation, Pre-clinical and Fundamental Science, Institute of Cardiovascular Science, University College London, 5 University Street, London WC1E 6JF, UK
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina 3, 119991 Moscow, Russia
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Simona Panni
- Department DIBEST, University of Calabria, Rende, Italy
| | - Livia Perfetto
- Fondazione Human Technopole, Department of Biology, Via Cristina Belgioioso, 171, 20157 Milan, Italy
| | - David Sant
- Department of Biomedical Informatics, University of Utah, 421 Wakara Way #140, Salt Lake City, UT 84108, United States
| | - Stefan Schulz
- Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerpl. 2, Graz, Austria
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Astrid Lægreid
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | | |
Collapse
|
6
|
Abramov S, Boytsov A, Bykova D, Penzar DD, Yevshin I, Kolmykov SK, Fridman MV, Favorov AV, Vorontsov IE, Baulin E, Kolpakov F, Makeev VJ, Kulakovskiy IV. Landscape of allele-specific transcription factor binding in the human genome. Nat Commun 2021; 12:2751. [PMID: 33980847 PMCID: PMC8115691 DOI: 10.1038/s41467-021-23007-0] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 04/12/2021] [Indexed: 12/28/2022] Open
Abstract
Sequence variants in gene regulatory regions alter gene expression and contribute to phenotypes of individual cells and the whole organism, including disease susceptibility and progression. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Differential transcription factor binding in heterozygous genomic loci provides a natural source of information on such regulatory variants. We present a novel approach to call the allele-specific transcription factor binding events at single-nucleotide variants in ChIP-Seq data, taking into account the joint contribution of aneuploidy and local copy number variation, that is estimated directly from variant calls. We have conducted a meta-analysis of more than 7 thousand ChIP-Seq experiments and assembled the database of allele-specific binding events listing more than half a million entries at nearly 270 thousand single-nucleotide polymorphisms for several hundred human transcription factors and cell types. These polymorphisms are enriched for associations with phenotypes of medical relevance and often overlap eQTLs, making candidates for causality by linking variants with molecular mechanisms. Specifically, there is a special class of switching sites, where different transcription factors preferably bind alternative alleles, thus revealing allele-specific rewiring of molecular circuitry. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Here the authors present a meta-analysis empowered by a new statistical method covering thousands of ChIP-Seq experiments resulting in the identification of more than 500 thousand allele-specific binding (ASB) events in the human genome.
Collapse
Affiliation(s)
- Sergey Abramov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Alexandr Boytsov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Daria Bykova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Dmitry D Penzar
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia.,Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Ivan Yevshin
- Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia.,Sirius University of Science and Technology, Sochi, Russia.,BIOSOFT.RU LLC, Novosibirsk, Russia
| | - Semyon K Kolmykov
- Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia.,Sirius University of Science and Technology, Sochi, Russia.,BIOSOFT.RU LLC, Novosibirsk, Russia
| | - Marina V Fridman
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexander V Favorov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia.,Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ilya E Vorontsov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Eugene Baulin
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia.,Institute of Mathematical Problems of Biology RAS-The Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Russia
| | - Fedor Kolpakov
- Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia.,Sirius University of Science and Technology, Sochi, Russia.,BIOSOFT.RU LLC, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia. .,Moscow Institute of Physics and Technology, Dolgoprudny, Russia. .,State Research Institute of Genetics and Selection of Industrial Microorganisms of the National Research Center Kurchatov Institute, Moscow, Russia. .,Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.
| | - Ivan V Kulakovskiy
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Russia. .,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia. .,Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.
| |
Collapse
|
7
|
Kolmykov S, Yevshin I, Kulyashov M, Sharipov R, Kondrakhin Y, Makeev VJ, Kulakovskiy IV, Kel A, Kolpakov F. GTRD: an integrated view of transcription regulation. Nucleic Acids Res 2021; 49:D104-D111. [PMID: 33231677 PMCID: PMC7778956 DOI: 10.1093/nar/gkaa1057] [Citation(s) in RCA: 101] [Impact Index Per Article: 33.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/18/2020] [Accepted: 11/03/2020] [Indexed: 12/24/2022] Open
Abstract
The Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org/) contains uniformly annotated and processed NGS data related to gene transcription regulation: ChIP-seq, ChIP-exo, DNase-seq, MNase-seq, ATAC-seq and RNA-seq. With the latest release, the database has reached a new level of data integration. All cell types (cell lines and tissues) presented in the GTRD were arranged into a dictionary and linked with different ontologies (BRENDA, Cell Ontology, Uberon, Cellosaurus and Experimental Factor Ontology) and with related experiments in specialized databases on transcription regulation (FANTOM5, ENCODE and GTEx). The updated version of the GTRD provides an integrated view of transcription regulation through a dedicated web interface with advanced browsing and search capabilities, an integrated genome browser, and table reports by cell types, transcription factors, and genes of interest.
Collapse
Affiliation(s)
- Semyon Kolmykov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Federal Research Center Institute of Cytology and Genetics SB RAS, Novosibirsk 630090, Russian Federation
| | - Ivan Yevshin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| | - Mikhail Kulyashov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Novosibirsk State University, Novosibirsk 630090, Russian Federation
| | - Ruslan Sharipov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
- Novosibirsk State University, Novosibirsk 630090, Russian Federation
| | - Yury Kondrakhin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics RAS, Moscow 119991, Russian Federation
- Moscow Institute of Physics and Technology (State University), Dolgoprudny 141700, Russian Federation
- NRC «Kurchatov Institute» - GOSNIIGENETIKA, Kurchatov Genomic Center, Moscow 123182, Russian Federation
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russian Federation
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics RAS, Moscow 119991, Russian Federation
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russian Federation
- Institute of Protein Research, Russian Academy of Sciences, Pushchino 142290, Russian Federation
| | - Alexander Kel
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- geneXplain GmbH, 38302 Wolfenbüttel, Germany
- Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk 630090, Russian Federation
| | - Fedor Kolpakov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
- Federal Research Center for Information and Computational Technologies, Novosibirsk 630090, Russian Federation
| |
Collapse
|
8
|
Sethi S, Vorontsov IE, Kulakovskiy IV, Greenaway S, Williams J, Makeev VJ, Brown SDM, Simon MM, Mallon AM. A holistic view of mouse enhancer architectures reveals analogous pleiotropic effects and correlation with human disease. BMC Genomics 2020; 21:754. [PMID: 33138777 PMCID: PMC7607678 DOI: 10.1186/s12864-020-07109-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 09/29/2020] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Efforts to elucidate the function of enhancers in vivo are underway but their vast numbers alongside differing enhancer architectures make it difficult to determine their impact on gene activity. By systematically annotating multiple mouse tissues with super- and typical-enhancers, we have explored their relationship with gene function and phenotype. RESULTS Though super-enhancers drive high total- and tissue-specific expression of their associated genes, we find that typical-enhancers also contribute heavily to the tissue-specific expression landscape on account of their large numbers in the genome. Unexpectedly, we demonstrate that both enhancer types are preferentially associated with relevant 'tissue-type' phenotypes and exhibit no difference in phenotype effect size or pleiotropy. Modelling regulatory data alongside molecular data, we built a predictive model to infer gene-phenotype associations and use this model to predict potentially novel disease-associated genes. CONCLUSION Overall our findings reveal that differing enhancer architectures have a similar impact on mammalian phenotypes whilst harbouring differing cellular and expression effects. Together, our results systematically characterise enhancers with predicted phenotypic traits endorsing the role for both types of enhancers in human disease and disorders.
Collapse
Affiliation(s)
- Siddharth Sethi
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina 3, Moscow, 119991, Russia
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya 4, Pushchino, Moscow Region, 142290, Russia
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina 3, Moscow, 119991, Russia
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya 4, Pushchino, Moscow Region, 142290, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, Moscow, 119991, Russia
| | - Simon Greenaway
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK
| | - John Williams
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK
- Institute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, Birmingham, B15 2TH, UK
- Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina 3, Moscow, 119991, Russia
- Institute of Protein Research, Russian Academy of Sciences, Institutskaya 4, Pushchino, Moscow Region, 142290, Russia
- Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow Region, 141700, Russia
| | - Steve D M Brown
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK
| | - Michelle M Simon
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK.
| | - Ann-Marie Mallon
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK.
| |
Collapse
|
9
|
Ramilowski JA, Yip CW, Agrawal S, Chang JC, Ciani Y, Kulakovskiy IV, Mendez M, Ooi JLC, Ouyang JF, Parkinson N, Petri A, Roos L, Severin J, Yasuzawa K, Abugessaisa I, Akalin A, Antonov IV, Arner E, Bonetti A, Bono H, Borsari B, Brombacher F, Cameron CJ, Cannistraci CV, Cardenas R, Cardon M, Chang H, Dostie J, Ducoli L, Favorov A, Fort A, Garrido D, Gil N, Gimenez J, Guler R, Handoko L, Harshbarger J, Hasegawa A, Hasegawa Y, Hashimoto K, Hayatsu N, Heutink P, Hirose T, Imada EL, Itoh M, Kaczkowski B, Kanhere A, Kawabata E, Kawaji H, Kawashima T, Kelly ST, Kojima M, Kondo N, Koseki H, Kouno T, Kratz A, Kurowska-Stolarska M, Kwon ATJ, Leek J, Lennartsson A, Lizio M, López-Redondo F, Luginbühl J, Maeda S, Makeev VJ, Marchionni L, Medvedeva YA, Minoda A, Müller F, Muñoz-Aguirre M, Murata M, Nishiyori H, Nitta KR, Noguchi S, Noro Y, Nurtdinov R, Okazaki Y, Orlando V, Paquette D, Parr CJ, Rackham OJ, Rizzu P, Martinez DFS, Sandelin A, Sanjana P, Semple CA, Shibayama Y, Sivaraman DM, Suzuki T, Szumowski SC, Tagami M, Taylor MS, Terao C, Thodberg M, Thongjuea S, Tripathi V, Ulitsky I, Verardo R, Vorontsov IE, Yamamoto C, Young RS, Baillie JK, Forrest AR, Guigó R, Hoffman MM, Hon CC, Kasukawa T, Kauppinen S, Kere J, Lenhard B, Schneider C, Suzuki H, Yagi K, de Hoon MJ, Shin JW, Carninci P. Corrigendum: Functional annotation of human long noncoding RNAs via molecular phenotyping. Genome Res 2020. [DOI: 10.1101/gr.270330.120] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Ramilowski JA, Yip CW, Agrawal S, Chang JC, Ciani Y, Kulakovskiy IV, Mendez M, Ooi JLC, Ouyang JF, Parkinson N, Petri A, Roos L, Severin J, Yasuzawa K, Abugessaisa I, Akalin A, Antonov IV, Arner E, Bonetti A, Bono H, Borsari B, Brombacher F, Cameron CJF, Cannistraci CV, Cardenas R, Cardon M, Chang H, Dostie J, Ducoli L, Favorov A, Fort A, Garrido D, Gil N, Gimenez J, Guler R, Handoko L, Harshbarger J, Hasegawa A, Hasegawa Y, Hashimoto K, Hayatsu N, Heutink P, Hirose T, Imada EL, Itoh M, Kaczkowski B, Kanhere A, Kawabata E, Kawaji H, Kawashima T, Kelly ST, Kojima M, Kondo N, Koseki H, Kouno T, Kratz A, Kurowska-Stolarska M, Kwon ATJ, Leek J, Lennartsson A, Lizio M, López-Redondo F, Luginbühl J, Maeda S, Makeev VJ, Marchionni L, Medvedeva YA, Minoda A, Müller F, Muñoz-Aguirre M, Murata M, Nishiyori H, Nitta KR, Noguchi S, Noro Y, Nurtdinov R, Okazaki Y, Orlando V, Paquette D, Parr CJC, Rackham OJL, Rizzu P, Sánchez Martinez DF, Sandelin A, Sanjana P, Semple CAM, Shibayama Y, Sivaraman DM, Suzuki T, Szumowski SC, Tagami M, Taylor MS, Terao C, Thodberg M, Thongjuea S, Tripathi V, Ulitsky I, Verardo R, Vorontsov IE, Yamamoto C, Young RS, Baillie JK, Forrest ARR, Guigó R, Hoffman MM, Hon CC, Kasukawa T, Kauppinen S, Kere J, Lenhard B, Schneider C, Suzuki H, Yagi K, de Hoon MJL, Shin JW, Carninci P. Functional annotation of human long noncoding RNAs via molecular phenotyping. Genome Res 2020; 30:1060-1072. [PMID: 32718982 PMCID: PMC7397864 DOI: 10.1101/gr.254219.119] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 06/24/2020] [Indexed: 12/12/2022]
Abstract
Long noncoding RNAs (lncRNAs) constitute the majority of transcripts in the mammalian genomes, and yet, their functions remain largely unknown. As part of the FANTOM6 project, we systematically knocked down the expression of 285 lncRNAs in human dermal fibroblasts and quantified cellular growth, morphological changes, and transcriptomic responses using Capped Analysis of Gene Expression (CAGE). Antisense oligonucleotides targeting the same lncRNAs exhibited global concordance, and the molecular phenotype, measured by CAGE, recapitulated the observed cellular phenotypes while providing additional insights on the affected genes and pathways. Here, we disseminate the largest-to-date lncRNA knockdown data set with molecular phenotyping (over 1000 CAGE deep-sequencing libraries) for further exploration and highlight functional roles for ZNF213-AS1 and lnc-KHDC3L-2.
Collapse
Affiliation(s)
- Jordan A Ramilowski
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Chi Wai Yip
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Saumya Agrawal
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Jen-Chien Chang
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Yari Ciani
- Laboratorio Nazionale Consorzio Interuniversitario Biotecnologie (CIB), Trieste 34127, Italy
| | - Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia.,Institute of Protein Research, Russian Academy of Sciences, Pushchino 142290, Russia
| | - Mickaël Mendez
- Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada
| | | | - John F Ouyang
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore Medical School, Singapore 169857, Singapore
| | - Nick Parkinson
- Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, United Kingdom
| | - Andreas Petri
- Center for RNA Medicine, Department of Clinical Medicine, Aalborg University, Copenhagen 9220, Denmark
| | - Leonie Roos
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, United Kingdom.,Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London W12 0NN, United Kingdom
| | - Jessica Severin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Kayoko Yasuzawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Imad Abugessaisa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Altuna Akalin
- Berlin Institute for Medical Systems Biology, Max Delbrük Center for Molecular Medicine in the Helmholtz Association, Berlin 13125, Germany
| | - Ivan V Antonov
- Institute of Bioengineering, Research Center of Biotechnology, Russian Academy of Sciences, Moscow 117312, Russia
| | - Erik Arner
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Alessandro Bonetti
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Hidemasa Bono
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima City 739-0046, Japan
| | - Beatrice Borsari
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia 08003, Spain
| | - Frank Brombacher
- International Centre for Genetic Engineering and Biotechnology (ICGEB), University of Cape Town, Cape Town 7925, South Africa.,Institute of Infectious Diseases and Molecular Medicine (IDM), Department of Pathology, Division of Immunology and South African Medical Research Council (SAMRC) Immunology of Infectious Diseases, Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa
| | - Christopher JF Cameron
- School of Computer Science, McGill University, Montréal, Québec H3G 1Y6, Canada.,Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec H3G 1Y6, Canada.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06510, USA
| | - Carlo Vittorio Cannistraci
- Biomedical Cybernetics Group, Biotechnology Center (BIOTEC), Center for Molecular and Cellular Bioengineering (CMCB), Center for Systems Biology Dresden (CSBD), Cluster of Excellence Physics of Life (PoL), Department of Physics, Technische Universität Dresden, Dresden 01062, Germany.,Center for Complex Network Intelligence (CCNI) at the Tsinghua Laboratory of Brain and Intelligence (THBI), Department of Bioengineering, Tsinghua University, Beijing 100084, China
| | - Ryan Cardenas
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Melissa Cardon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Howard Chang
- Center for Personal Dynamic Regulome, Stanford University, Stanford, California 94305, USA
| | - Josée Dostie
- Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec H3G 1Y6, Canada
| | - Luca Ducoli
- Institute of Pharmaceutical Sciences, Swiss Federal Institute of Technology, Zurich 8093, Switzerland
| | - Alexander Favorov
- Department of Computational Systems Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia.,Department of Oncology, Johns Hopkins University, Baltimore, Maryland 21287, USA
| | - Alexandre Fort
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Diego Garrido
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia 08003, Spain
| | - Noa Gil
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Juliette Gimenez
- Epigenetics and Genome Reprogramming Laboratory, IRCCS Fondazione Santa Lucia, Rome 00179, Italy
| | - Reto Guler
- International Centre for Genetic Engineering and Biotechnology (ICGEB), University of Cape Town, Cape Town 7925, South Africa.,Institute of Infectious Diseases and Molecular Medicine (IDM), Department of Pathology, Division of Immunology and South African Medical Research Council (SAMRC) Immunology of Infectious Diseases, Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa
| | - Lusy Handoko
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Jayson Harshbarger
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Akira Hasegawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Yuki Hasegawa
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Kosuke Hashimoto
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Norihito Hayatsu
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Peter Heutink
- Genome Biology of Neurodegenerative Diseases, German Center for Neurodegenerative Diseases (DZNE), Tübingen 72076, Germany
| | - Tetsuro Hirose
- Graduate School of Frontier Biosciences, Osaka University, Suita 565-0871, Japan
| | - Eddie L Imada
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland 21287, USA
| | - Masayoshi Itoh
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Preventive Medicine and Diagnosis Innovation Program (PMI), Saitama 351-0198, Japan
| | - Bogumil Kaczkowski
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Aditi Kanhere
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Emily Kawabata
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Hideya Kawaji
- RIKEN Preventive Medicine and Diagnosis Innovation Program (PMI), Saitama 351-0198, Japan
| | - Tsugumi Kawashima
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - S Thomas Kelly
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Miki Kojima
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Naoto Kondo
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Haruhiko Koseki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Tsukasa Kouno
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Anton Kratz
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Mariola Kurowska-Stolarska
- Institute of Infection, Immunity, and Inflammation, University of Glasgow, Glasgow, Scotland G12 8QQ, United Kingdom
| | - Andrew Tae Jun Kwon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Jeffrey Leek
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland 21287, USA
| | - Andreas Lennartsson
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge 14157, Sweden
| | - Marina Lizio
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Fernando López-Redondo
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Joachim Luginbühl
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Shiori Maeda
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Vsevolod J Makeev
- Department of Computational Systems Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Luigi Marchionni
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland 21287, USA
| | - Yulia A Medvedeva
- Institute of Bioengineering, Research Center of Biotechnology, Russian Academy of Sciences, Moscow 117312, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Aki Minoda
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Ferenc Müller
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia 08003, Spain
| | - Mitsuyoshi Murata
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Hiromi Nishiyori
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Kazuhiro R Nitta
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Shuhei Noguchi
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Yukihiko Noro
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Ramil Nurtdinov
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia 08003, Spain
| | - Yasushi Okazaki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Valerio Orlando
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Denis Paquette
- Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec H3G 1Y6, Canada
| | - Callum J C Parr
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Owen J L Rackham
- Program in Cardiovascular and Metabolic Disorders, Duke-National University of Singapore Medical School, Singapore 169857, Singapore
| | - Patrizia Rizzu
- Genome Biology of Neurodegenerative Diseases, German Center for Neurodegenerative Diseases (DZNE), Tübingen 72076, Germany
| | | | - Albin Sandelin
- Department of Biology and BRIC, University of Copenhagen, Denmark, Copenhagen N DK2200, Denmark
| | - Pillay Sanjana
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Colin A M Semple
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Youtaro Shibayama
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Divya M Sivaraman
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Takahiro Suzuki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | | | - Michihira Tagami
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Martin S Taylor
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh EH4 2XU, United Kingdom
| | - Chikashi Terao
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Malte Thodberg
- Department of Biology and BRIC, University of Copenhagen, Denmark, Copenhagen N DK2200, Denmark
| | - Supat Thongjuea
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Vidisha Tripathi
- National Centre for Cell Science, Pune, Maharashtra 411007, India
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Roberto Verardo
- Laboratorio Nazionale Consorzio Interuniversitario Biotecnologie (CIB), Trieste 34127, Italy
| | - Ilya E Vorontsov
- Department of Computational Systems Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| | - Chinatsu Yamamoto
- RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Robert S Young
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh EH8 9AG, United Kingdom
| | - J Kenneth Baillie
- Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, United Kingdom
| | - Alistair R R Forrest
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan.,Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Nedlands, Perth, Western Australia 6009, Australia
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Catalonia 08002, Spain
| | | | - Chung Chau Hon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Takeya Kasukawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Sakari Kauppinen
- Center for RNA Medicine, Department of Clinical Medicine, Aalborg University, Copenhagen 9220, Denmark
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge 14157, Sweden.,Stem Cells and Metabolism Research Program, University of Helsinki and Folkhälsan Research Center, 00290 Helsinki, Finland
| | - Boris Lenhard
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, United Kingdom.,Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London W12 0NN, United Kingdom.,Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen N-5008, Norway
| | - Claudio Schneider
- Laboratorio Nazionale Consorzio Interuniversitario Biotecnologie (CIB), Trieste 34127, Italy.,Department of Medicine and Consorzio Interuniversitario Biotecnologie p.zle Kolbe 1 University of Udine, Udine 33100, Italy
| | - Harukazu Suzuki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Ken Yagi
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Michiel J L de Hoon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Jay W Shin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan.,RIKEN Center for Life Science Technologies, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
11
|
Penzar DD, Zinkevich AO, Vorontsov IE, Sitnik VV, Favorov AV, Makeev VJ, Kulakovskiy IV. What Do Neighbors Tell About You: The Local Context of Cis-Regulatory Modules Complicates Prediction of Regulatory Variants. Front Genet 2019; 10:1078. [PMID: 31737053 PMCID: PMC6834773 DOI: 10.3389/fgene.2019.01078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/09/2019] [Indexed: 02/05/2023] Open
Abstract
Many problems of modern genetics and functional genomics require the assessment of functional effects of sequence variants, including gene expression changes. Machine learning is considered to be a promising approach for solving this task, but its practical applications remain a challenge due to the insufficient volume and diversity of training data. A promising source of valuable data is a saturation mutagenesis massively parallel reporter assay, which quantitatively measures changes in transcription activity caused by sequence variants. Here, we explore the computational predictions of the effects of individual single-nucleotide variants on gene transcription measured in the massively parallel reporter assays, based on the data from the recent "Regulation Saturation" Critical Assessment of Genome Interpretation challenge. We show that the estimated prediction quality strongly depends on the structure of the training and validation data. Particularly, training on the sequence segments located next to the validation data results in the "information leakage" caused by the local context. This information leakage allows reproducing the prediction quality of the best CAGI challenge submissions with a fairly simple machine learning approach, and even obtaining notably better-than-random predictions using irrelevant genomic regions. Validation scenarios preventing such information leakage dramatically reduce the measured prediction quality. The performance at independent regulatory regions entirely excluded from the training set appears to be much lower than needed for practical applications, and even the performance estimation will become reliable only in the future with richer data from multiple reporters. The source code and data are available at https://bitbucket.org/autosomeru_cagi2018/cagi2018_regsat and https://genomeinterpretation.org/content/expression-variants.
Collapse
Affiliation(s)
- Dmitry D. Penzar
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | - Arsenii O. Zinkevich
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Ilya E. Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Vasily V. Sitnik
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexander V. Favorov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, The Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Vsevolod J. Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Ivan V. Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Institute of Mathematical Problems of Biology RAS - the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
12
|
Orekhov AN, Oishi Y, Nikiforov NG, Zhelankin AV, Dubrovsky L, Sobenin IA, Kel A, Stelmashenko D, Makeev VJ, Foxx K, Jin X, Kruth HS, Bukrinsky M. Modified LDL Particles Activate Inflammatory Pathways in Monocyte-derived Macrophages: Transcriptome Analysis. Curr Pharm Des 2019; 24:3143-3151. [PMID: 30205792 PMCID: PMC6302360 DOI: 10.2174/1381612824666180911120039] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 08/28/2018] [Accepted: 09/04/2018] [Indexed: 12/27/2022]
Abstract
Background: A hallmark of atherosclerosis is its complex pathogenesis, which is dependent on altered cholesterol metabolism and inflammation. Both arms of pathogenesis involve myeloid cells. Monocytes migrating into the arterial walls interact with modified low-density lipoprotein (LDL) parti-cles, accumulate cholesterol and convert into foam cells, which promote plaque formation and also con-tribute to inflammation by producing pro-inflammatory cytokines. A number of studies characterized transcriptomics of macrophages following interaction with modified LDL, and revealed alteration of the expression of genes responsible for inflammatory response and cholesterol metabolism. However, it is still unclear how these two processes are related to each other to contribute to atherosclerotic lesion formation. Methods: We attempted to identify the main mater regulator genes in macrophages treated with athero-genic modified LDL using a bioinformatics approach. Results: We found that most of the identified genes were involved in inflammation, and none of them was implicated in cholesterol metabolism. Among the key identified genes were interleukin (IL)-7, IL-7 receptor, IL-15 and CXCL8. Conclusion: Our results indicate that activation of the inflammatory pathway is the primary response of the immune cells to modified LDL, while the lipid metabolism genes may be a secondary response trig-gered by inflammatory signalling
Collapse
Affiliation(s)
- Alexander N Orekhov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, 125315 Moscow, Russian Federation.,Institute for Atherosclerosis Research, Skolkovo Innovative Center, 121609 Moscow, Russian Federation
| | - Yumiko Oishi
- Department of Cellular and Molecular Medicine, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 1138510, Japan
| | - Nikita G Nikiforov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, 125315 Moscow, Russian Federation.,Laboratory of Medical Genetics, Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russian Federation
| | - Andrey V Zhelankin
- Laboratory of postgenomic research, Federal Research and Clinical Center of Physical-Chemical Medicine, 119435 Moscow, Russian Federation
| | - Larisa Dubrovsky
- GW School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, United States
| | - Igor A Sobenin
- Laboratory of Medical Genetics, Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russian Federation
| | - Alexander Kel
- Biosoft.ru Ltd, 630001 Novosibirsk, Russian Federation.,GeneXplain, GmbH, Wolfenbüttel 38304, Germany.,Institute of Chemical Biology and Fundamental Medicine, 630001 Novosibirsk, Russian Federation
| | - Daria Stelmashenko
- Biosoft.ru Ltd, 630001 Novosibirsk, Russian Federation.,GeneXplain, GmbH, Wolfenbüttel 38304, Germany.,Institute of Chemical Biology and Fundamental Medicine, 630001 Novosibirsk, Russian Federation
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russian Federation
| | - Kathy Foxx
- Kalen Biomedical, LLC, Montgomery Village, MD 20886, United States
| | - Xueting Jin
- Experimental Atherosclerosis Section, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, United States
| | - Howard S Kruth
- Experimental Atherosclerosis Section, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, United States
| | - Michael Bukrinsky
- GW School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, United States
| |
Collapse
|
13
|
Sobenin IA, Zhelankin AV, Khasanova ZB, Sinyov VV, Medvedeva LV, Sagaidak MO, Makeev VJ, Kolmychkova KI, Smirnova AS, Sukhorukov VN, Postnov AY, Grechko AV, Orekhov AN. Heteroplasmic Variants of Mitochondrial DNA in Atherosclerotic Lesions of Human Aortic Intima. Biomolecules 2019; 9:biom9090455. [PMID: 31500189 PMCID: PMC6770808 DOI: 10.3390/biom9090455] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 09/02/2019] [Accepted: 09/03/2019] [Indexed: 12/16/2022] Open
Abstract
Mitochondrial dysfunction and oxidative stress are likely involved in atherogenesis. Since the mitochondrial genome variation can alter functional activity of cells, it is necessary to assess the presence in atherosclerotic lesions of mitochondrial DNA (mtDNA) heteroplasmic mutations known to be associated with different pathological processes and ageing. In this study, mtDNA heteroplasmy and copy number (mtCN) were evaluated in the autopsy-derived samples of aortic intima differing by the type of atherosclerotic lesions. To detect mtDNA heteroplasmic variants, next generation sequencing was used, and mtCN measurement was performed by qPCR. It was shown that mtDNA heteroplasmic mutations are characteristic for particular areas of intimal tissue; in 83 intimal samples 55 heteroplasmic variants were found; mean minor allele frequencies level accounted for 0.09, with 12% mean heteroplasmy level. The mtCN variance measured in adjacent areas of intima was high, but atherosclerotic lesions and unaffected intima did not differ significantly in mtCN values. Basing on the ratio of minor and major nucleotide mtDNA variants, we can conclude that there exists the increase in the number of heteroplasmic mtDNA variants, which corresponds to the extent of atherosclerotic morphologic phenotype.
Collapse
Affiliation(s)
- Igor A Sobenin
- Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russia.
- Institute of General Pathology and Pathophysiology, 125315 Moscow, Russia.
- Research Institute of Threpsology and Healthy Longevity, Plekhanov Russian University of Economics, 115093 Moscow, Russia.
| | - Andrey V Zhelankin
- Federal Research and Clinical Center of Physical-Chemical Medicine, 119435 Moscow, Russia.
| | - Zukhra B Khasanova
- Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russia.
| | - Vasily V Sinyov
- Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russia.
- Institute of General Pathology and Pathophysiology, 125315 Moscow, Russia.
| | - Lyudmila V Medvedeva
- Federal Research Center of Transplantology and Artificial Organs, 123182 Moscow, Russia.
| | - Maria O Sagaidak
- Vavilov Institute of General Genetics, 117971 Moscow, Russia.
- Moscow Institute of Physics and Technology, Dolgoprudny, 141701 Moscow Region, Russia.
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, 117971 Moscow, Russia.
- Moscow Institute of Physics and Technology, Dolgoprudny, 141701 Moscow Region, Russia.
- Engelhardt Institute of Molecular Biology, 119991 Moscow, Russia.
| | - Kira I Kolmychkova
- Institute for Atherosclerosis Research, Skolkovo Innovation Center, 143026 Moscow, Russia.
| | - Anna S Smirnova
- Institute for Atherosclerosis Research, Skolkovo Innovation Center, 143026 Moscow, Russia.
| | - Vasily N Sukhorukov
- Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russia.
- Research Institute of Human Morphology, 117418 Moscow, Russia.
| | - Anton Y Postnov
- Institute of Experimental Cardiology, National Medical Research Center of Cardiology, 121552 Moscow, Russia.
- Research Institute of Human Morphology, 117418 Moscow, Russia.
| | - Andrey V Grechko
- Federal Scientific Clinical Center for Resuscitation and Rehabilitation, 141534 Moscow Region, Russia.
| | - Alexander N Orekhov
- Institute for Atherosclerosis Research, Skolkovo Innovation Center, 143026 Moscow, Russia.
- Research Institute of Human Morphology, 117418 Moscow, Russia.
| |
Collapse
|
14
|
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, Rumynskiy EI, Medvedeva YA, Magana-Mora A, Bajic VB, Papatsenko DA, Kolpakov FA, Makeev VJ. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Res 2019; 46:D252-D259. [PMID: 29140464 PMCID: PMC5753240 DOI: 10.1093/nar/gkx1106] [Citation(s) in RCA: 446] [Impact Index Per Article: 89.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 10/31/2017] [Indexed: 12/15/2022] Open
Abstract
We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ivan S Yevshin
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia
| | - Ruslan N Sharipov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia.,Novosibirsk State University, 630090, Pirogova 2, Novosibirsk, Russia
| | - Alla D Fedorova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119234, Leninskiye Gory 1-73, Moscow, Russia
| | - Eugene I Rumynskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia.,Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, 119071, 2 Leninsky Ave. 33, Moscow, Russia
| | - Arturo Magana-Mora
- National Institute of Advanced Industrial Science and Technology (AIST), Com. Bio Big-Data Open Innovation Lab. (CBBD-OIL), AIST Tokyo Waterfront Main Bldg. #323, 2-3-26 Aomi, Tokyo 135-0064, Japan.,King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Dmitry A Papatsenko
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Fedor A Kolpakov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| |
Collapse
|
15
|
Klimina KM, Kasianov AS, Poluektova EU, Emelyanov KV, Voroshilova VN, Zakharevich NV, Kudryavtseva AV, Makeev VJ, Danilenko VN. Employing toxin-antitoxin genome markers for identification of Bifidobacterium and Lactobacillus strains in human metagenomes. PeerJ 2019; 7:e6554. [PMID: 30863681 PMCID: PMC6404652 DOI: 10.7717/peerj.6554] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 02/02/2019] [Indexed: 01/22/2023] Open
Abstract
Recent research has indicated that in addition to the unique genotype each individual may also have a unique microbiota composition. Difference in microbiota composition may emerge from both its species and strain constituents. It is important to know the precise composition especially for the gut microbiota (GM), since it can contribute to the health assessment, personalized treatment, and disease prevention for individuals and groups (cohorts). The existing methods for species and strain composition in microbiota are not always precise and usually not so easy to use. Probiotic bacteria of the genus Bifidobacterium and Lactobacillus make an essential component of human GM. Previously we have shown that in certain Bifidobacterium and Lactobacillus species the RelBE and MazEF superfamily of toxin-antitoxin (TA) systems may be used as functional biomarkers to differentiate these groups of bacteria at the species and strain levels. We have composed a database of TA genes of these superfamily specific for all lactobacilli and bifidobacteria species with complete genome sequence and confirmed that in all Lactobacillus and Bifidobacterium species TA gene composition is species and strain specific. To analyze composition of species and strains of two bacteria genera, Bifidobacterium and Lactobacillus, in human GM we developed TAGMA (toxin antitoxin genes for metagenomes analyses) software based on polymorphism in TA genes. TAGMA was tested on gut metagenomic samples. The results of our analysis have shown that TAGMA can be used to characterize species and strains of Lactobacillus and Bifidobacterium in metagenomes.
Collapse
Affiliation(s)
- Ksenia M Klimina
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia.,Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
| | - Artem S Kasianov
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| | - Elena U Poluektova
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia
| | | | | | | | - Anna V Kudryavtseva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia.,Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Valery N Danilenko
- Vavilov Institute of General Genetics Russian Academy of Sciences, Moscow, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| |
Collapse
|
16
|
Odintsova TI, Slezina MP, Istomina EA, Korostyleva TV, Kasianov AS, Kovtun AS, Makeev VJ, Shcherbakova LA, Kudryavtsev AM. Defensin-like peptides in wheat analyzed by whole-transcriptome sequencing: a focus on structural diversity and role in induced resistance. PeerJ 2019; 7:e6125. [PMID: 30643692 PMCID: PMC6329339 DOI: 10.7717/peerj.6125] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 11/18/2018] [Indexed: 01/15/2023] Open
Abstract
Antimicrobial peptides (AMPs) are the main components of the plant innate immune system. Defensins represent the most important AMP family involved in defense and non-defense functions. In this work, global RNA sequencing and de novo transcriptome assembly were performed to explore the diversity of defensin-like (DEFL) genes in the wheat Triticum kiharae and to study their role in induced resistance (IR) mediated by the elicitor metabolites of a non-pathogenic strain FS-94 of Fusarium sambucinum. Using a combination of two pipelines for DEFL mining in transcriptome data sets, as many as 143 DEFL genes were identified in T. kiharae, the vast majority of them represent novel genes. According to the number of cysteine residues and the cysteine motif, wheat DEFLs were classified into ten groups. Classical defensins with a characteristic 8-Cys motif assigned to group 1 DEFLs represent the most abundant group comprising 52 family members. DEFLs with a characteristic 4-Cys motif CX{3,5}CX{8,17}CX{4,6}C named group 4 DEFLs previously found only in legumes were discovered in wheat. Within DEFL groups, subgroups of similar sequences originated by duplication events were isolated. Variation among DEFLs within subgroups is due to amino acid substitutions and insertions/deletions of amino acid sequences. To identify IR-related DEFL genes, transcriptional changes in DEFL gene expression during elicitor-mediated IR were monitored. Transcriptional diversity of DEFL genes in wheat seedlings in response to the fungus Fusarium oxysporum, FS-94 elicitors, and the combination of both (elicitors + fungus) was demonstrated, with specific sets of up- and down-regulated DEFL genes. DEFL expression profiling allowed us to gain insight into the mode of action of the elicitors from F. sambucinum. We discovered that the elicitors up-regulated a set of 24 DEFL genes. After challenge inoculation with F. oxysporum, another set of 22 DEFLs showed enhanced expression in IR-displaying seedlings. These DEFLs, in concert with other defense molecules, are suggested to determine enhanced resistance of elicitor-pretreated wheat seedlings. In addition to providing a better understanding of the mode of action of the elicitors from FS-94 in controlling diseases, up-regulated IR-specific DEFL genes represent novel candidates for genetic transformation of plants and development of pathogen-resistant crops.
Collapse
Affiliation(s)
- Tatyana I Odintsova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Marina P Slezina
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Ekaterina A Istomina
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | | | - Artem S Kasianov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexey S Kovtun
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Larisa A Shcherbakova
- All-Russian Research Institute of Phytopathology, B. Vyazyomy, Moscow Region, Russia
| | | |
Collapse
|
17
|
Vorontsov IE, Fedorova AD, Yevshin IS, Sharipov RN, Kolpakov FA, Makeev VJ, Kulakovskiy IV. Genome-wide map of human and mouse transcription factor binding sites aggregated from ChIP-Seq data. BMC Res Notes 2018; 11:756. [PMID: 30352610 PMCID: PMC6199713 DOI: 10.1186/s13104-018-3856-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 10/16/2018] [Indexed: 11/25/2022] Open
Abstract
Objectives Mammalian genomics studies, especially those focusing on transcriptional regulation, require information on genomic locations of regulatory regions, particularly, transcription factor (TF) binding sites. There are plenty of published ChIP-Seq data on in vivo binding of transcription factors in different cell types and conditions. However, handling of thousands of separate data sets is often impractical and it is desirable to have a single global map of genomic regions potentially bound by a particular TF in any of studied cell types and conditions. Data description Here we report human and mouse cistromes, the maps of genomic regions that are routinely identified as TF binding sites, organized by TF. We provide cistromes for 349 mouse and 599 human TFs. Given a TF, its cistrome regions are supported by evidence from several ChIP-Seq experiments or several computational tools, and, as an optional filter, contain occurrences of sequence motifs recognized by the TF. Using the cistrome, we provide an annotation of TF binding sites in the vicinity of human and mouse transcription start sites. This information is useful for selecting potential gene targets of transcription factors and detecting co-regulated genes in differential gene expression data.
Collapse
Affiliation(s)
- Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, GSP-1, Gubkina 3, Moscow, Russia, 119991
| | - Alla D Fedorova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, GSP-1, Gubkina 3, Moscow, Russia, 119991
| | - Ivan S Yevshin
- BIOSOFT.RU Ltd, Russkaya 41/1, Novosibirsk, Russia, 630058
| | - Ruslan N Sharipov
- BIOSOFT.RU Ltd, Russkaya 41/1, Novosibirsk, Russia, 630058.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, Akad. Rzhanova 6, Novosibirsk, Russia, 630090.,Novosibirsk State University, Pirogova 2, Novosibirsk, Russia, 630090
| | - Fedor A Kolpakov
- BIOSOFT.RU Ltd, Russkaya 41/1, Novosibirsk, Russia, 630058.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, Akad. Rzhanova 6, Novosibirsk, Russia, 630090
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, GSP-1, Gubkina 3, Moscow, Russia, 119991.,Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, GSP-1, Vavilova 32, Moscow, Russia, 119991.,Moscow Institute of Physics and Technology (State University), 9 Institutskiy per, Dolgoprudny, Russia, 141700
| | - Ivan V Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, GSP-1, Gubkina 3, Moscow, Russia, 119991. .,Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, GSP-1, Vavilova 32, Moscow, Russia, 119991. .,Institute of Mathematical Problems of Biology RAS-the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Vitkevicha 1, Pushchino, Russia, 142290.
| |
Collapse
|
18
|
Orekhov AN, Pushkarsky T, Oishi Y, Nikiforov NG, Zhelankin AV, Dubrovsky L, Makeev VJ, Foxx K, Jin X, Kruth HS, Sobenin IA, Sukhorukov VN, Zakiev ER, Kontush A, Le Goff W, Bukrinsky M. HDL activates expression of genes stimulating cholesterol efflux in human monocyte-derived macrophages. Exp Mol Pathol 2018; 105:202-207. [PMID: 30118702 DOI: 10.1016/j.yexmp.2018.08.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 08/09/2018] [Accepted: 08/13/2018] [Indexed: 12/24/2022]
Abstract
High density lipoproteins (HDL) are key components of reverse cholesterol transport pathway. HDL removes excessive cholesterol from peripheral cells, including macrophages, providing protection from cholesterol accumulation and conversion into foam cells, which is a key event in pathogenesis of atherosclerosis. The mechanism of cellular cholesterol efflux stimulation by HDL involves interaction with the ABCA1 lipid transporter and ensuing transfer of cholesterol to HDL particles. In this study, we looked for additional proteins contributing to HDL-dependent cholesterol efflux. Using RNAseq, we analyzed mRNAs induced by HDL in human monocyte-derived macrophages and identified three genes, fatty acid desaturase 1 (FADS1), insulin induced gene 1 (INSIG1), and the low-density lipoprotein receptor (LDLR), expression of which was significantly upregulated by HDL. We individually knocked down these genes in THP-1 cells using gene silencing by siRNA, and measured cellular cholesterol efflux to HDL. Knock down of FADS1 did not significantly change cholesterol efflux (p = 0.70), but knockdown of INSIG1 and LDLR resulted in highly significant reduction of the efflux to HDL (67% and 75% of control, respectively, p < 0.001). Importantly, the suppression of cholesterol efflux was independent of known effects of these genes on cellular cholesterol content, as cells were loaded with cholesterol using acetylated LDL. These results indicate that HDL particles stimulate expression of genes that enhance cellular cholesterol transfer to HDL.
Collapse
Affiliation(s)
- Alexander N Orekhov
- Institute of General Pathology and Pathophysiology, Moscow, Russia; Institute for Atherosclerosis Research, Skolkovo Innovative Center, Moscow, Russia
| | - Tatiana Pushkarsky
- The George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Yumiko Oishi
- Department of Cellular and Molecular Medicine, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | - Nikita G Nikiforov
- Institute of General Pathology and Pathophysiology, Moscow, Russia; Laboratory of Medical Genetics, Institute of Experimental Cardiology, National Medical Research Center of Cardiology, Moscow, Russia; Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Andrey V Zhelankin
- Laboratory of postgenomic research, Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russia
| | - Larisa Dubrovsky
- The George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia; Scientific Center "Kurchatov Institute", Research Institute for Genetics and Selection of Industrial Microorganisms, Moscow, Russia; Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow, Region, Russia; Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Kathy Foxx
- Kalen Biomedical LLC, Montgomery Village, MD, USA
| | - Xueting Jin
- Experimental Atherosclerosis Section, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Howard S Kruth
- Experimental Atherosclerosis Section, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Igor A Sobenin
- Institute of General Pathology and Pathophysiology, Moscow, Russia; Laboratory of Medical Genetics, Institute of Experimental Cardiology, National Medical Research Center of Cardiology, Moscow, Russia
| | - Vasily N Sukhorukov
- Institute of General Pathology and Pathophysiology, Moscow, Russia; Sorbonne Université, Inserm, Institute of Cardiometabolism and Nutrition (ICAN), UMR_S1166, Hôpital de la Pitié, Paris, France
| | - Emile R Zakiev
- Institute of General Pathology and Pathophysiology, Moscow, Russia; Sorbonne Université, Inserm, Institute of Cardiometabolism and Nutrition (ICAN), UMR_S1166, Hôpital de la Pitié, Paris, France
| | - Anatol Kontush
- Sorbonne Université, Inserm, Institute of Cardiometabolism and Nutrition (ICAN), UMR_S1166, Hôpital de la Pitié, Paris, France
| | - Wilfried Le Goff
- Sorbonne Université, Inserm, Institute of Cardiometabolism and Nutrition (ICAN), UMR_S1166, Hôpital de la Pitié, Paris, France
| | - Michael Bukrinsky
- The George Washington University School of Medicine and Health Sciences, Washington, DC, USA.
| |
Collapse
|
19
|
Shtratnikova VY, Belalov I, Kasianov AS, Schelkunov MI, Logacheva Maria DA, Novikov AD, Shatalov AA, Gerasimova TV, Yanenko AS, Makeev VJ. The complete genome of the oil emulsifying strain Thalassolituus oleivorans K-188 from the Barents Sea. Mar Genomics 2018; 37:18-20. [PMID: 33250120 DOI: 10.1016/j.margen.2017.08.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 08/17/2017] [Indexed: 11/16/2022]
Abstract
Gammaproteobacterium Thalassolituus oleivorans plays an important role in oil degradation in sea water through emulsifying crude oil and alkanes at low temperatures in polar sea environment. Here we report the complete genome sequence of K-188 strain (VKPM B-9394) isolated in the Barents Sea and compare it with other known Thalassolituus oleivorans strains. The Thalassolituus strains are differed in orthologs number of the genes of alkane degradation, transport proteins, genes of sugar utilization, endonucleases, signaling proteins, transcriptional regulators and presence of CRISPR/Cas locus. Also only the genome of K-188 contains the 3-hydroxyalkanoate synthetase.
Collapse
Affiliation(s)
- Victoria Yu Shtratnikova
- A.N. Belozersky Institute of Physico-Chemical Biology, M.V. Lomonosov Moscow State University, Leninskye gory, h. 1, b. 40, Moscow 119991, Russian Federation.
| | - Ilya Belalov
- Winogradsky Institute of Microbiology, Research Center of Biotechnology of the Russian Academy of Sciences, Leninsky Ave., h. 33, b. 2, Moscow 119071, Russian Federation.
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Gubkina str., h. 3, Moscow 119991, Russian Federation.
| | - Mikhail I Schelkunov
- A.N. Belozersky Institute of Physico-Chemical Biology, M.V. Lomonosov Moscow State University, Leninskye gory, h. 1, b. 40, Moscow 119991, Russian Federation; Institute for Information Transmission Problems, Russian Academy of Sciences, Bolshoy Karetny per., h. 19, b. 1, Moscow 127051, Russian Federation.
| | - D A Logacheva Maria
- A.N. Belozersky Institute of Physico-Chemical Biology, M.V. Lomonosov Moscow State University, Leninskye gory, h. 1, b. 40, Moscow 119991, Russian Federation.
| | - Andrey D Novikov
- State Institute for Genetics and Selection of Industrial Microorganisms, 1-st Dorozhniy pr., h. 1, Moscow 117545, Russian Federation.
| | - Alexey A Shatalov
- State Institute for Genetics and Selection of Industrial Microorganisms, 1-st Dorozhniy pr., h. 1, Moscow 117545, Russian Federation.
| | - Tatyana V Gerasimova
- State Institute for Genetics and Selection of Industrial Microorganisms, 1-st Dorozhniy pr., h. 1, Moscow 117545, Russian Federation
| | - Alexander S Yanenko
- State Institute for Genetics and Selection of Industrial Microorganisms, 1-st Dorozhniy pr., h. 1, Moscow 117545, Russian Federation.
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Gubkina str., h. 3, Moscow 119991, Russian Federation; State Institute for Genetics and Selection of Industrial Microorganisms, 1-st Dorozhniy pr., h. 1, Moscow 117545, Russian Federation; Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow Region 141700, Russian Federation.
| |
Collapse
|
20
|
Afanasyeva MA, Putlyaeva LV, Demin DE, Kulakovskiy IV, Vorontsov IE, Fridman MV, Makeev VJ, Kuprash DV, Schwartz AM. The single nucleotide variant rs12722489 determines differential estrogen receptor binding and enhancer properties of an IL2RA intronic region. PLoS One 2017; 12:e0172681. [PMID: 28234966 PMCID: PMC5325477 DOI: 10.1371/journal.pone.0172681] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 02/08/2017] [Indexed: 12/11/2022] Open
Abstract
We studied functional effect of rs12722489 single nucleotide polymorphism located in the first intron of human IL2RA gene on transcriptional regulation. This polymorphism is associated with multiple autoimmune conditions (rheumatoid arthritis, multiple sclerosis, Crohn's disease, and ulcerative colitis). Analysis in silico suggested significant difference in the affinity of estrogen receptor (ER) binding site between alternative allelic variants, with stronger predicted affinity for the risk (G) allele. Electrophoretic mobility shift assay showed that purified human ERα bound only G variant of a 32-bp genomic sequence containing rs12722489. Chromatin immunoprecipitation demonstrated that endogenous human ERα interacted with rs12722489 genomic region in vivo and DNA pull-down assay confirmed differential allelic binding of amplified 189-bp genomic fragments containing rs12722489 with endogenous human ERα. In a luciferase reporter assay, a kilobase-long genomic segment containing G but not A allele of rs12722489 demonstrated enhancer properties in MT-2 cell line, an HTLV-1 transformed human cell line with a regulatory T cell phenotype.
Collapse
Affiliation(s)
- Marina A. Afanasyeva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- * E-mail:
| | - Lidia V. Putlyaeva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Denis E. Demin
- Moscow Institute of Physics and Technology, Moscow, Russia
| | - Ivan V. Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo Innovation Center, Moscow, Russia
| | - Ilya E. Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Marina V. Fridman
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Vsevolod J. Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Moscow Institute of Physics and Technology, Moscow, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Dmitry V. Kuprash
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Moscow Institute of Physics and Technology, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Anton M. Schwartz
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
21
|
Nikiforov NG, Elizova NV, Bukrinsky M, Dubrovsky L, Makeev VJ, Wakabayashi Y, Liu P, Foxx KK, Kruth HS, Jin X, Zakiev ER, Orekhov AN. Use of Primary Macrophages for Searching Novel Immunocorrectors. Curr Pharm Des 2017; 23:915-920. [PMID: 28124601 DOI: 10.2174/1381612823666170125110128] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Accepted: 01/11/2017] [Indexed: 11/22/2022]
Abstract
In this mini-review, the role of macrophage phenotypes in atherogenesis is considered. Recent studies on distribution of M1 and M2 macrophages in different types of atherosclerotic lesions indicate that macrophages exhibit a high degree of plasticity of phenotype in response to various conditions in microenvironment. The effect of the accumulation of cholesterol, a key event in atherogenesis, on the macrophage phenotype is also discussed. The article presents the results of transcriptome analysis of cholesterol-loaded macrophages revealing genes involved in immune response whose expression rate has changed the most. It turned out that the interaction of macrophages with modified LDL leads to higher expression levels of pro-inflammatory marker TNF-α and antiinflammatory marker CCL18. Phenotypic profile of macrophage activation could be a good target for testing of novel anti-atherogenic immunocorrectors. A number of anti-atherogenic drugs were tested as potential immunocorrectors using primary macrophage-based model.
Collapse
Affiliation(s)
- Nikita G Nikiforov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, 125315 Moscow, Russian Federation
| | - Natalia V Elizova
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, 125315 Moscow, Russian Federation
| | - Michael Bukrinsky
- GW School of Medicine and Health Sciences, George Washington University, 20037 Washington, DC, United States
| | - Larisa Dubrovsky
- GW School of Medicine and Health Sciences, George Washington University, 20037 Washington, DC, United States
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119333 Moscow, Russian Federation
| | - Yoshiyuki Wakabayashi
- DNA Sequencing and Genomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, 20892 Bethesda, MD, United States
| | - Poching Liu
- DNA Sequencing and Genomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, 20892 Bethesda, MD, United States
| | - Kathy K Foxx
- Kalen Biomedical, LLC, 20886 Montgomery Village, MD, United States
| | - Howard S Kruth
- Experimental Atherosclerosis Section, Center for Molecular, National Heart, Lung, and Blood Institute , National Institutes of Health, 20892 Bethesda, MD, United States
| | - Xueting Jin
- Experimental Atherosclerosis Section, Center for Molecular, National Heart, Lung, and Blood Institute , National Institutes of Health, 20892 Bethesda, MD, United States
| | - Emile R Zakiev
- INSERM UMR_S 1166, Faculte de Medecine Pitie-Salpetriere, University of Pierre and Marie Curie - Paris 6, 75013 Paris, France
| | - Alexander N Orekhov
- Department of Biophysics, Biological Faculty, Moscow State University, Moscow 119991, Russian Federation
| |
Collapse
|
22
|
Papatsenko D, Darr H, Kulakovskiy IV, Waghray A, Makeev VJ, MacArthur BD, Lemischka IR. Single-Cell Analyses of ESCs Reveal Alternative Pluripotent Cell States and Molecular Mechanisms that Control Self-Renewal. Stem Cell Reports 2016; 5:207-20. [PMID: 26267829 PMCID: PMC4618835 DOI: 10.1016/j.stemcr.2015.07.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2015] [Revised: 07/14/2015] [Accepted: 07/14/2015] [Indexed: 12/22/2022] Open
Abstract
Analyses of gene expression in single mouse embryonic stem cells (mESCs) cultured in serum and LIF revealed the presence of two distinct cell subpopulations with individual gene expression signatures. Comparisons with published data revealed that cells in the first subpopulation are phenotypically similar to cells isolated from the inner cell mass (ICM). In contrast, cells in the second subpopulation appear to be more mature. Pluripotency Gene Regulatory Network (PGRN) reconstruction based on single-cell data and published data suggested antagonistic roles for Oct4 and Nanog in the maintenance of pluripotency states. Integrated analyses of published genomic binding (ChIP) data strongly supported this observation. Certain target genes alternatively regulated by OCT4 and NANOG, such as Sall4 and Zscan10, feed back into the top hierarchical regulator Oct4. Analyses of such incoherent feedforward loops with feedback (iFFL-FB) suggest a dynamic model for the maintenance of mESC pluripotency and self-renewal. Mouse embryonic stem cells grown on serum and LIF contain two subpopulations of cells Oct4 and Nanog alternatively regulate a class of pluripotency genes We demonstrate stabilization of Oct4 concentration and pluripotency via feedback control The “state exchange” model explains self-renewal
Collapse
Affiliation(s)
- Dmitri Papatsenko
- Department of Regenerative and Developmental Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA; Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA.
| | - Henia Darr
- Department of Regenerative and Developmental Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA; Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova Strasse 32, Moscow 119991, Russia; Department of Computational Systems Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina Strasse 3, Moscow 119991, Russia
| | - Avinash Waghray
- Department of Regenerative and Developmental Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA; Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova Strasse 32, Moscow 119991, Russia; Department of Computational Systems Biology, Vavilov Institute of General Genetics, Russian Academy of Sciences, Gubkina Strasse 3, Moscow 119991, Russia
| | - Ben D MacArthur
- Centre for Human Development, Stem Cells, and Regeneration, Institute of Developmental Sciences, University of Southampton, Southampton SO17 1BJ, UK
| | - Ihor R Lemischka
- Department of Regenerative and Developmental Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA; Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA; Department of Pharmacology and System Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York, One Gustave L. Levy Place, New York, NY 10029, USA.
| |
Collapse
|
23
|
Abstract
BACKGROUND CpG dinucleotides are extensively underrepresented in mammalian genomes. It is widely accepted that genome-wide CpG depletion is predominantly caused by an elevated CpG > TpG mutation rate due to frequent cytosine methylation in the CpG context. Meanwhile the CpG content in genomic regions called CpG islands (CGIs) is noticeably higher. This observation is usually explained by lower CpG > TpG substitution rates within CGIs due to reduced cytosine methylation levels. RESULTS By combining genome-wide data on substitutions and methylation levels in several human cell types we have shown that cytosine methylation in human sperm cells was strongly and consistently associated with increased CpG > TpG substitution rates. In contrast, this correlation was not observed for embryonic stem cells or fibroblasts. Surprisingly, the decreased sperm CpG methylation level was insufficient to explain the reduced CpG > TpG substitution rates in CGIs. CONCLUSIONS While cytosine methylation in human sperm cells is strongly associated with increased CpG > TpG substitution rates, substitution rates are significantly reduced within CGIs even after sperm CpG methylation levels and local GC content are controlled for. Our findings are consistent with strong negative selection preserving methylated CpGs within CGIs including intergenic ones.
Collapse
Affiliation(s)
- Alexander Y Panchin
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127994, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, GSP-1, 119991, Russia.,Research Institute for Genetics and Selection of Industrial Microorganisms, Moscow, 117545, Russia.,Moscow Institute of Physics and Technology, Moscow Regoin, 141700, Russia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, GSP-1, 119991, Russia. .,Center for Bioengineering, Research Center of Biotechnology RAS, Russian Academy of Science, Moscow, 117312, Russia.
| |
Collapse
|
24
|
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Soboleva AV, Kasianov AS, Ashoor H, Ba-Alawi W, Bajic VB, Medvedeva YA, Kolpakov FA, Makeev VJ. HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res 2016; 44:D116-25. [PMID: 26586801 PMCID: PMC4702883 DOI: 10.1093/nar/gkv1249] [Citation(s) in RCA: 145] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 10/29/2015] [Accepted: 10/30/2015] [Indexed: 02/06/2023] Open
Abstract
Models of transcription factor (TF) binding sites provide a basis for a wide spectrum of studies in regulatory genomics, from reconstruction of regulatory networks to functional annotation of transcripts and sequence variants. While TFs may recognize different sequence patterns in different conditions, it is pragmatic to have a single generic model for each particular TF as a baseline for practical applications. Here we present the expanded and enhanced version of HOCOMOCO (http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco10), the collection of models of DNA patterns, recognized by transcription factors. HOCOMOCO now provides position weight matrix (PWM) models for binding sites of 601 human TFs and, in addition, PWMs for 396 mouse TFs. Furthermore, we introduce the largest up to date collection of dinucleotide PWM models for 86 (52) human (mouse) TFs. The update is based on the analysis of massive ChIP-Seq and HT-SELEX datasets, with the validation of the resulting models on in vivo data. To facilitate a practical application, all HOCOMOCO models are linked to gene and protein databases (Entrez Gene, HGNC, UniProt) and accompanied by precomputed score thresholds. Finally, we provide command-line tools for PWM and diPWM threshold estimation and motif finding in nucleotide sequences.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ivan S Yevshin
- Design Technological Institute of Digital Techniques, Siberian Branch of the Russian Academy of Sciences, 630090, Academician Rzhanov 6, Novosibirsk, Russia Institute of Systems Biology Ltd, 630112, office 901, Krasina 54, Novosibirsk, Russia
| | - Anastasiia V Soboleva
- Moscow Institute of Physics and Technology, 141700, Institutskiy per. 9, Dolgoprudny, Moscow Region, Russia
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Haitham Ashoor
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Wail Ba-Alawi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia Center for Bioengineering, Russian Academy of Sciences, 117312, 60-letiya Oktyabrya 7/2, Moscow, Russia
| | - Fedor A Kolpakov
- Design Technological Institute of Digital Techniques, Siberian Branch of the Russian Academy of Sciences, 630090, Academician Rzhanov 6, Novosibirsk, Russia Institute of Systems Biology Ltd, 630112, office 901, Krasina 54, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia Moscow Institute of Physics and Technology, 141700, Institutskiy per. 9, Dolgoprudny, Moscow Region, Russia
| |
Collapse
|
25
|
Zakharevich NV, Averina OV, Klimina KM, Kudryavtseva AV, Kasianov AS, Makeev VJ, Danilenko VN. Complete Genome Sequence of Bifidobacterium longum GT15: Identification and Characterization of Unique and Global Regulatory Genes. Microb Ecol 2015; 70:819-834. [PMID: 25894918 DOI: 10.1007/s00248-015-0603-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 03/23/2015] [Indexed: 06/04/2023]
Abstract
In this study, we report the first completely annotated genome sequence of the Russia origin Bifidobacterium longum subsp. longum strain GT15. Comparative genomic analysis of this genome with other available completely annotated genome sequences of B. longum strains isolated from other countries has revealed a high degree of conservation and synteny across the entire genomes. However, it was discovered that the open reading frames to 35 genes were detected only from the B. longum GT15 genome and absent from other genomes B. longum strains (not of Russian origin). These so-called unique genes (UGs) represent a total length of 39,066 bp, with G + C content ranging from 37 to 65 %. Interestingly, certain genes were detected in other B. longum strains of Russian origin. In our analysis, we examined genes for global regulatory systems: proteins of toxin-antitoxin (TA) systems type II, serine/threonine protein kinases (STPKs) of eukaryotic type, and genes of the WhiB-like family proteins. In addition, we have made in silico analysis of all the most significant probiotic genes and considered genes involved in epigenetic regulation and genes responsible for producing various neuromediators. This genome sequence may elucidate the biology of this probiotic strain as a promising candidate for practical (pharmaceutical) applications.
Collapse
Affiliation(s)
| | - Olga V Averina
- Vavilov Institute of General Genetics, Gubkina str. 3, 119991, Moscow, Russia
| | - Ksenia M Klimina
- Vavilov Institute of General Genetics, Gubkina str. 3, 119991, Moscow, Russia
| | - Anna V Kudryavtseva
- Engelhardt Institute of Molecular Biology, Vavilova str. 32, 119991, Moscow, Russia
| | - Artem S Kasianov
- Vavilov Institute of General Genetics, Gubkina str. 3, 119991, Moscow, Russia
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Gubkina str. 3, 119991, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Vavilova str. 32, 119991, Moscow, Russia
| | - Valery N Danilenko
- Vavilov Institute of General Genetics, Gubkina str. 3, 119991, Moscow, Russia
| |
Collapse
|
26
|
Nikiforov NG, Makeev VJ, Elizova NV, Orekhov AN. [Ability of human monocytes to activate in atherosclerosis]. Patol Fiziol Eksp Ter 2015:100-105. [PMID: 26852604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Monocytes were isolated from blood of patients belonging to three groups: those with normal intima-media thickness (IMT) of carotid arteries, patients with increased IMT and patients with atherosclerotic plaques. The degree of activation of the macrophages was determined by the concentration of the cytokine TNF-α and chemokine CCL18 in the culture medium. When comparing the average values of the concentrations of TNF-α and CCL18 for patients in all groups dramatic individual differences were revealed. These individual differences were found within each group and in the pool. An inverse relationship between intracellular cholesterol levels and the ability of monocytes to activate was found. To clarify the cause of this relationship, monocytes were cultured with atherogenic modified low-density lipoprotein, causing accumulation of cholesterol in cultured cells. The accumulation of intracellular cholesterol had no effect neither on the secretion of cytokines nor on the expression of their genes. Therefore, individual differences in the activation capacity of monocytes is not determined by the accumulation of intracellular cholesterol caused by atherogenic lipoproteins. The data obtained can be explained by the individual characteristics of the immune response in different patients.
Collapse
|
27
|
Orekhov AN, Nikiforov NG, Elizova NV, Ivanova EA, Makeev VJ. Phenomenon of individual difference in human monocyte activation. Exp Mol Pathol 2015; 99:151-4. [PMID: 26107006 DOI: 10.1016/j.yexmp.2015.06.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 06/18/2015] [Indexed: 01/25/2023]
Abstract
Macrophages play an important role in the pathogenesis of atherosclerosis, including the early pre-clinical stages of the disease development. We have explored the possibility that the disease onset could be associated with altered monocyte/macrophage response to activating pro- and anti-inflammatory stimuli. We evaluated the susceptibility of circulating monocytes from healthy individuals and patients with asymptomatic carotid atherosclerosis to M1 and M2 activation. The obtained data indicated the existence of a remarkable individual difference in susceptibility to activation among monocytes isolated from the blood of different subjects, regardless of the presence or absence of atherosclerosis. The identified differences in susceptibility to activation between monocytes may explain the individual peculiarities of the immune response in different subjects.
Collapse
Affiliation(s)
- Alexander N Orekhov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Russian Academy of Medical Sciences, Moscow 125315, Russia; Department of Biophysics, Faculty of Biology, Moscow State University, Moscow 119991, Russia; Institute for Atherosclerosis Research, Skolkovo Innovative Center, Moscow 143025, Russia
| | - Nikita G Nikiforov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Russian Academy of Medical Sciences, Moscow 125315, Russia; Institute for Atherosclerosis Research, Skolkovo Innovative Center, Moscow 143025, Russia
| | - Natalia V Elizova
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Russian Academy of Medical Sciences, Moscow 125315, Russia; Institute for Atherosclerosis Research, Skolkovo Innovative Center, Moscow 143025, Russia
| | - Ekaterina A Ivanova
- Katholieke Universiteit Leuven, Department of Growth and Regeneration, Campus Gasthuisberg O&N1 Herestraat 49-BUS 817, 3000 Leuven, Belgium.
| | - Vsevolod J Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| |
Collapse
|
28
|
Vorontsov IE, Kulakovskiy IV, Makeev VJ. Jaccard index based similarity measure to compare transcription factor binding site models. Algorithms Mol Biol 2013; 8:23. [PMID: 24074225 PMCID: PMC3851813 DOI: 10.1186/1748-7188-8-23] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Positional weight matrix (PWM) remains the most popular for quantification of transcription factor (TF) binding. PWM supplied with a score threshold defines a set of putative transcription factor binding sites (TFBS), thus providing a TFBS model.TF binding DNA fragments obtained by different experimental methods usually give similar but not identical PWMs. This is also common for different TFs from the same structural family. Thus it is often necessary to measure the similarity between PWMs. The popular tools compare PWMs directly using matrix elements. Yet, for log-odds PWMs, negative elements do not contribute to the scores of highly scoring TFBS and thus may be different without affecting the sets of the best recognized binding sites. Moreover, the two TFBS sets recognized by a given pair of PWMs can be more or less different depending on the score thresholds. RESULTS We propose a practical approach for comparing two TFBS models, each consisting of a PWM and the respective scoring threshold. The proposed measure is a variant of the Jaccard index between two TFBS sets. The measure defines a metric space for TFBS models of all finite lengths. The algorithm can compare TFBS models constructed using substantially different approaches, like PWMs with raw positional counts and log-odds. We present the efficient software implementation: MACRO-APE (MAtrix CompaRisOn by Approximate P-value Estimation). CONCLUSIONS MACRO-APE can be effectively used to compute the Jaccard index based similarity for two TFBS models. A two-pass scanning algorithm is presented to scan a given collection of PWMs for PWMs similar to a given query. AVAILABILITY AND IMPLEMENTATION MACRO-APE is implemented in ruby 1.9; software including source code and a manual is freely available at http://autosome.ru/macroape/ and in supplementary materials.
Collapse
|
29
|
Makeev VJ. Predictive biology using systems and integrative analysis and methods. J Biomol Struct Dyn 2013; 31:1-3. [DOI: 10.1080/07391102.2012.691340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
30
|
Kulakovskiy IV, Medvedeva YA, Schaefer U, Kasianov AS, Vorontsov IE, Bajic VB, Makeev VJ. HOCOMOCO: a comprehensive collection of human transcription factor binding sites models. Nucleic Acids Res 2012; 41:D195-202. [PMID: 23175603 PMCID: PMC3531053 DOI: 10.1093/nar/gks1089] [Citation(s) in RCA: 155] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Transcription factor (TF) binding site (TFBS) models are crucial for computational reconstruction of transcription regulatory networks. In existing repositories, a TF often has several models (also called binding profiles or motifs), obtained from different experimental data. Having a single TFBS model for a TF is more pragmatic for practical applications. We show that integration of TFBS data from various types of experiments into a single model typically results in the improved model quality probably due to partial correction of source specific technique bias. We present the Homo sapiens comprehensive model collection (HOCOMOCO, http://autosome.ru/HOCOMOCO/, http://cbrc.kaust.edu.sa/hocomoco/) containing carefully hand-curated TFBS models constructed by integration of binding sequences obtained by both low- and high-throughput methods. To construct position weight matrices to represent these TFBS models, we used ChIPMunk software in four computational modes, including newly developed periodic positional prior mode associated with DNA helix pitch. We selected only one TFBS model per TF, unless there was a clear experimental evidence for two rather distinct TFBS models. We assigned a quality rating to each model. HOCOMOCO contains 426 systematically curated TFBS models for 401 human TFs, where 172 models are based on more than one data source.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Laboratory of Bioinformatics and Systems Biology, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Street 32, Moscow 119991, GSP-1, Russia.
| | | | | | | | | | | | | |
Collapse
|
31
|
Permina EA, Medvedeva YA, Baeck PM, Hegde SR, Mande SC, Makeev VJ. Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets. J Biomol Struct Dyn 2012; 31:115-24. [PMID: 22803819 DOI: 10.1080/07391102.2012.691368] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Identification of bacterial modulons from series of gene expression measurements on microarrays is a principal problem, especially relevant for inadequately studied but practically important species. Usage of a priori information on regulatory interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set of genes essential for a regulon is used to control modulon updating. Essential genes for a regulon were selected as a subset of regulon genes highly related by different measures to each other. Using Escherichia coli as a model, we studied how modulon identification depends on the data, including the microarray experiments set, the adopted relevance measure and the regulon itself. We have found that results of modulon identification are highly dependent on all parameters studied and thus the resulting modulon varies substantially depending on the identification procedure. Yet, modulons that were identified correctly displayed higher stability during iterations, which allows developing a procedure for reliable modulon identification in the case of less studied species where the known regulatory interactions are sparse.
Collapse
Affiliation(s)
- Elizaveta A Permina
- Vavilov Institute of General Genetics, 3 Gubkina str. , Moscow, GSP-1, 119991, Russia.
| | | | | | | | | | | |
Collapse
|
32
|
Favorov A, Mularoni L, Cope LM, Medvedeva Y, Mironov AA, Makeev VJ, Wheelan SJ. Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput Biol 2012; 8:e1002529. [PMID: 22693437 PMCID: PMC3364938 DOI: 10.1371/journal.pcbi.1002529] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2011] [Accepted: 04/08/2012] [Indexed: 02/06/2023] Open
Abstract
We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor.
Collapse
Affiliation(s)
- Alexander Favorov
- Department of Oncology, Division of Biostatistics and Bioinformatics, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia
- * E-mail: (AF); (SJW)
| | - Loris Mularoni
- Department of Oncology, Division of Biostatistics and Bioinformatics, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Leslie M. Cope
- Department of Oncology, Division of Biostatistics and Bioinformatics, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Yulia Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia
| | - Andrey A. Mironov
- Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | - Vsevolod J. Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Research Institute of Genetics and Selection of Industrial Microorganisms, Moscow, Russia
| | - Sarah J. Wheelan
- Department of Oncology, Division of Biostatistics and Bioinformatics, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
- * E-mail: (AF); (SJW)
| |
Collapse
|
33
|
Nikulova AA, Favorov AV, Sutormin RA, Makeev VJ, Mironov AA. CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation. Nucleic Acids Res 2012; 40:e93. [PMID: 22422836 PMCID: PMC3384346 DOI: 10.1093/nar/gks235] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.
Collapse
Affiliation(s)
- Anna A Nikulova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73 Leninskie Gory, Moscow 119991, Russia.
| | | | | | | | | |
Collapse
|
34
|
Hara Y, Kadotani N, Izui H, Katashkina JI, Kuvaeva TM, Andreeva IG, Golubeva LI, Malko DB, Makeev VJ, Mashko SV, Kozlov YI. The complete genome sequence of Pantoea ananatis AJ13355, an organism with great biotechnological potential. Appl Microbiol Biotechnol 2011; 93:331-41. [PMID: 22159605 PMCID: PMC3251776 DOI: 10.1007/s00253-011-3713-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Revised: 10/23/2011] [Accepted: 11/05/2011] [Indexed: 11/28/2022]
Abstract
Pantoea ananatis AJ13355 is a newly identified member of the Enterobacteriaceae family with promising biotechnological applications. This bacterium is able to grow at an acidic pH and is resistant to saturating concentrations of L-glutamic acid, making this organism a suitable host for the production of L-glutamate. In the current study, the complete genomic sequence of P. ananatis AJ13355 was determined. The genome was found to consist of a single circular chromosome consisting of 4,555,536 bp [DDBJ: AP012032] and a circular plasmid, pEA320, of 321,744 bp [DDBJ: AP012033]. After automated annotation, 4,071 protein-coding sequences were identified in the P. ananatis AJ13355 genome. For 4,025 of these genes, functions were assigned based on homologies to known proteins. A high level of nucleotide sequence identity (99%) was revealed between the genome of P. ananatis AJ13355 and the previously published genome of P. ananatis LMG 20103. Short colinear regions, which are identical to DNA sequences in the Escherichia coli MG1655 chromosome, were found to be widely dispersed along the P. ananatis AJ13355 genome. Conjugal gene transfer from E. coli to P. ananatis, mediated by homologous recombination between short identical sequences, was also experimentally demonstrated. The determination of the genome sequence has paved the way for the directed metabolic engineering of P. ananatis to produce biotechnologically relevant compounds.
Collapse
Affiliation(s)
- Yoshihiko Hara
- Fermentation and Biotechnology Laboratories, Ajinomoto Co., Inc., Kawasaki-ku, Kawasaki, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Kulakovskiy IV, Belostotsky AA, Kasianov AS, Esipova NG, Medvedeva YA, Eliseeva IA, Makeev VJ. A deeper look into transcription regulatory code by preferred pair distance templates for transcription factor binding sites. ACTA ACUST UNITED AC 2011; 27:2621-4. [PMID: 21852305 DOI: 10.1093/bioinformatics/btr453] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Modern experimental methods provide substantial information on protein-DNA recognition. Studying arrangements of transcription factor binding sites (TFBSs) of interacting transcription factors (TFs) advances understanding of the transcription regulatory code. RESULTS We constructed binding motifs for TFs forming a complex with HIF-1α at the erythropoietin 3(')-enhancer. Corresponding TFBSs were predicted in the segments around transcription start sites (TSSs) of all human genes. Using the genome-wide set of regulatory regions, we observed several strongly preferred distances between hypoxia-responsive element (HRE) and binding sites of a particular cofactor protein. The set of preferred distances was called as a preferred pair distance template (PPDT). PPDT dramatically depended on the TF and orientation of its binding sites relative to HRE. PPDT evaluated from the genome-wide set of regulatory sequences was used to detect significant PPDT-consistent binding site pairs in regulatory regions of hypoxia-responsive genes. We believe PPDT can help to reveal the layout of eukaryotic regulatory segments. CONTACT ivan.kulakovskiy@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- I V Kulakovskiy
- Laboratory of Bioinformatics and System Biology, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia.
| | | | | | | | | | | | | |
Collapse
|
36
|
Logacheva MD, Kasianov AS, Vinogradov DV, Samigullin TH, Gelfand MS, Makeev VJ, Penin AA. De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum). BMC Genomics 2011; 12:30. [PMID: 21232141 PMCID: PMC3027159 DOI: 10.1186/1471-2164-12-30] [Citation(s) in RCA: 124] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Accepted: 01/13/2011] [Indexed: 11/20/2022] Open
Abstract
BACKGROUND Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales--a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. RESULTS Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. CONCLUSIONS 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated.
Collapse
Affiliation(s)
- Maria D Logacheva
- Department of Evolutionary Biochemistry, A.N. Belozersky Institute of Physico-Chemical Biology, M.V. Lomonosov Moscow State University, Moscow, Russia
- Evolutionary Genomics Laboratory, Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Moscow, Russia
| | - Artem S Kasianov
- V.A. Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Dmitriy V Vinogradov
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Moscow, Russia
| | - Tagir H Samigullin
- Department of Evolutionary Biochemistry, A.N. Belozersky Institute of Physico-Chemical Biology, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - Mikhail S Gelfand
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - Vsevolod J Makeev
- V.A. Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- N.I Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- State Scientific Institute of Genetics and Selection of Industrial Microorganisms, GosNIIgenetika, Moscow, Russia
| | - Aleksey A Penin
- Evolutionary Genomics Laboratory, Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Science, Moscow, Russia
- Department of Genetics, Biological faculty, M.V. Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
37
|
Abstract
SUMMARY ChIP-Seq data are a new challenge for motif discovery. Such a data typically consists of thousands of DNA segments with base-specific coverage values. We present a new version of our DNA motif discovery software ChIPMunk adapted for ChIP-Seq data. ChIPMunk is an iterative algorithm that combines greedy optimization with bootstrapping and uses coverage profiles as motif positional preferences. ChIPMunk does not require truncation of long DNA segments and it is practical for processing up to tens of thousands of data sequences. Comparison with traditional (MEME) or ChIP-Seq-oriented (HMS) motif discovery tools shows that ChIPMunk identifies the correct motifs with the same or better quality but works dramatically faster. AVAILABILITY AND IMPLEMENTATION ChIPMunk is freely available within the ru_genetika Java package: http://line.imb.ac.ru/ChIPMunk. Web-based version is also available. CONTACT ivan.kulakovskiy@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- I V Kulakovskiy
- Research Institute for Genetics and Selection of Industrial Microorganisms, Moscow 117545, Russia.
| | | | | | | |
Collapse
|
38
|
Medvedeva YA, Fridman MV, Oparina NJ, Malko DB, Ermakova EO, Kulakovskiy IV, Heinzel A, Makeev VJ. Intergenic, gene terminal, and intragenic CpG islands in the human genome. BMC Genomics 2010; 11:48. [PMID: 20085634 PMCID: PMC2817693 DOI: 10.1186/1471-2164-11-48] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open
Abstract
Background Recently, it has been discovered that the human genome contains many transcription start sites for non-coding RNA. Regulatory regions related to transcription of this non-coding RNAs are poorly studied. Some of these regulatory regions may be associated with CpG islands located far from transcription start-sites of any protein coding gene. The human genome contains many such CpG islands; however, until now their properties were not systematically studied. Results We studied CpG islands located in different regions of the human genome using methods of bioinformatics and comparative genomics. We have observed that CpG islands have a preference to overlap with exons, including exons located far from transcription start site, but usually extend well into introns. Synonymous substitution rate of CpG-containing codons becomes substantially reduced in regions where CpG islands overlap with protein-coding exons, even if they are located far downstream from transcription start site. CAGE tag analysis displayed frequent transcription start sites in all CpG islands, including those found far from transcription start sites of protein coding genes. Computational prediction and analysis of published ChIP-chip data revealed that CpG islands contain an increased number of sites recognized by Sp1 protein. CpG islands containing more CAGE tags usually also contain more Sp1 binding sites. This is especially relevant for CpG islands located in 3' gene regions. Various examples of transcription, confirmed by mRNAs or ESTs, but with no evidence of protein coding genes, were found in CAGE-enriched CpG islands located far from transcription start site of any known protein coding gene. Conclusions CpG islands located far from transcription start sites of protein coding genes have transcription initiation activity and display Sp1 binding properties. In exons, overlapping with these islands, the synonymous substitution rate of CpG containing codons is decreased. This suggests that these CpG islands are involved in transcription initiation, possibly of some non-coding RNAs.
Collapse
Affiliation(s)
- Yulia A Medvedeva
- Research Institute for Genetics and Selection of Industrial Microorganisms, Genetika, 1st Dorozhny proezd, 1, Moscow, 117545, Russia.
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Abstract
MOTIVATION Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. RESULTS Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. AVAILABILITY Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.
| | | | | |
Collapse
|
40
|
Britanova LV, Makeev VJ, Kuprash DV. In vitro selection of optimal RelB/p52 DNA-binding motifs. Biochem Biophys Res Commun 2008; 365:583-8. [DOI: 10.1016/j.bbrc.2007.10.200] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2007] [Accepted: 10/29/2007] [Indexed: 11/15/2022]
|
41
|
Enikeeva FN, Kotelnikova EA, Gelfand MS, Makeev VJ. A model of evolution with constant selective pressure for regulatory DNA sites. BMC Evol Biol 2007; 7:125. [PMID: 17662135 PMCID: PMC1978210 DOI: 10.1186/1471-2148-7-125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Accepted: 07/27/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Molecular evolution is usually described assuming a neutral or weakly non-neutral substitution model. Recently, new data have become available on evolution of sequence regions under a selective pressure, e.g. transcription factor binding sites. To reconstruct the evolutionary history of such sequences, one needs evolutionary models that take into account a substantial constant selective pressure. RESULTS We present a simple evolutionary model with a single preferred (consensus) nucleotide and the neutral substitution model adopted for all other nucleotides. This evolutionary model has a rate matrix in which all substitutions that do not involve the consensus nucleotide occur with the same rate. The model has two time scales for achieving a stationary distribution; in the general case only one of the two rate parameters can be evaluated from the stationary distribution. In the middle-time zone, a counterintuitive behavior was observed for some parameter values, with a probability of conservation for a non-consensus nucleotide greater than that for the consensus nucleotide. Such an effect can be observed only in the case of weak preference for the consensus nucleotide, when the probability to observe the consensus nucleotide in the stationary distribution is less than 1/2. If the substitution rate is represented as a product of mutation and fixation, only the fixation can be calculated from the stationary distribution. The exhibited conservation of non-consensus nucleotides does not take place if the elements of mutation matrix are identical, and can be related to the reduced mutation rate between the non-consensus nucleotides. This bias can have no effect on the stationary distribution of nucleotide frequencies calculated over the ensemble of multiple alignments, e.g. transcription factor binding sites upstream of different sets of co-regulated orthologous genes. CONCLUSION The derived model can be used as a null model when analyzing the evolution of orthologous transcription factor binding sites. In particular, our findings show that a nucleotide preferred at some position of a multiple alignment of binding sites for some transcription factor in the same genome is not necessarily the most conserved nucleotide in an alignment of orthologous sites from different species. However, this effect can take place only in the case of a mutation matrix whose elements are not identical.
Collapse
Affiliation(s)
- Farida N Enikeeva
- Institute for Information Transmission Problems (the Kharkevich Institute) of RAS, Bolshoi Karetny pereulok, 19, GSP-4, Moscow, 127994, Russia
| | - Ekaterina A Kotelnikova
- State Research Institute of Genetics and Selection of Industrial Microorganisms, 1st Dorozhnyj proezd, 1, Moscow, 113535, Russia
- Ariadne Genomics Inc. 9700 Great Seneca Highway, Suite 113, Rockville, MD 20850, USA
| | - Mikhail S Gelfand
- Institute for Information Transmission Problems (the Kharkevich Institute) of RAS, Bolshoi Karetny pereulok, 19, GSP-4, Moscow, 127994, Russia
- Faculty of Bioengineering and Bioinformatics, Moscow State University, Vorobyevy Gory 1-73, Moscow, 119992, Russia
| | - Vsevolod J Makeev
- State Research Institute of Genetics and Selection of Industrial Microorganisms, 1st Dorozhnyj proezd, 1, Moscow, 113535, Russia
- Engelgardt Institute of Molecular Biology of RAS, Vavilova 32, Moscow, 119991, Russia
| |
Collapse
|
42
|
Rakhmanov SV, Makeev VJ. Atomic hydration potentials using a Monte Carlo Reference State (MCRS) for protein solvation modeling. BMC Struct Biol 2007; 7:19. [PMID: 17397537 PMCID: PMC1852318 DOI: 10.1186/1472-6807-7-19] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Accepted: 03/30/2007] [Indexed: 11/10/2022]
Abstract
Background Accurate description of protein interaction with aqueous solvent is crucial for modeling of protein folding, protein-protein interaction, and drug design. Efforts to build a working description of solvation, both by continuous models and by molecular dynamics, yield controversial results. Specifically constructed knowledge-based potentials appear to be promising for accounting for the solvation at the molecular level, yet have not been used for this purpose. Results We developed original knowledge-based potentials to study protein hydration at the level of atom contacts. The potentials were obtained using a new Monte Carlo reference state (MCRS), which simulates the expected probability density of atom-atom contacts via exhaustive sampling of structure space with random probes. Using the MCRS allowed us to calculate the expected atom contact densities with high resolution over a broad distance range including very short distances. Knowledge-based potentials for hydration of protein atoms of different types were obtained based on frequencies of their contacts at different distances with protein-bound water molecules, in a non-redundant training data base of 1776 proteins with known 3D structures. Protein hydration sites were predicted in a test set of 12 proteins with experimentally determined water locations. The MCRS greatly improves prediction of water locations over existing methods. In addition, the contribution of the energy of macromolecular solvation into total folding free energy was estimated, and tested in fold recognition experiments. The correct folds were preferred over all the misfolded decoys for the majority of proteins from the improved Rosetta decoy set based on the structure hydration energy alone. Conclusion MCRS atomic hydration potentials provide a detailed distance-dependent description of hydropathies of individual protein atoms. This allows placement of water molecules on the surface of proteins and in protein interfaces with much higher precision. The potentials provide a means to estimate the total solvation energy for a protein structure, in many cases achieving a successful fold recognition. Possible applications of atomic hydration potentials to structure verification, protein folding and stability, and protein-protein interactions are discussed.
Collapse
Affiliation(s)
- Sergei V Rakhmanov
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
| | - Vsevolod J Makeev
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova str. 32, Moscow, Russia
| |
Collapse
|
43
|
Malko DB, Makeev VJ, Mironov AA, Gelfand MS. Evolution of exon-intron structure and alternative splicing in fruit flies and malarial mosquito genomes. Genome Res 2006; 16:505-9. [PMID: 16520458 PMCID: PMC1457027 DOI: 10.1101/gr.4236606] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Comparative analysis of alternative splicing of orthologous genes from fruit flies (Drosophila melanogaster and Drosophila pseudoobscura) and mosquito (Anopheles gambiae) demonstrated that both in the fruit fly genes and in fruit fly-mosquito comparisons, constitutive exons and splicing sites are more conserved than alternative ones. While >97% of constitutive D. melanogaster exons are conserved in D. pseudoobscura, only approximately 80% of alternative exons are conserved. Similarly, 77% of constitutive fruit fly exons are conserved in the mosquito genes, compared with <50% of alternative exons. Internal alternatives are more conserved than terminal ones. Retained introns are the least conserved, alternative acceptor sites are slightly more conserved than donor sites, and mutually exclusive exons are almost as conserved as constitutive exons. Cassette and mutually exclusive exons experience almost no intron insertions. We also observed cases of interconversion of various elementary alternatives, e.g., transformation of cassette exons into alternative sites. These results agree with the observations made earlier in human-mouse comparisons and demonstrate that the phenomenon of relatively low conservation of alternatively spliced regions may be universal, as it has been observed in different taxonomic groups (mammals and insects) and at various evolutionary distances.
Collapse
Affiliation(s)
- Dmitry B. Malko
- State Scientific Center GosNIIgenetika, Moscow 117545, Russia
| | | | - Andrey A. Mironov
- State Scientific Center GosNIIgenetika, Moscow 117545, Russia
- Department of Bioengineering and Bioinformatics, Moscow State University, Moscow 119992, Russia
| | - Mikhail S. Gelfand
- State Scientific Center GosNIIgenetika, Moscow 117545, Russia
- Department of Bioengineering and Bioinformatics, Moscow State University, Moscow 119992, Russia
- Institute for Information Transmission Problems RAS, Moscow 127994, Russia
- Corresponding author.E-mail ; fax +7-095-2090579
| |
Collapse
|
44
|
Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ, Makeev VJ, Mironov AA, Noble WS, Pavesi G, Pesole G, Régnier M, Simonis N, Sinha S, Thijs G, van Helden J, Vandenbogaert M, Weng Z, Workman C, Ye C, Zhu Z. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005; 23:137-44. [PMID: 15637633 DOI: 10.1038/nbt1053] [Citation(s) in RCA: 691] [Impact Index Per Article: 36.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.
Collapse
Affiliation(s)
- Martin Tompa
- Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, Washington 98195-2350, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Favorov AV, Gelfand MS, Gerasimova AV, Ravcheev DA, Mironov AA, Makeev VJ. A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length. Bioinformatics 2005; 21:2240-5. [PMID: 15728117 DOI: 10.1093/bioinformatics/bti336] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Transcription regulatory protein factors often bind DNA as homo-dimers or hetero-dimers. Thus they recognize structured DNA motifs that are inverted or direct repeats or spaced motif pairs. However, these motifs are often difficult to identify owing to their high divergence. The motif structure included explicitly into the motif recognition algorithm improves recognition efficiency for highly divergent motifs as well as estimation of motif geometric parameters. RESULT We present a modification of the Gibbs sampling motif extraction algorithm, SeSiMCMC (Sequence Similarities by Markov Chain Monte Carlo), which finds structured motifs of these types, as well as non-structured motifs, in a set of unaligned DNA sequences. It employs improved estimators of motif and spacer lengths. The probability that a sequence does not contain any motif is accounted for in a rigorous Bayesian manner. We have applied the algorithm to a set of upstream regions of genes from two Escherichia coli regulons involved in respiration. We have demonstrated that accounting for a symmetric motif structure allows the algorithm to identify weak motifs more accurately. In the examples studied, ArcA binding sites were demonstrated to have the structure of a direct spaced repeat, whereas NarP binding sites exhibited the palindromic structure. AVAILABILITY The WWW interface of the program, its FreeBSD (4.0) and Windows 32 console executables are available at http://bioinform.genetika.ru/SeSiMCMC
Collapse
Affiliation(s)
- A V Favorov
- Laboratory for Bioinformatics, State Scientific Centre GosNIIGenetika, 1st Dorozhny pr. 1, Moscow, 117545, Russia.
| | | | | | | | | | | |
Collapse
|
46
|
Kotelnikova EA, Makeev VJ, Gelfand MS. Evolution of transcription factor DNA binding sites. Gene 2005; 347:255-63. [PMID: 15725380 DOI: 10.1016/j.gene.2004.12.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2004] [Revised: 11/12/2004] [Accepted: 12/02/2004] [Indexed: 11/17/2022]
Abstract
In bioinformatics, binding of transcription regulatory factors to the cognate binding sites is usually described by sequence-specific binding energy, which is estimated from a training sample of sites. This model implies that all binding sites with binding energy above some threshold are functional and site sequence variations should be considered neutral until they do not reduce this energy below the threshold. To quantify this energy, the binding profile (positional weight matrix, PWM) model or consensus-based model is usually applied. Here we show that in many cases available data are not sufficient to construct a relevant PWM, and modified consensus-based model could be more effective to describe binding properties. Further, using the data about binding sites of several transcription factors, we demonstrate that some non-consensus nucleotides in "orthologous sites" (that is, binding sites of the same factor upstream of orthologous genes), which have been believed to be irrelevant or even hindering the regulation, are evolutionary very stable and specific for the regulated gene. For each two considered genomes, the number of substitutions between non-consensus nucleotides is far less than the expected number of neutral substitutions. Moreover, in several positions of binding sites regulating different genes, there are non-consensus nucleotides conserved in distant genomes. It means that there exists a selection pressure, which results in the stability of non-consensus nucleotides.
Collapse
Affiliation(s)
- Ekaterina A Kotelnikova
- State Research Institute of Genetics and Selection of Industrial Microorganisms, 1st Dorozhnyj proezd 1, Moscow 113535, Russia.
| | | | | |
Collapse
|
47
|
Makeev VJ, Lifanov AP, Nazina AG, Papatsenko DA. Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. Nucleic Acids Res 2004; 31:6016-26. [PMID: 14530449 PMCID: PMC219477 DOI: 10.1093/nar/gkg799] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We explored distance preferences in the arrangement of binding motifs for five transcription factors (Bicoid, Krüppel, Hunchback, Knirps and Caudal) in a large set of Drosophila cis-regulatory modules (CRMs). Analysis of non-overlapping binding motifs revealed the presence of periodic signals specific to particular combinations of binding motifs. The most striking periodic signals (10 bp for Bicoid and 11 bp for Hunchback) suggest preferential positioning of some binding site combinations on the same side of the DNA helix. We also analyzed distance preferences in arrangements of highly correlated overlapping binding motifs, such as Bicoid and Krüppel. Based on the distance analysis, we extracted preferential binding site arrangements and proposed models for potential composite elements (CEs) and antagonistic motif pairs involved in the function of developmental CRMs. Our results suggest that there are distinct hierarchical levels in the organization of transcription regulatory information. We discuss the role of the hierarchy in understanding transcriptional regulation and in detection of transcription regulatory regions in genomes.
Collapse
|
48
|
Abstract
Cis-regulatory modules (CRMs) are transcription regulatory DNA segments (approximately 1 Kb range) that control the expression of developmental genes in higher eukaryotes. We analyzed clustering of known binding motifs for transcription factors (TFs) in over 60 known CRMs from 20 Drosophila developmental genes, and we present evidence that each type of recognition motif forms significant clusters within the regulatory regions regulated by the corresponding TF. We demonstrate how a search with a single binding motif can be applied to explore gene regulatory networks and to discover coregulated genes in the genome. We also discuss the potential of the clustering method in interpreting the differential response of genes to various levels of transcriptional regulators.
Collapse
|
49
|
Kalinina OV, Makeev VJ, Sutormin RA, Gelfand MS, Rakhmaninova AB. The channel in transporters is formed by residues that are rare in transmembrane helices. In Silico Biol 2003; 3:197-204. [PMID: 14524337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/27/2023]
Abstract
Transmembrane transport is an essential component of the cell life. Many genes encoding known or putative transport proteins are found in bacterial genomes. In most cases their substrate specificity is not experimentally determined and only approximately predicted by comparative genomic analysis. Even less is known about the 3D structure of transporters. Nevertheless, the published experimental data demonstrate that channel-forming residues determine the substrate specificity of secondary transporters and analysis of these residues would provide better understanding of the transport mechanism. We developed a simple computational method for identification of channel-forming residues in transporter sequences. It is based on the analysis of amino acids frequencies in bacterial secondary transporters. We applied this method to a variety of transmembrane proteins with resolved 3D structure. The predictions are in sufficiently good agreement with the real protein structure.
Collapse
|
50
|
Papatsenko DA, Makeev VJ, Lifanov AP, Régnier M, Nazina AG, Desplan C. Extraction of functional binding sites from unique regulatory regions: the Drosophila early developmental enhancers. Genome Res 2002; 12:470-81. [PMID: 11875036 PMCID: PMC155290 DOI: 10.1101/gr.212502] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The early developmental enhancers of Drosophila melanogaster comprise one of the most sophisticated regulatory systems in higher eukaryotes. An elaborate code in their DNA sequence translates both maternal and early embryonic regulatory signals into spatial distribution of transcription factors. One of the most striking features of this code is the redundancy of binding sites for these transcription factors (BSTF). Using this redundancy, we explored the possibility of predicting functional binding sites in a single enhancer region without any prior consensus/matrix description or evolutionary sequence comparisons. We developed a conceptually simple algorithm, Scanseq, that employs an original statistical evaluation for identifying the most redundant motifs and locates the position of potential BSTF in a given regulatory region. To estimate the biological relevance of our predictions, we built thorough literature-based annotations for the best-known Drosophila developmental enhancers and we generated detailed distribution maps for the most robust binding sites. The high statistical correlation between the location of BSTF in these experiment-based maps and the location predicted in silico by Scanseq confirmed the relevance of our approach. We also discuss the definition of true binding sites and the possible biological principles that govern patterning of regulatory regions and the distribution of transcriptional signals.
Collapse
|