1
|
Abdennur N, Abraham S, Fudenberg G, Flyamer IM, Galitsyna AA, Goloborodko A, Imakaev M, Oksuz BA, Venev SV, Xiao Y. Cooltools: Enabling high-resolution Hi-C analysis in Python. PLoS Comput Biol 2024; 20:e1012067. [PMID: 38709825 DOI: 10.1371/journal.pcbi.1012067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/10/2024] [Indexed: 05/08/2024] Open
Abstract
Chromosome conformation capture (3C) technologies reveal the incredible complexity of genome organization. Maps of increasing size, depth, and resolution are now used to probe genome architecture across cell states, types, and organisms. Larger datasets add challenges at each step of computational analysis, from storage and memory constraints to researchers' time; however, analysis tools that meet these increased resource demands have not kept pace. Furthermore, existing tools offer limited support for customizing analysis for specific use cases or new biology. Here we introduce cooltools (https://github.com/open2c/cooltools), a suite of computational tools that enables flexible, scalable, and reproducible analysis of high-resolution contact frequency data. Cooltools leverages the widely-adopted cooler format which handles storage and access for high-resolution datasets. Cooltools provides a paired command line interface (CLI) and Python application programming interface (API), which respectively facilitate workflows on high-performance computing clusters and in interactive analysis environments. In short, cooltools enables the effective use of the latest and largest genome folding datasets.
Collapse
Affiliation(s)
- Nezar Abdennur
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America
| | - Sameer Abraham
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Geoffrey Fudenberg
- Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, California, United States of America
| | - Ilya M Flyamer
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Aleksandra A Galitsyna
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Anton Goloborodko
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Vienna, Austria
| | - Maxim Imakaev
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States of America
| | - Betul A Oksuz
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America
| | - Sergey V Venev
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, Massachusetts, United States of America
| | - Yao Xiao
- Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
2
|
Abdennur N, Fudenberg G, Flyamer IM, Galitsyna AA, Goloborodko A, Imakaev M, Venev S. Bioframe: operations on genomic intervals in Pandas dataframes. Bioinformatics 2024; 40:btae088. [PMID: 38402507 PMCID: PMC10903647 DOI: 10.1093/bioinformatics/btae088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/05/2024] [Accepted: 02/22/2024] [Indexed: 02/26/2024] Open
Abstract
MOTIVATION Genomic intervals are one of the most prevalent data structures in computational genome biology, and used to represent features ranging from genes, to DNA binding sites, to disease variants. Operations on genomic intervals provide a language for asking questions about relationships between features. While there are excellent interval arithmetic tools for the command line, they are not smoothly integrated into Python, one of the most popular general-purpose computational and visualization environments. RESULTS Bioframe is a library to enable flexible and performant operations on genomic interval dataframes in Python. Bioframe extends the Python data science stack to use cases for computational genome biology by building directly on top of two of the most commonly-used Python libraries, NumPy and Pandas. The bioframe API enables flexible name and column orders, and decouples operations from data formats to avoid unnecessary conversions, a common scourge for bioinformaticians. Bioframe achieves these goals while maintaining high performance and a rich set of features. AVAILABILITY AND IMPLEMENTATION Bioframe is open-source under MIT license, cross-platform, and can be installed from the Python Package Index. The source code is maintained by Open2C on GitHub at https://github.com/open2c/bioframe.
Collapse
Affiliation(s)
| | - Nezar Abdennur
- Department of Genomics and Computational Biology, UMass Chan Medical School, Worcester, MA 01605, United States
- Department of Systems Biology, UMass Chan Medical School, Worcester, MA 01605, United States
| | - Geoffrey Fudenberg
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, United States
| | - Ilya M Flyamer
- Friedrich Miescher Institute for Biomedical Research, 4058 Basel, Switzerland
| | - Aleksandra A Galitsyna
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Anton Goloborodko
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), 1030 Vienna, Austria
| | - Maxim Imakaev
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Sergey Venev
- Department of Systems Biology, UMass Chan Medical School, Worcester, MA 01605, United States
| |
Collapse
|
3
|
Gavrilov AA, Evko GS, Galitsyna AA, Ulianov SV, Kochetkova TV, Merkel AY, Tyakht AV, Razin SV. RNA-DNA interactomes of three prokaryotes uncovered by proximity ligation. Commun Biol 2023; 6:473. [PMID: 37120653 PMCID: PMC10148824 DOI: 10.1038/s42003-023-04853-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 04/19/2023] [Indexed: 05/01/2023] Open
Abstract
Proximity ligation approaches, which are widely used to study the spatial organization of the genome, also make it possible to reveal patterns of RNA-DNA interactions. Here, we use RedC, an RNA-DNA proximity ligation approach, to assess the distribution of major RNA types along the genomes of E. coli, B. subtilis, and thermophilic archaeon T. adornatum. We find that (i) messenger RNAs preferentially interact with their cognate genes and the genes located downstream in the same operon, which is consistent with polycistronic transcription; (ii) ribosomal RNAs preferentially interact with active protein-coding genes in both bacteria and archaea, indicating co-transcriptional translation; and (iii) 6S noncoding RNA, a negative regulator of bacterial transcription, is depleted from active genes in E. coli and B. subtilis. We conclude that the RedC data provide a rich resource for studying both transcription dynamics and the function of noncoding RNAs in microbial organisms.
Collapse
Affiliation(s)
- Alexey A Gavrilov
- Institute of Gene Biology, Russian Academy of Sciences, 119334, Moscow, Russia
| | - Grigory S Evko
- Institute of Gene Biology, Russian Academy of Sciences, 119334, Moscow, Russia
| | | | - Sergey V Ulianov
- Institute of Gene Biology, Russian Academy of Sciences, 119334, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, 119991, Moscow, Russia
| | - Tatiana V Kochetkova
- Winogradsky Institute of Microbiology, Federal Research Center of Biotechnology, Russian Academy of Sciences, 117312, Moscow, Russia
| | - Alexander Y Merkel
- Winogradsky Institute of Microbiology, Federal Research Center of Biotechnology, Russian Academy of Sciences, 117312, Moscow, Russia
| | - Alexander V Tyakht
- Institute of Gene Biology, Russian Academy of Sciences, 119334, Moscow, Russia
| | - Sergey V Razin
- Institute of Gene Biology, Russian Academy of Sciences, 119334, Moscow, Russia.
- Faculty of Biology, Lomonosov Moscow State University, 119991, Moscow, Russia.
| |
Collapse
|
4
|
Kobets VA, Ulianov SV, Galitsyna AA, Doronin SA, Mikhaleva EA, Gelfand MS, Shevelyov YY, Razin SV, Khrameeva EE. HiConfidence: a novel approach uncovering the biological signal in Hi-C data affected by technical biases. Brief Bioinform 2023; 24:7033301. [PMID: 36759336 PMCID: PMC10025441 DOI: 10.1093/bib/bbad044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 01/04/2023] [Accepted: 01/20/2023] [Indexed: 02/11/2023] Open
Abstract
The chromatin interaction assays, particularly Hi-C, enable detailed studies of genome architecture in multiple organisms and model systems, resulting in a deeper understanding of gene expression regulation mechanisms mediated by epigenetics. However, the analysis and interpretation of Hi-C data remain challenging due to technical biases, limiting direct comparisons of datasets obtained in different experiments and laboratories. As a result, removing biases from Hi-C-generated chromatin contact matrices is a critical data analysis step. Our novel approach, HiConfidence, eliminates biases from the Hi-C data by weighing chromatin contacts according to their consistency between replicates so that low-quality replicates do not substantially influence the result. The algorithm is effective for the analysis of global changes in chromatin structures such as compartments and topologically associating domains. We apply the HiConfidence approach to several Hi-C datasets with significant technical biases, that could not be analyzed effectively using existing methods, and obtain meaningful biological conclusions. In particular, HiConfidence aids in the study of how changes in histone acetylation pattern affect chromatin organization in Drosophila melanogaster S2 cells. The method is freely available at GitHub: https://github.com/victorykobets/HiConfidence.
Collapse
Affiliation(s)
- Victoria A Kobets
- Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Sergey V Ulianov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, 119992, Russia
| | - Aleksandra A Galitsyna
- Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051, Russia
| | - Semen A Doronin
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", Moscow, 123182, Russia
| | - Elena A Mikhaleva
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", Moscow, 123182, Russia
| | - Mikhail S Gelfand
- Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051, Russia
| | - Yuri Y Shevelyov
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", Moscow, 123182, Russia
| | - Sergey V Razin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, 119334, Russia
- Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, 119992, Russia
| | | |
Collapse
|
5
|
Abdennur N, Fudenberg G, Flyamer IM, Galitsyna AA, Goloborodko A, Imakaev M, Venev SV. Pairtools: from sequencing data to chromosome contacts. bioRxiv 2023:2023.02.13.528389. [PMID: 36824968 PMCID: PMC9949071 DOI: 10.1101/2023.02.13.528389] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
The field of 3D genome organization produces large amounts of sequencing data from Hi-C and a rapidly-expanding set of other chromosome conformation protocols (3C+). Massive and heterogeneous 3C+ data require high-performance and flexible processing of sequenced reads into contact pairs. To meet these challenges, we present pairtools - a flexible suite of tools for contact extraction from sequencing data. Pairtools provides modular command-line interface (CLI) tools that can be flexibly chained into data processing pipelines. Pairtools provides both crucial core tools as well as auxiliary tools for building feature-rich 3C+ pipelines, including contact pair manipulation, filtration, and quality control. Benchmarking pairtools against popular 3C+ data pipelines shows advantages of pairtools for high-performance and flexible 3C+ analysis. Finally, pairtools provides protocol-specific tools for multi-way contacts, haplotype-resolved contacts, and single-cell Hi-C. The combination of CLI tools and tight integration with Python data analysis libraries makes pairtools a versatile foundation for a broad range of 3C+ pipelines.
Collapse
Affiliation(s)
- Open2C
- https://open2c.github.io/
| | - Nezar Abdennur
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, MA
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Geoffrey Fudenberg
- Department of Computational and Quantitative Biology, University of Southern California, Los Angeles, CA, USA
| | - Ilya M. Flyamer
- Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH-4058 Basel, Switzerland
| | - Aleksandra A. Galitsyna
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Anton Goloborodko
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna BioCenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Maxim Imakaev
- Institute for Medical Engineering and Sciences, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA
| | - Sergey V. Venev
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| |
Collapse
|
6
|
Abstract
Over the past decade, genome-wide assays for chromatin interactions in single cells have enabled the study of individual nuclei at unprecedented resolution and throughput. Current chromosome conformation capture techniques survey contacts for up to tens of thousands of individual cells, improving our understanding of genome function in 3D. However, these methods recover a small fraction of all contacts in single cells, requiring specialised processing of sparse interactome data. In this review, we highlight recent advances in methods for the interpretation of single-cell genomic contacts. After discussing the strengths and limitations of these methods, we outline frontiers for future development in this rapidly moving field.
Collapse
Affiliation(s)
- Aleksandra A Galitsyna
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Institute for Information Transmission Problems, RAS, Moscow, Russia
- Institute of Gene Biology, RAS, Moscow, Russia
| | - Mikhail S Gelfand
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
- Institute for Information Transmission Problems, RAS, Moscow, Russia
| |
Collapse
|
7
|
Rozenwald MB, Galitsyna AA, Sapunov GV, Khrameeva EE, Gelfand MS. A machine learning framework for the prediction of chromatin folding in Drosophila using epigenetic features. PeerJ Comput Sci 2020; 6:e307. [PMID: 33816958 PMCID: PMC7924456 DOI: 10.7717/peerj-cs.307] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 09/30/2020] [Indexed: 05/03/2023]
Abstract
Technological advances have lead to the creation of large epigenetic datasets, including information about DNA binding proteins and DNA spatial structure. Hi-C experiments have revealed that chromosomes are subdivided into sets of self-interacting domains called Topologically Associating Domains (TADs). TADs are involved in the regulation of gene expression activity, but the mechanisms of their formation are not yet fully understood. Here, we focus on machine learning methods to characterize DNA folding patterns in Drosophila based on chromatin marks across three cell lines. We present linear regression models with four types of regularization, gradient boosting, and recurrent neural networks (RNN) as tools to study chromatin folding characteristics associated with TADs given epigenetic chromatin immunoprecipitation data. The bidirectional long short-term memory RNN architecture produced the best prediction scores and identified biologically relevant features. Distribution of protein Chriz (Chromator) and histone modification H3K4me3 were selected as the most informative features for the prediction of TADs characteristics. This approach may be adapted to any similar biological dataset of chromatin features across various cell lines and species. The code for the implemented pipeline, Hi-ChiP-ML, is publicly available: https://github.com/MichalRozenwald/Hi-ChIP-ML.
Collapse
Affiliation(s)
- Michal B. Rozenwald
- Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
| | | | - Grigory V. Sapunov
- Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia
- Intento, Inc., Berkeley, CA, USA
| | | | - Mikhail S. Gelfand
- Skolkovo Institute of Science and Technology, Moscow, Russia
- A.A. Kharkevich Institute for Information Transmission Problems, RAS, Moscow, Russia
| |
Collapse
|
8
|
Gavrilov AA, Zharikova AA, Galitsyna AA, Luzhin A, Rubanova NM, Golov AK, Petrova NV, Logacheva M, Kantidze OL, Ulianov SV, Magnitov MD, Mironov AA, Razin SV. Studying RNA-DNA interactome by Red-C identifies noncoding RNAs associated with various chromatin types and reveals transcription dynamics. Nucleic Acids Res 2020; 48:6699-6714. [PMID: 32479626 PMCID: PMC7337940 DOI: 10.1093/nar/gkaa457] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 05/13/2020] [Accepted: 05/18/2020] [Indexed: 12/15/2022] Open
Abstract
Non-coding RNAs (ncRNAs) participate in various biological processes, including regulating transcription and sustaining genome 3D organization. Here, we present a method termed Red-C that exploits proximity ligation to identify contacts with the genome for all RNA molecules present in the nucleus. Using Red-C, we uncovered the RNA-DNA interactome of human K562 cells and identified hundreds of ncRNAs enriched in active or repressed chromatin, including previously undescribed RNAs. Analysis of the RNA-DNA interactome also allowed us to trace the kinetics of messenger RNA production. Our data support the model of co-transcriptional intron splicing, but not the hypothesis of the circularization of actively transcribed genes.
Collapse
Affiliation(s)
- Alexey A Gavrilov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Anastasiya A Zharikova
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
- National Medical Research Center for Preventive Medicine, Ministry of Healthcare of the Russian Federation, Moscow, Russia
| | - Aleksandra A Galitsyna
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Russia
| | - Artem V Luzhin
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | | | - Arkadiy K Golov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Mental Health Research Center, Moscow, Russia
| | | | | | - Omar L Kantidze
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Sergey V Ulianov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Mikhail D Magnitov
- Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, Moscow, Russia
| | - Andrey A Mironov
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
- Faculty of Computer Science, Higher School of Economics, Moscow, Russia
| | - Sergey V Razin
- To whom correspondence should be addressed. Tel: +7 499 135 3092; Fax: +7 499 135 4105;
| |
Collapse
|
9
|
Ulianov SV, Galitsyna AA, Flyamer IM, Golov AK, Khrameeva EE, Imakaev MV, Abdennur NA, Gelfand MS, Gavrilov AA, Razin SV. Activation of the alpha-globin gene expression correlates with dramatic upregulation of nearby non-globin genes and changes in local and large-scale chromatin spatial structure. Epigenetics Chromatin 2017; 10:35. [PMID: 28693562 PMCID: PMC5504709 DOI: 10.1186/s13072-017-0142-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Accepted: 07/03/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation. RESULTS We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation. CONCLUSIONS Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.
Collapse
Affiliation(s)
- Sergey V Ulianov
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334.,Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia 119992
| | - Aleksandra A Galitsyna
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334.,Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia 119992.,Institute for Information Transmission Problems (the Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia 127051
| | - Ilya M Flyamer
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334.,Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia 119992.,MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
| | - Arkadiy K Golov
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334
| | - Ekaterina E Khrameeva
- Skolkovo Institute of Science and Technology, Skolkovo, Russia 143026.,Institute for Information Transmission Problems (the Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia 127051
| | - Maxim V Imakaev
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Nezar A Abdennur
- Computational and Systems Biology Graduate Program, Massachusetts Institute of Technology, Cambridge, MA USA
| | - Mikhail S Gelfand
- Faculty of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia 119992.,Skolkovo Institute of Science and Technology, Skolkovo, Russia 143026.,Institute for Information Transmission Problems (the Kharkevich Institute) of the Russian Academy of Sciences, Moscow, Russia 127051.,Faculty of Computer Science, Higher School of Economics, Moscow, Russia 125319
| | - Alexey A Gavrilov
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334
| | - Sergey V Razin
- Institute of Gene Biology of the Russian Academy of Sciences, Moscow, Russia 119334.,Faculty of Biology, M.V. Lomonosov Moscow State University, Moscow, Russia 119992
| |
Collapse
|
10
|
Kovina AP, Petrova NV, Gushchanskaya ES, Dolgushin KV, Gerasimov ES, Galitsyna AA, Penin AA, Flyamer IM, Ioudinkova ES, Gavrilov AA, Vassetzky YS, Ulianov SV, Iarovaia OV, Razin SV. Evolution of the Genome 3D Organization: Comparison of Fused and Segregated Globin Gene Clusters. Mol Biol Evol 2017; 34:1492-1504. [DOI: 10.1093/molbev/msx100] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
|
11
|
Popov YV, Galitsyna AA, Alexeevski AV, Karyagina AS, Spirin SA. StructAlign, a Program for Alignment of Structures of DNA-Protein Complexes. Biochemistry (Mosc) 2016; 80:1465-8. [PMID: 26615437 DOI: 10.1134/s0006297915110073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Comparative analysis of structures of complexes of homologous proteins with DNA is important in the analysis of DNA-protein recognition. Alignment is a necessary stage of the analysis. An alignment is a matching of amino acid residues and nucleotides of one complex to residues and nucleotides of the other. Currently, there are no programs available for aligning structures of DNA-protein complexes. We present the program StructAlign, which should fill this gap. The program inputs a pair of complexes of DNA double helix with proteins and outputs an alignment of DNA chains corresponding to the best spatial fit of the protein chains.
Collapse
Affiliation(s)
- Ya V Popov
- Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Moscow, 119991, Russia
| | | | | | | | | |
Collapse
|