1
|
Kadlof M, Banecki K, Chiliński M, Plewczynski D. Chromatin Image-driven modelling. Methods 2024; 226:S1046-2023(24)00090-2. [PMID: 38636797 DOI: 10.1016/j.ymeth.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/13/2024] [Accepted: 04/05/2024] [Indexed: 04/20/2024] Open
Abstract
The challenge of modeling the spatial conformation of chromatin remains an open problem. While multiple data-driven approaches have been proposed, each has limitations. This work introduces two image-driven modeling methods based on the Molecular Dynamics Flexible Fitting (MDFF) approach: the force method and the correlational method. Both methods have already been used successfully in protein modeling. We propose a novel way to employ them for building chromatin models directly from 3D images. This approach is termed image-driven modeling. Additionally, we introduce the initial structure generator, a tool designed to generate optimal starting structures for the proposed algorithms. The methods are versatile and can be applied to various data types, with minor modifications to accommodate new generation imaging techniques.
Collapse
Affiliation(s)
- Michał Kadlof
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
| | - Krzysztof Banecki
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Mateusz Chiliński
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland; Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| |
Collapse
|
2
|
Wlasnowolski M, Grabowski P, Roszczyk D, Kaczmarski K, Plewczynski D. cudaMMC: GPU-enhanced multiscale Monte Carlo chromatin 3D modelling. Bioinformatics 2023; 39:btad588. [PMID: 37774005 PMCID: PMC10568367 DOI: 10.1093/bioinformatics/btad588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 08/14/2023] [Accepted: 09/28/2023] [Indexed: 10/01/2023] Open
Abstract
MOTIVATION Investigating the 3D structure of chromatin provides new insights into transcriptional regulation. With the evolution of 3C next-generation sequencing methods like ChiA-PET and Hi-C, the surge in data volume has highlighted the need for more efficient chromatin spatial modelling algorithms. This study introduces the cudaMMC method, based on the Simulated Annealing Monte Carlo approach and enhanced by GPU-accelerated computing, to efficiently generate ensembles of chromatin 3D structures. RESULTS The cudaMMC calculations demonstrate significantly faster performance with better stability compared to our previous method on the same workstation. cudaMMC also substantially reduces the computation time required for generating ensembles of large chromatin models, making it an invaluable tool for studying chromatin spatial conformation. AVAILABILITY AND IMPLEMENTATION Open-source software and manual and sample data are freely available on https://github.com/SFGLab/cudaMMC.
Collapse
Affiliation(s)
- Michal Wlasnowolski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw 02-097, Poland
| | - Pawel Grabowski
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Damian Roszczyk
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Krzysztof Kaczmarski
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw 02-097, Poland
| |
Collapse
|
3
|
Li N, Meng G, Yang C, Li H, Liu L, Wu Y, Liu B. Changes in epigenetic information during the occurrence and development of gastric cancer. Int J Biochem Cell Biol 2022; 153:106315. [DOI: 10.1016/j.biocel.2022.106315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 09/22/2022] [Accepted: 10/18/2022] [Indexed: 11/24/2022]
|
4
|
Das P, Shen T, McCord RP. Characterizing the variation in chromosome structure ensembles in the context of the nuclear microenvironment. PLoS Comput Biol 2022; 18:e1010392. [PMID: 35969616 PMCID: PMC9410561 DOI: 10.1371/journal.pcbi.1010392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/25/2022] [Accepted: 07/15/2022] [Indexed: 11/23/2022] Open
Abstract
Inside the nucleus, chromosomes are subjected to direct physical interaction between different components, active forces, and thermal noise, leading to the formation of an ensemble of three-dimensional structures. However, it is still not well understood to what extent and how the structural ensemble varies from one chromosome region or cell-type to another. We designed a statistical analysis technique and applied it to single-cell chromosome imaging data to reveal the heterogeneity of individual chromosome structures. By analyzing the resulting structural landscape, we find that the largest dynamic variation is the overall radius of gyration of the chromatin region, followed by domain reorganization within the region. By comparing different human cell-lines and experimental perturbation data using this statistical analysis technique and a network-based similarity quantification approach, we identify both cell-type and condition-specific features of the structural landscapes. We identify a relationship between epigenetic state and the properties of chromosome structure fluctuation and validate this relationship through polymer simulations. Overall, our study suggests that the types of variation in a chromosome structure ensemble are cell-type as well as region-specific and can be attributed to constraints placed on the structure by factors such as variation in epigenetic state. Recent work has revealed principles of how chromosomes are folded and structured inside the human nucleus. It is now even possible to microscopically trace the path of chromosomes in 3D in individual cells. With this data, we can start to examine how much variation exists in chromosome structure and what biological factors may restrict or enhance this variation. Are chromosomes stuck in just a few possible positions or do they move around more freely, sampling many configurations? Here, we use a mathematical approach to compare chromosome structure variation in different cell types, at different locations along the genome, and when key structural proteins are removed. Through these comparisons and dynamic simulations of chromosome behavior, we identify factors that may constrain or promote variation in chromosome structure.
Collapse
Affiliation(s)
- Priyojit Das
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Tongye Shen
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Rachel Patton McCord
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
5
|
Madsen-Østerbye J, Bellanger A, Galigniana NM, Collas P. Biology and Model Predictions of the Dynamics and Heterogeneity of Chromatin-Nuclear Lamina Interactions. Front Cell Dev Biol 2022; 10:913458. [PMID: 35693945 PMCID: PMC9178083 DOI: 10.3389/fcell.2022.913458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/12/2022] [Indexed: 11/13/2022] Open
Abstract
Associations of chromatin with the nuclear lamina, at the nuclear periphery, help shape the genome in 3 dimensions. The genomic landscape of lamina-associated domains (LADs) is well characterized, but much remains unknown on the physical and mechanistic properties of chromatin conformation at the nuclear lamina. Computational models of chromatin folding at, and interactions with, a surface representing the nuclear lamina are emerging in attempts to characterize these properties and predict chromatin behavior at the lamina in health and disease. Here, we highlight the heterogeneous nature of the nuclear lamina and LADs, outline the main 3-dimensional chromatin structural modeling methods, review applications of modeling chromatin-lamina interactions and discuss biological insights inferred from these models in normal and disease states. Lastly, we address perspectives on future developments in modeling chromatin interactions with the nuclear lamina.
Collapse
Affiliation(s)
- Julia Madsen-Østerbye
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Aurélie Bellanger
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Natalia M. Galigniana
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, Oslo, Norway
| | - Philippe Collas
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, Oslo, Norway
- *Correspondence: Philippe Collas,
| |
Collapse
|
6
|
3DGenBench: a web-server to benchmark computational models for 3D Genomics. Nucleic Acids Res 2022; 50:W4-W12. [PMID: 35639501 PMCID: PMC9252746 DOI: 10.1093/nar/gkac396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 04/26/2022] [Accepted: 05/23/2022] [Indexed: 11/13/2022] Open
Abstract
Modeling 3D genome organisation has been booming in the last years thanks to the availability of experimental datasets of genomic contacts. However, the field is currently missing the standardisation of methods and metrics to compare predictions and experiments. We present 3DGenBench, a web server available at https://inc-cost.eu/benchmarking/, that allows benchmarking computational models of 3D Genomics. The benchmark is performed using a manually curated dataset of 39 capture Hi-C profiles in wild type and genome-edited mouse cells, and five genome-wide Hi-C profiles in human, mouse, and Drosophila cells. 3DGenBench performs two kinds of analysis, each supplied with a specific scoring module that compares predictions of a computational method to experimental data using several metrics. With 3DGenBench, the user obtains model performance scores, allowing an unbiased comparison with other models. 3DGenBench aims to become a reference web server to test new 3D genomics models and is conceived as an evolving platform where new types of analysis will be implemented in the future.
Collapse
|
7
|
Quan C, Ping J, Lu H, Zhou G, Lu Y. 3DSNP 2.0: update and expansion of the noncoding genomic variant annotation database. Nucleic Acids Res 2022; 50:D950-D955. [PMID: 34723317 PMCID: PMC8728236 DOI: 10.1093/nar/gkab1008] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/11/2021] [Accepted: 10/12/2021] [Indexed: 12/30/2022] Open
Abstract
The rapid development of single-molecule long-read sequencing (LRS) and single-cell assay for transposase accessible chromatin sequencing (scATAC-seq) technologies presents both challenges and opportunities for the annotation of noncoding variants. Here, we updated 3DSNP, a comprehensive database for human noncoding variant annotation, to expand its applications to structural variation (SV) and to implement variant annotation down to single-cell resolution. The updates of 3DSNP include (i) annotation of 108 317 SVs from a full spectrum of functions, especially their potential effects on three-dimensional chromatin structures, (ii) evaluation of the accessible chromatin peaks flanking the variants across 126 cell types/subtypes in 15 human fetal tissues and 54 cell types/subtypes in 25 human adult tissues by integrating scATAC-seq data and (iii) expansion of Hi-C data to 49 human cell types. In summary, this version is a significant and comprehensive improvement over the previous version. The 3DSNP v2.0 database is freely available at https://omic.tech/3dsnpv2/.
Collapse
Affiliation(s)
- Cheng Quan
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Jie Ping
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Hao Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Gangqiao Zhou
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Yiming Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| |
Collapse
|
8
|
Orozco G, Schoenfelder S, Walker N, Eyre S, Fraser P. 3D genome organization links non-coding disease-associated variants to genes. Front Cell Dev Biol 2022; 10:995388. [PMID: 36340032 PMCID: PMC9631826 DOI: 10.3389/fcell.2022.995388] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/27/2022] [Indexed: 11/13/2022] Open
Abstract
Genome sequencing has revealed over 300 million genetic variations in human populations. Over 90% of variants are single nucleotide polymorphisms (SNPs), the remainder include short deletions or insertions, and small numbers of structural variants. Hundreds of thousands of these variants have been associated with specific phenotypic traits and diseases through genome wide association studies which link significant differences in variant frequencies with specific phenotypes among large groups of individuals. Only 5% of disease-associated SNPs are located in gene coding sequences, with the potential to disrupt gene expression or alter of the function of encoded proteins. The remaining 95% of disease-associated SNPs are located in non-coding DNA sequences which make up 98% of the genome. The role of non-coding, disease-associated SNPs, many of which are located at considerable distances from any gene, was at first a mystery until the discovery that gene promoters regularly interact with distal regulatory elements to control gene expression. Disease-associated SNPs are enriched at the millions of gene regulatory elements that are dispersed throughout the non-coding sequences of the genome, suggesting they function as gene regulation variants. Assigning specific regulatory elements to the genes they control is not straightforward since they can be millions of base pairs apart. In this review we describe how understanding 3D genome organization can identify specific interactions between gene promoters and distal regulatory elements and how 3D genomics can link disease-associated SNPs to their target genes. Understanding which gene or genes contribute to a specific disease is the first step in designing rational therapeutic interventions.
Collapse
Affiliation(s)
- Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom.,NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Stefan Schoenfelder
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom.,Epigenetics Programme, The Babraham Institute, Babraham Research Campus, CB22 3AT Cambridge, Cambridge, United Kingdom
| | | | - Stephan Eyre
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom.,NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Peter Fraser
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom.,Department of Biological Science, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
9
|
Chiliński M, Sengupta K, Plewczynski D. From DNA human sequence to the chromatin higher order organisation and its biological meaning: Using biomolecular interaction networks to understand the influence of structural variation on spatial genome organisation and its functional effect. Semin Cell Dev Biol 2021; 121:171-185. [PMID: 34429265 DOI: 10.1016/j.semcdb.2021.08.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 08/06/2021] [Accepted: 08/12/2021] [Indexed: 12/30/2022]
Abstract
The three-dimensional structure of the human genome has been proven to have a significant functional impact on gene expression. The high-order spatial chromatin is organised first by looping mediated by multiple protein factors, and then it is further formed into larger structures of topologically associated domains (TADs) or chromatin contact domains (CCDs), followed by A/B compartments and finally the chromosomal territories (CTs). The genetic variation observed in human population influences the multi-scale structures, posing a question regarding the functional impact of structural variants reflected by the variability of the genes expression patterns. The current methods of evaluating the functional effect include eQTLs analysis which uses statistical testing of influence of variants on spatially close genes. Rarely, non-coding DNA sequence changes are evaluated by their impact on the biomolecular interaction network (BIN) reflecting the cellular interactome that can be analysed by the classical graph-theoretic algorithms. Therefore, in the second part of the review, we introduce the concept of BIN, i.e. a meta-network model of the complete molecular interactome developed by integrating various biological networks. The BIN meta-network model includes DNA-protein binding by the plethora of protein factors as well as chromatin interactions, therefore allowing connection of genomics with the downstream biomolecular processes present in a cell. As an illustration, we scrutinise the chromatin interactions mediated by the CTCF protein detected in a ChIA-PET experiment in the human lymphoblastoid cell line GM12878. In the corresponding BIN meta-network the DNA spatial proximity is represented as a graph model, combined with the Proteins-Interaction Network (PIN) of human proteome using the Gene Association Network (GAN). Furthermore, we enriched the BIN with the signalling and metabolic pathways and Gene Ontology (GO) terms to assert its functional context. Finally, we mapped the Single Nucleotide Polymorphisms (SNPs) from the GWAS studies and identified the chromatin mutational hot-spots associated with a significant enrichment of SNPs related to autoimmune diseases. Afterwards, we mapped Structural Variants (SVs) from healthy individuals of 1000 Genomes Project and identified an interesting example of the missing protein complex associated with protein Q6GYQ0 due to a deletion on chromosome 14. Such an analysis using the meta-network BIN model is therefore helpful in evaluating the influence of genetic variation on spatial organisation of the genome and its functional effect in a cell.
Collapse
Affiliation(s)
- Mateusz Chiliński
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Kaustav Sengupta
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland.
| |
Collapse
|
10
|
Zha M, Wang N, Zhang C, Wang Z. Inferring Single-Cell 3D Chromosomal Structures Based on the Lennard-Jones Potential. Int J Mol Sci 2021; 22:ijms22115914. [PMID: 34072879 PMCID: PMC8199262 DOI: 10.3390/ijms22115914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 05/23/2021] [Accepted: 05/28/2021] [Indexed: 11/16/2022] Open
Abstract
Reconstructing three-dimensional (3D) chromosomal structures based on single-cell Hi-C data is a challenging scientific problem due to the extreme sparseness of the single-cell Hi-C data. In this research, we used the Lennard-Jones potential to reconstruct both 500 kb and high-resolution 50 kb chromosomal structures based on single-cell Hi-C data. A chromosome was represented by a string of 500 kb or 50 kb DNA beads and put into a 3D cubic lattice for simulations. A 2D Gaussian function was used to impute the sparse single-cell Hi-C contact matrices. We designed a novel loss function based on the Lennard-Jones potential, in which the ε value, i.e., the well depth, was used to indicate how stable the binding of every pair of beads is. For the bead pairs that have single-cell Hi-C contacts and their neighboring bead pairs, the loss function assigns them stronger binding stability. The Metropolis-Hastings algorithm was used to try different locations for the DNA beads, and simulated annealing was used to optimize the loss function. We proved the correctness and validness of the reconstructed 3D structures by evaluating the models according to multiple criteria and comparing the models with 3D-FISH data.
Collapse
Affiliation(s)
- Mengsheng Zha
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Nan Wang
- Department of Computer Science, New Jersey City University, 2039 Kennedy Blvd, Jersey City, NJ 07305, USA;
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Zheng Wang
- Department of Computer Science, University of Miami, 1364 Memorial Drive, Coral Gables, FL 33124, USA
- Correspondence:
| |
Collapse
|
11
|
Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021; 22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. RESULTS Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. CONCLUSIONS Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans' adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yuanfeng Li
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Xinyi Liu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yahui Wang
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Jie Ping
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China
| |
Collapse
|
12
|
Gong H, Yang Y, Zhang S, Li M, Zhang X. Application of Hi-C and other omics data analysis in human cancer and cell differentiation research. Comput Struct Biotechnol J 2021; 19:2070-2083. [PMID: 33995903 PMCID: PMC8086027 DOI: 10.1016/j.csbj.2021.04.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/04/2021] [Accepted: 04/04/2021] [Indexed: 02/07/2023] Open
Abstract
With the development of 3C (chromosome conformation capture) and its derivative technology Hi-C (High-throughput chromosome conformation capture) research, the study of the spatial structure of the genomic sequence in the nucleus helps researchers understand the functions of biological processes such as gene transcription, replication, repair, and regulation. In this paper, we first introduce the research background and purpose of Hi-C data visualization analysis. After that, we discuss the Hi-C data analysis methods from genome 3D structure, A/B compartment, TADs (topologically associated domain), and loop detection. We also discuss how to apply genome visualization technologies to the identification of chromosome feature structures. We continue with a review of correlation analysis differences among multi-omics data, and how to apply Hi-C and other omics data analysis into cancer and cell differentiation research. Finally, we summarize the various problems in joint analyses based on Hi-C and other multi-omics data. We believe this review can help researchers better understand the progress and applications of 3D genome technology.
Collapse
Affiliation(s)
- Haiyan Gong
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| | - Yi Yang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Sichen Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Minghong Li
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiaotong Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| |
Collapse
|
13
|
Abstract
Genome architecture plays a pivotal role in gene regulation. The use of high-throughput methods for chromatin profiling and 3-D interaction mapping provide rich experimental data sets describing genome organization and dynamics. These data challenge development of new models and algorithms connecting genome architecture with epigenetic marks. In this review, we describe how chromatin architecture could be reconstructed from epigenetic data using biophysical or statistical approaches. We discuss the applicability and limitations of these methods for understanding the mechanisms of chromatin organization. We also highlight the emergence of new predictive approaches for scoring effects of structural variations in human cells.
Collapse
Affiliation(s)
- Polina Belokopytova
- Natural Sciences Department, Novosibirsk State University, Novosibirsk, Russia
- Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia
| | - Veniamin Fishman
- Natural Sciences Department, Novosibirsk State University, Novosibirsk, Russia
- Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia
| |
Collapse
|