1
|
Pisarenco VA, Vizueta J, Rozas J. GALEON: a comprehensive bioinformatic tool to analyse and visualize gene clusters in complete genomes. Bioinformatics 2024; 40:btae439. [PMID: 38976642 PMCID: PMC11236287 DOI: 10.1093/bioinformatics/btae439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/12/2024] [Accepted: 07/05/2024] [Indexed: 07/10/2024] Open
Abstract
MOTIVATION Gene clusters, defined as a set of genes encoding functionally related proteins, are abundant in eukaryotic genomes. Despite the increasing availability of chromosome-level genomes, the comprehensive analysis of gene family evolution remains largely unexplored, particularly for large and highly dynamic gene families or those including very recent family members. These challenges stem from limitations in genome assembly contiguity, particularly in repetitive regions such as large gene clusters. Recent advancements in sequencing technology, such as long reads and chromatin contact mapping, hold promise in addressing these challenges. RESULTS To facilitate the identification, analysis, and visualization of physically clustered gene family members within chromosome-level genomes, we introduce GALEON, a user-friendly bioinformatic tool. GALEON identifies gene clusters by studying the spatial distribution of pairwise physical distances among gene family members along with the genome-wide gene density. The pipeline also enables the simultaneous analysis and comparison of two gene families and allows the exploration of the relationship between physical and evolutionary distances. This tool offers a novel approach for studying the origin and evolution of gene families. AVAILABILITY AND IMPLEMENTATION GALEON is freely available from https://www.ub.edu/softevol/galeon and https://github.com/molevol-ub/galeon.
Collapse
Affiliation(s)
- Vadim A Pisarenco
- Departament de Genètica, Microbiologia i Estadística, Universitat de Barcelona, Barcelona 08028, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona 08028, Spain
| | - Joel Vizueta
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística, Universitat de Barcelona, Barcelona 08028, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona 08028, Spain
| |
Collapse
|
2
|
Wu S, Zhou H, Chen D, Lu Y, Li Y, Qiao J. Multi-omic analysis tools for microbial metabolites prediction. Brief Bioinform 2024; 25:bbae264. [PMID: 38859767 PMCID: PMC11165163 DOI: 10.1093/bib/bbae264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 05/08/2024] [Indexed: 06/12/2024] Open
Abstract
How to resolve the metabolic dark matter of microorganisms has long been a challenging problem in discovering active molecules. Diverse omics tools have been developed to guide the discovery and characterization of various microbial metabolites, which make it gradually possible to predict the overall metabolites for individual strains. The combinations of multi-omic analysis tools effectively compensates for the shortcomings of current studies that focus only on single omics or a broad class of metabolites. In this review, we systematically update, categorize and sort out different analysis tools for microbial metabolites prediction in the last five years to appeal for the multi-omic combination on the understanding of the metabolic nature of microbes. First, we provide the general survey on different updated prediction databases, webservers, or software that based on genomics, transcriptomics, proteomics, and metabolomics, respectively. Then, we discuss the essentiality on the integration of multi-omics data to predict metabolites of different microbial strains and communities, as well as stressing the combination of other techniques, such as systems biology methods and data-driven algorithms. Finally, we identify key challenges and trends in developing multi-omic analysis tools for more comprehensive prediction on diverse microbial metabolites that contribute to human health and disease treatment.
Collapse
Affiliation(s)
- Shengbo Wu
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Haonan Zhou
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Danlei Chen
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Yutong Lu
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
| | - Yanni Li
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin 300072, China
| | - Jianjun Qiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
- Zhejiang Institute of Tianjin University, Shaoxing, Shaoxing 312300, China
- Key Laboratory of Systems Bioengineering, Ministry of Education (Tianjin University), Tianjin 300072, China
- Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin University, Tianjin 300072, China
| |
Collapse
|
3
|
Malzl D, Peycheva M, Rahjouei A, Gnan S, Klein KN, Nazarova M, Schoeberl UE, Gilbert DM, Buonomo SCB, Di Virgilio M, Neumann T, Pavri R. RIF1 regulates early replication timing in murine B cells. Nat Commun 2023; 14:8049. [PMID: 38081811 PMCID: PMC10713614 DOI: 10.1038/s41467-023-43778-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 11/20/2023] [Indexed: 12/18/2023] Open
Abstract
The mammalian DNA replication timing (RT) program is crucial for the proper functioning and integrity of the genome. The best-known mechanism for controlling RT is the suppression of late origins of replication in heterochromatin by RIF1. Here, we report that in antigen-activated, hypermutating murine B lymphocytes, RIF1 binds predominantly to early-replicating active chromatin and promotes early replication, but plays a minor role in regulating replication origin activity, gene expression and genome organization in B cells. Furthermore, we find that RIF1 functions in a complementary and non-epistatic manner with minichromosome maintenance (MCM) proteins to establish early RT signatures genome-wide and, specifically, to ensure the early replication of highly transcribed genes. These findings reveal additional layers of regulation within the B cell RT program, driven by the coordinated activity of RIF1 and MCM proteins.
Collapse
Affiliation(s)
- Daniel Malzl
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090, Lazarettgasse 14, Vienna, Austria
| | - Mihaela Peycheva
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090, Lazarettgasse 14, Vienna, Austria
| | - Ali Rahjouei
- Max-Delbruck Center for Molecular Medicine in the Helmholtz Association (MDC), 13125, Berlin, Germany
| | - Stefano Gnan
- School of Biological Sciences, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3FF, UK
| | - Kyle N Klein
- San Diego Biomedical Research Institute, San Diego, CA, 92121, USA
| | - Mariia Nazarova
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria
| | - Ursula E Schoeberl
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria
| | - David M Gilbert
- San Diego Biomedical Research Institute, San Diego, CA, 92121, USA
| | - Sara C B Buonomo
- School of Biological Sciences, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3FF, UK
| | - Michela Di Virgilio
- Max-Delbruck Center for Molecular Medicine in the Helmholtz Association (MDC), 13125, Berlin, Germany
| | - Tobias Neumann
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria.
- Quantro Therapeutics, Vienna Biocenter, 1030, Vienna, Austria.
| | - Rushad Pavri
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030, Vienna, Austria.
| |
Collapse
|
4
|
Ansaloni F, Gustincich S, Sanges R. In silico characterisation of minor wave genes and LINE-1s transcriptional dynamics at murine zygotic genome activation. Front Cell Dev Biol 2023; 11:1124266. [PMID: 37389353 PMCID: PMC10300423 DOI: 10.3389/fcell.2023.1124266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/05/2023] [Indexed: 07/01/2023] Open
Abstract
Introduction: In mouse, the zygotic genome activation (ZGA) is coordinated by MERVL elements, a class of LTR retrotransposons. In addition to MERVL, another class of retrotransposons, LINE-1 elements, recently came under the spotlight as key regulators of murine ZGA. In particular, LINE-1 transcripts seem to be required to switch-off the transcriptional program started by MERVL sequences, suggesting an antagonistic interplay between LINE-1 and MERVL pathways. Methods: To better investigate the activities of LINE-1 and MERVL elements at ZGA, we integrated publicly available transcriptomics (RNA-seq), chromatin accessibility (ATAC-seq) and Pol-II binding (Stacc-seq) datasets and characterised the transcriptional and epigenetic dynamics of such elements during murine ZGA. Results: We identified two likely distinct transcriptional activities characterising the murine zygotic genome at ZGA onset. On the one hand, our results confirmed that ZGA minor wave genes are preferentially transcribed from MERVL-rich and gene-dense genomic compartments, such as gene clusters. On the other hand, we identified a set of evolutionary young and likely transcriptionally autonomous LINE-1s located in intergenic and gene-poor regions showing, at the same stage, features such as open chromatin and RNA Pol II binding suggesting them to be, at least, poised for transcription. Discussion: These results suggest that, across evolution, transcription of two different classes of transposable elements, MERVLs and LINE-1s, have likely been confined in genic and intergenic regions respectively in order to maintain and regulate two successive transcriptional programs at ZGA.
Collapse
Affiliation(s)
- Federico Ansaloni
- Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy
- Central RNA Laboratory, Istituto Italiano di Tecnologia—IIT, Genova, Italy
| | - Stefano Gustincich
- Central RNA Laboratory, Istituto Italiano di Tecnologia—IIT, Genova, Italy
| | - Remo Sanges
- Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy
- Central RNA Laboratory, Istituto Italiano di Tecnologia—IIT, Genova, Italy
| |
Collapse
|
5
|
Hadzhiev Y, Wheatley L, Cooper L, Ansaloni F, Whalley C, Chen Z, Finaurini S, Gustincich S, Sanges R, Burgess S, Beggs A, Müller F. The miR-430 locus with extreme promoter density forms a transcription body during the minor wave of zygotic genome activation. Dev Cell 2023; 58:155-170.e8. [PMID: 36693321 PMCID: PMC9904021 DOI: 10.1016/j.devcel.2022.12.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 08/10/2022] [Accepted: 12/16/2022] [Indexed: 01/24/2023]
Abstract
In anamniote embryos, the major wave of zygotic genome activation starts during the mid-blastula transition. However, some genes escape global genome repression, are activated substantially earlier, and contribute to the minor wave of genome activation. The mechanisms underlying the minor wave of genome activation are little understood. We explored the genomic organization and cis-regulatory mechanisms of a transcription body, in which the minor wave of genome activation is first detected in zebrafish. We identified the miR-430 cluster as having excessive copy number and the highest density of Pol-II-transcribed promoters in the genome, and this is required for forming the transcription body. However, this transcription body is not essential for, nor does it encompasse, minor wave transcription globally. Instead, distinct minor-wave-specific promoter architecture suggests that promoter-autonomous mechanisms regulate the minor wave of genome activation. The minor-wave-specific features also suggest distinct transcription initiation mechanisms between the minor and major waves of genome activation.
Collapse
Affiliation(s)
- Yavor Hadzhiev
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Lucy Wheatley
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Ledean Cooper
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Federico Ansaloni
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK; Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy; Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), 16163 Genoa, Italy
| | - Celina Whalley
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Zhelin Chen
- South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China; Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-2152, USA
| | - Sara Finaurini
- Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Stefano Gustincich
- Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), 16163 Genoa, Italy
| | - Remo Sanges
- Area of Neuroscience, Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy; Central RNA Laboratory, Istituto Italiano di Tecnologia (IIT), 16163 Genoa, Italy
| | - Shawn Burgess
- Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-2152, USA
| | - Andrew Beggs
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Ferenc Müller
- Institute of Cancer and Genomics Sciences, Birmingham Centre for Genome Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK.
| |
Collapse
|
6
|
Peycheva M, Neumann T, Malzl D, Nazarova M, Schoeberl UE, Pavri R. DNA replication timing directly regulates the frequency of oncogenic chromosomal translocations. Science 2022; 377:eabj5502. [DOI: 10.1126/science.abj5502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Chromosomal translocations result from the joining of DNA double-strand breaks (DSBs) and frequently cause cancer. However, the steps linking DSB formation to DSB ligation remain undeciphered. We report that DNA replication timing (RT) directly regulates lymphomagenic
Myc
translocations during antibody maturation in B cells downstream of DSBs and independently of DSB frequency. Depletion of minichromosome maintenance complexes alters replication origin activity, decreases translocations, and deregulates global RT. Ablating a single origin at
Myc
causes an early-to-late RT switch, loss of translocations, and reduced proximity with the immunoglobulin heavy chain (
Igh
) gene, its major translocation partner. These phenotypes were reversed by restoring early RT. Disruption of early RT also reduced tumorigenic translocations in human leukemic cells. Thus, RT constitutes a general mechanism in translocation biogenesis linking DSB formation to DSB ligation.
Collapse
Affiliation(s)
- Mihaela Peycheva
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
| | - Tobias Neumann
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna Biocenter, 1030 Vienna, Austria
| | - Daniel Malzl
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
| | - Mariia Nazarova
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
| | - Ursula E. Schoeberl
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
| | - Rushad Pavri
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter, 1030 Vienna, Austria
| |
Collapse
|
7
|
Guilbaud G, Murat P, Wilkes HS, Lerner LK, Sale JE, Krude T. Determination of human DNA replication origin position and efficiency reveals principles of initiation zone organisation. Nucleic Acids Res 2022; 50:7436-7450. [PMID: 35801867 PMCID: PMC9303276 DOI: 10.1093/nar/gkac555] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 12/16/2022] Open
Abstract
Replication of the human genome initiates within broad zones of ∼150 kb. The extent to which firing of individual DNA replication origins within initiation zones is spatially stochastic or localised at defined sites remains a matter of debate. A thorough characterisation of the dynamic activation of origins within initiation zones is hampered by the lack of a high-resolution map of both their position and efficiency. To address this shortcoming, we describe a modification of initiation site sequencing (ini-seq), based on density substitution. Newly replicated DNA is rendered 'heavy-light' (HL) by incorporation of BrdUTP while unreplicated DNA remains 'light-light' (LL). Replicated HL-DNA is separated from unreplicated LL-DNA by equilibrium density gradient centrifugation, then both fractions are subjected to massive parallel sequencing. This allows precise mapping of 23,905 replication origins simultaneously with an assignment of a replication initiation efficiency score to each. We show that origin firing within early initiation zones is not randomly distributed. Rather, origins are arranged hierarchically with a set of very highly efficient origins marking zone boundaries. We propose that these origins explain much of the early firing activity arising within initiation zones, helping to unify the concept of replication initiation zones with the identification of discrete replication origin sites.
Collapse
Affiliation(s)
- Guillaume Guilbaud
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Pierre Murat
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Helen S Wilkes
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Leticia Koch Lerner
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Julian E Sale
- Division of Protein and Nucleic Acid Chemistry, MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Torsten Krude
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| |
Collapse
|
8
|
Song M, Zhong H. Efficient weighted univariate clustering maps outstanding dysregulated genomic zones in human cancers. Bioinformatics 2021; 36:5027-5036. [PMID: 32619008 PMCID: PMC7755420 DOI: 10.1093/bioinformatics/btaa613] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 05/24/2020] [Accepted: 06/26/2020] [Indexed: 12/14/2022] Open
Abstract
Motivation Chromosomal patterning of gene expression in cancer can arise from aneuploidy, genome disorganization or abnormal DNA methylation. To map such patterns, we introduce a weighted univariate clustering algorithm to guarantee linear runtime, optimality and reproducibility. Results We present the chromosome clustering method, establish its optimality and runtime and evaluate its performance. It uses dynamic programming enhanced with an algorithm to reduce search-space in-place to decrease runtime overhead. Using the method, we delineated outstanding genomic zones in 17 human cancer types. We identified strong continuity in dysregulation polarity—dominance by either up- or downregulated genes in a zone—along chromosomes in all cancer types. Significantly polarized dysregulation zones specific to cancer types are found, offering potential diagnostic biomarkers. Unreported previously, a total of 109 loci with conserved dysregulation polarity across cancer types give insights into pan-cancer mechanisms. Efficient chromosomal clustering opens a window to characterize molecular patterns in cancer genome and beyond. Availability and implementation Weighted univariate clustering algorithms are implemented within the R package ‘Ckmeans.1d.dp’ (4.0.0 or above), freely available at https://cran.r-project.org/package=Ckmeans.1d.dp. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mingzhou Song
- Department of Computer Science.,Molecular Biology Graduate Program, New Mexico State University, Las Cruces, NM 88003, USA
| | | |
Collapse
|
9
|
Cheng YH, Liu CFJ, Yu YH, Jhou YT, Fujishima M, Tsai IJ, Leu JY. Genome plasticity in Paramecium bursaria revealed by population genomics. BMC Biol 2020; 18:180. [PMID: 33250052 PMCID: PMC7702705 DOI: 10.1186/s12915-020-00912-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 10/29/2020] [Indexed: 11/25/2022] Open
Abstract
Background Ciliates are an ancient and diverse eukaryotic group found in various environments. A unique feature of ciliates is their nuclear dimorphism, by which two types of nuclei, the diploid germline micronucleus (MIC) and polyploidy somatic macronucleus (MAC), are present in the same cytoplasm and serve different functions. During each sexual cycle, ciliates develop a new macronucleus in which newly fused genomes are extensively rearranged to generate functional minichromosomes. Interestingly, each ciliate species seems to have its way of processing genomes, providing a diversity of resources for studying genome plasticity and its regulation. Here, we sequenced and analyzed the macronuclear genome of different strains of Paramecium bursaria, a highly divergent species of the genus Paramecium which can stably establish endosymbioses with green algae. Results We assembled a high-quality macronuclear genome of P. bursaria and further refined genome annotation by comparing population genomic data. We identified several species-specific expansions in protein families and gene lineages that are potentially associated with endosymbiosis. Moreover, we observed an intensive chromosome breakage pattern that occurred during or shortly after sexual reproduction and contributed to highly variable gene dosage throughout the genome. However, patterns of copy number variation were highly correlated among genetically divergent strains, suggesting that copy number is adjusted by some regulatory mechanisms or natural selection. Further analysis showed that genes with low copy number variation among populations tended to function in basic cellular pathways, whereas highly variable genes were enriched in environmental response pathways. Conclusions We report programmed DNA rearrangements in the P. bursaria macronuclear genome that allow cells to adjust gene copy number globally according to individual gene functions. Our results suggest that large-scale gene copy number variation may represent an ancient mechanism for cells to adapt to different environments. Supplementary information The online version contains supplementary material available at 10.1186/s12915-020-00912-2.
Collapse
Affiliation(s)
- Yu-Hsuan Cheng
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan.,Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Chien-Fu Jeff Liu
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Yen-Hsin Yu
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Yu-Ting Jhou
- Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan
| | - Masahiro Fujishima
- Graduate School of Sciences and Technology for Innovation, Yamaguchi University, Yamaguchi, 753-8512, Japan
| | - Isheng Jason Tsai
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan.,Biodiversity Research Center, Academia Sinica, Taipei, 115, Taiwan
| | - Jun-Yi Leu
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, 106, Taiwan. .,Institute of Molecular Biology, Academia Sinica, 128 Sec. 2, Academia Road, Nankang, Taipei, 115, Taiwan.
| |
Collapse
|