1
|
Joshy D, Santpere G, Yi SV. Accelerated cell-type-specific regulatory evolution of the human brain. Proc Natl Acad Sci U S A 2024; 121:e2411918121. [PMID: 39680759 PMCID: PMC11670112 DOI: 10.1073/pnas.2411918121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 10/30/2024] [Indexed: 12/18/2024] Open
Abstract
The molecular basis of human brain evolution is a key piece in understanding the evolution of human-specific cognitive and behavioral traits. Comparative studies have suggested that human brain evolution was accompanied by accelerated changes of gene expression (referred to as "regulatory evolution"), especially those leading to an increase of gene products involved in energy production and metabolism. However, the signals of accelerated regulatory evolution were not always consistent across studies. One confounding factor is the diversity of distinctive cell types in the human brain. Here, we leveraged single-cell human and nonhuman primate transcriptomic data to investigate regulatory evolution at cell-type resolution. We relied on six well-established major cell types: excitatory and inhibitory neurons, astrocytes, microglia, oligodendrocytes, and oligodendrocyte precursor cells. We found pervasive signatures of accelerated regulatory evolution in the human brains compared to the chimpanzee brains in the major six cell types, as well as across multiple neuronal subtypes. Moreover, regulatory evolution is highly cell type specific rather than shared between cell types and strongly associated with cellular-level epigenomic features. Evolutionarily differentially expressed genes (DEGs) exhibit greater cell-type specificity than other genes, suggesting their role in the functional specialization of individual cell types in the human brain. As we continue to unfold the cellular complexity of the brain, the actual scope of DEGs in the human brain appears to be much broader than previously estimated. Our study supports the acceleration of cell-type-specific functional programs as an important feature of human brain evolution.
Collapse
Affiliation(s)
- Dennis Joshy
- Department of Mechanical Engineering, University of California, Santa Barbara, CA93106
- Neuroscience Research Institute, University of California, Santa Barbara, CA93106
| | - Gabriel Santpere
- Hospital del Mar Research Institute, Parc de Recerca Biomèdica de Barcelona, Barcelona08003, Catalonia, Spain
| | - Soojin V. Yi
- Neuroscience Research Institute, University of California, Santa Barbara, CA93106
- Department of Ecology, Evolution, Marine Biology, University of California, Santa Barbara, CA93106
- Department of Molecular, Cellular, and Developmental Biology, University of California, Santa Barbara, CA93106
| |
Collapse
|
2
|
Qi C, Wei Q, Ye Y, Liu J, Li G, Liang JW, Huang H, Wu G. Fixation of Expression Divergences by Natural Selection in Arabidopsis Coding Genes. Int J Mol Sci 2024; 25:13710. [PMID: 39769472 PMCID: PMC11678068 DOI: 10.3390/ijms252413710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2024] [Revised: 12/19/2024] [Accepted: 12/20/2024] [Indexed: 01/11/2025] Open
Abstract
Functional divergences of coding genes can be caused by divergences in their coding sequences and expression. However, whether and how expression divergences and coding sequence divergences coevolve is not clear. Gene expression divergences in differentiated cells and tissues recapitulate developmental models within a species, while gene expression divergences between analogous cells and tissues resemble traditional phylogenies in different species, suggesting that gene expression divergences are molecular traits that can be used for evolutionary studies. Using transcriptomes and evolutionary proxies to study gene expression divergences among differentiated cells and tissues in Arabidopsis, expression divergences of coding genes are shown to be strongly anti-correlated with phylostrata (gene ages), indicators of selective constraint Ka/Ks (nonsynonymous replacement rate/synonymous substitution rate) and indicators of positive selection (frequency of loci with Ka/Ks > 1), but only weakly or not correlated with indicators of neutral selection (Ks). Our results thus suggest that expression divergences largely coevolve with coding sequence divergences, suggesting that expression divergences of coding genes are selectively fixed by natural selection but not neutral selection, which provides a molecular framework for trait diversification, functional adaptation and speciation. Our findings therefore support that positive selection rather than negative or neutral selection is a major driver for the origin and evolution of Arabidopsis genes, supporting the Darwinian theory at molecular levels.
Collapse
Affiliation(s)
- Cheng Qi
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| | - Qiang Wei
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| | - Yuting Ye
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| | - Jing Liu
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| | - Guishuang Li
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| | - Jane W. Liang
- Department of Statistics, University of California, Berkeley, CA 94720, USA; (J.W.L.); (H.H.)
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, CA 94720, USA; (J.W.L.); (H.H.)
| | - Guang Wu
- College of Life Science, Shaanxi Normal University, Xi’an 710119, China; (C.Q.); (Y.Y.); (J.L.); (G.L.)
| |
Collapse
|
3
|
Aqil A, Li Y, Wang Z, Islam S, Russell M, Kallak TK, Saitou M, Gokcumen O, Masuda N. Switch-like Gene Expression Modulates Disease Susceptibility. RESEARCH SQUARE 2024:rs.3.rs-4974188. [PMID: 39315271 PMCID: PMC11419265 DOI: 10.21203/rs.3.rs-4974188/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
A fundamental challenge in biomedicine is understanding the mechanisms predisposing individuals to disease. While previous research has suggested that switch-like gene expression is crucial in driving biological variation and disease susceptibility, a systematic analysis across multiple tissues is still lacking. By analyzing transcriptomes from 943 individuals across 27 tissues, we identified 1,013 switch-like genes. We found that only 31 (3.1%) of these genes exhibit switch-like behavior across all tissues. These universally switch-like genes appear to be genetically driven, with large exonic genomic structural variants explaining five (~18%) of them. The remaining switch-like genes exhibit tissue-specific expression patterns. Notably, tissue-specific switch-like genes tend to be switched on or off in unison within individuals, likely under the influence of tissue-specific master regulators, including hormonal signals. Among our most significant findings, we identified hundreds of concordantly switched-off genes in the stomach and vagina that are linked to gastric cancer (41-fold, p<10-4) and vaginal atrophy (44-fold, p<10-4), respectively. Experimental analysis of vaginal tissues revealed that low systemic levels of estrogen lead to a significant reduction in both the epithelial thickness and the expression of the switch-like gene ALOX12. We propose a model wherein the switching off of driver genes in basal and parabasal epithelium suppresses cell proliferation therein, leading to epithelial thinning and, therefore, vaginal atrophy. Our findings underscore the significant biomedical implications of switch-like gene expression and lay the groundwork for potential diagnostic and therapeutic applications.
Collapse
Affiliation(s)
- Alber Aqil
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - Yanyan Li
- Department of Mathematics, State University of New York at Buffalo, Buffalo, NY, USA
| | - Zhiliang Wang
- Department of Mathematics, State University of New York at Buffalo, Buffalo, NY, USA
| | - Saiful Islam
- Institute for Artificial Intelligence and Data Science, State University of New York at Buffalo, Buffalo, NY, USA
| | - Madison Russell
- Department of Mathematics, State University of New York at Buffalo, Buffalo, NY, USA
| | | | - Marie Saitou
- Faculty of Biosciences, Norwegian University of Life Sciences, Aas, Norway
| | - Omer Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - Naoki Masuda
- Department of Mathematics, State University of New York at Buffalo, Buffalo, NY, USA
- Institute for Artificial Intelligence and Data Science, State University of New York at Buffalo, Buffalo, NY, USA
| |
Collapse
|
4
|
Zhang W, Zhang L, Feng Y, Lin D, Yang Z, Zhang Z, Ma Y. Genome-wide profiling of DNA methylome and transcriptome reveals epigenetic regulation of Urechis unicinctus response to sulfide stress. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 927:172238. [PMID: 38582121 DOI: 10.1016/j.scitotenv.2024.172238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/28/2024] [Accepted: 04/03/2024] [Indexed: 04/08/2024]
Abstract
Sulfide is a well-known environmental pollutant that can have detrimental effects on most organisms. However, few metazoans living in sulfide-rich environments have developed mechanisms to tolerate and adapt to sulfide stress. Epigenetic mechanisms, including DNA methylation, have been shown to play a vital role in environmental stress adaptation. Nevertheless, the precise function of DNA methylation in biological sulfide adaptation remains unclear. Urechis unicinctus, a benthic organism inhabiting sulfide-rich intertidal environments, is an ideal model organism for studying adaptation to sulfide environments. In this study, we conducted a comprehensive analysis of the DNA methylome and transcriptome of U. unicinctus after exposure to 50 μM sulfide. The results revealed dynamic changes in the DNA methylation (5-methylcytosine) landscape in response to sulfide stress, with U. unicinctus exhibiting elevated DNA methylation levels following stress exposure. Integrating differentially expressed genes (DEGs) and differentially methylated regions (DMRs), we identified a crucial role of gene body methylation in predicting gene expression. Furthermore, using a DNA methyltransferase inhibitor, we validated the involvement of DNA methylation in the sulfide stress response and the gene regulatory network influenced by DNA methylation. The results indicated that by modulating DNA methylation levels during sulfide stress, the expression of glutathione S-transferase, glutamyl aminopeptidase, and cytochrome c oxidase could be up-regulated, thereby facilitating the metabolism and detoxification of exogenous sulfides. Moreover, DNA methylation was found to regulate and enhance the oxidative phosphorylation pathway, including NADH dehydrogenase, isocitrate dehydrogenase, and ATP synthase. Additionally, DNA methylation influenced the regulation of Cytochrome P450 and macrophage migration inhibitory factor, both of which are closely associated with oxidative stress and stress resistance. Our findings not only emphasize the role of DNA methylation in sulfide adaptation but also provide novel insights into the potential mechanisms through which marine organisms adapt to environmental changes.
Collapse
Affiliation(s)
- Wenqing Zhang
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China
| | - Long Zhang
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China
| | - Yuxin Feng
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China
| | - Dawei Lin
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China
| | - Zhi Yang
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China
| | - Zhifeng Zhang
- Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Ocean Institute, Ocean University of China, Sanya 572000, China; Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.
| | - Yubin Ma
- Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.
| |
Collapse
|
5
|
Khandia R, Pandey MK, Zaki MEA, Al-Hussain SA, Baklanov I, Gurjar P. Application of codon usage and context analysis in genes up- or down-regulated in neurodegeneration and cancer to combat comorbidities. Front Mol Neurosci 2023; 16:1200523. [PMID: 37383425 PMCID: PMC10293642 DOI: 10.3389/fnmol.2023.1200523] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 05/23/2023] [Indexed: 06/30/2023] Open
Abstract
Introduction Neurodegeneration and cancer present in comorbidities with inverse effects due to the expression of genes and pathways acting in opposition. Identifying and studying the genes simultaneously up or downregulated during morbidities helps curb both ailments together. Methods This study examines four genes. Three of these (Amyloid Beta Precursor Protein (APP), Cyclin D1 (CCND1), and Cyclin E2 (CCNE2) are upregulated, and one protein phosphatase 2 phosphatase activator (PTPA) is simultaneously downregulated in both disorders. We investigated molecular patterns, codon usage, codon usage bias, nucleotide bias in the third codon position, preferred codons, preferred codon pairs, rare codons, and codon context. Results Parity analysis revealed that T is preferred over A, and G is preferred over C in the third codon position, suggesting composition plays no role in nucleotide bias in both the upregulated and downregulated gene sets and that mutational forces are stronger in upregulated gene sets than in downregulated ones. Transcript length influenced the overall %A composition and codon bias, and the codon AGG exerted the strongest influence on codon usage in both the upregulated and downregulated gene sets. Codons ending in G/C were preferred for 16 amino acids, and glutamic acid-, aspartic acid-, leucine-, valine-, and phenylalanine-initiated codon pairs were preferred in all genes. Codons CTA (Leu), GTA (Val), CAA (Gln), and CGT (Arg) were underrepresented in all examined genes. Discussion Using advanced gene editing tools such as CRISPR/Cas or any other gene augmentation technique, these recoded genes may be introduced into the human body to optimize gene expression levels to augment neurodegeneration and cancer therapeutic regimens simultaneously.
Collapse
Affiliation(s)
- Rekha Khandia
- Department of Biochemistry and Genetics, Barkatullah University, Bhopal, Madhya Pradesh, India
| | - Megha Katare Pandey
- Translational Medicine Center, All India Institute of Medical Sciences, Bhopal, India
| | - Magdi E. A. Zaki
- Department of Chemistry, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
| | - Sami A. Al-Hussain
- Department of Chemistry, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
| | - Igor Baklanov
- Department of Philosophy, North Caucasus Federal University, Stavropol, Russia
| | - Pankaj Gurjar
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| |
Collapse
|
6
|
Wu X, Bhatia N, Grozinger CM, Yi SV. Comparative studies of genomic and epigenetic factors influencing transcriptional variation in two insect species. G3 GENES|GENOMES|GENETICS 2022; 12:6693626. [PMID: 36137211 PMCID: PMC9635643 DOI: 10.1093/g3journal/jkac230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 08/05/2022] [Indexed: 11/16/2022]
Abstract
Different genes show different levels of expression variability. For example, highly expressed genes tend to exhibit less expression variability. Genes whose promoters have TATA box and initiator motifs tend to have increased expression variability. On the other hand, DNA methylation of transcriptional units, or gene body DNA methylation, is associated with reduced gene expression variability in many species. Interestingly, some insect lineages, most notably Diptera including the canonical model insect Drosophila melanogaster, have lost DNA methylation. Therefore, it is of interest to determine whether genomic features similarly influence gene expression variability in lineages with and without DNA methylation. We analyzed recently generated large-scale data sets in D. melanogaster and honey bee (Apis mellifera) to investigate these questions. Our analysis shows that increased gene expression levels are consistently associated with reduced expression variability in both species, while the presence of TATA box is consistently associated with increased gene expression variability. In contrast, initiator motifs and gene lengths have weak effects limited to some data sets. Importantly, we show that a sequence characteristics indicative of gene body DNA methylation is strongly and negatively associate with gene expression variability in honey bees, while it shows no such association in D. melanogaster. These results suggest the evolutionary loss of DNA methylation in some insect lineages has reshaped the molecular mechanisms concerning the regulation of gene expression variability.
Collapse
Affiliation(s)
| | - Neharika Bhatia
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology , Atlanta, GA 30332, USA
| | - Christina M Grozinger
- Department of Entomology, Center for Pollinator Research, Huck Institutes of the Life Sciences, Pennsylvania State University , University Park, PA 16801, USA
| | - Soojin V Yi
- School of Biological Sciences, Institute for Bioengineering and Bioscience, Georgia Institute of Technology , Atlanta, GA 30332, USA
- Department of Ecology, Evolution and Marine Biology, University of California Santa Barbara , Santa Barbara, CA 93106, USA
| |
Collapse
|
7
|
Kolobkov DS, Sviridova DA, Abilev SK, Kuzovlev AN, Salnikova LE. Genes and Diseases: Insights from Transcriptomics Studies. Genes (Basel) 2022; 13:genes13071168. [PMID: 35885950 PMCID: PMC9317567 DOI: 10.3390/genes13071168] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/13/2022] [Accepted: 06/23/2022] [Indexed: 01/25/2023] Open
Abstract
Results of expression studies can be useful to clarify the genotype-phenotype relationship. However, according to data from recent literature, there is a large group of genes that are revealed as differentially expressed (DE) in many studies, regardless of the biological context. Additional analyses could shed more light on the relationships between genes, their differential expression, and diseases. We generated a set of 9972 disease genes from five gene-phenotype databases (OMIM, ORPHANET, DDG2P, DisGeNet and MalaCards) and a report of the International Union of Immunological Societies. To study transcriptomics of disease and non-disease genes in healthy tissues, we obtained data from the Human Protein Atlas (HPA) website. We analyzed the dependency between expression in healthy tissues and gene occurrence in Gene Expression Omnibus series using tools within the Enrichr libraries. The results of expression studies were annotated with Gene Ontology (GO) and Human Phenotype Ontology (HPO) terms. Using transcriptomics analysis of healthy tissues, we validated the previous findings of higher expression levels of disease genes in pathologically linked tissues compared to other tissues. Preferentially DE genes were generally highly expressed in one or multiple tissues and were enriched for disease genes. According to the results of GO enrichment analyses, both down- and up-regulated DE genes most often took part in immune response, translation and tissue-specific processes. A connection between DE-related pathology and the diversity of HPO terms was found. Investigating a link between expression and phenotype contributes to understanding the mode of development and progression of human diseases.
Collapse
Affiliation(s)
- Dmitry S. Kolobkov
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; (D.S.K.); (D.A.S.); (S.K.A.)
| | - Darya A. Sviridova
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; (D.S.K.); (D.A.S.); (S.K.A.)
| | - Serikbai K. Abilev
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; (D.S.K.); (D.A.S.); (S.K.A.)
| | - Artem N. Kuzovlev
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia;
| | - Lyubov E. Salnikova
- The Laboratory of Ecological Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; (D.S.K.); (D.A.S.); (S.K.A.)
- The Laboratory of Clinical Pathophysiology of Critical Conditions, Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow 107031, Russia;
- The Laboratory of Molecular Immunology, Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow 117997, Russia
- Correspondence:
| |
Collapse
|
8
|
Mohamed AR, Naval-Sanchez M, Menzies M, Evans B, King H, Reverter A, Kijas JW. Leveraging transcriptome and epigenome landscapes to infer regulatory networks during the onset of sexual maturation. BMC Genomics 2022; 23:413. [PMID: 35650521 PMCID: PMC9158274 DOI: 10.1186/s12864-022-08514-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 03/29/2022] [Indexed: 12/03/2022] Open
Abstract
Background Despite sexual development being ubiquitous to vertebrates, the molecular mechanisms underpinning this fundamental transition remain largely undocumented in many organisms. We designed a time course experiment that successfully sampled the period when Atlantic salmon commence their trajectory towards sexual maturation. Results Through deep RNA sequencing, we discovered key genes and pathways associated with maturation in the pituitary-ovarian axis. Analyzing DNA methylomes revealed a bias towards hypermethylation in ovary that implicated maturation-related genes. Co-analysis of DNA methylome and gene expression changes revealed chromatin remodeling genes and key transcription factors were both significantly hypermethylated and upregulated in the ovary during the onset of maturation. We also observed changes in chromatin state landscapes that were strongly correlated with fundamental remodeling of gene expression in liver. Finally, a multiomic integrated analysis revealed regulatory networks and identified hub genes including TRIM25 gene (encoding the estrogen-responsive finger protein) as a putative key regulator in the pituitary that underwent a 60-fold change in connectivity during the transition to maturation. Conclusion The study successfully documented transcriptome and epigenome changes that involved key genes and pathways acting in the pituitary – ovarian axis. Using a Systems Biology approach, we identified hub genes and their associated networks deemed crucial for onset of maturation. The results provide a comprehensive view of the spatiotemporal changes involved in a complex trait and opens the door to future efforts aiming to manipulate puberty in an economically important aquaculture species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08514-8.
Collapse
|
9
|
Wang F, Tekle YI. Variation of natural selection in the Amoebozoa reveals heterogeneity across the phylogeny and adaptive evolution in diverse lineages. Front Ecol Evol 2022; 10:851816. [PMID: 36874909 PMCID: PMC9980437 DOI: 10.3389/fevo.2022.851816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The evolution and diversity of the supergroup Amoebozoa is complex and poorly understood. The supergroup encompasses predominantly amoeboid lineages characterized by extreme diversity in phenotype, behavior and genetics. The study of natural selection, a driving force of diversification, within and among species of Amoebozoa will play a crucial role in understanding the evolution of the supergroup. In this study, we searched for traces of natural selection based on a set of highly conserved protein-coding genes in a phylogenetic framework from a broad sampling of amoebozoans. Using these genes, we estimated substitution rates and inferred patterns of selective pressure in lineages and sites with various models. We also examined the effect of selective pressure on codon usage bias and potential correlations with observed biological traits and habitat. Results showed large heterogeneity of selection across lineages of Amoebozoa, indicating potential species-specific optimization of adaptation to their diverse ecological environment. Overall, lineages in Tubulinea had undergone stronger purifying selection with higher average substitution rates compared to Discosea and Evosea. Evidence of adaptive evolution was observed in some representative lineages and in a gene (Rpl7a) within Evosea, suggesting potential innovation and beneficial mutations in these lineages. Our results revealed that members of the fast-evolving lineages, Entamoeba and Cutosea, all underwent strong purifying selection but had distinct patterns of codon usage bias. For the first time, this study revealed an overall pattern of natural selection across the phylogeny of Amoebozoa and provided significant implications on their distinctive evolutionary processes.
Collapse
Affiliation(s)
- Fang Wang
- Department of Biology, Spelman College, Atlanta, GA, United States
| | - Yonas I Tekle
- Department of Biology, Spelman College, Atlanta, GA, United States
| |
Collapse
|
10
|
Huminiecki Ł. Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science. ENTROPY (BASEL, SWITZERLAND) 2021; 24:17. [PMID: 35052043 PMCID: PMC8774939 DOI: 10.3390/e24010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]
Abstract
Mendel proposed an experimentally verifiable paradigm of particle-based heredity that has been influential for over 150 years. The historical arguments have been reflected in the near past as Mendel's concept has been diversified by new types of omics data. As an effect of the accumulation of omics data, a virtual gene concept forms, giving rise to genetical data science. The concept integrates genetical, functional, and molecular features of the Mendelian paradigm. I argue that the virtual gene concept should be deployed pragmatically. Indeed, the concept has already inspired a practical research program related to systems genetics. The program includes questions about functionality of structural and categorical gene variants, about regulation of gene expression, and about roles of epigenetic modifications. The methodology of the program includes bioinformatics, machine learning, and deep learning. Education, funding, careers, standards, benchmarks, and tools to monitor research progress should be provided to support the research program.
Collapse
Affiliation(s)
- Łukasz Huminiecki
- Evolutionary, Computational, and Statistical Genetics, Department of Molecula Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Postępu 36A, Jastrzębiec, 05-552 Warsaw, Poland
| |
Collapse
|
11
|
Singh D, Yi SV. Enhancer pleiotropy, gene expression, and the architecture of human enhancer-gene interactions. Mol Biol Evol 2021; 38:3898-3909. [PMID: 33749795 PMCID: PMC8383896 DOI: 10.1093/molbev/msab085] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Revised: 02/10/2021] [Accepted: 03/18/2021] [Indexed: 12/30/2022] Open
Abstract
Enhancers are often studied as noncoding regulatory elements that modulate the precise spatiotemporal expression of genes in a highly tissue-specific manner. This paradigm has been challenged by recent evidence of individual enhancers acting in multiple tissues or developmental contexts. However, the frequency of these enhancers with high degrees of “pleiotropy” out of all putative enhancers is not well understood. Consequently, it is unclear how the variation of enhancer pleiotropy corresponds to the variation in expression breadth of target genes. Here, we use multi-tissue chromatin maps from diverse human tissues to investigate the enhancer–gene interaction architecture while accounting for 1) the distribution of enhancer pleiotropy, 2) the variations of regulatory links from enhancers to target genes, and 3) the expression breadth of target genes. We show that most enhancers are tissue-specific and that highly pleiotropy enhancers account for <1% of all putative regulatory sequences in the human genome. Notably, several genomic features are indicative of increasing enhancer pleiotropy, including longer sequence length, greater number of links to genes, increasing abundance and diversity of encoded transcription factor motifs, and stronger evolutionary conservation. Intriguingly, the number of enhancers per gene remains remarkably consistent for all genes (∼14). However, enhancer pleiotropy does not directly translate to the expression breadth of target genes. We further present a series of Gaussian Mixture Models to represent this organization architecture. Consequently, we demonstrate that a modest trend of more pleiotropic enhancers targeting more broadly expressed genes can generate the observed diversity of expression breadths in the human genome.
Collapse
Affiliation(s)
- Devika Singh
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Soojin V Yi
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| |
Collapse
|
12
|
Evans P, Cox NJ, Gamazon ER. The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes. PeerJ 2020; 8:e9554. [PMID: 32765967 PMCID: PMC7380284 DOI: 10.7717/peerj.9554] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Accepted: 06/24/2020] [Indexed: 11/20/2022] Open
Abstract
The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann-Whitney U p = 1.4 × 10-4). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies (p = 5.55 × 10-284) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.
Collapse
Affiliation(s)
- Patrick Evans
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Nancy J Cox
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Eric R Gamazon
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America.,Clare Hall, University of Cambridge, Cambridge, United Kingdom.,MRC Epidemiology Unit, University of Cambridge, Cambridge, United Kingdom.,Data Science Institute, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
13
|
Shi T, Rahmani RS, Gugger PF, Wang M, Li H, Zhang Y, Li Z, Wang Q, Van de Peer Y, Marchal K, Chen J. Distinct Expression and Methylation Patterns for Genes with Different Fates following a Single Whole-Genome Duplication in Flowering Plants. Mol Biol Evol 2020; 37:2394-2413. [PMID: 32343808 PMCID: PMC7403625 DOI: 10.1093/molbev/msaa105] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
For most sequenced flowering plants, multiple whole-genome duplications (WGDs) are found. Duplicated genes following WGD often have different fates that can quickly disappear again, be retained for long(er) periods, or subsequently undergo small-scale duplications. However, how different expression, epigenetic regulation, and functional constraints are associated with these different gene fates following a WGD still requires further investigation due to successive WGDs in angiosperms complicating the gene trajectories. In this study, we investigate lotus (Nelumbo nucifera), an angiosperm with a single WGD during the K-pg boundary. Based on improved intraspecific-synteny identification by a chromosome-level assembly, transcriptome, and bisulfite sequencing, we explore not only the fundamental distinctions in genomic features, expression, and methylation patterns of genes with different fates after a WGD but also the factors that shape post-WGD expression divergence and expression bias between duplicates. We found that after a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein-protein interactions and protein lengths and the lowest methylation in gene flanking regions. For those long-retained duplicate pairs, the degree of expression divergence correlates with their sequence divergence, degree in protein-protein interactions, and expression level, whereas their biases in expression level reflecting subgenome dominance are associated with the bias of subgenome fractionation. Overall, our study on the paleopolyploid nature of lotus highlights the impact of different functional constraints on gene fate and duplicate divergence following a single WGD in plant.
Collapse
Affiliation(s)
- Tao Shi
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
| | - Razgar Seyed Rahmani
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Paul F Gugger
- Appalachian Laboratory, University of Maryland Center for Environmental Science, Frostburg, MD
| | - Muhua Wang
- School of Marine Sciences, Sun Yat-sen University, Guangzhou, China
| | - Hui Li
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yue Zhang
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhizhong Li
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qingfeng Wang
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
- Sino-African Joint Research Center, Chinese Academy of Sciences, Wuhan, China
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Centre for Plant Systems Biology, VIB, Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
- College of Horticulture, Nanjing Agricultural University, Nanjing, China
| | - Kathleen Marchal
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Information Technology, IDLab, IMEC, Ghent University, Ghent, Belgium
| | - Jinming Chen
- CAS Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, China
| |
Collapse
|
14
|
Yin H, Li M, Xia L, He C, Zhang Z. Computational determination of gene age and characterization of evolutionary dynamics in human. Brief Bioinform 2019; 20:2141-2149. [PMID: 30184145 DOI: 10.1093/bib/bby074] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 08/01/2018] [Accepted: 08/02/2018] [Indexed: 12/23/2022] Open
Abstract
Genes originate at different evolutionary time scales and possess different ages, accordingly presenting diverse functional characteristics and reflecting distinct adaptive evolutionary innovations. In the past decades, progresses have been made in gene age identification by a variety of methods that are principally based on comparative genomics. Here we summarize methods for computational determination of gene age and evaluate the effectiveness of different computational methods for age identification. Our results show that improved age determination can be achieved by combining homolog clustering with phylogeny inference, which enables more accurate age identification in human genes. Accordingly, we characterize evolutionary dynamics of human genes based on an extremely long evolutionary time scale spanning ~4,000 million years from archaea/bacteria to human, revealing that young genes are clustered on certain chromosomes and that Mendelian disease genes (including monogenic disease and polygenic disease genes) and cancer genes exhibit divergent evolutionary origins. Taken together, deciphering genes' ages as well as their evolutionary dynamics is of fundamental significance in unveiling the underlying mechanisms during evolution and better understanding how young or new genes become indispensable integrants coupled with novel phenotypes and biological diversity.
Collapse
Affiliation(s)
- Hongyan Yin
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Mengwei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Lin Xia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Chaozu He
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresources, Institute of Tropical Agriculture and Forestry, Hainan University, China
| | - Zhang Zhang
- BIG Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
15
|
Das S, Bansal M. Variation of gene expression in plants is influenced by gene architecture and structural properties of promoters. PLoS One 2019; 14:e0212678. [PMID: 30908494 PMCID: PMC6433290 DOI: 10.1371/journal.pone.0212678] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2018] [Accepted: 02/07/2019] [Indexed: 12/03/2022] Open
Abstract
In higher eukaryotes, gene architecture and structural properties of promoters have emerged as significant factors influencing variation in number of transcripts (expression level) and specificity of gene expression in a tissue (expression breadth), which eventually shape the phenotype. In this study, transcriptome data of different tissue types at various developmental stages of A. thaliana, O. sativa, S. bicolor and Z. mays have been used to understand the relationship between properties of gene components and its expression. Our findings indicate that in plants, among all gene architecture and structural properties of promoters, compactness of genes in terms of intron content is significantly linked to gene expression level and breadth, whereas in human an exactly opposite scenario is seen. In plants, for the first time we have carried out a quantitative estimation of effect of a particular trait on expression level and breadth, by using multiple regression analysis and it confirms that intron content of primary transcript (as %) is a powerful determinant of expression breadth. Similarly, further regression analysis revealed that among structural properties of the promoters, stability is negatively linked to expression breadth, while DNase1 sensitivity strongly governs gene expression breadth in monocots and gene expression level in dicots. In addition, promoter regions of tissue specific genes are found to be enriched with TATA box and Y-patch motifs. Finally, multi copy orthologous genes in plants are found to be longer, highly regulated and tissue specific.
Collapse
Affiliation(s)
- Sanjukta Das
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| |
Collapse
|
16
|
Huminiecki L. Modelling of the breadth of expression from promoter architectures identifies pro-housekeeping transcription factors. PLoS One 2018; 13:e0198961. [PMID: 29928029 PMCID: PMC6013173 DOI: 10.1371/journal.pone.0198961] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 05/28/2018] [Indexed: 12/22/2022] Open
Abstract
Understanding how regulatory elements control mammalian gene expression is a challenge of post-genomic era. We previously reported that size of proximal promoter architecture predicted the breadth of expression (fraction of tissues in which a gene is expressed). Herein, the contributions of individual transcription factors (TFs) were quantified. Several technologies of statistical modelling were utilized and compared: tree models, generalized linear models (GLMs, without and with regularization), Bayesian GLMs and random forest. Both linear and non-linear modelling strategies were explored. Encouragingly, different models led to similar statistical conclusions and biological interpretations. The majority of ENCODE TFs correlated positively with housekeeping expression, a minority correlated negatively. Thus, housekeeping expression can be understood as a cumulative effect of many types of TF binding sites. This is accompanied by the exclusion of fewer types of binding sites for TFs which are repressors, or support cell lineage commitment or temporarily inducible or spatially-restricted expression.
Collapse
Affiliation(s)
- Lukasz Huminiecki
- Instytut Genetyki i Hodowli Zwierząt Polskiej Akademii Nauk, Jastrzębiec, Magdalenka, Poland
| |
Collapse
|
17
|
Wang M, Uebbing S, Ellegren H. Bayesian Inference of Allele-Specific Gene Expression Indicates Abundant Cis-Regulatory Variation in Natural Flycatcher Populations. Genome Biol Evol 2017; 9:1266-1279. [PMID: 28453623 PMCID: PMC5434935 DOI: 10.1093/gbe/evx080] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2017] [Indexed: 12/13/2022] Open
Abstract
Polymorphism in cis-regulatory sequences can lead to different levels of expression for the two alleles of a gene, providing a starting point for the evolution of gene expression. Little is known about the genome-wide abundance of genetic variation in gene regulation in natural populations but analysis of allele-specific expression (ASE) provides a means for investigating such variation. We performed RNA-seq of multiple tissues from population samples of two closely related flycatcher species and developed a Bayesian algorithm that maximizes data usage by borrowing information from the whole data set and combines several SNPs per transcript to detect ASE. Of 2,576 transcripts analyzed in collared flycatcher, ASE was detected in 185 (7.2%) and a similar frequency was seen in the pied flycatcher. Transcripts with statistically significant ASE commonly showed the major allele in >90% of the reads, reflecting that power was highest when expression was heavily biased toward one of the alleles. This would suggest that the observed frequencies of ASE likely are underestimates. The proportion of ASE transcripts varied among tissues, being lowest in testis and highest in muscle. Individuals often showed ASE of particular transcripts in more than one tissue (73.4%), consistent with a genetic basis for regulation of gene expression. The results suggest that genetic variation in regulatory sequences commonly affects gene expression in natural populations and that it provides a seedbed for phenotypic evolution via divergence in gene expression.
Collapse
Affiliation(s)
- Mi Wang
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| | - Severin Uebbing
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| |
Collapse
|
18
|
Mendizabal I, Zeng J, Keller TE, Yi SV. Body-hypomethylated human genes harbor extensive intragenic transcriptional activity and are prone to cancer-associated dysregulation. Nucleic Acids Res 2017; 45:4390-4400. [PMID: 28115635 PMCID: PMC5416765 DOI: 10.1093/nar/gkx020] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 01/05/2017] [Indexed: 01/08/2023] Open
Abstract
Genomic DNA methylation maps (methylomes) encode genetic and environmental effects as stable chemical modifications of DNA. Variations in DNA methylation, especially in regulatory regions such as promoters and enhancers, are known to affect numerous downstream processes. In contrast, most transcription units (gene bodies) in the human genome are thought to be heavily methylated. However, epigenetic reprogramming in cancer often involves gene body hypomethylation with consequences on gene expression. In this study, we focus on the relatively unexplored phenomenon that some gene bodies are devoid of DNA methylation under normal conditions. Utilizing nucleotide-resolution methylomes of diverse samples, we show that nearly 2000 human genes are commonly hypomethylated. Remarkably, these genes occupy highly specialized genomic, epigenomic, evolutionary and functional niches in our genomes. For example, hypomethylated genes tend to be short yet encode significantly more transcripts than expected based upon their lengths, include many genes involved in nucleosome and chromatin formation, and are extensively and significantly enriched for histone-tail modifications and transcription factor binding with particular relevance for cis-regulation. Furthermore, they are significantly more prone to cancer-associated hypomethylation and mutation. Consequently, gene body hypomethylation represents an additional layer of epigenetic regulatory complexity, with implications on cancer-associated epigenetic reprogramming.
Collapse
Affiliation(s)
- Isabel Mendizabal
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA.,Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country UPV/EHU, Barrio Sarriena s/n, 48940 Leioa, Spain
| | - Jia Zeng
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Thomas E Keller
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Soojin V Yi
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
19
|
Radhakrishnan S, Literman R, Mizoguchi B, Valenzuela N. MeDIP-seq and nCpG analyses illuminate sexually dimorphic methylation of gonadal development genes with high historic methylation in turtle hatchlings with temperature-dependent sex determination. Epigenetics Chromatin 2017; 10:28. [PMID: 28533820 PMCID: PMC5438563 DOI: 10.1186/s13072-017-0136-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 05/12/2017] [Indexed: 12/15/2022] Open
Abstract
Background DNA methylation alters gene expression but not DNA sequence and mediates some cases of phenotypic plasticity. Temperature-dependent sex determination (TSD) epitomizes phenotypic plasticity where environmental temperature drives embryonic sexual fate, as occurs commonly in turtles. Importantly, the temperature-specific transcription of two genes underlying gonadal differentiation is known to be induced by differential methylation in TSD fish, turtle and alligator. Yet, how extensive is the link between DNA methylation and TSD remains unclear. Here we test for broad differences in genome-wide DNA methylation between male and female hatchling gonads of the TSD painted turtle Chrysemys picta using methyl DNA immunoprecipitation sequencing, to identify differentially methylated candidates for future study. We also examine the genome-wide nCpG distribution (which affects DNA methylation) in painted turtles and test for historic methylation in genes regulating vertebrate gonadogenesis. Results Turtle global methylation was consistent with other vertebrates (57% of the genome, 78% of all CpG dinucleotides). Numerous genes predicted to regulate turtle gonadogenesis exhibited sex-specific methylation and were proximal to methylated repeats. nCpG distribution predicted actual turtle DNA methylation and was bimodal in gene promoters (as other vertebrates) and introns (unlike other vertebrates). Differentially methylated genes, including regulators of sexual development, had lower nCpG content indicative of higher historic methylation. Conclusions Ours is the first evidence suggesting that sexually dimorphic DNA methylation is pervasive in turtle gonads (perhaps mediated by repeat methylation) and that it targets numerous regulators of gonadal development, consistent with the hypothesis that it may regulate thermosensitive transcription in TSD vertebrates. However, further research during embryogenesis will help test this hypothesis and the alternative that instead, most differential methylation observed in hatchlings is the by-product of sexual differentiation and not its cause. Electronic supplementary material The online version of this article (doi:10.1186/s13072-017-0136-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Srihari Radhakrishnan
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50011 USA.,Department of Ecology, Evolution and Organismal Biology, Iowa State University, 251 Bessey Hall, Ames, IA 50011 USA
| | - Robert Literman
- Ecology and Evolutionary Biology Program, Iowa State University, Ames, IA 50011 USA.,Department of Ecology, Evolution and Organismal Biology, Iowa State University, 251 Bessey Hall, Ames, IA 50011 USA
| | - Beatriz Mizoguchi
- Interdepartmental Genetics and Genomics Program, Iowa State University, Ames, IA 50011 USA.,Department of Ecology, Evolution and Organismal Biology, Iowa State University, 251 Bessey Hall, Ames, IA 50011 USA
| | - Nicole Valenzuela
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, 251 Bessey Hall, Ames, IA 50011 USA
| |
Collapse
|
20
|
Abstract
As genes originate at different evolutionary times, they harbor distinctive genomic signatures of evolutionary ages. Although previous studies have investigated different gene age-related signatures, what signatures dominantly associate with gene age remains unresolved. Here we address this question via a combined approach of comprehensive assignment of gene ages, gene family identification, and multivariate analyses. We first provide a comprehensive and improved gene age assignment by combining homolog clustering with phylogeny inference and categorize human genes into 26 age classes spanning the whole tree of life. We then explore the dominant age-related signatures based on a collection of 10 potential signatures (including gene composition, gene length, selection pressure, expression level, connectivity in protein–protein interaction network and DNA methylation). Our results show that GC content and connectivity in protein–protein interaction network (PPIN) associate dominantly with gene age. Furthermore, we investigate the heterogeneity of dominant signatures in duplicates and singletons. We find that GC content is a consistent primary factor of gene age in duplicates and singletons, whereas PPIN is more strongly associated with gene age in singletons than in duplicates. Taken together, GC content and PPIN are two dominant signatures in close association with gene age, exhibiting heterogeneity in duplicates and singletons and presumably reflecting complex differential interplays between natural selection and mutation.
Collapse
Affiliation(s)
- Hongyan Yin
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Guangyu Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| | - Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
21
|
França GS, Vibranovski MD, Galante PAF. Host gene constraints and genomic context impact the expression and evolution of human microRNAs. Nat Commun 2016; 7:11438. [PMID: 27109497 PMCID: PMC4848552 DOI: 10.1038/ncomms11438] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 03/25/2016] [Indexed: 12/16/2022] Open
Abstract
Increasing evidence has shown that recent miRNAs tend to emerge within coding genes. Here we conjecture that human miRNA evolution is tightly influenced by the genomic context, especially by host genes. Our findings show a preferential emergence of intragenic miRNAs within old genes. We found that miRNAs within old host genes are significantly more broadly expressed than those within young ones. Young miRNAs within old genes are more broadly expressed than their intergenic counterparts, suggesting that young miRNAs have an initial advantage by residing in old genes, and benefit from their hosts' expression control and from the exposure to diverse cellular contexts and target genes. Our results demonstrate that host genes may provide stronger expression constraints to intragenic miRNAs in the long run. We also report associated functional implications, highlighting the genomic context and host genes as driving factors for the expression and evolution of human miRNAs. Recent miRNAs tend to emerge within coding genes. Here, by analysing miRNA expression data from six species and comparing genomes from 13 species, the authors report that host genes may provide stronger expression constraints to intragenic miRNAs in the long run.
Collapse
Affiliation(s)
- Gustavo S França
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, Rua Daher Cutait 69, 01308-060 São Paulo, Brazil.,Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, Av. Prof. Lineu Prestes 748, 05508-000 São Paulo, Brazil
| | - Maria D Vibranovski
- Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, Rua do Matao 277, 05508-090 São Paulo, Brazil
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, Rua Daher Cutait 69, 01308-060 São Paulo, Brazil
| |
Collapse
|
22
|
Keller TE, Han P, Yi SV. Evolutionary Transition of Promoter and Gene Body DNA Methylation across Invertebrate-Vertebrate Boundary. Mol Biol Evol 2015; 33:1019-28. [PMID: 26715626 PMCID: PMC4776710 DOI: 10.1093/molbev/msv345] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genomes of invertebrates and vertebrates exhibit highly divergent patterns of DNA methylation. Invertebrate genomes tend to be sparsely methylated, and DNA methylation is mostly targeted to a subset of transcription units (gene bodies). In a drastic contrast, vertebrate genomes are generally globally and heavily methylated, punctuated by the limited local hypo-methylation of putative regulatory regions such as promoters. These genomic differences also translate into functional differences in DNA methylation and gene regulation. Although promoter DNA methylation is an important regulatory component of vertebrate gene expression, its role in invertebrate gene regulation has been little explored. Instead, gene body DNA methylation is associated with expression of invertebrate genes. However, the evolutionary steps leading to the differentiation of invertebrate and vertebrate genomic DNA methylation remain unresolved. Here we analyzed experimentally determined DNA methylation maps of several species across the invertebrate–vertebrate boundary, to elucidate how vertebrate gene methylation has evolved. We show that, in contrast to the prevailing idea, a substantial number of promoters in an invertebrate basal chordate Ciona intestinalis are methylated. Moreover, gene expression data indicate significant, epigenomic context-dependent associations between promoter methylation and expression in C. intestinalis. However, there is no evidence that promoter methylation in invertebrate chordate has been evolutionarily maintained across the invertebrate–vertebrate boundary. Rather, body-methylated invertebrate genes preferentially obtain hypo-methylated promoters among vertebrates. Conversely, promoter methylation is preferentially found in lineage- and tissue-specific vertebrate genes. These results provide important insights into the evolutionary origin of epigenetic regulation of vertebrate gene expression.
Collapse
Affiliation(s)
| | | | - Soojin V Yi
- School of Biology, Georgia Institute of Technology
| |
Collapse
|
23
|
Mendizabal I, Yi SV. Whole-genome bisulfite sequencing maps from multiple human tissues reveal novel CpG islands associated with tissue-specific regulation. Hum Mol Genet 2015; 25:69-82. [PMID: 26512062 PMCID: PMC4690492 DOI: 10.1093/hmg/ddv449] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 10/21/2015] [Indexed: 01/25/2023] Open
Abstract
CpG islands (CGIs) are one of the most widely studied regulatory features of the human genome, with critical roles in development and disease. Despite such significance and the original epigenetic definition, currently used CGI sets are typically predicted from DNA sequence characteristics. Although CGIs are deeply implicated in practical analyses of DNA methylation, recent studies have shown that such computational annotations suffer from inaccuracies. Here we used whole-genome bisulfite sequencing from 10 diverse human tissues to identify a comprehensive, experimentally obtained, single-base resolution CGI catalog. In addition to the unparalleled annotation precision, our method is free from potential bias due to arbitrary sequence features or probe affinity differences. In addition to clarifying substantial false positives in the widely used University of California Santa Cruz (UCSC) annotations, our study identifies numerous novel epigenetic loci. In particular, we reveal significant impact of transposable elements on the epigenetic regulatory landscape of the human genome and demonstrate ubiquitous presence of transcription initiation at CGIs, including alternative promoters in gene bodies and non-coding RNAs in intergenic regions. Moreover, coordinated DNA methylation and chromatin modifications mark tissue-specific enhancers at novel CGIs. Enrichment of specific transcription factor binding from ChIP-seq supports mechanistic roles of CGIs on the regulation of tissue-specific transcription. The new CGI catalog provides a comprehensive and integrated list of genomic hotspots of epigenetic regulation.
Collapse
Affiliation(s)
- Isabel Mendizabal
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA and Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country UPV/EHU, Barrio Sarriena s/n, 48940 Leioa, Spain
| | - Soojin V Yi
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA and
| |
Collapse
|
24
|
Gossmann TI, Santure AW, Sheldon BC, Slate J, Zeng K. Highly variable recombinational landscape modulates efficacy of natural selection in birds. Genome Biol Evol 2015; 6:2061-75. [PMID: 25062920 PMCID: PMC4231635 DOI: 10.1093/gbe/evu157] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Determining the rate of protein evolution and identifying the causes of its variation across the genome are powerful ways to understand forces that are important for genome evolution. By using a multitissue transcriptome data set from great tit (Parus major), we analyzed patterns of molecular evolution between two passerine birds, great tit and zebra finch (Taeniopygia guttata), using the chicken genome (Gallus gallus) as an outgroup. We investigated whether a special feature of avian genomes, the highly variable recombinational landscape, modulates the efficacy of natural selection through the effects of Hill-Robertson interference, which predicts that selection should be more effective in removing deleterious mutations and incorporating beneficial mutations in high-recombination regions than in low-recombination regions. In agreement with these predictions, genes located in low-recombination regions tend to have a high proportion of neutrally evolving sites and relaxed selective constraint on sites subject to purifying selection, whereas genes that show strong support for past episodes of positive selection appear disproportionally in high-recombination regions. There is also evidence that genes located in high-recombination regions tend to have higher gene expression specificity than those located in low-recombination regions. Furthermore, more compact genes (i.e., those with fewer/shorter introns or shorter proteins) evolve faster than less compact ones. In sum, our results demonstrate that transcriptome sequencing is a powerful method to answer fundamental questions about genome evolution in nonmodel organisms.
Collapse
Affiliation(s)
- Toni I Gossmann
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| | - Anna W Santure
- Department of Animal and Plant Sciences, University of Sheffield, United KingdomSchool of Biological Sciences, University of Auckland, New Zealand
| | - Ben C Sheldon
- Edward Grey Institute, Department of Zoology, University of Oxford, United Kingdom
| | - Jon Slate
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| | - Kai Zeng
- Department of Animal and Plant Sciences, University of Sheffield, United Kingdom
| |
Collapse
|
25
|
Galbraith DA, Yang X, Niño EL, Yi S, Grozinger C. Parallel epigenomic and transcriptomic responses to viral infection in honey bees (Apis mellifera). PLoS Pathog 2015; 11:e1004713. [PMID: 25811620 PMCID: PMC4374888 DOI: 10.1371/journal.ppat.1004713] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 01/28/2015] [Indexed: 01/07/2023] Open
Abstract
Populations of honey bees are declining throughout the world, with US beekeepers losing 30% of their colonies each winter. Though multiple factors are driving these colony losses, it is increasingly clear that viruses play a major role. However, information about the molecular mechanisms mediating antiviral immunity in honey bees is surprisingly limited. Here, we examined the transcriptional and epigenetic (DNA methylation) responses to viral infection in honey bee workers. One-day old worker honey bees were fed solutions containing Israeli Acute Paralysis Virus (IAPV), a virus which causes muscle paralysis and death and has previously been associated with colony loss. Uninfected control and infected, symptomatic bees were collected within 20-24 hours after infection. Worker fat bodies, the primary tissue involved in metabolism, detoxification and immune responses, were collected for analysis. We performed transcriptome- and bisulfite-sequencing of the worker fat bodies to identify genome-wide gene expression and DNA methylation patterns associated with viral infection. There were 753 differentially expressed genes (FDR<0.05) in infected versus control bees, including several genes involved in epigenetic and antiviral pathways. DNA methylation status of 156 genes (FDR<0.1) changed significantly as a result of the infection, including those involved in antiviral responses in humans. There was no significant overlap between the significantly differentially expressed and significantly differentially methylated genes, and indeed, the genomic characteristics of these sets of genes were quite distinct. Our results indicate that honey bees have two distinct molecular pathways, mediated by transcription and methylation, that modulate protein levels and/or function in response to viral infections.
Collapse
Affiliation(s)
- David A. Galbraith
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Xingyu Yang
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Elina Lastro Niño
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Soojin Yi
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| | - Christina Grozinger
- Department of Entomology, Center for Pollinator Research, Pennsylvania State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
26
|
Helanterä H, Uller T. Neutral and adaptive explanations for an association between caste-biased gene expression and rate of sequence evolution. Front Genet 2014; 5:297. [PMID: 25221570 PMCID: PMC4148897 DOI: 10.3389/fgene.2014.00297] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 08/08/2014] [Indexed: 12/30/2022] Open
Abstract
The castes of social insects provide outstanding opportunities to address the causes and consequences of evolution of discrete phenotypes, i.e., polymorphisms. Here we focus on recently described patterns of a positive association between the degree of caste-specific gene expression and the rate of sequence evolution. We outline how neutral and adaptive evolution can cause genes that are morph-biased in their expression profiles to exhibit historical signatures of faster or slower sequence evolution compared to unbiased genes. We conclude that evaluation of different hypotheses will benefit from (i) reconstruction of the phylogenetic origin of biased expression and changes in rates of sequence evolution, and (ii) replicated data on gene expression variation within versus between morphs. Although the data are limited at present, we suggest that the observed phylogenetic and intra-population variation in gene expression lends support to the hypothesis that the association between caste-biased expression and rate of sequence evolution largely is a result of neutral processes.
Collapse
Affiliation(s)
- Heikki Helanterä
- Department of Biosciences, Centre of Excellence in Biological Interactions, University of HelsinkiHelsinki, Finland
| | - Tobias Uller
- Department of Zoology, Edward Grey Institute, University of OxfordOxford, UK
- Department of Biology, University of LundSölvegatan, Lund, Sweden
| |
Collapse
|
27
|
Hurst LD, Sachenkova O, Daub C, Forrest ARR, the FANTOM consortium, Huminiecki L. A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators. Genome Biol 2014; 15:413. [PMID: 25079787 PMCID: PMC4310617 DOI: 10.1186/s13059-014-0413-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 07/15/2014] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Conventional wisdom holds that, owing to the dominance of features such as chromatin level control, the expression of a gene cannot be readily predicted from knowledge of promoter architecture. This is reflected, for example, in a weak or absent correlation between promoter divergence and expression divergence between paralogs. However, an inability to predict may reflect an inability to accurately measure or employment of the wrong parameters. Here we address this issue through integration of two exceptional resources: ENCODE data on transcription factor binding and the FANTOM5 high-resolution expression atlas. RESULTS Consistent with the notion that in eukaryotes most transcription factors are activating, the number of transcription factors binding a promoter is a strong predictor of expression breadth. In addition, evolutionarily young duplicates have fewer transcription factor binders and narrower expression. Nonetheless, we find several binders and cooperative sets that are disproportionately associated with broad expression, indicating that models more complex than simple correlations should hold more predictive power. Indeed, a machine learning approach improves fit to the data compared with a simple correlation. Machine learning could at best moderately predict tissue of expression of tissue specific genes. CONCLUSIONS We find robust evidence that some expression parameters and paralog expression divergence are strongly predictable with knowledge of transcription factor binding repertoire. While some cooperative complexes can be identified, consistent with the notion that most eukaryotic transcription factors are activating, a simple predictor, the number of binding transcription factors found on a promoter, is a robust predictor of expression breadth.
Collapse
Affiliation(s)
- Laurence D Hurst
- />Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY UK
| | - Oxana Sachenkova
- />Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
- />Science for Life Laboratory, SciLifeLab, Stockholm, Sweden
| | - Carsten Daub
- />Science for Life Laboratory, SciLifeLab, Stockholm, Sweden
| | - Alistair RR Forrest
- />RIKEN Omics Science Center, Yokohama, Japan
- />Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Kanagawa Japan
| | - the FANTOM consortium
- />Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY UK
- />Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
- />Science for Life Laboratory, SciLifeLab, Stockholm, Sweden
- />RIKEN Omics Science Center, Yokohama, Japan
- />Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
- />BILS bioinformatics infrastructure for life sciences, Stockholm, Sweden
- />Department of Immunology Genetics and Pathology, Uppsala University, Uppsala, Sweden
- />Division of Genomic Technologies, RIKEN Center for Life Science Technologies, Yokohama, Kanagawa Japan
| | - Lukasz Huminiecki
- />Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
- />Science for Life Laboratory, SciLifeLab, Stockholm, Sweden
- />Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
- />BILS bioinformatics infrastructure for life sciences, Stockholm, Sweden
- />Department of Immunology Genetics and Pathology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
28
|
Singh ND, Koerich LB, Carvalho AB, Clark AG. Positive and purifying selection on the Drosophila Y chromosome. Mol Biol Evol 2014; 31:2612-23. [PMID: 24974375 DOI: 10.1093/molbev/msu203] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Y chromosomes, with their reduced effective population size, lack of recombination, and male-limited transmission, present a unique collection of constraints for the operation of natural selection. Male-limited transmission may greatly increase the efficacy of selection for male-beneficial mutations, but the reduced effective size also inflates the role of random genetic drift. Together, these defining features of the Y chromosome are expected to influence rates and patterns of molecular evolution on the Y as compared with X-linked or autosomal loci. Here, we use sequence data from 11 genes in 9 Drosophila species to gain insight into the efficacy of natural selection on the Drosophila Y relative to the rest of the genome. Drosophila is an ideal system for assessing the consequences of Y-linkage for molecular evolution in part because the gene content of Drosophila Y chromosomes is highly dynamic, with orthologous genes being Y-linked in some species whereas autosomal in others. Our results confirm the expectation that the efficacy of natural selection at weakly selected sites is reduced on the Y chromosome. In contrast, purifying selection on the Y chromosome for strongly deleterious mutations does not appear to be compromised. Finally, we find evidence of recurrent positive selection for 4 of the 11 genes studied here. Our results thus highlight the variable nature of the mode and impact of natural selection on the Drosophila Y chromosome.
Collapse
Affiliation(s)
- Nadia D Singh
- Department of Biological Sciences, North Carolina State University
| | - Leonardo B Koerich
- Departamento de Genética, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University
| |
Collapse
|
29
|
Chuang TJ, Chiang TW. Impacts of pretranscriptional DNA methylation, transcriptional transcription factor, and posttranscriptional microRNA regulations on protein evolutionary rate. Genome Biol Evol 2014; 6:1530-1541. [PMID: 24923326 PMCID: PMC4080426 DOI: 10.1093/gbe/evu124] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/05/2014] [Indexed: 12/24/2022] Open
Abstract
Gene expression is largely regulated by DNA methylation, transcription factor (TF), and microRNA (miRNA) before, during, and after transcription, respectively. Although the evolutionary effects of TF/miRNA regulations have been widely studied, evolutionary analysis of simultaneously accounting for DNA methylation, TF, and miRNA regulations and whether promoter methylation and gene body (coding regions) methylation have different effects on the rate of gene evolution remain uninvestigated. Here, we compared human-macaque and human-mouse protein evolutionary rates against experimentally determined single base-resolution DNA methylation data, revealing that promoter methylation level is positively correlated with protein evolutionary rates but negatively correlated with TF/miRNA regulations, whereas the opposite was observed for gene body methylation level. Our results showed that the relative importance of these regulatory factors in determining the rate of mammalian protein evolution is as follows: Promoter methylation ≈ miRNA regulation > gene body methylation > TF regulation, and further indicated that promoter methylation and miRNA regulation have a significant dependent effect on protein evolutionary rates. Although the mechanisms underlying cooperation between DNA methylation and TFs/miRNAs in gene regulation remain unclear, our study helps to not only illuminate the impact of these regulatory factors on mammalian protein evolution but also their intricate interaction within gene regulatory networks.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Division of Physical & Computational Genomics, Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Tai-Wei Chiang
- Division of Physical & Computational Genomics, Genomics Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
30
|
Chuang TJ, Chen FC. DNA methylation is associated with an increased level of conservation at nondegenerate nucleotides in mammals. Mol Biol Evol 2014; 31:387-396. [PMID: 24157417 PMCID: PMC3907051 DOI: 10.1093/molbev/mst208] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
DNA methylation at CpG dinucleotides can significantly increase the rate of cytosine-to-thymine mutations and the level of sequence divergence. Although the correlations between DNA methylation and genomic sequence evolution have been widely studied, an unaddressed yet fundamental question is how DNA methylation is associated with the conservation of individual nucleotides in different sequence contexts. Here, we demonstrate that in mammalian exons, the correlations between DNA methylation and the conservation of individual nucleotides are dependent on the type of exonic sequence (coding or untranslated), the degeneracy of coding nucleotides, background selection pressure, and the relative position (first or nonfirst exon in the transcript) where the nucleotides are located. For untranslated and nonzero-fold degenerate nucleotides, methylated sites are less conserved than unmethylated sites regardless of background selection pressure and the relative position of the exon. For zero-fold degenerate (or nondegenerate) nucleotides, however, the reverse trend is observed in nonfirst coding exons and first coding exons that are under stringent background selection pressure. Furthermore, cytosine-to-thymine mutations at methylated zero-fold degenerate nucleotides are predicted to be more detrimental than those that occur at unmethylated nucleotides. As zero-fold and nonzero-fold degenerate nucleotides are very close to each other, our results suggest that the "functional resolution" of DNA methylation may be finer than previously recognized. In addition, the positive correlation between CpG methylation and the level of conservation at zero-fold degenerate nucleotides implies that CpG methylation may serve as an "indicator" of functional importance of these nucleotides.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Physical and Computational Genomics Division, Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Feng-Chi Chen
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Miaoli County, Taiwan
- Department of Life Science, National Chiao-Tung University, Hsinchu, Taiwan
- Department of Dentistry, China Medical University, Taichung, Taiwan
| |
Collapse
|
31
|
Bellizzi D, D'Aquila P, Scafone T, Giordano M, Riso V, Riccio A, Passarino G. The control region of mitochondrial DNA shows an unusual CpG and non-CpG methylation pattern. DNA Res 2013; 20:537-47. [PMID: 23804556 PMCID: PMC3859322 DOI: 10.1093/dnares/dst029] [Citation(s) in RCA: 202] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Accepted: 05/30/2013] [Indexed: 11/23/2022] Open
Abstract
DNA methylation is a common epigenetic modification of the mammalian genome. Conflicting data regarding the possible presence of methylated cytosines within mitochondrial DNA (mtDNA) have been reported. To clarify this point, we analysed the methylation status of mtDNA control region (D-loop) on human and murine DNA samples from blood and cultured cells by bisulphite sequencing and methylated/hydroxymethylated DNA immunoprecipitation assays. We found methylated and hydroxymethylated cytosines in the L-strand of all samples analysed. MtDNA methylation particularly occurs within non-C-phosphate-G (non-CpG) nucleotides, mainly in the promoter region of the heavy strand and in conserved sequence blocks, suggesting its involvement in regulating mtDNA replication and/or transcription. We observed DNA methyltransferases within the mitochondria, but the inactivation of Dnmt1, Dnmt3a, and Dnmt3b in mouse embryonic stem (ES) cells results in a reduction of the CpG methylation, while the non-CpG methylation shows to be not affected. This suggests that D-loop epigenetic modification is only partially established by these enzymes. Our data show that DNA methylation occurs in the mtDNA control region of mammals, not only at symmetrical CpG dinucleotides, typical of nuclear genome, but in a peculiar non-CpG pattern previously reported for plants and fungi. The molecular mechanisms responsible for this pattern remain an open question.
Collapse
Affiliation(s)
- Dina Bellizzi
- Department of Cell Biology, University of Calabria, Rende 87036, Italy
| | - Patrizia D'Aquila
- Department of Cell Biology, University of Calabria, Rende 87036, Italy
| | - Teresa Scafone
- Department of Cell Biology, University of Calabria, Rende 87036, Italy
| | - Marco Giordano
- Department of Cell Biology, University of Calabria, Rende 87036, Italy
| | - Vincenzo Riso
- Institute of Genetics and Biophysics—Adriano Buzzati Traverso, Napoli 80131, Italy
| | - Andrea Riccio
- Institute of Genetics and Biophysics—Adriano Buzzati Traverso, Napoli 80131, Italy
| | | |
Collapse
|
32
|
Whole-genome sequences of DA and F344 rats with different susceptibilities to arthritis, autoimmunity, inflammation and cancer. Genetics 2013; 194:1017-28. [PMID: 23695301 PMCID: PMC3730908 DOI: 10.1534/genetics.113.153049] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease.
Collapse
|
33
|
Huh I, Zeng J, Park T, Yi SV. DNA methylation and transcriptional noise. Epigenetics Chromatin 2013; 6:9. [PMID: 23618007 PMCID: PMC3641963 DOI: 10.1186/1756-8935-6-9] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2012] [Accepted: 04/05/2013] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND DNA methylation is one of the most phylogenetically widespread epigenetic modifications of genomic DNA. In particular, DNA methylation of transcription units ('gene bodies') is highly conserved across diverse taxa. However, the functional role of gene body methylation is not yet fully understood. A long-standing hypothesis posits that gene body methylation reduces transcriptional noise associated with spurious transcription of genes. Despite the plausibility of this hypothesis, an explicit test of this hypothesis has not been performed until now. RESULTS Using nucleotide-resolution data on genomic DNA methylation and abundant microarray data, here we investigate the relationship between DNA methylation and transcriptional noise. Transcriptional noise measured from microarrays scales down with expression abundance, confirming findings from single-cell studies. We show that gene body methylation is significantly negatively associated with transcriptional noise when examined in the context of other biological factors. CONCLUSIONS This finding supports the hypothesis that gene body methylation suppresses transcriptional noise. Heavy methylation of vertebrate genomes may have evolved as a global regulatory mechanism to control for transcriptional noise. In contrast, promoter methylation exhibits positive correlations with the level of transcriptional noise. We hypothesize that methylated promoters tend to undergo more frequent transcriptional bursts than those that avoid DNA methylation.
Collapse
Affiliation(s)
- Iksoo Huh
- School of Biology, Institute of Bioengineering and Biosciences, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA, 30332, USA.
| | | | | | | |
Collapse
|
34
|
Ogino S, Lochhead P, Chan AT, Nishihara R, Cho E, Wolpin BM, Meyerhardt JA, Meissner A, Schernhammer ES, Fuchs CS, Giovannucci E. Molecular pathological epidemiology of epigenetics: emerging integrative science to analyze environment, host, and disease. Mod Pathol 2013; 26:465-84. [PMID: 23307060 PMCID: PMC3637979 DOI: 10.1038/modpathol.2012.214] [Citation(s) in RCA: 166] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Epigenetics acts as an interface between environmental/exogenous factors, cellular responses, and pathological processes. Aberrant epigenetic signatures are a hallmark of complex multifactorial diseases (including neoplasms and malignancies such as leukemias, lymphomas, sarcomas, and breast, lung, prostate, liver, and colorectal cancers). Epigenetic signatures (DNA methylation, mRNA and microRNA expression, etc) may serve as biomarkers for risk stratification, early detection, and disease classification, as well as targets for therapy and chemoprevention. In particular, DNA methylation assays are widely applied to formalin-fixed, paraffin-embedded archival tissue specimens as clinical pathology tests. To better understand the interplay between etiological factors, cellular molecular characteristics, and disease evolution, the field of 'molecular pathological epidemiology (MPE)' has emerged as an interdisciplinary integration of 'molecular pathology' and 'epidemiology'. In contrast to traditional epidemiological research including genome-wide association studies (GWAS), MPE is founded on the unique disease principle, that is, each disease process results from unique profiles of exposomes, epigenomes, transcriptomes, proteomes, metabolomes, microbiomes, and interactomes in relation to the macroenvironment and tissue microenvironment. MPE may represent a logical evolution of GWAS, termed 'GWAS-MPE approach'. Although epigenome-wide association study attracts increasing attention, currently, it has a fundamental problem in that each cell within one individual has a unique, time-varying epigenome. Having a similar conceptual framework to systems biology, the holistic MPE approach enables us to link potential etiological factors to specific molecular pathology, and gain novel pathogenic insights on causality. The widespread application of epigenome (eg, methylome) analyses will enhance our understanding of disease heterogeneity, epigenotypes (CpG island methylator phenotype, LINE-1 (long interspersed nucleotide element-1; also called long interspersed nuclear element-1; long interspersed element-1; L1) hypomethylation, etc), and host-disease interactions. In this article, we illustrate increasing contribution of modern pathology to broader public health sciences, which attests pivotal roles of pathologists in the new integrated MPE science towards our ultimate goal of personalized medicine and prevention.
Collapse
Affiliation(s)
- Shuji Ogino
- Department of Pathology, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA 02215, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Panda A, Begum T, Ghosh TC. Insights into the evolutionary features of human neurodegenerative diseases. PLoS One 2012; 7:e48336. [PMID: 23118989 PMCID: PMC3484049 DOI: 10.1371/journal.pone.0048336] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2012] [Accepted: 09/24/2012] [Indexed: 02/06/2023] Open
Abstract
Comparative analyses between human disease and non-disease genes are of great interest in understanding human disease gene evolution. However, the progression of neurodegenerative diseases (NDD) involving amyloid formation in specific brain regions is still unknown. Therefore, in this study, we mainly focused our analysis on the evolutionary features of human NDD genes with respect to non-disease genes. Here, we observed that human NDD genes are evolutionarily conserved relative to non-disease genes. To elucidate the conserved nature of NDD genes, we incorporated the evolutionary attributes like gene expression level, number of regulatory miRNAs, protein connectivity, intrinsic disorder content and relative aggregation propensity in our analysis. Our studies demonstrate that NDD genes have higher gene expression levels in favor of their lower evolutionary rates. Additionally, we observed that NDD genes have higher number of different regulatory miRNAs target sites and also have higher interaction partners than the non-disease genes. Moreover, miRNA targeted genes are known to have higher disorder content. In contrast, our analysis exclusively established that NDD genes have lower disorder content. In favor of our analysis, we found that NDD gene encoded proteins are enriched with multi interface hubs (party hubs) with lower disorder contents. Since, proteins with higher disorder content need to adapt special structure to reduce their aggregation propensity, NDD proteins found to have elevated relative aggregation propensity (RAP) in support of their lower disorder content. Finally, our categorical regression analysis confirmed the underlined relative dominance of protein connectivity, 3'UTR length, RAP, nature of hubs (singlish/multi interface) and disorder content for such evolutionary rates variation between human NDD genes and non-disease genes.
Collapse
Affiliation(s)
- Arup Panda
- Bioinformatics Centre, Bose Institute, Kolkata, India
| | - Tina Begum
- Bioinformatics Centre, Bose Institute, Kolkata, India
| | | |
Collapse
|
36
|
Chuang TJ, Chen FC, Chen YZ. Position-dependent correlations between DNA methylation and the evolutionary rates of mammalian coding exons. Proc Natl Acad Sci U S A 2012; 109:15841-15846. [PMID: 23019368 PMCID: PMC3465446 DOI: 10.1073/pnas.1208214109] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
DNA cytosine methylation is a central epigenetic marker that is usually mutagenic and may increase the level of sequence divergence. However, methylated genes have been reported to evolve more slowly than unmethylated genes. Hence, there is a controversy on whether DNA methylation is correlated with increased or decreased protein evolutionary rates. We hypothesize that this controversy has resulted from the differential correlations between DNA methylation and the evolutionary rates of coding exons in different genic positions. To test this hypothesis, we compare human-mouse and human-macaque exonic evolutionary rates against experimentally determined single-base resolution DNA methylation data derived from multiple human cell types. We show that DNA methylation is significantly related to within-gene variations in evolutionary rates. First, DNA methylation level is more strongly correlated with C-to-T mutations at CpG dinucleotides in the first coding exons than in the internal and last exons, although it is positively correlated with the synonymous substitution rate in all exon positions. Second, for the first exons, DNA methylation level is negatively correlated with exonic expression level, but positively correlated with both nonsynonymous substitution rate and the sample specificity of DNA methylation level. For the internal and last exons, however, we observe the opposite correlations. Our results imply that DNA methylation level is differentially correlated with the biological (and evolutionary) features of coding exons in different genic positions. The first exons appear more prone to the mutagenic effects, whereas the other exons are more influenced by the regulatory effects of DNA methylation.
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Physical and Computational Genomics Division, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan
| | - Feng-Chi Chen
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Miaoli County 350, Taiwan
- Department of Life Science, National Chiao Tung University, Hsinchu 300, Taiwan; and
- Department of Dentistry, China Medical University, Taichung 404, Taiwan
| | - Yen-Zho Chen
- Physical and Computational Genomics Division, Genomics Research Center, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|