1
|
Crossley ER, Fedorova L, Mulyar O, Freeman R, Khuder S, Fedorov A. Computational identification of ultra-conserved elements in the human genome: a hypothesis on homologous DNA pairing. NAR Genom Bioinform 2024; 6:lqae074. [PMID: 38962254 PMCID: PMC11217675 DOI: 10.1093/nargab/lqae074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 05/29/2024] [Accepted: 06/19/2024] [Indexed: 07/05/2024] Open
Abstract
Thousands of prolonged sequences of human ultra-conserved non-coding elements (UCNEs) share only one common feature: peculiarities in the unique composition of their dinucleotides. Here we investigate whether the numerous weak signals emanating from these dinucleotide arrangements can be used for computational identification of UCNEs within the human genome. For this purpose, we analyzed 4272 UCNE sequences, encompassing 1 393 448 nucleotides, alongside equally sized control samples of randomly selected human genomic sequences. Our research identified nine different features of dinucleotide arrangements that enable differentiation of UCNEs from the rest of the genome. We employed these nine features, implementing three Machine Learning techniques - Support Vector Machine, Random Forest, and Artificial Neural Networks - to classify UCNEs, achieving an accuracy rate of 82-84%, with specific conditions allowing for over 90% accuracy. Notably, the strongest feature for UCNE identification was the frequency ratio between GpC dinucleotides and the sum of GpG and CpC dinucleotides. Additionally, we investigated the entire pool of 31 046 SNPs located within UCNEs for their representation in the ClinVar database, which catalogs human SNPs with known phenotypic effects. The presence of UCNE-associated SNPs in ClinVar aligns with the expectation of a random distribution, emphasizing the enigmatic nature of UCNE phenotypic manifestation.
Collapse
Affiliation(s)
- Emily R Crossley
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
| | | | | | | | - Sadik Khuder
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| | - Alexei Fedorov
- Program of Bioinformatics and Proteomics/Genomics, University of Toledo, Toledo, OH 43606, USA
- CRI Genetics LLC, Santa Monica, CA 90404, USA
- Department of Medicine, University of Toledo, Toledo, OH 43606, USA
| |
Collapse
|
2
|
de Jong MJ, van Oosterhout C, Hoelzel AR, Janke A. Moderating the neutralist-selectionist debate: exactly which propositions are we debating, and which arguments are valid? Biol Rev Camb Philos Soc 2024; 99:23-55. [PMID: 37621151 DOI: 10.1111/brv.13010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/04/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Half a century after its foundation, the neutral theory of molecular evolution continues to attract controversy. The debate has been hampered by the coexistence of different interpretations of the core proposition of the neutral theory, the 'neutral mutation-random drift' hypothesis. In this review, we trace the origins of these ambiguities and suggest potential solutions. We highlight the difference between the original, the revised and the nearly neutral hypothesis, and re-emphasise that none of them equates to the null hypothesis of strict neutrality. We distinguish the neutral hypothesis of protein evolution, the main focus of the ongoing debate, from the neutral hypotheses of genomic and functional DNA evolution, which for many species are generally accepted. We advocate a further distinction between a narrow and an extended neutral hypothesis (of which the latter posits that random non-conservative amino acid substitutions can cause non-ecological phenotypic divergence), and we discuss the implications for evolutionary biology beyond the domain of molecular evolution. We furthermore point out that the debate has widened from its initial focus on point mutations, and also concerns the fitness effects of large-scale mutations, which can alter the dosage of genes and regulatory sequences. We evaluate the validity of neutralist and selectionist arguments and find that the tested predictions, apart from being sensitive to violation of underlying assumptions, are often derived from the null hypothesis of strict neutrality, or equally consistent with the opposing selectionist hypothesis, except when assuming molecular panselectionism. Our review aims to facilitate a constructive neutralist-selectionist debate, and thereby to contribute to answering a key question of evolutionary biology: what proportions of amino acid and nucleotide substitutions and polymorphisms are adaptive?
Collapse
Affiliation(s)
- Menno J de Jong
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
| | - Cock van Oosterhout
- Centre for Ecology, Evolution and Conservation, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
| | - A Rus Hoelzel
- Department of Biosciences, Durham University, South Road, Durham, DH1 3LE, UK
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Strasse 9, Frankfurt am Main, 60438, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (TBG), Senckenberg Nature Research Society, Georg-Voigt-Straße 14-16, Frankfurt am Main, 60325, Germany
| |
Collapse
|
3
|
Liu A, Wang N, Xie G, Li Y, Yan X, Li X, Zhu Z, Li Z, Yang J, Meng F, Dou M, Chen W, Ma N, Jiang Y, Gao Y, Wang Y. GC-biased gene conversion drives accelerated evolution of ultraconserved elements in mammalian and avian genomes. Genome Res 2023; 33:1673-1689. [PMID: 37884342 PMCID: PMC10691551 DOI: 10.1101/gr.277784.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/23/2023] [Indexed: 10/28/2023]
Abstract
Ultraconserved elements (UCEs) are the most conserved regions among the genomes of evolutionarily distant species and are thought to play critical biological functions. However, some UCEs rapidly evolved in specific lineages, and whether they contributed to adaptive evolution is still controversial. Here, using an increased number of sequenced genomes with high taxonomic coverage, we identified 2191 mammalian UCEs and 5938 avian UCEs from 95 mammal and 94 bird genomes, respectively. Our results show that these UCEs are functionally constrained and that their adjacent genes are prone to widespread expression with low expression diversity across tissues. Functional enrichment of mammalian and avian UCEs shows different trends indicating that UCEs may contribute to adaptive evolution of taxa. Focusing on lineage-specific accelerated evolution, we discover that the proportion of fast-evolving UCEs in nine mammalian and 10 avian test lineages range from 0.19% to 13.2%. Notably, up to 62.1% of fast-evolving UCEs in test lineages are much more likely to result from GC-biased gene conversion (gBGC). A single cervid-specific gBGC region embracing the uc.359 allele significantly alters the expression of Nova1 and other neural-related genes in the rat brain. Combined with the altered regulatory activity of ancient gBGC-induced fast-evolving UCEs in eutherians, our results provide evidence that synergy between gBGC and selection shaped lineage-specific substitution patterns, even in the most constrained regulatory elements. In summary, our results show that gBGC played an important role in facilitating lineage-specific accelerated evolution of UCEs, and further support the idea that a combination of multiple evolutionary forces shapes adaptive evolution.
Collapse
Affiliation(s)
- Anguo Liu
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nini Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Faculty of Mathematics and Natural Sciences, University of Cologne, and Cologne Excellence Cluster for Cellular Stress Responses in Aging-Associated Diseases (CECAD), University Hospital Cologne, Cologne 50931, Germany
| | - Guoxiang Xie
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yang Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xixi Yan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xinmei Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhenliang Zhu
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhuohui Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jing Yang
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Fanxin Meng
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mingle Dou
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Weihuang Chen
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Nange Ma
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
- Center for Functional Genomics, Institute of Future Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yuanpeng Gao
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- College of Veterinary Medicine, Northwest A&F University, Yangling, Shaanxi 712100, China
- Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China;
- Key Laboratory of Livestock Biology, Northwest A&F University, Yangling, Shaanxi 712100, China
| |
Collapse
|
4
|
Thomas GWC, Hughes JJ, Kumon T, Berv JS, Nordgren CE, Lampson M, Levine M, Searle JB, Good JM. The genomic landscape, causes, and consequences of extensive phylogenomic discordance in Old World mice and rats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.28.555178. [PMID: 37693498 PMCID: PMC10491188 DOI: 10.1101/2023.08.28.555178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
A species tree is a central concept in evolutionary biology whereby a single branching phylogeny reflects relationships among species. However, the phylogenies of different genomic regions often differ from the species tree. Although tree discordance is often widespread in phylogenomic studies, we still lack a clear understanding of how variation in phylogenetic patterns is shaped by genome biology or the extent to which discordance may compromise comparative studies. We characterized patterns of phylogenomic discordance across the murine rodents (Old World mice and rats) - a large and ecologically diverse group that gave rise to the mouse and rat model systems. Combining new linked-read genome assemblies for seven murine species with eleven published rodent genomes, we first used ultra-conserved elements (UCEs) to infer a robust species tree. We then used whole genomes to examine finer-scale patterns of discordance and found that phylogenies built from proximate chromosomal regions had similar phylogenies. However, there was no relationship between tree similarity and local recombination rates in house mice, suggesting that genetic linkage influences phylogenetic patterns over deeper timescales. This signal may be independent of contemporary recombination landscapes. We also detected a strong influence of linked selection whereby purifying selection at UCEs led to less discordance, while genes experiencing positive selection showed more discordant and variable phylogenetic signals. Finally, we show that assuming a single species tree can result in high error rates when testing for positive selection under different models. Collectively, our results highlight the complex relationship between phylogenetic inference and genome biology and underscore how failure to account for this complexity can mislead comparative genomic studies.
Collapse
Affiliation(s)
- Gregg W. C. Thomas
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
- Informatics Group, Harvard University, Cambridge, MA, 02138
| | - Jonathan J. Hughes
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Evolution, Ecology, and Organismal Biology, University of California Riverside, Riverside, CA, 92521
| | - Tomohiro Kumon
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jacob S. Berv
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, 48109
| | - C. Erik Nordgren
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Michael Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Mia Levine
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104
| | - Jeremy B. Searle
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853
| | - Jeffrey M. Good
- Division of Biological Sciences, University of Montana, Missoula, MT, 59801
| |
Collapse
|
5
|
Alda F, Ludt WB, Elías DJ, McMahan CD, Chakrabarty P. Comparing Ultraconserved Elements and Exons for Phylogenomic Analyses of Middle American Cichlids: When Data Agree to Disagree. Genome Biol Evol 2021; 13:evab161. [PMID: 34272856 PMCID: PMC8369075 DOI: 10.1093/gbe/evab161] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/05/2021] [Indexed: 12/20/2022] Open
Abstract
Choosing among types of genomic markers to be used in a phylogenomic study can have a major influence on the cost, design, and results of a study. Yet few attempts have been made to compare categories of next-generation sequence markers limiting our ability to compare the suitability of these different genomic fragment types. Here, we explore properties of different genomic markers to find if they vary in the accuracy of component phylogenetic trees and to clarify the causes of conflict obtained from different data sets or inference methods. As a test case, we explore the causes of discordance between phylogenetic hypotheses obtained using a novel data set of ultraconserved elements (UCEs) and a recently published exon data set of the cichlid tribe Heroini. Resolving relationships among heroine cichlids has historically been difficult, and the processes of colonization and diversification in Middle America and the Greater Antilles are not yet well understood. Despite differences in informativeness and levels of gene tree discordance between UCEs and exons, the resulting phylogenomic hypotheses generally agree on most relationships. The independent data sets disagreed in areas with low phylogenetic signal that were overwhelmed by incomplete lineage sorting and nonphylogenetic signals. For UCEs, high levels of incomplete lineage sorting were found to be the major cause of gene tree discordance, whereas, for exons, nonphylogenetic signal is most likely caused by a reduced number of highly informative loci. This paucity of informative loci in exons might be due to heterogeneous substitution rates that are problematic to model (i.e., computationally restrictive) resulting in systematic errors that UCEs (being less informative individually but more uniform) are less prone to. These results generally demonstrate the robustness of phylogenomic methods to accommodate genomic markers with different biological and phylogenetic properties. However, we identify common and unique pitfalls of different categories of genomic fragments when inferring enigmatic phylogenetic relationships.
Collapse
Affiliation(s)
- Fernando Alda
- Department of Biology, Geology and Environmental Science, University of Tennessee at Chattanooga, Tennessee, USA
| | - William B Ludt
- Department of Ichthyology, Natural History Museum of Los Angeles County, Los Angeles, California, USA
| | - Diego J Elías
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, USA
| | | | - Prosanta Chakrabarty
- Museum of Natural Science, Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, USA
| |
Collapse
|
6
|
Ding Z, Yan Y, Guo YL, Wang C. Esophageal carcinoma cell-excreted exosomal uc.189 promotes lymphatic metastasis. Aging (Albany NY) 2021; 13:13846-13858. [PMID: 34024769 PMCID: PMC8202844 DOI: 10.18632/aging.202979] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 02/16/2021] [Indexed: 04/12/2023]
Abstract
Most cancers are old age-related diseases. Patients with lymphatic metastasis have an extremely poor prognosis in esophageal cancers (ECs). Previous studies showed ultraconserved RNAs are involved in tumorigenesis and ultraconserved RNA 189 (uc.189) served as an oncogene in cervical cancer, but the effect of exosomal uc.189 in esophageal squamous cell carcinoma (ESCC) remains undefined. This study revealed that uc.189 is closely correlated with lymph node (LN) metastasis and the number of lymphatic vessels in ESCC. ESCC-secreted exosomal uc.189 is transferred into human lymphatic endothelial cells (HLECs) to promote its proliferation, migration and tube formation to facilitate lymph node metastasis. Mechanistically, uc.189 regulated EPHA2 expression by directly binding to its 3'UTR region through dual-luciferase reporter assay. Over-expression and knockdown of EPHA2 could respectively rescue and simulate the effects induced by exosomal uc.189. Especially, the uc.189-EPHA2 axis activates the P38MAPK/VEGF-C pathway in HLECs. Finally, ESCC-secreted exosomal of uc.189 promotes HLECs sprouting in vitro, migration, and lymphangiogenesis. Thus, these findings suggested that exosomal uc.189 targets the EPHA2 of HLECs to promote lymphangiogenesis, and may represent a novel marker of diagnosis and treatment for ESCC patients in early stages.
Collapse
Affiliation(s)
- Zhiyan Ding
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou 225009, PR China
| | - Yun Yan
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, Yangzhou 225009, PR China
- Institute of Translational Medicine, Medical College, Yangzhou University, Yangzhou 225001, PR China
| | - Yu Lian Guo
- Institute of Translational Medicine, Medical College, Yangzhou University, Yangzhou 225001, PR China
| | - Chenghai Wang
- Institute of Translational Medicine, Medical College, Yangzhou University, Yangzhou 225001, PR China
- Jiangsu Key Laboratory of Integrated Traditional Chinese and Western Medicine for Prevention and Treatment of Senile Diseases, Yangzhou University, Yangzhou 225001, PR China
| |
Collapse
|
7
|
Snetkova V, Ypsilanti AR, Akiyama JA, Mannion BJ, Plajzer-Frick I, Novak CS, Harrington AN, Pham QT, Kato M, Zhu Y, Godoy J, Meky E, Hunter RD, Shi M, Kvon EZ, Afzal V, Tran S, Rubenstein JLR, Visel A, Pennacchio LA, Dickel DE. Ultraconserved enhancer function does not require perfect sequence conservation. Nat Genet 2021; 53:521-528. [PMID: 33782603 PMCID: PMC8038972 DOI: 10.1038/s41588-021-00812-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 02/04/2021] [Indexed: 01/09/2023]
Abstract
Ultraconserved enhancer sequences show perfect conservation between human and rodent genomes, suggesting that their functions are highly sensitive to mutation. However, current models of enhancer function do not sufficiently explain this extreme evolutionary constraint. We subjected 23 ultraconserved enhancers to different levels of mutagenesis, collectively introducing 1,547 mutations, and examined their activities in transgenic mouse reporter assays. Overall, we find that the regulatory properties of ultraconserved enhancers are robust to mutation. Upon mutagenesis, nearly all (19/23, 83%) still functioned as enhancers at one developmental stage, as did most of those tested again later in development (5/9, 56%). Replacement of endogenous enhancers with mutated alleles in mice corroborated results of transgenic assays, including the functional resilience of ultraconserved enhancers to mutation. Our findings show that the currently known activities of ultraconserved enhancers do not necessarily require the perfect conservation observed in evolution and suggest that additional regulatory or other functions contribute to their sequence constraint.
Collapse
Affiliation(s)
- Valentina Snetkova
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Athena R Ypsilanti
- Department of Psychiatry, Neuroscience Program, UCSF Weill Institute for Neurosciences, and the Nina Ireland Laboratory of Developmental Neurobiology, University of California, San Francisco, San Francisco, CA, USA
| | - Jennifer A Akiyama
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Brandon J Mannion
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA, USA
| | - Ingrid Plajzer-Frick
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Catherine S Novak
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anne N Harrington
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Quan T Pham
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Momoe Kato
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Yiwen Zhu
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Janeth Godoy
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Eman Meky
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Riana D Hunter
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Marie Shi
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Evgeny Z Kvon
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Developmental & Cell Biology, Department of Ecology & Evolutionary Biology, University of California, Irvine, Irvine, CA, USA
| | - Veena Afzal
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Stella Tran
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - John L R Rubenstein
- Department of Psychiatry, Neuroscience Program, UCSF Weill Institute for Neurosciences, and the Nina Ireland Laboratory of Developmental Neurobiology, University of California, San Francisco, San Francisco, CA, USA
| | - Axel Visel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Len A Pennacchio
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
| | - Diane E Dickel
- Environmental Genomics & System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
8
|
|
9
|
Habic A, Mattick JS, Calin GA, Krese R, Konc J, Kunej T. Genetic Variations of Ultraconserved Elements in the Human Genome. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2020; 23:549-559. [PMID: 31689173 DOI: 10.1089/omi.2019.0156] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Ultraconserved elements (UCEs) are among the most popular DNA markers for phylogenomic analysis. In at least three of five placental mammalian genomes (human, dog, cow, mouse, and rat), 2189 UCEs of at least 200 bp in length that are identical have been identified. Most of these regions have not yet been functionally annotated, and their associations with diseases remain largely unknown. This is an important knowledge gap in human genomics with regard to UCE roles in physiologically critical functions, and by extension, their relevance for shared susceptibilities to common complex diseases across several mammalian organisms in the event of their polymorphic variations. In the present study, we remapped the genomic locations of these UCEs to the latest human genome assembly, and examined them for documented polymorphisms in sequenced human genomes. We identified 29,983 polymorphisms within analyzed UCEs, but revealed that a vast majority exhibits very low minor allele frequencies. Notably, only 112 of the identified polymorphisms are associated with a phenotype in the Ensembl genome browser. Through literature analyses, we confirmed associations of 37 (i.e., out of the 112) polymorphisms within 23 UCEs with 25 diseases and phenotypic traits, including, muscular dystrophies, eye diseases, and cancers (e.g., familial adenomatous polyposis). Most reports of UCE polymorphism-disease associations appeared to be not cognizant that their candidate polymorphisms were actually within UCEs. The present study offers strategic directions and knowledge gaps for future computational and experimental work so as to better understand the thus far intriguing and puzzling role(s) of UCEs in mammalian genomes.
Collapse
Affiliation(s)
- Anamarija Habic
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| | - John S Mattick
- School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia.,Green Templeton College, University of Oxford, Oxford, United Kingdom
| | - George Adrian Calin
- Department of Experimental Therapeutics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas.,The Center for RNA Interference and Noncoding RNAs, The University of Texas M.D. Anderson Cancer Center, Houston, Texas
| | - Rok Krese
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| | - Janez Konc
- National Institute of Chemistry, Ljubljana, Slovenia
| | - Tanja Kunej
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Domzale, Slovenia
| |
Collapse
|
10
|
Woerner AE, Veeramah KR, Watkins JC, Hammer MF. The Role of Phylogenetically Conserved Elements in Shaping Patterns of Human Genomic Diversity. Mol Biol Evol 2020; 35:2284-2295. [PMID: 30113695 DOI: 10.1093/molbev/msy145] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Evolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positive-directional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (∼100 bp), widely distributed (comprising ∼5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80×) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GC-biased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human "neutralome," comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
Collapse
Affiliation(s)
- August E Woerner
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ.,Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY
| | | | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ
| |
Collapse
|
11
|
Pereira Zambalde E, Mathias C, Rodrigues AC, Souza Fonseca Ribeiro EM, Fiori Gradia D, Calin GA, Carvalho de Oliveira J. Highlighting transcribed ultraconserved regions in human diseases. WILEY INTERDISCIPLINARY REVIEWS-RNA 2019; 11:e1567. [DOI: 10.1002/wrna.1567] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Revised: 08/02/2019] [Accepted: 08/13/2019] [Indexed: 12/18/2022]
Affiliation(s)
| | - Carolina Mathias
- Department of Genetics Universidade Federal do Paraná Curitiba Brazil
| | | | | | | | - George A. Calin
- Department of Experimental Therapeutics, MD Anderson Cancer Center University of Texas Houston Texas
| | | |
Collapse
|
12
|
Zhou J, Wang C, Gong W, Wu Y, Xue H, Jiang Z, Shi M. uc.454 Inhibited Growth by Targeting Heat Shock Protein Family A Member 12B in Non-Small-Cell Lung Cancer. MOLECULAR THERAPY. NUCLEIC ACIDS 2018; 12:174-183. [PMID: 30195756 PMCID: PMC6023848 DOI: 10.1016/j.omtn.2018.05.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 04/30/2018] [Accepted: 05/07/2018] [Indexed: 01/18/2023]
Abstract
Transcribed ultraconserved regions (T-UCRs) classified as long non-coding RNAs (Lnc-RNAs) are transcripts longer than 200-nt RNA with no protein-coding capacity. Previous studies showed that T-UCRs serve as novel oncogenes, or tumor suppressors are involved in tumorigenesis and cancer progressive. Nevertheless, the clinicopathologic significance and regulatory mechanism of T-UCRs in lung cancer (LC) remain largely unknown. We found that uc.454 was downregulated in both non-small-cell LC (NSCLC) tissues and LC cell lines, and the downregulated uc.454 is associated with tumor size and tumors with more advanced stages. Transfection with uc.454 markedly induced apoptosis and inhibited cell proliferation in SPC-A-1 and NCI-H2170 LC cell lines. Above results suggested that uc.454 played a suppressive role in LC. Heat shock protein family A member 12B (HSPA12B) protein was negatively regulated by uc.454 at the posttranscriptional level by dual-luciferase reporter assay and affected the expressions of Bcl-2 family members, which finally induced LC apoptosis. The uc.454/HSPA12B axis furthers our understanding of the molecular mechanisms involved in tumor apoptosis, which may potentially serve as a therapeutic target for lung carcinoma.
Collapse
Affiliation(s)
- Jun Zhou
- Department of Respiratory Medicine, The 2nd Affiliated Hospital of Soochow University, 1055 Sanxiang Road, Suzhou, Jiangsu 215004, China; Department of Respiratory Medicine, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Chenghai Wang
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Weijuan Gong
- Department of Molecular Immunology, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Yandan Wu
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Huimin Xue
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Zewei Jiang
- Department of Pathology, The Affiliated Hospital of Yangzhou University, Yangzhou University, 368 Hanjiang Middle Road, Yangzhou 225009, China
| | - Minhua Shi
- Department of Respiratory Medicine, The 2nd Affiliated Hospital of Soochow University, 1055 Sanxiang Road, Suzhou, Jiangsu 215004, China.
| |
Collapse
|
13
|
Colwell M, Drown M, Showel K, Drown C, Palowski A, Faulk C. Evolutionary conservation of DNA methylation in CpG sites within ultraconserved noncoding elements. Epigenetics 2018; 13:49-60. [PMID: 29372669 PMCID: PMC5836973 DOI: 10.1080/15592294.2017.1411447] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Revised: 11/14/2017] [Accepted: 11/27/2017] [Indexed: 01/14/2023] Open
Abstract
Ultraconserved noncoding elements (UCNEs) constitute less than 1 Mb of vertebrate genomes and are impervious to accumulating mutations. About 4000 UCNEs exist in vertebrate genomes, each at least 200 nucleotides in length, sharing greater than 95% sequence identity between human and chicken. Despite extreme sequence conservation over 400 million years of vertebrate evolution, we show both ordered interspecies and within-species interindividual variation in DNA methylation in these regions. Here, we surveyed UCNEs with high CpG density in 56 species finding half to be intermediately methylated and the remaining near 0% or 100%. Intermediately methylated UCNEs displayed a greater range of methylation between mouse tissues. In a human population, most UCNEs showed greater variation than the LINE1 transposon, a frequently used epigenetic biomarker. Global methylation was found to be inversely correlated to hydroxymethylation across 60 vertebrates. Within UCNEs, DNA methylation is flexible, conserved between related species, and relaxed from the underlying sequence selection pressure, while remaining heritable through speciation.
Collapse
Affiliation(s)
- Mathia Colwell
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| | - Melissa Drown
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| | - Kelly Showel
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| | - Chelsea Drown
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| | - Amanda Palowski
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| | - Christopher Faulk
- Department of Animal Sciences, University of Minnesota, College of Food, Agricultural, and Natural Resource Sciences, Saint Paul, MN, USA
| |
Collapse
|
14
|
Yang ZK, Gao F. The systematic analysis of ultraconserved genomic regions in the budding yeast. Bioinformatics 2018; 34:361-366. [PMID: 29028909 DOI: 10.1093/bioinformatics/btx619] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 09/26/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation In the evolution of species, a kind of special sequences, termed ultraconserved sequences (UCSs), have been inherited without any change, which strongly suggests those sequences should be crucial for the species to survive or adapt to the environment. However, the UCSs are still regarded as mysterious genetic sequences so far. Here, we present a systematic study of ultraconserved genomic regions in the budding yeast based on the publicly available genome sequences, in order to reveal their relationship with the adaptability or fitness advantages of the budding yeast. Results Our results indicate that, in addition to some fundamental biological functions, the UCSs play an important role in the adaptation of Saccharomyces cerevisiae to the acidic environment, which is backed up by the previous observation. Besides that, we also find the highly unchanged genes are enriched in some other pathways, such as the nutrient-sensitive signaling pathway. To facilitate the investigation of unique UCSs, the UCSC Genome Browser was utilized to visualize the chromosomal position and related annotations of UCSs in S.cerevisiae genome. Availability and implementation For more details on UCSs, please refer to the Supplementary information online, and the custom code is available on request. Contact fgao@tju.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhi-Kai Yang
- Department of Physics, Tianjin University, Tianjin 300072, China.,SinoGenoMax Co., Ltd./Chinese National Human Genome Center, Beijing 100176, China
| | - Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, China.,Key Laboratory of Systems Bioengineering (Ministry of Education).,SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), Tianjin University, Tianjin 300072, China
| |
Collapse
|
15
|
Esselstyn JA, Oliveros CH, Swanson MT, Faircloth BC. Investigating Difficult Nodes in the Placental Mammal Tree with Expanded Taxon Sampling and Thousands of Ultraconserved Elements. Genome Biol Evol 2017; 9:2308-2321. [PMID: 28934378 PMCID: PMC5604124 DOI: 10.1093/gbe/evx168] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2017] [Indexed: 12/21/2022] Open
Abstract
The phylogeny of eutherian mammals contains some of the most recalcitrant nodes in the tetrapod tree of life. We combined comprehensive taxon and character sampling to explore three of the most debated interordinal relationships among placental mammals. We performed in silico extraction of ultraconserved element loci from 72 published genomes and invitro enrichment and sequencing of ultraconserved elements from 28 additional mammals, resulting in alignments of 3,787 loci. We analyzed these data using concatenated and multispecies coalescent phylogenetic approaches, topological tests, and exploration of support among individual loci to identify the root of Eutheria and the sister groups of tree shrews (Scandentia) and horses (Perissodactyla). Individual loci provided weak, but often consistent support for topological hypotheses. Although many gene trees lacked accepted species-tree relationships, summary coalescent topologies were largely consistent with inferences from concatenation. At the root of Eutheria, we identified consistent support for a sister relationship between Xenarthra and Afrotheria (i.e., Atlantogenata). At the other nodes of interest, support was less consistent. We suggest Scandentia is the sister of Primatomorpha (Euarchonta), but we failed to reject a sister relationship between Scandentia and Glires. Similarly, we suggest Perissodactyla is sister to Cetartiodactyla (Euungulata), but a sister relationship between Perissodactyla and Chiroptera remains plausible.
Collapse
Affiliation(s)
- Jacob A. Esselstyn
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Carl H. Oliveros
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Mark T. Swanson
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Brant C. Faircloth
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| |
Collapse
|
16
|
Esselstyn JA, Oliveros CH, Swanson MT, Faircloth BC. Investigating Difficult Nodes in the Placental Mammal Tree with Expanded Taxon Sampling and Thousands of Ultraconserved Elements. Genome Biol Evol 2017. [PMID: 28934378 DOI: 10.1093/gbe/evx168)] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The phylogeny of eutherian mammals contains some of the most recalcitrant nodes in the tetrapod tree of life. We combined comprehensive taxon and character sampling to explore three of the most debated interordinal relationships among placental mammals. We performed in silico extraction of ultraconserved element loci from 72 published genomes and invitro enrichment and sequencing of ultraconserved elements from 28 additional mammals, resulting in alignments of 3,787 loci. We analyzed these data using concatenated and multispecies coalescent phylogenetic approaches, topological tests, and exploration of support among individual loci to identify the root of Eutheria and the sister groups of tree shrews (Scandentia) and horses (Perissodactyla). Individual loci provided weak, but often consistent support for topological hypotheses. Although many gene trees lacked accepted species-tree relationships, summary coalescent topologies were largely consistent with inferences from concatenation. At the root of Eutheria, we identified consistent support for a sister relationship between Xenarthra and Afrotheria (i.e., Atlantogenata). At the other nodes of interest, support was less consistent. We suggest Scandentia is the sister of Primatomorpha (Euarchonta), but we failed to reject a sister relationship between Scandentia and Glires. Similarly, we suggest Perissodactyla is sister to Cetartiodactyla (Euungulata), but a sister relationship between Perissodactyla and Chiroptera remains plausible.
Collapse
Affiliation(s)
- Jacob A Esselstyn
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Carl H Oliveros
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Mark T Swanson
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| | - Brant C Faircloth
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, Baton Rouge
| |
Collapse
|
17
|
Kuperberg M, Lev D, Blumkin L, Zerem A, Ginsberg M, Linder I, Carmi N, Kivity S, Lerman-Sagie T, Leshinsky-Silver E. Utility of Whole Exome Sequencing for Genetic Diagnosis of Previously Undiagnosed Pediatric Neurology Patients. J Child Neurol 2016; 31:1534-1539. [PMID: 27572814 DOI: 10.1177/0883073816664836] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Revised: 07/12/2016] [Accepted: 07/18/2016] [Indexed: 12/18/2022]
Abstract
Whole exome sequencing enables scanning a large number of genes for relatively low costs. The authors investigate its use for previously undiagnosed pediatric neurological patients. This retrospective cohort study performed whole exome sequencing on 57 patients of "Magen" neurogenetic clinics, with unknown diagnoses despite previous workup. The authors report on clinical features, causative genes, and treatment modifications and provide an analysis of whole exome sequencing utility per primary clinical feature. A causative gene was identified in 49.1% of patients, of which 17 had an autosomal dominant mutation, 9 autosomal recessive, and 2 X-linked. The highest rate of positive diagnosis was found for patients with developmental delay, ataxia, or suspected neuromuscular disease. Whole exome sequencing warranted a definitive change of treatment for 5 patients. Genetic databases were updated accordingly. In conclusion, whole exome sequencing is useful in obtaining a high detection rate for previously undiagnosed disorders. Use of this technique could affect diagnosis, treatment, and prognostics for both patients and relatives.
Collapse
Affiliation(s)
- Maya Kuperberg
- Metabolic-Neurogenetic Service, Wolfson Medical Center, Holon, Israel
| | - Dorit Lev
- Institute of Medical Genetics, Wolfson Medical Center, Holon, Israel
| | - Lubov Blumkin
- Metabolic-Neurogenetic Service, Wolfson Medical Center, Holon, Israel
| | - Ayelet Zerem
- Department of Pediatric Neurology, Wolfson Medical Center, Holon, Israel
| | - Mira Ginsberg
- Department of Pediatric Neurology, Wolfson Medical Center, Holon, Israel
| | - Ilan Linder
- Department of Pediatric Neurology, Wolfson Medical Center, Holon, Israel
| | - Nirit Carmi
- Department of Pediatric Neurology, Wolfson Medical Center, Holon, Israel
| | - Sarah Kivity
- Department of Pediatric Neurology, Wolfson Medical Center, Holon, Israel
| | | | | |
Collapse
|
18
|
Warnefors M, Hartmann B, Thomsen S, Alonso CR. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila. Mol Biol Evol 2016; 33:2294-306. [PMID: 27247329 PMCID: PMC4989106 DOI: 10.1093/molbev/msw101] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa.
Collapse
Affiliation(s)
- Maria Warnefors
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Britta Hartmann
- Institute of Human Genetics, Freiburg, Germany BIOSS Centre for Biological Signaling Studies, University Medical Center Freiburg, Freiburg, Germany
| | - Stefan Thomsen
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Claudio R Alonso
- Sussex Neuroscience, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
19
|
Abstract
Congenital heart defects (CHDs) are structural abnormalities of the heart and great vessels that are present from birth. The presence or absence of extracardiac anomalies has historically been used to identify patients with possible monogenic, chromosomal, or teratogenic CHD causes. These distinctions remain clinically relevant, but it is increasingly clear that nonsyndromic CHDs can also be genetic. This article discusses key morphologic, molecular, and signaling mechanisms relevant to heart development, summarizes overall progress in molecular genetic analyses of CHDs, and provides current recommendations for clinical application of genetic testing.
Collapse
Affiliation(s)
- Jason R Cowan
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229, USA; Department of Pediatrics and Medical and Molecular Genetics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, 1044 West Walnut Street, Indianapolis, IN 46202, USA
| | - Stephanie M Ware
- Department of Pediatrics and Medical and Molecular Genetics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, 1044 West Walnut Street, Indianapolis, IN 46202, USA.
| |
Collapse
|
20
|
Silla T, Kepp K, Tai ES, Goh L, Davila S, Ivkovic TC, Calin GA, Voorhoeve PM. Allele frequencies of variants in ultra conserved elements identify selective pressure on transcription factor binding. PLoS One 2014; 9:e110692. [PMID: 25369454 PMCID: PMC4219694 DOI: 10.1371/journal.pone.0110692] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Accepted: 09/16/2014] [Indexed: 12/30/2022] Open
Abstract
Ultra-conserved genes or elements (UCGs/UCEs) in the human genome are extreme examples of conservation. We characterized natural variations in 2884 UCEs and UCGs in two distinct populations; Singaporean Chinese (n = 280) and Italian (n = 501) by using a pooled sample, targeted capture, sequencing approach. We identify, with high confidence, in these regions the abundance of rare SNVs (MAF<0.5%) of which 75% is not present in dbSNP137. UCEs association studies for complex human traits can use this information to model expected background variation and thus necessary power for association studies. By combining our data with 1000 Genome Project data, we show in three independent datasets that prevalent UCE variants (MAF>5%) are more often found in relatively less-conserved nucleotides within UCEs, compared to rare variants. Moreover, prevalent variants are less likely to overlap transcription factor binding site. Using SNPfold we found no significant influence of RNA secondary structure on UCE conservation. All together, these results suggest UCEs are not under selective pressure as a stretch of DNA but are under differential evolutionary pressure on the single nucleotide level.
Collapse
Affiliation(s)
- Toomas Silla
- Cancer and Stem Cell Biology Program, Duke-NUS Graduate Medical School, Singapore, Singapore
| | - Katrin Kepp
- Human Genetics, Genome Institute of Singapore, Singapore, Singapore
| | - E. Shyong Tai
- Department of Medicine, National University of Singapore, Singapore, Singapore
- Cardiovascular & Metabolic Disorders Program, Duke-NUS Graduate Medical School, Singapore, Singapore
| | - Liang Goh
- Cancer and Stem Cell Biology Program, Duke-NUS Graduate Medical School, Singapore, Singapore
| | - Sonia Davila
- Human Genetics, Genome Institute of Singapore, Singapore, Singapore
| | - Tina Catela Ivkovic
- Experimental Therapeutics & Cancer Genetics, MD Anderson Cancer Center, Texas State University, Houston, Texas, United States of America
- Division of Molecular Medicine, Ruder Boskovic Institute, Zagreb, Croatia
| | - George A. Calin
- Experimental Therapeutics & Cancer Genetics, MD Anderson Cancer Center, Texas State University, Houston, Texas, United States of America
| | - P. Mathijs Voorhoeve
- Cancer and Stem Cell Biology Program, Duke-NUS Graduate Medical School, Singapore, Singapore
| |
Collapse
|
21
|
McCole RB, Fonseka CY, Koren A, Wu CT. Abnormal dosage of ultraconserved elements is highly disfavored in healthy cells but not cancer cells. PLoS Genet 2014; 10:e1004646. [PMID: 25340765 PMCID: PMC4207606 DOI: 10.1371/journal.pgen.1004646] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 08/04/2014] [Indexed: 12/17/2022] Open
Abstract
Ultraconserved elements (UCEs) are strongly depleted from segmental duplications and copy number variations (CNVs) in the human genome, suggesting that deletion or duplication of a UCE can be deleterious to the mammalian cell. Here we address the process by which CNVs become depleted of UCEs. We begin by showing that depletion for UCEs characterizes the most recent large-scale human CNV datasets and then find that even newly formed de novo CNVs, which have passed through meiosis at most once, are significantly depleted for UCEs. In striking contrast, CNVs arising specifically in cancer cells are, as a rule, not depleted for UCEs and can even become significantly enriched. This observation raises the possibility that CNVs that arise somatically and are relatively newly formed are less likely to have established a CNV profile that is depleted for UCEs. Alternatively, lack of depletion for UCEs from cancer CNVs may reflect the diseased state. In support of this latter explanation, somatic CNVs that are not associated with disease are depleted for UCEs. Finally, we show that it is possible to observe the CNVs of induced pluripotent stem (iPS) cells become depleted of UCEs over time, suggesting that depletion may be established through selection against UCE-disrupting CNVs without the requirement for meiotic divisions.
Collapse
Affiliation(s)
- Ruth B. McCole
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Chamith Y. Fonseka
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Biological and Biomedical Sciences PhD program, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Amnon Koren
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| | - C.-ting Wu
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
22
|
Liu JZ, Anderson CA. Genetic studies of Crohn's disease: past, present and future. Best Pract Res Clin Gastroenterol 2014; 28:373-86. [PMID: 24913378 PMCID: PMC4075408 DOI: 10.1016/j.bpg.2014.04.009] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Revised: 04/14/2014] [Accepted: 04/24/2014] [Indexed: 01/31/2023]
Abstract
The exact aetiology of Crohn's disease is unknown, though it is clear from early epidemiological studies that a combination of genetic and environmental risk factors contributes to an individual's disease susceptibility. Here, we review the history of gene-mapping studies of Crohn's disease, from the linkage-based studies that first implicated the NOD2 locus, through to modern-day genome-wide association studies that have discovered over 140 loci associated with Crohn's disease and yielded novel insights into the biological pathways underlying pathogenesis. We describe on-going and future gene-mapping studies that utilise next generation sequencing technology to pinpoint causal variants and identify rare genetic variation underlying Crohn's disease risk. We comment on the utility of genetic markers for predicting an individual's disease risk and discuss their potential for identifying novel drug targets and influencing disease management. Finally, we describe how these studies have shaped and continue to shape our understanding of the genetic architecture of Crohn's disease.
Collapse
Affiliation(s)
- Jimmy Z Liu
- The Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK
| | | |
Collapse
|
23
|
Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A 2014; 111:E455-64. [PMID: 24443550 DOI: 10.1073/pnas.1322563111] [Citation(s) in RCA: 440] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set.
Collapse
|
24
|
Li J, Xuan Z, Liu C. Long non-coding RNAs and complex human diseases. Int J Mol Sci 2013; 14:18790-808. [PMID: 24036441 PMCID: PMC3794807 DOI: 10.3390/ijms140918790] [Citation(s) in RCA: 153] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Revised: 08/28/2013] [Accepted: 09/03/2013] [Indexed: 02/07/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are a heterogeneous class of RNAs that are generally defined as non-protein-coding transcripts longer than 200 nucleotides. Recently, an increasing number of studies have shown that lncRNAs can be involved in various critical biological processes, such as chromatin remodeling, gene transcription, and protein transport and trafficking. Moreover, lncRNAs are dysregulated in a number of complex human diseases, including coronary artery diseases, autoimmune diseases, neurological disorders, and various cancers, which indicates their important roles in these diseases. Here, we reviewed the current understanding of lncRNAs, including their definition and subclassification, regulatory functions, and potential roles in different types of complex human diseases.
Collapse
Affiliation(s)
- Jing Li
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; E-Mail:
| | - Zhenyu Xuan
- Department of Molecular and Cell Biology, Center for Systems Biology, University of Texas at Dallas, 800 W Campbell Road, Richardson, TX 75080, USA
| | - Changning Liu
- Bioinformatics Research Group, Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; E-Mail:
| |
Collapse
|
25
|
Haerty W, Ponting CP. Mutations within lncRNAs are effectively selected against in fruitfly but not in human. Genome Biol 2013; 14:R49. [PMID: 23710818 PMCID: PMC4053968 DOI: 10.1186/gb-2013-14-5-r49] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 05/27/2013] [Indexed: 02/07/2023] Open
Abstract
Background Previous studies in Drosophila and mammals have revealed levels of long non-coding RNAs (lncRNAs) sequence conservation that are intermediate between neutrally evolving and protein-coding sequence. These analyses compared conservation between species that diverged up to 75 million years ago. However, analysis of sequence polymorphisms within a species' population can provide an understanding of essentially contemporaneous selective constraints that are acting on lncRNAs and can quantify the deleterious effect of mutations occurring within these loci. Results We took advantage of polymorphisms derived from the genome sequences of 163 Drosophila melanogaster strains and 174 human individuals to calculate the distribution of fitness effects of single nucleotide polymorphisms occurring within intergenic lncRNAs and compared this to distributions for SNPs present within putatively neutral or protein-coding sequences. Our observations show that in D.melanogaster there is a significant excess of rare frequency variants within intergenic lncRNAs relative to neutrally evolving sequences, whereas selection on human intergenic lncRNAs appears to be effectively neutral. Approximately 30% of mutations within these fruitfly lncRNAs are estimated as being weakly deleterious. Conclusions These contrasting results can be attributed to the large difference in effective population sizes between the two species. Our results suggest that while the sequences of lncRNAs will be well conserved across insect species, such loci in mammals will accumulate greater proportions of deleterious changes through genetic drift.
Collapse
|
26
|
Vernot B, Stergachis AB, Maurano MT, Vierstra J, Neph S, Thurman RE, Stamatoyannopoulos JA, Akey JM. Personal and population genomics of human regulatory variation. Genome Res 2013; 22:1689-97. [PMID: 22955981 PMCID: PMC3431486 DOI: 10.1101/gr.134890.111] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The characteristics and evolutionary forces acting on regulatory variation in humans remains elusive because of the difficulty in defining functionally important noncoding DNA. Here, we combine genome-scale maps of regulatory DNA marked by DNase I hypersensitive sites (DHSs) from 138 cell and tissue types with whole-genome sequences of 53 geographically diverse individuals in order to better delimit the patterns of regulatory variation in humans. We estimate that individuals likely harbor many more functionally important variants in regulatory DNA compared with protein-coding regions, although they are likely to have, on average, smaller effect sizes. Moreover, we demonstrate that there is significant heterogeneity in the level of functional constraint in regulatory DNA among different cell types. We also find marked variability in functional constraint among transcription factor motifs in regulatory DNA, with sequence motifs for major developmental regulators, such as HOX proteins, exhibiting levels of constraint comparable to protein-coding regions. Finally, we perform a genome-wide scan of recent positive selection and identify hundreds of novel substrates of adaptive regulatory evolution that are enriched for biologically interesting pathways such as melanogenesis and adipocytokine signaling. These data and results provide new insights into patterns of regulatory variation in individuals and populations and demonstrate that a large proportion of functionally important variation lies beyond the exome.
Collapse
Affiliation(s)
- Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Kritsas K, Wuest SE, Hupalo D, Kern AD, Wicker T, Grossniklaus U. Computational analysis and characterization of UCE-like elements (ULEs) in plant genomes. Genome Res 2012; 22:2455-66. [PMID: 22987666 PMCID: PMC3514675 DOI: 10.1101/gr.129346.111] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ultraconserved elements (UCEs), stretches of DNA that are identical between distantly related species, are enigmatic genomic features whose function is not well understood. First identified and characterized in mammals, UCEs have been proposed to play important roles in gene regulation, RNA processing, and maintaining genome integrity. However, because all of these functions can tolerate some sequence variation, their ultraconserved and ultraselected nature is not explained. We investigated whether there are highly conserved DNA elements without genic function in distantly related plant genomes. We compared the genomes of Arabidopsis thaliana and Vitis vinifera; species that diverged ∼115 million years ago (Mya). We identified 36 highly conserved elements with at least 85% similarity that are longer than 55 bp. Interestingly, these elements exhibit properties similar to mammalian UCEs, such that we named them UCE-like elements (ULEs). ULEs are located in intergenic or intronic regions and are depleted from segmental duplications. Like UCEs, ULEs are under strong purifying selection, suggesting a functional role for these elements. As their mammalian counterparts, ULEs show a sharp drop of A+T content at their borders and are enriched close to genes encoding transcription factors and genes involved in development, the latter showing preferential expression in undifferentiated tissues. By comparing the genomes of Brachypodium distachyon and Oryza sativa, species that diverged ∼50 Mya, we identified a different set of ULEs with similar properties in monocots. The identification of ULEs in plant genomes offers new opportunities to study their possible roles in genome function, integrity, and regulation.
Collapse
Affiliation(s)
- Konstantinos Kritsas
- Institute of Plant Biology & Zürich-Basel Plant Science Center, University Zürich, CH-8008 Zürich, Switzerland
| | | | | | | | | | | |
Collapse
|
28
|
Ultraconserved elements in the human genome: association and transmission analyses of highly constrained single-nucleotide polymorphisms. Genetics 2012; 192:253-66. [PMID: 22714408 DOI: 10.1534/genetics.112.141945] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Ultraconserved elements in the human genome likely harbor important biological functions as they are dosage sensitive and are able to direct tissue-specific expression. Because they are under purifying selection, variants in these elements may have a lower frequency in the population but a higher likelihood of association with complex traits. We tested a set of highly constrained SNPs (hcSNPs) distributed genome-wide among ultraconserved and nearly ultraconserved elements for association with seven traits related to reproductive (age at natural menopause, number of children, age at first child, and age at last child) and overall [longevity, body mass index (BMI), and height] fitness. Using up to 24,047 European-American samples from the National Heart, Lung, and Blood Institute Candidate Gene Association Resource (CARe), we observed an excess of associations with BMI and height. In an independent replication panel the most strongly associated SNPs showed an 8.4-fold enrichment of associations at the nominal level, including three variants in previously identified loci and one in a locus (DENND1A) previously shown to be associated with polycystic ovary syndrome. Finally, using 1430 family trios, we showed that the transmissions from heterozygous parents to offspring of the derived alleles of rare (frequency ≤ 0.5%) hcSNPs are not biased, particularly after adjusting for the rates of genotype missingness and error in the data. The lack of transmission bias ruled out an immediately and strongly deleterious effect due to the rare derived alleles, consistent with the observation that mice homozygous for the deletion of ultraconserved elements showed no overt phenotype. Our study also illustrated the importance of carefully modeling potential technical confounders when analyzing genotype data of rare variants.
Collapse
|
29
|
Sana J, Hankeova S, Svoboda M, Kiss I, Vyzula R, Slaby O. Expression levels of transcribed ultraconserved regions uc.73 and uc.388 are altered in colorectal cancer. Oncology 2012; 82:114-8. [PMID: 22328099 DOI: 10.1159/000336479] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Accepted: 01/04/2012] [Indexed: 12/18/2022]
Abstract
OBJECTIVES The development of colorectal cancer (CRC) is characterized by multiple genetic alterations. Transcribed ultraconserved regions (T-UCRs) are a subset of 481 sequences longer than 200 bp, which are absolutely conserved between orthologous regions of human, rat and mouse genomes, and are actively transcribed. It has recently been proven in cancer systems that differentially expressed T-UCRs could alter the functional characteristics of malignant cells. Genome-wide profiling revealed that T-UCRs have distinct signatures in human leukemia and carcinoma. METHODS In our study, we examined the expression levels of uc.43, uc.73, uc.134, uc.230, uc.339, uc.388 and uc.399 in 54 samples of primary colorectal carcinomas and 15 samples of non-tumoral adjacent tissues by real-time PCR. T-UCR expression levels were also correlated with commonly used clinicopathological features of CRC. RESULTS Expression levels of uc.73 (p = 0.0139) and uc.388 (p = 0.0325) were significantly decreased in CRC tissue, and uc.73 indicated a positive correlation with overall survival (p = 0.0315). The lower expression of uc.388 was associated with the distal location of CRC (p = 0.0183), but no correlation of any evaluated T-UCR with clinical stage, grade and tumor diameter was observed. CONCLUSION Our preliminary results suggest that uc.73 and uc.388 could be potential diagnostic and prognostic biomarkers in CRC patients.
Collapse
Affiliation(s)
- J Sana
- Department of Comprehensive Cancer Care, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| | | | | | | | | | | |
Collapse
|
30
|
Okada T, Miyashita M, Fukuhara J, Sugitani M, Ueno T, Samson-Bouma ME, Aggerbeck LP. Anderson's disease/chylomicron retention disease in a Japanese patient with uniparental disomy 7 and a normal SAR1B gene protein coding sequence. Orphanet J Rare Dis 2011; 6:78. [PMID: 22104167 PMCID: PMC3284428 DOI: 10.1186/1750-1172-6-78] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 11/21/2011] [Indexed: 12/03/2022] Open
Abstract
Background Anderson's Disease (AD)/Chylomicron Retention Disease (CMRD) is a rare hereditary hypocholesterolemic disorder characterized by a malabsorption syndrome with steatorrhea, failure to thrive and the absence of chylomicrons and apolipoprotein B48 post-prandially. All patients studied to date exhibit a mutation in the SAR1B gene, which codes for an essential component of the vesicular coat protein complex II (COPII) necessary for endoplasmic reticulum to Golgi transport. We describe here a patient with AD/CMRD, a normal SAR1B gene protein coding sequence and maternal uniparental disomy of chromosome 7 (matUPD7). Methods and Results The patient, one of two siblings of a Japanese family, had diarrhea and steatorrhea beginning at five months of age. There was a white duodenal mucosa upon endoscopy. Light and electron microscopy showed that the intestinal villi were normal but that they had lipid laden enterocytes containing accumulations of lipid droplets in the cytoplasm and lipoprotein-size particles in membrane bound structures. Although there were decreased amounts in plasma of total- and low-density lipoprotein cholesterol, apolipoproteins AI and B and vitamin E levels, the triglycerides were normal, typical of AD/CMRD. The presence of low density lipoproteins and apolipoprotein B in the plasma, although in decreased amounts, ruled out abetalipoproteinemia. The parents were asymptomatic with normal plasma cholesterol levels suggesting a recessive disorder and ruling out familial hypobetalipoproteinemia. Sequencing of genomic DNA showed that the 8 exons of the SAR1B gene were normal. Whole genome SNP analysis and karyotyping revealed matUPD7 with a normal karyotype. In contrast to other cases of AD/CMRD which have shown catch-up growth following vitamin supplementation and a fat restricted diet, our patient exhibits continued growth delay and other aspects of the matUPD7 and Silver-Russell Syndrome phenotypes. Conclusions This patient with AD/CMRD has a normal SAR1B gene protein coding sequence which suggests that factors other than the SAR1B protein may be crucial for chylomicron secretion. Further, this patient exhibits matUPD7 with regions of homozygosity which might be useful for elucidating the molecular basis of the defect(s) in this individual. The results provide novel insights into the relation between phenotype and genotype in these diseases and for the mechanisms of secretion in the intestine.
Collapse
Affiliation(s)
- Tomoo Okada
- Department of Pediatrics, Nihon University School of Medicine, 30-1 Oyaguchi Kamicho Itabashi-ku, Tokyo 173-8630, Japan.
| | | | | | | | | | | | | |
Collapse
|
31
|
Abstract
We tested whether functionally important sites in bacterial, yeast, and animal promoters are more conserved than their neighbors. We found that substitutions are predominantly seen in less important sites and that those that occurred tended to have less impact on gene expression than possible alternatives. These results suggest that purifying selection operates on promoter sequences.
Collapse
|
32
|
Levenstien MA, Klein RJ. Predicting functionally important SNP classes based on negative selection. BMC Bioinformatics 2011; 12:26. [PMID: 21247465 PMCID: PMC3033802 DOI: 10.1186/1471-2105-12-26] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 01/19/2011] [Indexed: 01/20/2023] Open
Abstract
Background With the advent of cost-effective genotyping technologies, genome-wide association studies allow researchers to examine hundreds of thousands of single nucleotide polymorphisms (SNPs) for association with human disease. Recently, many researchers applying this strategy have detected strong associations to disease with SNP markers that are either not in linkage disequilibrium with any nonsynonymous SNP or large distances from any annotated gene. In such cases, no well-established standard practice for effective SNP selection for follow-up studies exists. We aim to identify and prioritize groups of SNPs that are more likely to affect phenotypes in order to facilitate efficient SNP selection for follow-up studies. Results Based on the annotations available in the Ensembl database, we categorized SNPs in the human genome into classes related to regulatory attributes, such as epigenetic modifications and transcription factor binding sites, in addition to classes related to gene structure and cross-species conservation. Using the distribution of derived allele frequencies (DAF) within each class, we assessed the strength of natural selection for each class relative to the genome as a whole. We applied this DAF analysis to Perlegen resequenced SNPs genome-wide. Regulatory elements annotated by Ensembl such as specific histone methylation sites as well as classes defined by cross-species conservation showed negative selection in comparison to the genome as a whole. Conclusions These results highlight which annotated classes are under purifying selection, have putative functional importance, and contain SNPs that are strong candidates for follow-up studies after genome-wide association. Such SNP annotation may also be useful in interpreting results of whole-genome sequencing studies.
Collapse
Affiliation(s)
- Mark A Levenstien
- Program in Cancer Biology and Genetics, Memorial Sloan-Kettering Cancer Center, New York, NY 10065, USA
| | | |
Collapse
|
33
|
Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD, Chuzhanova N, Krawczak M, Kehrer-Sawatzki H, Stenson PD. Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics. Hum Mutat 2010; 31:631-55. [PMID: 20506564 DOI: 10.1002/humu.21260] [Citation(s) in RCA: 117] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
This Timeline article looks back at 40 years of research into the inherited genetic basis of cancer and the insights these studies have yielded. Early epidemiological research provided evidence for the 'two-hit' model of cancer predisposition. During the 1980s and 1990s linkage and positional cloning analyses led to the identification of high-penetrance cancer susceptibility genes. The past decade has seen a shift from models of predisposition based on single-gene causative mutations to multigenic models. These models suggest that a high proportion of cancers may arise in a genetically susceptible minority as a consequence of the combined effects of common low-penetrance alleles and rare disease-causing variants that confer moderate cancer risks.
Collapse
Affiliation(s)
- Olivia Fletcher
- Olivia Fletcher is at the Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London SW3 6JB, UK
| | | |
Collapse
|
35
|
Licastro D, Gennarino VA, Petrera F, Sanges R, Banfi S, Stupka E. Promiscuity of enhancer, coding and non-coding transcription functions in ultraconserved elements. BMC Genomics 2010; 11:151. [PMID: 20202189 PMCID: PMC2847969 DOI: 10.1186/1471-2164-11-151] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2009] [Accepted: 03/04/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ultraconserved elements (UCEs) are highly constrained elements of mammalian genomes, whose functional role has not been completely elucidated yet. Previous studies have shown that some of them act as enhancers in mouse, while some others are expressed in both normal and cancer-derived human tissues. Only one UCE element so far was shown to present these two functions concomitantly, as had been observed in other isolated instances of single, non ultraconserved enhancer elements. RESULTS We used a custom microarray to assess the levels of UCE transcription during mouse development and integrated these data with published microarray and next-generation sequencing datasets as well as with newly produced PCR validation experiments. We show that a large fraction of non-exonic UCEs is transcribed across all developmental stages examined from only one DNA strand. Although the nature of these transcripts remains a mistery, our meta-analysis of RNA-Seq datasets indicates that they are unlikely to be short RNAs and that some of them might encode nuclear transcripts. In the majority of cases this function overlaps with the already established enhancer function of these elements during mouse development. Utilizing several next-generation sequencing datasets, we were further able to show that the level of expression observed in non-exonic UCEs is significantly higher than in random regions of the genome and that this is also seen in other regions which act as enhancers. CONCLUSION Our data shows that the concurrent presence of enhancer and transcript function in non-exonic UCE elements is more widespread than previously shown. Moreover through our own experiments as well as the use of next-generation sequencing datasets, we were able to show that the RNAs encoded by non-exonic UCEs are likely to be long RNAs transcribed from only one DNA strand.
Collapse
Affiliation(s)
- Danilo Licastro
- CBM scrl - Genomics, Area Science Park, Basovizza, Trieste, Italy
| | - Vincenzo A Gennarino
- Telethon Institute of Genetics and Medicine (TIGEM), via Pietro Castellino 111, 80131, Napoli, Italy
| | | | - Remo Sanges
- CBM scrl - Genomics, Area Science Park, Basovizza, Trieste, Italy
| | - Sandro Banfi
- Telethon Institute of Genetics and Medicine (TIGEM), via Pietro Castellino 111, 80131, Napoli, Italy
| | - Elia Stupka
- UCL Cancer Institute, University College London, London, WC1E 6BT, UK
- Centre for Gastroenterology, Institute of Cell and Molecular Science, Queen Mary University of London, London, E1 2AT, UK
| |
Collapse
|
36
|
Goode DL, Cooper GM, Schmutz J, Dickson M, Gonzales E, Tsai M, Karra K, Davydov E, Batzoglou S, Myers RM, Sidow A. Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes. Genome Res 2010; 20:301-10. [PMID: 20067941 PMCID: PMC2840986 DOI: 10.1101/gr.102210.109] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 01/08/2010] [Indexed: 01/22/2023]
Abstract
Here, we demonstrate how comparative sequence analysis facilitates genome-wide base-pair-level interpretation of individual genetic variation and address two questions of importance for human personal genomics: first, whether an individual's functional variation comes mostly from noncoding or coding polymorphisms; and, second, whether population-specific or globally-present polymorphisms contribute more to functional variation in any given individual. Neither has been definitively answered by analyses of existing variation data because of a focus on coding polymorphisms, ascertainment biases in favor of common variation, and a lack of base-pair-level resolution for identifying functional variants. We resequenced 575 amplicons within 432 individuals at genomic sites enriched for evolutionary constraint and also analyzed variation within three published human genomes. We find that single-site measures of evolutionary constraint derived from mammalian multiple sequence alignments are strongly predictive of reductions in modern-day genetic diversity across a range of annotation categories and across the allele frequency spectrum from rare (<1%) to high frequency (>10% minor allele frequency). Furthermore, we show that putatively functional variation in an individual genome is dominated by polymorphisms that do not change protein sequence and that originate from our shared ancestral population and commonly segregate in human populations. These observations show that common, noncoding alleles contribute substantially to human phenotypes and that constraint-based analyses will be of value to identify phenotypically relevant variants in individual genomes.
Collapse
Affiliation(s)
- David L Goode
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Loots GG, Ovcharenko I. Human variation in short regions predisposed to deep evolutionary conservation. Mol Biol Evol 2010; 27:1279-88. [PMID: 20093432 PMCID: PMC2872621 DOI: 10.1093/molbev/msq011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species.
Collapse
Affiliation(s)
- Gabriela G Loots
- Biology and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | | |
Collapse
|
38
|
Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, Huff CD, Shannon PT, Jabs EW, Nickerson DA, Shendure J, Bamshad MJ. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 2009; 42:30-5. [PMID: 19915526 PMCID: PMC2847889 DOI: 10.1038/ng.499] [Citation(s) in RCA: 1393] [Impact Index Per Article: 87.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2009] [Accepted: 11/09/2009] [Indexed: 12/15/2022]
Abstract
We demonstrate the first successful application of exome sequencing to discover the gene for a rare, Mendelian disorder of unknown cause, Miller syndrome (OMIM %263750). For four affected individuals in three independent kindreds, we captured and sequenced coding regions to a mean coverage of 40X, and sufficient depth to call variants at ~97% of each targeted exome. Filtering against public SNP databases and a small number of HapMap exomes for genes with two novel variants in each of the four cases identified a single candidate gene, DHODH, which encodes a key enzyme in the pyrimidine de novo biosynthesis pathway. Sanger sequencing confirmed the presence of DHODH mutations in three additional families with Miller syndrome. Exome sequencing of a small number of unrelated, affected individuals is a powerful, efficient strategy for identifying the genes underlying rare Mendelian disorders and will likely transform the genetic analysis of monogenic traits.
Collapse
Affiliation(s)
- Sarah B Ng
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
TIAN J, ZHAO ZH, CHEN HP. [Conserved non-coding elements in human genome]. YI CHUAN = HEREDITAS 2009; 31:1067-1076. [PMID: 19933086 DOI: 10.3724/sp.j.1005.2009.01067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Study of comparative genomics has revealed that about 5% of the human genome are under purifying selection, 3.5% of which are conserved non-coding elements (CNEs). While the coding regions comprise of only a small part. In human, the CNEs are functionally important, which may be associated with the process of the establishment and maintain of chromatin architecture, transcription regulation, and pre-mRNA processing. They are also related to ontogeny of mammals and human diseases. This review outlined the identification, functional significance, evolutionary origin, and effects on human genetic defects of the CNEs.
Collapse
Affiliation(s)
- Jing TIAN
- Institute of Biotechnology, Academy of Military Medical Science, Beijing 100071, China.
| | | | | |
Collapse
|
40
|
Zhang J, Yuan Z, Zhou T. Geometric characteristics of dynamic correlations for combinatorial regulation in gene expression noise. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:021905. [PMID: 19792149 DOI: 10.1103/physreve.80.021905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2009] [Revised: 05/28/2009] [Indexed: 05/28/2023]
Abstract
Knowing which mode of combinatorial regulation (typically, AND or OR logic operation) that a gene employs is important for determining its function in regulatory networks. Here, we introduce a dynamic cross-correlation function between the output of a gene and its upstream regulator concentrations for signatures of combinatorial regulation in gene expression noise. We find that such a correlation function with respect to the correlation time near the peak close to the point of the zero correlation time is always upward convex in the case of AND logic whereas is always downward convex in the case of OR logic, whichever sources of noise (intrinsic or extrinsic or both). In turn, this fact implies a means for inferring regulatory synergies from available experimental data. The extensions and applications are discussed.
Collapse
Affiliation(s)
- Jiajun Zhang
- School of Mathematical and Computational Sciences, Sun Yet-Sen University, Guangzhou 510275, People's Republic of China
| | | | | |
Collapse
|
41
|
Power of deep, all-exon resequencing for discovery of human trait genes. Proc Natl Acad Sci U S A 2009; 106:3871-6. [PMID: 19202052 DOI: 10.1073/pnas.0812824106] [Citation(s) in RCA: 131] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The ability to sequence cost-effectively all of the coding regions of a given individual genome is rapidly approaching, with the potential for whole-genome resequencing not far behind. Initiatives are currently underway to phenotype hundreds of thousands of individuals for major human traits. Here, we determine the power for de novo discovery of genes related to human traits by resequencing all human exons in a clinical population. We analyze the potential of the gene discovery strategy that combines multiple rare variants from the same gene and treats genes, rather than individual alleles, as the units for the association test. By using computer simulations based on deep resequencing data for the European population, we show that genes meaningfully affecting a human trait can be identified in an unbiased fashion, although large sample sizes would be required to achieve substantial power.
Collapse
|
42
|
Chen CTL, Gottlieb DI, Cohen BA. Ultraconserved elements in the Olig2 promoter. PLoS One 2008; 3:e3946. [PMID: 19079603 PMCID: PMC2596485 DOI: 10.1371/journal.pone.0003946] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2008] [Accepted: 11/13/2008] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Oligodendrocytes are specialized cells of the nervous system that produce the myelin sheaths surrounding the axons of neurons. Myelinating the axons increases the speed of nerve conduction and demyelination contributes to the pathology of neurodegenerative diseases such as multiple sclerosis. Oligodendrocyte differentiation is specified early in development by the expression of the basic-helix-loop-helix transcription factor Olig2 in the ventral region of the neural tube. Understanding how Olig2 expression is controlled is therefore essential for elucidating the mechanisms governing oligodendrocyte differentiation. A method is needed to identify potential regulatory sequences in the long stretches of adjacent non-coding DNA that flank Olig2. METHODOLOGY/PRINCIPAL FINDINGS We identified ten potential regulatory regions upstream of Olig2 based on a combination of bioinformatics metrics that included evolutionary conservation across multiple vertebrate genomes, the presence of potential transcription factor binding sites and the existence of ultraconserved elements. One of our computational predictions includes a region previously identified as the Olig2 basal promoter, suggesting that our criterion represented characteristics of known regulatory regions. In this study, we tested one candidate regulatory region for its ability to modulate the Olig2 basal promoter and found that it represses expression in undifferentiated embryonic stem cells. CONCLUSIONS/SIGNIFICANCE The regulatory region we identified modifies the expression regulated by the Olig2 basal promoter in a manner consistent with our current understanding of Olig2 expression during oligodendrocyte differentiation. Our results support a model in which constitutive activation of Olig2 by its basal promoter is repressed in undifferentiated cells by upstream repressive elements until that repression is relieved during differentiation. We conclude that the potential regulatory elements presented in this study provide a good starting point for unraveling the cis-regulatory logic that governs Olig2 expression. Future studies of the functionality of the potential regulatory elements we present will help reveal the interactions that govern Olig2 expression during development.
Collapse
Affiliation(s)
- Christina T. L. Chen
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri, United States of America
| | - David I. Gottlieb
- Department of Anatomy and Neurobiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri, United States of America
| | - Barak A. Cohen
- Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|
43
|
Identification and characterization of new long conserved noncoding sequences in vertebrates. Mamm Genome 2008; 19:703-12. [PMID: 19015917 DOI: 10.1007/s00335-008-9152-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Accepted: 10/10/2008] [Indexed: 02/07/2023]
Abstract
Comparative sequence analyses have identified highly conserved genomic DNA sequences, including noncoding sequences, between humans and other species. By performing whole-genome comparisons of human and mouse, we have identified 611 conserved noncoding sequences longer than 500 bp, with more than 95% identity between the species. These long conserved noncoding sequences (LCNS) include 473 new sequences that do not overlap with previously reported ultraconserved elements (UCE), which are defined as aligned sequences longer than 200 bp with 100% identity in human, mouse, and rat. The LCNS were distributed throughout the genome except for the Y chromosome and often occurred in clusters within regions with a low density of coding genes. Many of the LCNS were also highly conserved in other mammals, chickens, frogs, and fish; however, we were unable to find orthologous sequences in the genomes of invertebrate species. In order to examine whether these conserved sequences are functionally important or merely mutational cold spots, we directly measured the frequencies of ENU-induced germline mutations in the LCNS of the mouse. By screening about 40.7 Mb, we found 35 mutations, including mutations at nucleotides that were conserved between human and fish. The mutation frequencies were equivalent to those found in other genomic regions, including coding sequences and introns, suggesting that the LCNS are not mutational cold spots at all. Taken together, these results suggest that mutations occur with equal frequency in LCNS but are eliminated by natural selection during the course of evolution.
Collapse
|
44
|
Schmidt S, Gerasimova A, Kondrashov FA, Adzuhbei IA, Kondrashov AS, Sunyaev S. Hypermutable non-synonymous sites are under stronger negative selection. PLoS Genet 2008; 4:e1000281. [PMID: 19043566 PMCID: PMC2583910 DOI: 10.1371/journal.pgen.1000281] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2007] [Accepted: 10/27/2008] [Indexed: 12/04/2022] Open
Abstract
Mutation rate varies greatly between nucleotide sites of the human genome and depends both on the global genomic location and the local sequence context of a site. In particular, CpG context elevates the mutation rate by an order of magnitude. Mutations also vary widely in their effect on the molecular function, phenotype, and fitness. Independence of the probability of occurrence of a new mutation's effect has been a fundamental premise in genetics. However, highly mutable contexts may be preserved by negative selection at important sites but destroyed by mutation at sites under no selection. Thus, there may be a positive correlation between the rate of mutations at a nucleotide site and the magnitude of their effect on fitness. We studied the impact of CpG context on the rate of human-chimpanzee divergence and on intrahuman nucleotide diversity at non-synonymous coding sites. We compared nucleotides that occupy identical positions within codons of identical amino acids and only differ by being within versus outside CpG context. Nucleotides within CpG context are under a stronger negative selection, as revealed by their lower, proportionally to the mutation rate, rate of evolution and nucleotide diversity. In particular, the probability of fixation of a non-synonymous transition at a CpG site is two times lower than at a CpG site. Thus, sites with different mutation rates are not necessarily selectively equivalent. This suggests that the mutation rate may complement sequence conservation as a characteristic predictive of functional importance of nucleotide sites.
Collapse
Affiliation(s)
- Steffen Schmidt
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Biochemistry, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Anna Gerasimova
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Fyodor A. Kondrashov
- Section on Ecology, Behavior, and Evolution, Division of Biological Sciences, University of California San Diego, La Jolla, California, United States of America
| | - Ivan A. Adzuhbei
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Alexey S. Kondrashov
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
45
|
Abstract
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.
Collapse
|
46
|
Abstract
Background To date, the reconstruction of gene regulatory networks from gene expression data has primarily relied on the correlation between the expression of transcription regulators and that of target genes. Results We developed a network reconstruction method based on quantities that are closely related to the biophysical properties of TF-TF interaction, TF-DNA binding and transcriptional activation and repression. The Network-Identifier method utilized a thermodynamic model for gene regulation to infer regulatory relationships from multiple time course gene expression datasets. Applied to five datasets of differentiating embryonic stem cells, Network-Identifier identified a gene regulatory network among 87 transcription regulator genes. This network suggests that Oct4, Sox2 and Klf4 indirectly repress lineage specific differentiation genes by activating transcriptional repressors of Ctbp2, Rest and Mtf2.
Collapse
|
47
|
Abstract
The distribution and evolution of ultraconserved elements (UCEs, DNA stretches that are perfectly identical in primates and rodents) were examined in genomes of 3 primate species (human, chimpanzee, and rhesus macaque). It was found that the number of UCEs has decreased throughout primate evolution. At least 26% of ancestral UCEs have diverged in hominoids, whereas an additional 17% have accumulated one or more single nucleotide polymorphisms in the human genome. Sequence polymorphism analyses indicate that mutation fixation within an UCE can trigger a relaxation in the selective constraint on that element. Homogeneous mutation accumulations in UCEs served as a template by which purifying selection acted more effectively on protein-coding UCEs. Gene ontology annotation suggests that UCE sequence variation, primarily occurring in noncoding regions, might be linked to the reprogramming of the expression pattern of transcription factors and developmentally important genes. Many of these genes are expressed in the central nervous system. Finally, UCE sequence variability within human populations has been identified, including population-specific nonsynonymous changes in protein-coding regions.
Collapse
Affiliation(s)
- Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
48
|
Xie D, Cai J, Chia NY, Ng HH, Zhong S. Cross-species de novo identification of cis-regulatory modules with GibbsModule: application to gene regulation in embryonic stem cells. Genome Res 2008; 18:1325-35. [PMID: 18490265 DOI: 10.1101/gr.072769.107] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
We introduce the GibbsModule algorithm for de novo detection of cis-regulatory motifs and modules in eukaryote genomes. GibbsModule models the coexpressed genes within one species as sharing a core cis-regulatory motif and each homologous gene group as sharing a homologous cis-regulatory module (CRM), characterized by a similar composition of motifs. Without using a predetermined alignment result, GibbsModule iteratively updates the core motif shared by coexpressed genes and traces the homologous CRMs that contain the core motif. GibbsModule achieved substantial improvements in both precision and recall as compared with peer algorithms on a number of synthetic and real data sets. Applying GibbsModule to analyze the binding regions of the Krüppel-like factor (KLF) transcription factor in embryonic stem cells (ESCs), we discovered a motif that differs from a previously published KLF motif identified by a SELEX experiment, but the new motif is consistent with mutagenesis analysis. The SOX2 motif was found to be a collaborating motif to the KLF motif in ESCs. We used quantitative chromatin immunoprecipitation (ChIP) analysis to test whether GibbsModule could distinguish functional and nonfunctional binding sites. All seven tested binding sites in GibbsModule-predicted CRMs had higher ChIP signals as compared with the other seven tested binding sites located outside of predicted CRMs. GibbsModule is available at (http://biocomp.bioen.uiuc.edu/GibbsModule).
Collapse
Affiliation(s)
- Dan Xie
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | | | | | | | | |
Collapse
|
49
|
Yang R, Frank B, Hemminki K, Bartram CR, Wappenschmidt B, Sutter C, Kiechle M, Bugert P, Schmutzler RK, Arnold N, Weber BHF, Niederacher D, Meindl A, Burwinkel B. SNPs in ultraconserved elements and familial breast cancer risk. Carcinogenesis 2008; 29:351-5. [PMID: 18174240 DOI: 10.1093/carcin/bgm290] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Ultraconserved elements (UCEs) are segments of >200 bp length showing absolute sequence identity between orthologous regions of human, rat and mouse genomes. The selection factors acting on these UCEs are still unknown. Recent studies have shown that UCEs function as long-range enhancers of flanking genes or are involved in splicing when overlapping with exons. The depletion of UCEs among copy number variation as well as the significant under-representation of single-nucleotide polymorphisms (SNPs) within UCEs have also revealed their evolutional and functional importance indicating their potential impact on disease, such as cancer. In the present study, we investigated the influence of six SNPs within UCEs on familial breast cancer risk. Two out of six SNPs showed an association with familial breast cancer risk. Whereas rs9572903 showed only a borderline significant association, the frequency of the rare [G] allele of rs2056116 was higher in cases than in controls indicating an increased familial breast cancer risk ([G] versus [A]: odds ratio (OR) = 1.18, 95% confidence interval (CI) 1.06-1.30, P = 0.0020; [GG] versus [AA]: OR = 1.41, 95% CI 1.15-1.74, P = 0.0011). Interestingly, comparing with the older age group, the ORs were increased in woman younger than 50 years of age ([G] versus [A]: OR = 1.27, 95% CI 1.11-1.45, P = 0.0005; [GG] versus [AA]: OR = 1.60, 95% CI 1.22-2.10, P = 0.0007) pointing to an age- or hormone-related effect. This is the first study indicating that SNPs in UCEs might be associated with cancer risk.
Collapse
Affiliation(s)
- Rongxi Yang
- Helmholtz-University Group Molecular Epidemiology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Kikuta H, Fredman D, Rinkwitz S, Lenhard B, Becker TS. Retroviral enhancer detection insertions in zebrafish combined with comparative genomics reveal genomic regulatory blocks - a fundamental feature of vertebrate genomes. Genome Biol 2007; 8 Suppl 1:S4. [PMID: 18047696 PMCID: PMC2106839 DOI: 10.1186/gb-2007-8-s1-s4] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A large-scale enhancer detection screen was performed in the zebrafish using a retroviral vector carrying a basal promoter and a fluorescent protein reporter cassette. Analysis of insertional hotspots uncovered areas around developmental regulatory genes in which an insertion results in the same global expression pattern, irrespective of exact position. These areas coincide with vertebrate chromosomal segments containing identical gene order; a phenomenon known as conserved synteny and thought to be a vestige of evolution. Genomic comparative studies have found large numbers of highly conserved noncoding elements (HCNEs) spanning these and other loci. HCNEs are thought to act as transcriptional enhancers based on the finding that many of those that have been tested direct tissue specific expression in transient or transgenic assays. Although gene order in hox and other gene clusters has long been known to be conserved because of shared regulatory sequences or overlapping transcriptional units, the chromosomal areas found through insertional hotspots contain only one or a few developmental regulatory genes as well as phylogenetically unrelated genes. We have termed these regions genomic regulatory blocks (GRBs), and show that they underlie the phenomenon of conserved synteny through all sequenced vertebrate genomes. After teleost whole genome duplication, a subset of GRBs were retained in two copies, underwent degenerative changes compared with tetrapod loci that exist as single copy, and that therefore can be viewed as representing the ancestral form. We discuss these findings in light of evolution of vertebrate chromosomal architecture and the identification of human disease mutations.
Collapse
Affiliation(s)
- Hiroshi Kikuta
- Sars Centre for Marine Molecular Biology, University of Bergen, Thormoehlensgate, 5008 Bergen, Norway
| | | | | | | | | |
Collapse
|