1
|
Caragine CM, Le VT, Mustafa M, Diaz BJ, Morris JA, Müller S, Mendez-Mancilla A, Geller E, Liscovitch-Brauer N, Sanjana NE. Comprehensive dissection of cis-regulatory elements in a 2.8 Mb topologically associated domain in six human cancers. Nat Commun 2025; 16:1611. [PMID: 39948336 PMCID: PMC11825950 DOI: 10.1038/s41467-025-56568-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 01/22/2025] [Indexed: 02/16/2025] Open
Abstract
Cis-regulatory elements (CREs), such as enhancers and promoters, are fundamental regulators of gene expression and, across different cell types, the MYC locus utilizes a diverse regulatory architecture driven by multiple CREs. To better understand differences in CRE function, we perform pooled CRISPR inhibition (CRISPRi) screens to comprehensively probe the 2.8 Mb topologically-associated domain containing MYC in 6 human cancer cell lines with nucleotide resolution. We map 32 CREs where inhibition leads to changes in cell growth, including 8 that overlap previously identified enhancers. Targeting specific CREs decreases MYC expression by as much as 60%, and cell growth by as much as 50%. Using 3-D enhancer contact mapping, we find that these CREs almost always contact MYC but less than 10% of total MYC contacts impact growth when silenced, highlighting the utility of our approach to identify phenotypically-relevant CREs. We also detect an enrichment of lineage-specific transcription factors (TFs) at MYC CREs and, for some of these TFs, find a strong, tumor-specific correlation between TF and MYC expression not found in normal tissue. Taken together, these CREs represent systematically identified, functional regulatory regions and demonstrate how the same region of the human genome can give rise to complex, tissue-specific gene regulation.
Collapse
Affiliation(s)
- Christina M Caragine
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Victoria T Le
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Meer Mustafa
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Bianca Jay Diaz
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - John A Morris
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Simon Müller
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Alejandro Mendez-Mancilla
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Evan Geller
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Noa Liscovitch-Brauer
- New York Genome Center, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Neville E Sanjana
- New York Genome Center, New York, NY, USA.
- Department of Biology, New York University, New York, NY, USA.
- Department of Neuroscience and Physiology, New York University School of Medicine, New York, NY, USA.
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA.
| |
Collapse
|
2
|
Luthra I, Jensen C, Chen XE, Salaudeen AL, Rafi AM, de Boer CG. Regulatory activity is the default DNA state in eukaryotes. Nat Struct Mol Biol 2024; 31:559-567. [PMID: 38448573 DOI: 10.1038/s41594-024-01235-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/29/2024] [Indexed: 03/08/2024]
Abstract
Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.
Collapse
Affiliation(s)
- Ishika Luthra
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Cassandra Jensen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xinyi E Chen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Asfar Lathif Salaudeen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Abdul Muntakim Rafi
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
3
|
de Jong MJ, van Oosterhout C, Hoelzel AR, Janke A. Moderating the neutralist-selectionist debate: exactly which propositions are we debating, and which arguments are valid? Biol Rev Camb Philos Soc 2024; 99:23-55. [PMID: 37621151 DOI: 10.1111/brv.13010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 08/04/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Half a century after its foundation, the neutral theory of molecular evolution continues to attract controversy. The debate has been hampered by the coexistence of different interpretations of the core proposition of the neutral theory, the 'neutral mutation-random drift' hypothesis. In this review, we trace the origins of these ambiguities and suggest potential solutions. We highlight the difference between the original, the revised and the nearly neutral hypothesis, and re-emphasise that none of them equates to the null hypothesis of strict neutrality. We distinguish the neutral hypothesis of protein evolution, the main focus of the ongoing debate, from the neutral hypotheses of genomic and functional DNA evolution, which for many species are generally accepted. We advocate a further distinction between a narrow and an extended neutral hypothesis (of which the latter posits that random non-conservative amino acid substitutions can cause non-ecological phenotypic divergence), and we discuss the implications for evolutionary biology beyond the domain of molecular evolution. We furthermore point out that the debate has widened from its initial focus on point mutations, and also concerns the fitness effects of large-scale mutations, which can alter the dosage of genes and regulatory sequences. We evaluate the validity of neutralist and selectionist arguments and find that the tested predictions, apart from being sensitive to violation of underlying assumptions, are often derived from the null hypothesis of strict neutrality, or equally consistent with the opposing selectionist hypothesis, except when assuming molecular panselectionism. Our review aims to facilitate a constructive neutralist-selectionist debate, and thereby to contribute to answering a key question of evolutionary biology: what proportions of amino acid and nucleotide substitutions and polymorphisms are adaptive?
Collapse
Affiliation(s)
- Menno J de Jong
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
| | - Cock van Oosterhout
- Centre for Ecology, Evolution and Conservation, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
| | - A Rus Hoelzel
- Department of Biosciences, Durham University, South Road, Durham, DH1 3LE, UK
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Institute (SBiK-F), Georg-Voigt-Strasse 14-16, Frankfurt am Main, 60325, Germany
- Institute for Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Strasse 9, Frankfurt am Main, 60438, Germany
- LOEWE-Centre for Translational Biodiversity Genomics (TBG), Senckenberg Nature Research Society, Georg-Voigt-Straße 14-16, Frankfurt am Main, 60325, Germany
| |
Collapse
|
4
|
Patrick MT, Sreeskandarajan S, Shefler A, Wasikowski R, Sarkar MK, Chen J, Qin T, Billi AC, Kahlenberg JM, Prens E, Hovnanian A, Weidinger S, Elder JT, Kuo CC, Gudjonsson JE, Tsoi LC. Large-scale functional inference for skin-expressing lncRNAs using expression and sequence information. JCI Insight 2023; 8:e172956. [PMID: 38131377 PMCID: PMC10807743 DOI: 10.1172/jci.insight.172956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 11/08/2023] [Indexed: 12/23/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) regulate the expression of protein-coding genes and have been shown to play important roles in inflammatory skin diseases. However, we still have limited understanding of the functional impact of lncRNAs in skin, partly due to their tissue specificity and lower expression levels compared with protein-coding genes. We compiled a comprehensive list of 18,517 lncRNAs from different sources and studied their expression profiles in 834 RNA-Seq samples from multiple inflammatory skin conditions and cytokine-stimulated keratinocytes. Applying a balanced random forest to predict involvement in biological functions, we achieved a median AUROC of 0.79 in 10-fold cross-validation, identifying significant DNA binding domains (DBDs) for 39 lncRNAs. G18244, a skin-expressing lncRNA predicted for IL-4/IL-13 signaling in keratinocytes, was highly correlated in expression with F13A1, a protein-coding gene involved in macrophage regulation, and we further identified a significant DBD in F13A1 for G18244. Reflecting clinical implications, AC090198.1 (predicted for IL-17 pathway) and AC005332.6 (predicted for IFN-γ pathway) had significant negative correlation with the SCORAD metric for atopic dermatitis. We also utilized single-cell RNA and spatial sequencing data to validate cell type specificity. Our research demonstrates lncRNAs have important immunological roles and can help prioritize their impact on inflammatory skin diseases.
Collapse
Affiliation(s)
- Matthew T. Patrick
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Sutharzan Sreeskandarajan
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, USA
| | - Alanna Shefler
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Rachael Wasikowski
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Mrinal K. Sarkar
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Jiahan Chen
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
- College of Sciences, North Carolina State University, Raleigh, North Carolina, USA
| | - Tingting Qin
- Department of Computational Medicine & Bioinformatics and
| | - Allison C. Billi
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - J. Michelle Kahlenberg
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
- Division of Rheumatology, Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Errol Prens
- Department of Dermatology, Erasmus University Medical Center, Rotterdam, Netherlands
| | - Alain Hovnanian
- Laboratory of Genetic Skin Diseases, Imagine Institute, Paris, France
| | - Stephan Weidinger
- Department of Dermatology and Allergy, University Medical Center Schleswig-Holstein, Kiel, Germany
| | - James T. Elder
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
- Ann Arbor Veterans Affairs Hospital, Ann Arbor, Michigan, USA
| | - Chao-Chung Kuo
- Institute for Computational Genomics, Joint Research Center for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
| | - Johann E. Gudjonsson
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Lam C. Tsoi
- Department of Dermatology, Michigan Medicine, University of Michigan, Ann Arbor, Michigan, USA
- Department of Computational Medicine & Bioinformatics and
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
5
|
Reinar WB, Tørresen OK, Nederbragt AJ, Matschiner M, Jentoft S, Jakobsen KS. Teleost genomic repeat landscapes in light of diversification rates and ecology. Mob DNA 2023; 14:14. [PMID: 37789366 PMCID: PMC10546739 DOI: 10.1186/s13100-023-00302-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 09/20/2023] [Indexed: 10/05/2023] Open
Abstract
Repetitive DNA make up a considerable fraction of most eukaryotic genomes. In fish, transposable element (TE) activity has coincided with rapid species diversification. Here, we annotated the repetitive content in 100 genome assemblies, covering the major branches of the diverse lineage of teleost fish. We investigated if TE content correlates with family level net diversification rates and found support for a weak negative correlation. Further, we demonstrated that TE proportion correlates with genome size, but not to the proportion of short tandem repeats (STRs), which implies independent evolutionary paths. Marine and freshwater fish had large differences in STR content, with the most extreme propagation detected in the genomes of codfish species and Atlantic herring. Such a high density of STRs is likely to increase the mutational load, which we propose could be counterbalanced by high fecundity as seen in codfishes and herring.
Collapse
Affiliation(s)
| | - Ole K Tørresen
- Department of Biosciences, University of Oslo, Oslo, Norway
| | - Alexander J Nederbragt
- Department of Biosciences, University of Oslo, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Michael Matschiner
- Department of Biosciences, University of Oslo, Oslo, Norway
- University of Oslo, Natural History Museum, Oslo, Norway
| | - Sissel Jentoft
- Department of Biosciences, University of Oslo, Oslo, Norway
| | | |
Collapse
|
6
|
Singh RS. A Law of Redundancy Compounds the Problem of Cancer and Precision Medicine. J Mol Evol 2023; 91:711-720. [PMID: 37665357 PMCID: PMC10597872 DOI: 10.1007/s00239-023-10131-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 08/17/2023] [Indexed: 09/05/2023]
Abstract
Genetics and molecular biology research have progressed for over a century; however, no laws of biology resembling those of physics have been identified, despite the expectations of some physicists. It may be that it is not the properties of matter alone but evolved properties of matter in combination with atomic physics and chemistry that gave rise to the origin and complexity of life. It is proposed that any law of biology must also be a product of evolution that co-evolved with the origin and progression of life. It was suggested that molecular complexity and redundancy exponentially increase over time and have the following relationship: DNA sequence complexity (Cd) < molecular complexity (Cm) < phenotypic complexity (Cp). This study presents a law of redundancy, which together with the law of complexity, is proposed as an evolutionary law of biology. Molecular complexity and redundancy are inseparable aspects of biochemical pathways, and molecular redundancy provides the first line of defense against environmental challenges, including those of deleterious mutations. Redundancy can create problems for precision medicine because in addition to the issues arising from the involvement of multiple genes, redundancy arising from alternate pathways between genotypes and phenotypes can complicate gene detection for complex diseases and mental disorders. This study uses cancer as an example to show how cellular complexity, molecular redundancy, and hidden variation affect the ability of cancer cells to evolve and evade detection and elimination. Characterization of alternate biochemical pathways or "escape routes" can provide a step in the fight against cancer.
Collapse
Affiliation(s)
- Rama S Singh
- Professor Emeritus, Department of Biology and Origins Institute, McMaster University, 1280 Main Street W., Hamilton, ON, L8S 4K1, Canada.
| |
Collapse
|
7
|
Roy A, Sakthikumar S, Kozyrev SV, Nordin J, Pensch R, Mäkeläinen S, Pettersson M, Karlsson EK, Lindblad-Toh K, Forsberg-Nilsson K. Using evolutionary constraint to define novel candidate driver genes in medulloblastoma. Proc Natl Acad Sci U S A 2023; 120:e2300984120. [PMID: 37549291 PMCID: PMC10438395 DOI: 10.1073/pnas.2300984120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 07/07/2023] [Indexed: 08/09/2023] Open
Abstract
Current knowledge of cancer genomics remains biased against noncoding mutations. To systematically search for regulatory noncoding mutations, we assessed mutations in conserved positions in the genome under the assumption that these are more likely to be functional than mutations in positions with low conservation. To this end, we use whole-genome sequencing data from the International Cancer Genome Consortium and combined it with evolutionary constraint inferred from 240 mammals, to identify genes enriched in noncoding constraint mutations (NCCMs), mutations likely to be regulatory in nature. We compare medulloblastoma (MB), which is malignant, to pilocytic astrocytoma (PA), a primarily benign tumor, and find highly different NCCM frequencies between the two, in agreement with the fact that malignant cancers tend to have more mutations. In PA, a high NCCM frequency only affects the BRAF locus, which is the most commonly mutated gene in PA. In contrast, in MB, >500 genes have high levels of NCCMs. Intriguingly, several loci with NCCMs in MB are associated with different ages of onset, such as the HOXB cluster in young MB patients. In adult patients, NCCMs occurred in, e.g., the WASF-2/AHDC1/FGR locus. One of these NCCMs led to increased expression of the SRC kinase FGR and augmented responsiveness of MB cells to dasatinib, a SRC kinase inhibitor. Our analysis thus points to different molecular pathways in different patient groups. These newly identified putative candidate driver mutations may aid in patient stratification in MB and could be valuable for future selection of personalized treatment options.
Collapse
Affiliation(s)
- Ananya Roy
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 751 85Uppsala, Sweden
| | - Sharadha Sakthikumar
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
- Broad Institute, Cambridge, MA02142
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
| | - Jessika Nordin
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 751 85Uppsala, Sweden
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
| | - Raphaela Pensch
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
| | - Suvi Mäkeläinen
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
| | - Mats Pettersson
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
| | | | - Elinor K. Karlsson
- Broad Institute, Cambridge, MA02142
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA01605
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA01605
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 23Uppsala, Sweden
- Broad Institute, Cambridge, MA02142
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 751 85Uppsala, Sweden
- Division of Cancer and Stem Cells, University of Nottingham Biodiscovery Institute, NottinghamNG72RD, United Kingdom
| |
Collapse
|
8
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023; 380:eabn3943. [PMID: 37104599 PMCID: PMC10250106 DOI: 10.1126/science.abn3943] [Citation(s) in RCA: 104] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 12/16/2022] [Indexed: 04/29/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M. Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G. Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C. Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M. Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M. Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B. Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C. Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L. Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R. Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W. Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R. Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M. Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E. Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A. Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P. Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E. Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R. Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian’s National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K. Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S. Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A. Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J. Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S. Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A. Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
9
|
Bartlett J. Random with Respect to Fitness or External Selection? An Important but Often Overlooked Distinction. Acta Biotheor 2023; 71:12. [PMID: 36933070 DOI: 10.1007/s10441-023-09464-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 03/03/2023] [Indexed: 03/19/2023]
Abstract
Mutations are often described as being "random with respect to fitness." Here we show that the experiments used to establish randomness with respect to fitness are only capable of showing that mutations are random with respect to current external selection. Current debates about whether or not mutations are directed may be at least partially resolved by making use of this distinction. Additionally, this distinction has important mathematical, experimental, and inferential implications.
Collapse
|
10
|
Barbo M, Ravnik-Glavač M. Extracellular Vesicles as Potential Biomarkers in Amyotrophic Lateral Sclerosis. Genes (Basel) 2023; 14:genes14020325. [PMID: 36833252 PMCID: PMC9956314 DOI: 10.3390/genes14020325] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/19/2023] [Accepted: 01/20/2023] [Indexed: 01/28/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is described as a fatal and rapidly progressive neurodegenerative disorder caused by the degeneration of upper motor neurons in the primary motor cortex and lower motor neurons of the brainstem and spinal cord. Due to ALS's slowly progressive characteristic, which is often accompanied by other neurological comorbidities, its diagnosis remains challenging. Perturbations in vesicle-mediated transport and autophagy as well as cell-autonomous disease initiation in glutamatergic neurons have been revealed in ALS. The use of extracellular vesicles (EVs) may be key in accessing pathologically relevant tissues for ALS, as EVs can cross the blood-brain barrier and be isolated from the blood. The number and content of EVs may provide indications of the disease pathogenesis, its stage, and prognosis. In this review, we collected a recent study aiming at the identification of EVs as a biomarker of ALS with respect to the size, quantity, and content of EVs in the biological fluids of patients compared to controls.
Collapse
|
11
|
Wooding SP, Ramirez VA. Global population genetics and diversity in the TAS2R bitter taste receptor family. Front Genet 2022; 13:952299. [PMID: 36303543 PMCID: PMC9592824 DOI: 10.3389/fgene.2022.952299] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 08/25/2022] [Indexed: 12/03/2022] Open
Abstract
Bitter taste receptors (TAS2Rs) are noted for their role in perception, and mounting evidence suggests that they mediate responses to compounds entering airways, gut, and other tissues. The importance of these roles suggests that TAS2Rs have been under pressure from natural selection. To determine the extent of variation in TAS2Rs on a global scale and its implications for human evolution and behavior, we analyzed patterns of diversity in the complete 25 gene repertoire of human TAS2Rs in ∼2,500 subjects representing worldwide populations. Across the TAS2R family as a whole, we observed 721 single nucleotide polymorphisms (SNPs) including 494 nonsynonymous SNPs along with 40 indels and gained and lost start and stop codons. In addition, computational predictions identified 169 variants particularly likely to affect receptor function, making them candidate sources of phenotypic variation. Diversity levels ranged widely among loci, with the number of segregating sites ranging from 17 to 41 with a mean of 32 among genes and per nucleotide heterozygosity (π) ranging from 0.02% to 0.36% with a mean of 0.12%. FST ranged from 0.01 to 0.26 with a mean of 0.13, pointing to modest differentiation among populations. Comparisons of observed π and FST values with their genome wide distributions revealed that most fell between the 5th and 95th percentiles and were thus consistent with expectations. Further, tests for natural selection using Tajima’s D statistic revealed only two loci departing from expectations given D’s genome wide distribution. These patterns are consistent with an overall relaxation of selective pressure on TAS2Rs in the course of recent human evolution.
Collapse
Affiliation(s)
- Stephen P. Wooding
- Department of Anthropology, University of California, Merced, Merced, CA, United States
- *Correspondence: Stephen P. Wooding,
| | - Vicente A. Ramirez
- Department of Public Health, University of California, Merced, Merced, CA, United States
| |
Collapse
|
12
|
Abstract
Selection accumulates information in the genome-it guides stochastically evolving populations toward states (genotype frequencies) that would be unlikely under neutrality. This can be quantified as the Kullback-Leibler (KL) divergence between the actual distribution of genotype frequencies and the corresponding neutral distribution. First, we show that this population-level information sets an upper bound on the information at the level of genotype and phenotype, limiting how precisely they can be specified by selection. Next, we study how the accumulation and maintenance of information is limited by the cost of selection, measured as the genetic load or the relative fitness variance, both of which we connect to the control-theoretic KL cost of control. The information accumulation rate is upper bounded by the population size times the cost of selection. This bound is very general, and applies across models (Wright-Fisher, Moran, diffusion) and to arbitrary forms of selection, mutation, and recombination. Finally, the cost of maintaining information depends on how it is encoded: Specifying a single allele out of two is expensive, but one bit encoded among many weakly specified loci (as in a polygenic trait) is cheap.
Collapse
|
13
|
Palazzo AF, Kejiou NS. Non-Darwinian Molecular Biology. Front Genet 2022; 13:831068. [PMID: 35251134 PMCID: PMC8888898 DOI: 10.3389/fgene.2022.831068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 01/24/2022] [Indexed: 12/14/2022] Open
Abstract
With the discovery of the double helical structure of DNA, a shift occurred in how biologists investigated questions surrounding cellular processes, such as protein synthesis. Instead of viewing biological activity through the lens of chemical reactions, this new field used biological information to gain a new profound view of how biological systems work. Molecular biologists asked new types of questions that would have been inconceivable to the older generation of researchers, such as how cellular machineries convert inherited biological information into functional molecules like proteins. This new focus on biological information also gave molecular biologists a way to link their findings to concepts developed by genetics and the modern synthesis. However, by the late 1960s this all changed. Elevated rates of mutation, unsustainable genetic loads, and high levels of variation in populations, challenged Darwinian evolution, a central tenant of the modern synthesis, where adaptation was the main driver of evolutionary change. Building on these findings, Motoo Kimura advanced the neutral theory of molecular evolution, which advocates that selection in multicellular eukaryotes is weak and that most genomic changes are neutral and due to random drift. This was further elaborated by Jack King and Thomas Jukes, in their paper “Non-Darwinian Evolution”, where they pointed out that the observed changes seen in proteins and the types of polymorphisms observed in populations only become understandable when we take into account biochemistry and Kimura’s new theory. Fifty years later, most molecular biologists remain unaware of these fundamental advances. Their adaptionist viewpoint fails to explain data collected from new powerful technologies which can detect exceedingly rare biochemical events. For example, high throughput sequencing routinely detects RNA transcripts being produced from almost the entire genome yet are present less than one copy per thousand cells and appear to lack any function. Molecular biologists must now reincorporate ideas from classical biochemistry and absorb modern concepts from molecular evolution, to craft a new lens through which they can evaluate the functionality of transcriptional units, and make sense of our messy, intricate, and complicated genome.
Collapse
|
14
|
Quiver MH, Lachance J. Adaptive eQTLs reveal the evolutionary impacts of pleiotropy and tissue-specificity while contributing to health and disease. HGG ADVANCES 2022; 3:100083. [PMID: 35047867 PMCID: PMC8756519 DOI: 10.1016/j.xhgg.2021.100083] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 12/21/2021] [Indexed: 11/24/2022] Open
Abstract
Large numbers of expression quantitative trait loci (eQTLs) have recently been identified in humans, and many of these regulatory variants have large allele frequency differences between populations. Here, we conducted genome-wide scans of selection to identify adaptive eQTLs (i.e., eQTLs with large population branch statistics). We then tested if tissue pleiotropy affects whether eQTLs are more or less likely to be adaptive and identified tissues that have been key targets of positive selection during the last 100,000 years. Top adaptive eQTL outliers include rs1043809, rs66899053, and rs2814778 (a SNP that is associated with malaria resistance). We found that effect sizes of eQTLs were negatively correlated with population branch statistics and that adaptive eQTLs affect two-thirds as many tissues as do non-adaptive eQTLs. Because the tissue breadth of an eQTL can be viewed as a measure of pleiotropy, these results imply that pleiotropy inhibits adaptation. The proportion of eQTLs that are adaptive varies by tissue, and we found that eQTLs that regulate expression in testis, thyroid, blood, or sun-exposed skin are enriched for signatures of positive selection. By contrast, eQTLs that regulate expression in the cerebrum or female-specific tissues have a relative lack of adaptive outliers. Scans of selections also reveal that many adaptive eQTLs are closely linked to disease-associated loci. Taken together, our results indicate that eQTLs have played an important role in recent human evolution.
Collapse
Affiliation(s)
- Melanie H Quiver
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Joseph Lachance
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
15
|
Dhasmana S, Dhasmana A, Narula AS, Jaggi M, Yallapu MM, Chauhan SC. The panoramic view of amyotrophic lateral sclerosis: A fatal intricate neurological disorder. Life Sci 2022; 288:120156. [PMID: 34801512 DOI: 10.1016/j.lfs.2021.120156] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 11/10/2021] [Accepted: 11/11/2021] [Indexed: 02/07/2023]
Abstract
Amyotrophic lateral sclerosis (ALS) is a progressive and fatal neurological disease affecting both upper and lower motor neurons. In the United States alone, there are 16,000-20,000 established cases of ALS. The early disease diagnosis is challenging due to many overlapping pathophysiologies with other neurological diseases. The etiology of ALS is unknown; however, it is divided into two categories: familial ALS (fALS) which occurs due to gene mutations & contributes to 5-10% of ALS, and sporadic ALS (sALS) which is due to environmental factors & contributes to 90-95% of ALS. There is still no curative treatment for ALS: palliative care and symptomatic treatment are therefore essential components in the management of these patients. In this review, we provide a panoramic view of ALS, which includes epidemiology, risk factors, pathophysiologies, biomarkers, diagnosis, therapeutics (natural, synthetic, gene-based, pharmacological, stem cell, extracellular vesicles, and physical therapy), controversies (in the clinical trials of ALS), the scope of nanomedicine in ALS, and future perspectives.
Collapse
Affiliation(s)
- Swati Dhasmana
- Department of Immunology and Microbiology, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA; South Texas Center of Excellence in Cancer Research, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA
| | - Anupam Dhasmana
- Department of Immunology and Microbiology, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA; South Texas Center of Excellence in Cancer Research, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA
| | - Acharan S Narula
- Narula Research LLC, 107 Boulder Bluff, Chapel Hill, NC 27516, USA
| | - Meena Jaggi
- Department of Immunology and Microbiology, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA; South Texas Center of Excellence in Cancer Research, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA
| | - Murali M Yallapu
- Department of Immunology and Microbiology, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA; South Texas Center of Excellence in Cancer Research, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA
| | - Subhash C Chauhan
- Department of Immunology and Microbiology, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA; South Texas Center of Excellence in Cancer Research, School of Medicine, University of Texas Rio Grande Valley, McAllen, TX 78504, USA.
| |
Collapse
|
16
|
Akhlaghpour H. An RNA-Based Theory of Natural Universal Computation. J Theor Biol 2021; 537:110984. [PMID: 34979104 DOI: 10.1016/j.jtbi.2021.110984] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 09/30/2021] [Accepted: 12/07/2021] [Indexed: 12/15/2022]
Abstract
Life is confronted with computation problems in a variety of domains including animal behavior, single-cell behavior, and embryonic development. Yet we currently do not know of a naturally existing biological system that is capable of universal computation, i.e., Turing-equivalent in scope. Generic finite-dimensional dynamical systems (which encompass most models of neural networks, intracellular signaling cascades, and gene regulatory networks) fall short of universal computation, but are assumed to be capable of explaining cognition and development. I present a class of models that bridge two concepts from distant fields: combinatory logic (or, equivalently, lambda calculus) and RNA molecular biology. A set of basic RNA editing rules can make it possible to compute any computable function with identical algorithmic complexity to that of Turing machines. The models do not assume extraordinarily complex molecular machinery or any processes that radically differ from what we already know to occur in cells. Distinct independent enzymes can mediate each of the rules and RNA molecules solve the problem of parenthesis matching through their secondary structure. In the most plausible of these models all of the editing rules can be implemented with merely cleavage and ligation operations at fixed positions relative to predefined motifs. This demonstrates that universal computation is well within the reach of molecular biology. It is therefore reasonable to assume that life has evolved - or possibly began with - a universal computer that yet remains to be discovered. The variety of seemingly unrelated computational problems across many scales can potentially be solved using the same RNA-based computation system. Experimental validation of this theory may immensely impact our understanding of memory, cognition, development, disease, evolution, and the early stages of life.
Collapse
Affiliation(s)
- Hessameddin Akhlaghpour
- Laboratory of Integrative Brain Function, The Rockefeller University, New York, NY, 10065, USA
| |
Collapse
|
17
|
Pagni S, Mills JD, Frankish A, Mudge JM, Sisodiya SM. Non-coding regulatory elements: Potential roles in disease and the case of epilepsy. Neuropathol Appl Neurobiol 2021; 48:e12775. [PMID: 34820881 DOI: 10.1111/nan.12775] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/04/2021] [Accepted: 11/16/2021] [Indexed: 12/27/2022]
Abstract
Non-coding DNA (ncDNA) refers to the portion of the genome that does not code for proteins and accounts for the greatest physical proportion of the human genome. ncDNA includes sequences that are transcribed into RNA molecules, such as ribosomal RNAs (rRNAs), microRNAs (miRNAs), long non-coding RNAs (lncRNAs) and un-transcribed sequences that have regulatory functions, including gene promoters and enhancers. Variation in non-coding regions of the genome have an established role in human disease, with growing evidence from many areas, including several cancers, Parkinson's disease and autism. Here, we review the features and functions of the regulatory elements that are present in the non-coding genome and the role that these regions have in human disease. We then review the existing research in epilepsy and emphasise the potential value of further exploring non-coding regulatory elements in epilepsy. In addition, we outline the most widely used techniques for recognising regulatory elements throughout the genome, current methodologies for investigating variation and the main challenges associated with research in the field of non-coding DNA.
Collapse
Affiliation(s)
- Susanna Pagni
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| | - James D Mills
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK.,Amsterdam UMC, Department of (Neuro)Pathology, Amsterdam Neuroscience, University of Amsterdam, Amsterdam, Netherlands
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Sanjay M Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| |
Collapse
|
18
|
Annotation depth confounds direct comparison of gene expression across species. BMC Bioinformatics 2021; 22:499. [PMID: 34654362 PMCID: PMC8518172 DOI: 10.1186/s12859-021-04414-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 09/30/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Comparisons of the molecular framework among organisms can be done on both structural and functional levels. One of the most common top-down approaches for functional comparisons is RNA sequencing. This estimation of organismal transcriptional responses is of interest for understanding evolution of molecular activity, which is used for answering a diversity of questions ranging from basic biology to pre-clinical species selection and translation. However, direct comparison between species is often hindered by evolutionary divergence in structure of molecular framework, as well as large difference in the depth of our understanding of the genetic background between humans and other species. Here, we focus on the latter. We attempt to understand how differences in transcriptome annotation affect direct gene abundance comparisons between species. RESULTS We examine and suggest some straightforward approaches for direct comparison given the current available tools and using a sample dataset from human, cynomolgus monkey, dog, rat and mouse with a common quantitation and normalization approach. In addition, we examine how variation in genome annotation depth and quality across species may affect these direct comparisons. CONCLUSIONS Our findings suggest that further efforts for better genome annotation or computational normalization tools may be of strong interest.
Collapse
|
19
|
Riba A, Fumagalli MR, Caselle M, Osella M. A Model-Driven Quantitative Analysis of Retrotransposon Distributions in the Human Genome. Genome Biol Evol 2021; 12:2045-2059. [PMID: 32986810 PMCID: PMC7750997 DOI: 10.1093/gbe/evaa201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/19/2020] [Indexed: 12/21/2022] Open
Abstract
Retrotransposons, DNA sequences capable of creating copies of themselves, compose about half of the human genome and played a central role in the evolution of mammals. Their current position in the host genome is the result of the retrotranscription process and of the following host genome evolution. We apply a model from statistical physics to show that the genomic distribution of the two most populated classes of retrotransposons in human deviates from random placement, and that this deviation increases with time. The time dependence suggests a major role of the host genome dynamics in shaping the current retrotransposon distributions. Focusing on a neutral scenario, we show that a simple model based on random placement followed by genome expansion and sequence duplications can reproduce the empirical retrotransposon distributions, even though more complex and possibly selective mechanisms can have contributed. Besides the inherent interest in understanding the origin of current retrotransposon distributions, this work sets a general analytical framework to analyze quantitatively the effects of genome evolutionary dynamics on the distribution of genomic elements.
Collapse
Affiliation(s)
| | - Maria Rita Fumagalli
- Institute of Biophysics - CNR, National Research Council, Genova, Italy.,Department of Environmental Science and Policy, Center for Complexity and Biosystems, University of Milan, Milano, Italy
| | - Michele Caselle
- Department of Physics and INFN, University of Torino, Torino, Italy
| | - Matteo Osella
- Department of Physics and INFN, University of Torino, Torino, Italy
| |
Collapse
|
20
|
Fadason T, Farrow S, Gokuladhas S, Golovina E, Nyaga D, O'Sullivan JM, Schierding W. Assigning function to SNPs: Considerations when interpreting genetic variation. Semin Cell Dev Biol 2021; 121:135-142. [PMID: 34446357 DOI: 10.1016/j.semcdb.2021.08.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 08/12/2021] [Indexed: 12/26/2022]
Abstract
Assigning function to single nucleotide polymorphisms (SNPs) to understand the mechanisms that link genetic and phenotypic variation and disease is an area of intensive research that is necessary to contribute to the continuing development of precision medicine. However, despite the apparent simplicity that is captured in the name SNP - 'single nucleotide' changes are not easy to functionally characterize. This complexity arises from multiple features of the genome including the fact that function is development and environment specific. As such, we are often fooled by our terminology and underlying assumptions that there is a single function for a SNP. Here we discuss some of what is known about SNPs, their functions and how we can go about characterizing them.
Collapse
Affiliation(s)
- Tayaza Fadason
- Liggins Institute, The University of Auckland, Auckland, New Zealand; The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Sophie Farrow
- Liggins Institute, The University of Auckland, Auckland, New Zealand; The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | | | - Evgeniia Golovina
- Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - Denis Nyaga
- Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand; The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand; Garvan Institute of Medical Research, Sydney, New South Wales, Australia; MRC Lifecourse Epidemiology Unit, University of Southampton, United Kingdom.
| | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand; The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
21
|
Rachakonda S, Hoheisel JD, Kumar R. Occurrence, functionality and abundance of the TERT promoter mutations. Int J Cancer 2021; 149:1852-1862. [PMID: 34313327 DOI: 10.1002/ijc.33750] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 06/14/2021] [Accepted: 07/16/2021] [Indexed: 12/18/2022]
Abstract
Telomere shortening at chromosomal ends due to the constraints of the DNA replication process acts as a tumor suppressor by restricting the replicative potential in primary cells. Cancers evade that limitation primarily through the reactivation of telomerase via different mechanisms. Mutations within the promoter of the telomerase reverse transcriptase (TERT) gene represent a definite mechanism for the ribonucleic enzyme regeneration predominantly in cancers that arise from tissues with low rates of self-renewal. The promoter mutations cause a moderate increase in TERT transcription and consequent telomerase upregulation to the levels sufficient to delay replicative senescence but not prevent bulk telomere shortening and genomic instability. Since the discovery, a staggering number of studies have resolved the discrete aspects, effects and clinical relevance of the TERT promoter mutations. The promoter mutations link transcription of TERT with oncogenic pathways, associate with markers of poor outcome and define patients with reduced survivals in several cancers. In this review, we discuss the occurrence and impact of the promoter mutations and highlight the mechanism of TERT activation. We further deliberate on the foundational question of the abundance of the TERT promoter mutations and a general dearth of functional mutations within noncoding sequences, as evident from pan-cancer analysis of the whole-genomes. We posit that the favorable genomic constellation within the TERT promoter may be less than a common occurrence in other noncoding functional elements. Besides, the evolutionary constraints limit the functional fraction within the human genome, hence the lack of abundant mutations outside the coding sequences.
Collapse
Affiliation(s)
| | - Jörg D Hoheisel
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Rajiv Kumar
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
22
|
Galeota-Sprung B, Sniegowski P, Ewens W. Mutational Load and the Functional Fraction of the Human Genome. Genome Biol Evol 2021; 12:273-281. [PMID: 32108234 PMCID: PMC7151545 DOI: 10.1093/gbe/evaa040] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2020] [Indexed: 01/30/2023] Open
Abstract
The fraction of the human genome that is functional is a question of both evolutionary and practical importance. Studies of sequence divergence have suggested that the functional fraction of the human genome is likely to be no more than ∼15%. In contrast, the ENCODE project, a systematic effort to map regions of transcription, transcription factor association, chromatin structure, and histone modification, assigned function to 80% of the human genome. In this article, we examine whether and how an analysis based on mutational load might set a limit on the functional fraction. In order to do so, we characterize the distribution of fitness of a large, finite, diploid population at mutation-selection equilibrium. In particular, if mean fitness is ∼1, the fitness of the fittest individual likely to occur cannot be unreasonably high. We find that at equilibrium, the distribution of log fitness has variance nus, where u is the per-base deleterious mutation rate, n is the number of functional sites (and hence incorporates the functional fraction f), and s is the selection coefficient of deleterious mutations. In a large (N=109) reproducing population, the fitness of the fittest individual likely to exist is ∼e5nus. These results apply to both additive and recessive fitness schemes. Our approach is different from previous work that compared mean fitness at mutation-selection equilibrium with the fitness of an individual who has no deleterious mutations; we show that such an individual is exceedingly unlikely to exist. We find that the functional fraction is not very likely to be limited substantially by mutational load, and that any such limit, if it exists, depends strongly on the selection coefficients of new deleterious mutations.
Collapse
Affiliation(s)
| | | | - Warren Ewens
- Department of Biology, University of Pennsylvania
| |
Collapse
|
23
|
Ni P, Su Z. Accurate prediction of cis-regulatory modules reveals a prevalent regulatory genome of humans. NAR Genom Bioinform 2021; 3:lqab052. [PMID: 34159315 PMCID: PMC8210889 DOI: 10.1093/nargab/lqab052] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/01/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023] Open
Abstract
cis-regulatory modules(CRMs) formed by clusters of transcription factor (TF) binding sites (TFBSs) are as important as coding sequences in specifying phenotypes of humans. It is essential to categorize all CRMs and constituent TFBSs in the genome. In contrast to most existing methods that predict CRMs in specific cell types using epigenetic marks, we predict a largely cell type agonistic but more comprehensive map of CRMs and constituent TFBSs in the gnome by integrating all available TF ChIP-seq datasets. Our method is able to partition 77.47% of genome regions covered by available 6092 datasets into a CRM candidate (CRMC) set (56.84%) and a non-CRMC set (43.16%). Intriguingly, the predicted CRMCs are under strong evolutionary constraints, while the non-CRMCs are largely selectively neutral, strongly suggesting that the CRMCs are likely cis-regulatory, while the non-CRMCs are not. Our predicted CRMs are under stronger evolutionary constraints than three state-of-the-art predictions (GeneHancer, EnhancerAtlas and ENCODE phase 3) and substantially outperform them for recalling VISTA enhancers and non-coding ClinVar variants. We estimated that the human genome might encode about 1.47M CRMs and 68M TFBSs, comprising about 55% and 22% of the genome, respectively; for both of which, we predicted 80%. Therefore, the cis-regulatory genome appears to be more prevalent than originally thought.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| |
Collapse
|
24
|
Bernardi G. The "Genomic Code": DNA Pervasively Moulds Chromatin Structures Leaving no Room for "Junk". Life (Basel) 2021; 11:342. [PMID: 33924668 PMCID: PMC8070607 DOI: 10.3390/life11040342] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 04/06/2021] [Accepted: 04/07/2021] [Indexed: 02/07/2023] Open
Abstract
The chromatin of the human genome was analyzed at three DNA size levels. At the first, compartment level, two "gene spaces" were found many years ago: A GC-rich, gene-rich "genome core" and a GC-poor, gene-poor "genome desert", the former corresponding to open chromatin centrally located in the interphase nucleus, the latter to closed chromatin located peripherally. This bimodality was later confirmed and extended by the discoveries (1) of LADs, the Lamina-Associated Domains, and InterLADs; (2) of two "spatial compartments", A and B, identified on the basis of chromatin interactions; and (3) of "forests and prairies" characterized by high and low CpG islands densities. Chromatin compartments were shown to be associated with the compositionally different, flat and single- or multi-peak DNA structures of the two, GC-poor and GC-rich, "super-families" of isochores. At the second, sub-compartment, level, chromatin corresponds to flat isochores and to isochore loops (due to compositional DNA gradients) that are susceptible to extrusion. Finally, at the short-sequence level, two sets of sequences, GC-poor and GC-rich, define two different nucleosome spacings, a short one and a long one. In conclusion, chromatin structures are moulded according to a "genomic code" by DNA sequences that pervade the genome and leave no room for "junk".
Collapse
Affiliation(s)
- Giorgio Bernardi
- Science Department, Roma Tre University, Viale Marconi 446, 00146 Rome, Italy; ; Tel.: +39-33-540-5892
- Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| |
Collapse
|
25
|
Silberstein M, Nesbit N, Cai J, Lee PH. Pathway analysis for genome-wide genetic variation data: Analytic principles, latest developments, and new opportunities. J Genet Genomics 2021; 48:173-183. [PMID: 33896739 PMCID: PMC8286309 DOI: 10.1016/j.jgg.2021.01.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/24/2021] [Accepted: 01/25/2021] [Indexed: 12/23/2022]
Abstract
Pathway analysis, also known as gene-set enrichment analysis, is a multilocus analytic strategy that integrates a priori, biological knowledge into the statistical analysis of high-throughput genetics data. Originally developed for the studies of gene expression data, it has become a powerful analytic procedure for in-depth mining of genome-wide genetic variation data. Astonishing discoveries were made in the past years, uncovering genes and biological mechanisms underlying common and complex disorders. However, as massive amounts of diverse functional genomics data accrue, there is a pressing need for newer generations of pathway analysis methods that can utilize multiple layers of high-throughput genomics data. In this review, we provide an intellectual foundation of this powerful analytic strategy, as well as an update of the state-of-the-art in recent method developments. The goal of this review is threefold: (1) introduce the motivation and basic steps of pathway analysis for genome-wide genetic variation data; (2) review the merits and the shortcomings of classic and newly emerging integrative pathway analysis tools; and (3) discuss remaining challenges and future directions for further method developments.
Collapse
Affiliation(s)
- Micah Silberstein
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nicholas Nesbit
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jacquelyn Cai
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Phil H Lee
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Psychiatry, Harvard Medical School, Boston, MA 02115, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
26
|
Cooper-Knock J, Zhang S, Kenna KP, Moll T, Franklin JP, Allen S, Nezhad HG, Iacoangeli A, Yacovzada NY, Eitan C, Hornstein E, Elhaik E, Celadova P, Bose D, Farhan S, Fishilevich S, Lancet D, Morrison KE, Shaw CE, Al-Chalabi A, Veldink JH, Kirby J, Snyder MP, Shaw PJ. Rare Variant Burden Analysis within Enhancers Identifies CAV1 as an ALS Risk Gene. Cell Rep 2020; 33:108456. [PMID: 33264630 PMCID: PMC7710676 DOI: 10.1016/j.celrep.2020.108456] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 09/15/2020] [Accepted: 11/09/2020] [Indexed: 02/01/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is an incurable neurodegenerative disease. CAV1 and CAV2 organize membrane lipid rafts (MLRs) important for cell signaling and neuronal survival, and overexpression of CAV1 ameliorates ALS phenotypes in vivo. Genome-wide association studies localize a large proportion of ALS risk variants within the non-coding genome, but further characterization has been limited by lack of appropriate tools. By designing and applying a pipeline to identify pathogenic genetic variation within enhancer elements responsible for regulating gene expression, we identify disease-associated variation within CAV1/CAV2 enhancers, which replicate in an independent cohort. Discovered enhancer mutations reduce CAV1/CAV2 expression and disrupt MLRs in patient-derived cells, and CRISPR-Cas9 perturbation proximate to a patient mutation is sufficient to reduce CAV1/CAV2 expression in neurons. Additional enrichment of ALS-associated mutations within CAV1 exons positions CAV1 as an ALS risk gene. We propose CAV1/CAV2 overexpression as a personalized medicine target for ALS.
Collapse
Affiliation(s)
- Johnathan Cooper-Knock
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK.
| | - Sai Zhang
- Stanford Center for Genomics and Personalized Medicine, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Kevin P Kenna
- Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Tobias Moll
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK
| | - John P Franklin
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK
| | - Samantha Allen
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK
| | - Helia Ghahremani Nezhad
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK
| | - Alfredo Iacoangeli
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Nancy Y Yacovzada
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Chen Eitan
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Eran Hornstein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Eran Elhaik
- Department of Biology, Lund University, Lund, Sweden
| | - Petra Celadova
- Sheffield Institute for Nucleic Acids, University of Sheffield, Sheffield, UK
| | - Daniel Bose
- Sheffield Institute for Nucleic Acids, University of Sheffield, Sheffield, UK
| | - Sali Farhan
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | | | - Christopher E Shaw
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Ammar Al-Chalabi
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
| | - Jan H Veldink
- Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Janine Kirby
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK
| | - Michael P Snyder
- Stanford Center for Genomics and Personalized Medicine, Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Pamela J Shaw
- Sheffield Institute for Translational Neuroscience (SITraN), University of Sheffield, Sheffield, UK.
| |
Collapse
|
27
|
Guerra-Almeida D, Nunes-da-Fonseca R. Small Open Reading Frames: How Important Are They for Molecular Evolution? Front Genet 2020; 11:574737. [PMID: 33193682 PMCID: PMC7606980 DOI: 10.3389/fgene.2020.574737] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 08/25/2020] [Indexed: 11/13/2022] Open
Affiliation(s)
- Diego Guerra-Almeida
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Rodrigo Nunes-da-Fonseca
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.,National Institute of Science and Technology in Molecular Entomology, Rio de Janeiro, Brazil
| |
Collapse
|
28
|
Kinzina ED, Podolskiy DI, Dmitriev SE, Gladyshev VN. Patterns of Aging Biomarkers, Mortality, and Damaging Mutations Illuminate the Beginning of Aging and Causes of Early-Life Mortality. Cell Rep 2020; 29:4276-4284.e3. [PMID: 31875539 DOI: 10.1016/j.celrep.2019.11.091] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/04/2019] [Accepted: 11/20/2019] [Indexed: 12/12/2022] Open
Abstract
An increase in the probability of death has been a defining feature of aging, yet human perinatal mortality starts high and decreases with age. Previous evolutionary models suggested that organismal aging begins after the onset of reproduction. However, we find that mortality and incidence of diseases associated with aging follow a U-shaped curve with the minimum before puberty, whereas quantitative biomarkers of aging, including somatic mutations and DNA methylation, do not, revealing that aging starts early but is masked by early-life mortality. Moreover, our genetic analyses point to the contribution of damaging mutations to early mortality. We propose that mortality patterns are governed, in part, by negative selection against damaging mutations in early life, manifesting after the corresponding genes are first expressed. Deconvolution of mortality patterns suggests that deleterious changes rather than mortality are the defining characteristic of aging and that aging begins in very early life.
Collapse
Affiliation(s)
- Elvira D Kinzina
- Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia; Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Dmitriy I Podolskiy
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Sergey E Dmitriev
- Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | - Vadim N Gladyshev
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
29
|
Risely A. Applying the core microbiome to understand host-microbe systems. J Anim Ecol 2020; 89:1549-1558. [PMID: 32248522 DOI: 10.1111/1365-2656.13229] [Citation(s) in RCA: 177] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Accepted: 03/13/2020] [Indexed: 12/16/2022]
Abstract
The host-associated core microbiome was originally coined to refer to common groups of microbes or genes that were likely to be particularly important for host biological function. However, the term has evolved to encompass variable definitions across studies, often identifying key microbes with respect to their spatial distribution, temporal stability or ecological influence, as well as their contribution to host function and fitness. A major barrier to reaching a consensus over how to define the core microbiome and its relevance to biological, ecological and evolutionary theory is a lack of precise terminology and associated definitions, as well the persistent association of the core microbiome with host function. Common, temporal and ecological core microbiomes can together generate insights into ecological processes that act independently of host function, while functional and host-adapted cores distinguish between facultative and near-obligate symbionts that differ in their effects on host fitness. This commentary summarizes five broad definitions of the core microbiome that have been applied across the literature, highlighting their strengths and limitations for advancing our understanding of host-microbe systems, noting where they are likely to overlap, and discussing their potential relevance to host function and fitness. No one definition of the core microbiome is likely to capture the range of key microbes across a host population. Applied together, they have the potential to reveal different layers of microbial organization from which we can begin to understand the ecological and evolutionary processes that govern host-microbe interactions.
Collapse
Affiliation(s)
- Alice Risely
- Institute for Evolutionary Ecology and Conservation Genomics, University of Ulm, Ulm, Germany
| |
Collapse
|
30
|
Wallace C. Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses. PLoS Genet 2020; 16:e1008720. [PMID: 32310995 PMCID: PMC7192519 DOI: 10.1371/journal.pgen.1008720] [Citation(s) in RCA: 218] [Impact Index Per Article: 43.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 04/30/2020] [Accepted: 03/17/2020] [Indexed: 01/03/2023] Open
Abstract
Horizontal integration of summary statistics from different GWAS traits can be used to evaluate evidence for their shared genetic causality. One popular method to do this is a Bayesian method, coloc, which is attractive in requiring only GWAS summary statistics and no linkage disequilibrium estimates and is now being used routinely to perform thousands of comparisons between traits. Here we show that while most users do not adjust default software values, misspecification of prior parameters can substantially alter posterior inference. We suggest data driven methods to derive sensible prior values, and demonstrate how sensitivity analysis can be used to assess robustness of posterior inference. The flexibility of coloc comes at the expense of an unrealistic assumption of a single causal variant per trait. This assumption can be relaxed by stepwise conditioning, but this requires external software and an LD matrix aligned to study alleles. We have now implemented conditioning within coloc, and propose a new alternative method, masking, that does not require LD and approximates conditioning when causal variants are independent. Importantly, masking can be used in combination with conditioning where allelically aligned LD estimates are available for only a single trait. We have implemented these developments in a new version of coloc which we hope will enable more informed choice of priors and overcome the restriction of the single causal variant assumptions in coloc analysis.
Collapse
Affiliation(s)
- Chris Wallace
- Cambridge Institute for Therapeutic Immunology & Infectious Disease, and MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
31
|
Bernardi G. The Genomic Code: A Pervasive Encoding/Molding of Chromatin Structures and a Solution of the "Non-Coding DNA" Mystery. Bioessays 2019; 41:e1900106. [PMID: 31701567 DOI: 10.1002/bies.201900106] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 08/07/2019] [Indexed: 12/15/2022]
Abstract
Recent investigations have revealed 1) that the isochores of the human genome group into two super-families characterized by two different long-range 3D structures, and 2) that these structures, essentially based on the distribution and topology of short sequences, mold primary chromatin domains (and define nucleosome binding). More specifically, GC-poor, gene-poor isochores are low-heterogeneity sequences with oligo-A spikes that mold the lamina-associated domains (LADs), whereas GC-rich, gene-rich isochores are characterized by single or multiple GC peaks that mold the topologically associating domains (TADs). The formation of these "primary TADs" may be followed by extrusion under the action of cohesin and CTCF. Finally, the genomic code, which is responsible for the pervasive encoding and molding of primary chromatin domains (LADs and primary TADs, namely the "gene spaces"/"spatial compartments") resolves the longstanding problems of "non-coding DNA," "junk DNA," and "selfish DNA" leading to a new vision of the genome as shaped by DNA sequences.
Collapse
Affiliation(s)
- Giorgio Bernardi
- Science Department, Roma Tre University, Viale Marconi 446, 00146, Rome, Italy
- Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy
| |
Collapse
|
32
|
Zhou B, Yang Y, Zhan J, Dou X, Wang J, Zhou Y. Predicting functional long non-coding RNAs validated by low throughput experiments. RNA Biol 2019; 16:1555-1564. [PMID: 31345106 PMCID: PMC6779387 DOI: 10.1080/15476286.2019.1644590] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 06/17/2019] [Accepted: 07/10/2019] [Indexed: 01/05/2023] Open
Abstract
High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most useful features for classification are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that the majority of HTlncRNAs is probably non-functional but a large portion (nearly 30%) are likely functional. In other words, there is an ample number of lncRNAs whose specific biological roles are yet to be discovered. The method developed here is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html . All datasets used in this study can be obtained from the same website.
Collapse
Affiliation(s)
- Bailing Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Yuedong Yang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| | - Xianghua Dou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Jihua Wang
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- College of Physics and Electronic Information, Dezhou University, Dezhou, China
| | - Yaoqi Zhou
- Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast, QLD, Australia
| |
Collapse
|
33
|
Massey SE, Mishra B. Origin of biomolecular games: deception and molecular evolution. J R Soc Interface 2019; 15:rsif.2018.0429. [PMID: 30185543 DOI: 10.1098/rsif.2018.0429] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 08/09/2018] [Indexed: 12/13/2022] Open
Abstract
Biological macromolecules encode information: some of it to endow the molecule with structural flexibility, some of it to enable molecular actions as a catalyst or a substrate, but a residual part can be used to communicate with other macromolecules. Thus, macromolecules do not need to possess information only to survive in an environment, but also to strategically interact with others by sending signals to a receiving macromolecule that can properly interpret the signal and act suitably. These sender-receiver signalling games are sustained by the information asymmetry that exists among the macromolecules. In both biochemistry and molecular evolution, the important role of information asymmetry remains largely unaddressed. Here, we provide a new unifying perspective on the impact of information symmetry between macromolecules on molecular evolutionary processes, while focusing on molecular deception. Biomolecular games arise from the ability of biological macromolecules to exert precise recognition, and their role as units of selection, meaning that they are subject to competition and cooperation with other macromolecules. Thus, signalling game theory can be used to better understand fundamental features of living systems such as molecular recognition, molecular mimicry, selfish elements and 'junk' DNA. We show how deceptive behaviour at the molecular level indicates a conflict of interest, and so provides evidence of genetic conflict. This model proposes that molecular deception is diagnostic of selfish behaviour, helping to explain the evasive behaviour of transposable elements in 'junk' DNA, for example. Additionally, in this broad review, a range of major evolutionary transitions are shown to be associated with the establishment of signalling conventions, many of which are susceptible to molecular deception. These perspectives allow us to assign rudimentary behaviour to macromolecules, and show how participation in signalling games differentiates biochemistry from abiotic chemistry.
Collapse
Affiliation(s)
- Steven E Massey
- Department of Biology, University of Puerto Rico, San Juan, PR, USA
| | - Bud Mishra
- Courant Institute, New York University, NY, USA
| |
Collapse
|
34
|
Jabbari K, Chakraborty M, Wiehe T. DNA sequence-dependent chromatin architecture and nuclear hubs formation. Sci Rep 2019; 9:14646. [PMID: 31601866 PMCID: PMC6787200 DOI: 10.1038/s41598-019-51036-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 09/18/2019] [Indexed: 02/08/2023] Open
Abstract
In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional constraints associated with chromatin interactions and TADs formation, we analysed intra-TAD and intra-loop GC variations. This led to the identification of clear GC-gradients, along which, the density of genes, super-enhancers, transcriptional activity, and CTCF binding sites occupancy co-vary non-randomly. Further, the analysis of DNA base composition of nucleolar aggregates and nuclear speckles showed strong sequence-dependant effects. We conjecture that dynamic DNA binding affinity and flexibility underlay the emergence of chromatin condensates, their growth is likely promoted in mechanically soft regions (GC-rich) of the lowest chromatin and nucleosome densities. As a practical perspective, the strong linear association between sequence composition and interchromosomal contacts can help define consensus chromatin interactions, which in turn may be used to study alternative states of chromatin architecture.
Collapse
Affiliation(s)
- Kamel Jabbari
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany.
| | - Maharshi Chakraborty
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany
| | - Thomas Wiehe
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany
| |
Collapse
|
35
|
Estimating dispensable content in the human interactome. Nat Commun 2019; 10:3205. [PMID: 31324802 PMCID: PMC6642175 DOI: 10.1038/s41467-019-11180-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 06/21/2019] [Indexed: 11/21/2022] Open
Abstract
Protein-protein interaction (PPI) networks (interactome networks) have successfully advanced our knowledge of molecular function, disease and evolution. While much progress has been made in quantifying errors and biases in experimental PPI datasets, it remains unknown what fraction of the error-free PPIs in the cell are completely dispensable, i.e., effectively neutral upon disruption. Here, we estimate dispensable content in the human interactome by calculating the fractions of PPIs disrupted by neutral and non-neutral mutations. Starting with the human reference interactome determined by experiments, we construct a human structural interactome by building homology-based three-dimensional structural models for PPIs. Next, we map common mutations from healthy individuals as well as Mendelian disease-causing mutations onto the human structural interactome, and perform structure-based calculations of how these mutations perturb the interactome. Using our predicted as well as experimentally-determined interactome perturbation patterns by common and disease mutations, we estimate that <~20% of the human interactome is completely dispensable. The fraction of protein-protein interactions (PPIs) that can be disrupted without fitness effect is unknown. Here, the authors model how disease-causing mutations and common mutations carried by healthy people perturb the interactome, and estimate that <20% of human PPIs are completely dispensable.
Collapse
|
36
|
Abstract
Understanding the complexity and regular function of the human brain is an unresolved challenge that hampers the identification of disease-contributing components and mechanisms of psychiatric disorders. It is accepted that the majority of psychiatric disorders result from a complex interaction of environmental and heritable factors, and efforts to determine, for example, genetic variants contributing to the pathophysiology of these diseases are becoming increasingly successful. We also continue to discover new molecules with unknown functions that might play a role in brain physiology. One such class of polymeric molecules is noncoding RNAs; though discovered years ago, they have only recently started to receive careful attention. Furthermore, recent technological advances in the field of molecular genetics and high-throughput sequencing have facilitated the discovery of a broad spectrum of RNAs that show no obvious coding potential but may provide additional layers of complexity and regulation to the molecular mechanisms underlying psychiatric disorders. Their exquisite enrichment and expression profiles in the brain may point to important functions of these RNAs in health and disease. This review will therefore aim to provide insight into the expression of noncoding RNAs in the brain, their function, and potential role in psychiatric disorders.
Collapse
|
37
|
Abstract
Among the multitude of papers published yearly in scientific journals, precious few publications may be worth looking back in half a century to appreciate the significance of the discoveries that would later become common knowledge and get a chance to shape a field or several adjacent fields. Here, Kimura's fundamental concept of neutral mutation-random drift, which was published 50 years ago, is re-examined in light of its pervasive influence on comparative genomics and, more specifically, on the contribution of transposable elements to eukaryotic genome evolution.
Collapse
Affiliation(s)
- Irina R Arkhipova
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA
| |
Collapse
|
38
|
Hoeppner MP, Denisenko E, Gardner PP, Schmeier S, Poole AM. An Evaluation of Function of Multicopy Noncoding RNAs in Mammals Using ENCODE/FANTOM Data and Comparative Genomics. Mol Biol Evol 2019; 35:1451-1462. [PMID: 29617896 PMCID: PMC5967550 DOI: 10.1093/molbev/msy046] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Mammalian diversification has coincided with a rapid proliferation of various types of noncoding RNAs, including members of both snRNAs and snoRNAs. The significance of this expansion however remains obscure. While some ncRNA copy-number expansions have been linked to functionally tractable effects, such events may equally likely be neutral, perhaps as a result of random retrotransposition. Hindering progress in our understanding of such observations is the difficulty in establishing function for the diverse features that have been identified in our own genome. Projects such as ENCODE and FANTOM have revealed a hidden world of genomic expression patterns, as well as a host of other potential indicators of biological function. However, such projects have been criticized, particularly from practitioners in the field of molecular evolution, where many suspect these data provide limited insight into biological function. The molecular evolution community has largely taken a skeptical view, thus it is important to establish tests of function. We use a range of data, including data drawn from ENCODE and FANTOM, to examine the case for function for the recent copy number expansion in mammals of six evolutionarily ancient RNA families involved in splicing and rRNA maturation. We use several criteria to assess evidence for function: conservation of sequence and structure, genomic synteny, evidence for transposition, and evidence for species-specific expression. Applying these criteria, we find that only a minority of loci show strong evidence for function and that, for the majority, we cannot reject the null hypothesis of no function.
Collapse
Affiliation(s)
- Marc P Hoeppner
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel, Kiel, Germany
| | - Elena Denisenko
- Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand
| | - Paul P Gardner
- Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
| | - Sebastian Schmeier
- Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand
| | - Anthony M Poole
- Bioinformatics Institute, School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
39
|
Lloyd JP, Tsai ZTY, Sowers RP, Panchy NL, Shiu SH. A Model-Based Approach for Identifying Functional Intergenic Transcribed Regions and Noncoding RNAs. Mol Biol Evol 2019; 35:1422-1436. [PMID: 29554332 DOI: 10.1093/molbev/msy035] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
With advances in transcript profiling, the presence of transcriptional activities in intergenic regions has been well established. However, whether intergenic expression reflects transcriptional noise or activity of novel genes remains unclear. We identified intergenic transcribed regions (ITRs) in 15 diverse flowering plant species and found that the amount of intergenic expression correlates with genome size, a pattern that could be expected if intergenic expression is largely nonfunctional. To further assess the functionality of ITRs, we first built machine learning models using Arabidopsis thaliana as a model that accurately distinguish functional sequences (benchmark protein-coding and RNA genes) and likely nonfunctional ones (pseudogenes and unexpressed intergenic regions) by integrating 93 biochemical, evolutionary, and sequence-structure features. Next, by applying the models genome-wide, we found that 4,427 ITRs (38%) and 796 annotated ncRNAs (44%) had features significantly similar to benchmark protein-coding or RNA genes and thus were likely parts of functional genes. Approximately 60% of ITRs and ncRNAs were more similar to nonfunctional sequences and were likely transcriptional noise. The predictive framework established here provides not only a comprehensive look at how functional, genic sequences are distinct from likely nonfunctional ones, but also a new way to differentiate novel genes from genomic regions with noisy transcriptional activities.
Collapse
Affiliation(s)
- John P Lloyd
- Department of Plant Biology, Michigan State University, East Lansing, MI
| | - Zing Tsung-Yeh Tsai
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI
| | - Rosalie P Sowers
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA
| | | | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI.,Genetics Program, Michigan State University, East Lansing, MI.,Ecology, Evolutionary Biology, and Behavior Program, Michigan State University, East Lansing, MI
| |
Collapse
|
40
|
Bohlin J, Pettersson JHO. Evolution of Genomic Base Composition: From Single Cell Microbes to Multicellular Animals. Comput Struct Biotechnol J 2019; 17:362-370. [PMID: 30949307 PMCID: PMC6429543 DOI: 10.1016/j.csbj.2019.03.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 02/28/2019] [Accepted: 03/01/2019] [Indexed: 01/07/2023] Open
Abstract
Whole genome sequencing (WGS) of thousands of microbial genomes has provided considerable insight into evolutionary mechanisms in the microbial world. While substantially fewer eukaryotic genomes are available for analyses the number is rapidly increasing. This mini-review summarizes broadly evolutionary dynamics of base composition in the different domains of life from the perspective of prokaryotes. Common and different evolutionary mechanisms influencing genomic base composition in eukaryotes and prokaryotes are discussed. The conclusion from the data currently available suggests that while there are similarities there are also striking differences in how genomic base composition has evolved within prokaryotes and eukaryotes. For instance, homologous recombination appears to increase GC content locally in eukaryotes due to a non-selective process termed GC-biased gene conversion (gBGC). For prokaryotes on the other hand, increase in genomic GC content seems to be driven by the environment and selection. We find that similar phenomena observed for some organisms in each respective domain may be caused by very different mechanisms: while gBGC and recombination rates appear to explain the negative correlation between GC3 (GC content based on the third codon nucleotides) and genome size in some eukaryotes uptake of AT rich DNA sequences is the main reason for a similar negative correlation observed in prokaryotes. We provide further examples that indicate that base composition in prokaryotes and eukaryotes have evolved under very different constraints.
Collapse
Affiliation(s)
- Jon Bohlin
- Norwegian Institute of Public Health, Division of Infection Control and Environmental Health, Department of Infectious Disease Epidemiology and Modelling, Lovisenberggata 8, 0456 Oslo, Norway.,Centre for Fertility and Health, Norwegian Institute of Public Health, PO-Box 222 Skøyen, N-0213 Oslo, Norway.,Norwegian University of Life Sciences, Faculty of Veterinary Sciences, Production Animal Clinical Sciences, Ullevålsveien 72, 0454 Oslo, Norway
| | - John H-O Pettersson
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School the University of Sydney, New South Wales 2006, Australia.,Zoonosis Science Center, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Public Health Agency of Sweden, Nobels vg 18, SE-171 82 Solna, Sweden
| |
Collapse
|
41
|
Salas A. The natural selection that shapes our genomes. Forensic Sci Int Genet 2018; 39:57-60. [PMID: 30578983 DOI: 10.1016/j.fsigen.2018.12.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 12/13/2018] [Indexed: 12/14/2022]
Abstract
Most of the variation in the human genome (∼95%) is constrained, directly or indirectly, by purifying selection and GC-biased gene conversion, according to a recent article by Pouyet et al. (2018). The use of 'non-neutral' variation to infer human demographies can lead to undesirable biases; for example, in estimation of the time of the most recent common ancestor. Further examination of 'neutral' variation in entire human genomes from The 1000 Genomes Project reveals that ∼99% of this variation lacks exonic function, but ∼35% of it falls in introns. In addition, estimates of biogeographical ancestry using 'non-neutral' SNPs differ very marginally from inferences obtained from 'neutral' variation. Additional investigations should be carried out before establishing the roadmap for future human population and forensic genetic studies.
Collapse
Affiliation(s)
- Antonio Salas
- Unidade de Xenética, Instituto de Ciencias Forenses (INCIFOR), Facultade de Medicina, Universidade de Santiago de Compostela, and GenPoB Research Group, Instituto de Investigaciones Sanitarias (IDIS), Hospital Clínico Universitario de Santiago (SERGAS), Galicia, Spain.
| |
Collapse
|
42
|
Slijepcevic P. Genome dynamics over evolutionary time: “C-value enigma” in light of chromosome structure. MUTATION RESEARCH-GENETIC TOXICOLOGY AND ENVIRONMENTAL MUTAGENESIS 2018; 836:22-27. [DOI: 10.1016/j.mrgentox.2018.05.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 03/28/2018] [Accepted: 05/03/2018] [Indexed: 12/15/2022]
|
43
|
Abstract
Just 5% of the human genome is subject to neutral evolution, but this process remains central to understanding the history of human migration across the Earth.
Collapse
Affiliation(s)
- Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, United States
| |
Collapse
|
44
|
Pouyet F, Aeschbacher S, Thiéry A, Excoffier L. Background selection and biased gene conversion affect more than 95% of the human genome and bias demographic inferences. eLife 2018; 7:e36317. [PMID: 30125248 PMCID: PMC6177262 DOI: 10.7554/elife.36317] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 08/17/2018] [Indexed: 12/15/2022] Open
Abstract
Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.
Collapse
Affiliation(s)
- Fanny Pouyet
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Simon Aeschbacher
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Department of Evolutionary Biology and Environmental StudiesUniversity of ZurichZurichSwitzerland
| | - Alexandre Thiéry
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| | - Laurent Excoffier
- Computational and Molecular Population Genetics, Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
| |
Collapse
|
45
|
Klassen JL. Defining microbiome function. Nat Microbiol 2018; 3:864-869. [PMID: 30046174 DOI: 10.1038/s41564-018-0189-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 06/05/2018] [Indexed: 02/07/2023]
Abstract
Why does a microorganism associate with a host? What function does it perform? Such questions are difficult to unequivocally address and remain hotly debated. This is partially because scientists often use different philosophical definitions of 'function' ambiguously and interchangeably, as exemplified by the controversy surrounding the Encyclopedia of DNA Elements (ENCODE) project. Here, I argue that research studying host-associated microbial communities and their genomes (that is, microbiomes) faces similar pitfalls and that unclear or misapplied conceptions of function underpin many controversies in this field. In particular, experiments that support phenomenological models of function can inappropriately be used to support functional models that instead require specific measurements of evolutionary selection. Microbiome research also requires uniquely clear definitions of 'who the function is for', in contrast to most single-organism systems where this is implicit. I illustrate how obscuring either of these issues can lead to substantial confusion and misinterpretation of microbiome function, using the varied conceptions of the holobiont as a current and cogent example. Using clear functional definitions and appropriate types of evidence are essential to effectively communicate microbiome research and foster host health.
Collapse
Affiliation(s)
- Jonathan L Klassen
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA.
| |
Collapse
|
46
|
Sverdlov ED. Unsolvable Problems of Biology: It Is Impossible to Create Two Identical Organisms, to Defeat Cancer, or to Map Organisms onto Their Genomes. BIOCHEMISTRY (MOSCOW) 2018; 83:370-380. [PMID: 29626924 DOI: 10.1134/s0006297918040089] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The review is devoted to unsolvable problems of biology. 1) Problems unsolvable due to stochastic mutations occurring during DNA replication that make it impossible to create two identical organisms or even two identical complex cells (Sverdlov, E. D. (2009) Biochemistry (Moscow), 74, 939-944) and to "defeat" cancer. 2) Problems unsolvable due to multiple interactions in complex systems leading to the appearance of unpredictable emergent properties that prevent establishment of unambiguous relationships between the genetic architecture and phenotypic manifestation of the genome and make impossible to predict with certainty responses of the organism, its parts, or pathological processes to external factors. 3) Problems unsolvable because of the uncertainty principle and observer effect in biology, due to which it is impossible to obtain adequate information about cells in their tissue microenvironment by isolating and analyzing individual cells. In particular, we cannot draw conclusions on the properties of stem cells in their niches based on the properties of stem cell cultures. A strategy is proposed for constructing the pattern most closely approximated to the relationship of genotypes with their phenotypes by designing networks of intermediate phenotypes (endophenotypes).
Collapse
Affiliation(s)
- E D Sverdlov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, 117997, Russia.
| |
Collapse
|
47
|
Same-Sex Twin Pair Phenotypic Correlations are Consistent with Human Y Chromosome Promoting Phenotypic Heterogeneity. Evol Biol 2018. [DOI: 10.1007/s11692-018-9454-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
48
|
Buckley RM, Kortschak RD, Adelson DL. Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse. PLoS Comput Biol 2018; 14:e1006091. [PMID: 29677183 PMCID: PMC5931693 DOI: 10.1371/journal.pcbi.1006091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 05/02/2018] [Accepted: 03/15/2018] [Indexed: 12/31/2022] Open
Abstract
The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or "churning" in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against.
Collapse
Affiliation(s)
- Reuben M. Buckley
- Department of Genetics and Evolution, The University of Adelaide, North Tce, Adelaide, Australia
| | - R. Daniel Kortschak
- Department of Genetics and Evolution, The University of Adelaide, North Tce, Adelaide, Australia
| | - David L. Adelson
- Department of Genetics and Evolution, The University of Adelaide, North Tce, Adelaide, Australia
- * E-mail:
| |
Collapse
|
49
|
Lee PH, Lee C, Li X, Wee B, Dwivedi T, Daly M. Principles and methods of in-silico prioritization of non-coding regulatory variants. Hum Genet 2018; 137:15-30. [PMID: 29288389 PMCID: PMC5892192 DOI: 10.1007/s00439-017-1861-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Accepted: 12/14/2017] [Indexed: 12/13/2022]
Abstract
Over a decade of genome-wide association, studies have made great strides toward the detection of genes and genetic mechanisms underlying complex traits. However, the majority of associated loci reside in non-coding regions that are functionally uncharacterized in general. Now, the availability of large-scale tissue and cell type-specific transcriptome and epigenome data enables us to elucidate how non-coding genetic variants can affect gene expressions and are associated with phenotypic changes. Here, we provide an overview of this emerging field in human genomics, summarizing available data resources and state-of-the-art analytic methods to facilitate in-silico prioritization of non-coding regulatory mutations. We also highlight the limitations of current approaches and discuss the direction of much-needed future research.
Collapse
Affiliation(s)
- Phil H Lee
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Simches Research Building, 185 Cambridge St, Boston, MA, 02114, USA.
- Quantitative Genomics Program, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Christian Lee
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Simches Research Building, 185 Cambridge St, Boston, MA, 02114, USA
- Department of Life Sciences, Harvard University, Cambridge, MA, USA
| | - Xihao Li
- Quantitative Genomics Program, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Brian Wee
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Simches Research Building, 185 Cambridge St, Boston, MA, 02114, USA
| | - Tushar Dwivedi
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Simches Research Building, 185 Cambridge St, Boston, MA, 02114, USA
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
| | - Mark Daly
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Simches Research Building, 185 Cambridge St, Boston, MA, 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| |
Collapse
|
50
|
Abstract
The idea that much of our genome is irrelevant to fitness-is not the product of positive natural selection at the organismal level-remains viable. Claims to the contrary, and specifically that the notion of "junk DNA" should be abandoned, are based on conflating meanings of the word "function". Recent estimates suggest that perhaps 90% of our DNA, though biochemically active, does not contribute to fitness in any sequence-dependent way, and possibly in no way at all. Comparisons to vertebrates with much larger and smaller genomes (the lungfish and the pufferfish) strongly align with such a conclusion, as they have done for the last half-century.
Collapse
Affiliation(s)
- W Ford Doolittle
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada.
| | - Tyler D P Brunet
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of History and Philosophy of Science, University of Cambridge, Cambridge, UK
| |
Collapse
|