1
|
Soni V, Versoza CJ, Pfeifer SP, Jensen JD. Estimating the distribution of fitness effects in aye-ayes ( Daubentonia madagascariensis ), accounting for population history as well as mutation and recombination rate heterogeneity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.02.631144. [PMID: 39803457 PMCID: PMC11722344 DOI: 10.1101/2025.01.02.631144] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
The distribution of fitness effects (DFE) characterizes the range of selection coefficients from which new mutations are sampled, and thus holds a fundamentally important role in evolutionary genomics. To date, DFE inference in primates has been largely restricted to haplorrhines, with limited data availability leaving the other suborder of primates, strepsirrhines, largely under-explored. To advance our understanding of the population genetics of this important taxonomic group, we here map exonic divergence in aye-ayes ( Daubentonia madagascariensis ) - the only extant member of the Daubentoniidae family of the Strepsirrhini suborder. We further infer the DFE in this highly-endangered species, utilizing a recently published high-quality annotated reference genome, a well-supported model of demographic history, as well as both direct and indirect estimates of underlying mutation and recombination rates. The inferred distribution is generally characterized by a greater proportion of deleterious mutations relative to humans, providing evidence of a larger long-term effective population size. In addition however, both immune-related and sensory-related genes were found to be amongst the most rapidly evolving in the aye-aye genome.
Collapse
|
2
|
Costa CE, Watowich MM, Goldman EA, Sterner KN, Negron-Del Valle JE, Phillips D, Platt ML, Montague MJ, Brent LJN, Higham JP, Snyder-Mackler N, Lea AJ. Genetic Architecture of Immune Cell DNA Methylation in the Rhesus Macaque. Mol Ecol 2024:e17576. [PMID: 39582237 DOI: 10.1111/mec.17576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 06/23/2024] [Accepted: 10/18/2024] [Indexed: 11/26/2024]
Abstract
Genetic variation that impacts gene regulation, rather than protein function, can have strong effects on trait variation both within and between species. Epigenetic mechanisms, such as DNA methylation, are often an important intermediate link between genotype and phenotype, yet genetic effects on DNA methylation remain understudied in natural populations. To address this gap, we used reduced representation bisulfite sequencing to measure DNA methylation levels at 555,856 CpGs in peripheral whole blood of 573 samples collected from free-ranging rhesus macaques (Macaca mulatta) living on the island of Cayo Santiago, Puerto Rico. We used allele-specific methods to map cis-methylation quantitative trait loci (meQTL) and tested for effects of 243,389 single nucleotide polymorphisms (SNPs) on local DNA methylation levels. Of 776,092 tested SNP-CpG pairs, we identified 516,213 meQTL, with 69.12% of CpGs having at least one meQTL (FDR < 5%). On average, meQTL explained 21.2% of nearby methylation variance, significantly more than age or sex. meQTL were enriched in genomic compartments where methylation is likely to impact gene expression, for example, promoters, enhancers and binding sites for methylation-sensitive transcription factors. In support, using mRNA-seq data from 172 samples, we confirmed 332 meQTL as whole blood cis-expression QTL (eQTL) in the population, and found meQTL-eQTL genes were enriched for immune response functions, like antigen presentation and inflammation. Overall, our study takes an important step towards understanding the genetic architecture of DNA methylation in natural populations, and more generally points to the biological mechanisms driving phenotypic variation in our close relatives.
Collapse
Affiliation(s)
- Christina E Costa
- Department of Anthropology, New York University, New York, New York, USA
- New York Consortium in Evolutionary Primatology, New York, New York, USA
| | - Marina M Watowich
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, USA
| | | | - Kirstin N Sterner
- Department of Anthropology, University of Oregon, Eugene, Oregon, USA
| | - Josue E Negron-Del Valle
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, Arizona, USA
| | - Daniel Phillips
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, Arizona, USA
| | - Michael L Platt
- Department of Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Michael J Montague
- Department of Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | | | - James P Higham
- Department of Anthropology, New York University, New York, New York, USA
- New York Consortium in Evolutionary Primatology, New York, New York, USA
| | - Noah Snyder-Mackler
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, Arizona, USA
- School of Human Evolution and Social Change, Arizona State University, Tempe, Arizona, USA
- Neurodegenerative Disease Research Center, Arizona State University, Tempe, Arizona, USA
| | - Amanda J Lea
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, USA
| |
Collapse
|
3
|
Longtin A, Watowich MM, Sadoughi B, Petersen RM, Brosnan SF, Buetow K, Cai Q, Gurven MD, Highland HM, Huang YT, Kaplan H, Kraft TS, Lim YAL, Long J, Melin AD, Roberson J, Ng KS, Stieglitz J, Trumble BC, Venkataraman VV, Wallace IJ, Wu J, Snyder-Mackler N, Jones A, Bick AG, Lea AJ. Cost-effective solutions for high-throughput enzymatic DNA methylation sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.09.612068. [PMID: 39314398 PMCID: PMC11419010 DOI: 10.1101/2024.09.09.612068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Characterizing DNA methylation patterns is important for addressing key questions in evolutionary biology, geroscience, and medical genomics. While costs are decreasing, whole-genome DNA methylation profiling remains prohibitively expensive for most population-scale studies, creating a need for cost-effective, reduced representation approaches (i.e., assays that rely on microarrays, enzyme digests, or sequence capture to target a subset of the genome). Most common whole genome and reduced representation techniques rely on bisulfite conversion, which can damage DNA resulting in DNA loss and sequencing biases. Enzymatic methyl sequencing (EM-seq) was recently proposed to overcome these issues, but thorough benchmarking of EM-seq combined with cost-effective, reduced representation strategies has not yet been performed. To do so, we optimized Targeted Methylation Sequencing protocol (TMS)-which profiles ∼4 million CpG sites-for miniaturization, flexibility, and multispecies use at a cost of ∼$80. First, we tested modifications to increase throughput and reduce cost, including increasing multiplexing, decreasing DNA input, and using enzymatic rather than mechanical fragmentation to prepare DNA. Second, we compared our optimized TMS protocol to commonly used techniques, specifically the Infinium MethylationEPIC BeadChip (n=55 paired samples) and whole genome bisulfite sequencing (n=6 paired samples). In both cases, we found strong agreement between technologies (R² = 0.97 and 0.99, respectively). Third, we tested the optimized TMS protocol in three non-human primate species (rhesus macaques, geladas, and capuchins). We captured a high percentage (mean=77.1%) of targeted CpG sites and produced methylation level estimates that agreed with those generated from reduced representation bisulfite sequencing (R² = 0.98). Finally, we applied our protocol to profile age-associated DNA methylation variation in two subsistence-level populations-the Tsimane of lowland Bolivia and the Orang Asli of Peninsular Malaysia-and found age-methylation patterns that were strikingly similar to those reported in high income cohorts, despite known differences in age-health relationships between lifestyle contexts. Altogether, our optimized TMS protocol will enable cost-effective, population-scale studies of genome-wide DNA methylation levels across human and non-human primate species.
Collapse
|
4
|
Barr KA, Rhodes KL, Gilad Y. The relationship between regulatory changes in cis and trans and the evolution of gene expression in humans and chimpanzees. Genome Biol 2023; 24:207. [PMID: 37697401 PMCID: PMC10496171 DOI: 10.1186/s13059-023-03019-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 07/21/2023] [Indexed: 09/13/2023] Open
Abstract
BACKGROUND Comparative gene expression studies in apes are fundamentally limited by the challenges associated with sampling across different tissues. Here, we used single-cell RNA sequencing of embryoid bodies to collect transcriptomic data from over 70 cell types in three humans and three chimpanzees. RESULTS We find hundreds of genes whose regulation is conserved across cell types, as well as genes whose regulation likely evolves under directional selection in one or a handful of cell types. Using embryoid bodies from a human-chimpanzee fused cell line, we also infer the proportion of inter-species regulatory differences due to changes in cis and trans elements between the species. Using the cis/trans inference and an analysis of transcription factor binding sites, we identify dozens of transcription factors whose inter-species differences in expression are affecting expression differences between humans and chimpanzees in hundreds of target genes. CONCLUSIONS Here, we present the most comprehensive dataset of comparative gene expression from humans and chimpanzees to date, including a catalog of regulatory mechanisms associated with inter-species differences.
Collapse
Affiliation(s)
- Kenneth A Barr
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | | | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, IL, 60637, USA.
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
5
|
Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol 2023; 24:26. [PMID: 36788564 PMCID: PMC9926830 DOI: 10.1186/s13059-023-02856-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 01/17/2023] [Indexed: 02/16/2023] Open
Abstract
A long-standing goal of evolutionary biology is to decode how gene regulation contributes to organismal diversity. Doing so is challenging because it is hard to predict function from non-coding sequence and to perform molecular research with non-model taxa. Massively parallel reporter assays (MPRAs) enable the testing of thousands to millions of sequences for regulatory activity simultaneously. Here, we discuss the execution, advantages, and limitations of MPRAs, with a focus on evolutionary questions. We propose solutions for extending MPRAs to rare taxa and those with limited genomic resources, and we underscore MPRA's broad potential for driving genome-scale, functional studies across organisms.
Collapse
Affiliation(s)
- Irene Gallego Romero
- Melbourne Integrative Genomics, University of Melbourne, Royal Parade, Parkville, Victoria, 3010, Australia. .,School of BioSciences, The University of Melbourne, Royal Parade, Parkville, 3010, Australia. .,The Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, 30 Royal Parade, Parkville, Victoria, 3010, Australia. .,Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.
| | - Amanda J. Lea
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37240 USA ,grid.152326.10000 0001 2264 7217Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37240 USA ,Child and Brain Development Program, Canadian Institute for Advanced Study, Toronto, Canada
| |
Collapse
|
6
|
Current advances in primate genomics: novel approaches for understanding evolution and disease. Nat Rev Genet 2023; 24:314-331. [PMID: 36599936 DOI: 10.1038/s41576-022-00554-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/07/2022] [Indexed: 01/05/2023]
Abstract
Primate genomics holds the key to understanding fundamental aspects of human evolution and disease. However, genetic diversity and functional genomics data sets are currently available for only a few of the more than 500 extant primate species. Concerted efforts are under way to characterize primate genomes, genetic polymorphism and divergence, and functional landscapes across the primate phylogeny. The resulting data sets will enable the connection of genotypes to phenotypes and provide new insight into aspects of the genetics of primate traits, including human diseases. In this Review, we describe the existing genome assemblies as well as genetic variation and functional genomic data sets. We highlight some of the challenges with sample acquisition. Finally, we explore how technological advances in single-cell functional genomics and induced pluripotent stem cell-derived organoids will facilitate our understanding of the molecular foundations of primate biology.
Collapse
|
7
|
Dubois‐Mignon T, Monget P. Gene essentiality and variability: What is the link? A within‐ and between‐species perspective. Bioessays 2022; 44:e2200132. [DOI: 10.1002/bies.202200132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 08/17/2022] [Accepted: 08/30/2022] [Indexed: 11/07/2022]
Affiliation(s)
- Tania Dubois‐Mignon
- Institut de Biologie de l’École Normale Supérieure Université PSL 46 rue d'Ulm Paris 75005 France
| | - Philippe Monget
- Physiologie de la Reproduction et des Comportements, Centre Val de Loire – UMR INRAE, CNRS, IFCE Université de Tours Nouzilly France
| |
Collapse
|
8
|
Pozzi L, Penna A. Rocks and clocks revised: New promises and challenges in dating the primate tree of life. Evol Anthropol 2022; 31:138-153. [PMID: 35102633 DOI: 10.1002/evan.21940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 10/04/2021] [Accepted: 01/12/2022] [Indexed: 01/14/2023]
Abstract
In recent years, multiple technological and methodological advances have increased our ability to estimate phylogenies, leading to more accurate dating of the primate tree of life. Here we provide an overview of the limitations and potentials of some of these advancements and discuss how dated phylogenies provide the crucial temporal scale required to understand primate evolution. First, we review new methods, such as the total-evidence dating approach, that promise a better integration between the fossil record and molecular data. We then explore how the ever-increasing availability of genomic-level data for more primate species can impact our ability to accurately estimate timetrees. Finally, we discuss more recent applications of mutation rates to date divergence times. We highlight example studies that have applied these approaches to estimate divergence dates within primates. Our goal is to provide a critical overview of these new developments and explore the promises and challenges of their application in evolutionary anthropology.
Collapse
Affiliation(s)
- Luca Pozzi
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| | - Anna Penna
- Department of Anthropology, The University of Texas at San Antonio, San Antonio, Texas, USA
| |
Collapse
|
9
|
Jovanovic VM, Sarfert M, Reyna-Blanco CS, Indrischek H, Valdivia DI, Shelest E, Nowick K. Positive Selection in Gene Regulatory Factors Suggests Adaptive Pleiotropic Changes During Human Evolution. Front Genet 2021; 12:662239. [PMID: 34079582 PMCID: PMC8166252 DOI: 10.3389/fgene.2021.662239] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 04/19/2021] [Indexed: 01/09/2023] Open
Abstract
Gene regulatory factors (GRFs), such as transcription factors, co-factors and histone-modifying enzymes, play many important roles in modifying gene expression in biological processes. They have also been proposed to underlie speciation and adaptation. To investigate potential contributions of GRFs to primate evolution, we analyzed GRF genes in 27 publicly available primate genomes. Genes coding for zinc finger (ZNF) proteins, especially ZNFs with a Krüppel-associated box (KRAB) domain were the most abundant TFs in all genomes. Gene numbers per TF family differed between all species. To detect signs of positive selection in GRF genes we investigated more than 3,000 human GRFs with their more than 70,000 orthologs in 26 non-human primates. We implemented two independent tests for positive selection, the branch-site-model of the PAML suite and aBSREL of the HyPhy suite, focusing on the human and great ape branch. Our workflow included rigorous procedures to reduce the number of false positives: excluding distantly similar orthologs, manual corrections of alignments, and considering only genes and sites detected by both tests for positive selection. Furthermore, we verified the candidate sites for selection by investigating their variation within human and non-human great ape population data. In order to approximately assign a date to positively selected sites in the human lineage, we analyzed archaic human genomes. Our work revealed with high confidence five GRFs that have been positively selected on the human lineage and one GRF that has been positively selected on the great ape lineage. These GRFs are scattered on different chromosomes and have been previously linked to diverse functions. For some of them a role in speciation and/or adaptation can be proposed based on the expression pattern or association with human diseases, but it seems that they all contributed independently to human evolution. Four of the positively selected GRFs are KRAB-ZNF proteins, that induce changes in target genes co-expression and/or through arms race with transposable elements. Since each positively selected GRF contains several sites with evidence for positive selection, we suggest that these GRFs participated pleiotropically to phenotypic adaptations in humans.
Collapse
Affiliation(s)
- Vladimir M Jovanovic
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany.,Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Melanie Sarfert
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany
| | - Carlos S Reyna-Blanco
- Department of Biology, University of Fribourg, Fribourg, Switzerland.,Swiss Institute of Bioinformatics, Fribourg, Switzerland
| | - Henrike Indrischek
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Max Planck Institute for the Physics of Complex Systems, Dresden, Germany.,Center for Systems Biology Dresden, Dresden, Germany
| | - Dulce I Valdivia
- Evolutionary Genomics Laboratory and Genome Topology and Regulation Laboratory, Genetic Engineering Department, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-Irapuato), Irapuato, Mexico
| | - Ekaterina Shelest
- Centre for Enzyme Innovation, University of Portsmouth, Portsmouth, United Kingdom
| | - Katja Nowick
- Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
10
|
Hernandez M, Shenk MK, Perry GH. Factors influencing taxonomic unevenness in scientific research: a mixed-methods case study of non-human primate genomic sequence data generation. ROYAL SOCIETY OPEN SCIENCE 2020; 7:201206. [PMID: 33047065 PMCID: PMC7540799 DOI: 10.1098/rsos.201206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 09/07/2020] [Indexed: 05/06/2023]
Abstract
Scholars have noted major disparities in the extent of scientific research conducted among taxonomic groups. Such trends may cascade if future scientists gravitate towards study species with more data and resources already available. As new technologies emerge, do research studies employing these technologies continue these disparities? Here, using non-human primates as a case study, we identified disparities in massively parallel genomic sequencing data and conducted interviews with scientists who produced these data to learn their motivations when selecting study species. We tested whether variables including publication history and conservation status were significantly correlated with publicly available sequence data in the NCBI Sequence Read Archive (SRA). Of the 179.6 terabases (Tb) of sequence data in SRA for 519 non-human primate species, 135 Tb (approx. 75%) were from only five species: rhesus macaques, olive baboons, green monkeys, chimpanzees and crab-eating macaques. The strongest predictors of the amount of genomic data were the total number of non-medical publications (linear regression; r 2 = 0.37; p = 6.15 × 10-12) and number of medical publications (r 2 = 0.27; p = 9.27 × 10-9). In a generalized linear model, the number of non-medical publications (p = 0.00064) and closer phylogenetic distance to humans (p = 0.024) were the most predictive of the amount of genomic sequence data. We interviewed 33 authors of genomic data-producing publications and analysed their responses using grounded theory. Consistent with our quantitative results, authors mentioned their choice of species was motivated by sample accessibility, prior published work and relevance to human medicine. Our mixed-methods approach helped identify and contextualize some of the driving factors behind species-uneven patterns of scientific research, which can now be considered by funding agencies, scientific societies and research teams aiming to align their broader goals with future data generation efforts.
Collapse
Affiliation(s)
- Margarita Hernandez
- Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA
- Authors for correspondence: Margarita Hernandez e-mail:
| | - Mary K. Shenk
- Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA
| | - George H. Perry
- Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Authors for correspondence: George H. Perry e-mail:
| |
Collapse
|