1
|
Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. Genome Biol 2023; 24:232. [PMID: 37845780 PMCID: PMC10578003 DOI: 10.1186/s13059-023-03079-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 09/29/2023] [Indexed: 10/18/2023] Open
Abstract
BACKGROUND The evolution of genomic regulatory regions plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems complicates the understanding of the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different species and tissues of Drosophila. RESULTS We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that our models generalize well across substantially evolutionarily diverged species of insects, implying that the sequence determinants of accessibility are highly conserved. Using our model to examine species-specific gains in accessibility, we find evidence suggesting that these regions may be ancestrally poised for evolution. Using in silico mutagenesis, we show that accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that accessibility is mutationally robust. Subsequently, we show that accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. Conversely, simulations under strong selection demonstrate that accessibility can be extremely malleable despite its robustness. Finally, we identify motifs predictive of accessibility, recovering both novel and previously known motifs. CONCLUSIONS These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks to explore fundamental questions in regulatory genomics and evolution.
Collapse
Affiliation(s)
- Samuel Khodursky
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Eric B Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Sylvia M Durkin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
- Present Address: Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Alice Gadau
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Xia Wu
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA.
| |
Collapse
|
2
|
Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. bioRxiv 2023:2023.06.26.546587. [PMID: 37425760 PMCID: PMC10327059 DOI: 10.1101/2023.06.26.546587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The evolution of regulatory regions in the genome plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems has made it difficult to understand the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different tissues of Drosophila. We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that a model trained in one species has nearly identical performance when tested in another species, implying that the sequence determinants of accessibility are highly conserved. Indeed, model performance remains excellent even in distantly-related species. By using our model to examine species-specific gains in chromatin accessibility, we find that their orthologous inaccessible regions in other species have surprisingly similar model outputs, suggesting that these regions may be ancestrally poised for evolution. We then use in silico saturation mutagenesis to reveal evidence of selective constraint acting specifically on inaccessible chromatin regions. We further show that chromatin accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that chromatin accessibility is mutationally robust. Subsequently, we demonstrate that chromatin accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. We also perform in silico evolution experiments under the regime of strong selection and weak mutation (SSWM) and show that chromatin accessibility can be extremely malleable despite its mutational robustness. However, selection acting in different directions in a tissue-specific manner can substantially slow adaptation. Finally, we identify motifs predictive of chromatin accessibility and recover motifs corresponding to known chromatin accessibility activators and repressors. These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks as tools to answer fundamental questions in regulatory genomics and evolution.
Collapse
Affiliation(s)
- Samuel Khodursky
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- These authors contributed equally
| | - Eric B Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- These authors contributed equally
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Sylvia M Durkin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
- Current Address: Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Alice Gadau
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Xia Wu
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
3
|
Hague MTJ, Mavengere H, Matute DR, Cooper BS. Environmental and Genetic Contributions to Imperfect wMel-Like Wolbachia Transmission and Frequency Variation. Genetics 2020; 215:1117-1132. [PMID: 32546497 PMCID: PMC7404227 DOI: 10.1534/genetics.120.303330] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 06/13/2020] [Indexed: 12/11/2022] Open
Abstract
Maternally transmitted Wolbachia bacteria infect about half of all insect species. They usually show imperfect maternal transmission and often produce cytoplasmic incompatibility (CI). Irrespective of CI, Wolbachia frequencies tend to increase when rare only if they benefit host fitness. Several Wolbachia, including wMel that infects Drosophila melanogaster, cause weak or no CI and persist at intermediate frequencies. On the island of São Tomé off West Africa, the frequencies of wMel-like Wolbachia infecting Drosophila yakuba (wYak) and Drosophila santomea (wSan) fluctuate, and the contributions of imperfect maternal transmission, fitness effects, and CI to these fluctuations are unknown. We demonstrate spatial variation in wYak frequency and transmission on São Tomé. Concurrent field estimates of imperfect maternal transmission do not predict spatial variation in wYak frequencies, which are highest at high altitudes where maternal transmission is the most imperfect. Genomic and genetic analyses provide little support for D. yakuba effects on wYak transmission. Instead, rearing at cool temperatures reduces wYak titer and increases imperfect transmission to levels observed on São Tomé. Using mathematical models of Wolbachia frequency dynamics and equilibria, we infer that temporally variable imperfect transmission or spatially variable effects on host fitness and reproduction are required to explain wYak frequencies. In contrast, spatially stable wSan frequencies are plausibly explained by imperfect transmission, modest fitness effects, and weak CI. Our results provide insight into causes of wMel-like frequency variation in divergent hosts. Understanding this variation is crucial to explain Wolbachia spread and to improve wMel biocontrol of human disease in transinfected mosquito systems.
Collapse
Affiliation(s)
- Michael T J Hague
- Division of Biological Sciences, University of Montana, Missoula, Montana 59812
| | - Heidi Mavengere
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Brandon S Cooper
- Division of Biological Sciences, University of Montana, Missoula, Montana 59812
| |
Collapse
|
4
|
Yassin A, Debat V, Bastide H, Gidaszewski N, David JR, Pool JE. Recurrent specialization on a toxic fruit in an island Drosophila population. Proc Natl Acad Sci U S A 2016; 113:4771-6. [PMID: 27044093 DOI: 10.1073/pnas.1522559113] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Recurrent specialization on similar host plants offers a unique opportunity to unravel the evolutionary and genetic mechanisms underlying dietary shifts. Recent studies have focused on ecological races belonging to the same species, but it is hard in many cases to untangle the role of adaptive introgression versus distinct mutations in facilitating recurrent evolution. We discovered on the island of Mayotte a population of the generalist fly Drosophila yakuba that is strictly associated with noni (Morinda citrifolia). This case strongly resembles Drosophila sechellia, a genetically isolated insular relative of D. yakuba whose intensely studied specialization on toxic noni fruits has always been considered a unique event in insect evolution. Experiments revealed that unlike mainland D. yakuba strains, Mayotte flies showed strong olfactory attraction and significant toxin tolerance to noni. Island females strongly discriminated against mainland males, suggesting that dietary adaptation has been accompanied by partial reproductive isolation. Population genomic analysis indicated a recent colonization (∼29 kya), at a time when year-round noni fruits may have presented a predictable resource on the small island, with ongoing migration after colonization. This relatively recent time scale allowed us to search for putatively adaptive loci based on genetic variation. Strong signals of genetic differentiation were found for several detoxification genes, including a major toxin tolerance locus in D. sechellia Our results suggest that recurrent evolution on a toxic resource can involve similar historical events and common genetic bases, and they establish an important genetic system for the study of early stages of ecological specialization and speciation.
Collapse
|
5
|
Denis B, Rouzic AL, Wicker-Thomas C. Hydrocarbon Patterns and Mating Behaviour in Populations of Drosophila yakuba. Insects 2015; 6:897-911. [PMID: 26516919 PMCID: PMC4693177 DOI: 10.3390/insects6040897] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 09/16/2015] [Accepted: 10/12/2015] [Indexed: 11/16/2022]
Abstract
Drosophila yakuba is widespread in Africa. Here we compare the cuticular hydrocarbon (CHC) profiles and mating behavior of mainland (Kounden, Cameroon) and island (Mayotte, Sao-Tome, Bioko) populations. The strains each had different CHC profiles: Bioko and Kounden were the most similar, while Mayotte and Sao-Tome contained significant amounts of 7-heptacosene. The CHC profile of the Sao-Tome population differed the most, with half the 7-tricosene of the other populations and more 7-heptacosene and 7-nonacosene. We also studied the characteristics of the mating behavior of the four strains: copulation duration was similar but latency times were higher in Mayotte and Sao-Tome populations. We found partial reproductive isolation between populations, especially in male-choice experiments with Sao-Tome females.
Collapse
Affiliation(s)
- Béatrice Denis
- Laboratoire Évolution, Génomes, Comportement et Écologie, CNRS, IRD, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette F-91198, France.
| | - Arnaud Le Rouzic
- Laboratoire Évolution, Génomes, Comportement et Écologie, CNRS, IRD, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette F-91198, France.
| | - Claude Wicker-Thomas
- Laboratoire Évolution, Génomes, Comportement et Écologie, CNRS, IRD, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette F-91198, France.
| |
Collapse
|
6
|
Rogers RL, Cridland JM, Shao L, Hu TT, Andolfatto P, Thornton KR. Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans. Mol Biol Evol 2014; 31:1750-66. [PMID: 24710518 PMCID: PMC4069613 DOI: 10.1093/molbev/msu124] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of Drosophila yakuba and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting that deleterious impacts are common. Drosophila simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. Drosophila simulans displays an excess of high-frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited noncoding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.
Collapse
Affiliation(s)
- Rebekah L Rogers
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| | - Julie M Cridland
- Department of Ecology and Evolutionary Biology, University of California, IrvineDepartment of Ecology and Evolutionary Biology, University of California, Davis
| | - Ling Shao
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| | - Tina T Hu
- Department of Ecology and Evolutionary Biology and the Lewis Sigler Institute for Integrative Genomics, Princeton University
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and the Lewis Sigler Institute for Integrative Genomics, Princeton University
| | - Kevin R Thornton
- Department of Ecology and Evolutionary Biology, University of California, Irvine
| |
Collapse
|
7
|
Abstract
Phenotypic differences between males and females of sexually dimorphic species are caused in large part by differences in gene expression between the sexes, most of which occurs in the gonads. To accurately identify genes differentially expressed between males and females in Drosophila, we sequenced the testis and ovary transcriptomes of D. yakuba, D. pseudoobscura, and D. ananassae and used them to identify sex-biased genes in the latter two species. We highlight the increased sensitivity and improved power of sex-biased gene detection methods when using our testis/ovary data versus male and female whole body transcriptome data. We thus provide a resource specifically designed to accurately identify and characterize sex-biased genes across Drosophila. This dataset is available through NCBI GEO accession GSE52058.
Collapse
Affiliation(s)
- Nicholas W VanKuren
- 1. Committee on Genetics, Genomics, and Systems Biology, The University of Chicago, Chicago IL 60637, USA; ; 2. Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Maria D Vibranovski
- 2. Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA; ; 3. Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil 05508
| |
Collapse
|
8
|
Hodar C, Zuñiga A, Pulgar R, Travisany D, Chacon C, Pino M, Maass A, Cambiazo V. Comparative gene expression analysis of Dtg, a novel target gene of Dpp signaling pathway in the early Drosophila melanogaster embryo. Gene 2013; 535:210-7. [PMID: 24321690 DOI: 10.1016/j.gene.2013.11.032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Revised: 10/30/2013] [Accepted: 11/14/2013] [Indexed: 10/25/2022]
Abstract
In the early Drosophila melanogaster embryo, Dpp, a secreted molecule that belongs to the TGF-β superfamily of growth factors, activates a set of downstream genes to subdivide the dorsal region into amnioserosa and dorsal epidermis. Here, we examined the expression pattern and transcriptional regulation of Dtg, a new target gene of Dpp signaling pathway that is required for proper amnioserosa differentiation. We showed that the expression of Dtg was controlled by Dpp and characterized a 524-bp enhancer that mediated expression in the dorsal midline, as well as, in the differentiated amnioserosa in transgenic reporter embryos. This enhancer contained a highly conserved region of 48-bp in which bioinformatic predictions and in vitro assays identified three Mad binding motifs. Mutational analysis revealed that these three motifs were necessary for proper expression of a reporter gene in transgenic embryos, suggesting that short and highly conserved genomic sequences may be indicative of functional regulatory regions in D. melanogaster genes. Dtg orthologs were not detected in basal lineages of Dipterans, which unlike D. melanogaster develop two extra-embryonic membranes, amnion and serosa, nevertheless Dtg orthologs were identified in the transcriptome of Musca domestica, in which dorsal ectoderm patterning leads to the formation of a single extra-embryonic membrane. These results suggest that Dtg was recruited as a new component of the network that controls dorsal ectoderm patterning in the lineage leading to higher Cyclorrhaphan flies, such as D. melanogaster and M. domestica.
Collapse
Affiliation(s)
- Christian Hodar
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile
| | - Alejandro Zuñiga
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile
| | - Rodrigo Pulgar
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile
| | - Dante Travisany
- Laboratorio de Bioinformática y Matemática del Genoma, Center for Mathematical Modeling, FCFM-Universidad de Chile, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile
| | - Carlos Chacon
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile
| | - Michael Pino
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile
| | - Alejandro Maass
- Laboratorio de Bioinformática y Matemática del Genoma, Center for Mathematical Modeling, FCFM-Universidad de Chile, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile; Department of Mathematical Engineering, FCFM-Universidad de Chile, Santiago, Chile
| | - Verónica Cambiazo
- Laboratorio de Bioinformática y Expresión Génica, INTA-Universidad de Chile, El Líbano 5524, Santiago, Chile; Fondap Center for Genome Regulation (CGR), Universidad de Chile, Santiago, Chile.
| |
Collapse
|