1
|
Wirthlin ME, Schmid TA, Elie JE, Zhang X, Kowalczyk A, Redlich R, Shvareva VA, Rakuljic A, Ji MB, Bhat NS, Kaplow IM, Schäffer DE, Lawler AJ, Wang AZ, Phan BN, Annaldasula S, Brown AR, Lu T, Lim BK, Azim E, Clark NL, Meyer WK, Pond SLK, Chikina M, Yartsev MM, Pfenning AR, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Vocal learning-associated convergent evolution in mammalian proteins and regulatory elements. Science 2024; 383:eabn3263. [PMID: 38422184 DOI: 10.1126/science.abn3263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 02/20/2024] [Indexed: 03/02/2024]
Abstract
Vocal production learning ("vocal learning") is a convergently evolved trait in vertebrates. To identify brain genomic elements associated with mammalian vocal learning, we integrated genomic, anatomical, and neurophysiological data from the Egyptian fruit bat (Rousettus aegyptiacus) with analyses of the genomes of 215 placental mammals. First, we identified a set of proteins evolving more slowly in vocal learners. Then, we discovered a vocal motor cortical region in the Egyptian fruit bat, an emergent vocal learner, and leveraged that knowledge to identify active cis-regulatory elements in the motor cortex of vocal learners. Machine learning methods applied to motor cortex open chromatin revealed 50 enhancers robustly associated with vocal learning whose activity tended to be lower in vocal learners. Our research implicates convergent losses of motor cortex regulatory elements in mammalian vocal learning evolution.
Collapse
Affiliation(s)
- Morgan E Wirthlin
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Tobias A Schmid
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Julie E Elie
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Xiaomeng Zhang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Amanda Kowalczyk
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ruby Redlich
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Varvara A Shvareva
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Ashley Rakuljic
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Maria B Ji
- Department of Psychology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Ninad S Bhat
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Irene M Kaplow
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Daniel E Schäffer
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Andrew Z Wang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - BaDoi N Phan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Siddharth Annaldasula
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ashley R Brown
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Tianyu Lu
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Byung Kook Lim
- Neurobiology section, Division of Biological Science, University of California, San Diego, La Jolla, CA 92093, USA
| | - Eiman Azim
- Molecular Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Nathan L Clark
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | | | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Michael M Yartsev
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94708, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA 94708, USA
| | - Andreas R Pfenning
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Gilbertson EN, Brand CM, McArthur E, Rinker DC, Kuang S, Pollard KS, Capra JA. Machine learning reveals the diversity of human 3D chromatin contact patterns. bioRxiv 2023:2023.12.22.573104. [PMID: 38187606 PMCID: PMC10769343 DOI: 10.1101/2023.12.22.573104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Understanding variation in chromatin contact patterns across human populations is critical for interpreting non-coding variants and their ultimate effects on gene expression and phenotypes. However, experimental determination of chromatin contacts at a population-scale is prohibitively expensive. To overcome this challenge, we develop and validate a machine learning method to quantify the diversity 3D chromatin contacts at 2 kilobase resolution from genome sequence alone. We then apply this approach to thousands of diverse modern humans and the inferred human-archaic hominin ancestral genome. While patterns of 3D contact divergence genome-wide are qualitatively similar to patterns of sequence divergence, we find that 3D divergence in local 1-megabase genomic windows does not follow sequence divergence. In particular, we identify 392 windows with significantly greater 3D divergence than expected from sequence. Moreover, 26% of genomic windows have rare 3D contact variation observed in a small number of individuals. Using in silico mutagenesis we find that most sequence changes to do not result in changes to 3D chromatin contacts. However in windows with substantial 3D divergence, just one or a few variants can lead to divergent 3D chromatin contacts without the individuals carrying those variants having high sequence divergence. In summary, inferring 3D chromatin contact maps across human populations reveals diverse contact patterns. We anticipate that these genetically diverse maps of 3D chromatin contact will provide a reference for future work on the function and evolution of 3D chromatin contact variation across human populations.
Collapse
Affiliation(s)
- Erin N Gilbertson
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
| | - Colin M Brand
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| | - Evonne McArthur
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
- Department of Medicine, University of Washington, Seattle, WA
| | - David C Rinker
- Department of Biological Sciences, Vanderbilt University, Nashville, TN
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
| | - Katherine S Pollard
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Chan Zuckerberg Biohub SF, San Francisco, CA
| | - John A Capra
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| |
Collapse
|
3
|
Gunsalus LM, Keiser MJ, Pollard KS. ChromaFactor: deconvolution of single-molecule chromatin organization with non-negative matrix factorization. bioRxiv 2023:2023.11.22.568268. [PMID: 38045231 PMCID: PMC10690235 DOI: 10.1101/2023.11.22.568268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
The investigation of chromatin organization in single cells holds great promise for identifying causal relationships between genome structure and function. However, analysis of single-molecule data is hampered by extreme yet inherent heterogeneity, making it challenging to determine the contributions of individual chromatin fibers to bulk trends. To address this challenge, we propose ChromaFactor, a novel computational approach based on non-negative matrix factorization that deconvolves single-molecule chromatin organization datasets into their most salient primary components. ChromaFactor provides the ability to identify trends accounting for the maximum variance in the dataset while simultaneously describing the contribution of individual molecules to each component. Applying our approach to two single-molecule imaging datasets across different genomic scales, we find that these primary components demonstrate significant correlation with key functional phenotypes, including active transcription, enhancer-promoter distance, and genomic compartment. ChromaFactor offers a robust tool for understanding the complex interplay between chromatin structure and function on individual DNA molecules, pinpointing which subpopulations drive functional changes and fostering new insights into cellular heterogeneity and its implications for bulk genomic phenomena.
Collapse
|
4
|
Lind AL, McDonald NA, Gerrick ER, Bhatt AS, Pollard KS. Hybrid assemblies of microbiome Blastocystis protists reveal evolutionary diversification reflecting host ecology. bioRxiv 2023:2023.11.20.567959. [PMID: 38045412 PMCID: PMC10690189 DOI: 10.1101/2023.11.20.567959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
The most prevalent microbial eukaryote in the human gut is Blastocystis , an obligate commensal protist also common in many other vertebrates. Blastocystis is descended from free-living stramenopile ancestors; how it has adapted to thrive within humans and a wide range of hosts is unclear. Here, we cultivated six Blastocystis strains spanning the diversity of the genus and generated highly contiguous, annotated genomes with long-read DNA-seq, Hi-C, and RNA-seq. Comparative genomics between these strains and two closely related stramenopiles with different lifestyles, the lizard gut symbiont Proteromonas lacertae and the free-living marine flagellate Cafeteria burkhardae , reveal the evolutionary history of the Blastocystis genus. We find substantial gene content variability between Blastocystis strains. Blastocystis isolated from an herbivorous tortoise has many plant carbohydrate metabolizing enzymes, some horizontally acquired from bacteria, likely reflecting fermentation within the host gut. In contrast, human- isolated Blastocystis have gained many heat shock proteins, and we find numerous subtype- specific expansions of host-interfacing genes, including cell adhesion and cell surface glycan genes. In addition, we observe that human-isolated Blastocystis have substantial changes in gene structure, including shortened introns and intergenic regions, as well as genes lacking canonical termination codons. Finally, our data indicate that the common ancestor of Blastocystis lost nearly all ancestral genes for heterokont flagella morphology, including cilia proteins, microtubule motor proteins, and ion channel proteins. Together, these findings underscore the huge functional variability within the Blastocystis genus and provide candidate genes for the adaptations these lineages have undergone to thrive in the gut microbiomes of diverse vertebrates.
Collapse
|
5
|
Gjoni K, Pollard KS. SuPreMo: a computational tool for streamlining in silico perturbation using sequence-based predictive models. bioRxiv 2023:2023.11.03.565556. [PMID: 37961123 PMCID: PMC10635135 DOI: 10.1101/2023.11.03.565556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Computationally editing genome sequences is a common bioinformatics task, but current approaches have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo), a scalable and comprehensive tool for performing in silico mutagenesis. We then demonstrate how pairs of reference and perturbed sequences can be used with machine learning models to prioritize pathogenic variants or discover new functional sequences.
Collapse
Affiliation(s)
- Ketrin Gjoni
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA 94158, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
6
|
Pasquini L, Pereira FL, Seddighi S, Zeng Y, Wei Y, Illán-Gala I, Vatsavayai SC, Friedberg A, Lee AJ, Brown JA, Spina S, Grinberg LT, Sirkis DW, Bonham LW, Yokoyama JS, Boxer AL, Kramer JH, Rosen HJ, Humphrey J, Gitler AD, Miller BL, Pollard KS, Ward ME, Seeley WW. FTLD targets brain regions expressing recently evolved genes. medRxiv 2023:2023.10.27.23297687. [PMID: 37961381 PMCID: PMC10635220 DOI: 10.1101/2023.10.27.23297687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In frontotemporal lobar degeneration (FTLD), pathological protein aggregation is associated with a decline in human-specialized social-emotional and language functions. Most disease protein aggregates contain either TDP-43 (FTLD-TDP) or tau (FTLD-tau). Here, we explored whether FTLD targets brain regions that express genes containing human accelerated regions (HARs), conserved sequences that have undergone positive selection during recent human evolution. To this end, we used structural neuroimaging from patients with FTLD and normative human regional transcriptomic data to identify genes expressed in FTLD-targeted brain regions. We then integrated primate comparative genomic data to test our hypothesis that FTLD targets brain regions expressing recently evolved genes. In addition, we asked whether genes expressed in FTLD-targeted brain regions are enriched for genes that undergo cryptic splicing when TDP-43 function is impaired. We found that FTLD-TDP and FTLD-tau subtypes target brain regions that express overlapping and distinct genes, including many linked to neuromodulatory functions. Genes whose normative brain regional expression pattern correlated with FTLD cortical atrophy were strongly associated with HARs. Atrophy-correlated genes in FTLD-TDP showed greater overlap with TDP-43 cryptic splicing genes compared with atrophy-correlated genes in FTLD-tau. Cryptic splicing genes were enriched for HAR genes, and vice versa, but this effect was due to the confounding influence of gene length. Analyses performed at the individual-patient level revealed that the expression of HAR genes and cryptically spliced genes within putative regions of disease onset differed across FTLD-TDP subtypes. Overall, our findings suggest that FTLD targets brain regions that have undergone recent evolutionary specialization and provide intriguing potential leads regarding the transcriptomic basis for selective vulnerability in distinct FTLD molecular-anatomical subtypes.
Collapse
Affiliation(s)
- Lorenzo Pasquini
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Department of Neurology, Neuroscape, University of California, San Francisco, CA, USA
| | - Felipe L Pereira
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Sahba Seddighi
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
| | - Yi Zeng
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Yongbin Wei
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Ignacio Illán-Gala
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Global Brain Health Institute, University of California, San Francisco, San Francisco, CA, USA and Trinity College Dublin, Dublin, Ireland
- Department of Neurology, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute, Universitat Autònoma de Barcelona, Barcelona, Catalunya, Spain
| | - Sarat C Vatsavayai
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Adit Friedberg
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Global Brain Health Institute, University of California, San Francisco, San Francisco, CA, USA and Trinity College Dublin, Dublin, Ireland
| | - Alex J Lee
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Jesse A Brown
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Salvatore Spina
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Lea T Grinberg
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Department of Pathology, University of California, San Francisco, CA, USA
| | - Daniel W Sirkis
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Luke W Bonham
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Department of Radiology, University of California, San Francisco, CA, USA
| | - Jennifer S Yokoyama
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Department of Radiology, University of California, San Francisco, CA, USA
| | - Adam L Boxer
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Joel H Kramer
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Howard J Rosen
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Jack Humphrey
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Aaron D Gitler
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Bruce L Miller
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics and Bakar Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Michael E Ward
- National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
| | - William W Seeley
- Department of Neurology, Memory and Aging Center, University of California, San Francisco, CA, USA
- Department of Pathology, University of California, San Francisco, CA, USA
| |
Collapse
|
7
|
Brand CM, Kuang S, Gilbertson EN, McArthur E, Pollard KS, Webster TH, Capra JA. Sequence-based machine learning reveals 3D genome differences between bonobos and chimpanzees. bioRxiv 2023:2023.10.26.564272. [PMID: 37961120 PMCID: PMC10634871 DOI: 10.1101/2023.10.26.564272] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Phenotypic divergence between closely related species, including bonobos and chimpanzees (genus Pan), is largely driven by variation in gene regulation. The 3D structure of the genome mediates gene expression; however, genome folding differences in Pan are not well understood. Here, we apply machine learning to predict genome-wide 3D genome contact maps from DNA sequence for 56 bonobos and chimpanzees, encompassing all five extant lineages. We use a pairwise approach to estimate 3D divergence between individuals from the resulting contact maps in 4,420 1 Mb genomic windows. While most pairs were similar, ∼17% were predicted to be substantially divergent in genome folding. The most dissimilar maps were largely driven by single individuals with rare variants that produce unique 3D genome folding in a region. We also identified 89 genomic windows where bonobo and chimpanzee contact maps substantially diverged, including several windows harboring genes associated with traits implicated in Pan phenotypic divergence. We used in silico mutagenesis to identify 51 3D-modifying variants in these bonobo-chimpanzee divergent windows, finding that 34 or 66.67% induce genome folding changes via CTCF binding motif disruption. Our results reveal 3D genome variation at the population-level and identify genomic regions where changes in 3D folding may contribute to phenotypic differences in our closest living relatives.
Collapse
Affiliation(s)
- Colin M. Brand
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
| | - Erin N. Gilbertson
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
| | - Evonne McArthur
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
| | - Katherine S. Pollard
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Chan Zuckerberg Biohub, San Francisco, CA
| | | | - John A. Capra
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
| |
Collapse
|
8
|
Kuang S, Pollard KS. Exploring the Roles of RNAs in Chromatin Architecture Using Deep Learning. bioRxiv 2023:2023.10.22.563498. [PMID: 37961712 PMCID: PMC10634726 DOI: 10.1101/2023.10.22.563498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Recent studies have highlighted the impact of both transcription and transcripts on 3D genome organization, particularly its dynamics. Here, we propose a deep learning framework, called AkitaR, that leverages both genome sequences and genome-wide RNA-DNA interactions to investigate the roles of chromatin-associated RNAs (caRNAs) on genome folding in HFFc6 cells. In order to disentangle the cis- and trans-regulatory roles of caRNAs, we compared models with nascent transcripts, trans-located caRNAs, open chromatin data, or DNA sequence alone. Both nascent transcripts and trans-located caRNAs improved the models' predictions, especially at cell-type-specific genomic regions. Analyses of feature importance scores revealed the contribution of caRNAs at TAD boundaries, chromatin loops and nuclear sub-structures such as nuclear speckles and nucleoli to the models' predictions. Furthermore, we identified non-coding RNAs (ncRNAs) known to regulate chromatin structures, such as MALAT1 and NEAT1, as well as several novel RNAs, RNY5, RPPH1, POLG-DT and THBS1-IT, that might modulate chromatin architecture through trans-interactions in HFFc6. Our modeling also suggests that transcripts from Alus and other repetitive elements may facilitate chromatin interactions through trans R-loop formation. Our findings provide new insights and generate testable hypotheses about the roles of caRNAs in shaping chromatin organization.
Collapse
Affiliation(s)
- Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
| | - Katherine S. Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
9
|
Gunsalus LM, Keiser MJ, Pollard KS. In silico discovery of repetitive elements as key sequence determinants of 3D genome folding. Cell Genom 2023; 3:100410. [PMID: 37868032 PMCID: PMC10589630 DOI: 10.1016/j.xgen.2023.100410] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/08/2022] [Accepted: 08/31/2023] [Indexed: 10/24/2023]
Abstract
Natural and experimental genetic variants can modify DNA loops and insulating boundaries to tune transcription, but it is unknown how sequence perturbations affect chromatin organization genome wide. We developed a deep-learning strategy to quantify the effect of any insertion, deletion, or substitution on chromatin contacts and systematically scored millions of synthetic variants. While most genetic manipulations have little impact, regions with CTCF motifs and active transcription are highly sensitive, as expected. Our unbiased screen and subsequent targeted experiments also point to noncoding RNA genes and several families of repetitive elements as CTCF-motif-free DNA sequences with particularly large effects on nearby chromatin interactions, sometimes exceeding the effects of CTCF sites and explaining interactions that lack CTCF. We anticipate that our disruption tracks may be of broad interest and utility as a measure of 3D genome sensitivity, and our computational strategies may serve as a template for biological inquiry with deep learning.
Collapse
Affiliation(s)
- Laura M. Gunsalus
- Gladstone Institutes, San Francisco, CA, USA
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, USA
| | - Michael J. Keiser
- Institute for Neurodegenerative Diseases, University of California, San Francisco, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
| | - Katherine S. Pollard
- Gladstone Institutes, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| |
Collapse
|
10
|
Shi ZJ, Nayfach S, Pollard KS. Maast: genotyping thousands of microbial strains efficiently. Genome Biol 2023; 24:186. [PMID: 37563669 PMCID: PMC10416524 DOI: 10.1186/s13059-023-03030-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open
Abstract
Existing single nucleotide polymorphism (SNP) genotyping algorithms do not scale for species with thousands of sequenced strains, nor do they account for conspecific redundancy. Here we present a bioinformatics tool, Maast, which empowers population genetic meta-analysis of microbes at an unrivaled scale. Maast implements a novel algorithm to heuristically identify a minimal set of diverse conspecific genomes, then constructs a reliable SNP panel for each species, and enables rapid and accurate genotyping using a hybrid of whole-genome alignment and k-mer exact matching. We demonstrate Maast's utility by genotyping thousands of Helicobacter pylori strains and tracking SARS-CoV-2 diversification.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Joint Genome Institute, Department of Energy, Walnut Creek, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
11
|
Dekker J, Alber F, Aufmkolk S, Beliveau BJ, Bruneau BG, Belmont AS, Bintu L, Boettiger A, Calandrelli R, Disteche CM, Gilbert DM, Gregor T, Hansen AS, Huang B, Huangfu D, Kalhor R, Leslie CS, Li W, Li Y, Ma J, Noble WS, Park PJ, Phillips-Cremins JE, Pollard KS, Rafelski SM, Ren B, Ruan Y, Shav-Tal Y, Shen Y, Shendure J, Shu X, Strambio-De-Castillia C, Vertii A, Zhang H, Zhong S. Spatial and temporal organization of the genome: Current state and future aims of the 4D nucleome project. Mol Cell 2023; 83:2624-2640. [PMID: 37419111 PMCID: PMC10528254 DOI: 10.1016/j.molcel.2023.06.018] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 06/10/2023] [Accepted: 06/12/2023] [Indexed: 07/09/2023]
Abstract
The four-dimensional nucleome (4DN) consortium studies the architecture of the genome and the nucleus in space and time. We summarize progress by the consortium and highlight the development of technologies for (1) mapping genome folding and identifying roles of nuclear components and bodies, proteins, and RNA, (2) characterizing nuclear organization with time or single-cell resolution, and (3) imaging of nuclear organization. With these tools, the consortium has provided over 2,000 public datasets. Integrative computational models based on these data are starting to reveal connections between genome structure and function. We then present a forward-looking perspective and outline current aims to (1) delineate dynamics of nuclear architecture at different timescales, from minutes to weeks as cells differentiate, in populations and in single cells, (2) characterize cis-determinants and trans-modulators of genome organization, (3) test functional consequences of changes in cis- and trans-regulators, and (4) develop predictive models of genome structure and function.
Collapse
Affiliation(s)
- Job Dekker
- University of Massachusetts Chan Medical School, Boston, MA, USA; Howard Hughes Medical Institute, Chevy Chase, MD, USA.
| | - Frank Alber
- University of California, Los Angeles, Los Angeles, CA, USA
| | | | | | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA, USA; University of California, San Francisco, San Francisco, CA, USA
| | | | | | | | | | | | | | | | | | - Bo Huang
- University of California, San Francisco, San Francisco, CA, USA
| | - Danwei Huangfu
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Reza Kalhor
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Wenbo Li
- University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yun Li
- University of North Carolina, Gillings School of Global Public Health, Chapel Hill, NC, USA
| | - Jian Ma
- Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA; University of California, San Francisco, San Francisco, CA, USA; Chan Zuckerberg Biohub, San Francisco, San Francisco, CA, USA
| | | | - Bing Ren
- University of California, San Diego, La Jolla, CA, USA
| | - Yijun Ruan
- Zhejiang University, Hangzhou, Zhejiang, China
| | | | - Yin Shen
- University of California, San Francisco, San Francisco, CA, USA
| | | | - Xiaokun Shu
- University of California, San Francisco, San Francisco, CA, USA
| | | | | | | | - Sheng Zhong
- University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
12
|
Jin X, Yu FB, Yan J, Weakley AM, Dubinkina V, Meng X, Pollard KS. Culturing of a complex gut microbial community in mucin-hydrogel carriers reveals strain- and gene-associated spatial organization. Nat Commun 2023; 14:3510. [PMID: 37316519 PMCID: PMC10267222 DOI: 10.1038/s41467-023-39121-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 05/26/2023] [Indexed: 06/16/2023] Open
Abstract
Microbial community function depends on both taxonomic composition and spatial organization. While composition of the human gut microbiome has been deeply characterized, less is known about the organization of microbes between regions such as lumen and mucosa and the microbial genes regulating this organization. Using a defined 117 strain community for which we generate high-quality genome assemblies, we model mucosa/lumen organization with in vitro cultures incorporating mucin hydrogel carriers as surfaces for bacterial attachment. Metagenomic tracking of carrier cultures reveals increased diversity and strain-specific spatial organization, with distinct strains enriched on carriers versus liquid supernatant, mirroring mucosa/lumen enrichment in vivo. A comprehensive search for microbial genes associated with this spatial organization identifies candidates with known adhesion-related functions, as well as novel links. These findings demonstrate that carrier cultures of defined communities effectively recapitulate fundamental aspects of gut spatial organization, enabling identification of key microbial strains and genes.
Collapse
Affiliation(s)
- Xiaofan Jin
- Gladstone Institutes, San Francisco, CA, USA
| | | | - Jia Yan
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
| | | | | | - Xiandong Meng
- Sarafan ChEM-H Institute, Stanford University, Stanford, CA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA.
- Chan-Zuckerberg Biohub, San Francisco, CA, USA.
- University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
13
|
Yu M, Harper AR, Aguirre M, Pittman M, Tcheandjieu C, Amgalan D, Grace C, Goel A, Farrall M, Xiao K, Engreitz J, Pollard KS, Watkins H, Priest JR. Genetic Determinants of the Interventricular Septum Are Linked to Ventricular Septal Defects and Hypertrophic Cardiomyopathy. Circ Genom Precis Med 2023; 16:207-215. [PMID: 37017090 PMCID: PMC10293084 DOI: 10.1161/circgen.122.003708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 01/06/2023] [Indexed: 04/06/2023]
Abstract
BACKGROUND A large proportion of genetic risk remains unexplained for structural heart disease involving the interventricular septum (IVS) including hypertrophic cardiomyopathy and ventricular septal defects. This study sought to develop a reproducible proxy of IVS structure from standard medical imaging, discover novel genetic determinants of IVS structure, and relate these loci to diseases of the IVS, hypertrophic cardiomyopathy, and ventricular septal defect. METHODS We estimated the cross-sectional area of the IVS from the 4-chamber view of cardiac magnetic resonance imaging in 32 219 individuals from the UK Biobank which was used as the basis of genome wide association studies and Mendelian randomization. RESULTS Measures of IVS cross-sectional area at diastole were a strong proxy for the 3-dimensional volume of the IVS (Pearson r=0.814, P=0.004), and correlated with anthropometric measures, blood pressure, and diagnostic codes related to cardiovascular physiology. Seven loci with clear genomic consequence and relevance to cardiovascular biology were uncovered by genome wide association studies, most notably a single nucleotide polymorphism in an intron of CDKN1A (rs2376620; β, 7.7 mm2 [95% CI, 5.8-11.0]; P=6.0×10-10), and a common inversion incorporating KANSL1 predicted to disrupt local chromatin structure (β, 8.4 mm2 [95% CI, 6.3-10.9]; P=4.2×10-14). Mendelian randomization suggested that inheritance of larger IVS cross-sectional area at diastole was strongly associated with hypertrophic cardiomyopathy risk (pIVW=4.6×10-10) while inheritance of smaller IVS cross-sectional area at diastole was associated with risk for ventricular septal defect (pIVW=0.007). CONCLUSIONS Automated estimates of cross-sectional area of the IVS supports discovery of novel loci related to cardiac development and Mendelian disease. Inheritance of genetic liability for either small or large IVS, appears to confer risk for ventricular septal defect or hypertrophic cardiomyopathy, respectively. These data suggest that a proportion of risk for structural and congenital heart disease can be localized to the common genetic determinants of size and shape of cardiovascular anatomy.
Collapse
Affiliation(s)
- Mengyao Yu
- Dept of Pediatrics, Division of Pediatric Cardiology, Division of Cardiovascular Medicine, Stanford Univ School of Medicine
- Stanford Cardiovascular Institute, Stanford Univ, Stanford, CA
| | - Andrew R. Harper
- Radcliffe Dept of Medicine, Univ of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital
- Wellcome Centre for Human Genetics, Roosevelt Drive, Oxford
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom
| | - Matthew Aguirre
- Dept of Pediatrics, Division of Pediatric Cardiology, Division of Cardiovascular Medicine, Stanford Univ School of Medicine
- Dept of Biomedical Data Science, Stanford Medical School, Stanford
| | - Maureen Pittman
- Univ of California, San Francisco, San Francisco
- Gladstone Institute of Data Science & Biotechnology, San Francisco
| | - Catherine Tcheandjieu
- Dept of Pediatrics, Division of Pediatric Cardiology, Division of Cardiovascular Medicine, Stanford Univ School of Medicine
- Stanford Cardiovascular Institute, Stanford Univ, Stanford, CA
- Dept of Medicine, Division of Cardiovascular Medicine, Stanford Univ School of Medicine
| | - Dulguun Amgalan
- Dept of Genetics, Stanford Univ, Stanford, CA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA
| | - Christopher Grace
- Radcliffe Dept of Medicine, Univ of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital
- Wellcome Centre for Human Genetics, Roosevelt Drive, Oxford
| | - Anuj Goel
- Radcliffe Dept of Medicine, Univ of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital
- Wellcome Centre for Human Genetics, Roosevelt Drive, Oxford
| | - Martin Farrall
- Radcliffe Dept of Medicine, Univ of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital
- Wellcome Centre for Human Genetics, Roosevelt Drive, Oxford
| | - Ke Xiao
- College of Information & Computer Sciences at Univ of Massachusetts Amherst, Amherst, MA
| | - Jesse Engreitz
- Dept of Genetics, Stanford Univ, Stanford, CA
- Basic Sciences and Engineering Initiative, Betty Irene Moore Children’s Heart Center, Lucile Packard Children’s Hospital, Stanford, CA
| | - Katherine S. Pollard
- Univ of California, San Francisco, San Francisco
- Gladstone Institute of Data Science & Biotechnology, San Francisco
- Chan-Zuckerberg Biohub
| | - Hugh Watkins
- Radcliffe Dept of Medicine, Univ of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital
- Wellcome Centre for Human Genetics, Roosevelt Drive, Oxford
| | - James R. Priest
- Dept of Pediatrics, Division of Pediatric Cardiology, Division of Cardiovascular Medicine, Stanford Univ School of Medicine
- Stanford Cardiovascular Institute, Stanford Univ, Stanford, CA
- Chan-Zuckerberg Biohub
- Current affiliation: Tenaya Therapeutics, South San Francisco, CA
| |
Collapse
|
14
|
Gunsalus LM, McArthur E, Gjoni K, Kuang S, Pittman M, Capra JA, Pollard KS. Comparing chromatin contact maps at scale: methods and insights. Res Sq 2023:rs.3.rs-2842981. [PMID: 37292728 PMCID: PMC10246266 DOI: 10.21203/rs.3.rs-2842981/v1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Comparing chromatin contact maps is an essential step in quantifying how three-dimensional (3D) genome organization shapes development, evolution, and disease. However, no gold standard exists for comparing contact maps, and even simple methods often disagree. In this study, we propose novel comparison methods and evaluate them alongside existing approaches using genome-wide Hi-C data and 22,500 in silico predicted contact maps. We also quantify the robustness of methods to common sources of biological and technical variation, such as boundary size and noise. We find that simple difference-based methods such as mean squared error are suitable for initial screening, but biologically informed methods are necessary to identify why maps diverge and propose specific functional hypotheses. We provide a reference guide, codebase, and benchmark for rapidly comparing chromatin contact maps at scale to enable biological insights into the 3D organization of the genome.
Collapse
Affiliation(s)
- Laura M. Gunsalus
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Evonne McArthur
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| | - Ketrin Gjoni
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Maureen Pittman
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - John A. Capra
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
| | - Katherine S. Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
15
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023. [PMID: 37104599 DOI: 0.1126/science.abn3943] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R S Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Bruce W Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian's National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Kaplow IM, Lawler AJ, Schäffer DE, Srinivasan C, Sestili HH, Wirthlin ME, Phan BN, Prasad K, Brown AR, Zhang X, Foley K, Genereux DP, Karlsson EK, Lindblad-Toh K, Meyer WK, Pfenning AR, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Relating enhancer genetic variation across mammals to complex phenotypes using machine learning. Science 2023; 380:eabm7993. [PMID: 37104615 DOI: 10.1126/science.abm7993] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.
Collapse
Affiliation(s)
- Irene M Kaplow
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Alyssa J Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Daniel E Schäffer
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Chaitanya Srinivasan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Heather H Sestili
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Morgan E Wirthlin
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - BaDoi N Phan
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Kavya Prasad
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ashley R Brown
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Xiaomeng Zhang
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Kathleen Foley
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
| | - Diane P Genereux
- Broad Institute, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Elinor K Karlsson
- Broad Institute, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Kerstin Lindblad-Toh
- Broad Institute, Cambridge, MA, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, USA
| | - Andreas R Pfenning
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Christmas MJ, Kaplow IM, Genereux DP, Dong MX, Hughes GM, Li X, Sullivan PF, Hindle AG, Andrews G, Armstrong JC, Bianchi M, Breit AM, Diekhans M, Fanter C, Foley NM, Goodman DB, Goodman L, Keough KC, Kirilenko B, Kowalczyk A, Lawless C, Lind AL, Meadows JRS, Moreira LR, Redlich RW, Ryan L, Swofford R, Valenzuela A, Wagner F, Wallerman O, Brown AR, Damas J, Fan K, Gatesy J, Grimshaw J, Johnson J, Kozyrev SV, Lawler AJ, Marinescu VD, Morrill KM, Osmanski A, Paulat NS, Phan BN, Reilly SK, Schäffer DE, Steiner C, Supple MA, Wilder AP, Wirthlin ME, Xue JR, Birren BW, Gazal S, Hubley RM, Koepfli KP, Marques-Bonet T, Meyer WK, Nweeia M, Sabeti PC, Shapiro B, Smit AFA, Springer MS, Teeling EC, Weng Z, Hiller M, Levesque DL, Lewin HA, Murphy WJ, Navarro A, Paten B, Pollard KS, Ray DA, Ruf I, Ryder OA, Pfenning AR, Lindblad-Toh K, Karlsson EK. Evolutionary constraint and innovation across hundreds of placental mammals. Science 2023; 380:eabn3943. [PMID: 37104599 PMCID: PMC10250106 DOI: 10.1126/science.abn3943] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 12/16/2022] [Indexed: 04/29/2023]
Abstract
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Collapse
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Irene M. Kaplow
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Allyson G. Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Joel C. Armstrong
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ana M. Breit
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Nicole M. Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Daniel B. Goodman
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA 94143, USA
| | | | - Kathleen C. Keough
- Fauna Bio, Inc., Emeryville, CA 94608, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Bogdan Kirilenko
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | - Amanda Kowalczyk
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Colleen Lawless
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Abigail L. Lind
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Lucas R. Moreira
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Ruby W. Redlich
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Louise Ryan
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Alejandro Valenzuela
- Department of Experimental and Health Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Franziska Wagner
- Museum of Zoology, Senckenberg Natural History Collections Dresden, 01109 Dresden, Germany
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Ashley R. Brown
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joana Damas
- The Genome Center, University of California Davis, Davis, CA 95616, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Jenna Grimshaw
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jeremy Johnson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Sergey V. Kozyrev
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Kathleen M. Morrill
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Austin Osmanski
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, PA 15261, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Daniel E. Schäffer
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A. Supple
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Aryn P. Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Morgan E. Wirthlin
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Allen Institute for Brain Science, Seattle, WA 98109, USA
| | - James R. Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Steven Gazal
- Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | | | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian’s National Zoo and Conservation Biology Institute, Washington, DC 20008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
| | - Tomas Marques-Bonet
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Wynn K. Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Martin Nweeia
- Department of Comprehensive Care, School of Dental Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
- Department of Vertebrate Zoology, Canadian Museum of Nature, Ottawa, Ontario K2P 2R1, Canada
- Department of Vertebrate Zoology, Smithsonian Institution, Washington, DC 20002, USA
- Narwhal Genome Initiative, Department of Restorative Dentistry and Biomaterials Sciences, Harvard School of Dental Medicine, Boston, MA 02115, USA
| | - Pardis C. Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA 02138, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Mark S. Springer
- Department of Evolution, Ecology and Organismal Biology, University of California Riverside, Riverside, CA 92521, USA
| | - Emma C. Teeling
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Faculty of Biosciences, Goethe-University, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
| | | | - Harris A. Lewin
- The Genome Center, University of California Davis, Davis, CA 95616, USA
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
- John Muir Institute for the Environment, University of California Davis, Davis, CA 95616, USA
| | - William J. Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Arcadi Navarro
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, 08005 Barcelona, Spain
- CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
| | - Benedict Paten
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Katherine S. Pollard
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Irina Ruf
- Division of Messel Research and Mammalogy, Senckenberg Research Institute and Natural History Museum Frankfurt, 60325 Frankfurt am Main, Germany
| | - Oliver A. Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, School of Biological Sciences, University of California San Diego, La Jolla, CA 92039, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
18
|
Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos DG, Hilgers L, Lindblad-Toh K, Karlsson EK, Hiller M, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Integrating gene annotation with orthology inference at scale. Science 2023; 380:eabn3107. [PMID: 37104600 DOI: 10.1126/science.abn3107] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.
Collapse
Affiliation(s)
- Bogdan M Kirilenko
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Chetan Munegowda
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Ekaterina Osipova
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - David Jebb
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Virag Sharma
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Moritz Blumer
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
| | - Ariadna E Morales
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Alexis-Walid Ahmed
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Dimitrios-Georgios Kontopoulos
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Leon Hilgers
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 751 32 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany
- Center for Systems Biology Dresden, 01307 Dresden, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
- Senckenberg Research Institute, 60325 Frankfurt, Germany
- Goethe University Frankfurt, Faculty of Biosciences, 60438 Frankfurt, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Sullivan PF, Meadows JRS, Gazal S, Phan BN, Li X, Genereux DP, Dong MX, Bianchi M, Andrews G, Sakthikumar S, Nordin J, Roy A, Christmas MJ, Marinescu VD, Wang C, Wallerman O, Xue J, Yao S, Sun Q, Szatkiewicz J, Wen J, Huckins LM, Lawler A, Keough KC, Zheng Z, Zeng J, Wray NR, Li Y, Johnson J, Chen J, Paten B, Reilly SK, Hughes GM, Weng Z, Pollard KS, Pfenning AR, Forsberg-Nilsson K, Karlsson EK, Lindblad-Toh K. Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 2023; 380:eabn2937. [PMID: 37104612 PMCID: PMC10259825 DOI: 10.1126/science.abn2937] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 02/09/2023] [Indexed: 04/29/2023]
Abstract
Thousands of genomic regions have been associated with heritable human diseases, but attempts to elucidate biological mechanisms are impeded by an inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function, agnostic to cell type or disease mechanism. Single-base phyloP scores from 240 mammals identified 3.3% of the human genome as significantly constrained and likely functional. We compared phyloP scores to genome annotation, association studies, copy-number variation, clinical genetics findings, and cancer data. Constrained positions are enriched for variants that explain common disease heritability more than other functional annotations. Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
Collapse
Affiliation(s)
- Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Xue Li
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Diane P. Genereux
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Sharadha Sakthikumar
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Jessika Nordin
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 75185 Uppsala, Sweden
| | - Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Chao Wang
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - James Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Shuyang Yao
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Quan Sun
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jin Szatkiewicz
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Laura M. Huckins
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alyssa Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kathleen C. Keough
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Yun Li
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jessica Johnson
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | | | - Benedict Paten
- UC Santa Cruz Genomics Institute, Santa Cruz, CA 95064, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Katherine S. Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 75185 Uppsala, Sweden
- Biodiscovery Institute, University of Nottingham, Nottingham NG7 2RD, UK
| | - Elinor K. Karlsson
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| |
Collapse
|
20
|
Keough KC, Whalen S, Inoue F, Przytycki PF, Fair T, Deng C, Steyert M, Ryu H, Lindblad-Toh K, Karlsson E, Nowakowski T, Ahituv N, Pollen A, Pollard KS. Three-dimensional genome rewiring in loci with human accelerated regions. Science 2023; 380:eabm1696. [PMID: 37104607 PMCID: PMC10999243 DOI: 10.1126/science.abm1696] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/01/2023] [Indexed: 04/29/2023]
Abstract
Human accelerated regions (HARs) are conserved genomic loci that evolved at an accelerated rate in the human lineage and may underlie human-specific traits. We generated HARs and chimpanzee accelerated regions with an automated pipeline and an alignment of 241 mammalian genomes. Combining deep learning with chromatin capture experiments in human and chimpanzee neural progenitor cells, we discovered a significant enrichment of HARs in topologically associating domains containing human-specific genomic variants that change three-dimensional (3D) genome organization. Differential gene expression between humans and chimpanzees at these loci suggests rewiring of regulatory interactions between HARs and neurodevelopmental genes. Thus, comparative genomics together with models of 3D genome folding revealed enhancer hijacking as an explanation for the rapid evolution of HARs.
Collapse
Affiliation(s)
- Kathleen C Keough
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Sean Whalen
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Pawel F Przytycki
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Tyler Fair
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Anatomy, University of California San Francisco, San Francisco, CA, USA
| | - Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Marilyn Steyert
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Anatomy, University of California San Francisco, San Francisco, CA, USA
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA
| | - Hane Ryu
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Elinor Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - Tomasz Nowakowski
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
- Department of Anatomy, University of California San Francisco, San Francisco, CA, USA
- Department of Psychiatry and Behavioral Sciences, University of California San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Alex Pollen
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California San Francisco, San Francisco, CA, USA
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics and Bakar Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
21
|
Wilder AP, Supple MA, Subramanian A, Mudide A, Swofford R, Serres-Armero A, Steiner C, Koepfli KP, Genereux DP, Karlsson EK, Lindblad-Toh K, Marques-Bonet T, Munoz Fuentes V, Foley K, Meyer WK, Ryder OA, Shapiro B, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. The contribution of historical processes to contemporary extinction risk in placental mammals. Science 2023; 380:eabn5856. [PMID: 37104572 PMCID: PMC10184782 DOI: 10.1126/science.abn5856] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Species persistence can be influenced by the amount, type, and distribution of diversity across the genome, suggesting a potential relationship between historical demography and resilience. In this study, we surveyed genetic variation across single genomes of 240 mammals that compose the Zoonomia alignment to evaluate how historical effective population size (Ne) affects heterozygosity and deleterious genetic load and how these factors may contribute to extinction risk. We find that species with smaller historical Ne carry a proportionally larger burden of deleterious alleles owing to long-term accumulation and fixation of genetic load and have a higher risk of extinction. This suggests that historical demography can inform contemporary resilience. Models that included genomic data were predictive of species' conservation status, suggesting that, in the absence of adequate census or ecological data, genomic information may provide an initial risk assessment.
Collapse
Affiliation(s)
- Aryn P Wilder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Megan A Supple
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA 95064, USA
| | | | | | - Ross Swofford
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Aitor Serres-Armero
- Institute of Evolutionary Biology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Cynthia Steiner
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
| | - Klaus-Peter Koepfli
- Smithsonian-Mason School of Conservation, George Mason University, Front Royal, VA 22630, USA
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC 30008, USA
- Computer Technologies Laboratory, ITMO University, St. Petersburg 197101, Russia
| | | | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala 751 32, Sweden
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona 08003, Spain
- Catalan Institution of Research and Advanced Studies, Barcelona 08010, Spain
- Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona 08193, Spain
| | - Violeta Munoz Fuentes
- European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Kathleen Foley
- College of Law, University of Iowa, Iowa City, IA 52242, USA
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Wynn K Meyer
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027, USA
- Department of Evolution, Behavior and Ecology, Division of Biology, University of California, San Diego, La Jolla, CA 92039, USA
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Andrews G, Fan K, Pratt HE, Phalke N, Karlsson EK, Lindblad-Toh K, Gazal S, Moore JE, Weng Z, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Mammalian evolution of human cis-regulatory elements and transcription factor binding sites. Science 2023; 380:eabn7930. [PMID: 37104580 DOI: 10.1126/science.abn7930] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Understanding the regulatory landscape of the human genome is a long-standing objective of modern biology. Using the reference-free alignment across 241 mammalian genomes produced by the Zoonomia Consortium, we charted evolutionary trajectories for 0.92 million human candidate cis-regulatory elements (cCREs) and 15.6 million human transcription factor binding sites (TFBSs). We identified 439,461 cCREs and 2,024,062 TFBSs under evolutionary constraint. Genes near constrained elements perform fundamental cellular processes, whereas genes near primate-specific elements are involved in environmental interaction, including odor perception and immune response. About 20% of TFBSs are transposable element-derived and exhibit intricate patterns of gains and losses during primate evolution whereas sequence variants associated with complex traits are enriched in constrained TFBSs. Our annotations illuminate the regulatory functions of the human genome.
Collapse
Affiliation(s)
- Gregory Andrews
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Nishigandha Phalke
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, 75132 Uppsala, Sweden
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Foley NM, Mason VC, Harris AJ, Bredemeyer KR, Damas J, Lewin HA, Eizirik E, Gatesy J, Karlsson EK, Lindblad-Toh K, Springer MS, Murphy WJ, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. A genomic timescale for placental mammal evolution. Science 2023; 380:eabl8189. [PMID: 37104581 DOI: 10.1126/science.abl8189] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
The precise pattern and timing of speciation events that gave rise to all living placental mammals remain controversial. We provide a comprehensive phylogenetic analysis of genetic variation across an alignment of 241 placental mammal genome assemblies, addressing prior concerns regarding limited genomic sampling across species. We compared neutral genome-wide phylogenomic signals using concatenation and coalescent-based approaches, interrogated phylogenetic variation across chromosomes, and analyzed extensive catalogs of structural variants. Interordinal relationships exhibit relatively low rates of phylogenomic conflict across diverse datasets and analytical methods. Conversely, X-chromosome versus autosome conflicts characterize multiple independent clades that radiated during the Cenozoic. Genomic time trees reveal an accumulation of cladogenic events before and immediately after the Cretaceous-Paleogene (K-Pg) boundary, implying important roles for Cretaceous continental vicariance and the K-Pg extinction in the placental radiation.
Collapse
Affiliation(s)
- Nicole M Foley
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Victor C Mason
- Institute of Cell Biology, University of Bern, Bern, Switzerland
| | - Andrew J Harris
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Kevin R Bredemeyer
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | - Joana Damas
- The Genome Center, University of California, Davis, CA, USA
| | - Harris A Lewin
- The Genome Center, University of California, Davis, CA, USA
- Department of Evolution and Ecology, University of California, Davis, CA, USA
| | - Eduardo Eizirik
- School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, Brazil
| | - John Gatesy
- Division of Vertebrate Zoology, American Museum of Natural History, New York, NY, USA
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, University of Massachussetts Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 751 32 Uppsala, Sweden
| | - Mark S Springer
- Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA, USA
| | - William J Murphy
- Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
- Interdisciplinary Program in Genetics and Genomics, Texas A&M University, College Station, TX, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Xue JR, Mackay-Smith A, Mouri K, Garcia MF, Dong MX, Akers JF, Noble M, Li X, Lindblad-Toh K, Karlsson EK, Noonan JP, Capellini TD, Brennand KJ, Tewhey R, Sabeti PC, Reilly SK, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. The functional and evolutionary impacts of human-specific deletions in conserved elements. Science 2023; 380:eabn2253. [PMID: 37104592 DOI: 10.1126/science.abn2253] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conserved genomic sequences disrupted in humans may underlie uniquely human phenotypic traits. We identified and characterized 10,032 human-specific conserved deletions (hCONDELs). These short (average 2.56 base pairs) deletions are enriched for human brain functions across genetic, epigenomic, and transcriptomic datasets. Using massively parallel reporter assays in six cell types, we discovered 800 hCONDELs conferring significant differences in regulatory activity, half of which enhance rather than disrupt regulatory function. We highlight several hCONDELs with putative human-specific effects on brain development, including HDAC5, CPEB4, and PPP2CA. Reverting an hCONDEL to the ancestral sequence alters the expression of LOXL2 and developmental genes involved in myelination and synaptic function. Our data provide a rich resource to investigate the evolutionary mechanisms driving new traits in humans and other species.
Collapse
Affiliation(s)
- James R Xue
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Ava Mackay-Smith
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | | | - Michael X Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Jared F Akers
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Mark Noble
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Xue Li
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Elinor K Karlsson
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA, USA
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| | - Terence D Capellini
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Kristen J Brennand
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Psychiatry, Yale University, New Haven, CT, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences Tufts University School of Medicine, Boston, MA, USA
| | - Pardis C Sabeti
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Yu M, Aguirre M, Jia M, Gjoni K, Cordova-Palomera A, Munger C, Amgalan D, Rosa Ma X, Pereira A, Tcheandjieu C, Seidman C, Seidman J, Tristani-Firouzi M, Chung W, Goldmuntz E, Srivastava D, Loos RJF, Chami N, Cordell H, Dreßen M, Mueller-Myhsok B, Lahm H, Krane M, Pollard KS, Engreitz JM, Gagliano Taliun SA, Gelb BD, Priest JR. Oligogenic Architecture of Rare Noncoding Variants Distinguishes 4 Congenital Heart Disease Phenotypes. Circ Genom Precis Med 2023:e003968. [PMID: 37026454 DOI: 10.1161/circgen.122.003968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
BACKGROUND Congenital heart disease (CHD) is highly heritable, but the power to identify inherited risk has been limited to analyses of common variants in small cohorts. METHODS We performed reimputation of 4 CHD cohorts (n=55 342) to the TOPMed reference panel (freeze 5), permitting meta-analysis of 14 784 017 variants including 6 035 962 rare variants of high imputation quality as validated by whole genome sequencing. RESULTS Meta-analysis identified 16 novel loci, including 12 rare variants, which displayed moderate or large effect sizes (median odds ratio, 3.02) for 4 separate CHD categories. Analyses of chromatin structure link 13 of the genome-wide significant loci to key genes in cardiac development; rs373447426 (minor allele frequency, 0.003 [odds ratio, 3.37 for Conotruncal heart disease]; P=1.49×10-8) is predicted to disrupt chromatin structure for 2 nearby genes BDH1 and DLG1 involved in Conotruncal development. A lead variant rs189203952 (minor allele frequency, 0.01 [odds ratio, 2.4 for left ventricular outflow tract obstruction]; P=1.46×10-8) is predicted to disrupt the binding sites of 4 transcription factors known to participate in cardiac development in the promoter of SPAG9. A tissue-specific model of chromatin conformation suggests that common variant rs78256848 (minor allele frequency, 0.11 [odds ratio, 1.4 for Conotruncal heart disease]; P=2.6×10-8) physically interacts with NCAM1 (PFDR=1.86×10-27), a neural adhesion molecule acting in cardiac development. Importantly, while each individual malformation displayed substantial heritability (observed h2 ranging from 0.26 for complex malformations to 0.37 for left ventricular outflow tract obstructive disease) the risk for different CHD malformations appeared to be separate, without genetic correlation measured by linkage disequilibrium score regression or regional colocalization. CONCLUSIONS We describe a set of rare noncoding variants conferring significant risk for individual heart malformations which are linked to genes governing cardiac development. These results illustrate that the oligogenic basis of CHD and significant heritability may be linked to rare variants outside protein-coding regions conferring substantial risk for individual categories of cardiac malformation.
Collapse
Affiliation(s)
- Mengyao Yu
- Department of Pediatrics, Stanford University School of Medicine. (M.Y., M.A., A.C.-P., C.T., J.R.P.)
| | - Matthew Aguirre
- Department of Pediatrics, Stanford University School of Medicine. (M.Y., M.A., A.C.-P., C.T., J.R.P.)
- Department of Biomedical Data Science, Stanford University, CA (M.A.)
| | - Meiwen Jia
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry Munich, Germany (M.J., B.M.-M.)
| | - Ketrin Gjoni
- Gladstone Institutes; University of California San Francisco (K.G., C.T., D.S., K.S.P.)
| | - Aldo Cordova-Palomera
- Department of Pediatrics, Stanford University School of Medicine. (M.Y., M.A., A.C.-P., C.T., J.R.P.)
| | - Chad Munger
- Department of Genetics, Stanford University School of Medicine. (C.M., D.A., X.R.M., J.M.E.)
| | - Dulguun Amgalan
- Department of Genetics, Stanford University School of Medicine. (C.M., D.A., X.R.M., J.M.E.)
| | - X Rosa Ma
- Department of Genetics, Stanford University School of Medicine. (C.M., D.A., X.R.M., J.M.E.)
| | - Alexandre Pereira
- Department of Genetics, Harvard University, Cambridge, MA (A.P., C.S., J.S.)
| | - Catherine Tcheandjieu
- Department of Pediatrics, Stanford University School of Medicine. (M.Y., M.A., A.C.-P., C.T., J.R.P.)
- Gladstone Institutes; University of California San Francisco (K.G., C.T., D.S., K.S.P.)
| | - Christine Seidman
- Department of Genetics, Harvard University, Cambridge, MA (A.P., C.S., J.S.)
| | - Jonathan Seidman
- Department of Genetics, Harvard University, Cambridge, MA (A.P., C.S., J.S.)
| | | | - Wendy Chung
- Department of Pediatrics, Columbia University, NY (W.C.)
| | | | - Deepak Srivastava
- Gladstone Institutes; University of California San Francisco (K.G., C.T., D.S., K.S.P.)
| | - Ruth J F Loos
- Icahn School of Medicine at Mount Sinai, NY (R.J.F.L., N.C.)
| | - Nathalie Chami
- Icahn School of Medicine at Mount Sinai, NY (R.J.F.L., N.C.)
| | - Heather Cordell
- Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne, United Kingdom (H.C.)
| | - Martina Dreßen
- Department of Cardiovascular Surgery, Division of Experimental Surgery, Institute Insure (Institute for Translational Cardiac Surgery), German Heart Center Munich & Technical University of Munich, School of Medicine & Health, Germany (M.D., H.L., M.K.)
| | - Bertram Mueller-Myhsok
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry Munich, Germany (M.J., B.M.-M.)
| | - Harald Lahm
- Department of Cardiovascular Surgery, Division of Experimental Surgery, Institute Insure (Institute for Translational Cardiac Surgery), German Heart Center Munich & Technical University of Munich, School of Medicine & Health, Germany (M.D., H.L., M.K.)
| | - Markus Krane
- Department of Cardiovascular Surgery, Division of Experimental Surgery, Institute Insure (Institute for Translational Cardiac Surgery), German Heart Center Munich & Technical University of Munich, School of Medicine & Health, Germany (M.D., H.L., M.K.)
- Department of Cardiac Surgery, Yale School of Medicine, New Haven, CT (M.K.)
| | - Katherine S Pollard
- Gladstone Institutes; University of California San Francisco (K.G., C.T., D.S., K.S.P.)
- Chan Zuckerberg Biohub, San Francisco (K.S.P.)
| | - Jesse M Engreitz
- Department of Genetics, Stanford University School of Medicine. (C.M., D.A., X.R.M., J.M.E.)
- Basic Sciences and Engineering (BASE) Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA (J.M.E.)
| | - Sarah A Gagliano Taliun
- Department of Medicine & Department of Neurosciences, Faculty of Medicine, University ersité de Montréal (S.A.G.T.)
- Montreal Heart Institute, Montreal, Quebec, Canada (S.A.G.T.)
| | - Bruce D Gelb
- The Mindich Child Health & Development Institute at the Hess Center for Science & Medicine at Mount Sinai, NY (B.D.G.)
| | - James R Priest
- Department of Pediatrics, Stanford University School of Medicine. (M.Y., M.A., A.C.-P., C.T., J.R.P.)
| |
Collapse
|
26
|
Gunsalus LM, McArthur E, Gjoni K, Kuang S, Pittman M, Capra JA, Pollard KS. Comparing chromatin contact maps at scale: methods and insights. bioRxiv 2023:2023.04.04.535480. [PMID: 37066196 PMCID: PMC10104037 DOI: 10.1101/2023.04.04.535480] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Comparing chromatin contact maps is an essential step in quantifying how three-dimensional (3D) genome organization shapes development, evolution, and disease. However, no gold standard exists for comparing contact maps, and even simple methods often disagree. In this study, we propose novel comparison methods and evaluate them alongside existing approaches using genome-wide Hi-C data and 22,500 in silico predicted contact maps. We also quantify the robustness of methods to common sources of biological and technical variation, such as boundary size and noise. We find that simple difference-based methods such as mean squared error are suitable for initial screening, but biologically informed methods are necessary to identify why maps diverge and propose specific functional hypotheses. We provide a reference guide, codebase, and benchmark for rapidly comparing chromatin contact maps at scale to enable biological insights into the 3D organization of the genome.
Collapse
Affiliation(s)
- Laura M. Gunsalus
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Evonne McArthur
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| | - Ketrin Gjoni
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - Maureen Pittman
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
| | - John A. Capra
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
| | - Katherine S. Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
27
|
Shi ZJ, Nayfach S, Pollard KS. Identifying species-specific k-mers for fast and accurate metagenotyping with Maast and GT-Pro. STAR Protoc 2023; 4:101964. [PMID: 36856771 PMCID: PMC10037184 DOI: 10.1016/j.xpro.2022.101964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/19/2022] [Accepted: 12/08/2022] [Indexed: 01/22/2023] Open
Abstract
Genotyping single-nucleotide polymorphisms (SNPs) in microbiomes enables strain-level quantification. In this protocol, we describe a computational pipeline that performs fast and accurate SNP genotyping using metagenomic data. We first demonstrate how to use Maast to catalog SNPs from microbial genomes. Then we use GT-Pro to extract unique SNP-covering k-mers, optimize a data structure for storing these k-mers, and finally perform metagenotyping. For proof of concept, the protocol leverages public whole-genome sequences to metagenotype a synthetic community. For complete details on the use and execution of this protocol, please refer to Shi et al. (2022a)1 and Shi et al. (2022b).2.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA; Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA; University of California San Francisco, Department of Epidemiology and Biostatistics, San Francisco, CA, USA.
| |
Collapse
|
28
|
Whalen S, Inoue F, Ryu H, Fair T, Markenscoff-Papadimitriou E, Keough K, Kircher M, Martin B, Alvarado B, Elor O, Laboy Cintron D, Williams A, Hassan Samee MA, Thomas S, Krencik R, Ullian EM, Kriegstein A, Rubenstein JL, Shendure J, Pollen AA, Ahituv N, Pollard KS. Machine learning dissection of human accelerated regions in primate neurodevelopment. Neuron 2023; 111:857-873.e8. [PMID: 36640767 PMCID: PMC10023452 DOI: 10.1016/j.neuron.2022.12.026] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 09/29/2022] [Accepted: 12/18/2022] [Indexed: 01/15/2023]
Abstract
Using machine learning (ML), we interrogated the function of all human-chimpanzee variants in 2,645 human accelerated regions (HARs), finding 43% of HARs have variants with large opposing effects on chromatin state and 14% on neurodevelopmental enhancer activity. This pattern, consistent with compensatory evolution, was confirmed using massively parallel reporter assays in chimpanzee and human neural progenitor cells. The species-specific enhancer activity of HARs was accurately predicted from the presence and absence of transcription factor footprints in each species. Despite these striking cis effects, activity of a given HAR sequence was nearly identical in human and chimpanzee cells. This suggests that HARs did not evolve to compensate for changes in the trans environment but instead altered their ability to bind factors present in both species. Thus, ML prioritized variants with functional effects on human neurodevelopment and revealed an unexpected reason why HARs may have evolved so rapidly.
Collapse
Affiliation(s)
- Sean Whalen
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Hane Ryu
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Tyler Fair
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | | | - Kathleen Keough
- Gladstone Institutes, San Francisco, CA 94158, USA; Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Martin Kircher
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, 10117 Berlin, Germany; Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, 23562 Lübeck, Germany
| | - Beth Martin
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Beatriz Alvarado
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Orry Elor
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Dianne Laboy Cintron
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | | | | | - Sean Thomas
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Robert Krencik
- Department of Neurosurgery, Center for Neuroregeneration, Houston Methodist Research Institute, Houston, TX, USA
| | - Erik M Ullian
- Departments of Ophthalmology and Physiology, University of California, San Francisco, San Francisco, CA, USA; Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA, USA
| | - Arnold Kriegstein
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - John L Rubenstein
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Alex A Pollen
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Department of Epidemiology and Biostatistics and Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA; Chan-Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
29
|
Sullivan PF, Meadows JRS, Gazal S, Phan BN, Li X, Genereux DP, Dong MX, Bianchi M, Andrews G, Sakthikumar S, Nordin J, Roy A, Christmas MJ, Marinescu VD, Wallerman O, Xue JR, Li Y, Yao S, Sun Q, Szatkiewicz J, Wen J, Huckins LM, Lawler AJ, Keough KC, Zheng Z, Zeng J, Wray NR, Johnson J, Chen J, Paten B, Reilly SK, Hughes GM, Weng Z, Pollard KS, Pfenning AR, Forsberg-Nilsson K, Karlsson EK, Lindblad-Toh K. Leveraging Base Pair Mammalian Constraint to Understand Genetic Variation and Human Disease. bioRxiv 2023:2023.03.10.531987. [PMID: 36945512 PMCID: PMC10028973 DOI: 10.1101/2023.03.10.531987] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
Although thousands of genomic regions have been associated with heritable human diseases, attempts to elucidate biological mechanisms are impeded by a general inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function that is agnostic to cell type or disease mechanism. Here, single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional. We compared these scores to large-scale genome annotation, genome-wide association studies (GWAS), copy number variation, clinical genetics findings, and cancer data sets. Evolutionarily constrained positions are enriched for variants explaining common disease heritability (more than any other functional annotation). Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
Collapse
Affiliation(s)
- Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet; Stockholm, Sweden
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Steven Gazal
- Keck School of Medicine, University of Southern California; Los Angeles, CA 90033, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine; Pittsburgh, PA 15261, USA
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Xue Li
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School; Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Sharadha Sakthikumar
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
| | - Jessika Nordin
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
| | - Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - James R. Xue
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
| | - Yun Li
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Shuyang Yao
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet; Stockholm, Sweden
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, USA
| | - Jin Szatkiewicz
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Laura M. Huckins
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai; New York, NY 10029, USA
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Kathleen C. Keough
- Department of Epidemiology & Biostatistics, University of California San Francisco; San Francisco, CA 94158, USA
- Fauna Bio Incorporated; Emeryville, CA 94608, USA
- Gladstone Institutes; San Francisco, CA 94158, USA
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
- Queensland Brain Institute, University of Queensland; Brisbane, Queensland, Australia
| | - Jessica Johnson
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai; New York, NY 10029, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, USA
| | | | - Benedict Paten
- Genomics Institute, University of California Santa Cruz; Santa Cruz, CA 95064, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine; New Haven, CT 06510, USA
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin; Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Katherine S. Pollard
- Department of Epidemiology & Biostatistics, University of California San Francisco; San Francisco, CA 94158, USA
- Gladstone Institutes; San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA 94158, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
- Biodiscovery Institute, University of Nottingham; Nottingham, UK
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
| |
Collapse
|
30
|
Wen C, Margolis M, Dai R, Zhang P, Przytycki PF, Vo DD, Bhattacharya A, Matoba N, Jiao C, Kim M, Tsai E, Hoh C, Aygün N, Walker RL, Chatzinakos C, Clarke D, Pratt H, Consortium P, Peters MA, Gerstein M, Daskalakis NP, Weng Z, Jaffe AE, Kleinman JE, Hyde TM, Weinberger DR, Bray NJ, Sestan N, Geschwind DH, Roeder K, Gusev A, Pasaniuc B, Stein JL, Love MI, Pollard KS, Liu C, Gandal MJ. Cross-ancestry, cell-type-informed atlas of gene, isoform, and splicing regulation in the developing human brain. medRxiv 2023:2023.03.03.23286706. [PMID: 36945630 PMCID: PMC10029021 DOI: 10.1101/2023.03.03.23286706] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
Genomic regulatory elements active in the developing human brain are notably enriched in genetic risk for neuropsychiatric disorders, including autism spectrum disorder (ASD), schizophrenia, and bipolar disorder. However, prioritizing the specific risk genes and candidate molecular mechanisms underlying these genetic enrichments has been hindered by the lack of a single unified large-scale gene regulatory atlas of human brain development. Here, we uniformly process and systematically characterize gene, isoform, and splicing quantitative trait loci (xQTLs) in 672 fetal brain samples from unique subjects across multiple ancestral populations. We identify 15,752 genes harboring a significant xQTL and map 3,739 eQTLs to a specific cellular context. We observe a striking drop in gene expression and splicing heritability as the human brain develops. Isoform-level regulation, particularly in the second trimester, mediates the greatest proportion of heritability across multiple psychiatric GWAS, compared with eQTLs. Via colocalization and TWAS, we prioritize biological mechanisms for ~60% of GWAS loci across five neuropsychiatric disorders, nearly two-fold that observed in the adult brain. Finally, we build a comprehensive set of developmentally regulated gene and isoform co-expression networks capturing unique genetic enrichments across disorders. Together, this work provides a comprehensive view of genetic regulation across human brain development as well as the stage-and cell type-informed mechanistic underpinnings of neuropsychiatric disorders.
Collapse
Affiliation(s)
- Cindy Wen
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Michael Margolis
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Rujia Dai
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
| | - Pan Zhang
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Pawel F Przytycki
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
| | - Daniel D Vo
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Chuan Jiao
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
| | - Minsoo Kim
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Ellen Tsai
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Celine Hoh
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Rebecca L Walker
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Christos Chatzinakos
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
| | - Declan Clarke
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
| | - Henry Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
| | - PsychENCODE Consortium
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
- CNS Data Coordination Group, Sage Bionetworks; Seattle, WA, 98109, USA
- Program in Computational Biology and Bioinformatics, Yale University; New Haven, CT, 06520, USA
- Department of Computer Science, Yale University; New Haven, CT, 06520, USA
- Department of Statistics and Data Science, Yale University; New Haven, CT, 06520, USA
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Neumora Therapeutics; Watertown, MA, 02472, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- MRC Centre for Neuropsychiatric Genetics & Genomics, Division of Psychological Medicine & Clinical Neurosciences, Cardiff University School of Medicine; Cardiff, CF24 4HQ, UK
- Department of Comparative Medicine, Yale University School of Medicine; New Haven, CT, 06520, USA
- Department of Neuroscience, Yale University School of Medicine; New Haven, CT, 06520, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Statistics & Data Science, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Computational Biology Department, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute; Boston, MA, 02215, USA
- Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Harvard Medical School; Boston, MA, 02215, USA
- Division of Genetics, Brigham and Women's Hospital; Boston, MA, 02215, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco; San Francisco, CA, 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA, 94158, USA
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University; Changsha, Hunan, 410008, China
| | - Mette A Peters
- CNS Data Coordination Group, Sage Bionetworks; Seattle, WA, 98109, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University; New Haven, CT, 06520, USA
- Department of Computer Science, Yale University; New Haven, CT, 06520, USA
- Department of Statistics and Data Science, Yale University; New Haven, CT, 06520, USA
| | - Nikolaos P Daskalakis
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
| | - Andrew E Jaffe
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Neumora Therapeutics; Watertown, MA, 02472, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Daniel R Weinberger
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Nicholas J Bray
- MRC Centre for Neuropsychiatric Genetics & Genomics, Division of Psychological Medicine & Clinical Neurosciences, Cardiff University School of Medicine; Cardiff, CF24 4HQ, UK
| | - Nenad Sestan
- Department of Comparative Medicine, Yale University School of Medicine; New Haven, CT, 06520, USA
- Department of Neuroscience, Yale University School of Medicine; New Haven, CT, 06520, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Kathryn Roeder
- Department of Statistics & Data Science, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Computational Biology Department, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
| | - Alexander Gusev
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute; Boston, MA, 02215, USA
- Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Harvard Medical School; Boston, MA, 02215, USA
- Division of Genetics, Brigham and Women's Hospital; Boston, MA, 02215, USA
| | - Bogdan Pasaniuc
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco; San Francisco, CA, 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA, 94158, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University; Changsha, Hunan, 410008, China
| | - Michael J Gandal
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
| |
Collapse
|
31
|
Deng C, Whalen S, Steyert M, Ziffra R, Przytycki PF, Inoue F, Pereira DA, Capauto D, Norton S, Vaccarino FM, Pollen A, Nowakowski TJ, Ahituv N, Pollard KS. Massively parallel characterization of psychiatric disorder-associated and cell-type-specific regulatory elements in the developing human cortex. bioRxiv 2023:2023.02.15.528663. [PMID: 36824845 PMCID: PMC9949039 DOI: 10.1101/2023.02.15.528663] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Nucleotide changes in gene regulatory elements are important determinants of neuronal development and disease. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 sequences, including differentially accessible cell-type specific regions in the developing cortex and single-nucleotide variants associated with psychiatric disorders. In primary cells, we identified 46,802 active enhancer sequences and 164 disorder-associated variants that significantly alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning, we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.
Collapse
Affiliation(s)
- Chengyu Deng
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Sean Whalen
- Gladstone Institutes; San Francisco, CA, USA
| | - Marilyn Steyert
- Department of Anatomy, University of California, San Francisco; San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco; San Francisco, CA, USA
- Department of Neurological Surgery, University of California, San Francisco; San Francisco, CA, USA
| | - Ryan Ziffra
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
| | | | - Fumitaka Inoue
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University; Kyoto, Japan
| | - Daniela A. Pereira
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
- Graduate Program of Genetics, Institute of Biological Sciences, Federal University of Minas Gerais; Belo Horizonte, Minas Gerais, Brazil
| | | | - Scott Norton
- Child Study Center, Yale University; New Haven, CT, USA
| | - Flora M. Vaccarino
- Child Study Center, Yale University; New Haven, CT, USA
- Department of Neuroscience, Yale University; New Haven, CT, USA
| | - Alex Pollen
- Department of Neurology, University of California, San Francisco; San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco; San Francisco, CA, USA
| | - Tomasz J. Nowakowski
- Department of Anatomy, University of California, San Francisco; San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco; San Francisco, CA, USA
- Department of Neurological Surgery, University of California, San Francisco; San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco; San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco; San Francisco, CA, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Katherine S. Pollard
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
- Gladstone Institutes; San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco; San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco; San Francisco, CA, USA
| |
Collapse
|
32
|
Zhao C, Shi ZJ, Pollard KS. Pitfalls of genotyping microbial communities with rapidly growing genome collections. Cell Syst 2023; 14:160-176.e3. [PMID: 36657438 PMCID: PMC9957970 DOI: 10.1016/j.cels.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/15/2022] [Accepted: 12/19/2022] [Indexed: 01/20/2023]
Abstract
Detecting genetic variants in metagenomic data is a priority for understanding the evolution, ecology, and functional characteristics of microbial communities. Many tools that perform this metagenotyping rely on aligning reads of unknown origin to a database of sequences from many species before calling variants. In this synthesis, we investigate how databases of increasingly diverse and closely related species have pushed the limits of current alignment algorithms, thereby degrading the performance of metagenotyping tools. We identify multi-mapping reads as a prevalent source of errors and illustrate a trade-off between retaining correct alignments versus limiting incorrect alignments, many of which map reads to the wrong species. Then we evaluate several actionable mitigation strategies and review emerging methods showing promise to further improve metagenotyping in response to the rapid growth in genome collections. Our results have implications beyond metagenotyping to the many tools in microbial genomics that depend upon accurate read mapping.
Collapse
Affiliation(s)
- Chunyu Zhao
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
33
|
Shah PP, Keough KC, Gjoni K, Santini GT, Abdill RJ, Wickramasinghe NM, Dundes CE, Karnay A, Chen A, Salomon REA, Walsh PJ, Nguyen SC, Whalen S, Joyce EF, Loh KM, Dubois N, Pollard KS, Jain R. An atlas of lamina-associated chromatin across twelve human cell types reveals an intermediate chromatin subtype. Genome Biol 2023; 24:16. [PMID: 36691074 PMCID: PMC9869549 DOI: 10.1186/s13059-023-02849-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 01/05/2023] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Association of chromatin with lamin proteins at the nuclear periphery has emerged as a potential mechanism to coordinate cell type-specific gene expression and maintain cellular identity via gene silencing. Unlike many histone modifications and chromatin-associated proteins, lamina-associated domains (LADs) are mapped genome-wide in relatively few genetically normal human cell types, which limits our understanding of the role peripheral chromatin plays in development and disease. RESULTS To address this gap, we map LAMIN B1 occupancy across twelve human cell types encompassing pluripotent stem cells, intermediate progenitors, and differentiated cells from all three germ layers. Integrative analyses of this atlas with gene expression and repressive histone modification maps reveal that lamina-associated chromatin in all twelve cell types is organized into at least two subtypes defined by differences in LAMIN B1 occupancy, gene expression, chromatin accessibility, transposable elements, replication timing, and radial positioning. Imaging of fluorescently labeled DNA in single cells validates these subtypes and shows radial positioning of LADs with higher LAMIN B1 occupancy and heterochromatic histone modifications primarily embedded within the lamina. In contrast, the second subtype of lamina-associated chromatin is relatively gene dense, accessible, dynamic across development, and positioned adjacent to the lamina. Most genes gain or lose LAMIN B1 occupancy consistent with cell types along developmental trajectories; however, we also identify examples where the enhancer, but not the gene body and promoter, changes LAD state. CONCLUSIONS Altogether, this atlas represents the largest resource to date for peripheral chromatin organization studies and reveals an intermediate chromatin subtype.
Collapse
Affiliation(s)
- Parisha P. Shah
- grid.25879.310000 0004 1936 8972Departments of Medicine and Cell and Developmental Biology, Penn CVI, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA
| | - Kathleen C. Keough
- grid.266102.10000 0001 2297 6811University of California, San Francisco, CA 94117 USA ,grid.249878.80000 0004 0572 7110Gladstone Institute of Data Science and Biotechnology, 1650 Owens Street, San Francisco, CA 94158 USA
| | - Ketrin Gjoni
- grid.266102.10000 0001 2297 6811University of California, San Francisco, CA 94117 USA ,grid.249878.80000 0004 0572 7110Gladstone Institute of Data Science and Biotechnology, 1650 Owens Street, San Francisco, CA 94158 USA
| | - Garrett T. Santini
- grid.25879.310000 0004 1936 8972Departments of Medicine and Cell and Developmental Biology, Penn CVI, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA
| | - Richard J. Abdill
- grid.25879.310000 0004 1936 8972Departments of Medicine and Cell and Developmental Biology, Penn CVI, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA
| | - Nadeera M. Wickramasinghe
- grid.59734.3c0000 0001 0670 2351Department of Cell, Developmental and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| | - Carolyn E. Dundes
- grid.168010.e0000000419368956Department of Developmental Biology and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Ashley Karnay
- grid.25879.310000 0004 1936 8972Departments of Medicine and Cell and Developmental Biology, Penn CVI, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA
| | - Angela Chen
- grid.168010.e0000000419368956Department of Developmental Biology and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Rachel E. A. Salomon
- grid.168010.e0000000419368956Department of Developmental Biology and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Patrick J. Walsh
- grid.25879.310000 0004 1936 8972Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104 USA
| | - Son C. Nguyen
- grid.25879.310000 0004 1936 8972Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104 USA
| | - Sean Whalen
- grid.249878.80000 0004 0572 7110Gladstone Institute of Data Science and Biotechnology, 1650 Owens Street, San Francisco, CA 94158 USA
| | - Eric F. Joyce
- grid.25879.310000 0004 1936 8972Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104 USA
| | - Kyle M. Loh
- grid.168010.e0000000419368956Department of Developmental Biology and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, CA 94305 USA
| | - Nicole Dubois
- grid.59734.3c0000 0001 0670 2351Department of Cell, Developmental and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA
| | - Katherine S. Pollard
- grid.266102.10000 0001 2297 6811University of California, San Francisco, CA 94117 USA ,grid.249878.80000 0004 0572 7110Gladstone Institute of Data Science and Biotechnology, 1650 Owens Street, San Francisco, CA 94158 USA ,grid.499295.a0000 0004 9234 0175Chan Zuckerberg Biohub, San Francisco, CA 94158 USA
| | - Rajan Jain
- grid.25879.310000 0004 1936 8972Departments of Medicine and Cell and Developmental Biology, Penn CVI, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA ,Smilow TRC, 3400 Civic Center Blvd, Philadelphia, PA 19104 USA
| |
Collapse
|
34
|
Alexanian M, Padmanabhan A, Nishino T, Travers JG, Ye L, Lee CY, Sadagopan N, Huang Y, Pelonero A, Auclair K, Zhu A, Teran BG, Flanigan W, Kim CKS, Lumbao-Conradson K, Costa M, Jain R, Charo I, Haldar SM, Pollard KS, Vagnozzi RJ, McKinsey TA, Przytycki PF, Srivastava D. Chromatin Remodeling Drives Immune-Fibroblast Crosstalk in Heart Failure Pathogenesis. bioRxiv 2023:2023.01.06.522937. [PMID: 36711864 PMCID: PMC9881961 DOI: 10.1101/2023.01.06.522937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Chronic inflammation and tissue fibrosis are common stress responses that worsen organ function, yet the molecular mechanisms governing their crosstalk are poorly understood. In diseased organs, stress-induced changes in gene expression fuel maladaptive cell state transitions and pathological interaction between diverse cellular compartments. Although chronic fibroblast activation worsens dysfunction of lung, liver, kidney, and heart, and exacerbates many cancers, the stress-sensing mechanisms initiating the transcriptional activation of fibroblasts are not well understood. Here, we show that conditional deletion of the transcription co-activator Brd4 in Cx3cr1-positive myeloid cells ameliorates heart failure and is associated with a dramatic reduction in fibroblast activation. Analysis of single-cell chromatin accessibility and BRD4 occupancy in vivo in Cx3cr1-positive cells identified a large enhancer proximal to Interleukin-1 beta (Il1b), and a series of CRISPR deletions revealed the precise stress-dependent regulatory element that controlled expression of Il1b in disease. Secreted IL1B functioned non-cell autonomously to activate a p65/RELA-dependent enhancer near the transcription factor MEOX1, resulting in a profibrotic response in human cardiac fibroblasts. In vivo, antibody-mediated IL1B neutralization prevented stress-induced expression of MEOX1, inhibited fibroblast activation, and improved cardiac function in heart failure. The elucidation of BRD4-dependent crosstalk between a specific immune cell subset and fibroblasts through IL1B provides new therapeutic strategies for heart disease and other disorders of chronic inflammation and maladaptive tissue remodeling.
Collapse
Affiliation(s)
- Michael Alexanian
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco; San Francisco, CA, USA
| | - Arun Padmanabhan
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
- Department of Medicine, Division of Cardiology, University of California, San Francisco; San Francisco CA, USA
- Chan Zuckerberg Biohub; San Francisco, CA, USA
| | - Tomohiro Nishino
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Joshua G. Travers
- Department of Medicine, Division of Cardiology and Consortium for Fibrosis Research & Translation, University of Colorado Anschutz Medical Campus; Aurora, CO, USA
| | - Lin Ye
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Clara Youngna Lee
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
- Department of Medicine, Division of Cardiology, University of California, San Francisco; San Francisco CA, USA
| | - Nandhini Sadagopan
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
- Department of Medicine, Division of Cardiology, University of California, San Francisco; San Francisco CA, USA
| | - Yu Huang
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Angelo Pelonero
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Kirsten Auclair
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Ada Zhu
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Barbara Gonzalez Teran
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Will Flanigan
- Gladstone Institutes; San Francisco, CA, USA
- UC Berkeley-UCSF Joint Program in Bioengineering; Berkeley, CA, USA
| | - Charis Kee-Seon Kim
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Koya Lumbao-Conradson
- Department of Medicine, Division of Cardiology and Consortium for Fibrosis Research & Translation, University of Colorado Anschutz Medical Campus; Aurora, CO, USA
| | - Mauro Costa
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
| | - Rajan Jain
- Cardiovascular Institute, Epigenetics Institute, and Department of Medicine, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, USA
| | | | - Saptarsi M. Haldar
- Gladstone Institutes; San Francisco, CA, USA
- Department of Medicine, Division of Cardiology, University of California, San Francisco; San Francisco CA, USA
- Amgen Research, Cardiometabolic Disorders; South San Francisco, CA, USA
| | - Katherine S. Pollard
- Gladstone Institutes; San Francisco, CA, USA
- Chan Zuckerberg Biohub; San Francisco, CA, USA
- Institute for Computational Health Sciences, University of California, San Francisco; San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco; San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco; San Francisco, CA, USA
| | - Ronald J. Vagnozzi
- Department of Medicine, Division of Cardiology and Consortium for Fibrosis Research & Translation, University of Colorado Anschutz Medical Campus; Aurora, CO, USA
| | - Timothy A. McKinsey
- Department of Medicine, Division of Cardiology and Consortium for Fibrosis Research & Translation, University of Colorado Anschutz Medical Campus; Aurora, CO, USA
| | - Pawel F. Przytycki
- Gladstone Institutes; San Francisco, CA, USA
- Faculty of Computing & Data Sciences, Boston University; Boston, MA, USA
| | - Deepak Srivastava
- Gladstone Institutes; San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone Institutes; San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco; San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco; San Francisco, CA, USA
| |
Collapse
|
35
|
Zhao C, Dimitrov B, Goldman M, Nayfach S, Pollard KS. MIDAS2: Metagenomic Intra-species Diversity Analysis System. Bioinformatics 2023; 39:btac713. [PMID: 36321886 PMCID: PMC9805558 DOI: 10.1093/bioinformatics/btac713] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 10/07/2022] [Accepted: 10/28/2022] [Indexed: 11/07/2022] Open
Abstract
SUMMARY The Metagenomic Intra-Species Diversity Analysis System (MIDAS) is a scalable metagenomic pipeline that identifies single nucleotide variants (SNVs) and gene copy number variants in microbial populations. Here, we present MIDAS2, which addresses the computational challenges presented by increasingly large reference genome databases, while adding functionality for building custom databases and leveraging paired-end reads to improve SNV accuracy. This fast and scalable reengineering of the MIDAS pipeline enables thousands of metagenomic samples to be efficiently genotyped. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/czbiohub/MIDAS2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
| | | | - Miriam Goldman
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA 94158, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Berkeley, CA 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Katherine S Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
| |
Collapse
|
36
|
Zhao C, Goldman M, Smith BJ, Pollard KS. Genotyping Microbial Communities with MIDAS2: From Metagenomic Reads to Allele Tables. Curr Protoc 2022; 2:e604. [PMID: 36469554 PMCID: PMC9907011 DOI: 10.1002/cpz1.604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The Metagenomic Intra-Species Diversity Analysis System 2 (MIDAS2) is a scalable pipeline that identifies single nucleotide variants and gene copy number variants in metagenomes using comprehensive reference databases built from public microbial genome collections (metagenotyping). MIDAS2 is the first metagenotyping tool with functionality to control metagenomic read mapping filters and to customize the reference database to the microbial community, features that improve the precision and recall of detected variants. In this article we present four basic protocols for the most common use cases of MIDAS2, along with supporting protocols for installation and use. In addition, we provide in-depth guidance on adjusting command line parameters, editing the reference database, optimizing hardware utilization, and understanding the metagenotyping results. All the steps of metagenotyping, from raw sequencing reads to population genetic analysis, are demonstrated with example data in two downloadable sequencing libraries of single-end metagenomic reads representing a mixture of multiple bacterial species. This set of protocols empowers users to accurately genotype hundreds of species in thousands of samples, providing rich genetic data for studying the evolution and strain-level ecology of microbial communities. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Species prescreening Basic Protocol 2: Download MIDAS reference database Basic Protocol 3: Population single nucleotide variant calling Basic Protocol 4: Pan-genome copy number variant calling Support Protocol 1: Installing MIDAS2 Support Protocol 2: Command line inputs Support Protocol 3: Metagenotyping with a custom collection of genomes Support Protocol 4: Metagenotyping with advanced parameters.
Collapse
Affiliation(s)
- Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- These authors contributed equally to this work
| | - Miriam Goldman
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Biomedical Informatics, University of California San Francisco, San Francisco, California
- These authors contributed equally to this work
| | - Byron J. Smith
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Katherine S. Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| |
Collapse
|
37
|
Abstract
Human accelerated regions (HARs) are the fastest-evolving sequences in the human genome. When HARs were discovered in 2006, their function was mysterious due to scant annotation of the noncoding genome. Diverse technologies, from transgenic animals to machine learning, have consistently shown that HARs function as gene regulatory enhancers with significant enrichment in neurodevelopment. It is now possible to quantitatively measure the enhancer activity of thousands of HARs in parallel and model how each nucleotide contributes to gene expression. These strategies have revealed that many human HAR sequences function differently than their chimpanzee orthologs, though individual nucleotide changes in the same HAR may have opposite effects, consistent with compensatory substitutions. To fully evaluate the role of HARs in human evolution, it will be necessary to experimentally and computationally dissect them across more cell types and developmental stages.
Collapse
Affiliation(s)
- Sean Whalen
- Gladstone Institute of Data Science and Biotechnology, San Francisco, California, USA; ,
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, California, USA; ,
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California, USA
- Chan Zuckerberg Biohub, San Francisco, California, USA
| |
Collapse
|
38
|
Spanogiannopoulos P, Kyaw TS, Guthrie BGH, Bradley PH, Lee JV, Melamed J, Malig YNA, Lam KN, Gempis D, Sandy M, Kidder W, Van Blarigan EL, Atreya CE, Venook A, Gerona RR, Goga A, Pollard KS, Turnbaugh PJ. Host and gut bacteria share metabolic pathways for anti-cancer drug metabolism. Nat Microbiol 2022; 7:1605-1620. [PMID: 36138165 PMCID: PMC9530025 DOI: 10.1038/s41564-022-01226-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Accepted: 08/03/2022] [Indexed: 12/15/2022]
Abstract
Pharmaceuticals have extensive reciprocal interactions with the microbiome, but whether bacterial drug sensitivity and metabolism is driven by pathways conserved in host cells remains unclear. Here we show that anti-cancer fluoropyrimidine drugs inhibit the growth of gut bacterial strains from 6 phyla. In both Escherichia coli and mammalian cells, fluoropyrimidines disrupt pyrimidine metabolism. Proteobacteria and Firmicutes metabolized 5-fluorouracil to its inactive metabolite dihydrofluorouracil, mimicking the major host mechanism for drug clearance. The preTA operon was necessary and sufficient for 5-fluorouracil inactivation by E. coli, exhibited high catalytic efficiency for the reductive reaction, decreased the bioavailability and efficacy of oral fluoropyrimidine treatment in mice and was prevalent in the gut microbiomes of colorectal cancer patients. The conservation of both the targets and enzymes for metabolism of therapeutics across domains highlights the need to distinguish the relative contributions of human and microbial cells to drug efficacy and side-effect profiles.
Collapse
Affiliation(s)
- Peter Spanogiannopoulos
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
| | - Than S Kyaw
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
| | - Ben G H Guthrie
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
| | - Patrick H Bradley
- Gladstone Institutes, San Francisco, CA, USA
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
| | - Joyce V Lee
- Department of Cell and Tissue Biology, University of California San Francisco, San Francisco, CA, USA
| | - Jonathan Melamed
- Clinical Toxicology and Environmental Biomonitoring Laboratory, University of California San Francisco, San Francisco, CA, USA
| | - Ysabella Noelle Amora Malig
- Clinical Toxicology and Environmental Biomonitoring Laboratory, University of California San Francisco, San Francisco, CA, USA
| | - Kathy N Lam
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
| | - Daryll Gempis
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
| | - Moriah Sandy
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
| | - Wesley Kidder
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA
| | - Erin L Van Blarigan
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
- Department of Urology, University of California San Francisco, San Francisco, CA, USA
| | - Chloe E Atreya
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA
| | - Alan Venook
- Department of Medicine, University of California San Francisco, San Francisco, CA, USA
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA
| | - Roy R Gerona
- Clinical Toxicology and Environmental Biomonitoring Laboratory, University of California San Francisco, San Francisco, CA, USA
| | - Andrei Goga
- Department of Cell and Tissue Biology, University of California San Francisco, San Francisco, CA, USA
- UCSF Helen Diller Family Comprehensive Cancer Center, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Peter J Turnbaugh
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
39
|
Zhu L, Choudhary K, Gonzalez-Teran B, Ang YS, Thomas R, Stone NR, Liu L, Zhou P, Zhu C, Ruan H, Huang Y, Jin S, Pelonero A, Koback F, Padmanabhan A, Sadagopan N, Hsu A, Costa MW, Gifford CA, van Bemmel J, Hüttenhain R, Vedantham V, Conklin BR, Black BL, Bruneau BG, Steinmetz L, Krogan NJ, Pollard KS, Srivastava D. Transcription Factor GATA4 Regulates Cell Type-Specific Splicing Through Direct Interaction With RNA in Human Induced Pluripotent Stem Cell-Derived Cardiac Progenitors. Circulation 2022; 146:770-787. [PMID: 35938400 PMCID: PMC9452483 DOI: 10.1161/circulationaha.121.057620] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
BACKGROUND GATA4 (GATA-binding protein 4), a zinc finger-containing, DNA-binding transcription factor, is essential for normal cardiac development and homeostasis in mice and humans, and mutations in this gene have been reported in human heart defects. Defects in alternative splicing are associated with many heart diseases, yet relatively little is known about how cell type- or cell state-specific alternative splicing is achieved in the heart. Here, we show that GATA4 regulates cell type-specific splicing through direct interaction with RNA and the spliceosome in human induced pluripotent stem cell-derived cardiac progenitors. METHODS We leveraged a combination of unbiased approaches including affinity purification of GATA4 and mass spectrometry, enhanced cross-linking with immunoprecipitation, electrophoretic mobility shift assays, in vitro splicing assays, and unbiased transcriptomic analysis to uncover GATA4's novel function as a splicing regulator in human induced pluripotent stem cell-derived cardiac progenitors. RESULTS We found that GATA4 interacts with many members of the spliceosome complex in human induced pluripotent stem cell-derived cardiac progenitors. Enhanced cross-linking with immunoprecipitation demonstrated that GATA4 also directly binds to a large number of mRNAs through defined RNA motifs in a sequence-specific manner. In vitro splicing assays indicated that GATA4 regulates alternative splicing through direct RNA binding, resulting in functionally distinct protein products. Correspondingly, knockdown of GATA4 in human induced pluripotent stem cell-derived cardiac progenitors resulted in differential alternative splicing of genes involved in cytoskeleton organization and calcium ion import, with functional consequences associated with the protein isoforms. CONCLUSIONS This study shows that in addition to its well described transcriptional function, GATA4 interacts with members of the spliceosome complex and regulates cell type-specific alternative splicing via sequence-specific interactions with RNA. Several genes that have splicing regulated by GATA4 have functional consequences and many are associated with dilated cardiomyopathy, suggesting a novel role for GATA4 in achieving the necessary cardiac proteome in normal and stress-responsive conditions.
Collapse
Affiliation(s)
- Lili Zhu
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | | | - Barbara Gonzalez-Teran
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Yen-Sin Ang
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | | | - Nicole R. Stone
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Lei Liu
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Ping Zhou
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Chenchen Zhu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
| | - Hongmei Ruan
- Department of Medicine, University of California, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Yu Huang
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Shibo Jin
- Division of Cellular and Developmental Biology, Molecular and Cell Biology Department, University of California at Berkeley, Berkeley, CA, USA
| | - Angelo Pelonero
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Frances Koback
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Arun Padmanabhan
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Nandhini Sadagopan
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Austin Hsu
- Gladstone Institutes, San Francisco, CA, USA
| | - Mauro W. Costa
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Casey A. Gifford
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Joke van Bemmel
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Ruth Hüttenhain
- Gladstone Institutes, San Francisco, CA, USA
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
- Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, USA
| | - Vasanth Vedantham
- Department of Medicine, University of California, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Bruce R. Conklin
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, CA, USA
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
| | - Brian L. Black
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
| | - Benoit G. Bruneau
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
- Cardiovascular Research Institute, University of California, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, CA, USA
| | - Lars Steinmetz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Genome Technology Center, Palo Alto, CA, USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Nevan J. Krogan
- Gladstone Institutes, San Francisco, CA, USA
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
- Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, USA
| | - Katherine S. Pollard
- Gladstone Institutes, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Deepak Srivastava
- Gladstone Institutes, San Francisco, CA, USA
- Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, CA, USA
- Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA
| |
Collapse
|
40
|
Wagle M, Zarei M, Lovett-Barron M, Poston KT, Xu J, Ramey V, Pollard KS, Prober DA, Schulkin J, Deisseroth K, Guo S. Brain-wide perception of the emotional valence of light is regulated by distinct hypothalamic neurons. Mol Psychiatry 2022; 27:3777-3793. [PMID: 35484242 PMCID: PMC9613822 DOI: 10.1038/s41380-022-01567-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 02/25/2022] [Accepted: 04/06/2022] [Indexed: 02/08/2023]
Abstract
Salient sensory stimuli are perceived by the brain, which guides both the timing and outcome of behaviors in a context-dependent manner. Light is such a stimulus, which is used in treating mood disorders often associated with a dysregulated hypothalamic-pituitary-adrenal stress axis. Relationships between the emotional valence of light and the hypothalamus, and how they interact to exert brain-wide impacts remain unclear. Employing larval zebrafish with analogous hypothalamic systems to mammals, we show in free-swimming animals that hypothalamic corticotropin releasing factor (CRFHy) neurons promote dark avoidance, and such role is not shared by other hypothalamic peptidergic neurons. Single-neuron projection analyses uncover processes extended by individual CRFHy neurons to multiple targets including sensorimotor and decision-making areas. In vivo calcium imaging uncovers a complex and heterogeneous response of individual CRFHy neurons to the light or dark stimulus, with a reduced overall sum of CRF neuronal activity in the presence of light. Brain-wide calcium imaging under alternating light/dark stimuli further identifies distinct and distributed photic response neuronal types. CRFHy neuronal ablation increases an overall representation of light in the brain and broadly enhances the functional connectivity associated with an exploratory brain state. These findings delineate brain-wide photic perception, uncover a previously unknown role of CRFHy neurons in regulating the perception and emotional valence of light, and suggest that light therapy may alleviate mood disorders through reducing an overall sum of CRF neuronal activity.
Collapse
Affiliation(s)
- Mahendra Wagle
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, 94143-2811, USA
| | - Mahdi Zarei
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, 94143-2811, USA
| | - Matthew Lovett-Barron
- Department of Bioengineering, Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Neurobiology Section, Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Kristina Tyler Poston
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, 94143-2811, USA
| | - Jin Xu
- Tianqiao and Chrissy Chen Institute for Neuroscience, Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Vince Ramey
- Biophysics Graduate Group, University of California, Berkeley, CA, USA
- Invitae Inc., San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science & Biotechnology, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - David A Prober
- Tianqiao and Chrissy Chen Institute for Neuroscience, Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Jay Schulkin
- Department of Obstetrics & Gynecology, School of Medicine, University of Washington, Seattle, WA, USA
| | - Karl Deisseroth
- Department of Bioengineering, Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Su Guo
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, 94143-2811, USA.
- Programs in Human Genetics and Biological Sciences, Kavli Institute of Fundamental Neuroscience, The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, Bakar Aging Research Institute, University of California, San Francisco, CA, 94143-2811, USA.
| |
Collapse
|
41
|
Blair AP, Hu RK, Farah EN, Chi NC, Pollard KS, Przytycki PF, Kathiriya IS, Bruneau BG. Cell Layers: uncovering clustering structure in unsupervised single-cell transcriptomic analysis. Bioinform Adv 2022; 2:vbac051. [PMID: 35967929 PMCID: PMC9362878 DOI: 10.1093/bioadv/vbac051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/23/2022] [Accepted: 08/01/2022] [Indexed: 11/19/2022]
Abstract
Motivation Unsupervised clustering of single-cell transcriptomics is a powerful method for identifying cell populations. Static visualization techniques for single-cell clustering only display results for a single resolution parameter. Analysts will often evaluate more than one resolution parameter but then only report one. Results We developed Cell Layers, an interactive Sankey tool for the quantitative investigation of gene expression, co-expression, biological processes and cluster integrity across clustering resolutions. Cell Layers enhances the interpretability of single-cell clustering by linking molecular data and cluster evaluation metrics, providing novel insight into cell populations. Availability and implementation https://github.com/apblair/CellLayers.
Collapse
Affiliation(s)
- Andrew P Blair
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, CA 94143, USA
- Gladstone Institutes, San Francisco, CA 94158, USA
| | - Robert K Hu
- Division of Cardiology, Department of Medicine, University of California, San Diego, CA 92093, USA
| | - Elie N Farah
- Division of Cardiology, Department of Medicine, University of California, San Diego, CA 92093, USA
- Biomedical Sciences Graduate Program, University of California, San Diego, CA 92093, USA
| | - Neil C Chi
- Division of Cardiology, Department of Medicine, University of California, San Diego, CA 92093, USA
- Institute for Genomic Medicine, University of California, San Diego, CA 92093, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA
- Chan-Zuckerberg Biohub, San Francisco, CA 94143, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94143, USA
- Institute for Human Genetics, University of California, San Francisco, CA 94143, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94143, USA
- Quantitative Biology Institute, University of California, San Francisco, CA 94143, USA
| | | | - Irfan S Kathiriya
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Anesthesia and Perioperative Care, University of California, San Francisco, CA 94143, USA
| | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA 94158, USA
- Roddenberry Center for Stem Cell Biology and Medicine, Gladstone Institutes, San Francisco, CA 94158, USA
- Cardiovascular Research Institute, University of California, San Francisco, CA 94143, USA
- Department of Pediatrics, University of California, San Francisco, CA 94143, USA
| |
Collapse
|
42
|
Smith BJ, Li X, Shi ZJ, Abate A, Pollard KS. Scalable Microbial Strain Inference in Metagenomic Data Using StrainFacts. Front Bioinform 2022; 2:867386. [PMID: 36304283 PMCID: PMC9580935 DOI: 10.3389/fbinf.2022.867386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 04/14/2022] [Indexed: 11/25/2022] Open
Abstract
While genome databases are nearing a complete catalog of species commonly inhabiting the human gut, their representation of intraspecific diversity is lacking for all but the most abundant and frequently studied taxa. Statistical deconvolution of allele frequencies from shotgun metagenomic data into strain genotypes and relative abundances is a promising approach, but existing methods are limited by computational scalability. Here we introduce StrainFacts, a method for strain deconvolution that enables inference across tens of thousands of metagenomes. We harness a “fuzzy” genotype approximation that makes the underlying graphical model fully differentiable, unlike existing methods. This allows parameter estimates to be optimized with gradient-based methods, speeding up model fitting by two orders of magnitude. A GPU implementation provides additional scalability. Extensive simulations show that StrainFacts can perform strain inference on thousands of metagenomes and has comparable accuracy to more computationally intensive tools. We further validate our strain inferences using single-cell genomic sequencing from a human stool sample. Applying StrainFacts to a collection of more than 10,000 publicly available human stool metagenomes, we quantify patterns of strain diversity, biogeography, and linkage-disequilibrium that agree with and expand on what is known based on existing reference genomes. StrainFacts paves the way for large-scale biogeography and population genetic studies of microbiomes using metagenomic data.
Collapse
Affiliation(s)
- Byron J. Smith
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
| | - Xiangpeng Li
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Zhou Jason Shi
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
| | - Adam Abate
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
| | - Katherine S. Pollard
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
- *Correspondence: Katherine S. Pollard,
| |
Collapse
|
43
|
Lyalina S, Stepanauskas R, Wu F, Sanjabi S, Pollard KS. Single cell genome sequencing of laboratory mouse microbiota improves taxonomic and functional resolution of this model microbial community. PLoS One 2022; 17:e0261795. [PMID: 35417481 PMCID: PMC9007364 DOI: 10.1371/journal.pone.0261795] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 03/01/2022] [Indexed: 12/16/2022] Open
Abstract
Laboratory mice are widely studied as models of mammalian biology, including the microbiota. However, much of the taxonomic and functional diversity of the mouse gut microbiome is missed in current metagenomic studies, because genome databases have not achieved a balanced representation of the diverse members of this ecosystem. Towards solving this problem, we used flow cytometry and low-coverage sequencing to capture the genomes of 764 single cells from the stool of three laboratory mice. From these, we generated 298 high-coverage microbial genome assemblies, which we annotated for open reading frames and phylogenetic placement. These genomes increase the gene catalog and phylogenetic breadth of the mouse microbiota, adding 135 novel species with the greatest increase in diversity to the Muribaculaceae and Bacteroidaceae families. This new diversity also improves the read mapping rate, taxonomic classifier performance, and gene detection rate of mouse stool metagenomes. The novel microbial functions revealed through our single-cell genomes highlight previously invisible pathways that may be important for life in the murine gastrointestinal tract.
Collapse
Affiliation(s)
- Svetlana Lyalina
- Gladstone Institutes, San Francisco, CA, United States of America
| | - Ramunas Stepanauskas
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, United States of America
| | - Frank Wu
- Gladstone Institutes, San Francisco, CA, United States of America
| | - Shomyseh Sanjabi
- Gladstone Institutes, San Francisco, CA, United States of America
- Department of Microbiology & Immunology, University of California, San Francisco, San Francisco, CA, United States of America
| | - Katherine S. Pollard
- Gladstone Institutes, San Francisco, CA, United States of America
- Department of Epidemiology & Biostatistics, Institute for Human Genetics, and Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, United States of America
- Chan-Zuckerberg Biohub, San Francisco, CA, United States of America
- * E-mail:
| |
Collapse
|
44
|
Smith BJ, Piceno Y, Zydek M, Zhang B, Syriani LA, Terdiman JP, Kassam Z, Ma A, Lynch SV, Pollard KS, El-Nachef N. Strain-resolved analysis in a randomized trial of antibiotic pretreatment and maintenance dose delivery mode with fecal microbiota transplant for ulcerative colitis. Sci Rep 2022; 12:5517. [PMID: 35365713 PMCID: PMC8976058 DOI: 10.1038/s41598-022-09307-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 03/16/2022] [Indexed: 01/04/2023] Open
Abstract
Fecal microbiota transplant is a promising therapy for ulcerative colitis. Parameters maximizing effectiveness and tolerability are not yet clear, and it is not known how import the transmission of donor microbes to patients is. Here (clinicaltrails.gov: NCT03006809) we have tested the effects of antibiotic pretreatment and compared two modes of maintenance dose delivery, capsules versus enema, in a randomized, pilot, open-label, 2 × 2 factorial design with 22 patients analyzed with mild to moderate UC. Clinically, the treatment was well-tolerated with favorable safety profile. Of patients who received antibiotic pretreatment, 6 of 11 experienced remission after 6 weeks of treatment, versus 2 of 11 non-pretreated patients (log odds ratio: 1.69, 95% confidence interval: −0.25 to 3.62). No significant differences were found between maintenance dosing via capsules versus enema. In exploratory analyses, microbiome turnover at both the species and strain levels was extensive and significantly more pronounced in the pretreated patients. Associations were also revealed between taxonomic turnover and changes in the composition of primary and secondary bile acids. Together these findings suggest that antibiotic pretreatment contributes to microbiome engraftment and possibly clinical effectiveness, and validate longitudinal strain tracking as a powerful way to monitor the dynamics and impact of microbiota transfer.
Collapse
Affiliation(s)
- Byron J Smith
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| | | | - Martin Zydek
- Division of Gastroenterology, University of California, San Francisco, CA, USA
| | - Bing Zhang
- Division of Gastroenterology, University of California, San Francisco, CA, USA.,Division of Gastrointestinal and Liver Diseases, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Lara Aboud Syriani
- College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, USA
| | - Jonathan P Terdiman
- Division of Gastroenterology, University of California, San Francisco, CA, USA
| | | | - Averil Ma
- Department of Medicine, University of California, San Francisco, CA, USA
| | - Susan V Lynch
- Division of Gastroenterology, University of California, San Francisco, CA, USA.,Benioff Center for Microbiome Medicine, University of California, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA. .,Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA. .,Chan Zuckerberg Biohub, San Francisco, CA, USA.
| | - Najwa El-Nachef
- Division of Gastroenterology, University of California, San Francisco, CA, USA.
| |
Collapse
|
45
|
Shi ZJ, Dimitrov B, Zhao C, Nayfach S, Pollard KS. Fast and accurate metagenotyping of the human gut microbiome with GT-Pro. Nat Biotechnol 2022; 40:507-516. [PMID: 34949778 DOI: 10.1038/s41587-021-01102-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 09/20/2021] [Indexed: 02/07/2023]
Abstract
Single nucleotide polymorphisms (SNPs) in metagenomics are used to quantify population structure, track strains and identify genetic determinants of microbial phenotypes. However, existing alignment-based approaches for metagenomic SNP detection require high-performance computing and enough read coverage to distinguish SNPs from sequencing errors. To address these issues, we developed the GenoTyper for Prokaryotes (GT-Pro), a suite of methods to catalog SNPs from genomes and use unique k-mers to rapidly genotype these SNPs from metagenomes. Compared to methods that use read alignment, GT-Pro is more accurate and two orders of magnitude faster. Using high-quality genomes, we constructed a catalog of 104 million SNPs in 909 human gut species and used unique k-mers targeting this catalog to characterize the global population structure of gut microbes from 7,459 samples. GT-Pro enables fast and memory-efficient metagenotyping of millions of SNPs on a personal computer.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA, USA.,Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | | | - Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA. .,Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Katherine S Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, CA, USA. .,Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA. .,Epidemiology and Biostatistics, University of California, San Francisco, CA, USA.
| |
Collapse
|
46
|
Przytycki PF, Pollard KS. CellWalkR: An R Package for integrating and visualizing single-cell and bulk data to resolve regulatory elements. Bioinformatics 2022; 38:2621-2623. [PMID: 35274675 PMCID: PMC9048661 DOI: 10.1093/bioinformatics/btac150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 01/11/2022] [Accepted: 03/08/2022] [Indexed: 11/30/2022] Open
Abstract
Summary CellWalkR is an R package that integrates single-cell open chromatin data with cell type labels and bulk epigenetic data to identify cell type-specific regulatory regions. A Graphics Processing Unit (GPU) implementation and downsampling strategies enable thousands of cells to be processed in seconds. CellWalkR’s user-friendly interface provides interactive analysis and visualization of cell labels and regulatory region mappings. Availability and implementation CellWalkR is freely available as an R package under a GNU GPL-2.0 License and can be accessed from https://github.com/PFPrzytycki/CellWalkR with an accompanying vignette. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
47
|
Gonzalez-Teran B, Pittman M, Felix F, Thomas R, Richmond-Buccola D, Hüttenhain R, Choudhary K, Moroni E, Costa MW, Huang Y, Padmanabhan A, Alexanian M, Lee CY, Maven BEJ, Samse-Knapp K, Morton SU, McGregor M, Gifford CA, Seidman JG, Seidman CE, Gelb BD, Colombo G, Conklin BR, Black BL, Bruneau BG, Krogan NJ, Pollard KS, Srivastava D. Transcription factor protein interactomes reveal genetic determinants in heart disease. Cell 2022; 185:794-814.e30. [PMID: 35182466 PMCID: PMC8923057 DOI: 10.1016/j.cell.2022.01.021] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 08/20/2021] [Accepted: 01/25/2022] [Indexed: 02/08/2023]
Abstract
Congenital heart disease (CHD) is present in 1% of live births, yet identification of causal mutations remains challenging. We hypothesized that genetic determinants for CHDs may lie in the protein interactomes of transcription factors whose mutations cause CHDs. Defining the interactomes of two transcription factors haplo-insufficient in CHD, GATA4 and TBX5, within human cardiac progenitors, and integrating the results with nearly 9,000 exomes from proband-parent trios revealed an enrichment of de novo missense variants associated with CHD within the interactomes. Scoring variants of interactome members based on residue, gene, and proband features identified likely CHD-causing genes, including the epigenetic reader GLYR1. GLYR1 and GATA4 widely co-occupied and co-activated cardiac developmental genes, and the identified GLYR1 missense variant disrupted interaction with GATA4, impairing in vitro and in vivo function in mice. This integrative proteomic and genetic approach provides a framework for prioritizing and interrogating genetic variants in heart disease.
Collapse
Affiliation(s)
- Barbara Gonzalez-Teran
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Maureen Pittman
- Gladstone Institutes, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Franco Felix
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | | | - Desmond Richmond-Buccola
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Ruth Hüttenhain
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | | | | | - Mauro W Costa
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Yu Huang
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Arun Padmanabhan
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Division of Cardiology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Michael Alexanian
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Clara Youngna Lee
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Bonnie E J Maven
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Developmental and Stem Cell Biology Graduate Program, University of California San Francisco, San Francisco, CA, USA
| | - Kaitlen Samse-Knapp
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Sarah U Morton
- Division of Newborn Medicine, Department of Medicine, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Michael McGregor
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | - Casey A Gifford
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - J G Seidman
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Christine E Seidman
- Department of Genetics, Harvard Medical School, Boston, MA, USA; Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA; Cardiovascular Division, Brigham and Women's Hospital, Boston, MA, USA
| | - Bruce D Gelb
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Bruce R Conklin
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA
| | - Brian L Black
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA
| | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA; Division of Cardiology, Department of Pediatrics, UCSF School of Medicine, San Francisco, CA, USA
| | - Nevan J Krogan
- Gladstone Institutes, San Francisco, CA, USA; Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA; Quantitative Biosciences Institute (QBI), University of California San Francisco, San Francisco, CA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA; Chan Zuckerberg Biohub, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, Institute for Computational Health Sciences, and Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| | - Deepak Srivastava
- Gladstone Institutes, San Francisco, CA, USA; Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA, USA; Division of Cardiology, Department of Pediatrics, UCSF School of Medicine, San Francisco, CA, USA; Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
48
|
Markenscoff-Papadimitriou E, Binyameen F, Whalen S, Price J, Lim K, Ypsilanti AR, Catta-Preta R, Pai ELL, Mu X, Xu D, Pollard KS, Nord AS, State MW, Rubenstein JL. Autism risk gene POGZ promotes chromatin accessibility and expression of clustered synaptic genes. Cell Rep 2021; 37:110089. [PMID: 34879283 PMCID: PMC9512081 DOI: 10.1016/j.celrep.2021.110089] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 10/11/2021] [Accepted: 11/11/2021] [Indexed: 12/31/2022] Open
Abstract
Deleterious genetic variants in POGZ, which encodes the chromatin regulator Pogo Transposable Element with ZNF Domain protein, are strongly associated with autism spectrum disorder (ASD). Although it is a high-confidence ASD risk gene, the neurodevelopmental functions of POGZ remain unclear. Here we reveal the genomic binding of POGZ in the developing forebrain at euchromatic loci and gene regulatory elements (REs). We profile chromatin accessibility and gene expression in Pogz-/- mice and show that POGZ promotes the active chromatin state and transcription of clustered synaptic genes. We further demonstrate that POGZ forms a nuclear complex and co-occupies loci with ADNP, another high-confidence ASD risk gene, and provide evidence that POGZ regulates other neurodevelopmental disorder risk genes as well. Our results reveal a neurodevelopmental function of an ASD risk gene and identify molecular targets that may elucidate its function in ASD.
Collapse
Affiliation(s)
- Eirene Markenscoff-Papadimitriou
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA.
| | - Fadya Binyameen
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Sean Whalen
- Gladstone Institutes, San Francisco, CA, USA
| | - James Price
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Kenneth Lim
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Athena R Ypsilanti
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Rinaldo Catta-Preta
- Departments of Neurobiology, Physiology, and Behavior and Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA, USA
| | - Emily Ling-Lin Pai
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | | | | | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA; Chan-Zuckerberg Biohub, San Francisco, CA, USA; Institute for Computational Health Sciences, University of California, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, CA, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA; Quantitative Biology Institute, University of California, San Francisco, CA, USA
| | - Alex S Nord
- Departments of Neurobiology, Physiology, and Behavior and Psychiatry and Behavioral Sciences, University of California, Davis, Davis, CA, USA
| | - Matthew W State
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - John L Rubenstein
- Department of Psychiatry, Langley Porter Psychiatric Institute, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA, USA.
| |
Collapse
|
49
|
Ziffra RS, Kim CN, Ross JM, Wilfert A, Turner TN, Haeussler M, Casella AM, Przytycki PF, Keough KC, Shin D, Bogdanoff D, Kreimer A, Pollard KS, Ament SA, Eichler EE, Ahituv N, Nowakowski TJ. Single-cell epigenomics reveals mechanisms of human cortical development. Nature 2021; 598:205-213. [PMID: 34616060 PMCID: PMC8494642 DOI: 10.1038/s41586-021-03209-8] [Citation(s) in RCA: 109] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 01/07/2021] [Indexed: 12/12/2022]
Abstract
During mammalian development, differences in chromatin state coincide with cellular differentiation and reflect changes in the gene regulatory landscape1. In the developing brain, cell fate specification and topographic identity are important for defining cell identity2 and confer selective vulnerabilities to neurodevelopmental disorders3. Here, to identify cell-type-specific chromatin accessibility patterns in the developing human brain, we used a single-cell assay for transposase accessibility by sequencing (scATAC-seq) in primary tissue samples from the human forebrain. We applied unbiased analyses to identify genomic loci that undergo extensive cell-type- and brain-region-specific changes in accessibility during neurogenesis, and an integrative analysis to predict cell-type-specific candidate regulatory elements. We found that cerebral organoids recapitulate most putative cell-type-specific enhancer accessibility patterns but lack many cell-type-specific open chromatin regions that are found in vivo. Systematic comparison of chromatin accessibility across brain regions revealed unexpected diversity among neural progenitor cells in the cerebral cortex and implicated retinoic acid signalling in the specification of neuronal lineage identity in the prefrontal cortex. Together, our results reveal the important contribution of chromatin state to the emerging patterns of cell type diversity and cell fate specification and provide a blueprint for evaluating the fidelity and robustness of cerebral organoids as a model for cortical development.
Collapse
Affiliation(s)
- Ryan S Ziffra
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Chang N Kim
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Jayden M Ross
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Amy Wilfert
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tychele N Turner
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
| | | | - Alex M Casella
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Medical Scientist Training Program, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Kathleen C Keough
- Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA
- University of California, San Francisco, San Francisco, CA, USA
| | - David Shin
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Derek Bogdanoff
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Anat Kreimer
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA, USA
- Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
- Quantitative Biology Institute, University of California, San Francisco, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA, USA
| | - Seth A Ament
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
| | - Tomasz J Nowakowski
- Department of Anatomy, University of California, San Francisco, San Francisco, CA, USA.
- Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA.
- Eli and Edythe Broad Center for Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
50
|
Liu TY, Knott GJ, Smock DCJ, Desmarais JJ, Son S, Bhuiya A, Jakhanwal S, Prywes N, Agrawal S, Díaz de León Derby M, Switz NA, Armstrong M, Harris AR, Charles EJ, Thornton BW, Fozouni P, Shu J, Stephens SI, Kumar GR, Zhao C, Mok A, Iavarone AT, Escajeda AM, McIntosh R, Kim S, Dugan EJ, Pollard KS, Tan MX, Ott M, Fletcher DA, Lareau LF, Hsu PD, Savage DF, Doudna JA. Accelerated RNA detection using tandem CRISPR nucleases. Nat Chem Biol 2021; 17:982-988. [PMID: 34354262 PMCID: PMC10184463 DOI: 10.1038/s41589-021-00842-2] [Citation(s) in RCA: 93] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 06/23/2021] [Indexed: 12/14/2022]
Abstract
Direct, amplification-free detection of RNA has the potential to transform molecular diagnostics by enabling simple on-site analysis of human or environmental samples. CRISPR-Cas nucleases offer programmable RNA-guided RNA recognition that triggers cleavage and release of a fluorescent reporter molecule, but long reaction times hamper their detection sensitivity and speed. Here, we show that unrelated CRISPR nucleases can be deployed in tandem to provide both direct RNA sensing and rapid signal generation, thus enabling robust detection of ~30 molecules per µl of RNA in 20 min. Combining RNA-guided Cas13 and Csm6 with a chemically stabilized activator creates a one-step assay that can detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA extracted from respiratory swab samples with quantitative reverse transcriptase PCR (qRT-PCR)-derived cycle threshold (Ct) values up to 33, using a compact detector. This Fast Integrated Nuclease Detection In Tandem (FIND-IT) approach enables sensitive, direct RNA detection in a format that is amenable to point-of-care infection diagnosis as well as to a wide range of other diagnostic or research applications.
Collapse
Affiliation(s)
- Tina Y Liu
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Gavin J Knott
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Monash Biomedicine Discovery Institute, Department of Biochemistry & Molecular Biology, Monash University, Victoria, Australia
| | - Dylan C J Smock
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - John J Desmarais
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Sungmin Son
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Abdul Bhuiya
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- UC Berkeley, UC San Francisco Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Shrutee Jakhanwal
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Noam Prywes
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Shreeya Agrawal
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - María Díaz de León Derby
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- UC Berkeley, UC San Francisco Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Neil A Switz
- Department of Physics and Astronomy, San José State University, San José, CA, USA
| | - Maxim Armstrong
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew R Harris
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Emeric J Charles
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Brittney W Thornton
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Parinaz Fozouni
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA, USA
- Biomedical Sciences Graduate Program, University of California, San Francisco, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Jeffrey Shu
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Stephanie I Stephens
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - G Renuka Kumar
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Chunyu Zhao
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
| | - Amanda Mok
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Anthony T Iavarone
- QB3/Chemistry Mass Spectrometry Facility, University of California, Berkeley, Berkeley, CA, USA
| | | | | | - Shineui Kim
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Eli J Dugan
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Katherine S Pollard
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | | | - Melanie Ott
- Gladstone Institute of Virology, Gladstone Institutes, San Francisco, CA, USA
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Daniel A Fletcher
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- UC Berkeley, UC San Francisco Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Liana F Lareau
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | - Patrick D Hsu
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA.
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA.
- Berkeley Stem Cell Center, University of California, Berkeley, Berkeley, CA, USA.
| | - David F Savage
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
| | - Jennifer A Doudna
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA.
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA.
- Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, CA, USA.
- Department of Chemistry, University of California, Berkeley, Berkeley, CA, USA.
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.
| |
Collapse
|