Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

23
(from Reference Citation Analysis)

Article PDFs (9)

Cited by > 0 (18)

Searched Name

Pouya Kheradpour

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	Early immune factors associated with the development of post-acute sequelae of SARS-CoV-2 infection in hospitalized and non-hospitalized individuals. Front Immunol 2024;15:1348041. [PMID: 38318183 PMCID: PMC10838987 DOI: 10.3389/fimmu.2024.1348041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 01/02/2024] [Indexed: 02/07/2024] Open Abstract Background Infection by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can lead to post-acute sequelae of SARS-CoV-2 (PASC) that can persist for weeks to years following initial viral infection. Clinical manifestations of PASC are heterogeneous and often involve multiple organs. While many hypotheses have been made on the mechanisms of PASC and its associated symptoms, the acute biological drivers of PASC are still unknown. Methods We enrolled 494 patients with COVID-19 at their initial presentation to a hospital or clinic and followed them longitudinally to determine their development of PASC. From 341 patients, we conducted multi-omic profiling on peripheral blood samples collected shortly after study enrollment to investigate early immune signatures associated with the development of PASC. Results During the first week of COVID-19, we observed a large number of differences in the immune profile of individuals who were hospitalized for COVID-19 compared to those individuals with COVID-19 who were not hospitalized. Differences between individuals who did or did not later develop PASC were, in comparison, more limited, but included significant differences in autoantibodies and in epigenetic and transcriptional signatures in double-negative 1 B cells, in particular. Conclusions We found that early immune indicators of incident PASC were nuanced, with significant molecular signals manifesting predominantly in double-negative B cells, compared with the robust differences associated with hospitalization during acute COVID-19. The emerging acute differences in B cell phenotypes, especially in double-negative 1 B cells, in PASC patients highlight a potentially important role of these cells in the development of PASC. Collapse Key Words COVID-19 PASC autoantibody double-negative B cells long COVID Collapse MESH Headings Humans COVID-19 SARS-CoV-2 Post-Acute COVID-19 Syndrome Immunologic Factors Autoantibodies Disease Progression Collapse Grants UL1 TR002384 NCATS NIH HHS Verily Life Sciences Collapse
2	Multi-omic profiling reveals early immunological indicators for identifying COVID-19 Progressors. Clin Immunol 2023;256:109808. [PMID: 37852344 DOI: 10.1016/j.clim.2023.109808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/25/2023] [Accepted: 10/11/2023] [Indexed: 10/20/2023] Abstract We sought to better understand the immune response during the immediate post-diagnosis phase of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by identifying molecular associations with longitudinal disease outcomes. Multi-omic analyses identified differences in immune cell composition, cytokine levels, and cell subset-specific transcriptomic and epigenomic signatures between individuals on a more serious disease trajectory (Progressors) as compared to those on a milder course (Non-progressors). Higher levels of multiple cytokines were observed in Progressors, with IL-6 showing the largest difference. Blood monocyte cell subsets were also skewed, showing a comparative decrease in non-classical CD14-CD16+ and intermediate CD14+CD16+ monocytes. In lymphocytes, the CD8+ T effector memory cells displayed a gene expression signature consistent with stronger T cell activation in Progressors. These early stage observations could serve as the basis for the development of prognostic biomarkers of disease risk and interventional strategies to improve the management of severe COVID-19. BACKGROUND: Much of the literature on immune response post-SARS-CoV-2 infection has been in the acute and post-acute phases of infection. TRANSLATIONAL SIGNIFICANCE: We found differences at early time points of infection in approximately 160 participants. We compared multi-omic signatures in immune cells between individuals progressing to needing more significant medical intervention and non-progressors. We observed widespread evidence of a state of increased inflammation associated with progression, supported by a range of epigenomic, transcriptomic, and proteomic signatures. The signatures we identified support other findings at later time points and serve as the basis for prognostic biomarker development or to inform interventional strategies. Collapse Key Words COVID19 Early infection Multi-omic analysis SARS-CoV-2 Systems immunology Collapse MESH Headings Humans COVID-19 Multiomics Proteomics SARS-CoV-2 Cytokines Collapse Grants Collapse
3	Multi-omic Profiling Reveals Early Immunological Indicators for Identifying COVID-19 Progressors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.25.542297. [PMID: 37292797 PMCID: PMC10246026 DOI: 10.1101/2023.05.25.542297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023] Abstract The pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to a rapid response by the scientific community to further understand and combat its associated pathologic etiology. A focal point has been on the immune responses mounted during the acute and post-acute phases of infection, but the immediate post-diagnosis phase remains relatively understudied. We sought to better understand the immediate post-diagnosis phase by collecting blood from study participants soon after a positive test and identifying molecular associations with longitudinal disease outcomes. Multi-omic analyses identified differences in immune cell composition, cytokine levels, and cell subset-specific transcriptomic and epigenomic signatures between individuals on a more serious disease trajectory (Progressors) as compared to those on a milder course (Non-progressors). Higher levels of multiple cytokines were observed in Progressors, with IL-6 showing the largest difference. Blood monocyte cell subsets were also skewed, showing a comparative decrease in non-classical CD14-CD16+ and intermediate CD14+CD16+ monocytes. Additionally, in the lymphocyte compartment, CD8+ T effector memory cells displayed a gene expression signature consistent with stronger T cell activation in Progressors. Importantly, the identification of these cellular and molecular immune changes occurred at the early stages of COVID-19 disease. These observations could serve as the basis for the development of prognostic biomarkers of disease risk and interventional strategies to improve the management of severe COVID-19. Collapse Key Words Collapse MESH Headings Collapse Grants OT2 HL158287 NHLBI NIH HHS UH3 HL140144 NHLBI NIH HHS R25 HL126140 NHLBI NIH HHS C06 OD028307 NIH HHS R21 HD109777 NICHD NIH HHS OT2 HL161847 NHLBI NIH HHS UG3 HL140144 NHLBI NIH HHS R56 HL138377 NHLBI NIH HHS R33 HL151254 NHLBI NIH HHS Collapse
4	Dissection of multiple sclerosis genetics identifies B and CD4+ T cells as driver cell subsets. Genome Biol 2022;23:127. [PMID: 35672799 PMCID: PMC9175345 DOI: 10.1186/s13059-022-02694-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 05/16/2022] [Indexed: 11/10/2022] Open Abstract Background Multiple sclerosis (MS) is an autoimmune condition of the central nervous system with a well-characterized genetic background. Prior analyses of MS genetics have identified broad enrichments across peripheral immune cells, yet the driver immune subsets are unclear. Results We utilize chromatin accessibility data across hematopoietic cells to identify cell type-specific enrichments of MS genetic signals. We find that CD4 T and B cells are independently enriched for MS genetics and further refine the driver subsets to T_h17 and memory B cells, respectively. We replicate our findings in data from untreated and treated MS patients and find that immunomodulatory treatments suppress chromatin accessibility at driver cell types. Integration of statistical fine-mapping and chromatin interactions nominate numerous putative causal genes, illustrating complex interplay between shared and cell-specific genes. Conclusions Overall, our study finds that open chromatin regions in CD4 T cells and B cells independently drive MS genetic signals. Our study highlights how careful integration of genetics and epigenetics can provide fine-scale insights into causal cell types and nominate new genes and pathways for disease. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02694-y. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
5	OP0100 MOLECULAR PROFILING OF PERIPHERAL IMMUNE CELL SUBSETS IN PATIENTS WITH RHEUMATOID ARTHRITIS. Ann Rheum Dis 2020. [DOI: 10.1136/annrheumdis-2020-eular.3967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Abstract Background:Rheumatoid arthritis (RA) is a chronic systemic autoimmune disease that affects 1% of the world’s population. Several key biological functions are dysregulated in RA, manifesting clinically as pain, fatigue, and synovitis, with articular destruction, organ-based comorbidities, and functional decline. Defining immune dysregulation in the peripheral blood of patients (pts) with RA will help inform future work to assess the extent to which immune homeostasis can be therapeutically achieved for these pts.Objectives:To identify baseline molecular characteristics of the peripheral immune system, at the level of individual immune cell subsets, in pts with RA recruited to clinical trials of the oral, selective Janus kinase 1 (JAK1) inhibitor, filgotinib.Methods:Peripheral blood mononuclear cells (PBMC) were collected from 324 pts with moderate to severely active RA, who had an inadequate response to methotrexate ([MTX], FINCH-1;NCT02889796; n=109) or who were MTX naïve (FINCH-3;NCT02886728; n=215). PBMC were also collected from 50 demographically matched healthy volunteers (HV). The Immune Profiler platform was used to sort PBMC into 24 immune cell subsets, then quantify their gene expression and chromatin accessibility using RNA-seq and the assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq), respectively. Differentially expressed genes (DEGs) and differentially accessible regions (DARs) were identified among immune cell subsets from pts with RA versus HV. Gene set signature scores of Molecular Signatures Database hallmark pathways were calculated using single sample gene set enrichment analysis (ssGSEA) to examine differences in pathway activity between groups.Results:A total of 14,500 sequencing datasets were generated from the pt and HV immune cell subsets. Among these, over 26,000 DEGs and 220,000 DARs were identified in RA versus HV (false discovery rate <0.05) across the 24 immune cell subsets. DEGs were identified in all immune cell subsets tested and were most pronounced in natural killer (NK) subsets; most DARs were detected in myeloid and NK subsets. ssGSEA revealed differential pathway signaling in RA versus HV across multiple functions at the immune cell subset level. Myeloid subsets from pts with RA often showed elevated pathway activities versus HV whereas B, T and NK subsets showed a general decrease. In particular, monocyte populations from pts with RA versus HV had elevated pathway activities involved in inflammatory response and interleukin-6/Janus kinase/signal transducer and activator of transcription 3 signaling. The B, T and NK subsets showed a general decrease in tumor necrosis factor-α signaling; conversely, monocyte subsets showed an increase. Prior MTX exposure did not have a notable impact on the detected molecular profile.Conclusion:Differences in gene expression, hallmark pathway activity, and chromatin accessibility were identified in RA versus HV at the immune cell subset level. Significant contributions to differences in chromatin accessibility identified in the myeloid and NK cell populations suggest that there are more active regulatory sequences in these cell types that are associated with RA. Further investigations based on these findings may increase understanding of the immune regulatory paradigm in the context of RA.Acknowledgments:This study was funded by Gilead Sciences, Inc. Editorial support was provided by Fishawack Communications Inc and funded by Gilead Sciences, Inc.Disclosure of Interests:Peter C. Taylor Grant/research support from: Celgene, Eli Lilly and Company, Galapagos, and Gilead, Consultant of: AbbVie, Biogen, Eli Lilly and Company, Fresenius, Galapagos, Gilead, GlaxoSmithKline, Janssen, Nordic Pharma, Pfizer Roche, and UCB, Jinfeng Liu Shareholder of: Gilead Sciences Inc., Roche, Employee of: Gilead Sciences Inc., Luting Zhuo Employee of: Gilead Sciences Inc., Yuan Tian Employee of: Gilead Sciences Inc., Thomas Snyder Employee of: Verily Life Sciences, Charlie Kim Employee of: Verily Life Sciences, Pouya Kheradpour Employee of: Verily Life Sciences, Kat Drake Employee of: Verily Life Sciences, Sam Kim Shareholder of: Gilead Sciences Inc., Employee of: Gilead Sciences Inc., Rachael E. Hawtin Shareholder of: Gilead Sciences Inc., Employee of: Gilead Sciences Inc. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
6	Evidence of reduced recombination rate in human regulatory domains. Genome Biol 2017;18:193. [PMID: 29058599 PMCID: PMC5651596 DOI: 10.1186/s13059-017-1308-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 08/25/2017] [Indexed: 11/10/2022] Open Abstract BACKGROUND Recombination rate is non-uniformly distributed across the human genome. The variation of recombination rate at both fine and large scales cannot be fully explained by DNA sequences alone. Epigenetic factors, particularly DNA methylation, have recently been proposed to influence the variation in recombination rate. RESULTS We study the relationship between recombination rate and gene regulatory domains, defined by a gene and its linked control elements. We define these links using expression quantitative trait loci (eQTLs), methylation quantitative trait loci (meQTLs), chromatin conformation from publicly available datasets (Hi-C and ChIA-PET), and correlated activity links that we infer across cell types. Each link type shows a "recombination rate valley" of significantly reduced recombination rate compared to matched control regions. This recombination rate valley is most pronounced for gene regulatory domains of early embryonic development genes, housekeeping genes, and constitutive regulatory elements, which are known to show increased evolutionary constraint across species. Recombination rate valleys show increased DNA methylation, reduced doublestranded break initiation, and increased repair efficiency, specifically in the lineage leading to the germ line. Moreover, by using only the overlap of functional links and DNA methylation in germ cells, we are able to predict the recombination rate with high accuracy. CONCLUSIONS Our results suggest the existence of a recombination rate valley at regulatory domains and provide a potential molecular mechanism to interpret the interplay between genetic and epigenetic variations. Collapse Key Words DNA methylation Recombination rate Regulatory domain Collapse MESH Headings Collapse Grants Collapse
7	Diverse patterns of genomic targeting by transcriptional regulators in Drosophila melanogaster. Genome Res 2015;24:1224-35. [PMID: 24985916 PMCID: PMC4079976 DOI: 10.1101/gr.168807.113] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Abstract Annotation of regulatory elements and identification of the transcription-related factors (TRFs) targeting these elements are key steps in understanding how cells interpret their genetic blueprint and their environment during development, and how that process goes awry in the case of disease. One goal of the modENCODE (model organism ENCyclopedia of DNA Elements) Project is to survey a diverse sampling of TRFs, both DNA-binding and non-DNA-binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TRFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also highlight the stringency of the Polycomb regulatory network, and show association of the Trithorax-like (Trl) protein with hotspots of DNA binding throughout development. Furthermore, the data identify more than 5800 instances in which TRFs target DNA regions with demonstrated enhancer activity. Regions of high TRF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TRF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. Together these data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Integrative analysis of 111 reference human epigenomes. Nature 2015;518:317-30. [PMID: 25693563 PMCID: PMC4530010 DOI: 10.1038/nature14248] [Citation(s) in RCA: 4014] [Impact Index Per Article: 446.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Accepted: 01/21/2015] [Indexed: 02/06/2023] Abstract The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease. Collapse Key Words Collapse MESH Headings Base Sequence Cell Lineage/genetics Cells, Cultured Chromatin/chemistry Chromatin/genetics Chromatin/metabolism Chromosomes, Human/chemistry Chromosomes, Human/genetics Chromosomes, Human/metabolism DNA/chemistry DNA/genetics DNA/metabolism DNA Methylation Datasets as Topic Enhancer Elements, Genetic/genetics Epigenesis, Genetic/genetics Epigenomics Genetic Variation/genetics Genome, Human/genetics Genome-Wide Association Study Histones/metabolism Humans Organ Specificity/genetics RNA/genetics Reference Values Collapse Grants U01ES017166 NIEHS NIH HHS R01 ES024984 NIEHS NIH HHS U01 DA025956 NIDA NIH HHS Howard Hughes Medical Institute U01ES017154 NIEHS NIH HHS P01 DA008227 NIDA NIH HHS P30 AG010161 NIA NIH HHS K99 HL119617 NHLBI NIH HHS R01 ES024992 NIEHS NIH HHS U01 ES017156 NIEHS NIH HHS U01DA025956 NIDA NIH HHS RF1 AG036042 NIA NIH HHS U01 AG046152 NIA NIH HHS RF1 AG015819 NIA NIH HHS F32 HL110473 NHLBI NIH HHS R00 HL119617 NHLBI NIH HHS U01 ES017154 NIEHS NIH HHS R01 HG004037 NHGRI NIH HHS R01 HG007175 NHGRI NIH HHS U01ES017155 NIEHS NIH HHS F32HL110473 NHLBI NIH HHS R01 AG017917 NIA NIH HHS R01HG004037 NHGRI NIH HHS R01AG17917 NIA NIH HHS T32 ES007032 NIEHS NIH HHS ES017166 NIEHS NIH HHS R01HG004037-S1 NHGRI NIH HHS R01 NS078839 NINDS NIH HHS U01AG46152 NIA NIH HHS R01AG15819 NIA NIH HHS R24 HD000836 NICHD NIH HHS P50 MH096890 NIMH NIH HHS R01NS078839 NINDS NIH HHS RC1 HG005334 NHGRI NIH HHS P30AG10161 NIA NIH HHS U54 HG007990 NHGRI NIH HHS T32 GM081739 NIGMS NIH HHS T32 GM007266 NIGMS NIH HHS R25 DA027995 NIDA NIH HHS 5R24HD000836 NICHD NIH HHS U01 ES017166 NIEHS NIH HHS U01ES017156 NIEHS NIH HHS U01 ES017155 NIEHS NIH HHS R01 HG007354 NHGRI NIH HHS U54 DK106829 NIDDK NIH HHS R01 AG015819 NIA NIH HHS T32 GM007198 NIGMS NIH HHS RC1HG005334 NHGRI NIH HHS K99HL119617 NHLBI NIH HHS R01 HD092419 NICHD NIH HHS Collapse
9	Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Res 2013;42:2976-87. [PMID: 24335146 PMCID: PMC3950668 DOI: 10.1093/nar/gkt1249] [Citation(s) in RCA: 304] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open Abstract Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity between data sets and measure the variation of motif binding across individuals and species. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res 2013;23:800-11. [PMID: 23512712 PMCID: PMC3638136 DOI: 10.1101/gr.144899.112] [Citation(s) in RCA: 228] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2012] [Accepted: 03/14/2013] [Indexed: 01/06/2023] Abstract Genome-wide chromatin annotations have permitted the mapping of putative regulatory elements across multiple human cell types. However, their experimental dissection by directed regulatory motif disruption has remained unfeasible at the genome scale. Here, we use a massively parallel reporter assay (MPRA) to measure the transcriptional levels induced by 145-bp DNA segments centered on evolutionarily conserved regulatory motif instances within enhancer chromatin states. We select five predicted activators (HNF1, HNF4, FOXA, GATA, NFE2L2) and two predicted repressors (GFI1, ZFP161) and measure reporter expression in erythroleukemia (K562) and liver carcinoma (HepG2) cell lines. We test 2104 wild-type sequences and 3314 engineered enhancer variants containing targeted motif disruptions, each using 10 barcode tags and two replicates. The resulting data strongly confirm the enhancer activity and cell-type specificity of enhancer chromatin states, the ability of 145-bp segments to recapitulate both, the necessary role of regulatory motifs in enhancer function, and the complementary roles of activator and repressor motifs. We find statistically robust evidence that (1) disrupting the predicted activator motifs abolishes enhancer function, while silent or motif-improving changes maintain enhancer activity; (2) evolutionary conservation, nucleosome exclusion, binding of other factors, and strength of the motif match are predictive of enhancer activity; (3) scrambling repressor motifs leads to aberrant reporter expression in cell lines where the enhancers are usually inactive. Our results suggest a general strategy for deciphering cis-regulatory elements by systematic large-scale manipulation and provide quantitative enhancer activity measurements across thousands of constructs that can be mined to develop predictive models of gene expression. Collapse Key Words Collapse MESH Headings Base Sequence Binding Sites Cells/classification Cells/metabolism Chromatin/genetics Chromosome Mapping Conserved Sequence Enhancer Elements, Genetic Gene Expression Regulation Genes, Reporter Genome, Human Hep G2 Cells Humans Nucleotide Motifs/genetics Promoter Regions, Genetic Transcription, Genetic Collapse Grants R01 HG004037 NHGRI NIH HHS R01 HG006785 NHGRI NIH HHS HG004037 NHGRI NIH HHS HG004037-S1 NHGRI NIH HHS Collapse
11	ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 2013;22:1813-31. [PMID: 22955991 PMCID: PMC3431496 DOI: 10.1101/gr.136184.111] [Citation(s) in RCA: 1290] [Impact Index Per Article: 117.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Abstract Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	Analysis of variation at transcription factor binding sites in Drosophila and humans. Genome Biol 2012;13:R49. [PMID: 22950968 PMCID: PMC3491393 DOI: 10.1186/gb-2012-13-9-r49] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Revised: 05/23/2012] [Accepted: 06/08/2012] [Indexed: 12/31/2022] Open Abstract BACKGROUND Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Here we investigate TFBS variability by combining transcription factor binding maps generated by ENCODE, modENCODE, our previously published data and other sources with genomic variation data for human individuals and Drosophila isogenic lines. RESULTS We introduce a metric of TFBS variability that takes into account changes in motif match associated with mutation and makes it possible to investigate TFBS functional constraints instance-by-instance as well as in sets that share common biological properties. We also take advantage of the emerging per-individual transcription factor binding data to show evidence that TFBS mutations, particularly at evolutionarily conserved sites, can be efficiently buffered to ensure coherent levels of transcription factor binding. CONCLUSIONS Our analyses provide insights into the relationship between individual and interspecies variation and show evidence for the functional buffering of TFBS mutations in both humans and flies. In a broad perspective, these results demonstrate the potential of combining functional genomics and population genetics approaches for understanding gene regulation. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
13	Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes. Genome Res 2011;21:1916-28. [PMID: 21994248 DOI: 10.1101/gr.108753.110] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes--especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ∼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
14	An epigenetic signature for monoallelic olfactory receptor expression. Cell 2011;145:555-70. [PMID: 21529909 PMCID: PMC3094500 DOI: 10.1016/j.cell.2011.03.040] [Citation(s) in RCA: 206] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2010] [Revised: 03/10/2011] [Accepted: 03/17/2011] [Indexed: 12/29/2022] Abstract Constitutive heterochromatin is traditionally viewed as the static form of heterochromatin that silences pericentromeric and telomeric repeats in a cell cycle- and differentiation-independent manner. Here, we show that, in the mouse olfactory epithelium, olfactory receptor (OR) genes are marked in a highly dynamic fashion with the molecular hallmarks of constitutive heterochromatin, H3K9me3 and H4K20me3. The cell type and developmentally dependent deposition of these marks along the OR clusters are, most likely, reversed during the process of OR choice to allow for monogenic and monoallelic OR expression. In contrast to the current view of OR choice, our data suggest that OR silencing takes place before OR expression, indicating that it is not the product of an OR-elicited feedback signal. Our findings suggest that chromatin-mediated silencing lays a molecular foundation upon which singular and stochastic selection for gene expression can be applied. Collapse Key Words Collapse MESH Headings Animals Chromatin Assembly and Disassembly Chromatin Immunoprecipitation Gene Expression Gene Silencing Heterochromatin Histone Code Mice Mice, Inbred C57BL Mice, Transgenic Olfactory Mucosa/metabolism Oligonucleotide Array Sequence Analysis Receptors, Odorant/genetics Collapse Grants P30 GM138441 NIGMS NIH HHS DP2 OD006667-01 NIH HHS R01 HG004037 NHGRI NIH HHS R03 DC010273 NIDCD NIH HHS R03 DC010273-01 NIDCD NIH HHS 1DP2OD006667 NIH HHS R01 DA030320 NIDA NIH HHS P41 RR019664 NCRR NIH HHS DP2 OD006667 NIH HHS Collapse
15	Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010;330:1787-97. [PMID: 21177974 PMCID: PMC3192495 DOI: 10.1126/science.1198374] [Citation(s) in RCA: 899] [Impact Index Per Article: 64.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Abstract To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Collapse Key Words Collapse MESH Headings Animals Binding Sites Chromatin/genetics Chromatin/metabolism Computational Biology/methods Drosophila Proteins/genetics Drosophila Proteins/metabolism Drosophila melanogaster/genetics Drosophila melanogaster/growth & development Drosophila melanogaster/metabolism Epigenesis, Genetic Gene Expression Regulation Gene Regulatory Networks Genes, Insect Genome, Insect Genomics/methods Histones/metabolism Molecular Sequence Annotation Nucleosomes/genetics Nucleosomes/metabolism Promoter Regions, Genetic RNA, Small Untranslated/genetics RNA, Small Untranslated/metabolism Transcription Factors/metabolism Transcription, Genetic Collapse Grants R01 HG004037 NHGRI NIH HHS U01HG004261 NHGRI NIH HHS Howard Hughes Medical Institute R01HG004037 NHGRI NIH HHS U01HG004279 NHGRI NIH HHS U41HG004269 NHGRI NIH HHS U01 HG004279 NHGRI NIH HHS U01HG004264 NHGRI NIH HHS R01 GM081871 NIGMS NIH HHS U01HG004274 NHGRI NIH HHS RC2HG005639 NHGRI NIH HHS U01HG004271 NHGRI NIH HHS U01 HG004271 NHGRI NIH HHS U01HG004258 NHGRI NIH HHS ZIA DK015600-14 Intramural NIH HHS U01 HG004258 NHGRI NIH HHS Collapse
16	The Tasmanian devil transcriptome reveals Schwann cell origins of a clonally transmissible cancer. Science 2010;327:84-7. [PMID: 20044575 DOI: 10.1126/science.1180616] [Citation(s) in RCA: 191] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Abstract The Tasmanian devil, a marsupial carnivore, is endangered because of the emergence of a transmissible cancer known as devil facial tumor disease (DFTD). This fatal cancer is clonally derived and is an allograft transmitted between devils by biting. We performed a large-scale genetic analysis of DFTD with microsatellite genotyping, a mitochondrial genome analysis, and deep sequencing of the DFTD transcriptome and microRNAs. These studies confirm that DFTD is a monophyletic clonally transmissible tumor and suggest that the disease is of Schwann cell origin. On the basis of these results, we have generated a diagnostic marker for DFTD and identify a suite of genes relevant to DFTD pathology and transmission. We provide a genomic data set for the Tasmanian devil that is applicable to cancer diagnosis, disease evolution, and conservation biology. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
17	A comprehensive map of insulator elements for the Drosophila genome. PLoS Genet 2010;6:e1000814. [PMID: 20084099 PMCID: PMC2797089 DOI: 10.1371/journal.pgen.1000814] [Citation(s) in RCA: 257] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 12/14/2009] [Indexed: 01/31/2023] Open Abstract Insulators are DNA sequences that control the interactions among genomic regulatory elements and act as chromatin boundaries. A thorough understanding of their location and function is necessary to address the complexities of metazoan gene regulation. We studied by ChIP–chip the genome-wide binding sites of 6 insulator-associated proteins—dCTCF, CP190, BEAF-32, Su(Hw), Mod(mdg4), and GAF—to obtain the first comprehensive map of insulator elements in Drosophila embryos. We identify over 14,000 putative insulators, including all classically defined insulators. We find two major classes of insulators defined by dCTCF/CP190/BEAF-32 and Su(Hw), respectively. Distributional analyses of insulators revealed that particular sub-classes of insulator elements are excluded between cis-regulatory elements and their target promoters; divide differentially expressed, alternative, and divergent promoters; act as chromatin boundaries; are associated with chromosomal breakpoints among species; and are embedded within active chromatin domains. Together, these results provide a map demarcating the boundaries of gene regulatory units and a framework for understanding insulator function during the development and evolution of Drosophila. The spatiotemporal specificity of gene expression is controlled by interactions among regulatory proteins, cis-regulatory elements, chromatin modifications, and genes. These interactions can occur over large distances, and the mechanisms by which they are controlled are poorly understood. Insulators are DNA sequences that can both block the interaction between regulatory elements and genes, as well as block the spread of regions of modified chromatin. To date, relatively few insulators have been identified in developing Drosophila embryos. We here present the genome wide identification of over 14,000 binding sites for 6 insulator-associated proteins. We demonstrate the existence of two broad classes of insulators. Insulators of both classes are enriched at the boundaries of a particular chromatin modification. However, only insulators bound by BEAF-32, CP190, and dCTCF are enriched in regions of open chromatin or demarcate gene boundaries, with a particular enrichment between differentially expressed promoters. Furthermore, insulators of this class are enriched at points of chromosomal rearrangement among the 12 species of sequenced Drosophila, suggesting that insulator defined regulatory boundaries are evolutionarily conserved. Collapse Key Words Collapse MESH Headings Animals Chromosome Mapping Drosophila/genetics Drosophila/metabolism Drosophila Proteins/genetics Drosophila Proteins/metabolism Genome, Insect Insulator Elements Protein Binding Collapse Grants R01 HG004037 NHGRI NIH HHS Collapse
18	Genome analysis of the platypus reveals unique signatures of evolution. Nature 2008;453:175-83. [PMID: 18464734 PMCID: PMC2803040 DOI: 10.1038/nature06936] [Citation(s) in RCA: 475] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Accepted: 03/25/2008] [Indexed: 12/18/2022] Abstract We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. Collapse Key Words Collapse MESH Headings Animals Base Composition Dentition Evolution, Molecular Female Genome/genetics Genomic Imprinting/genetics Humans Immunity/genetics Male Mammals/genetics MicroRNAs/genetics Milk Proteins/genetics Phylogeny Platypus/genetics Platypus/immunology Platypus/physiology Receptors, Odorant/genetics Repetitive Sequences, Nucleic Acid/genetics Reptiles/genetics Sequence Analysis, DNA Spermatozoa/metabolism Venoms/genetics Zona Pellucida/metabolism Collapse Grants R01 HG004037 NHGRI NIH HHS MC_U137761446 Medical Research Council R01 HG002939 NHGRI NIH HHS R01 GM059290 NIGMS NIH HHS R01HG02385 NHGRI NIH HHS R01 HG002385 NHGRI NIH HHS R01 HG004037-02 NHGRI NIH HHS P01 CA013106-37 NCI NIH HHS R01 GM59290 NIGMS NIH HHS P01 CA013106 NCI NIH HHS 062023 Wellcome Trust R01 HG002238 NHGRI NIH HHS HG002238 NHGRI NIH HHS Collapse
19	Conservation of small RNA pathways in platypus. Genome Res 2008;18:995-1004. [PMID: 18463306 DOI: 10.1101/gr.073056.107] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Abstract Small RNA pathways play evolutionarily conserved roles in gene regulation and defense from parasitic nucleic acids. The character and expression patterns of small RNAs show conservation throughout animal lineages, but specific animal clades also show variations on these recurring themes, including species-specific small RNAs. The monotremes, with only platypus and four species of echidna as extant members, represent the basal branch of the mammalian lineage. Here, we examine the small RNA pathways of monotremes by deep sequencing of six platypus and echidna tissues. We find that highly conserved microRNA species display their signature tissue-specific expression patterns. In addition, we find a large rapidly evolving cluster of microRNAs on platypus chromosome X1, which is unique to monotremes. Platypus and echidna testes contain a robust Piwi-interacting (piRNA) system, which appears to be participating in ongoing transposon defense. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
20	A single Hox locus in Drosophila produces functional microRNAs from opposite DNA strands. Genes Dev 2008;22:8-13. [PMID: 18172160 DOI: 10.1101/gad.1613108] [Citation(s) in RCA: 195] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Abstract MicroRNAs (miRNAs) are approximately 22-nucleotide RNAs that are processed from characteristic precursor hairpins and pair to sites in messages of protein-coding genes to direct post-transcriptional repression. Here, we report that the miRNA iab-4 locus in the Drosophila Hox cluster is transcribed convergently from both DNA strands, giving rise to two distinct functional miRNAs. Both sense and antisense miRNA products target neighboring Hox genes via highly conserved sites, leading to homeotic transformations when ectopically expressed. We also report sense/antisense miRNAs in mouse and find antisense transcripts close to many miRNAs in both flies and mammals, suggesting that additional sense/antisense pairs exist. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	Reliable prediction of regulator targets using 12 Drosophila genomes. Genes Dev 2007;17:1919-31. [PMID: 17989251 PMCID: PMC2099599 DOI: 10.1101/gr.7090407] [Citation(s) in RCA: 139] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2007] [Accepted: 10/10/2007] [Indexed: 12/24/2022] Abstract Gene expression is regulated pre- and post-transcriptionally via cis-regulatory DNA and RNA motifs. Identification of individual functional instances of such motifs in genome sequences is a major goal for inferring regulatory networks yet has been hampered due to the motifs' short lengths that lead to many chance matches and poor signal-to-noise ratios. In this paper, we develop a general methodology for the comparative identification of functional motif instances across many related species, using a phylogenetic framework that accounts for the evolutionary relationships between species, allows for motif movements, and is robust against missing data due to artifacts in sequencing, assembly, or alignment. We also provide a robust statistical framework for evaluating motif confidence, which enables us to translate evolutionary conservation into a confidence measure for each motif instance, correcting for varying motif length, composition, and background conservation of the target regions. We predict targets of fly transcription factors and miRNAs in alignments of 12 recently sequenced Drosophila species. When compared to extensive genome-wide experimental data, predicted targets are of high quality, matching and surpassing ChIP-chip microarrays and recovering miRNA targets with high sensitivity. The resulting regulatory network suggests significant redundancy between pre- and post-transcriptional regulation of gene expression. Collapse Key Words Collapse MESH Headings Animals Base Sequence Conserved Sequence/physiology Down-Regulation/genetics Drosophila/genetics Gene Expression Regulation Genes, Insect/physiology Genes, Regulator/genetics Genes, Regulator/physiology Genome, Insect Molecular Sequence Data Sequence Alignment Species Specificity Collapse Grants R01 HG004037 NHGRI NIH HHS R01 HG004037-01A1 NHGRI NIH HHS Collapse
22	Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. Genome Res 2007;17:1865-79. [PMID: 17989255 DOI: 10.1101/gr.6593807] [Citation(s) in RCA: 173] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Abstract MicroRNAs (miRNAs) are short regulatory RNAs that inhibit target genes by complementary binding in 3' untranslated regions (3' UTRs). They are one of the most abundant classes of regulators, targeting a large fraction of all genes, making their comprehensive study a requirement for understanding regulation and development. Here we use 12 Drosophila genomes to define structural and evolutionary signatures of miRNA hairpins, which we use for their de novo discovery. We predict >41 novel miRNA genes, which encompass many unique families, and 28 of which are validated experimentally. We also define signals for the precise start position of mature miRNAs, which suggest corrections of previously known miRNAs, often leading to drastic changes in their predicted target spectrum. We show that miRNA discovery power scales with the number and divergence of species compared, suggesting that such approaches can be successful in human as dozens of mammalian genomes become available. Interestingly, for some miRNAs sense and anti-sense hairpins score highly and mature miRNAs from both strands can indeed be found in vivo. Similarly, miRNAs with weak 5' end predictions show increased in vivo processing of multiple alternate 5' ends and have fewer predicted targets. Lastly, we show that several miRNA star sequences score highly and are likely functional. For mir-10 in particular, both arms show abundant processing, and both show highly conserved target sites in Hox genes, suggesting a possible cooperation of the two arms, and their role as a master Hox regulator. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
23	Fold-specific substitution matrices for protein classification. Bioinformatics 2004;20:847-53. [PMID: 14764567 DOI: 10.1093/bioinformatics/btg492] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract MOTIVATION Methods that focus on secondary structures, such as Position Specific Scoring Matrices and Hidden Markov Models, have proved useful for assigning proteins to families. However, for assigning proteins to an attribute class within a family these methods may introduce more free parameters than are needed. There are fewer members and there is less variability among sequences within a family. We describe a method for organizing proteins in a family that exhibits up to an order of magnitude reduction in the number of parameters. The basis is the log odds ratio commonly used to measure similarity. We adapt this to characterize the sequence dissimilarities that give rise to attribute differentiation. This leads to the definition of Class Attribute Substitution Matrices (CLASSUM), a dual of the BLOSUM. RESULTS The method was applied to classify sequences hierarchically in the lambda and kappa subgroups of the immunoglobulin superfamily. Positions conferring class were identified based on the degree of amino acid variability at a position. The CLASSUM computed for these positions classified better than 90% of test data correctly compared with 35-50% for BLOSUM-62. The expected value for a random matrix is 14%. The results suggest that family-specific data-derived substitution matrices can improve the resolution of automated methods that use generic substitution matrices for searching for and classifying proteins. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Immunoglobulin Light Chains/chemistry Immunoglobulin Light Chains/classification Immunoglobulin Variable Region/chemistry Immunoglobulin Variable Region/classification Molecular Sequence Data Protein Folding Proteins/chemistry Proteins/classification Reproducibility of Results Sensitivity and Specificity Sequence Alignment/methods Sequence Analysis, Protein/methods Sequence Homology, Amino Acid Collapse Grants Collapse