76
|
Jjingo D, Wang J, Conley AB, Lunyak VV, Jordan IK. Compound cis-regulatory elements with both boundary and enhancer sequences in the human genome. Bioinformatics 2013; 29:3109-12. [PMID: 24085569 DOI: 10.1093/bioinformatics/btt542] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION It has been suggested that presumably distinct classes of genomic regulatory elements may actually share common sets of features and mechanisms. However, there has been no genome-wide assessment of the prevalence of this phenomenon. RESULTS To evaluate this possibility, we performed a bioinformatic screen for the existence of compound regulatory elements in the human genome. We identified numerous such colocated boundary and enhancer elements from human CD4(+) T cells. We report evidence that such compound regulatory elements possess unique chromatin features and facilitate cell type-specific functions related to inflammation and immune response in CD4(+) T cells.
Collapse
|
77
|
Abstract
This review highlights recent discoveries that have shaped the emerging viewpoints in the field of epigenetic influences in the central nervous system (CNS), focusing on the following questions: (i) How is the CNS shaped during development when precursor cells transition into morphologically and molecularly distinct cell types, and is this event driven by epigenetic alterations?; ii) How do epigenetic pathways control CNS function?; (iii) What happens to "epigenetic memory" during aging processes, and do these alterations cause CNS dysfunction?; (iv) Can one restore normal CNS function by manipulating the epigenome using pharmacologic agents, and will this ameliorate aging-related neurodegeneration? These and other still unanswered questions remain critical to understanding the impact of multifaceted epigenetic machinery on the age-related dysfunction of CNS.
Collapse
|
78
|
Lopez MF, Tollervey J, Krastins B, Garces A, Sarracino D, Prakash A, Vogelsang M, Geesman G, Valderrama A, Jordan IK, Lunyak VV. Depletion of nuclear histone H2A variants is associated with chronic DNA damage signaling upon drug-evoked senescence of human somatic cells. Aging (Albany NY) 2013; 4:823-42. [PMID: 23235539 PMCID: PMC3560435 DOI: 10.18632/aging.100507] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Cellular senescence is associated with global chromatin changes, altered gene expression, and activation of chronic DNA damage signaling. These events ultimately lead to morphological and physiological transformations in primary cells. In this study, we show that chronic DNA damage signals caused by genotoxic stress impact the expression of histones H2A family members and lead to their depletion in the nuclei of senescent human fibroblasts. Our data reinforce the hypothesis that progressive chromatin destabilization may lead to the loss of epigenetic information and impaired cellular function associated with chronic DNA damage upon drug-evoked senescence. We propose that changes in the histone biosynthesis and chromatin assembly may directly contribute to cellular aging. In addition, we also outline the method that allows for quantitative and unbiased measurement of these changes.
Collapse
|
79
|
Sebastian A, Rishishwar L, Wang J, Bernard KF, Conley AB, McCarty NA, Jordan IK. Origin and evolution of the cystic fibrosis transmembrane regulator protein R domain. Gene 2013; 523:137-46. [PMID: 23578801 DOI: 10.1016/j.gene.2013.02.050] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 02/18/2013] [Indexed: 10/27/2022]
Abstract
The Cystic Fibrosis Transmembrane Conductance Regulator protein (CFTR) is a member of the ABC transporter superfamily. CFTR is distinguished from all other members of this superfamily by its status as an ion channel as well as the presence of its unique regulatory (R) domain. We investigated the origin and subsequent evolution of the R domain along the CFTR evolutionary lineage. The R domain protein coding sequence originated via the loss of a splice donor site at the 3' end of exon 14, leading to the subsequent read-through and capture of formerly intronic sequence as novel coding sequence. Inclusion of the remaining part of the R domain coding sequence in the CFTR transcript involved a lineage-specific gain of exonic sequence with no homology to protein coding sequences outside of CFTR and loss of two exons conserved among ABC family members. These events occurred at the base of the Gnathostome evolutionary lineage ~550-650 million years ago. The apparent origination of the R domain de novo from previously non-coding sequence is consistent with its lack of sequence similarity to other domains as well as its intrinsically disordered structure, which has important implications for its function. In particular, this lack of structure may provide for a dynamic and inducible regulatory activity based on transient physical interactions with more structured domains of the protein. Since its acquisition along the CFTR evolutionary lineage, the R domain has evolved more rapidly than any other CFTR domain; however, there is no evidence for positive (adaptive) selection in the evolution of the domain. The R domain does show a distinct pattern of relative evolutionary rates compared to other CFTR domains, which sheds additional light on the connection between its function and evolution. The regulatory function of the R domain is dependent upon a fairly small number of sites that are subject to phosphorylation, and these sites were fixed very early in R domain evolution and have remained largely invariant since that time. In contrast, the rest of the R domain has been free to drift in sequence space leading to a more star-like phylogeny than seen for the other CFTR domains. The case of the R domain suggests that domain acquisition via the de novo creation of coding sequence, and the novel functional utility that such an event would seemingly entail, can be one route by which neo-functionalization is favored to occur.
Collapse
|
80
|
Wang J, Lunyak VV, Jordan IK. BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets. ACTA ACUST UNITED AC 2013; 29:492-3. [PMID: 23300134 DOI: 10.1093/bioinformatics/bts722] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY Although some histone modification chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) signals show abrupt peaks across narrow and specific genomic locations, others have diffuse distributions along chromosomes, and their large contiguous enrichment landscapes are better modeled as broad peaks. Here, we present BroadPeak, an algorithm for the identification of such broad peaks from diffuse ChIP-seq datasets. We show that BroadPeak is a linear time algorithm that requires only two parameters, and we validate its performance on real and simulated histone modification ChIP-seq datasets. BroadPeak calls peaks that are highly coincident with both the underlying ChIP-seq tag count distributions and relevant biological features, such as the gene bodies of actively transcribed genes, and it shows superior overall recall and precision of known broad peaks from simulated datasets. AVAILABILITY The source code and documentations are available at http://jordan.biology.gatech.edu/page/software/broadpeak/.
Collapse
|
81
|
Cui G, Wang J, Kuang C, Prince CZ, Jordan IK, McCarty NA. The Structural and Functional Imporatnce of Type II Divergent Amino Acids in the Cystic Fibrosis Transmembrane Conductance Regulator. Biophys J 2013. [DOI: 10.1016/j.bpj.2012.11.3458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022] Open
|
82
|
Conley AB, Jordan IK. Cell type-specific termination of transcription by transposable element sequences. Mob DNA 2012; 3:15. [PMID: 23020800 PMCID: PMC3517506 DOI: 10.1186/1759-8753-3-15] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 08/08/2012] [Indexed: 11/17/2022] Open
Abstract
Background Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Results Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3′ UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. Conclusions TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
Collapse
|
83
|
Wang J, Lunyak VV, Jordan IK. Chromatin signature discovery via histone modification profile alignments. Nucleic Acids Res 2012; 40:10642-56. [PMID: 22989711 PMCID: PMC3505981 DOI: 10.1093/nar/gks848] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
We report on the development of an unsupervised algorithm for the genome-wide discovery and analysis of chromatin signatures. Our Chromatin-profile Alignment followed by Tree-clustering algorithm (ChAT) employs dynamic programming of combinatorial histone modification profiles to identify locally similar chromatin sub-regions and provides complementary utility with respect to existing methods. We applied ChAT to genomic maps of 39 histone modifications in human CD4+ T cells to identify both known and novel chromatin signatures. ChAT was able to detect chromatin signatures previously associated with transcription start sites and enhancers as well as novel signatures associated with a variety of regulatory elements. Promoter-associated signatures discovered with ChAT indicate that complex chromatin signatures, made up of numerous co-located histone modifications, facilitate cell-type specific gene expression. The discovery of novel L1 retrotransposon-associated bivalent chromatin signatures suggests that these elements influence the mono-allelic expression of human genes by shaping the chromatin environment of imprinted genomic regions. Analysis of long gene-associated chromatin signatures point to a role for the H4K20me1 and H3K79me3 histone modifications in transcriptional pause release. The novel chromatin signatures and functional associations uncovered by ChAT underscore the ability of the algorithm to yield novel insight on chromatin-based regulatory mechanisms.
Collapse
|
84
|
Rishishwar L, Varghese N, Tyagi E, Harvey SC, Jordan IK, McCarty NA. Relating the disease mutation spectrum to the evolution of the cystic fibrosis transmembrane conductance regulator (CFTR). PLoS One 2012; 7:e42336. [PMID: 22879944 PMCID: PMC3413703 DOI: 10.1371/journal.pone.0042336] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Accepted: 07/03/2012] [Indexed: 11/18/2022] Open
Abstract
Cystic fibrosis (CF) is the most common genetic disease among Caucasians, and accordingly the cystic fibrosis transmembrane conductance regulator (CFTR) protein has perhaps the best characterized disease mutation spectrum with more than 1,500 causative mutations having been identified. In this study, we took advantage of that wealth of mutational information in an effort to relate site-specific evolutionary parameters with the propensity and severity of CFTR disease-causing mutations. To do this, we devised a scoring scheme for known CFTR disease-causing mutations based on the Grantham amino acid chemical difference matrix. CFTR site-specific evolutionary constraint values were then computed for seven different evolutionary metrics across a range of increasing evolutionary depths. The CFTR mutational scores and the various site-specific evolutionary constraint values were compared in order to evaluate which evolutionary measures best reflect the disease-causing mutation spectrum. Site-specific evolutionary constraint values from the widely used comparative method PolyPhen2 show the best correlation with the CFTR mutation score spectrum, whereas more straightforward conservation based measures (ConSurf and ScoreCons) show the greatest ability to predict individual CFTR disease-causing mutations. While far greater than could be expected by chance alone, the fraction of the variability in mutation scores explained by the PolyPhen2 metric (3.6%), along with the best set of paired sensitivity (58%) and specificity (60%) values for the prediction of disease-causing residues, were marginal. These data indicate that evolutionary constraint levels are informative but far from determinant with respect to disease-causing mutations in CFTR. Nevertheless, this work shows that, when combined with additional lines of evidence, information on site-specific evolutionary conservation can and should be used to guide site-directed mutagenesis experiments by more narrowly defining the set of target residues, resulting in a potential savings of both time and money.
Collapse
|
85
|
Kostka JE, Green SJ, Rishishwar L, Prakash O, Katz LS, Mariño-Ramírez L, Jordan IK, Munk C, Ivanova N, Mikhailova N, Watson DB, Brown SD, Palumbo AV, Brooks SC. Genome sequences for six Rhodanobacter strains, isolated from soils and the terrestrial subsurface, with variable denitrification capabilities. J Bacteriol 2012; 194:4461-2. [PMID: 22843592 PMCID: PMC3416251 DOI: 10.1128/jb.00871-12] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2012] [Accepted: 06/04/2012] [Indexed: 11/20/2022] Open
Abstract
We report the first genome sequences for six strains of Rhodanobacter species isolated from a variety of soil and subsurface environments. Three of these strains are capable of complete denitrification and three others are not. However, all six strains contain most of the genes required for the respiration of nitrate to gaseous nitrogen. The nondenitrifying members of the genus lack only the gene for nitrate reduction, the first step in the full denitrification pathway. The data suggest that the environmental role of bacteria from the genus Rhodanobacter should be reevaluated.
Collapse
|
86
|
Conley AB, Jordan IK. Epigenetic regulation of human cis-natural antisense transcripts. Nucleic Acids Res 2012; 40:1438-45. [PMID: 22371288 PMCID: PMC3287164 DOI: 10.1093/nar/gkr1010] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Mammalian genomes encode numerous cis-natural antisense transcripts (cis-NATs). The extent to which these cis-NATs are actively regulated and ultimately functionally relevant, as opposed to transcriptional noise, remains a matter of debate. To address this issue, we analyzed the chromatin environment and RNA Pol II binding properties of human cis-NAT promoters genome-wide. Cap analysis of gene expression data were used to identify thousands of cis-NAT promoters, and profiles of nine histone modifications and RNA Pol II binding for these promoters in ENCODE cell types were analyzed using chromatin immunoprecipitation followed by sequencing (ChIP-seq) data. Active cis-NAT promoters are enriched with activating histone modifications and occupied by RNA Pol II, whereas weak cis-NAT promoters are depleted for both activating modifications and RNA Pol II. The enrichment levels of activating histone modifications and RNA Pol II binding show peaks centered around cis-NAT transcriptional start sites, and the levels of activating histone modifications at cis-NAT promoters are positively correlated with cis-NAT expression levels. Cis-NAT promoters also show highly tissue-specific patterns of expression. These results suggest that human cis-NATs are actively transcribed by the RNA Pol II and that their expression is epigenetically regulated, prerequisites for a functional potential for many of these non-coding RNAs.
Collapse
|
87
|
Jjingo D, Conley AB, Yi SV, Lunyak VV, Jordan IK. On the presence and role of human gene-body DNA methylation. Oncotarget 2012; 3:462-74. [PMID: 22577155 PMCID: PMC3380580 DOI: 10.18632/oncotarget.497] [Citation(s) in RCA: 347] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 04/28/2012] [Indexed: 01/23/2023] Open
Abstract
DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes.
Collapse
|
88
|
Piriyapongsa J, Jordan IK, Conley AB, Ronan T, Smalheiser NR. Transcription factor binding sites are highly enriched within microRNA precursor sequences. Biol Direct 2011; 6:61. [PMID: 22136256 PMCID: PMC3240832 DOI: 10.1186/1745-6150-6-61] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2011] [Accepted: 12/02/2011] [Indexed: 02/05/2023] Open
Abstract
Background Transcription factors are thought to regulate the transcription of microRNA genes in a manner similar to that of protein-coding genes; that is, by binding to conventional transcription factor binding site DNA sequences located in or near promoter regions that lie upstream of the microRNA genes. However, in the course of analyzing the genomics of human microRNA genes, we noticed that annotated transcription factor binding sites commonly lie within 70- to 110-nt long microRNA small hairpin precursor sequences. Results We report that about 45% of all human small hairpin microRNA (pre-miR) sequences contain at least one predicted transcription factor binding site motif that is conserved across human, mouse and rat, and this rises to over 75% if one excludes primate-specific pre-miRs. The association is robust and has extremely strong statistical significance; it affects both intergenic and intronic pre-miRs and both isolated and clustered microRNA genes. We also confirmed and extended this finding using a separate analysis that examined all human pre-miR sequences regardless of conservation across species. Conclusions The transcription factor binding sites localized within small hairpin microRNA precursor sequences may possibly regulate their transcription. Transcription factors may also possibly bind directly to nascent primary microRNA gene transcripts or small hairpin microRNA precursors and regulate their processing. Reviewers This article was reviewed by Guillaume Bourque (nominated by Jerzy Jurka), Dmitri Pervouchine (nominated by Mikhail Gelfand), and Yuriy Gusev.
Collapse
|
89
|
Huda A, Tyagi E, Mariño-Ramírez L, Bowen NJ, Jjingo D, Jordan IK. Prediction of transposable element derived enhancers using chromatin modification profiles. PLoS One 2011; 6:e27513. [PMID: 22087331 PMCID: PMC3210180 DOI: 10.1371/journal.pone.0027513] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 10/18/2011] [Indexed: 11/19/2022] Open
Abstract
Experimentally characterized enhancer regions have previously been shown to display specific patterns of enrichment for several different histone modifications. We modelled these enhancer chromatin profiles in the human genome and used them to guide the search for novel enhancers derived from transposable element (TE) sequences. To do this, a computational approach was taken to analyze the genome-wide histone modification landscape characterized by the ENCODE project in two human hematopoietic cell types, GM12878 and K562. We predicted the locations of 2,107 and 1,448 TE-derived enhancers in the GM12878 and K562 cell lines respectively. A vast majority of these putative enhancers are unique to each cell line; only 3.5% of the TE-derived enhancers are shared between the two. We evaluated the functional effect of TE-derived enhancers by associating them with the cell-type specific expression of nearby genes, and found that the number of TE-derived enhancers is strongly positively correlated with the expression of nearby genes in each cell line. Furthermore, genes that are differentially expressed between the two cell lines also possess a divergent number of TE-derived enhancers in their vicinity. As such, genes that are up-regulated in the GM12878 cell line and down-regulated in K562 have significantly more TE-derived enhancers in their vicinity in the GM12878 cell line and vice versa. These data indicate that human TE-derived sequences are likely to be involved in regulating cell-type specific gene expression on a broad scale and suggest that the enhancer activity of TE-derived sequences is mediated by epigenetic regulatory mechanisms.
Collapse
|
90
|
Wang J, Lunyak VV, Jordan IK. Genome-wide prediction and analysis of human chromatin boundary elements. Nucleic Acids Res 2011; 40:511-29. [PMID: 21930510 PMCID: PMC3258141 DOI: 10.1093/nar/gkr750] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Boundary elements partition eukaryotic chromatin into active and repressive domains, and can also block regulatory interactions between domains. Boundary elements act via diverse mechanisms making accurate feature-based computational predictions difficult. Therefore, we developed an unbiased algorithm that predicts the locations of human boundary elements based on the genomic distributions of chromatin and transcriptional states, as opposed to any intrinsic characteristics that they may possess. Application of our algorithm to ChIP-seq data for histone modifications and RNA Pol II-binding data in human CD4(+) T cells resulted in the prediction of 2542 putative chromatin boundary elements genome wide. Predicted boundary elements display two distinct features: first, position-specific open chromatin and histone acetylation that is coincident with the recruitment of sequence-specific DNA-binding factors such as CTCF, EVI1 and YYI, and second, a directional and gradual increase in histone lysine methylation across predicted boundaries coincident with a gain of expression of non-coding RNAs, including examples of boundaries encoded by tRNA and other non-coding RNA genes. Accordingly, a number of the predicted human boundaries may function via the synergistic action of sequence-specific recruitment of transcription factors leading to non-coding RNA transcriptional interference and the blocking of facultative heterochromatin propagation by transcription-associated chromatin remodeling complexes.
Collapse
|
91
|
Katz LS, Humphrey JC, Conley AB, Nelakuditi V, Kislyuk AO, Agrawal S, Jayaraman P, Harcourt BH, Olsen-Rasmussen MA, Frace M, Sharma NV, Mayer LW, Jordan IK. Neisseria Base: a comparative genomics database for Neisseria meningitidis. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2011; 2011:bar035. [PMID: 21930505 PMCID: PMC3263597 DOI: 10.1093/database/bar035] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Neisseria meningitidis is an important pathogen, causing life-threatening diseases including meningitis, septicemia and in some cases pneumonia. Genomic studies hold great promise for N. meningitidis research, but substantial database resources are needed to deal with the wealth of information that comes with completely sequenced and annotated genomes. To address this need, we developed Neisseria Base (NBase), a comparative genomics database and genome browser that houses and displays publicly available N. meningitidis genomes. In addition to existing N. meningitidis genome sequences, we sequenced and annotated 19 new genomes using 454 pyrosequencing and the CG-Pipeline genome analysis tool. In total, NBase hosts 27 complete N. meningitidis genome sequences along with their associated annotations. The NBase platform is designed to be scalable, via the underlying database schema and modular code architecture, such that it can readily incorporate new genomes and their associated annotations. The front page of NBase provides user access to these genomes through searching, browsing and downloading. NBase search utility includes BLAST-based sequence similarity searches along with a variety of semantic search options. All genomes can be browsed using a modified version of the GBrowse platform, and a plethora of information on each gene can be viewed using a customized details page. NBase also has a whole-genome comparison tool that yields single-nucleotide polymorphism differences between two user-defined groups of genomes. Using the virulent ST-11 lineage as an example, we demonstrate how this comparative genomics utility can be used to identify novel genomic markers for molecular profiling of N. meningitidis. Database URL:http://nbase.biology.gatech.edu
Collapse
|
92
|
Wang J, Geesman GJ, Hostikka SL, Atallah M, Blackwell B, Lee E, Cook PJ, Pasaniuc B, Shariat G, Halperin E, Dobke M, Rosenfeld MG, Jordan IK, Lunyak VV. Inhibition of activated pericentromeric SINE/Alu repeat transcription in senescent human adult stem cells reinstates self-renewal. Cell Cycle 2011; 10:3016-30. [PMID: 21862875 PMCID: PMC3218602 DOI: 10.4161/cc.10.17.17543] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Accepted: 07/28/2011] [Indexed: 01/01/2023] Open
Abstract
Cellular aging is linked to deficiencies in efficient repair of DNA double strand breaks and authentic genome maintenance at the chromatin level. Aging poses a significant threat to adult stem cell function by triggering persistent DNA damage and ultimately cellular senescence. Senescence is often considered to be an irreversible process. Moreover, critical genomic regions engaged in persistent DNA damage accumulation are unknown. Here we report that 65% of naturally occurring repairable DNA damage in self-renewing adult stem cells occurs within transposable elements. Upregulation of Alu retrotransposon transcription upon ex vivo aging causes nuclear cytotoxicity associated with the formation of persistent DNA damage foci and loss of efficient DNA repair in pericentric chromatin. This occurs due to a failure to recruit of condensin I and cohesin complexes. Our results demonstrate that the cytotoxicity of induced Alu repeats is functionally relevant for the human adult stem cell aging. Stable suppression of Alu transcription can reverse the senescent phenotype, reinstating the cells' self-renewing properties and increasing their plasticity by altering so-called "master" pluripotency regulators.
Collapse
|
93
|
Momin AA, Park H, Portz BJ, Haynes CA, Shaner RL, Kelly SL, Jordan IK, Merrill JAH. A method for visualization of "omic" datasets for sphingolipid metabolism to predict potentially interesting differences. J Lipid Res 2011; 52:1073-1083. [PMID: 21415121 PMCID: PMC3090229 DOI: 10.1194/jlr.m010454] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Sphingolipids are structurally diverse and their metabolic pathways highly complex, which makes it difficult to follow all of the subspecies in a biological system, even using “lipidomic” approaches. This report describes a method to use transcriptomic data to visualize and predict potential differences in sphingolipid composition, and it illustrates its use with published data for cancer cell lines and tumors. In addition, several novel sphingolipids that were predicted to differ between MDA-MB-231 and MCF7 cells based on published microarray data for these breast cancer cell lines were confirmed by mass spectrometry. For the data that we were able to find for these comparisons, there was a significant match between the gene expression data and sphingolipid composition (P < 0.001 by Fisher's exact test). Upon considering the large number of gene expression datasets produced in recent years, this simple integration of two types of “omic” technologies (“transcriptomics” to direct “sphingolipidomics”) might facilitate the discovery of useful relationships between sphingolipid metabolism and disease, such as the identification of new biomarkers.
Collapse
|
94
|
Jjingo D, Huda A, Gundapuneni M, Mariño-Ramírez L, Jordan IK. Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol 2011; 3:259-71. [PMID: 21362639 PMCID: PMC3070429 DOI: 10.1093/gbe/evr015] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.
Collapse
|
95
|
Huda A, Bowen NJ, Conley AB, Jordan IK. Epigenetic regulation of transposable element derived human gene promoters. Gene 2011; 475:39-48. [PMID: 21215797 DOI: 10.1016/j.gene.2010.12.010] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Accepted: 12/22/2010] [Indexed: 02/08/2023]
Abstract
It was previously thought that epigenetic histone modifications of mammalian transposable elements (TEs) serve primarily to defend the genome against deleterious effects associated with their activity. However, we recently showed that, genome-wide, human TEs can also be epigenetically modified in a manner consistent with their ability to regulate host genes. Here, we explore the ability of TE sequences to epigenetically regulate individual human genes by focusing on the histone modifications of promoter sequences derived from TEs. We found 1520 human genes that initiate transcription from within TE-derived promoter sequences. We evaluated the distributions of eight histone modifications across these TE-promoters, within and between the GM12878 and K562 cell lines, and related their modification status with the cell-type specific expression patterns of the genes that they regulate. TE-derived promoters are significantly enriched for active histone modifications, and depleted for repressive modifications, relative to the genomic background. Active histone modifications of TE-promoters peak at transcription start sites and are positively correlated with increasing expression within cell lines. Furthermore, differential modification of TE-derived promoters between cell lines is significantly correlated with differential gene expression. LTR-retrotransposon derived promoters in particular play a prominent role in mediating cell-type specific gene regulation, and a number of these LTR-promoter genes are implicated in lineage-specific cellular functions. The regulation of human genes mediated by histone modifications targeted to TE-derived promoters is consistent with the ability of TEs to contribute to the epigenomic landscape in a way that provides functional utility to the host genome.
Collapse
|
96
|
Wang J, Huda A, Lunyak VV, Jordan IK. A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags. ACTA ACUST UNITED AC 2010; 26:2501-8. [PMID: 20871106 PMCID: PMC2951085 DOI: 10.1093/bioinformatics/btq460] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Motivation: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is widely used in biological research. ChIP-seq experiments yield many ambiguous tags that can be mapped with equal probability to multiple genomic sites. Such ambiguous tags are typically eliminated from consideration resulting in a potential loss of important biological information. Results: We have developed a Gibbs sampling-based algorithm for the genomic mapping of ambiguous sequence tags. Our algorithm relies on the local genomic tag context to guide the mapping of ambiguous tags. The Gibbs sampling procedure we use simultaneously maps ambiguous tags and updates the probabilities used to infer correct tag map positions. We show that our algorithm is able to correctly map more ambiguous tags than existing mapping methods. Our approach is also able to uncover mapped genomic sites from highly repetitive sequences that can not be detected based on unique tags alone, including transposable elements, segmental duplications and peri-centromeric regions. This mapping approach should prove to be useful for increasing biological knowledge on the too often neglected repetitive genomic regions. Availability:http://esbg.gatech.edu/jordan/software/map Contact:king.jordan@biology.gatech.edu Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
|
97
|
Pray JL, Jordan IK. The Deaf Community and Culture at a Crossroads: Issues and Challenges. ACTA ACUST UNITED AC 2010; 9:168-93. [DOI: 10.1080/1536710x.2010.493486] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
98
|
Kislyuk AO, Katz LS, Agrawal S, Hagen MS, Conley AB, Jayaraman P, Nelakuditi V, Humphrey JC, Sammons SA, Govil D, Mair RD, Tatti KM, Tondella ML, Harcourt BH, Mayer LW, Jordan IK. A computational genomics pipeline for prokaryotic sequencing projects. ACTA ACUST UNITED AC 2010; 26:1819-26. [PMID: 20519285 PMCID: PMC2905547 DOI: 10.1093/bioinformatics/btq284] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. RESULTS We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. AVAILABILITY AND IMPLEMENTATION The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.
Collapse
|
99
|
Huda A, Mariño-Ramírez L, Jordan IK. Epigenetic histone modifications of human transposable elements: genome defense versus exaptation. Mob DNA 2010; 1:2. [PMID: 20226072 PMCID: PMC2836006 DOI: 10.1186/1759-8753-1-2] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Accepted: 01/25/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transposition is disruptive in nature and, thus, it is imperative for host genomes to evolve mechanisms that suppress the activity of transposable elements (TEs). At the same time, transposition also provides diverse sequences that can be exapted by host genomes as functional elements. These notions form the basis of two competing hypotheses pertaining to the role of epigenetic modifications of TEs in eukaryotic genomes: the genome defense hypothesis and the exaptation hypothesis. To date, all available evidence points to the genome defense hypothesis as the best explanation for the biological role of TE epigenetic modifications. RESULTS We evaluated several predictions generated by the genome defense hypothesis versus the exaptation hypothesis using recently characterized epigenetic histone modification data for the human genome. To this end, we mapped chromatin immunoprecipitation sequence tags from 38 histone modifications, characterized in CD4+ T cells, to the human genome and calculated their enrichment and depletion in all families of human TEs. We found that several of these families are significantly enriched or depleted for various histone modifications, both active and repressive. The enrichment of human TE families with active histone modifications is consistent with the exaptation hypothesis and stands in contrast to previous analyses that have found mammalian TEs to be exclusively repressively modified. Comparisons between TE families revealed that older families carry more histone modifications than younger ones, another observation consistent with the exaptation hypothesis. However, data from within family analyses on the relative ages of epigenetically modified elements are consistent with both the genome defense and exaptation hypotheses. Finally, TEs located proximal to genes carry more histone modifications than the ones that are distal to genes, as may be expected if epigenetically modified TEs help to regulate the expression of nearby host genes. CONCLUSIONS With a few exceptions, most of our findings support the exaptation hypothesis for the role of TE epigenetic modifications when vetted against the genome defense hypothesis. The recruitment of epigenetic modifications may represent an additional mechanism by which TEs can contribute to the regulatory functions of their host genomes.
Collapse
|
100
|
Huda A, Jordan IK. Epigenetic Regulation of Mammalian Genomes by Transposable Elements. Ann N Y Acad Sci 2009; 1178:276-84. [DOI: 10.1111/j.1749-6632.2009.05007.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|