1
|
Banirazi Motlagh N, Mohammadpour Esfahani B, Ashrafi B, Zare-Mirakabad F. The assessment of histone acetylation marks in the vicinity of transcription factor binding sites in human CD4 + T cells using information theory methods. Comput Biol Chem 2020; 86:107232. [PMID: 32142982 DOI: 10.1016/j.compbiolchem.2020.107232] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2018] [Revised: 01/29/2019] [Accepted: 02/08/2020] [Indexed: 11/24/2022]
Abstract
The genetic information encoded in structural genes is decoded by an intracellular process called gene expression. This mechanism is regulated by epigenetic processes such as histone acetylation. Histone acetylation, which happens in nucleosomes, exposes DNA (genome) to transcription factors. Therefore, the correlation between histone acetylation and gene expression has been assessed as a fundamental issue in many previous studies. In the proposed research, we investigate which marks of histone acetylation are informative and which ones are redundant in the vicinity of SP1 transcription factor binding sites, in human CD4 + T cell. To achieve this, we use information theory methods. Subsequently, we apply a multilayer perceptron neural network to show that the selected histone acetylation marks by information theory methods are sufficiently informative. Finally, we use the neural network to predict binding sites of 17 other transcription factors on chromosomes 1 and 2. The results suggest that information conveyed by the selected histone acetylation marks are equivalent to that of all 18 marks associated with SP1 transcription factor binding sites on chromosome 1. Furthermore, almost 91.75 % of SP1 binding sites of chromosome 2 are predicted by the selected histone acetylation marks while all 18 marks predict 90.56 % correctly. Moreover, the selected histone acetylation marks are efficient at predicting 17 other types of transcription factor binding sites.
Collapse
Affiliation(s)
- Nafiseh Banirazi Motlagh
- Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | | | - Behnoosh Ashrafi
- Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Fatemeh Zare-Mirakabad
- Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
| |
Collapse
|
2
|
Genome-wide integration on transcription factors, histone acetylation and gene expression reveals genes co-regulated by histone modification patterns. PLoS One 2011; 6:e22281. [PMID: 21829453 PMCID: PMC3146477 DOI: 10.1371/journal.pone.0022281] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Accepted: 06/22/2011] [Indexed: 12/26/2022] Open
Abstract
N-terminal tails of H2A, H2B, H3 and H4 histone families are subjected to posttranslational modifications that take part in transcriptional regulation mechanisms, such as transcription factor binding and gene expression. Regulation mechanisms under control of histone modification are important but remain largely unclear, despite of emerging datasets for comprehensive analysis of histone modification. In this paper, we focus on what we call genetic harmonious units (GHUs), which are co-occurring patterns among transcription factor binding, gene expression and histone modification. We present the first genome-wide approach that captures GHUs by combining ChIP-chip with microarray datasets from Saccharomyces cerevisiae. Our approach employs noise-robust soft clustering to select patterns which share the same preferences in transcription factor-binding, histone modification and gene expression, which are all currently implied to be closely correlated. The detected patterns are a well-studied acetylation of lysine 16 of H4 in glucose depletion as well as co-acetylation of five lysine residues of H3 with H4 Lys12 and H2A Lys7 responsible for ribosome biogenesis. Furthermore, our method further suggested the recognition of acetylated H4 Lys16 being crucial to histone acetyltransferase ESA1, whose essential role is still under controversy, from a microarray dataset on ESA1 and its bypass suppressor mutants. These results demonstrate that our approach allows us to provide clearer principles behind gene regulation mechanisms under histone modifications and detect GHUs further by applying to other microarray and ChIP-chip datasets. The source code of our method, which was implemented in MATLAB (http://www.mathworks.com/), is available from the supporting page for this paper: http://www.bic.kyoto-u.ac.jp/pathway/natsume/hm_detector.htm.
Collapse
|
3
|
Wang J. Computational study of associations between histone modification and protein-DNA binding in yeast genome by integrating diverse information. BMC Genomics 2011; 12:172. [PMID: 21457549 PMCID: PMC3082246 DOI: 10.1186/1471-2164-12-172] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 04/01/2011] [Indexed: 01/06/2023] Open
Abstract
Background In parallel with the quick development of high-throughput technologies, in vivo (vitro) experiments for genome-wide identification of protein-DNA interactions have been developed. Nevertheless, a few questions remain in the field, such as how to distinguish true protein-DNA binding (functional binding) from non-specific protein-DNA binding (non-functional binding). Previous researches tackled the problem by integrated analysis of multiple available sources. However, few systematic studies have been carried out to examine the possible relationships between histone modification and protein-DNA binding. Here this issue was investigated by using publicly available histone modification data in yeast. Results Two separate histone modification datasets were studied, at both the open reading frame (ORF) and the promoter region of binding targets for 37 yeast transcription factors. Both results revealed a distinct histone modification pattern between the functional protein-DNA binding sites and non-functional ones for almost half of all TFs tested. Such difference is much stronger at the ORF than at the promoter region. In addition, a protein-histone modification interaction pathway can only be inferred from the functional protein binding targets. Conclusions Overall, the results suggest that histone modification information can be used to distinguish the functional protein-DNA binding from the non-functional, and that the regulation of various proteins is controlled by the modification of different histone lysines such as the protein-specific histone modification levels.
Collapse
Affiliation(s)
- Junbai Wang
- Department of Pathology, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
4
|
Gianoulis TA, Agarwal A, Snyder M, Gerstein MB. The CRIT framework for identifying cross patterns in systems biology and application to chemogenomics. Genome Biol 2011; 12:R32. [PMID: 21453526 PMCID: PMC3129682 DOI: 10.1186/gb-2011-12-3-r32] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Revised: 01/31/2011] [Accepted: 03/31/2011] [Indexed: 12/03/2022] Open
Abstract
Biological data is often tabular but finding statistically valid connections between entities in a sequence of tables can be problematic - for example, connecting particular entities in a drug property table to gene properties in a second table, using a third table associating genes with drugs. Here we present an approach (CRIT) to find connections such as these and show how it can be applied in a variety of genomic contexts including chemogenomics data.
Collapse
Affiliation(s)
- Tara A Gianoulis
- Department of Genetics, 77 Ave. of Louis Pasteur, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
5
|
Mira NP, Becker JD, Sá-Correia I. Genomic expression program involving the Haa1p-regulon in Saccharomyces cerevisiae response to acetic acid. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2011; 14:587-601. [PMID: 20955010 DOI: 10.1089/omi.2010.0048] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The alterations occurring in yeast genomic expression during early response to acetic acid and the involvement of the transcription factor Haa1p in this transcriptional reprogramming are described in this study. Haa1p was found to regulate, directly or indirectly, the transcription of approximately 80% of the acetic acid-activated genes, suggesting that Haa1p is the main player in the control of yeast response to this weak acid. The genes identified in this work as being activated in response to acetic acid in a Haa1p-dependent manner include protein kinases, multidrug resistance transporters, proteins involved in lipid metabolism, in nucleic acid processing, and proteins of unknown function. Among these genes, the expression of SAP30 and HRK1 provided the strongest protective effect toward acetic acid. SAP30 encode a subunit of a histone deacetylase complex and HRK1 encode a protein kinase belonging to a family of protein kinases dedicated to the regulation of plasma membrane transporters activity. The deletion of the HRK1 gene was found to lead to the increase of the accumulation of labeled acetic acid into acid-stressed yeast cells, suggesting that the role of both HAA1 and HRK1 in providing protection against acetic acid is, at least partially, related with their involvement in the reduction of intracellular acetate concentration.
Collapse
Affiliation(s)
- Nuno P Mira
- Institute for Biotechnology and Bioengineering, Centre for Biological and Chemical Engineering, Instituto Superior Técnico, Technical University of Lisbon, Lisboa, Portugal
| | | | | |
Collapse
|
6
|
Liew AWC, Law NF, Yan H. Missing value imputation for gene expression data: computational techniques to recover missing data from available information. Brief Bioinform 2010; 12:498-513. [PMID: 21156727 DOI: 10.1093/bib/bbq080] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Affiliation(s)
- Alan Wee-Chung Liew
- School of Information and Communication Technology, Gold Coast Campus, Griffith University, QLD4222, Australia.
| | | | | |
Collapse
|
7
|
Wang J, Dai X, Xiang Q, Deng Y, Feng J, Dai Z, He C. Identifying the combinatorial effects of histone modifications by association rule mining in yeast. Evol Bioinform Online 2010; 6:113-31. [PMID: 21037963 PMCID: PMC2964047 DOI: 10.4137/ebo.s5602] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Eukaryotic genomes are packaged into chromatin by histone proteins whose chemical modification can profoundly influence gene expression. The histone modifications often act in combinations, which exert different effects on gene expression. Although a number of experimental techniques and data analysis methods have been developed to study histone modifications, it is still very difficult to identify the relationships among histone modifications on a genome-wide scale.We proposed a method to identify the combinatorial effects of histone modifications by association rule mining. The method first identified Functional Modification Transactions (FMTs) and then employed association rule mining algorithm and statistics methods to identify histone modification patterns. We applied the proposed methodology to Pokholok et al's data with eight sets of histone modifications and Kurdistani et al's data with eleven histone acetylation sites. Our method succeeds in revealing two different global views of histone modification landscapes on two datasets and identifying a number of modification patterns some of which are supported by previous studies.We concentrate on combinatorial effects of histone modifications which significantly affect gene expression. Our method succeeds in identifying known interactions among histone modifications and uncovering many previously unknown patterns. After in-depth analysis of possible mechanism by which histone modification patterns can alter transcriptional states, we infer three possible modification pattern reading mechanism ('redundant', 'trivial', 'dominative'). Our results demonstrate several histone modification patterns which show significant correspondence between yeast and human cells.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Electronics and Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, P.R. China
| | | | | | | | | | | | | |
Collapse
|
8
|
Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families. Genome Res 2010; 20:960-71. [PMID: 20430783 DOI: 10.1101/gr.102814.109] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Recent metagenomics studies have begun to sample the genomic diversity among disparate habitats and relate this variation to features of the environment. Membrane proteins are an intuitive, but thus far overlooked, choice in this type of analysis as they directly interact with the environment, receiving signals from the outside and transporting nutrients. Using global ocean sampling (GOS) data, we found nearly approximately 900,000 membrane proteins in large-scale metagenomic sequence, approximately a fifth of which are completely novel, suggesting a large space of hitherto unexplored protein diversity. Using GPS coordinates for the GOS sites, we extracted additional environmental features via interpolation from the World Ocean Database, the National Center for Ecological Analysis and Synthesis, and empirical models of dust occurrence. This allowed us to study membrane protein variation in terms of natural features, such as phosphate and nitrate concentrations, and also in terms of human impacts, such as pollution and climate change. We show that there is widespread variation in membrane protein content across marine sites, which is correlated with changes in both oceanographic variables and human factors. Furthermore, using these data, we developed an approach, protein families and environment features network (PEN), to quantify and visualize the correlations. PEN identifies small groups of covarying environmental features and membrane protein families, which we call "bimodules." Using this approach, we find that the affinity of phosphate transporters is related to the concentration of phosphate and that the occurrence of iron transporters is connected to the amount of shipping, pollution, and iron-containing dust.
Collapse
|
9
|
She X, Rohl CA, Castle JC, Kulkarni AV, Johnson JM, Chen R. Definition, conservation and epigenetics of housekeeping and tissue-enriched genes. BMC Genomics 2009; 10:269. [PMID: 19534766 PMCID: PMC2706266 DOI: 10.1186/1471-2164-10-269] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2008] [Accepted: 06/17/2009] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Housekeeping genes (HKG) are constitutively expressed in all tissues while tissue-enriched genes (TEG) are expressed at a much higher level in a single tissue type than in others. HKGs serve as valuable experimental controls in gene and protein expression experiments, while TEGs tend to represent distinct physiological processes and are frequently candidates for biomarkers or drug targets. The genomic features of these two groups of genes expressed in opposing patterns may shed light on the mechanisms by which cells maintain basic and tissue-specific functions. RESULTS Here, we generate gene expression profiles of 42 normal human tissues on custom high-density microarrays to systematically identify 1,522 HKGs and 975 TEGs and compile a small subset of 20 housekeeping genes which are highly expressed in all tissues with lower variance than many commonly used HKGs. Cross-species comparison shows that both the functions and expression patterns of HKGs are conserved. TEGs are enriched with respect to both segmental duplication and copy number variation, while no such enrichment is observed for HKGs, suggesting the high expression of HKGs are not due to high copy numbers. Analysis of genomic and epigenetic features of HKGs and TEGs reveals that the high expression of HKGs across different tissues is associated with decreased nucleosome occupancy at the transcription start site as indicated by enhanced DNase hypersensitivity. Additionally, we systematically and quantitatively demonstrated that the CpG islands' enrichment in HKGs transcription start sites (TSS) and their depletion in TEGs TSS. Histone methylation patterns differ significantly between HKGs and TEGs, suggesting that methylation contributes to the differential expression patterns as well. CONCLUSION We have compiled a set of high quality HKGs that should provide higher and more consistent expression when used as references in laboratory experiments than currently used HKGs. The comparison of genomic features between HKGs and TEGs shows that HKGs are more conserved than TEGs in terms of functions, expression pattern and polymorphisms. In addition, our results identify chromatin structure and epigenetic features of HKGs and TEGs that are likely to play an important role in regulating their strikingly different expression patterns.
Collapse
Affiliation(s)
- Xinwei She
- Rosetta Inpharmatics LLC, Seattle, WA 98109, USA.
| | | | | | | | | | | |
Collapse
|
10
|
Xiang Q, Dai X, Deng Y, He C, Wang J, Feng J, Dai Z. Missing value imputation for microarray gene expression data using histone acetylation information. BMC Bioinformatics 2008; 9:252. [PMID: 18510747 PMCID: PMC2432074 DOI: 10.1186/1471-2105-9-252] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Accepted: 05/29/2008] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages. RESULTS The paper explores the feasibility of doing missing value imputation with the help of gene regulatory mechanism. An imputation framework called histone acetylation information aided imputation method (HAIimpute method) is presented. It incorporates the histone acetylation information into the conventional KNN(k-nearest neighbor) and LLS(local least square) imputation algorithms for final prediction of the missing values. The experimental results indicated that the use of acetylation information can provide significant improvements in microarray imputation accuracy. The HAIimpute methods consistently improve the widely used methods such as KNN and LLS in terms of normalized root mean squared error (NRMSE). Meanwhile, the genes imputed by HAIimpute methods are more correlated with the original complete genes in terms of Pearson correlation coefficients. Furthermore, the proposed methods also outperform GOimpute, which is one of the existing related methods that use the functional similarity as the external information. CONCLUSION We demonstrated that the using of histone acetylation information could greatly improve the performance of the imputation especially at high missing percentages. This idea can be generalized to various imputation methods to facilitate the performance. Moreover, with more knowledge accumulated on gene regulatory mechanism in addition to histone acetylation, the performance of our approach can be further improved and verified.
Collapse
Affiliation(s)
- Qian Xiang
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Xianhua Dai
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Yangyang Deng
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Caisheng He
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Jiang Wang
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Jihua Feng
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| | - Zhiming Dai
- Department of Electronics & Communications Engineering, School of Information Science and Technology, Sun Yat-Sen University, 135 West Xin'gang Road, Guangzhou, PR China
| |
Collapse
|
11
|
Pham H, Ferrari R, Cokus SJ, Kurdistani SK, Pellegrini M. Modeling the regulatory network of histone acetylation in Saccharomyces cerevisiae. Mol Syst Biol 2007; 3:153. [PMID: 18091724 PMCID: PMC2174627 DOI: 10.1038/msb4100194] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2007] [Accepted: 10/17/2007] [Indexed: 01/03/2023] Open
Abstract
Acetylation of histones plays an important role in regulating transcription. Histone acetylation is mediated partly by the recruitment of specific histone acetyltransferases (HATs) and deacetylases (HDACs) to genomic loci by transcription factors, resulting in modulation of gene expression. Although several specific interactions between transcription factors and HATs and HDACs have been elaborated in Saccharomyces cerevisiae, the full regulatory network remains uncharacterized. We have utilized a linear regression of optimized sigmoidal functions to correlate transcription factor binding patterns to the acetylation profiles of 11 lysines in the four core histones measured at all S. cerevisiae promoters. The resulting associations are combined with large-scale protein–protein interaction data sets to generate a comprehensive model that relates recruitment of specific HDACs and HATs to transcription factors and their target genes and the resulting effects on individual lysines. This model provides a broad and detailed view of the regulatory network, describing which transcription factors are most significant in regulating acetylation of specific lysines at defined promoters. We validate the model, both computationally and experimentally, to demonstrate that it yields accurate predictions of these regulatory mechanisms.
Collapse
Affiliation(s)
- Hung Pham
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA 90095, USA
| | | | | | | | | |
Collapse
|
12
|
Abstract
Post-translational histone modifications and histone variants generate complexity in chromatin to enable the many functions of the chromosome. Recent studies have mapped histone modifications across the Saccharomyces cerevisiae genome. These experiments describe how combinations of modified and unmodified states relate to each other and particularly to chromosomal landmarks that include heterochromatin, subtelomeric chromatin, centromeres, origins of replication, promoters and coding regions. Such patterns might be important for the regulation of heterochromatin-mediated silencing, chromosome segregation, DNA replication and gene expression.
Collapse
Affiliation(s)
- Catherine B Millar
- Department of Biological Chemistry, Geffen School of Medicine and the Molecular Biology Institute, University of California, Los Angeles, California 90095, USA.
| | | |
Collapse
|
13
|
John Wiley & Sons, Ltd.. Current awareness on yeast. Yeast 2006. [DOI: 10.1002/yea.1320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|