Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Mahony S, Hendrix D, Golden A, Smith TJ, Rokhsar DS. Transcription factor binding site identification using the self-organizing map. Bioinformatics 2005;21:1807-14. [PMID: 15647296 DOI: 10.1093/bioinformatics/bti256] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Mahony S, Hendrix D, Golden A, Smith TJ, Rokhsar DS. Transcription factor binding site identification using the self-organizing map. Bioinformatics 2005;21:1807-14. [PMID: 15647296 DOI: 10.1093/bioinformatics/bti256] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Theepalakshmi P, Reddy US. Freezing firefly algorithm for efficient planted (ℓ, d) motif search. Med Biol Eng Comput 2022;60:511-530. [PMID: 35020123 DOI: 10.1007/s11517-021-02468-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 11/06/2021] [Indexed: 10/19/2022]

Kunz T, Rieber L, Mahony S. Assessing relationships between chromatin interactions and regulatory genomic activities using the self-organizing map. Methods 2020;189:12-21. [PMID: 32652235 DOI: 10.1016/j.ymeth.2020.07.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Revised: 06/09/2020] [Accepted: 07/03/2020] [Indexed: 11/24/2022] Open

Abstract

Few existing methods enable the visualization of relationships between regulatory genomic activities and genome organization as captured by Hi-C experimental data. Genome-wide Hi-C datasets are often displayed using "heatmap" matrices, but it is difficult to intuit from these heatmaps which biochemical activities are compartmentalized together. High-dimensional Hi-C data vectors can alternatively be projected onto three-dimensional space using dimensionality reduction techniques. The resulting three-dimensional structures can serve as scaffolds for projecting other forms of genomic information, thereby enabling the exploration of relationships between genome organization and various genome annotations. However, while three-dimensional models are contextually appropriate for chromatin interaction data, some analyses and visualizations may be more intuitively and conveniently performed in two-dimensional space. We present a novel approach to the visualization and analysis of chromatin organization based on the Self-Organizing Map (SOM). The SOM algorithm provides a two-dimensional manifold which adapts to represent the high dimensional chromatin interaction space. The resulting data structure can then be used to assess relationships between regulatory genomic activities and chromatin interactions. For example, given a set of genomic coordinates corresponding to a given biochemical activity, the degree to which this activity is segregated or compartmentalized in chromatin interaction space can be intuitively visualized on the 2D SOM grid and quantified using Lorenz curve analysis. We demonstrate our approach for exploratory analysis of genome compartmentalization in a high-resolution Hi-C dataset from the human GM12878 cell line. Our SOM-based approach provides an intuitive visualization of the large-scale structure of Hi-C data and serves as a platform for integrative analyses of the relationships between various genomic activities and genome organization.

Collapse

Lee NK, Li X, Wang D. A comprehensive survey on genetic algorithms for DNA motif prediction. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.07.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Liu S, Zibetti C, Wan J, Wang G, Blackshaw S, Qian J. Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility. BMC Bioinformatics 2017;18:355. [PMID: 28750606 PMCID: PMC5530957 DOI: 10.1186/s12859-017-1769-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 07/19/2017] [Indexed: 12/04/2022] Open

Abstract

Background

Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Furthermore, ChIP-Seq analysis was used to determine genome-wide binding sites for a range of different TFs in multiple cell types. Integration of these two types of genomic information can improve the prediction of TF binding events.

Results

We assessed to what extent a model built upon on other TFs and/or other cell types could be used to predict the binding sites of TFs of interest. A random forest model was built using a set of cell type-independent features such as specific sequences recognized by the TFs and evolutionary conservation, as well as cell type-specific features derived from chromatin accessibility data. Our analysis suggested that the models learned from other TFs and/or cell lines performed almost as well as the model learned from the target TF in the cell type of interest. Interestingly, models based on multiple TFs performed better than single-TF models. Finally, we proposed a universal model, BPAC, which was generated using ChIP-Seq data from multiple TFs in various cell types.

Conclusion

Integrating chromatin accessibility information with sequence information improves prediction of TF binding.The prediction of TF binding is transferable across TFs and/or cell lines suggesting there are a set of universal “rules”. A computational tool was developed to predict TF binding sites based on the universal “rules”.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-017-1769-7) contains supplementary material, which is available to authorized users.

Collapse

Fiannaca A, Rosa ML, Paglia LL, Rizzo R, Urso A. MiRNATIP: a SOM-based miRNA-target interactions predictor. BMC Bioinformatics 2016;17:321. [PMID: 28185545 PMCID: PMC5046196 DOI: 10.1186/s12859-016-1171-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Abstract

Background

MicroRNAs (miRNAs) are small non-coding RNA sequences with regulatory functions to post-transcriptional level for several biological processes, such as cell disease progression and metastasis. MiRNAs interact with target messenger RNA (mRNA) genes by base pairing. Experimental identification of miRNA target is one of the major challenges in cancer biology because miRNAs can act as tumour suppressors or oncogenes by targeting different type of targets. The use of machine learning methods for the prediction of the target genes is considered a valid support to investigate miRNA functions and to guide related wet-lab experiments. In this paper we propose the miRNA Target Interaction Predictor (miRNATIP) algorithm, a Self-Organizing Map (SOM) based method for the miRNA target prediction. SOM is trained with the seed region of the miRNA sequences and then the mRNA sequences are projected into the SOM lattice in order to find putative interactions with miRNAs. These interactions will be filtered considering the remaining part of the miRNA sequences and estimating the free-energy necessary for duplex stability.

Results

We tested the proposed method by predicting the miRNA target interactions of both the Homo sapiens and the Caenorhbditis elegans species; then, taking into account validated target (positive) and non-target (negative) interactions, we compared our results with other target predictors, namely miRanda, PITA, PicTar, mirSOM, TargetScan and DIANA-microT, in terms of the most used statistical measures. We demonstrate that our method produces the greatest number of predictions with respect to the other ones, exhibiting good results for both species, reaching the for example the highest percentage of sensitivity of 31 and 30.5 %, respectively for Homo sapiens and for C. elegans. All the predicted interaction are freely available at the following url: http://tblab.pa.icar.cnr.it/public/miRNATIP/.

Conclusions

Results state miRNATIP outperforms or is comparable to the other six state-of-the-art methods, in terms of validated target and non-target interactions, respectively.

Collapse

Tapan S, Wang D. A Further Study on Mining DNA Motifs Using Fuzzy Self-Organizing Maps. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016;27:113-124. [PMID: 26068877 DOI: 10.1109/tnnls.2015.2435155] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Harigua-Souiai E, Cortes-Ciriano I, Desdouits N, Malliavin TE, Guizani I, Nilges M, Blondel A, Bouvier G. Identification of binding sites and favorable ligand binding moieties by virtual screening and self-organizing map analysis. BMC Bioinformatics 2015;16:93. [PMID: 25888251 PMCID: PMC4381396 DOI: 10.1186/s12859-015-0518-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 02/24/2015] [Indexed: 11/24/2022] Open

Beadell AV, Haag ES. Evolutionary Dynamics of GLD-1-mRNA complexes in Caenorhabditis nematodes. Genome Biol Evol 2014;7:314-35. [PMID: 25502909 PMCID: PMC4316625 DOI: 10.1093/gbe/evu272] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/04/2014] [Indexed: 12/17/2022] Open

Tran NTL, Huang CH. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data. Biol Direct 2014;9:4. [PMID: 24555784 PMCID: PMC4022013 DOI: 10.1186/1745-6150-9-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 01/08/2014] [Accepted: 02/11/2014] [Indexed: 12/24/2022] Open

Wang D, Tapan S. A robust elicitation algorithm for discovering DNA motifs using fuzzy self-organizing maps. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013;24:1677-1688. [PMID: 24808603 DOI: 10.1109/tnnls.2013.2275733] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Zamani N, Russell P, Lantz H, Hoeppner MP, Meadows JR, Vijay N, Mauceli E, di Palma F, Lindblad-Toh K, Jern P, Grabherr MG. Unsupervised genome-wide recognition of local relationship patterns. BMC Genomics 2013;14:347. [PMID: 23706020 PMCID: PMC3669000 DOI: 10.1186/1471-2164-14-347] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Accepted: 05/08/2013] [Indexed: 12/05/2022] Open

Abstract

Background

Phenomena such as incomplete lineage sorting, horizontal gene transfer, gene duplication and subsequent sub- and neo-functionalisation can result in distinct local phylogenetic relationships that are discordant with species phylogeny. In order to assess the possible biological roles for these subdivisions, they must first be identified and characterised, preferably on a large scale and in an automated fashion.

Results

We developed Saguaro, a combination of a Hidden Markov Model (HMM) and a Self Organising Map (SOM), to characterise local phylogenetic relationships among aligned sequences using cacti, matrices of pair-wise distance measures. While the HMM determines the genomic boundaries from aligned sequences, the SOM hypothesises new cacti in an unsupervised and iterative fashion based on the regions that were modelled least well by existing cacti. After testing the software on simulated data, we demonstrate the utility of Saguaro by testing two different data sets: (i) 181 Dengue virus strains, and (ii) 5 primate genomes. Saguaro identifies regions under lineage-specific constraint for the first set, and genomic segments that we attribute to incomplete lineage sorting in the second dataset. Intriguingly for the primate data, Saguaro also classified an additional ~3% of the genome as most incompatible with the expected species phylogeny. A substantial fraction of these regions was found to overlap genes associated with both the innate and adaptive immune systems.

Conclusions

Saguaro detects distinct cacti describing local phylogenetic relationships without requiring any a priori hypotheses. We have successfully demonstrated Saguaro’s utility with two contrasting data sets, one containing many members with short sequences (Dengue viral strains: n = 181, genome size = 10,700 nt), and the other with few members but complex genomes (related primate species: n = 5, genome size = 3 Gb), suggesting that the software is applicable to a wide variety of experimental populations. Saguaro is written in C++, runs on the Linux operating system, and can be downloaded from http://saguarogw.sourceforge.net/.

Collapse

Wang D, Tapan S. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences. BMC SYSTEMS BIOLOGY 2012;6 Suppl 2:S4. [PMID: 23282090 PMCID: PMC3521183 DOI: 10.1186/1752-0509-6-s2-s4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Abstract

Background

Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools.

Results

This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones.

Conclusions

MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools.

Collapse

Chien TY, Lin CK, Lin CW, Weng YZ, Chen CY, Chang DTH. DBD2BS: connecting a DNA-binding protein with its binding sites. Nucleic Acids Res 2012;40:W173-9. [PMID: 22693214 PMCID: PMC3394304 DOI: 10.1093/nar/gks564] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2012] [Revised: 05/07/2012] [Accepted: 05/19/2012] [Indexed: 11/25/2022] Open

Abstract

By binding to short and highly conserved DNA sequences in genomes, DNA-binding proteins initiate, enhance or repress biological processes. Accurately identifying such binding sites, often represented by position weight matrices (PWMs), is an important step in understanding the control mechanisms of cells. When given coordinates of a DNA-binding domain (DBD) bound with DNA, a potential function can be used to estimate the change of binding affinity after base substitutions, where the changes can be summarized as a PWM. This technique provides an effective alternative when the chromatin immunoprecipitation data are unavailable for PWM inference. To facilitate the procedure of predicting PWMs based on protein-DNA complexes or even structures of the unbound state, the web server, DBD2BS, is presented in this study. The DBD2BS uses an atom-level knowledge-based potential function to predict PWMs characterizing the sequences to which the query DBD structure can bind. For unbound queries, a list of 1066 DBD-DNA complexes (including 1813 protein chains) is compiled for use as templates for synthesizing bound structures. The DBD2BS provides users with an easy-to-use interface for visualizing the PWMs predicted based on different templates and the spatial relationships of the query protein, the DBDs and the DNAs. The DBD2BS is the first attempt to predict PWMs of DBDs from unbound structures rather than from bound ones. This approach increases the number of existing protein structures that can be exploited when analyzing protein-DNA interactions. In a recent study, the authors showed that the kernel adopted by the DBD2BS can generate PWMs consistent with those obtained from the experimental data. The use of DBD2BS to predict PWMs can be incorporated with sequence-based methods to discover binding sites in genome-wide studies. Available at: http://dbd2bs.csie.ntu.edu.tw/, http://dbd2bs.csbb.ntu.edu.tw/, and http://dbd2bs.ee.ncku.edu.tw.

Collapse

Tan M, Yu D, Jin Y, Dou L, Li B, Wang Y, Yue J, Liang L. An information transmission model for transcription factor binding at regulatory DNA sites. Theor Biol Med Model 2012;9:19. [PMID: 22672438 PMCID: PMC3442977 DOI: 10.1186/1742-4682-9-19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2012] [Accepted: 05/17/2012] [Indexed: 11/10/2022] Open

Zambelli F, Pesole G, Pavesi G. Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief Bioinform 2012;14:225-37. [PMID: 22517426 PMCID: PMC3603212 DOI: 10.1093/bib/bbs016] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Chen CY, Chien TY, Lin CK, Lin CW, Weng YZ, Chang DTH. Predicting target DNA sequences of DNA-binding proteins based on unbound structures. PLoS One 2012;7:e30446. [PMID: 22312425 PMCID: PMC3270014 DOI: 10.1371/journal.pone.0030446] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Accepted: 12/16/2011] [Indexed: 12/17/2022] Open

Abstract

DNA-binding proteins such as transcription factors use DNA-binding domains (DBDs) to bind to specific sequences in the genome to initiate many important biological functions. Accurate prediction of such target sequences, often represented by position weight matrices (PWMs), is an important step to understand many biological processes. Recent studies have shown that knowledge-based potential functions can be applied on protein-DNA co-crystallized structures to generate PWMs that are considerably consistent with experimental data. However, this success has not been extended to DNA-binding proteins lacking co-crystallized structures. This study aims at investigating the possibility of predicting the DNA sequences bound by DNA-binding proteins from the proteins' unbound structures (structures of the unbound state). Given an unbound query protein and a template complex, the proposed method first employs structure alignment to generate synthetic protein-DNA complexes for the query protein. Once a complex is available, an atomic-level knowledge-based potential function is employed to predict PWMs characterizing the sequences to which the query protein can bind. The evaluation of the proposed method is based on seven DNA-binding proteins, which have structures of both DNA-bound and unbound forms for prediction as well as annotated PWMs for validation. Since this work is the first attempt to predict target sequences of DNA-binding proteins from their unbound structures, three types of structural variations that presumably influence the prediction accuracy were examined and discussed. Based on the analyses conducted in this study, the conformational change of proteins upon binding DNA was shown to be the key factor. This study sheds light on the challenge of predicting the target DNA sequences of a protein lacking co-crystallized structures, which encourages more efforts on the structure alignment-based approaches in addition to docking- and homology modeling-based approaches for generating synthetic complexes.

Collapse

Technau M, Knispel M, Roth S. Molecular mechanisms of EGF signaling-dependent regulation of pipe, a gene crucial for dorsoventral axis formation in Drosophila. Dev Genes Evol 2011;222:1-17. [PMID: 22198544 PMCID: PMC3291829 DOI: 10.1007/s00427-011-0384-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 11/29/2011] [Indexed: 01/28/2023]

Thomas BJ, Rubio ED, Krumm N, Broin PO, Bomsztyk K, Welcsh P, Greally JM, Golden AA, Krumm A. Allele-specific transcriptional elongation regulates monoallelic expression of the IGF2BP1 gene. Epigenetics Chromatin 2011;4:14. [PMID: 21812971 PMCID: PMC3174113 DOI: 10.1186/1756-8935-4-14] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 08/03/2011] [Indexed: 11/13/2022] Open

Abstract

Background

Random monoallelic expression contributes to phenotypic variation of cells and organisms. However, the epigenetic mechanisms by which individual alleles are randomly selected for expression are not known. Taking cues from chromatin signatures at imprinted gene loci such as the insulin-like growth factor 2 gene 2 (IGF2), we evaluated the contribution of CTCF, a zinc finger protein required for parent-of-origin-specific expression of the IGF2 gene, as well as a role for allele-specific association with DNA methylation, histone modification and RNA polymerase II.

Results

Using array-based chromatin immunoprecipitation, we identified 293 genomic loci that are associated with both CTCF and histone H3 trimethylated at lysine 9 (H3K9me3). A comparison of their genomic positions with those of previously published monoallelically expressed genes revealed no significant overlap between allele-specifically expressed genes and colocalized CTCF/H3K9me3. To analyze the contributions of CTCF and H3K9me3 to gene regulation in more detail, we focused on the monoallelically expressed IGF2BP1 gene. In vitro binding assays using the CTCF target motif at the IGF2BP1 gene, as well as allele-specific analysis of cytosine methylation and CTCF binding, revealed that CTCF does not regulate mono- or biallelic IGF2BP1 expression. Surprisingly, we found that RNA polymerase II is detected on both the maternal and paternal alleles in B lymphoblasts that express IGF2BP1 primarily from one allele. Thus, allele-specific control of RNA polymerase II elongation regulates the allelic bias of IGF2BP1 gene expression.

Conclusions

Colocalization of CTCF and H3K9me3 does not represent a reliable chromatin signature indicative of monoallelic expression. Moreover, association of individual alleles with both active (H3K4me3) and silent (H3K27me3) chromatin modifications (allelic bivalent chromatin) or with RNA polymerase II also fails to identify monoallelically expressed gene loci. The selection of individual alleles for expression occurs in part during transcription elongation.

Collapse

Heikkinen L, Kolehmainen M, Wong G. Prediction of microRNA targets in Caenorhabditis elegans using a self-organizing map. ACTA ACUST UNITED AC 2011;27:1247-54. [PMID: 21422073 DOI: 10.1093/bioinformatics/btr144] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Lee NK, Wang D. SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model. BMC Bioinformatics 2011;12 Suppl 1:S16. [PMID: 21342545 PMCID: PMC3044270 DOI: 10.1186/1471-2105-12-s1-s16] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Mahony S, Mazzoni EO, McCuine S, Young RA, Wichterle H, Gifford DK. Ligand-dependent dynamics of retinoic acid receptor binding during early neurogenesis. Genome Biol 2011;12:R2. [PMID: 21232103 PMCID: PMC3091300 DOI: 10.1186/gb-2011-12-1-r2] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2010] [Revised: 12/10/2010] [Accepted: 01/13/2011] [Indexed: 01/31/2023] Open

Kuo D, Tan K, Zinman G, Ravasi T, Bar-Joseph Z, Ideker T. Evolutionary divergence in the fungal response to fluconazole revealed by soft clustering. Genome Biol 2010;11:R77. [PMID: 20653936 PMCID: PMC2926788 DOI: 10.1186/gb-2010-11-7-r77] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Revised: 07/09/2010] [Accepted: 07/23/2010] [Indexed: 11/25/2022] Open

Rhee JK, Joung JG, Chang JH, Fei Z, Zhang BT. Identification of cell cycle-related regulatory motifs using a kernel canonical correlation analysis. BMC Genomics 2009;10 Suppl 3:S29. [PMID: 19958493 PMCID: PMC2788382 DOI: 10.1186/1471-2164-10-s3-s29] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

An integrated genome screen identifies the Wnt signaling pathway as a major target of WT1. Proc Natl Acad Sci U S A 2009;106:11154-9. [PMID: 19549856 DOI: 10.1073/pnas.0901591106] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Murtola T, Bunker A, Vattulainen I, Deserno M, Karttunen M. Multiscale modeling of emergent materials: biological and soft matter. Phys Chem Chem Phys 2009;11:1869-92. [PMID: 19279999 DOI: 10.1039/b818051b] [Citation(s) in RCA: 188] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Lu Y, Mahony S, Benos PV, Rosenfeld R, Simon I, Breeden LL, Bar-Joseph Z. Combined analysis reveals a core set of cycling genes. Genome Biol 2008;8:R146. [PMID: 17650318 PMCID: PMC2323241 DOI: 10.1186/gb-2007-8-7-r146] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Revised: 06/19/2007] [Accepted: 07/24/2007] [Indexed: 01/28/2023] Open

Abstract

The simultaneous analysis of expression data from multiple species reveals a core set of conserved cycling genes that is much larger than previously thought.

Background

Global transcript levels throughout the cell cycle have been characterized using microarrays in several species. Early analysis of these experiments focused on individual species. More recently, a number of studies have concluded that a surprisingly small number of genes conserved in two or more species are periodically transcribed in these species. Combining and comparing data from multiple species is challenging because of noise in expression data, the different synchronization and scoring methods used, and the need to determine an accurate set of homologs.

Results

To solve these problems, we developed and applied a new algorithm to analyze expression data from multiple species simultaneously. Unlike previous studies, we find that more than 20% of cycling genes in budding yeast have cycling homologs in fission yeast and 5% to 7% of cycling genes in each of four species have cycling homologs in all other species. These conserved cycling genes display much stronger cell cycle characteristics in several complementary high throughput datasets.

Essentiality analysis for yeast and human genes confirms these findings. Motif analysis indicates conservation in the corresponding regulatory mechanisms. Gene Ontology analysis and analysis of the genes in the conserved sets sheds light on the evolution of specific subfunctions within the cell cycle.

Conclusion

Our results indicate that the conservation in cyclic expression patterns is much greater than was previously thought. These genes are highly enriched for most cell cycle categories, and a large percentage of them are essential, supporting our claim that cross-species analysis can identify the core set of cycling genes.

Collapse

Wei W, Yu XD. Comparative analysis of regulatory motif discovery tools for transcription factor binding sites. GENOMICS PROTEOMICS & BIOINFORMATICS 2007;5:131-42. [PMID: 17893078 PMCID: PMC5054109 DOI: 10.1016/s1672-0229(07)60023-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]

Murtola T, Kupiainen M, Falck E, Vattulainen I. Conformational analysis of lipid molecules by self-organizing maps. J Chem Phys 2007;126:054707. [PMID: 17302498 DOI: 10.1063/1.2429066] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abnizova I, Subhankulova T, Gilks WR. Recent computational approaches to understand gene regulation: mining gene regulation in silico. Curr Genomics 2007;8:79-91. [PMID: 18660846 PMCID: PMC2435357 DOI: 10.2174/138920207780368150] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Revised: 12/13/2006] [Accepted: 12/15/2006] [Indexed: 01/03/2023] Open

MacIsaac KD, Fraenkel E. Practical strategies for discovering regulatory DNA sequence motifs. PLoS Comput Biol 2006;2:e36. [PMID: 16683017 PMCID: PMC1447654 DOI: 10.1371/journal.pcbi.0020036] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Mahony S, Benos PV, Smith TJ, Golden A. Self-organizing neural networks to support the discovery of DNA-binding motifs. Neural Netw 2006;19:950-62. [PMID: 16839740 DOI: 10.1016/j.neunet.2006.05.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Sandve GK, Drabløs F. A survey of motif discovery methods in an integrated framework. Biol Direct 2006;1:11. [PMID: 16600018 PMCID: PMC1479319 DOI: 10.1186/1745-6150-1-11] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Accepted: 04/06/2006] [Indexed: 11/10/2022] Open