Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Phang TL, Neville MC, Rudolph M, Hunter L. Trajectory clustering: a non-parametric method for grouping gene expression time courses, with applications to mammary development. Pac Symp Biocomput 2003:351-62. [PMID: 12603041 PMCID: PMC2527819 DOI: 10.1142/9789812776303_0033] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

For:	Phang TL, Neville MC, Rudolph M, Hunter L. Trajectory clustering: a non-parametric method for grouping gene expression time courses, with applications to mammary development. Pac Symp Biocomput 2003:351-62. [PMID: 12603041 PMCID: PMC2527819 DOI: 10.1142/9789812776303_0033] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Number

Cited by Other Article(s)

Williams JR, Yang R, Clifford JL, Watson D, Campbell R, Getnet D, Kumar R, Hammamieh R, Jett M. Functional Heatmap: an automated and interactive pattern recognition tool to integrate time with multi-omics assays. BMC Bioinformatics 2019;20:81. [PMID: 30770734 PMCID: PMC6377781 DOI: 10.1186/s12859-019-2657-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 01/28/2019] [Indexed: 11/10/2022] Open

Affiliation(s)

Joshua R Williams Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Ruoting Yang Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
John L Clifford Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Daniel Watson Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA
Ross Campbell Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Derese Getnet Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Raina Kumar Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Rasha Hammamieh Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
Marti Jett Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA.

Collapse

Fidaner IB, Cankorur-Cetinkaya A, Dikicioglu D, Kirdar B, Cemgil AT, Oliver SG. CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data. Bioinformatics 2016;32:388-97. [PMID: 26411869 PMCID: PMC4734040 DOI: 10.1093/bioinformatics/btv532] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 09/03/2015] [Indexed: 11/13/2022] Open

Abstract

Motivation: Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets.

Results: We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications.

Availability and implementation: The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG.

Contact:sgo24@cam.ac.uk

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

Scheff JD, Almon RR, DuBois DC, Jusko WJ, Androulakis IP. A new symbolic representation for the identification of informative genes in replicated microarray experiments. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010;14:239-48. [PMID: 20455749 DOI: 10.1089/omi.2010.0005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Importance of replication in analyzing time-series gene expression data: corticosteroid dynamics and circadian patterns in rat liver. BMC Bioinformatics 2010;11:279. [PMID: 20500897 PMCID: PMC2889936 DOI: 10.1186/1471-2105-11-279] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2009] [Accepted: 05/26/2010] [Indexed: 11/14/2022] Open

Abstract

Background

Microarray technology is a powerful and widely accepted experimental technique in molecular biology that allows studying genome wide transcriptional responses. However, experimental data usually contain potential sources of uncertainty and thus many experiments are now designed with repeated measurements to better assess such inherent variability. Many computational methods have been proposed to account for the variability in replicates. As yet, there is no model to output expression profiles accounting for replicate information so that a variety of computational models that take the expression profiles as the input data can explore this information without any modification.

Results

We propose a methodology which integrates replicate variability into expression profiles, to generate so-called 'true' expression profiles. The study addresses two issues: (i) develop a statistical model that can estimate 'true' expression profiles which are more robust than the average profile, and (ii) extend our previous micro-clustering which was designed specifically for clustering time-series expression data. The model utilizes a previously proposed error model and the concept of 'relative difference'. The clustering effectiveness is demonstrated through synthetic data where several methods are compared. We subsequently analyze in vivo rat data to elucidate circadian transcriptional dynamics as well as liver-specific corticosteroid induced changes in gene expression.

Conclusions

We have proposed a model which integrates the error information from repeated measurements into the expression profiles. Through numerous synthetic and real time-series data, we demonstrated the ability of the approach to improve the clustering performance and assist in the identification and selection of informative expression motifs.

Collapse

Feng W, Leach SM, Tipney H, Phang T, Geraci M, Spritz RA, Hunter LE, Williams T. Spatial and temporal analysis of gene expression during growth and fusion of the mouse facial prominences. PLoS One 2009;4:e8066. [PMID: 20016822 PMCID: PMC2789411 DOI: 10.1371/journal.pone.0008066] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 10/25/2009] [Indexed: 11/19/2022] Open

Abstract

Orofacial malformations resulting from genetic and/or environmental causes are frequent human birth defects yet their etiology is often unclear because of insufficient information concerning the molecular, cellular and morphogenetic processes responsible for normal facial development. We have, therefore, derived a comprehensive expression dataset for mouse orofacial development, interrogating three distinct regions – the mandibular, maxillary and frontonasal prominences. To capture the dynamic changes in the transcriptome during face formation, we sampled five time points between E10.5–E12.5, spanning the developmental period from establishment of the prominences to their fusion to form the mature facial platform. Seven independent biological replicates were used for each sample ensuring robustness and quality of the dataset. Here, we provide a general overview of the dataset, characterizing aspects of gene expression changes at both the spatial and temporal level. Considerable coordinate regulation occurs across the three prominences during this period of facial growth and morphogenesis, with a switch from expression of genes involved in cell proliferation to those associated with differentiation. An accompanying shift in the expression of polycomb and trithorax genes presumably maintains appropriate patterns of gene expression in precursor or differentiated cells, respectively. Superimposed on the many coordinated changes are prominence-specific differences in the expression of genes encoding transcription factors, extracellular matrix components, and signaling molecules. Thus, the elaboration of each prominence will be driven by particular combinations of transcription factors coupled with specific cell:cell and cell:matrix interactions. The dataset also reveals several prominence-specific genes not previously associated with orofacial development, a subset of which we externally validate. Several of these latter genes are components of bidirectional transcription units that likely share cis-acting sequences with well-characterized genes. Overall, our studies provide a valuable resource for probing orofacial development and a robust dataset for bioinformatic analysis of spatial and temporal gene expression changes during embryogenesis.

Collapse

Hennetin J, Pehkonen P, Bellis M. Construction and use of gene expression covariation matrix. BMC Bioinformatics 2009;10:214. [PMID: 19594909 PMCID: PMC2720390 DOI: 10.1186/1471-2105-10-214] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 07/13/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

One essential step in the massive analysis of transcriptomic profiles is the calculation of the correlation coefficient, a value used to select pairs of genes with similar or inverse transcriptional profiles across a large fraction of the biological conditions examined. Until now, the choice between the two available methods for calculating the coefficient has been dictated mainly by technological considerations. Specifically, in analyses based on double-channel techniques, researchers have been required to use covariation correlation, i.e. the correlation between gene expression changes measured between several pairs of biological conditions, expressed for example as fold-change. In contrast, in analyses of single-channel techniques scientists have been restricted to the use of coexpression correlation, i.e. correlation between gene expression levels. To our knowledge, nobody has ever examined the possible benefits of using covariation instead of coexpression in massive analyses of single channel microarray results.

RESULTS

We describe here how single-channel techniques can be treated like double-channel techniques and used to generate both gene expression changes and covariation measures. We also present a new method that allows the calculation of both positive and negative correlation coefficients between genes. First, we perform systematic comparisons between two given biological conditions and classify, for each comparison, genes as increased (I), decreased (D), or not changed (N). As a result, the original series of n gene expression level measures assigned to each gene is replaced by an ordered string of n(n-1)/2 symbols, e.g. IDDNNIDID....DNNNNNNID, with the length of the string corresponding to the number of comparisons. In a second step, positive and negative covariation matrices (CVM) are constructed by calculating statistically significant positive or negative correlation scores for any pair of genes by comparing their strings of symbols.

CONCLUSION

This new method, applied to four different large data sets, has allowed us to construct distinct covariation matrices with similar properties. We have also developed a technique to translate these covariation networks into graphical 3D representations and found that the local assignation of the probe sets was conserved across the four chip set models used which encompass three different species (humans, mice, and rats). The application of adapted clustering methods succeeded in delineating six conserved functional regions that we characterized using Gene Ontology information.

Collapse

Liu T, Lin N, Shi N, Zhang B. Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments. BMC Bioinformatics 2009;10:146. [PMID: 19445669 PMCID: PMC2696449 DOI: 10.1186/1471-2105-10-146] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Accepted: 05/15/2009] [Indexed: 11/25/2022] Open

Abstract

Background

Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. [1] proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data.

Results

We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM [2] and Wang et al. [3]. It is also computationally much faster than Wang et al. [3].

Conclusion

Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.

Collapse

Clustering of gene expression data based on shape similarity. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2009:195712. [PMID: 19404484 PMCID: PMC3171421 DOI: 10.1155/2009/195712] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Revised: 01/08/2009] [Accepted: 01/27/2009] [Indexed: 11/18/2022]

Liu LYD, Chen CY, Chen MJM, Tsai MS, Lee CHS, Phang TL, Chang LY, Kuo WH, Hwa HL, Lien HC, Jung SM, Lin YS, Chang KJ, Hsieh FJ. Statistical identification of gene association by CID in application of constructing ER regulatory network. BMC Bioinformatics 2009;10:85. [PMID: 19292896 PMCID: PMC2679734 DOI: 10.1186/1471-2105-10-85] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Accepted: 03/17/2009] [Indexed: 02/01/2023] Open

Abstract

Background

A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A).

Results

The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays.

Conclusion

CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers.

Availability

the implementation of CID in R codes can be freely downloaded from .

Collapse

Difference-based clustering of short time-course microarray data with replicates. BMC Bioinformatics 2007;8:253. [PMID: 17629922 PMCID: PMC1952071 DOI: 10.1186/1471-2105-8-253] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 07/14/2007] [Indexed: 11/10/2022] Open

Clarkson RWE, Wayland MT, Lee J, Freeman T, Watson CJ. Gene expression profiling of mammary gland development reveals putative roles for death receptors and immune mediators in post-lactational regression. Breast Cancer Res 2003;6:R92-109. [PMID: 14979921 PMCID: PMC400653 DOI: 10.1186/bcr754] [Citation(s) in RCA: 257] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2003] [Revised: 11/15/2003] [Accepted: 11/21/2003] [Indexed: 01/22/2023] Open

Abstract

INTRODUCTION

In order to gain a better understanding of the molecular processes that underlie apoptosis and tissue regression in mammary gland, we undertook a large-scale analysis of transcriptional changes during the mouse mammary pregnancy cycle, with emphasis on the transition from lactation to involution.

METHOD

Affymetrix microarrays, representing 8618 genes, were used to compare mammary tissue from 12 time points (one virgin, three gestation, three lactation and five involution stages). Six animals were used for each time point. Common patterns of gene expression across all time points were identified and related to biological function.

RESULTS

The majority of significantly induced genes in involution were also differentially regulated at earlier stages in the pregnancy cycle. This included a marked increase in inflammatory mediators during involution and at parturition, which correlated with leukaemia inhibitory factor-Stat3 (signal transducer and activator of signalling-3) signalling. Before involution, expected increases in cell proliferation, biosynthesis and metabolism-related genes were observed. During involution, the first 24 hours after weaning was characterized by a transient increase in expression of components of the death receptor pathways of apoptosis, inflammatory cytokines and acute phase response genes. After 24 hours, regulators of intrinsic apoptosis were induced in conjunction with markers of phagocyte activity, matrix proteases, suppressors of neutrophils and soluble components of specific and innate immunity.

CONCLUSION

We provide a resource of mouse mammary gene expression data for download or online analysis. Here we highlight the sequential induction of distinct apoptosis pathways in involution and the stimulation of immunomodulatory signals, which probably suppress the potentially damaging effects of a cellular inflammatory response while maintaining an appropriate antimicrobial and phagocytic environment.

Collapse

Rudolph MC, McManaman JL, Hunter L, Phang T, Neville MC. Functional development of the mammary gland: use of expression profiling and trajectory clustering to reveal changes in gene expression during pregnancy, lactation, and involution. J Mammary Gland Biol Neoplasia 2003;8:287-307. [PMID: 14973374 DOI: 10.1023/b:jomg.0000010030.73983.57] [Citation(s) in RCA: 165] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Clarkson RWE, Watson CJ. Microarray analysis of the involution switch. J Mammary Gland Biol Neoplasia 2003;8:309-19. [PMID: 14973375 DOI: 10.1023/b:jomg.0000010031.53310.92] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open