Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hafemeister C, Costa IG, Schönhuth A, Schliep A. Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions. ACTA ACUST UNITED AC 2011;27:946-52. [PMID: 21266444 DOI: 10.1093/bioinformatics/btr037] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

For:	Hafemeister C, Costa IG, Schönhuth A, Schliep A. Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions. ACTA ACUST UNITED AC 2011;27:946-52. [PMID: 21266444 DOI: 10.1093/bioinformatics/btr037] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Number

Cited by Other Article(s)

Straube J, Huang BE, Cao KAL. DynOmics to identify delays and co-expression patterns across time course experiments. Sci Rep 2017;7:40131. [PMID: 28065937 PMCID: PMC5220332 DOI: 10.1038/srep40131] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 12/02/2016] [Indexed: 12/16/2022] Open

Bulashevska S, Priest C, Speicher D, Zimmermann J, Westermann F, Cremers AB. SwitchFinder - a novel method and query facility for discovering dynamic gene expression patterns. BMC Bioinformatics 2016;17:532. [PMID: 27978814 PMCID: PMC5160026 DOI: 10.1186/s12859-016-1391-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2016] [Accepted: 11/29/2016] [Indexed: 12/20/2022] Open

Abstract

Background

Biological systems and processes are highly dynamic. To gain insights into their functioning time-resolved measurements are necessary. Time-resolved gene expression data captures temporal behaviour of the genes genome-wide under various biological conditions: in response to stimuli, during cell cycle, differentiation or developmental programs. Dissecting dynamic gene expression patterns from this data may shed light on the functioning of the gene regulatory system. The present approach facilitates this discovery. The fundamental idea behind it is the following: there are change-points (switches) in the gene behaviour separating intervals of increasing and decreasing activity, whereas the intervals may have different durations. Elucidating the switch-points is important for the identification of biologically meanigfull features and patterns of the gene dynamics.

Results

We developed a statistical method, called SwitchFinder, for the analysis of time-series data, in particular gene expression data, based on a change-point model. Fitting the model to the gene expression time-courses indicates switch-points between increasing and decreasing activities of each gene. Two types of the model - based on linear and on generalized logistic function - were used to capture the data between the switch-points. Model inference was facilitated with the Bayesian methodology using Markov chain Monte Carlo (MCMC) technique Gibbs sampling. Further on, we introduced features of the switch-points: growth, decay, spike and cleft, which reflect important dynamic aspects. With this, the gene expression profiles are represented in a qualitative manner - as sets of the dynamic features at their onset-times. We developed a Web application of the approach, enabling to put queries to the gene expression time-courses and to deduce groups of genes with common dynamic patterns.

SwitchFinder was applied to our original data - the gene expression time-series measured in neuroblastoma cell line upon treatment with all-trans retinoic acid (ATRA). The analysis revealed eight patterns of the gene expression responses to ATRA, indicating the induction of the BMP, WNT, Notch, FGF and NTRK-receptor signaling pathways involved in cell differentiation, as well as the repression of the cell-cycle related genes.

Conclusions

SwitchFinder is a novel approach to the analysis of biological time-series data, supporting inference and interactive exploration of its inherent dynamic patterns, hence facilitating biological discovery process. SwitchFinder is freely available at https://newbioinformatics.eu/switchfinder.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1391-0) contains supplementary material, which is available to authorized users.

Collapse

Natural Cubic Spline Regression Modeling Followed by Dynamic Network Reconstruction for the Identification of Radiation-Sensitivity Gene Association Networks from Time-Course Transcriptome Data. PLoS One 2016;11:e0160791. [PMID: 27505168 PMCID: PMC4978405 DOI: 10.1371/journal.pone.0160791] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 06/14/2016] [Indexed: 11/23/2022] Open

Abstract

Gene expression time-course experiments allow to study the dynamics of transcriptomic changes in cells exposed to different stimuli. However, most approaches for the reconstruction of gene association networks (GANs) do not propose prior-selection approaches tailored to time-course transcriptome data. Here, we present a workflow for the identification of GANs from time-course data using prior selection of genes differentially expressed over time identified by natural cubic spline regression modeling (NCSRM). The workflow comprises three major steps: 1) the identification of differentially expressed genes from time-course expression data by employing NCSRM, 2) the use of regularized dynamic partial correlation as implemented in GeneNet to infer GANs from differentially expressed genes and 3) the identification and functional characterization of the key nodes in the reconstructed networks. The approach was applied on a time-resolved transcriptome data set of radiation-perturbed cell culture models of non-tumor cells with normal and increased radiation sensitivity. NCSRM detected significantly more genes than another commonly used method for time-course transcriptome analysis (BETR). While most genes detected with BETR were also detected with NCSRM the false-detection rate of NCSRM was low (3%). The GANs reconstructed from genes detected with NCSRM showed a better overlap with the interactome network Reactome compared to GANs derived from BETR detected genes. After exposure to 1 Gy the normal sensitive cells showed only sparse response compared to cells with increased sensitivity, which exhibited a strong response mainly of genes related to the senescence pathway. After exposure to 10 Gy the response of the normal sensitive cells was mainly associated with senescence and that of cells with increased sensitivity with apoptosis. We discuss these results in a clinical context and underline the impact of senescence-associated pathways in acute radiation response of normal cells. The workflow of this novel approach is implemented in the open-source Bioconductor R-package splineTimeR.

Collapse

Blomstedt P, Dutta R, Seth S, Brazma A, Kaski S. Modelling-based experiment retrieval: a case study with gene expression clustering. Bioinformatics 2016;32:1388-94. [PMID: 26740526 DOI: 10.1093/bioinformatics/btv762] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 12/28/2015] [Indexed: 12/18/2022] Open

Abstract

MOTIVATION

Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be expressed as a single profile, typically of case versus control. A more general, recently suggested alternative is to retrieve experiments whose models are good for modelling the query dataset. However, for very noisy and high-dimensional query data, this retrieval criterion turns out to be very noisy as well.

RESULTS

We propose doing retrieval using a denoised model of the query dataset, instead of the original noisy dataset itself. To this end, we introduce a general probabilistic framework, where each experiment is modelled separately and the retrieval is done by finding related models. For retrieval of gene expression experiments, we use a probabilistic model called product partition model, which induces a clustering of genes that show similar expression patterns across a number of samples. The suggested metric for retrieval using clusterings is the normalized information distance. Empirical results finally suggest that inference for the full probabilistic model can be approximated with good performance using computationally faster heuristic clustering approaches (e.g. k-means). The method is highly scalable and straightforward to apply to construct a general-purpose gene expression experiment retrieval method.

AVAILABILITY AND IMPLEMENTATION

The method can be implemented using standard clustering algorithms and normalized information distance, available in many statistical software packages.

CONTACT

paul.blomstedt@aalto.fi or samuel.kaski@aalto.fi

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Identifying genes relevant to specific biological conditions in time course microarray experiments. PLoS One 2013;8:e76561. [PMID: 24146889 PMCID: PMC3795718 DOI: 10.1371/journal.pone.0076561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 08/28/2013] [Indexed: 11/19/2022] Open

Abstract

Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call "relative Signal-to-Noise ratio" (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods.

Collapse

Wang K, Ng SK, McLachlan GJ. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects. BMC Bioinformatics 2012;13:300. [PMID: 23151154 PMCID: PMC3574839 DOI: 10.1186/1471-2105-13-300] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 11/07/2012] [Indexed: 11/26/2022] Open

Abstract

Background

Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models.

Results

We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases.

Conclusions

Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data.

Collapse

Bar-Joseph Z, Gitter A, Simon I. Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet 2012;13:552-64. [PMID: 22805708 DOI: 10.1038/nrg3244] [Citation(s) in RCA: 318] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Redestig H, Costa IG. Detection and interpretation of metabolite-transcript coresponses using combined profiling data. ACTA ACUST UNITED AC 2011;27:i357-65. [PMID: 21685093 PMCID: PMC3117345 DOI: 10.1093/bioinformatics/btr231] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]