1
|
Williams JR, Yang R, Clifford JL, Watson D, Campbell R, Getnet D, Kumar R, Hammamieh R, Jett M. Functional Heatmap: an automated and interactive pattern recognition tool to integrate time with multi-omics assays. BMC Bioinformatics 2019; 20:81. [PMID: 30770734 PMCID: PMC6377781 DOI: 10.1186/s12859-019-2657-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 01/28/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Life science research is moving quickly towards large-scale experimental designs that are comprised of multiple tissues, time points, and samples. Omic time-series experiments offer answers to three big questions: what collective patterns do most analytes follow, which analytes follow an identical pattern or synchronize across multiple cohorts, and how do biological functions evolve over time. Existing tools fall short of robustly answering and visualizing all three questions in a unified interface. RESULTS Functional Heatmap offers time-series data visualization through a Master Panel page, and Combined page to answer each of the three time-series questions. It dissects the complex multi-omics time-series readouts into patterned clusters with associated biological functions. It allows users to identify a cascade of functional changes over a time variable. Inversely, Functional Heatmap can compare a pattern with specific biology respond to multiple experimental conditions. All analyses are interactive, searchable, and exportable in a form of heatmap, line-chart, or text, and the results are easy to share, maintain, and reproduce on the web platform. CONCLUSIONS Functional Heatmap is an automated and interactive tool that enables pattern recognition in time-series multi-omics assays. It significantly reduces the manual labour of pattern discovery and comparison by transferring statistical models into visual clues. The new pattern recognition feature will help researchers identify hidden trends driven by functional changes using multi-tissues/conditions on a time-series fashion from omic assays.
Collapse
Affiliation(s)
- Joshua R Williams
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Ruoting Yang
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - John L Clifford
- Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Daniel Watson
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA
| | - Ross Campbell
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Derese Getnet
- Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Raina Kumar
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, 21702-5010, USA.,Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Rasha Hammamieh
- Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA
| | - Marti Jett
- Integrative Systems Biology Program, US Army Center for Environmental Health Research, Fort Detrick, Frederick, MD, 21702-5010, USA.
| |
Collapse
|
2
|
Fidaner IB, Cankorur-Cetinkaya A, Dikicioglu D, Kirdar B, Cemgil AT, Oliver SG. CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data. Bioinformatics 2016; 32:388-97. [PMID: 26411869 PMCID: PMC4734040 DOI: 10.1093/bioinformatics/btv532] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 09/03/2015] [Indexed: 11/13/2022] Open
Abstract
Motivation: Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. Results: We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. Availability and implementation: The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. Contact:sgo24@cam.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Ayca Cankorur-Cetinkaya
- Department of Chemical Engineering, Bogazici University, Istanbul, Turkey and Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Duygu Dikicioglu
- Department of Chemical Engineering, Bogazici University, Istanbul, Turkey and Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Betul Kirdar
- Department of Chemical Engineering, Bogazici University, Istanbul, Turkey and
| | | | - Stephen G Oliver
- Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
3
|
Scheff JD, Almon RR, DuBois DC, Jusko WJ, Androulakis IP. A new symbolic representation for the identification of informative genes in replicated microarray experiments. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 14:239-48. [PMID: 20455749 DOI: 10.1089/omi.2010.0005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Microarray experiments generate massive amounts of data, necessitating innovative algorithms to distinguish biologically relevant information from noise. Because the variability of gene expression data is an important factor in determining which genes are differentially expressed, analysis techniques that take into account repeated measurements are critically important. Additionally, the selection of informative genes is typically done by searching for the individual genes that vary the most across conditions. Yet because genes tend to act in groups rather than individually, it may be possible to glean more information from the data by searching specifically for concerted behavior in a set of genes. Applying a symbolic transformation to the gene expression data allows the detection overrepresented patterns in the data, in contrast to looking only for genes that exhibit maximal differential expression. These challenges are approached by introducing an algorithm based on a new symbolic representation that searches for concerted gene expression patterns; furthermore, the symbolic representation takes into account the variance in multiple replicates and can be applied to long time series data. The proposed algorithm's ability to discover biologically relevant signals in gene expression data is exhibited by applying it to three datasets that measure gene expression in the rat liver.
Collapse
Affiliation(s)
- Jeremy D Scheff
- Biomedical Engineering Department, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | | | | | |
Collapse
|
4
|
Importance of replication in analyzing time-series gene expression data: corticosteroid dynamics and circadian patterns in rat liver. BMC Bioinformatics 2010; 11:279. [PMID: 20500897 PMCID: PMC2889936 DOI: 10.1186/1471-2105-11-279] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2009] [Accepted: 05/26/2010] [Indexed: 11/14/2022] Open
Abstract
Background Microarray technology is a powerful and widely accepted experimental technique in molecular biology that allows studying genome wide transcriptional responses. However, experimental data usually contain potential sources of uncertainty and thus many experiments are now designed with repeated measurements to better assess such inherent variability. Many computational methods have been proposed to account for the variability in replicates. As yet, there is no model to output expression profiles accounting for replicate information so that a variety of computational models that take the expression profiles as the input data can explore this information without any modification. Results We propose a methodology which integrates replicate variability into expression profiles, to generate so-called 'true' expression profiles. The study addresses two issues: (i) develop a statistical model that can estimate 'true' expression profiles which are more robust than the average profile, and (ii) extend our previous micro-clustering which was designed specifically for clustering time-series expression data. The model utilizes a previously proposed error model and the concept of 'relative difference'. The clustering effectiveness is demonstrated through synthetic data where several methods are compared. We subsequently analyze in vivo rat data to elucidate circadian transcriptional dynamics as well as liver-specific corticosteroid induced changes in gene expression. Conclusions We have proposed a model which integrates the error information from repeated measurements into the expression profiles. Through numerous synthetic and real time-series data, we demonstrated the ability of the approach to improve the clustering performance and assist in the identification and selection of informative expression motifs.
Collapse
|
5
|
Feng W, Leach SM, Tipney H, Phang T, Geraci M, Spritz RA, Hunter LE, Williams T. Spatial and temporal analysis of gene expression during growth and fusion of the mouse facial prominences. PLoS One 2009; 4:e8066. [PMID: 20016822 PMCID: PMC2789411 DOI: 10.1371/journal.pone.0008066] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 10/25/2009] [Indexed: 11/19/2022] Open
Abstract
Orofacial malformations resulting from genetic and/or environmental causes are frequent human birth defects yet their etiology is often unclear because of insufficient information concerning the molecular, cellular and morphogenetic processes responsible for normal facial development. We have, therefore, derived a comprehensive expression dataset for mouse orofacial development, interrogating three distinct regions – the mandibular, maxillary and frontonasal prominences. To capture the dynamic changes in the transcriptome during face formation, we sampled five time points between E10.5–E12.5, spanning the developmental period from establishment of the prominences to their fusion to form the mature facial platform. Seven independent biological replicates were used for each sample ensuring robustness and quality of the dataset. Here, we provide a general overview of the dataset, characterizing aspects of gene expression changes at both the spatial and temporal level. Considerable coordinate regulation occurs across the three prominences during this period of facial growth and morphogenesis, with a switch from expression of genes involved in cell proliferation to those associated with differentiation. An accompanying shift in the expression of polycomb and trithorax genes presumably maintains appropriate patterns of gene expression in precursor or differentiated cells, respectively. Superimposed on the many coordinated changes are prominence-specific differences in the expression of genes encoding transcription factors, extracellular matrix components, and signaling molecules. Thus, the elaboration of each prominence will be driven by particular combinations of transcription factors coupled with specific cell:cell and cell:matrix interactions. The dataset also reveals several prominence-specific genes not previously associated with orofacial development, a subset of which we externally validate. Several of these latter genes are components of bidirectional transcription units that likely share cis-acting sequences with well-characterized genes. Overall, our studies provide a valuable resource for probing orofacial development and a robust dataset for bioinformatic analysis of spatial and temporal gene expression changes during embryogenesis.
Collapse
Affiliation(s)
- Weiguo Feng
- Department of Craniofacial Biology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Sonia M. Leach
- Department of Pharmacology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Hannah Tipney
- Department of Pharmacology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Tzulip Phang
- Department of Pharmacology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Mark Geraci
- Department of Medicine, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Richard A. Spritz
- Human Medical Genetics Program, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Lawrence E. Hunter
- Department of Pharmacology, University of Colorado Denver, Aurora, Colorado, United States of America
| | - Trevor Williams
- Department of Craniofacial Biology, University of Colorado Denver, Aurora, Colorado, United States of America
- Department of Cell and Developmental Biology, University of Colorado Denver, Aurora, Colorado, United States of America
- * E-mail:
| |
Collapse
|
6
|
Hennetin J, Pehkonen P, Bellis M. Construction and use of gene expression covariation matrix. BMC Bioinformatics 2009; 10:214. [PMID: 19594909 PMCID: PMC2720390 DOI: 10.1186/1471-2105-10-214] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 07/13/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND One essential step in the massive analysis of transcriptomic profiles is the calculation of the correlation coefficient, a value used to select pairs of genes with similar or inverse transcriptional profiles across a large fraction of the biological conditions examined. Until now, the choice between the two available methods for calculating the coefficient has been dictated mainly by technological considerations. Specifically, in analyses based on double-channel techniques, researchers have been required to use covariation correlation, i.e. the correlation between gene expression changes measured between several pairs of biological conditions, expressed for example as fold-change. In contrast, in analyses of single-channel techniques scientists have been restricted to the use of coexpression correlation, i.e. correlation between gene expression levels. To our knowledge, nobody has ever examined the possible benefits of using covariation instead of coexpression in massive analyses of single channel microarray results. RESULTS We describe here how single-channel techniques can be treated like double-channel techniques and used to generate both gene expression changes and covariation measures. We also present a new method that allows the calculation of both positive and negative correlation coefficients between genes. First, we perform systematic comparisons between two given biological conditions and classify, for each comparison, genes as increased (I), decreased (D), or not changed (N). As a result, the original series of n gene expression level measures assigned to each gene is replaced by an ordered string of n(n-1)/2 symbols, e.g. IDDNNIDID....DNNNNNNID, with the length of the string corresponding to the number of comparisons. In a second step, positive and negative covariation matrices (CVM) are constructed by calculating statistically significant positive or negative correlation scores for any pair of genes by comparing their strings of symbols. CONCLUSION This new method, applied to four different large data sets, has allowed us to construct distinct covariation matrices with similar properties. We have also developed a technique to translate these covariation networks into graphical 3D representations and found that the local assignation of the probe sets was conserved across the four chip set models used which encompass three different species (humans, mice, and rats). The application of adapted clustering methods succeeded in delineating six conserved functional regions that we characterized using Gene Ontology information.
Collapse
Affiliation(s)
- Jérôme Hennetin
- Centre de Recherches en Biochimie Macromoléculaire, CNRS, Montpellier, France.
| | | | | |
Collapse
|
7
|
Liu T, Lin N, Shi N, Zhang B. Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments. BMC Bioinformatics 2009; 10:146. [PMID: 19445669 PMCID: PMC2696449 DOI: 10.1186/1471-2105-10-146] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Accepted: 05/15/2009] [Indexed: 11/25/2022] Open
Abstract
Background Time-course microarray experiments produce vector gene expression profiles across a series of time points. Clustering genes based on these profiles is important in discovering functional related and co-regulated genes. Early developed clustering algorithms do not take advantage of the ordering in a time-course study, explicit use of which should allow more sensitive detection of genes that display a consistent pattern over time. Peddada et al. [1] proposed a clustering algorithm that can incorporate the temporal ordering using order-restricted statistical inference. This algorithm is, however, very time-consuming and hence inapplicable to most microarray experiments that contain a large number of genes. Its computational burden also imposes difficulty to assess the clustering reliability, which is a very important measure when clustering noisy microarray data. Results We propose a computationally efficient information criterion-based clustering algorithm, called ORICC, that also takes account of the ordering in time-course microarray experiments by embedding the order-restricted inference into a model selection framework. Genes are assigned to the profile which they best match determined by a newly proposed information criterion for order-restricted inference. In addition, we also developed a bootstrap procedure to assess ORICC's clustering reliability for every gene. Simulation studies show that the ORICC method is robust, always gives better clustering accuracy than Peddada's method and saves hundreds of times computational time. Under some scenarios, its accuracy is also better than some other existing clustering methods for short time-course microarray data, such as STEM [2] and Wang et al. [3]. It is also computationally much faster than Wang et al. [3]. Conclusion Our ORICC algorithm, which takes advantage of the temporal ordering in time-course microarray experiments, provides good clustering accuracy and is meanwhile much faster than Peddada's method. Moreover, the clustering reliability for each gene can also be assessed, which is unavailable in Peddada's method. In a real data example, the ORICC algorithm identifies new and interesting genes that previous analyses failed to reveal.
Collapse
Affiliation(s)
- Tianqing Liu
- Key Laboratory for Applied Statistics of MOE and School of Mathematics and Statistics, Northeast Normal University, Changchun, PR China.
| | | | | | | |
Collapse
|
8
|
Clustering of gene expression data based on shape similarity. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2009:195712. [PMID: 19404484 PMCID: PMC3171421 DOI: 10.1155/2009/195712] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2008] [Revised: 01/08/2009] [Accepted: 01/27/2009] [Indexed: 11/18/2022]
Abstract
A method for gene clustering from expression profiles using shape information is presented. The conventional clustering approaches such as K-means assume that genes with similar functions have similar expression levels and hence allocate genes with similar expression levels into the same cluster. However, genes with similar function often exhibit similarity in signal shape even though the expression magnitude can be far apart. Therefore, this investigation studies clustering according to signal shape similarity. This shape information is captured in the form of normalized and time-scaled forward first differences, which then are subject to a variational Bayes clustering plus a non-Bayesian (Silhouette) cluster statistic. The statistic shows an improved ability to identify the correct number of clusters and assign the components of cluster. Based on initial results for both generated test data and Escherichia coli microarray expression data and initial validation of the Escherichia coli results, it is shown that the method has promise in being able to better cluster time-series microarray data according to shape similarity.
Collapse
|
9
|
Liu LYD, Chen CY, Chen MJM, Tsai MS, Lee CHS, Phang TL, Chang LY, Kuo WH, Hwa HL, Lien HC, Jung SM, Lin YS, Chang KJ, Hsieh FJ. Statistical identification of gene association by CID in application of constructing ER regulatory network. BMC Bioinformatics 2009; 10:85. [PMID: 19292896 PMCID: PMC2679734 DOI: 10.1186/1471-2105-10-85] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2008] [Accepted: 03/17/2009] [Indexed: 02/01/2023] Open
Abstract
Background A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). Results The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. Conclusion CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. Availability the implementation of CID in R codes can be freely downloaded from .
Collapse
Affiliation(s)
- Li-Yu D Liu
- Department of Agronomy, Biometry Division, National Taiwan University, Taipei, Taiwan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Difference-based clustering of short time-course microarray data with replicates. BMC Bioinformatics 2007; 8:253. [PMID: 17629922 PMCID: PMC1952071 DOI: 10.1186/1471-2105-8-253] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 07/14/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There are some limitations associated with conventional clustering methods for short time-course gene expression data. The current algorithms require prior domain knowledge and do not incorporate information from replicates. Moreover, the results are not always easy to interpret biologically. RESULTS We propose a novel algorithm for identifying a subset of genes sharing a significant temporal expression pattern when replicates are used. Our algorithm requires no prior knowledge, instead relying on an observed statistic which is based on the first and second order differences between adjacent time-points. Here, a pattern is predefined as the sequence of symbols indicating direction and the rate of change between time-points, and each gene is assigned to a cluster whose members share a similar pattern. We evaluated the performance of our algorithm to those of K-means, Self-Organizing Map and the Short Time-series Expression Miner methods. CONCLUSIONS Assessments using simulated and real data show that our method outperformed aforementioned algorithms. Our approach is an appropriate solution for clustering short time-course microarray data with replicates.
Collapse
|
11
|
Clarkson RWE, Wayland MT, Lee J, Freeman T, Watson CJ. Gene expression profiling of mammary gland development reveals putative roles for death receptors and immune mediators in post-lactational regression. Breast Cancer Res 2003; 6:R92-109. [PMID: 14979921 PMCID: PMC400653 DOI: 10.1186/bcr754] [Citation(s) in RCA: 257] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2003] [Revised: 11/15/2003] [Accepted: 11/21/2003] [Indexed: 01/22/2023] Open
Abstract
INTRODUCTION In order to gain a better understanding of the molecular processes that underlie apoptosis and tissue regression in mammary gland, we undertook a large-scale analysis of transcriptional changes during the mouse mammary pregnancy cycle, with emphasis on the transition from lactation to involution. METHOD Affymetrix microarrays, representing 8618 genes, were used to compare mammary tissue from 12 time points (one virgin, three gestation, three lactation and five involution stages). Six animals were used for each time point. Common patterns of gene expression across all time points were identified and related to biological function. RESULTS The majority of significantly induced genes in involution were also differentially regulated at earlier stages in the pregnancy cycle. This included a marked increase in inflammatory mediators during involution and at parturition, which correlated with leukaemia inhibitory factor-Stat3 (signal transducer and activator of signalling-3) signalling. Before involution, expected increases in cell proliferation, biosynthesis and metabolism-related genes were observed. During involution, the first 24 hours after weaning was characterized by a transient increase in expression of components of the death receptor pathways of apoptosis, inflammatory cytokines and acute phase response genes. After 24 hours, regulators of intrinsic apoptosis were induced in conjunction with markers of phagocyte activity, matrix proteases, suppressors of neutrophils and soluble components of specific and innate immunity. CONCLUSION We provide a resource of mouse mammary gene expression data for download or online analysis. Here we highlight the sequential induction of distinct apoptosis pathways in involution and the stimulation of immunomodulatory signals, which probably suppress the potentially damaging effects of a cellular inflammatory response while maintaining an appropriate antimicrobial and phagocytic environment.
Collapse
|
12
|
Rudolph MC, McManaman JL, Hunter L, Phang T, Neville MC. Functional development of the mammary gland: use of expression profiling and trajectory clustering to reveal changes in gene expression during pregnancy, lactation, and involution. J Mammary Gland Biol Neoplasia 2003; 8:287-307. [PMID: 14973374 DOI: 10.1023/b:jomg.0000010030.73983.57] [Citation(s) in RCA: 165] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
To characterize the molecular mechanisms by which progesterone withdrawal initiates milk secretion, we examined global gene expression during pregnancy and lactation in mice, focusing on the period around parturition. Trajectory clustering was used to profile the expression of 1358 genes that changed significantly between pregnancy day 12 and lactation day 9. Predominantly downward trajectories included stromal and proteasomal genes and genes for the enzymes of fatty acid degradation. Milk protein gene expression increased throughout pregnancy, whereas the expression of genes for lipid synthesis increased sharply at the onset of lactation. Examination of regulatory genes with profiles similar or complementary to those of lipid synthesis genes led to a model in which progesterone stimulates synthesis of TGF-beta, Wnt 5b, and IGFBP-5 during pregnancy. These factors are suggested to repress secretion by interfering with PRL and IGF-1 signaling. With progesterone withdrawal, PRL and IGF-1 signaling are activated, in turn activating Akt/PKB and the SREBPs, leading to increased lipid synthesis.
Collapse
Affiliation(s)
- Michael C Rudolph
- Department of Physiology and Biophysics, University of Colorado Health Sciences Center, Denver, Colorado 80262, USA
| | | | | | | | | |
Collapse
|
13
|
Abstract
Mammary epithelial cells (MEC) undergo a series of developmental decisions during a pregnancy cycle. The switches from proliferation to differentiation to secretion and then to cell death are precisely controlled. In order to identify critical changes associated with the transition from a secretory phenotype during lactation to dedifferentiation and cell death, we have undertaken a microarray analysis of mouse mammary gland development. We have focused on the involution switch and on the transcription profiles of genes that are targets of transcription factors known to influence involution and apoptosis. Our results show that both Stat3 and NF-kB target genes are induced by the involution switch while Stat5 target genes are distinct from Stat3 induced genes. Furthermore, a substantial number of genes that were specifically upregulated at the start of involution are regulators of inflammation and the acute phase response. These results provide a novel insight into the involution process and demonstrate the value of microarray analysis in defining molecular events associated with critical developmental transitions in mammary gland.
Collapse
|