1
|
Wang S, Yan R, Wang B, Meng P, Tan W, Guo X. The Functional Analysis of Selenium-Related Genes and Magnesium-Related Genes in the Gene Expression Profile Microarray in the Peripheral Blood Mononuclear Cells of Keshan Disease. Biol Trace Elem Res 2019; 192:3-9. [PMID: 31165343 DOI: 10.1007/s12011-019-01750-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 05/13/2019] [Indexed: 02/08/2023]
Abstract
Keshan disease (KD) is an endemic cardiomyopathy with high mortality. Selenium (Se) deficiency is closely related to KD, while magnesium (Mg) plays many critical roles in the cardiovascular function. The molecular mechanism of KD pathogenesis is still unclear. Until now, we have not found any studies investigating the association between Se- or Mg-related genes and KD. In this study, oligonucleotide microarray analysis was used to identify the differentially expressed genes in the peripheral blood mononuclear cells between KD patients and normal controls. Next, human metabolome database (HMDB) was used to screen Se- and Mg-related genes. Function classification, gene pathway, and interaction network of Se- and Mg-related genes in KD peripheral blood mononuclear cells were defined by FunRich (functional enrichment analysis tool). Among 83 differentially expressed genes, five Se-related (DIO2, GPX1, GPX2, GPX4, and GPX7) and five Mg-related (ACSL6, EYA4, IDH2, PPM1A, and STK11) genes were recognized from HMDB. Two significant biological processes (energy pathways and metabolism), one molecular function (peroxidase activity), one biological pathway (glutathione redox reactions I), and one gene interaction network were constituted from Se-related and Mg-related genes. Se-related gene DIO2 and Mg-related genes STK11 and IDH2 may have key roles in the myocardial dysfunction of KD. However, we still have not obtained any interaction between Se-related gene and Mg-related gene. The interactions between RPS6KB1, PTEN, ATM, HSP90AA1, SNRK, PRKAA2, SMARCA4, HSPA1A, and STK11 may play important roles in the abnormal cardiac function of KD.
Collapse
Affiliation(s)
- Sen Wang
- School of Public Health, Health Science Center of Xi'an Jiaotong University, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
- Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
| | - Rui Yan
- Department of Cardiology, Beijing Luhe Hospital of Capital Medical University, Beijing, China
| | - Bin Wang
- Institute for Hygiene of Ordance Industry, Xi'an, Shaanxi, China
| | - Peiling Meng
- School of Public Health, Health Science Center of Xi'an Jiaotong University, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
- Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
| | - Wuhong Tan
- School of Public Health, Health Science Center of Xi'an Jiaotong University, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
- Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China
| | - Xiong Guo
- School of Public Health, Health Science Center of Xi'an Jiaotong University, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China.
- Key Laboratory of Trace Elements and Endemic Diseases, National Health Commission, No. 76 Yanta West Road, Xi'an, 710061, Shaanxi, China.
| |
Collapse
|
2
|
Tulpan D, Ghiggi A, Montemanni R. Computational Sequence Design Techniques for DNA Microarray Technologies. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In systems biology and biomedical research, microarray technology is a method of choice that enables the complete quantitative and qualitative ascertainment of gene expression patterns for whole genomes. The selection of high quality oligonucleotide sequences that behave consistently across multiple experiments is a key step in the design, fabrication and experimental performance of DNA microarrays. The aim of this chapter is to outline recent algorithmic developments in microarray probe design, evaluate existing probe sequences used in commercial arrays, and suggest methodologies that have the potential to improve on existing design techniques.
Collapse
Affiliation(s)
- Dan Tulpan
- National Research Council of Canada, Canada
| | | | - Roberto Montemanni
- Istituto Dalle Molle di Studi sull’Intelligenza Artificiale, Switzerland
| |
Collapse
|
3
|
Systems Analysis of Arrestin Pathway Functions. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 118:431-67. [DOI: 10.1016/b978-0-12-394440-5.00017-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
4
|
Adriaens ME, Jaillard M, Eijssen LMT, Mayer CD, Evelo CTA. An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies. BMC Genomics 2012; 13:42. [PMID: 22276688 PMCID: PMC3293711 DOI: 10.1186/1471-2164-13-42] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2011] [Accepted: 01/25/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. RESULTS We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. CONCLUSION T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially.
Collapse
Affiliation(s)
- Michiel E Adriaens
- Department of Bioinformatics-BiGCaT, Maastricht University, Maastricht, The Netherlands.
| | | | | | | | | |
Collapse
|
5
|
Yao C, Li H, Shen X, He Z, He L, Guo Z. Reproducibility and concordance of differential DNA methylation and gene expression in cancer. PLoS One 2012; 7:e29686. [PMID: 22235325 PMCID: PMC3250460 DOI: 10.1371/journal.pone.0029686] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Accepted: 12/01/2011] [Indexed: 12/11/2022] Open
Abstract
Background Hundreds of genes with differential DNA methylation of promoters have been identified for various cancers. However, the reproducibility of differential DNA methylation discoveries for cancer and the relationship between DNA methylation and aberrant gene expression have not been systematically analysed. Methodology/Principal Findings Using array data for seven types of cancers, we first evaluated the effects of experimental batches on differential DNA methylation detection. Second, we compared the directions of DNA methylation changes detected from different datasets for the same cancer. Third, we evaluated the concordance between methylation and gene expression changes. Finally, we compared DNA methylation changes in different cancers. For a given cancer, the directions of methylation and expression changes detected from different datasets, excluding potential batch effects, were highly consistent. In different cancers, DNA hypermethylation was highly inversely correlated with the down-regulation of gene expression, whereas hypomethylation was only weakly correlated with the up-regulation of genes. Finally, we found that genes commonly hypomethylated in different cancers primarily performed functions associated with chronic inflammation, such as ‘keratinization’, ‘chemotaxis’ and ‘immune response’. Conclusions Batch effects could greatly affect the discovery of DNA methylation biomarkers. For a particular cancer, both differential DNA methylation and gene expression can be reproducibly detected from different studies with no batch effects. While DNA hypermethylation is significantly linked to gene down-regulation, hypomethylation is only weakly correlated with gene up-regulation and is likely to be linked to chronic inflammation.
Collapse
Affiliation(s)
- Chen Yao
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
| | - Hongdong Li
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaopei Shen
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
| | - Zheng He
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
| | - Lang He
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
| | - Zheng Guo
- Bioinformatics Centre and Key Laboratory for NeuroInfomation of the Education Ministry of China, School of Life Science, University of Electronic Science and Technology of China, Chengdu, China
- Colleges of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- * E-mail:
| |
Collapse
|
6
|
Maudsley S, Chadwick W, Wang L, Zhou Y, Martin B, Park SS. Bioinformatic approaches to metabolic pathways analysis. Methods Mol Biol 2011; 756:99-130. [PMID: 21870222 DOI: 10.1007/978-1-61779-160-4_5] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The growth and development in the last decade of accurate and reliable mass data collection techniques has greatly enhanced our comprehension of cell signaling networks and pathways. At the same time however, these technological advances have also increased the difficulty of satisfactorily analyzing and interpreting these ever-expanding datasets. At the present time, multiple diverse scientific communities including molecular biological, genetic, proteomic, bioinformatic, and cell biological, are converging upon a common endpoint, that is, the measurement, interpretation, and potential prediction of signal transduction cascade activity from mass datasets. Our ever increasing appreciation of the complexity of cellular or receptor signaling output and the structural coordination of intracellular signaling cascades has to some extent necessitated the generation of a new branch of informatics that more closely associates functional signaling effects to biological actions and even whole-animal phenotypes. The ability to untangle and hopefully generate theoretical models of signal transduction information flow from transmembrane receptor systems to physiological and pharmacological actions may be one of the greatest advances in cell signaling science. In this overview, we shall attempt to assist the navigation into this new field of cell signaling and highlight several methodologies and technologies to appreciate this exciting new age of signal transduction.
Collapse
Affiliation(s)
- Stuart Maudsley
- Receptor Pharmacology Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD, USA.
| | | | | | | | | | | |
Collapse
|
7
|
Tulpan D, Ghiggi A, Montemanni R. Computational Sequence Design Techniques for DNA Microarray Technologies. SYSTEMIC APPROACHES IN BIOINFORMATICS AND COMPUTATIONAL SYSTEMS BIOLOGY 2011. [DOI: 10.4018/978-1-61350-435-2.ch003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
In systems biology and biomedical research, microarray technology is a method of choice that enables the complete quantitative and qualitative ascertainment of gene expression patterns for whole genomes. The selection of high quality oligonucleotide sequences that behave consistently across multiple experiments is a key step in the design, fabrication and experimental performance of DNA microarrays. The aim of this chapter is to outline recent algorithmic developments in microarray probe design, evaluate existing probe sequences used in commercial arrays, and suggest methodologies that have the potential to improve on existing design techniques.
Collapse
Affiliation(s)
- Dan Tulpan
- National Research Council of Canada, Canada
| | | | - Roberto Montemanni
- Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), Switzerland
| |
Collapse
|
8
|
Pathway analysis in microarray data: a comparison of two different pathway analysis devices in the same data set. Shock 2010; 35:245-51. [PMID: 20926982 DOI: 10.1097/shk.0b013e3181fc904d] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Oligonucleotide microarray technology has been developed to a very powerful and favorable biotechnique. However, it is an explicit challenge to judge the potential biological meaning of such extensive amounts of data. There are various-commercially available or free-software applications for pathway analyses on microarray data on the market. The aim of the present study was to test whether pathway analyses on the same data set using different commercially available devices lead to roughly comparable or massively diverging results and, if so, to give potential explanations. Two different commercially available pathway analysis programs (GeneGo and Pathway Studio 6) have been elected. The programs have been compared concerning their different analyses tools, underlying databases, database constructions, and network-building algorithms. The same data set has been uploaded into two different programs. Pathway analysis was performed according to the following three criteria: the five top networks, the five top diseases, and the five top canonical networks that are associated with the uploaded gene list. The different programs differ in extracting their information from the literature, in database construction, and network-building algorithms. The "top networks," as suggested by the programs as to be "most important," substantially differ from each other and share only one same gene. Concerning the most represented diseases in the data set, there are certain overlaps but no uniform results in the different applications. Pathway analyses of microarray data using preformed software devices offer valuable options for investigating on the biological relevance and function of a focus gene set. However, there is no standard in constructing such programs. This leads to substantial differences when investigating on the same data set using different devices. The intention of this work is to sensitize for the potentialities and also pitfalls doing pathway analysis using automated software tools.
Collapse
|
9
|
Ooi CH, Chetty M, Teng SW. Degree of differential prioritization: prediction for multiclass molecular classification. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE : THE QUARTERLY MAGAZINE OF THE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY 2009; 28:45-51. [PMID: 19622424 DOI: 10.1109/memb.2009.932917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Affiliation(s)
- Chia Huey Ooi
- Duke-NUS Graduate Medical School Singapore, 2 Jalan Bukit Merah, Singapore.
| | | | | |
Collapse
|
10
|
Normalization method for transcriptional studies of heterogeneous samples--simultaneous array normalization and identification of equivalent expression. Stat Appl Genet Mol Biol 2009; 8:Article 10. [PMID: 19222377 DOI: 10.2202/1544-6115.1339] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Normalization is an important step in the analysis of microarray data of transcription profiles as systematic non-biological variations often arise from the multiple steps involved in any transcription profiling experiment. Existing methods for data normalization often assume that there are few or symmetric differential expression, but this assumption does not always hold. Alternatively, non-differentially expressed genes may be used for array normalization. However, it is unknown at the outset which genes are non-differentially expressed. In this paper we propose a hierarchical mixture model framework to simultaneously identify non-differentially expressed genes and normalize arrays using these genes. The Fisher's information matrix corresponding to array effects is derived, which provides useful intuition for guiding the choice of array normalization method. The operating characteristics of the proposed method are evaluated using simulated data. The simulations conducted under a wide range of parametric configurations suggest that the proposed method provides a useful alternative for array normalization. For example, the proposed method has better sensitivity than median normalization under modest prevalence of differentially expressed genes and when the magnitudes of over-expression and under-expression are not the same. Further, the proposed method has properties similar to median normalization when the prevalence of differentially expressed genes is very small. Empirical illustration of the proposed method is provided using a liposarcoma study from MSKCC to identify genes differentially expressed between normal fat tissue versus liposarcoma tissue samples.
Collapse
|
11
|
Martin-Requena V, Muñoz-Merida A, Claros MG, Trelles O. PreP+07: improvements of a user friendly tool to preprocess and analyse microarray data. BMC Bioinformatics 2009; 10:16. [PMID: 19134227 PMCID: PMC2657788 DOI: 10.1186/1471-2105-10-16] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2008] [Accepted: 01/12/2009] [Indexed: 11/21/2022] Open
Abstract
Background Nowadays, microarray gene expression analysis is a widely used technology that scientists handle but whose final interpretation usually requires the participation of a specialist. The need for this participation is due to the requirement of some background in statistics that most users lack or have a very vague notion of. Moreover, programming skills could also be essential to analyse these data. An interactive, easy to use application seems therefore necessary to help researchers to extract full information from data and analyse them in a simple, powerful and confident way. Results PreP+07 is a standalone Windows XP application that presents a friendly interface for spot filtration, inter- and intra-slide normalization, duplicate resolution, dye-swapping, error removal and statistical analyses. Additionally, it contains two unique implementation of the procedures – double scan and Supervised Lowess-, a complete set of graphical representations – MA plot, RG plot, QQ plot, PP plot, PN plot – and can deal with many data formats, such as tabulated text, GenePix GPR and ArrayPRO. PreP+07 performance has been compared with the equivalent functions in Bioconductor using a tomato chip with 13056 spots. The number of differentially expressed genes considering p-values coming from the PreP+07 and Bioconductor Limma packages were statistically identical when the data set was only normalized; however, a slight variability was appreciated when the data was both normalized and scaled. Conclusion PreP+07 implementation provides a high degree of freedom in selecting and organizing a small set of widely used data processing protocols, and can handle many data formats. Its reliability has been proven so that a laboratory researcher can afford a statistical pre-processing of his/her microarray results and obtain a list of differentially expressed genes using PreP+07 without any programming skills. All of this gives support to scientists that have been using previous PreP releases since its first version in 2003.
Collapse
|
12
|
Fundel K, Küffner R, Aigner T, Zimmer R. Normalization and gene p-value estimation: issues in microarray data processing. Bioinform Biol Insights 2008; 2:291-305. [PMID: 19812783 PMCID: PMC2735944 DOI: 10.4137/bbi.s441] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Introduction Numerous methods exist for basic processing, e.g. normalization, of microarray gene expression data. These methods have an important effect on the final analysis outcome. Therefore, it is crucial to select methods appropriate for a given dataset in order to assure the validity and reliability of expression data analysis. Furthermore, biological interpretation requires expression values for genes, which are often represented by several spots or probe sets on a microarray. How to best integrate spot/probe set values into gene values has so far been a somewhat neglected problem. Results We present a case study comparing different between-array normalization methods with respect to the identification of differentially expressed genes. Our results show that it is feasible and necessary to use prior knowledge on gene expression measurements to select an adequate normalization method for the given data. Furthermore, we provide evidence that combining spot/probe set p-values into gene p-values for detecting differentially expressed genes has advantages compared to combining expression values for spots/probe sets into gene expression values. The comparison of different methods suggests to use Stouffer’s method for this purpose. The study has been conducted on gene expression experiments investigating human joint cartilage samples of Osteoarthritis related groups: a cDNA microarray (83 samples, four groups) and an Affymetrix (26 samples, two groups) data set. Conclusion The apparently straight forward steps of gene expression data analysis, e.g. between-array normalization and detection of differentially regulated genes, can be accomplished by numerous different methods. We analyzed multiple methods and the possible effects and thereby demonstrate the importance of the single decisions taken during data processing. We give guidelines for evaluating normalization outcomes. An overview of these effects via appropriate measures and plots compared to prior knowledge is essential for the biological interpretation of gene expression measurements.
Collapse
Affiliation(s)
- Katrin Fundel
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 München, Germany
| | | | | | | |
Collapse
|
13
|
Xiong H, Zhang D, Martyniuk CJ, Trudeau VL, Xia X. Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data. BMC Bioinformatics 2008; 9:25. [PMID: 18199333 PMCID: PMC2275243 DOI: 10.1186/1471-2105-9-25] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2007] [Accepted: 01/16/2008] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Normalization is essential in dual-labelled microarray data analysis to remove non-biological variations and systematic biases. Many normalization methods have been used to remove such biases within slides (Global, Lowess) and across slides (Scale, Quantile and VSN). However, all these popular approaches have critical assumptions about data distribution, which is often not valid in practice. RESULTS In this study, we propose a novel assumption-free normalization method based on the Generalized Procrustes Analysis (GPA) algorithm. Using experimental and simulated normal microarray data and boutique array data, we systemically evaluate the ability of the GPA method in normalization compared with six other popular normalization methods including Global, Lowess, Scale, Quantile, VSN, and one boutique array-specific housekeeping gene method. The assessment of these methods is based on three different empirical criteria: across-slide variability, the Kolmogorov-Smirnov (K-S) statistic and the mean square error (MSE). Compared with other methods, the GPA method performs effectively and consistently better in reducing across-slide variability and removing systematic bias. CONCLUSION The GPA method is an effective normalization approach for microarray data analysis. In particular, it is free from the statistical and biological assumptions inherent in other normalization methods that are often difficult to validate. Therefore, the GPA method has a major advantage in that it can be applied to diverse types of array sets, especially to the boutique array where the majority of genes may be differentially expressed.
Collapse
Affiliation(s)
- Huiling Xiong
- Centre for Advanced Research in Environmental Genomics, Department of Biology, University of Ottawa, Ottawa, Ontario, K1N 6N5, Canada.
| | | | | | | | | |
Collapse
|
14
|
Meade KG, Gormley E, Doyle MB, Fitzsimons T, O'Farrelly C, Costello E, Keane J, Zhao Y, MacHugh DE. Innate gene repression associated with Mycobacterium bovis infection in cattle: toward a gene signature of disease. BMC Genomics 2007; 8:400. [PMID: 17974019 PMCID: PMC2213678 DOI: 10.1186/1471-2164-8-400] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2007] [Accepted: 10/31/2007] [Indexed: 01/04/2023] Open
Abstract
Background Bovine tuberculosis is an enduring disease of cattle that has significant repercussions for human health. The advent of high-throughput functional genomics technologies has facilitated large-scale analyses of the immune response to this disease that may ultimately lead to novel diagnostics and therapeutic targets. Analysis of mRNA abundance in peripheral blood mononuclear cells (PBMC) from six Mycobacterium bovis infected cattle and six non-infected controls was performed. A targeted immunospecific bovine cDNA microarray with duplicated spot features representing 1,391 genes was used to test the hypothesis that a distinct gene expression profile may exist in M. bovis infected animals in vivo. Results In total, 378 gene features were differentially expressed at the P ≤ 0.05 level in bovine tuberculosis (BTB)-infected and control animals, of which 244 were expressed at lower levels (65%) in the infected group. Lower relative expression of key innate immune genes, including the Toll-like receptor 2 (TLR2) and TLR4 genes, lack of differential expression of indicator adaptive immune gene transcripts (IFNG, IL2, IL4), and lower BOLA major histocompatibility complex – class I (BOLA) and class II (BOLA-DRA) gene expression was consistent with innate immune gene repression in the BTB-infected animals. Supervised hierarchical cluster analysis and class prediction validation identified a panel of 15 genes predictive of disease status and selected gene transcripts were validated (n = 8 per group) by real time quantitative reverse transcription PCR. Conclusion These results suggest that large-scale expression profiling can identify gene signatures of disease in peripheral blood that can be used to classify animals on the basis of in vivo infection, in the absence of exogenous antigenic stimulation.
Collapse
Affiliation(s)
- Kieran G Meade
- Education and Research Centre, St. Vincent's University Hospital, Dublin 4, Ireland.
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Weeraratna AT, Taub DD. Microarray data analysis: an overview of design, methodology, and analysis. Methods Mol Biol 2007; 377:1-16. [PMID: 17634607 DOI: 10.1007/978-1-59745-390-5_1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Microarray analysis results in the gathering of massive amounts of information concerning gene expression profiles of different cells and experimental conditions. Analyzing these data can often be a quagmire, with endless discussion as to what the appropriate statistical analyses for any given experiment might be. As a result many different methods of data analysis have evolved, the basics of which are outlined in this chapter.
Collapse
Affiliation(s)
- Ashani T Weeraratna
- Laboratory of Immunology, National Institutes of Health, National Institute on Aging, Gerontology Research Center, Baltimore, MD, USA
| | | |
Collapse
|
16
|
Kim SY, Lee JW, Bae JS. Effect of data normalization on fuzzy clustering of DNA microarray data. BMC Bioinformatics 2006; 7:134. [PMID: 16533412 PMCID: PMC1431564 DOI: 10.1186/1471-2105-7-134] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2005] [Accepted: 03/14/2006] [Indexed: 11/10/2022] Open
Abstract
Background Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. Gene expression data is information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. Clustering is an important tool for finding groups of genes with similar expression patterns in microarray data analysis. However, hard clustering methods, which assign each gene exactly to one cluster, are poorly suited to the analysis of microarray datasets because in such datasets the clusters of genes frequently overlap. Results In this study we applied the fuzzy partitional clustering method known as Fuzzy C-Means (FCM) to overcome the limitations of hard clustering. To identify the effect of data normalization, we used three normalization methods, the two common scale and location transformations and Lowess normalization methods, to normalize three microarray datasets and three simulated datasets. First we determined the optimal parameters for FCM clustering. We found that the optimal fuzzification parameter in the FCM analysis of a microarray dataset depended on the normalization method applied to the dataset during preprocessing. We additionally evaluated the effect of normalization of noisy datasets on the results obtained when hard clustering or FCM clustering was applied to those datasets. The effects of normalization were evaluated using both simulated datasets and microarray datasets. A comparative analysis showed that the clustering results depended on the normalization method used and the noisiness of the data. In particular, the selection of the fuzzification parameter value for the FCM method was sensitive to the normalization method used for datasets with large variations across samples. Conclusion Lowess normalization is more robust for clustering of genes from general microarray data than the two common scale and location adjustment methods when samples have varying expression patterns or are noisy. In particular, the FCM method slightly outperformed the hard clustering methods when the expression patterns of genes overlapped and was advantageous in finding co-regulated genes. Thus, the FCM approach offers a convenient method for finding subsets of genes that are strongly associated to a given cluster.
Collapse
Affiliation(s)
- Seo Young Kim
- Research Institute for Basic Science, Chonnam National University, Gwangju, 500-757, Korea
| | - Jae Won Lee
- Department of Statistics, Korea University, Seoul, Korea
| | - Jong Sung Bae
- Department of Statistics, Chonnam National University, Gwangju, 500-757, Korea
| |
Collapse
|
17
|
Engelen K, Naudts B, De Moor B, Marchal K. A calibration method for estimating absolute expression levels from microarray data. Bioinformatics 2006; 22:1251-8. [PMID: 16522672 DOI: 10.1093/bioinformatics/btl068] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION We describe an approach to normalize spotted microarray data, based on a physically motivated calibration model. This model consists of two major components, describing the hybridization of target transcripts to their corresponding probes on the one hand, and the measurement of fluorescence from the hybridized, labeled target on the other hand. The model parameters and error distributions are estimated from external control spikes. RESULTS Using a publicly available dataset, we show that our procedure is capable of adequately removing the typical non-linearities of the data, without making any assumptions on the distribution of differences in gene expression from one biological sample to the next. Since our model links target concentration to measured intensity, we show how absolute expression values of target transcripts in the hybridization solution can be estimated up to a certain degree.
Collapse
Affiliation(s)
- Kristof Engelen
- BIOI@SCD, Department of Electrical Engineering K.U.Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
| | | | | | | |
Collapse
|
18
|
Galbraith DW, Birnbaum K. Global studies of cell type-specific gene expression in plants. ANNUAL REVIEW OF PLANT BIOLOGY 2006; 57:451-75. [PMID: 16669770 DOI: 10.1146/annurev.arplant.57.032905.105302] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Technological advances in expression profiling and in the ability to collect minute quantities of tissues have come together to allow a growing number of global transcriptional studies at the cell level in plants. Microarray technology, with a choice of cDNA or oligo-based slides, is now well established, with commercial full-genome platforms for rice and Arabidopsis and extensive expressed sequence tag (EST)-based designs for many other species. Microdissection and cell sorting are two established methodologies that have been used in conjunction with microarrays to provide an early glimpse of the transcriptional landscape at the level of individual cell types. The results indicate that much of the transcriptome is compartmentalized. A minor but consistent percentage of transcripts appear to be unique to specific cell types. Functional analyses of cell-specific patterns of gene expression are providing important clues to cell-specific functions. The spatial dissection of the transcriptome has also yielded insights into the localized mediators of hormone inputs and promises to provide detail on cell-specific effects of microRNAs.
Collapse
Affiliation(s)
- David W Galbraith
- Department of Plant Sciences and Bio5 Institute, University of Arizona, Tucson, Arizona 85721, USA.
| | | |
Collapse
|
19
|
Riva A, Carpentier AS, Torrésani B, Hénaut A. Comments on selected fundamental aspects of microarray analysis. Comput Biol Chem 2005; 29:319-36. [PMID: 16219488 DOI: 10.1016/j.compbiolchem.2005.08.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2005] [Revised: 08/18/2005] [Accepted: 08/18/2005] [Indexed: 11/17/2022]
Abstract
Microarrays are becoming a ubiquitous tool of research in life sciences. However, the working principles of microarray-based methodologies are often misunderstood or apparently ignored by the researchers who actually perform and interpret experiments. This in turn seems to lead to a common over-expectation regarding the explanatory and/or knowledge-generating power of microarray analyses. In this note we intend to explain basic principles of five (5) major groups of analytical techniques used in studies of microarray data and their interpretation: the principal component analysis (PCA), the independent component analysis (ICA), the t-test, the analysis of variance (ANOVA), and self organizing maps (SOM). We discuss answers to selected practical questions related to the analysis of microarray data. We also take a closer look at the experimental setup and the rules, which have to be observed in order to exploit microarrays efficiently. Finally, we discuss in detail the scope and limitations of microarray-based methods. We emphasize the fact that no amount of statistical analysis can compensate for (or replace) a well thought through experimental setup. We conclude that microarrays are indeed useful tools in life sciences but by no means should they be expected to generate complete answers to complex biological questions. We argue that even well posed questions, formulated within a microarray-specific terminology, cannot be completely answered with the use of microarray analyses alone.
Collapse
Affiliation(s)
- Alessandra Riva
- Laboratoire Génome et Informatique UMR 8116 Tour Evry2, 523 Place des Terrasses, 91034 Evry Cedex, France.
| | | | | | | |
Collapse
|