Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP. A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 2004;20:323-31. [PMID: 14960458 DOI: 10.1093/bioinformatics/btg410] [Citation(s) in RCA: 214] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

For:	Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP. A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 2004;20:323-31. [PMID: 14960458 DOI: 10.1093/bioinformatics/btg410] [Citation(s) in RCA: 214] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Number

Cited by Other Article(s)

Madill-Thomsen K, Halloran P. Precision diagnostics in transplanted organs using microarray-assessed gene expression: concepts and technical methods of the Molecular Microscope® Diagnostic System (MMDx). Clin Sci (Lond) 2024;138:663-685. [PMID: 38819301 PMCID: PMC11147747 DOI: 10.1042/cs20220530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/26/2024] [Accepted: 05/02/2024] [Indexed: 06/01/2024]

Abstract

There is a major unmet need for improved accuracy and precision in the assessment of transplant rejection and tissue injury. Diagnoses relying on histologic and visual assessments demonstrate significant variation between expert observers (as represented by low kappa values) and have limited ability to assess many biological processes that produce little histologic changes, for example, acute injury. Consensus rules and guidelines for histologic diagnosis are useful but may have errors. Risks of over- or under-treatment can be serious: many therapies for transplant rejection or primary diseases are expensive and carry risk for significant adverse effects. Improved diagnostic methods could alleviate healthcare costs by reducing treatment errors, increase treatment efficacy, and serve as useful endpoints for clinical trials of new agents that can improve outcomes. Molecular diagnostic assessments using microarrays combined with machine learning algorithms for interpretation have shown promise for increasing diagnostic precision via probabilistic assessments, recalibrating standard of care diagnostic methods, clarifying ambiguous cases, and identifying potentially missed cases of rejection. This review describes the development and application of the Molecular Microscope® Diagnostic System (MMDx), and discusses the history and reasoning behind many common methods, statistical practices, and computational decisions employed to ensure that MMDx scores are as accurate and precise as possible. MMDx provides insights on disease processes and highly reproducible results from a comparatively small amount of tissue and constitutes a general approach that is useful in many areas of medicine, including kidney, heart, lung, and liver transplants, with the possibility of extrapolating lessons for understanding native organ disease states.

Collapse

Gene Expression Signature Associated with Clinical Outcome in ALK-Positive Anaplastic Large Cell Lymphoma. Cancers (Basel) 2021;13:cancers13215523. [PMID: 34771686 PMCID: PMC8582782 DOI: 10.3390/cancers13215523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 10/15/2021] [Accepted: 10/28/2021] [Indexed: 11/17/2022] Open

Abstract

Simple Summary

Anaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling, high throughput RT-qPCR, and RNA sequencing of 48 ALK-positive anaplastic large cell lymphoma (ALK⁺ ALCL) samples obtained at diagnosis enable the identification of genes associated with clinical outcome. More particularly, our molecular signatures indicate that the FN1 gene, a matrix key regulator, might also be involved in the prognosis and the therapeutic response in anaplastic lymphomas.

Abstract

Anaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling of 48 samples obtained at diagnosis was used to identify 47 genes that were differentially expressed between patients with early relapse/progression and no relapse. In the relapsing group, the most significant overrepresented genes were related to the regulation of the immune response and T-cell activation while those in the non-relapsing group were involved in the extracellular matrix. Fluidigm technology gave concordant results for 29 genes, of which FN1, FAM179A, and SLC40A1 had the strongest predictive power after logistic regression and two classification algorithms. In parallel with 39 samples, we used a Kallisto/Sleuth pipeline to analyze RNA sequencing data and identified 20 genes common to the 28 genes validated by Fluidigm technology—notably, the FAM179A and FN1 genes. Interestingly, FN1 also belongs to the gene signature predicting longer survival in diffuse large B-cell lymphomas treated with CHOP. Thus, our molecular signatures indicate that the FN1 gene, a matrix key regulator, might also be involved in the prognosis and the therapeutic response in anaplastic lymphomas.

Collapse

Lu M. An embedded method for gene identification problems involving unwanted data heterogeneity. Hum Genomics 2019;13:45. [PMID: 31639059 PMCID: PMC6805328 DOI: 10.1186/s40246-019-0228-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Weber LM, Saelens W, Cannoodt R, Soneson C, Hapfelmeier A, Gardner PP, Boulesteix AL, Saeys Y, Robinson MD. Essential guidelines for computational method benchmarking. Genome Biol 2019;20:125. [PMID: 31221194 PMCID: PMC6584985 DOI: 10.1186/s13059-019-1738-8] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Tian L, Dong X, Freytag S, Lê Cao KA, Su S, JalalAbadi A, Amann-Zalcenstein D, Weber TS, Seidi A, Jabbari JS, Naik SH, Ritchie ME. Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments. Nat Methods 2019;16:479-487. [DOI: 10.1038/s41592-019-0425-8] [Citation(s) in RCA: 183] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 04/18/2019] [Indexed: 11/09/2022]

Ahmed W, Malik MFA, Saeed M, Haq F. Copy number profiling of Oncotype DX genes reveals association with survival of breast cancer patients. Mol Biol Rep 2018;45:2185-2192. [PMID: 30225582 DOI: 10.1007/s11033-018-4379-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 09/10/2018] [Indexed: 12/17/2022]

Transcriptomic analysis of the heat stress response for a commercial baker's yeast Saccharomyces cerevisiae. Genes Genomics 2018;40:137-150. [PMID: 29892925 DOI: 10.1007/s13258-017-0616-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 10/01/2017] [Indexed: 10/18/2022]

Holik AZ, Law CW, Liu R, Wang Z, Wang W, Ahn J, Asselin-Labat ML, Smyth GK, Ritchie ME. RNA-seq mixology: designing realistic control experiments to compare protocols and analysis methods. Nucleic Acids Res 2017;45:e30. [PMID: 27899618 PMCID: PMC5389713 DOI: 10.1093/nar/gkw1063] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2016] [Accepted: 10/24/2016] [Indexed: 11/25/2022] Open

Affiliation(s)

Aliaksei Z Holik ACRF Stem Cells and Cancer Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia
Charity W Law Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia.,Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia
Ruijie Liu Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia
Zeya Wang Statistics Department, George R. Brown School of Engineering, Rice University, 6100 Main Street, Duncan Hall 2124, Houston, TX 77005, USA.,Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030, USA
Wenyi Wang Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030, USA
Jaeil Ahn Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University School of Medicine, 4000 Reservoir Road NW, Washington, DC 20057, USA
Marie-Liesse Asselin-Labat ACRF Stem Cells and Cancer Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia
Gordon K Smyth Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria 3010, Australia
Matthew E Ritchie Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia.,Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.,School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria 3010, Australia

Collapse

Strbenac D, Zhong L, Raftery MJ, Wang P, Wilson SR, Armstrong NJ, Yang JYH. Quantitative Performance Evaluator for Proteomics (QPEP): Web-based Application for Reproducible Evaluation of Proteomics Preprocessing Methods. J Proteome Res 2017;16:2359-2369. [DOI: 10.1021/acs.jproteome.6b00882] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Pan JC, Huang Y, Hwang JG. Estimation of selected parameters. Comput Stat Data Anal 2017. [DOI: 10.1016/j.csda.2016.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Teng M, Love MI, Davis CA, Djebali S, Dobin A, Graveley BR, Li S, Mason CE, Olson S, Pervouchine D, Sloan CA, Wei X, Zhan L, Irizarry RA. A benchmark for RNA-seq quantification pipelines. Genome Biol 2016;17:74. [PMID: 27107712 PMCID: PMC4842274 DOI: 10.1186/s13059-016-0940-1] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 04/08/2016] [Indexed: 02/07/2023] Open

Affiliation(s)

Mingxiang Teng Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA, 02215, USA.,Department of Biostatistics, Harvard TH Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA.,School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
Michael I Love Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA, 02215, USA.,Department of Biostatistics, Harvard TH Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
Carrie A Davis Functional Genomics Group, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724, USA
Sarah Djebali Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, Barcelona, 08003, Spain
Alexander Dobin Functional Genomics Group, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724, USA
Brenton R Graveley Department of Genetics and Genome Sciences, Institute for System Genomics, UConn Health Center, Farmington, CT, 06030, USA
Sheng Li Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, USA
Christopher E Mason Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, USA
Sara Olson Department of Genetics and Genome Sciences, Institute for System Genomics, UConn Health Center, Farmington, CT, 06030, USA
Dmitri Pervouchine Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, Doctor Aiguader, 88, Barcelona, 08003, Spain
Cricket A Sloan Department of Genetics, Stanford University, 300 Pasteur Drive, MC-5477, Stanford, CA, 94305, USA
Xintao Wei Department of Genetics and Genome Sciences, Institute for System Genomics, UConn Health Center, Farmington, CT, 06030, USA
Lijun Zhan Department of Genetics and Genome Sciences, Institute for System Genomics, UConn Health Center, Farmington, CT, 06030, USA
Rafael A Irizarry Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA, 02215, USA. .,Department of Biostatistics, Harvard TH Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA.

Collapse

Depiereux S, De Meulder B, Bareke E, Berger F, Le Gac F, Depiereux E, Kestemont P. Adaptation of a Bioinformatics Microarray Analysis Workflow for a Toxicogenomic Study in Rainbow Trout. PLoS One 2015;10:e0128598. [PMID: 26186543 PMCID: PMC4506078 DOI: 10.1371/journal.pone.0128598] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 04/28/2015] [Indexed: 12/26/2022] Open

Abstract

Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development) and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA) and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L) of ethynylestradiol (EE2) and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs), which were subjected to over-representation analysis methods (ORA). Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and lowest p-value of the statistical workflow output. The methodology can be generalized to other (non-model) species and various types of microarray platforms.

Collapse

Boareto M, Caticha N. t-Test at the Probe Level: An Alternative Method to Identify Statistically Significant Genes for Microarray Data. MICROARRAYS 2014;3:340-51. [PMID: 27600352 PMCID: PMC4979051 DOI: 10.3390/microarrays3040340] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Revised: 11/21/2014] [Accepted: 12/09/2014] [Indexed: 11/16/2022]

Richard AC, Lyons PA, Peters JE, Biasci D, Flint SM, Lee JC, McKinney EF, Siegel RM, Smith KGC. Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation. BMC Genomics 2014;15:649. [PMID: 25091430 PMCID: PMC4143561 DOI: 10.1186/1471-2164-15-649] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 07/17/2014] [Indexed: 01/02/2023] Open

Abstract

Background

Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study.

Results

Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this “gold-standard” comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues.

Conclusions

Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-649) contains supplementary material, which is available to authorized users.

Collapse

De Meulder B, Berger F, Bareke E, Depiereux S, Michiels C, Depiereux E. Meta-analysis and gene set analysis of archived microarrays suggest implication of the spliceosome in metastatic and hypoxic phenotypes. PLoS One 2014;9:e86699. [PMID: 24497970 PMCID: PMC3908947 DOI: 10.1371/journal.pone.0086699] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 12/10/2013] [Indexed: 12/17/2022] Open

Khamiakova T, Shkedy Z, Amaratunga D, Talloen W, Göhlmann H, Bijnens L, Kasim A. Quality control of Platinum Spike dataset by probe-level mixed models. Math Biosci 2014;248:1-10. [DOI: 10.1016/j.mbs.2013.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Revised: 11/20/2013] [Accepted: 11/21/2013] [Indexed: 10/25/2022]

Hossain A, Willan AR, Beyene J. An Improved Method on Wilcoxon Rank Sum Test for Gene Selection from Microarray Experiments. COMMUN STAT-SIMUL C 2013. [DOI: 10.1080/03610918.2012.667479] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Welsh EA, Eschrich SA, Berglund AE, Fenstermacher DA. Iterative rank-order normalization of gene expression microarray data. BMC Bioinformatics 2013;14:153. [PMID: 23647742 PMCID: PMC3651355 DOI: 10.1186/1471-2105-14-153] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 04/29/2013] [Indexed: 11/25/2022] Open

Abstract

Background

Many gene expression normalization algorithms exist for Affymetrix GeneChip microarrays. The most popular of these is RMA, primarily due to the precision and low noise produced during the process. A significant strength of this and similar approaches is the use of the entire set of arrays during both normalization and model-based estimation of signal. However, this leads to differing estimates of expression based on the starting set of arrays, and estimates can change when a single, additional chip is added to the set. Additionally, outlier chips can impact the signals of other arrays, and can themselves be skewed by the majority of the population.

Results

We developed an approach, termed IRON, which uses the best-performing techniques from each of several popular processing methods while retaining the ability to incrementally renormalize data without altering previously normalized expression. This combination of approaches results in a method that performs comparably to existing approaches on artificial benchmark datasets (i.e. spike-in) and demonstrates promising improvements in segregating true signals within biologically complex experiments.

Conclusions

By combining approaches from existing normalization techniques, the IRON method offers several advantages. First, IRON normalization occurs pair-wise, thereby avoiding the need for all chips to be normalized together, which can be important for large data analyses. Secondly, the technique does not require similarity in signal distribution across chips for normalization, which can be important for maintaining biologically relevant differences in a heterogeneous background. Lastly, IRON introduces fewer post-processing artifacts, particularly in data whose behavior violates common assumptions. Thus, the IRON method provides a practical solution to common needs of expression analysis. A software implementation of IRON is available at [http://gene.moffitt.org/libaffy/].

Collapse

Hossain A, Willan AR, Beyene J. A flexible nonparametric approach to find candidate genes associated with disease in microarray experiments. J Bioinform Comput Biol 2013;11:1250021. [PMID: 23600812 DOI: 10.1142/s0219720012500217] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Lahti L, Torrente A, Elo LL, Brazma A, Rung J. A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases. Nucleic Acids Res 2013;41:e110. [PMID: 23563154 PMCID: PMC3664815 DOI: 10.1093/nar/gkt229] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open

Correction of spatial bias in oligonucleotide array data. Adv Bioinformatics 2013;2013:167915. [PMID: 23573083 PMCID: PMC3610395 DOI: 10.1155/2013/167915] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 02/02/2013] [Indexed: 01/17/2023] Open

Jang GW, Lee KT, Park JE, Kim H, Kim TH, Choi BH, Kim MJ, Lim D. Gene Expression Profiling in Hepatic Tissue of two Pig Breeds. JOURNAL OF ANIMAL SCIENCE AND TECHNOLOGY 2012. [DOI: 10.5187/jast.2012.54.6.383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Faust D, Vondráček J, Krčmář P, Šmerdová L, Procházková J, Hrubá E, Hulinková P, Kaina B, Dietrich C, Machala M. AhR-mediated changes in global gene expression in rat liver progenitor cells. Arch Toxicol 2012. [DOI: 10.1007/s00204-012-0979-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Nair PS, Vihinen M. VariBench: A Benchmark Database for Variations. Hum Mutat 2012;34:42-9. [DOI: 10.1002/humu.22204] [Citation(s) in RCA: 106] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Accepted: 07/31/2012] [Indexed: 12/21/2022]

Ghorbel MT, Mokhtari A, Sheikh M, Angelini GD, Caputo M. Controlled reoxygenation cardiopulmonary bypass is associated with reduced transcriptomic changes in cyanotic tetralogy of Fallot patients undergoing surgery. Physiol Genomics 2012;44:1098-106. [PMID: 22991208 DOI: 10.1152/physiolgenomics.00072.2012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

In cyanotic patients undergoing repair of heart defects, high level of oxygen during cardiopulmonary bypass (CPB) leads to greater susceptibility to myocardial ischemia and reoxygenation injury. This study investigates the effects of controlled reoxygenation CPB on gene expression changes in cyanotic hearts of patients undergoing surgical correction of tetralogy of Fallot (TOF). We randomized 49 cyanotic TOF patients undergoing corrective cardiac surgery to receive either controlled reoxygenation or hyperoxic/standard CPB. Ventricular myocardium biopsies were obtained immediately after starting and before discontinuing CPB. Microarray analyses were performed on samples, and array results validated with real-time PCR. Gene expression profiles before and after hyperoxic/standard CPB revealed 35 differentially expressed genes with three upregulated and 32 downregulated. Upregulated genes included two E3 Ubiquitin ligases. The products of downregulated genes included intracellular signaling kinases, metabolic process proteins, and transport factors. In contrast, gene expression profiles before and after controlled reoxygenation CPB revealed only 11 differentially expressed genes with 10 upregulated including extracellular matrix proteins, transport factors, and one downregulated. The comparison of gene expression following hyperoxic/standard vs. controlled reoxygenation CPB revealed 59 differentially expressed genes, with six upregulated and 53 downregulated. Upregulated genes included PDE1A, MOSC1, and CRIP3. Downregulated genes functionally clustered into four major classes: extracellular matrix/cell adhesion, transcription, transport, and cellular metabolic process. This study provides direct evidence that hyperoxic CPB decreases the adaptation and remodeling capacity in cyanotic patients undergoing TOF repair. This simple CPB strategy of controlled reoxygenation reduced the number of genes whose expression was altered following hyperoxic/standard CPB.

Collapse

McCall MN, Almudevar A. Affymetrix GeneChip microarray preprocessing for multivariate analyses. Brief Bioinform 2012;13:536-46. [PMID: 22210854 PMCID: PMC3431718 DOI: 10.1093/bib/bbr072] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Revised: 11/20/2011] [Indexed: 11/15/2022] Open

Assessing numerical dependence in gene expression summaries with the jackknife expression difference. PLoS One 2012;7:e39570. [PMID: 22876276 PMCID: PMC3411624 DOI: 10.1371/journal.pone.0039570] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2011] [Accepted: 05/27/2012] [Indexed: 11/19/2022] Open

Zhao Z, Gene Hwang JT. Empirical Bayes false coverage rate controlling confidence intervals. J R Stat Soc Series B Stat Methodol 2012. [DOI: 10.1111/j.1467-9868.2012.01033.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Vihinen M. How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis. BMC Genomics 2012;13 Suppl 4:S2. [PMID: 22759650 PMCID: PMC3303716 DOI: 10.1186/1471-2164-13-s4-s2] [Citation(s) in RCA: 175] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Abstract

Background

Prediction methods are increasingly used in biosciences to forecast diverse features and characteristics. Binary two-state classifiers are the most common applications. They are usually based on machine learning approaches. For the end user it is often problematic to evaluate the true performance and applicability of computational tools as some knowledge about computer science and statistics would be needed.

Results

Instructions are given on how to interpret and compare method evaluation results. For systematic method performance analysis is needed established benchmark datasets which contain cases with known outcome, and suitable evaluation measures. The criteria for benchmark datasets are discussed along with their implementation in VariBench, benchmark database for variations. There is no single measure that alone could describe all the aspects of method performance. Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified. Therefore numerous prediction tools have been developed, however, systematic analyses of their performance and comparison have just started to emerge.

Conclusions

The end users of prediction tools should be able to understand how evaluation is done and how to interpret the results. Six main performance evaluation measures are introduced. These include sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. Together with receiver operating characteristics (ROC) analysis they provide a good picture about the performance of methods and allow their objective and quantitative comparison. A checklist of items to look at is provided. Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

Collapse

An integrated framework to model cellular phenotype as a component of biochemical networks. Adv Bioinformatics 2011;2011:608295. [PMID: 22190923 PMCID: PMC3235418 DOI: 10.1155/2011/608295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2011] [Accepted: 08/26/2011] [Indexed: 11/25/2022] Open

Lim D, Lee KT, Park JE, Kim H, Kim TH, Choi BH, Kim MJ, Park HS, Jang GW. Analysis of gene expression profiles from subcutaneous adipose tissue of two pig breeds. Genes Genomics 2011. [DOI: 10.1007/s13258-011-0083-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Berger F, Carlon E. From hybridization theory to microarray data analysis: performance evaluation. BMC Bioinformatics 2011;12:464. [PMID: 22136743 PMCID: PMC3267830 DOI: 10.1186/1471-2105-12-464] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 12/02/2011] [Indexed: 02/05/2023] Open

Abstract

Background

Several preprocessing methods are available for the analysis of Affymetrix Genechips arrays. The most popular algorithms analyze the measured fluorescence intensities with statistical methods. Here we focus on a novel algorithm, AffyILM, available from Bioconductor, which relies on inputs from hybridization thermodynamics and uses an extended Langmuir isotherm model to compute transcript concentrations. These concentrations are then employed in the statistical analysis. We compared the performance of AffyILM and other traditional methods both in the old and in the newest generation of GeneChips.

Results

Tissue mixture and Latin Square datasets (provided by Affymetrix) were used to assess the performances of the differential expression analysis depending on the preprocessing strategy. A correlation analysis conducted on the tissue mixture data reveals that the median-polish algorithm allows to best summarize AffyILM concentrations computed at the probe-level. Those correlation results are equivalent to the best correlations observed using popular preprocessing methods relying on intensity values. The performances of each tested preprocessing algorithm were quantified using the Latin Square HG-U133A dataset, thanks to the comparison of differential analysis results with the list of spiked genes. The figures of merit generated illustrates that the performances associated to AffyILM(medianpolish), inferred from the present statistical analysis, are comparable to the best performing strategies previously reported.

Conclusions

Converting probe intensities to estimates of target concentrations prior to the statistical analysis, AffyILM(medianpolish) is one of the best performing strategy currently available. Using hybridization theory, probe-level estimates of target concentrations should be identically distributed. In the future, a probe-level multivariate analysis of the concentrations should be compared to the univariate analysis of probe-set summarized expression data.

Collapse

Owzar K, Barry WT, Jung SH. Statistical considerations for analysis of microarray experiments. Clin Transl Sci 2011;4:466-77. [PMID: 22212230 DOI: 10.1111/j.1752-8062.2011.00309.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Moffitt RA, Yin-Goen Q, Stokes TH, Parry RM, Torrance JH, Phan JH, Young AN, Wang MD. caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts. BMC Bioinformatics 2011;12:383. [PMID: 21957981 PMCID: PMC3230913 DOI: 10.1186/1471-2105-12-383] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 09/29/2011] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replacement strategy and integration with typical microarray workflows via caCORRECT's web portal and caBIG grid services. In this report, we demonstrate that caCORRECT improves the reproducibility and reliability of experimental results across several common Affymetrix microarray platforms. caCORRECT represents an advance over state-of-art quality control methods such as Harshlighting, and acts to improve gene expression calculation techniques such as PLIER, RMA and MAS5.0, because it incorporates spatial information into outlier detection as well as outlier information into probe normalization. The ability of caCORRECT to recover accurate gene expressions from low quality probe intensity data is assessed using a combination of real and synthetic artifacts with PCR follow-up confirmation and the affycomp spike in data. The caCORRECT tool can be accessed at the website: http://cacorrect.bme.gatech.edu.

RESULTS

We demonstrate that (1) caCORRECT's artifact-aware normalization avoids the undesirable global data warping that happens when any damaged chips are processed without caCORRECT; (2) When used upstream of RMA, PLIER, or MAS5.0, the data imputation of caCORRECT generally improves the accuracy of microarray gene expression in the presence of artifacts more than using Harshlighting or not using any quality control; (3) Biomarkers selected from artifactual microarray data which have undergone the quality control procedures of caCORRECT are more likely to be reliable, as shown by both spike in and PCR validation experiments. Finally, we present a case study of the use of caCORRECT to reliably identify biomarkers for renal cell carcinoma, yielding two diagnostic biomarkers with potential clinical utility, PRKAB1 and NNMT.

CONCLUSIONS

caCORRECT is shown to improve the accuracy of gene expression, and the reproducibility of experimental results in clinical application. This study suggests that caCORRECT will be useful to clean up possible artifacts in new as well as archived microarray data.

Collapse

Kadota K, Shimizu K. Evaluating methods for ranking differentially expressed genes applied to microArray quality control data. BMC Bioinformatics 2011;12:227. [PMID: 21639945 PMCID: PMC3128035 DOI: 10.1186/1471-2105-12-227] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Accepted: 06/06/2011] [Indexed: 11/12/2022] Open

Lytkin NI, McVoy L, Weitkamp JH, Aliferis CF, Statnikov A. Expanding the understanding of biases in development of clinical-grade molecular signatures: a case study in acute respiratory viral infections. PLoS One 2011;6:e20662. [PMID: 21673802 PMCID: PMC3105991 DOI: 10.1371/journal.pone.0020662] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2010] [Accepted: 05/06/2011] [Indexed: 01/21/2023] Open

Abstract

BACKGROUND

The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them.

METHODOLOGY AND PRINCIPAL FINDINGS

Using a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures.

CONCLUSIONS AND SIGNIFICANCE

Several recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution.

Collapse

Shakya K, Ruskin HJ, Kerr G, Crane M, Becker J. Comparison of microarray preprocessing methods. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2011;680:139-47. [PMID: 20865495 DOI: 10.1007/978-1-4419-5913-3_16] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]

Taub MA, Corrada Bravo H, Irizarry RA. Overcoming bias and systematic errors in next generation sequencing data. Genome Med 2010;2:87. [PMID: 21144010 PMCID: PMC3025429 DOI: 10.1186/gm208] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Dai M, Thompson RC, Maher C, Contreras-Galindo R, Kaplan MH, Markovitz DM, Omenn G, Meng F. NGSQC: cross-platform quality analysis pipeline for deep sequencing data. BMC Genomics 2010;11 Suppl 4:S7. [PMID: 21143816 PMCID: PMC3005923 DOI: 10.1186/1471-2164-11-s4-s7] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Kohl M, Deigner HP. Preprocessing of gene expression data by optimally robust estimators. BMC Bioinformatics 2010;11:583. [PMID: 21118506 PMCID: PMC3744637 DOI: 10.1186/1471-2105-11-583] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2010] [Accepted: 11/30/2010] [Indexed: 11/17/2022] Open

Abstract

Background

The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics.

Preprocessing methods like MAS 5.0, Illumina's default summarization method, RMA, or VSN show that the use of robust estimators is widely accepted in gene expression analysis. However, the selection of robust methods seems to be mainly driven by their high breakdown point and not by efficiency.

Results

We describe how optimally robust radius-minimax (rmx) estimators, i.e. estimators that minimize an asymptotic maximum risk on shrinking neighborhoods about an ideal model, can be used for the aggregation of multiple raw signal intensities to one expression value for Affymetrix and Illumina data. With regard to the Affymetrix data, we have implemented an algorithm which is a variant of MAS 5.0.

Using datasets from the literature and Monte-Carlo simulations we provide some reasoning for assuming approximate log-normal distributions of the raw signal intensities by means of the Kolmogorov distance, at least for the discussed datasets, and compare the results of our preprocessing algorithms with the results of Affymetrix's MAS 5.0 and Illumina's default method.

The numerical results indicate that when using rmx estimators an accuracy improvement of about 10-20% is obtained compared to Affymetrix's MAS 5.0 and about 1-5% compared to Illumina's default method. The improvement is also visible in the analysis of technical replicates where the reproducibility of the values (in terms of Pearson and Spearman correlation) is increased for all Affymetrix and almost all Illumina examples considered. Our algorithms are implemented in the R package named RobLoxBioC which is publicly available via CRAN, The Comprehensive R Archive Network (http://cran.r-project.org/web/packages/RobLoxBioC/).

Conclusions

Optimally robust rmx estimators have a high breakdown point and are computationally feasible. They can lead to a considerable gain in efficiency for well-established bioinformatics procedures and thus, can increase the reproducibility and power of subsequent statistical analysis.

Collapse

Küppers M, Ittrich C, Faust D, Dietrich C. The transcriptional programme of contact-inhibition. J Cell Biochem 2010;110:1234-43. [PMID: 20564218 DOI: 10.1002/jcb.22638] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Giorgi FM, Bolger AM, Lohse M, Usadel B. Algorithm-driven artifacts in median polish summarization of microarray data. BMC Bioinformatics 2010;11:553. [PMID: 21070630 PMCID: PMC2998528 DOI: 10.1186/1471-2105-11-553] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2010] [Accepted: 11/11/2010] [Indexed: 12/27/2022] Open

Exploration, visualization, and preprocessing of high-dimensional data. Methods Mol Biol 2010;620:267-84. [PMID: 20652508 DOI: 10.1007/978-1-60761-580-4_8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Aniba MR, Poch O, Thompson JD. Issues in bioinformatics benchmarking: the case study of multiple sequence alignment. Nucleic Acids Res 2010;38:7353-63. [PMID: 20639539 PMCID: PMC2995051 DOI: 10.1093/nar/gkq625] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Revised: 06/10/2010] [Accepted: 06/29/2010] [Indexed: 11/13/2022] Open

Skrzypczak M, Goryca K, Rubel T, Paziewska A, Mikula M, Jarosz D, Pachlewski J, Oledzki J, Ostrowsk J. Modeling oncogenic signaling in colon tumors by multidirectional analyses of microarray data directed for maximization of analytical reliability. PLoS One 2010;5. [PMID: 20957034 PMCID: PMC2948500 DOI: 10.1371/journal.pone.0013091] [Citation(s) in RCA: 270] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Accepted: 09/08/2010] [Indexed: 12/16/2022] Open

Abstract

Background

Clinical progression of colorectal cancers (CRC) may occur in parallel with distinctive signaling alterations. We designed multidirectional analyses integrating microarray-based data with biostatistics and bioinformatics to elucidate the signaling and metabolic alterations underlying CRC development in the adenoma-carcinoma sequence.

Methodology/Principal Findings

Studies were performed on normal mucosa, adenoma, and carcinoma samples obtained during surgery or colonoscopy. Collections of cryostat sections prepared from the tissue samples were evaluated by a pathologist to control the relative cell type content. The measurements were done using Affymetrix GeneChip HG-U133plus2, and probe set data was generated using two normalization algorithms: MAS5.0 and GCRMA with least-variant set (LVS). The data was evaluated using pair-wise comparisons and data decomposition into singular value decomposition (SVD) modes. The method selected for the functional analysis used the Kolmogorov-Smirnov test. Expressional profiles obtained in 105 samples of whole tissue sections were used to establish oncogenic signaling alterations in progression of CRC, while those representing 40 microdissected specimens were used to select differences in KEGG pathways between epithelium and mucosa. Based on a consensus of the results obtained by two normalization algorithms, and two probe set sorting criteria, we identified 14 and 17 KEGG signaling and metabolic pathways that are significantly altered between normal and tumor samples and between benign and malignant tumors, respectively. Several of them were also selected from the raw microarray data of 2 recently published studies (GSE4183 and GSE8671).

Conclusion/Significance

Although the proposed strategy is computationally complex and labor–intensive, it may reduce the number of false results.

Collapse

Eronen VP, Lindén RO, Lindroos A, Kanerva M, Aittokallio T. Genome-wide scoring of positive and negative epistasis through decomposition of quantitative genetic interaction fitness matrices. PLoS One 2010;5:e11611. [PMID: 20657656 PMCID: PMC2904709 DOI: 10.1371/journal.pone.0011611] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 06/22/2010] [Indexed: 01/07/2023] Open

Abstract

Recent technological developments in genetic screening approaches have offered the means to start exploring quantitative genotype-phenotype relationships on a large-scale. What remains unclear is the extent to which the quantitative genetic interaction datasets can distinguish the broad spectrum of interaction classes, as compared to existing information on mutation pairs associated with both positive and negative interactions, and whether the scoring of varying degrees of such epistatic effects could be improved by computational means. To address these questions, we introduce here a computational approach for improving the quantitative discrimination power encoded in the genetic interaction screening data. Our matrix approximation model decomposes the original double-mutant fitness matrix into separate components, representing variability across the array and query mutants, which can be utilized for estimating and correcting the single-mutant fitness effects, respectively. When applied to three large-scale quantitative interaction datasets in yeast, we could improve the accuracy of scoring various interaction classes beyond that obtained with the original fitness data, especially in synthetic genetic array (SGA) and in genetic interaction mapping (GIM) datasets. In addition to the known pairs of interactions used in the evaluation of the computational approach, a number of novel interaction pairs were also predicted, along with underlying biological mechanisms, which remained undetected by the original datasets. It was shown that the optimal choice of the scoring function depends heavily on the screening approach and on the interaction class under analysis. Moreover, a simple preprocessing of the fitness matrix could further enhance the discrimination power of the epistatic miniarray profiling (E-MAP) dataset. These systematic evaluation results provide in-depth information on the optimal analysis of the future, large-scale screening experiments. In general, the modeling framework, enabling accurate identification and classification of genetic interactions, provides a solid basis for completing and mining the genetic interaction networks in yeast and other organisms.

Collapse

Schmid R, Baum P, Ittrich C, Fundel-Clemens K, Huber W, Brors B, Eils R, Weith A, Mennerich D, Quast K. Comparison of normalization methods for Illumina BeadChip HumanHT-12 v3. BMC Genomics 2010;11:349. [PMID: 20525181 PMCID: PMC3091625 DOI: 10.1186/1471-2164-11-349] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Accepted: 06/02/2010] [Indexed: 11/26/2022] Open

Kim C, Choi J, Park H, Park Y, Park J, Park T, Cho K, Yang Y, Yoon S. Global analysis of microarray data reveals intrinsic properties in gene expression and tissue selectivity. ACTA ACUST UNITED AC 2010;26:1723-30. [PMID: 20511364 DOI: 10.1093/bioinformatics/btq279] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Reimers M. Making informed choices about microarray data analysis. PLoS Comput Biol 2010;6:e1000786. [PMID: 20523743 PMCID: PMC2877726 DOI: 10.1371/journal.pcbi.1000786] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Zhu CQ, Pintilie M, John T, Strumpf D, Shepherd FA, Der SD, Jurisica I, Tsao MS. Understanding prognostic gene expression signatures in lung cancer. Clin Lung Cancer 2010;10:331-40. [PMID: 19808191 DOI: 10.3816/clc.2009.n.045] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]