1
|
Shahjaman M, Rahman MR, Islam T, Auwul MR, Moni MA, Mollah MNH. rMisbeta: A robust missing value imputation approach in transcriptomics and metabolomics data. Comput Biol Med 2021; 138:104911. [PMID: 34634637 DOI: 10.1016/j.compbiomed.2021.104911] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 09/25/2021] [Accepted: 09/25/2021] [Indexed: 12/14/2022]
Abstract
Transcriptomics and metabolomics data often contain missing values or outliers due to limitations of the data acquisition techniques. Most of the statistical methods require complete datasets for downstream analysis. A number of methods have been developed for missing value imputation using the classical mean and variance based on maximum likelihood estimators, which are not robust against outliers. Consequently, the performance of these methods deteriorates in the presence of outliers. Hence precise imputation of missing values and outliers handling are both concurrently important. Therefore, in this paper, we developed a robust iterative approach using robust estimators based on the minimum beta divergence method, which simultaneously impute missing values and outliers. We investigate the performance of the proposed method in a comparison with six frequently used missing value imputation methods such as Zero, KNN, robust SVD, EM, random forest (RF) and weighted least square approach (WLSA) through feature selection using both simulated and real datasets. Ten performance indices were used to explore the optimal method such as Frobenius norm (FOBN), accuracy (ACC), sensitivity (SN), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), detection rate (DR), misclassification error rate (MER), the area under the ROC curve (AUC) and computational runtime. Evaluation based on both simulated and real data suggests the superiority of the proposed method over the other traditional methods in terms of various rates of outliers and missing values. The suggested approach also keeps almost equal performance in absence of outliers with the other methods. The proposed method is accurate, simple, and consumes lower computational time compared to the other methods. Therefore, our recommendation is to apply the proposed procedure for large-scale transcriptomics and metabolomics data analysis. The computational tool has been implemented in an R package, which is publicly available from https://CRAN.R-project.org/package=rMisbeta.
Collapse
Affiliation(s)
- Md Shahjaman
- Department of Statistics, Begum Rokeya University, Rangpur, 5400, Bangladesh.
| | - Md Rezanur Rahman
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Tania Islam
- Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Md Rabiul Auwul
- School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland St Lucia, Australia
| | - Md Nurul Haque Mollah
- Laboratory of Bioinformatics, Department of Statistics, University of Rajshahi, Rajshahi 6205, Bangladesh.
| |
Collapse
|
2
|
Zhang D, Dai W, Hu H, Chen W, Liu Y, Guan Z, Zhang S, Xu H. Controlling the immobilization process of an optically enhanced protein microarray for highly reproducible immunoassay. NANOSCALE 2021; 13:4269-4277. [PMID: 33595014 DOI: 10.1039/d0nr08407g] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
By virtue of its high throughput multiplex detection capability, superior read-out sensitivity, and tiny analyte consumption, an optically enhanced protein microarray assay has been developed as a promising diagnostic tool for various applications, ranging from the field of pharmacology to diagnostics. However, so far, the development of an optically enhanced protein microarray (OEPM) toward widespread commercial availability is mainly hampered by insufficient detection reproducibility. Here, we develop an OEPM platform with an order of magnitude optical enhancement induced by the interference effect. High assay reproducibility of the OEPM is achieved by optimizing the protein immobilization schemes, linking to the surface energy of the substrate, surfactant-tuned wetting ability, and the washing and drying dynamics. As a result, smearing-free and uniform spot arrays with a coefficient of variation less than 7% can be achieved. Furthermore, we demonstrate the assay performance of the OEPM by detecting five biomarkers, showing an order of magnitude higher sensitivity, many-fold higher throughput, and 10 times less analyte consumption than those of the commercial enzyme-linked immunosorbent assay kits. Our results provide new insight for improving the reproducibility of OEPMs toward practical and commercial diagnostic assays.
Collapse
Affiliation(s)
- Daxiao Zhang
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Wei Dai
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Huatian Hu
- The Institute for Advanced Studies, Wuhan University, Wuhan 430072, China
| | - Wen Chen
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Yang Liu
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Zhiqiang Guan
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Shunping Zhang
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China.
| | - Hongxing Xu
- School of Physics and Technology, Center for Nanoscience and Nanotechnology, Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education, Wuhan University, Wuhan 430072, China. and The Institute for Advanced Studies, Wuhan University, Wuhan 430072, China
| |
Collapse
|
3
|
Fluorescence Interference Contrast-enabled structures improve the microarrays performance. Biosens Bioelectron 2019; 123:251-259. [PMID: 30224286 DOI: 10.1016/j.bios.2018.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 08/18/2018] [Accepted: 09/01/2018] [Indexed: 11/22/2022]
Abstract
Continuous improvements of the fluorescence-based sensitivity and specificity, required for high throughput screening, diagnostics, and molecular biology studies, are usually addressed by better readout systems, or better reporting elements. However, while Fluorescence Interference Contrast (FLIC), which modulates the fluorescence by materials-based parameters, has been used for decades to measure biomolecular interactions at nanometer-precision, e.g., for the study of molecular motors and membrane processes, it has been seldom used for high throughput or diagnostic microdevices. Moreover, the amplification of both the fluorescence signal, modulated by vertically-nano-calibrated structures, and the signal/background, modulated by laterally-micro-calibrated structures, has not been explored. To address this synergy, structures comprising optically transparent silicon oxide, tens of micrometers-wide and with thicknesses in the low hundreds of nanometers, which are able to promote the formation of standing waves if patterned on a reflective material, have been designed, fabricated and tested, for the use in DNA- and protein arrays. The light emitted by a fluorophore placed on top of the structures and reflected by a bottom mirror surface, e.g., silicon, platinum, is physically constrained to a region defined lithographically, both vertically and laterally, i.e., micro-pillars and -wells, resulting in an accurate identification and quantification of fluorescence. The signal/noise ratio on micro-/nano-structured substrates is comparable to that measured on planar substrates, but the physical confinement of the microarray spots results in a considerable increase of the intra-feature uniformity.
Collapse
|
4
|
Irwin RD, Boorman GA, Cunningham ML, Heinloth AN, Malarkey DE, Paules RS. Application of Toxicogenomics to Toxicology: Basic Concepts in the Analysis of Microarray Data. Toxicol Pathol 2016; 32 Suppl 1:72-83. [PMID: 15209406 DOI: 10.1080/01926230490424752] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Toxicology and the practice of pathology are rapidly evolving in the postgenomic era. Observable treatment related changes have been the hallmark of toxicology studies. Toxicogenomics is a powerful new tool that may show gene and protein changes earlier and at treatment levels below the limits of detection of traditional measures of toxicity. It may also aid in the understanding of toxic mechanisms. It is important to remember that it is only a tool and will provide meaningful results only when properly applied. As is often the case with new experimental tools, the initial utilization is driven more by the technology than application to problem solving. Toxicogenomics is interdisciplinary in nature including at a minimum, pathology, toxicology, and genomics. Most studies will require the input from the disciplines of toxicology, pathology, molecular biology, bioinformatics, biochemistry, and others depending on the types of questions being asked.
Collapse
Affiliation(s)
- Richard D Irwin
- Environmental Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, USA.
| | | | | | | | | | | |
Collapse
|
5
|
The Classification of Sini Decoction Pattern in Traditional Chinese Medicine by Gene Expression Profiling. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2016; 2016:8239817. [PMID: 27200105 PMCID: PMC4855028 DOI: 10.1155/2016/8239817] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Revised: 02/21/2016] [Accepted: 03/10/2016] [Indexed: 12/02/2022]
Abstract
We investigated the syndromes of the Sini decoction pattern (SDP), a common ZHENG in traditional Chinese medicine (TCM). The syndromes of SDP were correlated with various severe Yang deficiency related symptoms. To obtain a common profile for SDP, we distributed questionnaires to 300 senior clinical TCM practitioners. According to the survey, we concluded 2 sets of symptoms for SDP: (1) pulse feels deep or faint and (2) reversal cold of the extremities. Twenty-four individuals from Taipei City Hospital, Linsen Chinese Medicine Branch, Taiwan, were recruited. We extracted the total mRNA of peripheral blood mononuclear cells from the 24 individuals for microarray experiments. Twelve individuals (including 6 SDP patients and 6 non-SDP individuals) were used as the training set to identify biomarkers for distinguishing the SDP and non-SDP groups. The remaining 12 individuals were used as the test set. The test results indicated that the gene expression profiles of the identified biomarkers could effectively distinguish the 2 groups by adopting a hierarchical clustering algorithm. Our results suggest the feasibility of using the identified biomarkers in facilitating the diagnosis of TCM ZHENGs. Furthermore, the gene expression profiles of biomarker genes could provide a molecular explanation corresponding to the ZHENG of TCM.
Collapse
|
6
|
Marcinek P, Geithe C, Krautwurst D. Chemosensory G Protein-Coupled Receptors (GPCR) in Blood Leukocytes. TOPICS IN MEDICINAL CHEMISTRY 2016. [DOI: 10.1007/7355_2016_101] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
7
|
RNA-seq reveals the critical role of OtpR in regulating Brucella melitensis metabolism and virulence under acidic stress. Sci Rep 2015; 5:10864. [PMID: 26242322 PMCID: PMC4542472 DOI: 10.1038/srep10864] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 04/29/2015] [Indexed: 02/07/2023] Open
Abstract
The response regulator OtpR is critical for the growth, morphology and virulence of Brucella melitensis. Compared to its wild type strain 16 M, B. melitensis 16 MΔotpR mutant has decreased tolerance to acid stress. To analyze the genes regulated by OtpR under acid stress, we performed RNA-seq whole transcriptome analysis of 16 MΔotpR and 16 M. In total, 501 differentially expressed genes were identified, including 390 down-regulated and 111 up-regulated genes. Among these genes, 209 were associated with bacterial metabolism, including 54 genes involving carbohydrate metabolism, 13 genes associated with nitrogen metabolism, and seven genes associated with iron metabolism. The 16 MΔotpR also decreased capacity to utilize different carbon sources and to tolerate iron limitation in culture experiments. Notably, OtpR regulated many Brucella virulence factors essential for B. melitensis intracellular survival. For instance, the virB operon encoding type IV secretion system was significantly down-regulated, and 36 known transcriptional regulators (e.g., vjbR and blxR) were differentially expressed in 16 MΔotpR. Selected RNA-seq results were experimentally confirmed by RT-PCR and RT-qPCR. Overall, these results deciphered differential phenomena associated with virulence, environmental stresses and cell morphology in 16 MΔotpR and 16 M, which provided important information for understanding the detailed OtpR-regulated interaction networks and Brucella pathogenesis.
Collapse
|
8
|
Li CY, Chiang CS, Cheng WC, Wang SC, Cheng HT, Chen CR, Shu WY, Tsai ML, Hseu RS, Chang CW, Huang CY, Fang SH, Hsu IC. Gene expression profiling of dendritic cells in different physiological stages under Cordyceps sinensis treatment. PLoS One 2012; 7:e40824. [PMID: 22829888 PMCID: PMC3400664 DOI: 10.1371/journal.pone.0040824] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 06/13/2012] [Indexed: 11/19/2022] Open
Abstract
Cordyceps sinensis (CS) has been commonly used as herbal medicine and a health supplement in China for over two thousand years. Although previous studies have demonstrated that CS has benefits in immunoregulation and anti-inflammation, the precise mechanism by which CS affects immunomodulation is still unclear. In this study, we exploited duplicate sets of loop-design microarray experiments to examine two different batches of CS and analyze the effects of CS on dendritic cells (DCs), in different physiology stages: naïve stage and inflammatory stage. Immature DCs were treated with CS, lipopolysaccharide (LPS), or LPS plus CS (LPS/CS) for two days, and the gene expression profiles were examined using cDNA microarrays. The results of two loop-design microarray experiments showed good intersection rates. The expression level of common genes found in both loop-design microarray experiments was consistent, and the correlation coefficients (Rs), were higher than 0.96. Through intersection analysis of microarray results, we identified 295 intersecting significantly differentially expressed (SDE) genes of the three different treatments (CS, LPS, and LPS/CS), which participated mainly in the adjustment of immune response and the regulation of cell proliferation and death. Genes regulated uniquely by CS treatment were significantly involved in the regulation of focal adhesion pathway, ECM-receptor interaction pathway, and hematopoietic cell lineage pathway. Unique LPS regulated genes were significantly involved in the regulation of Toll-like receptor signaling pathway, systemic lupus erythematosus pathway, and complement and coagulation cascades pathway. Unique LPS/CS regulated genes were significantly involved in the regulation of oxidative phosphorylation pathway. These results could provide useful information in further study of the pharmacological mechanisms of CS. This study also demonstrates that with a rigorous experimental design, the biological effects of a complex compound can be reliably studied by a complex system like cDNA microarray.
Collapse
Affiliation(s)
- Chia-Yang Li
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
- Department of Urology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Chi-Shiun Chiang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Wei-Chung Cheng
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
- Division of Pediatric Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan
| | - Shu-Chi Wang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Hung-Tsu Cheng
- Institute of Nanoengineerin and Microsystem, National Tsing Hua University, Hsinchu, Taiwan
| | - Chaang-Ray Chen
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Wun-Yi Shu
- Institute of Statistics, National Tsing Hua University, Hsinchu, Taiwan
| | - Min-Lung Tsai
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Ruey-Shyang Hseu
- Department of Biochemical Science and Technology, National Taiwan University, Taipei, Taiwan
| | - Cheng-Wei Chang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Chao-Ying Huang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Shih-Hua Fang
- Institute of Athletics, National Taiwan Sport University, Taichung, Taiwan
| | - Ian C. Hsu
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
9
|
MARCHAL KATHLEEN, ENGELEN KRISTOF, DE BRABANTER JOS, AERTS STEIN, DE MOOR BART, AYOUBI TORIK, VAN HUMMELEN PAUL. COMPARISON OF DIFFERENT METHODOLOGIES TO IDENTIFY DIFFERENTIALLY EXPRESSED GENES IN TWO-SAMPLE cDNA MICROARRAYS. J BIOL SYST 2012. [DOI: 10.1142/s0218339002000731] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This review compares different methods to identify differentially expressed genes in two-sample cDNA arrays. A two-sample experiment is a commonly used design to compare relative mRNA abundance between two different samples. This simple design is customarily used by biologists as a first screening before relying on more complex designs. Statistical techniques are quite well developed for such simple designs. For the identification of differentially expressed genes, four methods were described and compared: a fold test, a t-test (Long et al., 2001), SAM (Tusher et al., 2001) and an ANOVA-based bootstrap method (Kerr and Churchill, 2001). Mutual comparison of these methods clearly illustrates each method's advantages and pitfalls. Our analyses showed that the most reliable predictions are made by the combined use of different methods, each of which is based on a different statistic. The ANOVA-based bootstap method used in this study performed rather poorly in identifying differentially expressed genes.
Collapse
Affiliation(s)
- KATHLEEN MARCHAL
- Department of Electrical Engineering, ESAT-SCD, K. U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - KRISTOF ENGELEN
- Department of Electrical Engineering, ESAT-SCD, K. U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - JOS DE BRABANTER
- Department of Electrical Engineering, ESAT-SCD, K. U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - STEIN AERTS
- Department of Electrical Engineering, ESAT-SCD, K. U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - BART DE MOOR
- Department of Electrical Engineering, ESAT-SCD, K. U. Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium
| | - TORIK AYOUBI
- Flanders Interuniversity Institute of Biotechnology (VIB), 9050 Ghent, Belgium
| | - PAUL VAN HUMMELEN
- Microarray Facility, Flanders Interuniversity Institute of Biotechnology (VIB), 3000 Leuven, Belgium
| |
Collapse
|
10
|
Aittokallio T, Kurki M, Nevalainen O, Nikula T, West A, Lahesmaa R. Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments. J Bioinform Comput Biol 2012; 1:541-86. [PMID: 15290769 DOI: 10.1142/s0219720003000319] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2003] [Revised: 07/02/2003] [Indexed: 11/18/2022]
Abstract
Microarray analysis has become a widely used method for generating gene expression data on a genomic scale. Microarrays have been enthusiastically applied in many fields of biological research, even though several open questions remain about the analysis of such data. A wide range of approaches are available for computational analysis, but no general consensus exists as to standard for microarray data analysis protocol. Consequently, the choice of data analysis technique is a crucial element depending both on the data and on the goals of the experiment. Therefore, basic understanding of bioinformatics is required for optimal experimental design and meaningful interpretation of the results. This review summarizes some of the common themes in DNA microarray data analysis, including data normalization and detection of differential expression. Algorithms are demonstrated by analyzing cDNA microarray data from an experiment monitoring gene expression in T helper cells. Several computational biology strategies, along with their relative merits, are overviewed and potential areas for additional research discussed. The goal of the review is to provide a computational framework for applying and evaluating such bioinformatics strategies. Solid knowledge of microarray informatics contributes to the implementation of more efficient computational protocols for the given data obtained through microarray experiments.
Collapse
Affiliation(s)
- Tero Aittokallio
- Department of Computational Biology, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-Shi, Chiba 277-8562, Japan.
| | | | | | | | | | | |
Collapse
|
11
|
Hidalgo MMR, Ruiz-Medina MD. Local wavelet-vaguelette-based functional classification of gene expression data. Biom J 2012; 54:75-93. [PMID: 22213074 DOI: 10.1002/bimj.201000135] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2010] [Revised: 03/11/2011] [Accepted: 09/08/2011] [Indexed: 11/08/2022]
Abstract
This paper focuses on the problem of functional statistical classification of gene expression curves. A local-wavelet-vaguelette-based functional logistic regression approach is presented. This approach is specially suitable for the classification of non-stationary singular (non-differentiable) curves. The performance of the methodology proposed is illustrated by implementing it for the classification of yeast cell-cycle temporal gene expression profiles. A simulation study is also carried out for comparison with other functional classification methodologies.
Collapse
Affiliation(s)
- Margarita M Rincón Hidalgo
- Departament of Statistics and Operational Research, Universidad de Granada, Campus Fuente Nueva s/n, E-18071, Granada, Spain
| | | |
Collapse
|
12
|
Le Meur N, Gentleman R. Analyzing biological data using R: methods for graphs and networks. Methods Mol Biol 2012; 804:343-73. [PMID: 22144163 DOI: 10.1007/978-1-61779-361-5_19] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
R is a powerful language and widely used software tool for the analysis and visualization of data. Its core capabilities can be extended through many different add-on packages. Among the many packages are some which offer a broad range of facilities for analyzing statistical properties of graphs. This chapter provides a practical tutorial covering the use of R methods for graphs and networks to examine biological data and analyze their topological and statistical properties.
Collapse
Affiliation(s)
- Nolwenn Le Meur
- IRISA, Equipe Symbiose, Université de Rennes I, Rennes, France.
| | | |
Collapse
|
13
|
Sifakis EG, Prentza A, Koutsouris D, Chatziioannou AA. Evaluating the effect of various background correction methods regarding noise reduction, in two-channel microarray data. Comput Biol Med 2011; 42:19-29. [PMID: 22074762 DOI: 10.1016/j.compbiomed.2011.10.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2010] [Revised: 04/26/2011] [Accepted: 10/13/2011] [Indexed: 10/15/2022]
Abstract
In this work, two novel background correction (BC) methods, along with several commonly used ones, are evaluated regarding noise reduction in eleven two-channel self-versus-self (SVS) hybridizations. The evaluation of each BC method is investigated under the use of four statistical criteria combined into a single measure, the polygon area measure. Overall, our proposed BC approaches perform very well in terms of the proposed measure for most of the cases and provide an improved effect regarding technical noise reduction.
Collapse
Affiliation(s)
- E G Sifakis
- Biomedical Engineering Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Greece
| | | | | | | |
Collapse
|
14
|
Zhu B, Peng RH, Xiong AS, Fu XY, Zhao W, Tian YS, Jin XF, Xue Y, Xu J, Han HJ, Chen C, Gao JJ, Yao QH. Analysis of gene expression profile of Arabidopsis genes under trichloroethylene stresses with the use of a full-length cDNA microarray. Mol Biol Rep 2011; 39:3799-806. [DOI: 10.1007/s11033-011-1157-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2011] [Accepted: 06/30/2011] [Indexed: 11/28/2022]
|
15
|
Benso A, Di Carlo S, Politano G. A cDNA microarray gene expression data classifier for clinical diagnostics based on graph theory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:577-591. [PMID: 20855919 DOI: 10.1109/tcbb.2010.90] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Despite great advances in discovering cancer molecular profiles, the proper application of microarray technology to routine clinical diagnostics is still a challenge. Current practices in the classification of microarrays' data show two main limitations: the reliability of the training data sets used to build the classifiers, and the classifiers' performances, especially when the sample to be classified does not belong to any of the available classes. In this case, state-of-the-art algorithms usually produce a high rate of false positives that, in real diagnostic applications, are unacceptable. To address this problem, this paper presents a new cDNA microarray data classification algorithm based on graph theory and is able to overcome most of the limitations of known classification methodologies. The classifier works by analyzing gene expression data organized in an innovative data structure based on graphs, where vertices correspond to genes and edges to gene expression relationships. To demonstrate the novelty of the proposed approach, the authors present an experimental performance comparison between the proposed classifier and several state-of-the-art classification algorithms.
Collapse
Affiliation(s)
- Alfredo Benso
- Control and Computer Engineering Department, Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129, Torino, Italy.
| | | | | |
Collapse
|
16
|
Matsuyama T, Ishikawa T, Mogushi K, Yoshida T, Iida S, Uetake H, Mizushima H, Tanaka H, Sugihara K. MUC12 mRNA expression is an independent marker of prognosis in stage II and stage III colorectal cancer. Int J Cancer 2010; 127:2292-9. [PMID: 20162577 DOI: 10.1002/ijc.25256] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Distant metastasis is the major cause of death in colorectal cancer (CRC) patients. To identify genes influencing the prognosis of patients with CRC, we compared gene expression in primary tumors with and without distant metastasis using an oligonucleotide microarray. We also examined the expression of the candidate gene in 100 CRC patients by quantitative real-time reverse transcription PCR and studied the relationship between its expression and the prognosis of patients with CRC. As a result, we identified MUC12 as a candidate gene involved in metastasis processes by microarray analysis. Quantitative real-time reverse transcription PCR showed that MUC12 expression was significantly lower in cancer tissues than in adjacent normal tissues (p < 0.001). In Stages II and III CRC, patients with low expression showed worse disease-free survival (p = 0.020). Multivariate analysis disclosed that MUC12 expression status was an independent prognostic factor in Stages II and III CRC (relative risk, 8.236; 95% confidence interval, 1.702-39.849 p = 0.009). Our study revealed the prognostic value of MUC12 expression in CRC patients. Moreover, our result suggests MUC12 expression is a possible candidate gene for assessing postoperative adjuvant therapy for CRC patients.
Collapse
Affiliation(s)
- Takatoshi Matsuyama
- Department of Surgical Oncology, Graduate School, Tokyo Medical and Dental University, Tokyo, Japan.
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Anderson T, Wulfkuhle J, Liotta L, Winslow RL, Petricoin E. Improved reproducibility of reverse-phase protein microarrays using array microenvironment normalization. Proteomics 2010; 9:5562-6. [PMID: 19834915 DOI: 10.1002/pmic.200900505] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We introduce a novel experimental methodology for the reverse-phase protein microarray platform which reduces the typical measurement CV as much as 70%. The methodology, referred to as array microenvironment normalization, increases the statistical power of the platform. In the experiment, it enabled the detection of a 1.1-fold shift in prostate specific antigen concentration using approximately six technical replicates rather than the 37 replicates previously required. The improved reproducibility and statistical power should facilitate clinical implementation of the platform.
Collapse
Affiliation(s)
- Troy Anderson
- Institute for Computational Medicine, Center for Cardiovascular Bioinformatics and Modeling, Johns Hopkins University, Baltimore, MD 21218, USA.
| | | | | | | | | |
Collapse
|
18
|
Microarray data quality control improves the detection of differentially expressed genes. Genomics 2010; 95:138-42. [PMID: 20079422 DOI: 10.1016/j.ygeno.2010.01.003] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2009] [Accepted: 01/08/2010] [Indexed: 10/20/2022]
Abstract
Microarrays have become a routine tool for biomedical research. Data quality assessment is an essential part of the analysis, but it is still not easy to perform objectively or in an automated manner, and as a result it is often neglected. Here, we compared two strategies of array-level quality control using five publicly available microarray experiments: outlier removal and array weights. We also compared them against no outlier removal and random array removal. We find that removing outlier arrays can improve the signal-to-noise ratio and thus strengthen the power of detecting differentially expressed genes. Using array weights is similarly effective, but its applicability is more limited. The quality metrics presented here are implemented in the Bioconductor package arrayQualityMetrics.
Collapse
|
19
|
Celton M, Malpertuy A, Lelandais G, de Brevern AG. Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments. BMC Genomics 2010; 11:15. [PMID: 20056002 PMCID: PMC2827407 DOI: 10.1186/1471-2164-11-15] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2009] [Accepted: 01/07/2010] [Indexed: 11/17/2022] Open
Abstract
Background Microarray technologies produced large amount of data. In a previous study, we have shown the interest of k-Nearest Neighbour approach for restoring the missing gene expression values, and its positive impact of the gene clustering by hierarchical algorithm. Since, numerous replacement methods have been proposed to impute missing values (MVs) for microarray data. In this study, we have evaluated twelve different usable methods, and their influence on the quality of gene clustering. Interestingly we have used several datasets, both kinetic and non kinetic experiments from yeast and human. Results We underline the excellent efficiency of approaches proposed and implemented by Bo and co-workers and especially one based on expected maximization (EM_array). These improvements have been observed also on the imputation of extreme values, the most difficult predictable values. We showed that the imputed MVs have still important effects on the stability of the gene clusters. The improvement on the clustering obtained by hierarchical clustering remains limited and, not sufficient to restore completely the correct gene associations. However, a common tendency can be found between the quality of the imputation method and the gene cluster stability. Even if the comparison between clustering algorithms is a complex task, we observed that k-means approach is more efficient to conserve gene associations. Conclusions More than 6.000.000 independent simulations have assessed the quality of 12 imputation methods on five very different biological datasets. Important improvements have so been done since our last study. The EM_array approach constitutes one efficient method for restoring the missing expression gene values, with a lower estimation error level. Nonetheless, the presence of MVs even at a low rate is a major factor of gene cluster instability. Our study highlights the need for a systematic assessment of imputation methods and so of dedicated benchmarks. A noticeable point is the specific influence of some biological dataset.
Collapse
Affiliation(s)
- Magalie Celton
- INSERM UMR-S 726, Equipe de Bioinformatique Génomique et Moléculaire, DSIMB, Université Paris Diderot-Paris 7, 2 place Jussieu, Paris, France
| | | | | | | |
Collapse
|
20
|
Schützenmeister A, Piepho HP. Background correction of two-colour cDNA microarray data using spatial smoothing methods. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2010; 120:475-490. [PMID: 19916001 DOI: 10.1007/s00122-009-1210-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2009] [Accepted: 10/22/2009] [Indexed: 05/28/2023]
Abstract
The analysis of two-colour cDNA microarray data usually involves subtracting background values from foreground values prior to normalization and further analysis. This approach has the advantage of reducing bias and the disadvantage of blowing up the variance of lower abundant spots. Whenever background subtraction is considered, it implicitly assumes locally constant background values. In practice, this assumption is often not met, which casts doubts on the usefulness of simple background subtraction. In order to improve background correction, we propose local background smoothing within the pre-processing pipeline of cDNA microarray data prior to background correction. For this purpose, we employ a geostatistical framework with ordinary kriging using both isotropic and anisotropic models of spatial correlation and 2-D locally weighted regression. We show that application of local background smoothing prior to background correction is beneficial in comparison to using raw background estimates. This is done using data of a self-versus-self experiment in Arabidopsis where subsets of differentially expressed genes were simulated. Using locally smoothed background values in conjunction with existing background correction methods increases the power, increases the accuracy and decreases the number of false positive results.
Collapse
Affiliation(s)
- André Schützenmeister
- Bioinformatics Unit, Institute for Crop Production and Grassland Research, University of Hohenheim, Fruwirthstrasse 23, 70599 Stuttgart, Germany.
| | | |
Collapse
|
21
|
Dozmorov I, Lefkovits I. Internal standard-based analysis of microarray data. Part 1: analysis of differential gene expressions. Nucleic Acids Res 2009; 37:6323-39. [PMID: 19720734 PMCID: PMC2770671 DOI: 10.1093/nar/gkp706] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Genome-scale microarray experiments for comparative analysis of gene expressions produce massive amounts of information. Traditional statistical approaches fail to achieve the required accuracy in sensitivity and specificity of the analysis. Since the problem can be resolved neither by increasing the number of replicates nor by manipulating thresholds, one needs a novel approach to the analysis. This article describes methods to improve the power of microarray analyses by defining internal standards to characterize features of the biological system being studied and the technological processes underlying the microarray experiments. Applying these methods, internal standards are identified and then the obtained parameters are used to define (i) genes that are distinct in their expression from background; (ii) genes that are differentially expressed; and finally (iii) genes that have similar dynamical behavior.
Collapse
Affiliation(s)
- Igor Dozmorov
- Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA.
| | | |
Collapse
|
22
|
Zhang Y, Wei Z, Li YY, Chen Y, Shen W, Lu C. Transcription level of messenger RNA per gene copy determined with dual-spike-in strategy. Anal Biochem 2009; 394:202-8. [PMID: 19646945 DOI: 10.1016/j.ab.2009.07.043] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Revised: 07/24/2009] [Accepted: 07/27/2009] [Indexed: 10/20/2022]
Abstract
To quantify the transcription level of a gene, we have conceived a novel concept, transcription level of messenger RNA (mRNA) per gene copy, which was determined with a dual-spike-in strategy. In this strategy, an exogenous DNA was added as the spike reference for target DNA in addition to the exogenous RNA as the reference for target RNA. After the mRNA-to-DNA ratio of a target gene was estimated by real-time polymerase chain reaction (PCR), it was first normalized with the mRNA-to-DNA ratio of the exogenous reference. The normalized ratio was multiplied by the ratio of exogenous RNA to exogenous DNA to obtain the transcription level of mRNA per gene copy. This quantified transcription value allows one to compare the expression of a target gene in different tissues or the expression in a specified tissue under different conditions.
Collapse
Affiliation(s)
- Yi Zhang
- State Key Laboratory of Molecular Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, People's Republic of China
| | | | | | | | | | | |
Collapse
|
23
|
Daskalakis A, Glotsos D, Kostopoulos S, Cavouras D, Nikiforidis G. A comparative study of individual and ensemble majority vote cDNA microarray image segmentation schemes, originating from a spot-adjustable based restoration framework. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2009; 95:72-88. [PMID: 19278747 DOI: 10.1016/j.cmpb.2009.01.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2008] [Revised: 09/23/2008] [Accepted: 01/12/2009] [Indexed: 05/27/2023]
Abstract
The aim of this study was to comparatively evaluate the performances of various segmentation algorithms, in conjunction with a noise reduction step, for gene expression levels intensity extraction in cDNA microarray images. Different segmentation algorithms, based on histogram and unsupervised classification methods, which have never been previously employed in microarray image analysis, were employed either individually or in ensemble majority vote structures for separating spot-images from background pixels. The performances of segmentation algorithms or ensemble structures were evaluated by assessing the validity and reproducibility of gene expression levels extraction in simulated and real cDNA microarray images. By processing high quality simulated images, the highest segmentation accuracy was achieved by an ensemble structure (Histogram Concavity, Gaussian Kernelized Fuzzy-C-Means, Seeded Region Growing). Optimum performance in terms of processing time and segmentation precision for low quality simulated and replicated real cDNA microarray images was attained by the Histogram Concavity algorithm.
Collapse
Affiliation(s)
- Antonis Daskalakis
- Department of Medical Physics, Medical Image Processing and Analysis Laboratory, School of Medicine, University of Patras, Rio, Patras, Greece.
| | | | | | | | | |
Collapse
|
24
|
Prediction of qualitative outcome of oligonucleotide microarray hybridization by measurement of RNA integrity using the 2100 Bioanalyzer capillary electrophoresis system. Ann Hematol 2009; 88:1177-83. [PMID: 19424697 DOI: 10.1007/s00277-009-0751-5] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2009] [Accepted: 04/29/2009] [Indexed: 10/20/2022]
Abstract
RNA quality is critical to achieve valid results in microarray experiments and to save resources. The RNA integrity number (RIN) can be measured with minimal sample consumption by microfluidics-based capillary electrophoresis. To determine whether RIN can predict the qualitative outcome of microarray hybridization, we measured RIN in total RNA samples from 484 different experiments by the 2100 Bioanalyzer system and correlated with the percentage of present calls (%pc) of downstream oligonucleotide microarrays. The correlation coefficient for RNA and %pc in all 408 samples for which the bioanalyzer algorithm was able to produce an RIN was 0.475 (p < 0.05), ranging from 0.039 to 0.673 for different tissue- and assay-type subgroups. Multivariate analysis found RIN to be the best predictor of microarray quality as assessed by %pc, outperforming the 28S to 18S ratio. For a %pc threshold of 25% and 35%, we determined optimal cut points for RIN at 7.15 and 8.05, respectively. Using the suggested cut points, RIN can support the final decision whether a certain RNA sample is appropriate for successful microarray hybridization.
Collapse
|
25
|
Abstract
Issues implicit in a multicenter microarray study are protocol standardization and monitoring center adherence to established protocols. This study explored the effects of submitting center and sample preservation method on the quality of isolated RNA. In addition, the effects of sample preservation method and laboratory on microarray quality were also examined. Herein we evaluated the contribution of specific technical factors [center, laboratory, and preservation method (frozen/RNAlater)] on quality of isolated RNA, cRNA synthesis products, and reproducibility of gene expression microarray data for independent biologic samples collected in a multicenter microarray study. The Kruskal-Wallis test was used to test for differences owing to submitting center on isolated RNA quality. Mixed effects analysis of variance was used in assessing the impact of laboratory and preservation method on gene expression values for the 12 samples hybridized at 2 independent laboratories (24 GeneChips). One center was found to be in violation of the tissue handling protocol. No significant effect was noted owing to preservation method, which ensured that our tissue handling protocols are working properly. There was a significant laboratory effect with respect to cRNA yield, though this effect did not impact sample quality. We conclude that use of consistent protocols for sample collection, RNA extraction, cDNA/cRNA synthesis, labeling, hybridization, platform, image acquisition, normalization, and expression summaries can yield consistent expression values. Moreover, evaluation of sample quality at various steps in the data acquisition process is an important component of a multicenter study to ensure all participating centers adhere to established protocols.
Collapse
|
26
|
Sun Y, Fan W, McCann MP, Golovlev V. Rapid and quantitative quality control of microarrays using cationic nanoparticles. Biotechnol Bioeng 2009; 102:960-4. [DOI: 10.1002/bit.22117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
27
|
Huerta EB, Duval B, Hao JK. Fuzzy logic for elimination of redundant information of microarray data. GENOMICS PROTEOMICS & BIOINFORMATICS 2009; 6:61-73. [PMID: 18973862 PMCID: PMC5054105 DOI: 10.1016/s1672-0229(08)60021-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Gene subset selection is essential for classification and analysis of microarray data. However, gene selection is known to be a very difficult task since gene expression data not only have high dimensionalities, but also contain redundant information and noises. To cope with these difficulties, this paper introduces a fuzzy logic based pre-processing approach composed of two main steps. First, we use fuzzy inference rules to transform the gene expression levels of a given dataset into fuzzy values. Then we apply a similarity relation to these fuzzy values to define fuzzy equivalence groups, each group containing strongly similar genes. Dimension reduction is achieved by considering for each group of similar genes a single representative based on mutual information. To assess the usefulness of this approach, extensive experimentations were carried out on three well-known public datasets with a combined classification model using three statistic filters and three classifiers.
Collapse
|
28
|
Kauffmann A, Gentleman R, Huber W. arrayQualityMetrics--a bioconductor package for quality assessment of microarray data. ACTA ACUST UNITED AC 2008; 25:415-6. [PMID: 19106121 PMCID: PMC2639074 DOI: 10.1093/bioinformatics/btn647] [Citation(s) in RCA: 651] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY The assessment of data quality is a major concern in microarray analysis. arrayQualityMetrics is a Bioconductor package that provides a report with diagnostic plots for one or two colour microarray data. The quality metrics assess reproducibility, identify apparent outlier arrays and compute measures of signal-to-noise ratio. The tool handles most current microarray technologies and is amenable to use in automated analysis pipelines or for automatic report generation, as well as for use by individuals. The diagnosis of quality remains, in principle, a context-dependent judgement, but our tool provides powerful, automated, objective and comprehensive instruments on which to base a decision. AVAILABILITY arrayQualityMetrics is a free and open source package, under LGPL license, available from the Bioconductor project at www.bioconductor.org. A users guide and examples are provided with the package. Some examples of HTML reports generated by arrayQualityMetrics can be found at http://www.microarray-quality.org
Collapse
Affiliation(s)
- Audrey Kauffmann
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | |
Collapse
|
29
|
Soong TT, Wrzeszczynski KO, Rost B. Physical protein-protein interactions predicted from microarrays. ACTA ACUST UNITED AC 2008; 24:2608-14. [PMID: 18829707 PMCID: PMC2579715 DOI: 10.1093/bioinformatics/btn498] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Motivation: Microarray expression data reveal functionally associated proteins. However, most proteins that are associated are not actually in direct physical contact. Predicting physical interactions directly from microarrays is both a challenging and important task that we addressed by developing a novel machine learning method optimized for this task. Results: We validated our support vector machine-based method on several independent datasets. At the same levels of accuracy, our method recovered more experimentally observed physical interactions than a conventional correlation-based approach. Pairs predicted by our method to very likely interact were close in the overall network of interaction, suggesting our method as an aid for functional annotation. We applied the method to predict interactions in yeast (Saccharomyces cerevisiae). A Gene Ontology function annotation analysis and literature search revealed several probable and novel predictions worthy of future experimental validation. We therefore hope our new method will improve the annotation of interactions as one component of multi-source integrated systems. Contact:ts2186@columbia.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ta-Tsen Soong
- Columbia University Center for Computational Biology and Bioinformatics, Columbia University, New York, NY, USA.
| | | | | |
Collapse
|
30
|
Gondro C, Kinghorn BP. Optimization of cDNA microarray experimental designs using an evolutionary algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:630-638. [PMID: 18989048 DOI: 10.1109/tcbb.2007.70222] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The cDNA microarray is an important tool for generating large datasets of gene expression measurements.An efficient design is critical to ensure that the experiment will be able to address relevant biological questions. Microarray experimental design can be treated as a multicriterion optimization problem. For this class of problems evolutionary algorithms (EAs) are well suited, as they can search the solution space and evolve a design that optimizes the parameters of interest based on their relative value to the researcher under a given set of constraints. This paper introduces the use of EAs for optimization of experimental designs of spotted microarrays using a weighted objective function. The EA and the various criteria relevant to design optimization are discussed. Evolved designs are compared with designs obtained through exhaustive search with results suggesting that the EA can find just as efficient optimal or near-optimal designs within atractable timeframe.
Collapse
Affiliation(s)
- Cedric Gondro
- Institute for Genetics and Bioinformatics, University of New England, Armidale, NSW-2351, Australia
| | | |
Collapse
|
31
|
Weber DG, Sahm K, Polen T, Wendisch VF, Antranikian G. Oligonucleotide microarrays for the detection and identification of viable beer spoilage bacteria. J Appl Microbiol 2008; 105:951-62. [PMID: 18785882 DOI: 10.1111/j.1365-2672.2008.03799.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AIMS The design and evaluation of an oligonucleotide microarray in order to detect and identify viable bacterial species that play a significant role in beer spoilage. These belong to the species of the genera Lactobacillus, Megasphaera, Pediococcus and Pectinatus. METHODS AND RESULTS Oligonucleotide probes specific to beer spoilage bacteria were designed. In order to detect viable bacteria, the probes were designed to target the intergenic spacer regions (ISR) between 16S and 23S rRNA. Prior to hybridization the ISR were amplified by combining reverse transcriptase and polymerase chain reactions using a designed consenus primer. The developed oligonucleotide microarrays allows the detection of viable beer spoilage bacteria. CONCLUSIONS This method allows the detection and discrimination of single bacterial species in a sample containing complex microbial community. Furthermore, microarrays using oligonucleotide probes targeting the ISR allow the distinction between viable bacteria with the potential to grow and non growing bacteria. SIGNIFICANCE AND IMPACT OF THE STUDY The results demonstrate the feasibility of oligonucleotide microarrays as a contamination control in food industry for the detection and identification of spoilage micro-organisms within a mixed population.
Collapse
Affiliation(s)
- D G Weber
- Institute of Technical Microbiology, Hamburg University of Technology, Hamburg, Germany
| | | | | | | | | |
Collapse
|
32
|
The "reverse capture" autoantibody microarray : an innovative approach to profiling the autoantibody response to tissue-derived native antigens. Methods Mol Biol 2008; 441:175-92. [PMID: 18370319 DOI: 10.1007/978-1-60327-047-2_12] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Recently, we reported the development and use of a "reverse capture" antibody microarray for the purpose of investigating antigen-autoantibody profiling. This platform was developed to allow researchers to characterize and compare the autoantibody profiles of normal and diseased patients. Our "reverse capture" protocol is based on the dual-antibody sandwich immunoassay of enzyme-linked immunosorbent assay (ELISA), and we have previously reported its use to detect autoimmunity to epitopes found on native antigens derived from tumor cell lines. In this protocol, we used ovarian cancer as a model system to adapt the "reverse capture" procedure for use with native antigens derived from frozen tissue samples. The use of this platform in studies of autoimmunity is valuable because it allows for the detection of autoantibody reactivity with epitopes found on the post-translational modifications (PTMs) of native antigens, a feature not present with other protein array platforms. In the first step in the "reverse capture" process, tissue-derived native antigens are immobilized onto the 500 monoclonal antibodies that are spotted in duplicate on the array surface. Using the captured antigens as "baits," we then incubate the array with labeled IgG from test and control samples, and perform a two-slide dye-swap to account for any dye effects. Here, we present a detailed description of the "reverse capture" autoantibody microarray for use with tissue-derived native antigens.
Collapse
|
33
|
Li M, Reilly C. Assessing the quality of hybridized RNA in Affymetrix GeneChips using linear regression. J Biomol Tech 2008; 19:122-128. [PMID: 19137095 PMCID: PMC2361161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The quality of data from microarray analysis is highly dependent on the quality of RNA. Because of the lability of RNA, steps involved in tissue sampling, RNA purification, and RNA storage are known to potentially lead to the degradation of RNAs; therefore, assessment of RNA quality and integrity is essential. Existing methods for estimating the quality of RNA hybridized to a GeneChip either suffer from subjectivity or are inefficient in performance. To overcome these drawbacks, we propose a linear regression method for assessing RNA quality for a hybridized Genechip. In particular, our approach used the probe intensities from the .cel files that the Affymetrix software associates with each microarray. The effectiveness and the improvements of the proposed method over the existing methods are illustrated by the application of the method to the previously published 19 human Affymetrix microarray data sets for which external verification of RNA quality is available.
Collapse
Affiliation(s)
- Meijuan Li
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455-0378, USA.
| | | |
Collapse
|
34
|
Abstract
The ferric uptake regulator (Fur) is a predominant bacterial regulator controlling the iron assimilation functions in response to iron availability. Our previous microarray analysis on Yersinia pestis defined the iron-Fur modulon. In the present work, we reannotated the iron assimilation genes in Y. pestis, and the resulting genes in complementation with those disclosed by microarray constituted a total of 34 genome loci (putative operons) that represent the potential iron-responsive targets of Fur. The subsequent real-time reverse transcription-PCR (RT-PCR) in conjunction with the primer extension analysis showed that 32 of them were regulated by Fur in response to iron starvation. A previously predicted Fur box sequence was then used to search against the promoter regions of the 34 operons; the homologue of the above box could be predicted in each promoter tested. The subsequent electrophoretic mobility shift assay (EMSA) demonstrated that a purified His(6) tag-fused Fur protein was able to bind in vitro to each of these promoter regions. Therefore, Fur is a global regulator, both an activator and a repressor, and directly controls not only almost all of the iron assimilation functions but also a variety of genes involved in various non-iron functions for governing a complex regulatory cascade in Y. pestis. In addition, real-time RT-PCR, primer extension, EMSA, and DNase I footprinting assay were used to elucidate the Fur regulation of the ybt locus encoding a virulence-required iron uptake system. By combining the published data on the YbtA regulation of ybt, we constructed a concise Fur/YbtA regulatory network with a map of the Fur-promoter DNA interactions within the ybt locus. The data presented here give us an overview of the iron-responsive Fur regulon in Y. pestis.
Collapse
|
35
|
Quantitative reverse transcriptase real-time polymerase chain reaction (qRT-PCR) in translational oncology: lung cancer perspective. Lung Cancer 2008; 59:147-54. [PMID: 18177977 DOI: 10.1016/j.lungcan.2007.11.008] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2007] [Revised: 10/29/2007] [Accepted: 11/16/2007] [Indexed: 11/20/2022]
Abstract
Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) is rapidly becoming a basic method in lung cancer research. Analysis of transcriptional activity of tumor cells or detection of tumor markers by this technique has the potential to change lung cancer diagnosis and treatment. Quantitative RT-PCR is characterized by unparalleled sensitivity and specificity, with very reliable reproducibility. Its prime advantage for gene expression analysis is its broad dynamic range of 10(7)-fold. Moreover, it is cost-effective, feasible in every day laboratory routine and efficient in terms of biological material consumption. Still, there are a number of methodological aspects that need to be carefully considered before it can sensibly be implemented into clinical practice. Three major technical issues: the choice of chemistries, gene expression data normalization and statistical processing of the results will be specifically highlighted in this review. Further, clinical applications of qRT-PCR will be thoroughly discussed: detection and staging of lung cancer and construction and validation of prognostic and predictive gene expression signatures.
Collapse
|
36
|
Tamaoki M. Isolation of O3-response genes from Arabidopsis thaliana using cDNA macroarray. Methods Mol Biol 2008; 410:29-42. [PMID: 18642593 DOI: 10.1007/978-1-59745-548-0_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Nylon membrane-based cDNA macroarrays are a widely available alternative to cDNA microarrays for the collection of large-scale gene expression data. cDNA macroarrays are used in many areas of molecular biology research for applications ranging from gene discovery to gene expression profiling. Although degree of location of DNA spot in cDNA macroarray is lower than that in cDNA microarray, it can be used to detect expression of a large number of genes because it uses radiolabeled cDNA as a probe. Thus, cDNA macroarray technology can be applied to obtain the gene expression profile in organs that show wide variety in mRNA expression, such as meristems in plant species and brain tissue. To carry out hybridization experiments with a cDNA macroarray, I describe here how to prepare macroarray filters on a small or large scale, as well as how to analyze macroarray experiments and determine the statistical significance of the gene expression data obtained.
Collapse
Affiliation(s)
- Masanori Tamaoki
- Biodiversity Conservation Research Project, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan
| |
Collapse
|
37
|
Panguluri SK, Li B, Hormann RE, Palli SR. Effect of ecdysone receptor gene switch ligands on endogenous gene expression in 293 cells. FEBS J 2007; 274:5669-89. [PMID: 17922837 DOI: 10.1111/j.1742-4658.2007.06089.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Regulated gene expression may substantially enhance gene therapy. Correlated with structural differences between insect ecdysteroids and mammalian steroids, the ecdysteroids appear to have a benign pharmacology without adversely interfering with mammalian signaling systems. Consequently, the ecdysone receptor-based gene switches are attractive for application in medicine. In the present study, the effect of inducers of ecdysone receptor switches on the expression of endogenous genes in HEK 293 cells was determined. Four ligand chemotypes, represented by a tetrahydroquinoline (RG-120499), one amidoketone (RG-121150), two ecdysteroids [20-hydroxyecdysone (20E) and ponasterone A (Pon A)], and four diacylhydrazines (RG-102240, RG-102277, RG-102398 and RG-100864), were tested in HEK 293 cells. The cells were exposed to ligands at concentrations of 1 microm (RG-120499) or 10 microm (all others) for 72 h and the total RNA was isolated and analyzed using microarrays. Microarray data showed that the tetrahydroquinoline ligand, RG-120499 caused cell death at concentrations > or = 10 microm. At 1 microm, this ligand caused changes in the expression of genes such as TNF, MAF, Rab and Reprimo. At 10 microm, the amidoketone, RG-121150, induced changes in the expression of genes such as v-jun, FBJ and EGR, but was otherwise noninterfering. Of the two steroids tested, 20E did not affect gene expression, but Pon A caused some changes in the expression of endogenous genes. At lower concentrations pharmacologically relevant for gene therapy, intrinsic gene expression effects of ecdysteroids and amidoketones may actually be insignificant. A fortiori, even at 10 microm, the four diacylhydrazine ligands did not cause significant changes in expression of endogenous genes in 293 cells and therefore should have minimum pleiotropic effects when used as ligands for the ecdysone receptor gene switch.
Collapse
Affiliation(s)
- Siva K Panguluri
- Department of Entomology, College of Agriculture, University of Kentucky, Lexington, KY 40546, USA
| | | | | | | |
Collapse
|
38
|
Fu Q, Bent E, Borneman J, Chrobak M, Young NE. Algorithmic approaches to selecting control clones in DNA array hybridization experiments. J Bioinform Comput Biol 2007; 5:937-61. [PMID: 17787064 DOI: 10.1142/s0219720007002977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2007] [Revised: 05/16/2007] [Accepted: 05/19/2007] [Indexed: 11/18/2022]
Abstract
We study the problem of selecting control clones in DNA array hybridization experiments. The problem arises in the OFRG method for analyzing microbial communities. The OFRG method performs classification of rRNA gene clones using binary fingerprints created from a series of hybridization experiments, where each experiment consists of hybridizing a collection of arrayed clones with a single oligonucleotide probe. This experiment produces analog signals, one for each clone, which then need to be classified, that is, converted into binary values 1 and 0 that represent hybridization and non-hybridization events. In addition to the sample rRNA gene clones, the array contains a number of control clones needed to calibrate the classification procedure of the hybridization signals. These control clones must be selected with care to optimize the classification process. We formulate this as a combinatorial optimization problem called Balanced Covering. We prove that the problem is NP-hard, and we show some results on hardness of approximation. We propose approximation algorithms based on randomized rounding, and we show that, with high probability, our algorithms approximate well the optimum solution. The experimental results confirm that the algorithms find high quality control clones. The algorithms have been implemented and are publicly available as part of the software package called CloneTools.
Collapse
Affiliation(s)
- Qi Fu
- Department of Computer Science, University of California, Riverside, CA 92521, USA.
| | | | | | | | | |
Collapse
|
39
|
Daskalakis A, Cavouras D, Bougioukos P, Kostopoulos S, Glotsos D, Kalatzis I, Kagadis GC, Argyropoulos C, Nikiforidis G. Improving gene quantification by adjustable spot-image restoration. Bioinformatics 2007; 23:2265-72. [PMID: 17599935 DOI: 10.1093/bioinformatics/btm337] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION One of the major factors that complicate the task of microarray image analysis is that microarray images are distorted by various types of noise. In this study a robust framework is proposed, designed to take into account the effect of noise in microarray images in order to assist the demanding task of microarray image analysis. The proposed framework, incorporates in the microarray image processing pipeline a novel combination of spot adjustable image analysis and processing techniques and consists of the following stages: (1) gridding for facilitating spot identification, (2) clustering (unsupervised discrimination between spot and background pixels) applied to spot image for automatic local noise assessment, (3) modeling of local image restoration process for spot image conditioning (adjustable wiener restoration using an empirically determined degradation function), (4) automatic spot segmentation employing seeded-region-growing, (5) intensity extraction and (6) assessment of the reproducibility (real data) and the validity (simulated data) of the extracted gene expression levels. RESULTS Both simulated and real microarray images were employed in order to assess the performance of the proposed framework against well-established methods implemented in publicly available software packages (Scanalyze and SPOT). Regarding simulated images, the novel combination of techniques, introduced in the proposed framework, rendered the detection of spot areas and the extraction of spot intensities more accurate. Furthermore, on real images the proposed framework proved of better stability across replicates. Results indicate that the proposed framework improves spots' segmentation and, consequently, quantification of gene expression levels. AVAILABILITY All algorithms were implemented in Matlab (The Mathworks, Inc., Natick, MA, USA) environment. The codes that implement microarray gridding, adaptive spot restoration and segmentation/intensity extraction are available upon request. Supplementary results and the simulated microarray images used in this study are available for download from: ftp://users:bioinformatics@mipa.med.upatras.gr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Antonis Daskalakis
- Medical Image Processing and Analysis Group, Laboratory of Medical Physics, School of Medicine, University of Patras, 265 04 Rio, Greece.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Karakach TK, Wentzell PD. Methods for Estimating and Mitigating Errors in Spotted, Dual-color DNA Microarrays. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:186-99. [PMID: 17594237 DOI: 10.1089/omi.2007.0008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The conceptual simplicity of DNA microarray technology often belies the complex nature of the measurement errors inherent in the methodology. As the technology has developed, the importance of understanding the sources of uncertainty in the measurements and developing ways to control their influence on the conclusions drawn has become apparent. In this review, strategies for modeling measurement errors and minimizing their effect on the outcome of experiments using a variety of techniques are discussed in the context of spotted, dual-color microarrays. First, methods designed to reduce the influence of random variability through data filtering, replication, and experimental design are introduced. This is followed by a review of data analysis methods that partition the variance into random effects and one or more systematic effects, specifically two-sample significance testing and analysis of variance (ANOVA) methods. Finally, the current state of measurement error models for spotted microarrays and their role in variance stabilizing transformations are discussed.
Collapse
Affiliation(s)
- Tobias K Karakach
- Trace Analysis Research Centre, Department of Chemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | | |
Collapse
|
41
|
Wang JY, Lin SR, Wu DC, Lu CY, Yu FJ, Hsieh JS, Cheng TL, Koay LB, Uen YH. Multiple molecular markers as predictors of colorectal cancer in patients with normal perioperative serum carcinoembryonic antigen levels. Clin Cancer Res 2007; 13:2406-13. [PMID: 17406027 DOI: 10.1158/1078-0432.ccr-06-2054] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
PURPOSE In this study, a high-sensitivity colorimetric membrane array method was used to detect circulating tumor cells (CTC) in the peripheral blood of colorectal cancer (CRC) patients with normal perioperative serum carcinoembryonic antigen (CEA) levels. This membrane array method was evaluated as a potential diagnostic and postoperative surveillance tool. STUDY DESIGN Membrane arrays consisting of a panel of mRNA markers that include human telomerase reverse transcriptase, cytokeratin-19, cytokeratin-20, and CEA mRNA were used to detect CTCs in the peripheral blood of 157 postoperative CRC patients with normal perioperative serum CEA levels and in 80 healthy individuals. Digoxigenin-labeled cDNA were amplified by reverse transcription-PCR from the peripheral blood samples, which were then hybridized to the membrane array. The sensitivity, specificity, and accuracy of membrane arrays for the detection of CTCs were then calculated. RESULTS Using the four markers in combination, expression of any three markers or all the four markers in this panel was significantly correlated with the clinicopathologic characteristics, including depth of tumor invasion, lymph node metastasis, tumor-node-metastasis stage, and postoperative relapse (all P < 0.05). The interval between the detection of all four positive molecular markers and subsequent elevated CEA ranged from 3 to 8 months (median 6 months). The expression of all four mRNA markers was an independent predictor for postoperative relapse. CRC patients with all four mRNA markers expression showed a significantly poorer survival rate than those with less than four positive markers. CONCLUSIONS The constructed membrane array method was helpful in the early prediction of postoperative relapse in CRC patients with normal perioperative serum CEA levels.
Collapse
Affiliation(s)
- Jaw-Yuan Wang
- Department of Surgery, Kaohsiung Medical University Hospital, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Tsai CA, Hsueh HM, Chen JJ. A Generalized Additive Model For Microarray Gene Expression Data Analysis. J Biopharm Stat 2007; 14:553-73. [PMID: 15468752 DOI: 10.1081/bip-200025648] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Microarray technology allows the measurement of expression levels of a large number of genes simultaneously. There are inherent biases in microarray data generated from an experiment. Various statistical methods have been proposed for data normalization and data analysis. This paper proposes a generalized additive model for the analysis of gene expression data. This model consists of two sub-models: a non-linear model and a linear model. We propose a two-step normalization algorithm to fit the two sub-models sequentially. The first step involves a non-parametric regression using lowess fits to adjust for non-linear systematic biases. The second step uses a linear ANOVA model to estimate the remaining effects including the interaction effect of genes and treatments, the effect of interest in a study. The proposed model is a generalization of the ANOVA model for microarray data analysis. We show correspondences between the lowess fit and the ANOVA model methods. The normalization procedure does not assume the majority of genes do not change their expression levels, and neither does it assume two channel intensities from the same spot are independent. The procedure can be applied to either one channel or two channel data from the experiments with multiple treatments or multiple nuisance factors. Two toxicogenomic experiment data sets and a simulated data set are used to contrast the proposed method with the commonly known lowess fit and ANOVA methods.
Collapse
Affiliation(s)
- Chen-An Tsai
- Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas 72079, USA
| | | | | |
Collapse
|
43
|
Linder R, Richards T, Wagner M. Microarray data classified by artificial neural networks. Methods Mol Biol 2007; 382:345-72. [PMID: 18220242 DOI: 10.1007/978-1-59745-304-2_22] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Systems biology has enjoyed explosive growth in both the number of people participating in this area of research and the number of publications on the topic. The field of systems biology encompasses the in silico analysis of high-throughput data as provided by DNA or protein microarrays. Along with the increasing availability of microarray data, attention is focused on methods of analyzing the expression rates. One important type of analysis is the classification task, for example, distinguishing different types of cell functions or tumors. Recently, interest has been awakened toward artificial neural networks (ANN), which have many appealing characteristics such as an exceptional degree of accuracy. Nonlinear relationships or independence from certain assumptions regarding the data distribution are also considered. The current work reviews advantages as well as disadvantages of neural networks in the context of microarray analysis. Comparisons are drawn to alternative methods. Selected solutions are discussed, and finally algorithms for the effective combination of multiple ANNs are presented. The development of approaches to use ANN-processed microarray data applicable to run cell and tissue simulations may be slated for future investigation.
Collapse
Affiliation(s)
- Roland Linder
- Institute of Medical Informatics, University of Lübeck, Germany
| | | | | |
Collapse
|
44
|
Ehrenreich A. DNA microarray technology for the microbiologist: an overview. Appl Microbiol Biotechnol 2006; 73:255-73. [PMID: 17043830 DOI: 10.1007/s00253-006-0584-2] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Revised: 07/11/2006] [Accepted: 07/11/2006] [Indexed: 10/24/2022]
Abstract
DNA microarrays have found widespread use as a flexible tool to investigate bacterial metabolism. Their main advantage is the comprehensive data they produce on the transcriptional response of the whole genome to an environmental or genetic stimulus. This allows the microbiologist to monitor metabolism and to define stimulons and regulons. Other fields of application are the identification of microorganisms or the comparison of genomes. The importance of this technology increases with the number of sequenced genomes and the falling prices for equipment and oligonucleotides. Knowledge of DNA microarrays is of rising relevance for many areas in microbiological research. Much literature has been published on various specific aspects of this technique that can be daunting to the casual user and beginner. This article offers a comprehensive outline of microarray technology for transcription analysis in microbiology. It shortly discusses the types of DNA microarrays available, the printing of custom arrays, common labeling strategies for targets, hybridization, scanning, normalization, and clustering of expression data.
Collapse
Affiliation(s)
- Armin Ehrenreich
- Institute of Microbiology and Genetics, Georg August University, 37077 Göttingen, Germany.
| |
Collapse
|
45
|
Copois V, Bibeau F, Bascoul-Mollevi C, Salvetat N, Chalbos P, Bareil C, Candeil L, Fraslon C, Conseiller E, Granci V, Mazière P, Kramar A, Ychou M, Pau B, Martineau P, Molina F, Del Rio M. Impact of RNA degradation on gene expression profiles: assessment of different methods to reliably determine RNA quality. J Biotechnol 2006; 127:549-59. [PMID: 16945445 DOI: 10.1016/j.jbiotec.2006.07.032] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2006] [Revised: 07/21/2006] [Accepted: 07/27/2006] [Indexed: 11/24/2022]
Abstract
DNA microarray technology enables investigators to measure the expression of several 1000 mRNA species simultaneously in a biological specimen. However, the reliability of the microarray technology to detect transcriptional differences representative of the original samples is affected by the quality of the extracted RNA. Thus, it is of critical importance to standardize sample-handling protocols and to perform a quality assessment of RNA preparations. In this report, 59 human tissue samples were used to evaluate the relationships between RNA quality and gene expression. From Affymetrix GeneChip array data analysis of these samples, we compared the performance of the 28S/18S ratio, two computer methods (RIN and degradometer) and our in-house RNA quality scale (RQS) in assessing RNA quality. The optimal RNA reliability threshold was determined for each method using statistical discrimination measures. We showed that RQS, RIN and degradometer have a similar capacity to detect reliable RNA samples whereas the 28S/18S ratio leads to a misleading categorization. Furthermore, we developed a new approach, based on clustering analyses of full chip expression, to control RNA quality after hybridization experiments. The combination of these methods, allowing monitoring of RNA quality prior to and after the hybridization experiments, ensured reliable and reproducible microarray data.
Collapse
|
46
|
Abstract
The study of gene expression profiling of cells and tissue has become a major tool for discovery in medicine. Microarray experiments allow description of genome-wide expression changes in health and disease. The results of such experiments are expected to change the methods employed in the diagnosis and prognosis of disease in obstetrics and gynecology. Moreover, an unbiased and systematic study of gene expression profiling should allow the establishment of a new taxonomy of disease for obstetric and gynecologic syndromes. Thus, a new era is emerging in which reproductive processes and disorders could be characterized using molecular tools and fingerprinting. The design, analysis, and interpretation of microarray experiments require specialized knowledge that is not part of the standard curriculum of our discipline. This article describes the types of studies that can be conducted with microarray experiments (class comparison, class prediction, class discovery). We discuss key issues pertaining to experimental design, data preprocessing, and gene selection methods. Common types of data representation are illustrated. Potential pitfalls in the interpretation of microarray experiments, as well as the strengths and limitations of this technology, are highlighted. This article is intended to assist clinicians in appraising the quality of the scientific evidence now reported in the obstetric and gynecologic literature.
Collapse
Affiliation(s)
- Adi L. Tarca
- Perinatology Research Branch, National Institute of Child Health and Human Development, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, and Detroit, MI
- Department of Computer Science, Wayne State University
| | - Roberto Romero
- Perinatology Research Branch, National Institute of Child Health and Human Development, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, and Detroit, MI
- Center for Molecular Medicine and Genetics, Wayne State University
| | - Sorin Draghici
- Department of Computer Science, Wayne State University
- Karmanos Cancer Institute, Detroit, MI
| |
Collapse
|
47
|
Ritchie ME, Diyagama D, Neilson J, van Laar R, Dobrovic A, Holloway A, Smyth GK. Empirical array quality weights in the analysis of microarray data. BMC Bioinformatics 2006; 7:261. [PMID: 16712727 PMCID: PMC1564422 DOI: 10.1186/1471-2105-7-261] [Citation(s) in RCA: 228] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2005] [Accepted: 05/19/2006] [Indexed: 11/20/2022] Open
Abstract
Background Assessment of array quality is an essential step in the analysis of data from microarray experiments. Once detected, less reliable arrays are typically excluded or "filtered" from further analysis to avoid misleading results. Results In this article, a graduated approach to array quality is considered based on empirical reproducibility of the gene expression measures from replicate arrays. Weights are assigned to each microarray by fitting a heteroscedastic linear model with shared array variance terms. A novel gene-by-gene update algorithm is used to efficiently estimate the array variances. The inverse variances are used as weights in the linear model analysis to identify differentially expressed genes. The method successfully assigns lower weights to less reproducible arrays from different experiments. Down-weighting the observations from suspect arrays increases the power to detect differential expression. In smaller experiments, this approach outperforms the usual method of filtering the data. The method is available in the limma software package which is implemented in the R software environment. Conclusion This method complements existing normalisation and spot quality procedures, and allows poorer quality arrays, which would otherwise be discarded, to be included in an analysis. It is applicable to microarray data from experiments with some level of replication.
Collapse
Affiliation(s)
- Matthew E Ritchie
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3050, Australia
| | - Dileepa Diyagama
- lan Potter Foundation Centre for Cancer Genomics and Predictive Medicine, The Peter MacCallum Cancer Centre, St Andrews Place, East Melbourne, Victoria 3002, Australia
| | - Jody Neilson
- Molecular Pathology Research, Department of Pathology, The Peter MacCallum Cancer Centre, St Andrews Place, East Melbourne, Victoria 3002, Australia
| | - Ryan van Laar
- lan Potter Foundation Centre for Cancer Genomics and Predictive Medicine, The Peter MacCallum Cancer Centre, St Andrews Place, East Melbourne, Victoria 3002, Australia
| | - Alexander Dobrovic
- Molecular Pathology Research, Department of Pathology, The Peter MacCallum Cancer Centre, St Andrews Place, East Melbourne, Victoria 3002, Australia
| | - Andrew Holloway
- lan Potter Foundation Centre for Cancer Genomics and Predictive Medicine, The Peter MacCallum Cancer Centre, St Andrews Place, East Melbourne, Victoria 3002, Australia
| | - Gordon K Smyth
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3050, Australia
| |
Collapse
|
48
|
Rachman H, Lee JS, Angermann J, Kowall J, Kaufmann SHE. Reliable amplification method for bacterial RNA. J Biotechnol 2006; 126:61-8. [PMID: 16603269 DOI: 10.1016/j.jbiotec.2006.02.020] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2005] [Revised: 01/19/2006] [Accepted: 02/17/2006] [Indexed: 11/22/2022]
Abstract
DNA microarray technology has been increasingly applied for studies of clinical samples. Frequently, RNA probes from clinical samples are available in limited amounts. We describe a reliable amplification method for bacterial RNA. We verified this method on mycobacterial RNA applying mycobacterial genome-directed primers (mtGDPs). Glass slide-based oligoarrays were employed to assess the quality of the amplification method. We observed a relatively small bias in amplified RNA pool when compared to the unamplified one. Up to 1000-fold linear RNA amplification in a single amplification round was obtained. To our knowledge, this study describes the first amplification method for mycobacterial RNA.
Collapse
Affiliation(s)
- Helmy Rachman
- Max Planck Institute for Infection Biology, Department of Immunology, Schumannstrasse 21-22, 10117 Berlin, Germany
| | | | | | | | | |
Collapse
|
49
|
Redkar RJ, Schultz NA, Scheumann V, Burzio LA, Haines DE, Metwalli E, Becker O, Conzone SD. Signal and sensitivity enhancement through optical interference coating for DNA and protein microarray applications. J Biomol Tech 2006; 17:122-30. [PMID: 16741239 PMCID: PMC2291774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Optical inteference (OI) coated slides with unique optical properties were utilized in microarray analyses, demonstrating their enhanced detection sensitivity over traditional microarray substrates. The OI coating is comprised of a proprietary multilayered, dielectric, thin-film interference coating located beneath the functional coating (aminosilane or epoxysilane). It is designed to enhance the fluorescence in the Cy3 and Cy5 channel by increasing the light absorption of the dyes by about 6-fold and by redirecting emitted fluorescence into the detector during scanning, resulting in a theoretical limit of about 12-fold signal amplification. Two-color DNA microarray experiments conducted on the OI slides showed over 8-fold signal amplification, conservation of gene expression ratios, and increased signal-to-noise ratio when compared to control slides, indicating enhanced detection sensitivity. Protein microarray assays also exhibited over 8-fold signal amplification at three different target concentrations, demonstrating the versatility of the OI slides for different microarray applications. Further, the DNA and protein assays performed on the OI slides exhibited excellent detection sensitivity even at the low target amounts essential for diagnostic applications. The OI slides are compatible with commonly used protocols, printers, scanners and other microarray equipment. Therefore, the OI slides offer an attractive alternative to traditional microarray substrates, where enhanced detection sensitivity is desired.
Collapse
Affiliation(s)
- Rajendra J Redkar
- SCHOTT Nexterion, SCHOTT North America inc., 400 York Avenue, Duryea, PA 18642-2036, USA.
| | | | | | | | | | | | | | | |
Collapse
|
50
|
He W, Bull SB, Gokgoz N, Andrulis I, Wunder J. Application of reliability coefficients in cDNA microarray data analysis. Stat Med 2006; 25:1051-66. [PMID: 16345046 DOI: 10.1002/sim.2254] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Gene expression microarray technology has been widely used in areas such as human cancer research to identify molecular characteristics of sample specimens. The microarray study, however, is a very complicated procedure which involves numerous sources of variability that may be either systematic or random. Systematic variation is often eliminated by applying normalization procedures, but at present there are no standard criteria available to evaluate the performance of a particular normalization approach. In this paper, we propose a reliability-type coefficient as a criterion to assess the effectiveness of normalization procedures in eliminating systematic variation. Simulation studies show that this criterion performs reasonably well in a range of settings. The proposed method is illustrated using a subset of an ongoing microarray study of soft-tissue sarcoma.
Collapse
Affiliation(s)
- Wenqing He
- Prosserman Center for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada M5G 1X5.
| | | | | | | | | |
Collapse
|