1
|
Baciu C, Thompson KJ, Mougeot JL, Brooks BR, Weller JW. The LO-BaFL method and ALS microarray expression analysis. BMC Bioinformatics 2012; 13:244. [PMID: 23006766 PMCID: PMC3526454 DOI: 10.1186/1471-2105-13-244] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 09/05/2012] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Sporadic Amyotrophic Lateral Sclerosis (sALS) is a devastating, complex disease of unknown etiology. We studied this disease with microarray technology to capture as much biological complexity as possible. The Affymetrix-focused BaFL pipeline takes into account problems with probes that arise from physical and biological properties, so we adapted it to handle the long-oligonucleotide probes on our arrays (hence LO-BaFL). The revised method was tested against a validated array experiment and then used in a meta-analysis of peripheral white blood cells from healthy control samples in two experiments. We predicted differentially expressed (DE) genes in our sALS data, combining the results obtained using the TM4 suite of tools with those from the LO-BaFL method. Those predictions were tested using qRT-PCR assays. RESULTS LO-BaFL filtering and DE testing accurately predicted previously validated DE genes in a published experiment on coronary artery disease (CAD). Filtering healthy control data from the sALS and CAD studies with LO-BaFL resulted in highly correlated expression levels across many genes. After bioinformatics analysis, twelve genes from the sALS DE gene list were selected for independent testing using qRT-PCR assays. High-quality RNA from six healthy Control and six sALS samples yielded the predicted differential expression for 7 genes: TARDBP, SKIV2L2, C12orf35, DYNLT1, ACTG1, B2M, and ILKAP. Four of the seven have been previously described in sALS studies, while ACTG1, B2M and ILKAP appear in the context of this disease for the first time. Supplementary material can be accessed at: http://webpages.uncc.edu/~cbaciu/LO-BaFL/supplementary_data.html. CONCLUSION LO-BaFL predicts DE results that are broadly similar to those of other methods. The small healthy control cohort in the sALS study is a reasonable foundation for predicting DE genes. Modifying the BaFL pipeline allowed us to remove noise and systematic errors, improving the power of this study, which had a small sample size. Each bioinformatics approach revealed DE genes not predicted by the other; subsequent PCR assays confirmed seven of twelve candidates, a relatively high success rate.
Collapse
Affiliation(s)
- Cristina Baciu
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Kevin J Thompson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jean-Luc Mougeot
- ALS Biomarker Laboratory, Carolinas Neuromuscular/ALS-MDA Center, Department of Neurology, Carolinas Medical Center, Charlotte, NC, 28207, USA
- University of North Carolina School of Medicine, Charlotte Campus, Charlotte, NC, 28203, USA
| | - Benjamin R Brooks
- ALS Biomarker Laboratory, Carolinas Neuromuscular/ALS-MDA Center, Department of Neurology, Carolinas Medical Center, Charlotte, NC, 28207, USA
- University of North Carolina School of Medicine, Charlotte Campus, Charlotte, NC, 28203, USA
| | - Jennifer W Weller
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| |
Collapse
|
2
|
Auslander M, Neumann PM, Tom M. The effect of tert-butyl hydroperoxide on hepatic transcriptome expression patterns in the striped sea bream (Lithognathus mormyrus; Teleostei). Free Radic Res 2010; 44:991-1003. [PMID: 20553222 DOI: 10.3109/10715762.2010.492831] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The study was aimed at examining the effects of tert-butyl hydroperoxide (tBHP) on hepatic transcriptome expression patterns of the teleost fish Lithognathus mormyrus. tBHP is an organic hydro-peroxide, widely used as a model pro-oxidant. It generates the reactive oxygen species (ROS) tert-butoxyl and tert-butylperoxyl. Complementary DNAs of tBHP-treated vs control fish were applied onto a previously produced cDNA microarray of approximately 1500 unique sequences. The effects of the tBHP application were demonstrated by leukocyte infiltration into the liver and by differential expression of various genes, some already known to be involved in ROS-related responses. Indicator genes of putative ROS effects were: aldehyde dehydrogenase 3A2, Heme oxygenase and the hemopexin-like protein. Putative indicators of transendothelial leukocyte migration and function were: p22phox, Rac1 and CD63-like genes. Interestingly, 7-dehydrocholesterol reductase was significantly down-regulated in response to all treatments. Several non-annotated genes revealed uniform directions of differential expression in response to all treatments.
Collapse
|
3
|
Roh SW, Abell GCJ, Kim KH, Nam YD, Bae JW. Comparing microarrays and next-generation sequencing technologies for microbial ecology research. Trends Biotechnol 2010; 28:291-9. [PMID: 20381183 DOI: 10.1016/j.tibtech.2010.03.001] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2009] [Revised: 02/18/2010] [Accepted: 03/08/2010] [Indexed: 12/12/2022]
Abstract
Recent advances in molecular biology have resulted in the application of DNA microarrays and next-generation sequencing (NGS) technologies to the field of microbial ecology. This review aims to examine the strengths and weaknesses of each of the methodologies, including depth and ease of analysis, throughput and cost-effectiveness. It also intends to highlight the optimal application of each of the individual technologies toward the study of a particular environment and identify potential synergies between the two main technologies, whereby both sample number and coverage can be maximized. We suggest that the efficient use of microarray and NGS technologies will allow researchers to advance the field of microbial ecology, and importantly, improve our understanding of the role of microorganisms in their various environments.
Collapse
Affiliation(s)
- Seong Woon Roh
- Department of Life and Nanopharmaceutical Sciences, Kyung Hee University, HoeGi-Dong 1, DongDaeMun-Gu, Republic of Korea
| | | | | | | | | |
Collapse
|
4
|
Thompson KJ, Deshmukh H, Solka JL, Weller JW. A white-box approach to microarray probe response characterization: the BaFL pipeline. BMC Bioinformatics 2009; 10:449. [PMID: 20040098 PMCID: PMC2804686 DOI: 10.1186/1471-2105-10-449] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2009] [Accepted: 12/29/2009] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Microarrays depend on appropriate probe design to deliver the promise of accurate genome-wide measurement. Probe design, ideally, produces a unique probe-target match with homogeneous duplex stability over the complete set of probes. Much of microarray pre-processing is concerned with adjusting for non-ideal probes that do not report target concentration accurately. Cross-hybridizing probes (non-unique), probe composition and structure, as well as platform effects such as instrument limitations, have been shown to affect the interpretation of signal. Data cleansing pipelines seldom filter specifically for these constraints, relying instead on general statistical tests to remove the most variable probes from the samples in a study. This adjusts probes contributing to ProbeSet (gene) values in a study-specific manner. We refer to the complete set of factors as biologically applied filter levels (BaFL) and have assembled an analysis pipeline for managing them consistently. The pipeline and associated experiments reported here examine the outcome of comprehensively excluding probes affected by known factors on inter-experiment target behavior consistency. RESULTS We present here a 'white box' probe filtering and intensity transformation protocol that incorporates currently understood factors affecting probe and target interactions; the method has been tested on data from the Affymetrix human GeneChip HG-U95Av2, using two independent datasets from studies of a complex lung adenocarcinoma phenotype. The protocol incorporates probe-specific effects from SNPs, cross-hybridization and low heteroduplex affinity, as well as effects from scanner sensitivity, sample batches, and includes simple statistical tests for identifying unresolved biological factors leading to sample variability. Subsequent to filtering for these factors, the consistency and reliability of the remaining measurements is shown to be markedly improved. CONCLUSIONS The data cleansing protocol yields reproducible estimates of a given probe or ProbeSet's (gene's) relative expression that translates across datasets, allowing for credible cross-experiment comparisons. We provide supporting evidence for the validity of removing several large classes of probes, and for our approaches for removing outlying samples. The resulting expression profiles demonstrate consistency across the two independent datasets. Finally, we demonstrate that, given an appropriate sampling pool, the method enhances the t-test's statistical power to discriminate significantly different means over sample classes.
Collapse
Affiliation(s)
- Kevin J Thompson
- Computer Science Dept, University of North Carolina at Charlotte, Charlotte, NC 28223, USA.
| | | | | | | |
Collapse
|
5
|
Blair S, Williams L, Bishop J, Chagovetz A. Microarray temperature optimization using hybridization kinetics. Methods Mol Biol 2009; 529:171-196. [PMID: 19381979 DOI: 10.1007/978-1-59745-538-1_12] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
In any microarray hybridization experiment, there are contributions at each probe spot due to the match and numerous mismatch target species (i.e., cross-hybridizations). One goal of temperature optimization is to minimize the contribution of mismatch species; however, achieving this goal may come at the expense of obtaining equilibrium reaction conditions. We employ two-component thermodynamic and kinetic models to study the trade-offs involved in temperature optimization. These models show that the maximum selectivity is achieved at equilibrium, but that the mismatch species controls the time to equilibrium via the competitive displacement mechanism. Also, selectivity is improved at lower temperatures. However, the time to equilibrium is also extended, so that greater selectivity cannot be achieved in practice. We also employ a two-color real-time microarray reader to experimentally demonstrate these effects by independently monitoring the match and mismatch species during multiplex hybridization. The only universal criterion that can be employed is to optimize temperature based upon attaining equilibrium reaction conditions. This temperature varies from one probe to another, but can be determined empirically using standard microarray experimentation methods.
Collapse
Affiliation(s)
- Steve Blair
- University of Utah, Salt Lake City, Utah, USA
| | | | | | | |
Collapse
|
6
|
Potter DP, Yan P, Huang THM, Lin S. Probe signal correction for differential methylation hybridization experiments. BMC Bioinformatics 2008; 9:453. [PMID: 18947421 PMCID: PMC2603337 DOI: 10.1186/1471-2105-9-453] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2008] [Accepted: 10/23/2008] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Non-biological signal (or noise) has been the bane of microarray analysis. Hybridization effects related to probe-sequence composition and DNA dye-probe interactions have been observed in differential methylation hybridization (DMH) microarray experiments as well as other effects inherent to the DMH protocol. RESULTS We suggest two models to correct for non-biologically relevant probe signal with an overarching focus on probe-sequence composition. The estimated effects are evaluated and the strengths of the models are considered in the context of DMH analyses. CONCLUSION The majority of estimated parameters were statistically significant in all considered models. Model selection for signal correction is based on interpretation of the estimated values and their biological significance.
Collapse
Affiliation(s)
- Dustin P Potter
- Human Cancer Genetics Program, OSU Comprehensive Cancer Center, The Ohio State University, Columbus, OH, USA.
| | | | | | | |
Collapse
|
7
|
Auslander M, Yudkovski Y, Chalifa-Caspi V, Herut B, Ophir R, Reinhardt R, Neumann PM, Tom M. Pollution-affected fish hepatic transcriptome and its expression patterns on exposure to cadmium. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2008; 10:250-261. [PMID: 18213484 PMCID: PMC2921062 DOI: 10.1007/s10126-007-9060-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2007] [Revised: 09/16/2007] [Accepted: 09/28/2007] [Indexed: 05/25/2023]
Abstract
Individuals of the fish Lithognathus mormyrus were exposed to a series of pollutants including: benzo[a]pyrene, pp-DDE, Aroclor 1254, perfluorooctanoic acid, tributyl-tin chloride, lindane, estradiol, 4-nonylphenol, methyl mercury chloride, and cadmium chloride. Five mixtures of the pollutants were injected. Each mixture included one to three compounds. A microarray was constructed using 4608 L. mormyrus hepatic cDNAs cloned from the pollutant-exposed fish. Most clones (4456) were sequenced and assembled into 1494 annotated unique clones. The constructed microarray was used to identify changes in hepatic gene expression profile on exposure to cadmium administered to the fish by feeding or injections. Thirty-one unique clones showed altered expression levels on exposure to cadmium. Prominently differentially expressed genes included elastase 4, carboxypeptidase B, trypsinogen, perforin, complement C31, cytochrome P450 2K5, ceruloplasmin, carboxyl ester lipase, and metallothionein. Twelve sequences have no available annotation. Most genes (23) were downregulated and hypothesized to be affected by general toxicity due to the intensive cadmium exposure regime. The concept of an operational multigene cDNA microarray, aimed at routine and fast biomonitoring of multiple environmental threats, is outlined and the cadmium exposure experiment has been used to demonstrate functional and methodological aspects of the biomonitoring tool. The components of the outlined system include: (1) spotted array, composed of both pollution-affected and constitutively expressed genes, the latter are used for normalization; (2) standard, repeatable labeling procedure of a reference transcript population; and (3) biomarker indices derived from the profile of expression ratio across the pollution-affected genes, between the field-sampled transcript populations and the reference.
Collapse
Affiliation(s)
- M. Auslander
- Israel Oceanographic and Limnological Research, Haifa, 31080 Israel
- The Technion-Israel Institute of Technology, Faculty of Civil and Environmental Engineering, Technion City, Haifa 32000 Israel
| | - Y. Yudkovski
- Israel Oceanographic and Limnological Research, Haifa, 31080 Israel
| | - V. Chalifa-Caspi
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, 84105 Israel
| | - B. Herut
- Israel Oceanographic and Limnological Research, Haifa, 31080 Israel
| | - R. Ophir
- Weizmann Institute of Science, 71600 Rehovot, Israel
| | - R. Reinhardt
- Max Plank Institute-Molecular Genetics, 14195 Berlin-Dahlem, Germany
| | - P. M. Neumann
- The Technion-Israel Institute of Technology, Faculty of Civil and Environmental Engineering, Technion City, Haifa 32000 Israel
| | - M. Tom
- Israel Oceanographic and Limnological Research, Haifa, 31080 Israel
| |
Collapse
|
8
|
Yudkovski Y, Shechter A, Chalifa-Caspi V, Auslander M, Ophir R, Dauphin-Villemant C, Waterman M, Sagi A, Tom M. Hepatopancreatic multi-transcript expression patterns in the crayfish Cherax quadricarinatus during the moult cycle. INSECT MOLECULAR BIOLOGY 2007; 16:661-674. [PMID: 18092996 DOI: 10.1111/j.1365-2583.2007.00762.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Alterations of hepatopancreatic multi-transcript expression patterns, related to induced moult cycle, were identified in male Cherax quadricarinatus through cDNA microarray hybridizations of hepatopancreatic transcript populations. Moult was induced by X-organ sinus gland extirpation or by repeated injections of 20-hydroxyecdysone. Manipulated males were sacrificed at premoult or early postmoult, and a reference population was sacrificed at intermoult. Differentially expressed genes among the four combinations of two induction methods and two moult stages were identified. Biologically interesting clusters revealing concurrently changing transcript expressions across treatments were selected, characterized by a general shift of expression throughout premoult and early postmoult vs. intermoult, or by different premoult vs. postmoult expressions. A number of genes were differentially expressed in 20-hydroxyecdysone-injected crayfish vs. X-organ sinus gland extirpated males.
Collapse
Affiliation(s)
- Y Yudkovski
- Israel Oceanographic and Limnological Research, Haifa, Israel
| | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Casneuf T, Van de Peer Y, Huber W. In situ analysis of cross-hybridisation on microarrays and the inference of expression correlation. BMC Bioinformatics 2007; 8:461. [PMID: 18039370 PMCID: PMC2213692 DOI: 10.1186/1471-2105-8-461] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2007] [Accepted: 11/26/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray co-expression signatures are an important tool for studying gene function and relations between genes. In addition to genuine biological co-expression, correlated signals can result from technical deficiencies like hybridization of reporters with off-target transcripts. An approach that is able to distinguish these factors permits the detection of more biologically relevant co-expression signatures. RESULTS We demonstrate a positive relation between off-target reporter alignment strength and expression correlation in data from oligonucleotide genechips. Furthermore, we describe a method that allows the identification, from their expression data, of individual probe sets affected by off-target hybridization. CONCLUSION The effects of off-target hybridization on expression correlation coefficients can be substantial, and can be alleviated by more accurate mapping between microarray reporters and the target transcriptome. We recommend attention to the mapping for any microarray analysis of gene expression patterns.
Collapse
Affiliation(s)
- Tineke Casneuf
- Department of Plant Systems Biology, VIB, B-9052 Ghent, Belgium.
| | | | | |
Collapse
|
10
|
Abstract
Quantitative analysis of DNA microarray data is complicated by uncertainties inherent to the experimental setup. Using computer simulations and real-time experimental results, we have previously demonstrated effects of multiplex reactions on a single sensing zone of an array, which may be a leading factor in erroneous interpretation of experimental data. We suggest here that a simplified three-component kinetic model may present a sufficient approximation to describe the general case of DNA sensing in a complex sample milieu. We show that, by analyzing the real-time hybridization kinetics of a nontarget species, we can perform quantitative analysis of unlabeled targets of interest within a broad dynamic range of concentrations.
Collapse
|
11
|
Cohen R, Chalifa-Caspi V, Williams TD, Auslander M, George SG, Chipman JK, Tom M. Estimating the efficiency of fish cross-species cDNA microarray hybridization. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2007; 9:491-9. [PMID: 17514486 DOI: 10.1007/s10126-007-9010-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Revised: 03/06/2007] [Accepted: 03/09/2007] [Indexed: 04/12/2023]
Abstract
Using an available cross-species cDNA microarray is advantageous for examining multigene expression patterns in non-model organisms, saving the need for construction of species-specific arrays. The aim of the present study was to estimate relative efficiency of cross-species hybridizations across bony fishes, using bioinformatics tools. The methodology may serve also as a model for similar evaluations in other taxa. The theoretical evaluation was done by substituting comparative whole-transcriptome sequence similarity information into the thermodynamic hybridization equation. Complementary DNA sequence assemblages of nine fish species belonging to common families or suborders and distributed across the bony fish taxonomic branch were selected for transcriptome-wise comparisons. Actual cross-species hybridizations among fish of different taxonomic distances were used to validate and eventually to calibrate the theoretically computed relative efficiencies.
Collapse
Affiliation(s)
- Raphael Cohen
- National Institute for Biotechnology in Negev, Ben Gurion University of Negev, Beer-Sheva 84105, Israel
| | | | | | | | | | | | | |
Collapse
|
12
|
Chandler DP, Jarrell AE, Roden ER, Golova J, Chernov B, Schipma MJ, Peacock AD, Long PE. Suspension array analysis of 16S rRNA from Fe- and SO(4)2- reducing bacteria in uranium-contaminated sediments undergoing bioremediation. Appl Environ Microbiol 2006; 72:4672-87. [PMID: 16820459 PMCID: PMC1489301 DOI: 10.1128/aem.02858-05] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A 16S rRNA-targeted tunable bead array was developed and used in a retrospective analysis of metal- and sulfate-reducing bacteria in contaminated subsurface sediments undergoing in situ U(VI) bioremediation. Total RNA was extracted from subsurface sediments and interrogated directly, without a PCR step. Bead array validation studies with total RNA derived from 24 isolates indicated that the behavior and response of the 16S rRNA-targeted oligonucleotide probes could not be predicted based on the primary nucleic acid sequence. Likewise, signal intensity (absolute or normalized) could not be used to assess the abundance of one organism (or rRNA) relative to the abundance of another organism (or rRNA). Nevertheless, the microbial community structure and dynamics through time and space and as measured by the rRNA-targeted bead array were consistent with previous data acquired at the site, where indigenous sulfate- and iron-reducing bacteria and near neighbors of Desulfotomaculum were the organisms that were most responsive to a change in injected acetate concentrations. Bead array data were best interpreted by analyzing the relative changes in the probe responses for spatially and temporally related samples and by considering only the response of one probe to itself in relation to a background (reference) environmental sample. By limiting the interpretation of the data in this manner and placing it in the context of supporting geochemical and microbiological analyses, we concluded that ecologically relevant and meaningful information can be derived from direct microarray analysis of rRNA in uncharacterized environmental samples, even with the current analytical uncertainty surrounding the behavior of individual probes on tunable bead arrays.
Collapse
Affiliation(s)
- Darrell P Chandler
- Argonne National Laboratory, 9700 South Cass Avenue, Building 202, A-249, Argonne, IL 60439, USA.
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Chen YA, Chou CC, Lu X, Slate EH, Peck K, Xu W, Voit EO, Almeida JS. A multivariate prediction model for microarray cross-hybridization. BMC Bioinformatics 2006; 7:101. [PMID: 16509965 PMCID: PMC1409802 DOI: 10.1186/1471-2105-7-101] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2005] [Accepted: 03/01/2006] [Indexed: 11/17/2022] Open
Abstract
Background Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. Results We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. Conclusion A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments.
Collapse
Affiliation(s)
- Yian A Chen
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| | - Cheng-Chung Chou
- Center for Genomic Medicine, National Taiwan University, Taipei, Taiwan
| | - Xinghua Lu
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| | - Elizabeth H Slate
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| | - Konan Peck
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Wenying Xu
- Key Laboratory of Molecular and Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, P. R China
| | - Eberhard O Voit
- Department of Biomedical Engineering, Georgia Tech, Atlanta, GA, USA
| | - Jonas S Almeida
- Department of Biostatistics and Applied Mathematics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
14
|
Newkirk HL, Knoll JHM, Rogan PK. Distortion of quantitative genomic and expression hybridization by Cot-1 DNA: mitigation of this effect. Nucleic Acids Res 2005; 33:e191. [PMID: 16356923 PMCID: PMC1316118 DOI: 10.1093/nar/gni190] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Cross-hybridization of repetitive sequences in genomic and expression arrays is reported to be suppressed with repeat-blocking nucleic acids (Cot-1 DNA). Contrary to expectation, we demonstrated that Cot-1 also enhanced non-specific hybridization between probes and genomic targets. When added to target DNA, Cot-1 enhanced hybridization (2.2- to 3-fold) to genomic probes containing conserved repetitive elements. In addition to repetitive sequences, Cot-1 was found to be enriched for linked single copy (sc) sequences. Adventitious association between these sequences and probes distort quantitative measurements of the probes hybridized to desired genomic targets. Quantitative microarray hybridization studies using Cot-1 DNA are also susceptible to these effects, especially for probes that map to genomic regions containing conserved repetitive sequences. Hybridization measurements with such probes are less reproducible in the presence of Cot-1 than for probes derived from sc regions or regions containing divergent repeat elements, a finding with significant ramifications for genomic and expression microarray studies. We mitigated the requirement for Cot-1 either by hybridizing with computationally defined sc probes lacking repeats or by substituting synthetic repetitive elements complementary to sequences in genomic probes.
Collapse
Affiliation(s)
- Heather L Newkirk
- Laboratories of Genomic Disorders, Children's Mercy Hospital, University of Missouri-Kansas City School of Medicine, KS, USA
| | | | | |
Collapse
|
15
|
Lezar S, Myburg AA, Berger DK, Wingfield MJ, Wingfield BD. Development and assessment of microarray-based DNA fingerprinting in Eucalyptus grandis. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2004; 109:1329-36. [PMID: 15290050 DOI: 10.1007/s00122-004-1759-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2003] [Accepted: 06/09/2004] [Indexed: 05/02/2023]
Abstract
Development of improved Eucalyptus genotypes involves the routine identification of breeding stock and superior clones. Currently, microsatellites and random amplified polymorphic DNA markers are the most widely used DNA-based techniques for fingerprinting of these trees. While these techniques have provided rapid and powerful fingerprinting assays, they are constrained by their reliance on gel or capillary electrophoresis, and therefore, relatively low throughput of fragment analysis. In contrast, recently developed microarray technology holds the promise of parallel analysis of thousands of markers in plant genomes. The aim of this study was to develop a DNA fingerprinting chip for Eucalyptus grandis and to investigate its usefulness for fingerprinting of eucalypt trees. A prototype chip was prepared using a partial genomic library from total genomic DNA of 23 E. grandis trees, of which 22 were full siblings. A total of 384 cloned genomic fragments were individually amplified and arrayed onto glass slides. DNA fingerprints were obtained for 17 individuals by hybridizing labeled genome representations of the individual trees to the 384-element chip. Polymorphic DNA fragments were identified by evaluating the binary distribution of their background-corrected signal intensities across full-sib individuals. Among 384 DNA fragments on the chip, 104 (27%) were found to be polymorphic. Hybridization of these polymorphic fragments was highly repeatable (R2>0.91) within the E. grandis individuals, and they allowed us to identify all 17 full-sib individuals. Our results suggest that DNA microarrays can be used to effectively fingerprint large numbers of closely related Eucalyptus trees.
Collapse
Affiliation(s)
- Sabine Lezar
- Department of Genetics, University of Pretoria, Pretoria, 0020, South Africa
| | | | | | | | | |
Collapse
|
16
|
Wren JD, Yao M, Langer M, Conway T. Simulated Annealing of Microarray Data Reduces Noise and Enables Cross-Experimental Comparisons. DNA Cell Biol 2004; 23:695-700. [PMID: 15585127 DOI: 10.1089/dna.2004.23.695] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Microarrays are a powerful tool for assessing the genome-wide induction of a transcriptional response to internal or external stimuli, but are not considered quantitatively rigorous (i.e., the signal intensity of hybridized probe is normally used to quantify relative transcript abundance). Thus, it is difficult, if not impossible, to accurately compare separate microarray experiments without a reference standard. However, even among replicated microarray experiments, each gene varies significantly in the amount of signal detected, suggesting no single gene would be appropriate as a standard. We propose and test a method to "align" experimental transcription profiles to a set of reference experiments using simulated annealing (SA), essentially using the relative positions of all genes as a reference standard. SA attempts to find a globally optimal adjustment factor for the relative expression level of each experimental gene expression signal, given a previously observed range of gene expression measurements. By defining a relative dynamic range of gene expression under control conditions for all genes, we can more accurately compare transcription profiles between separate experiments and, potentially, between species--enabling comparative transcriptomics. Testing SA on a published dataset, we find that it significantly reduces interexperimental variation, suggesting it holds promise to accomplish this goal.
Collapse
Affiliation(s)
- Jonathan D Wren
- Advanced Center for Genome Technology, The University of Oklahoma, Norman, Oklahoma 73019, USA.
| | | | | | | |
Collapse
|
17
|
Flikka K, Yadetie F, Laegreid A, Jonassen I. XHM: a system for detection of potential cross hybridizations in DNA microarrays. BMC Bioinformatics 2004; 5:117. [PMID: 15333145 PMCID: PMC517492 DOI: 10.1186/1471-2105-5-117] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2004] [Accepted: 08/27/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarrays have emerged as the preferred platform for high throughput gene expression analysis. Cross-hybridization among genes with high sequence similarities can be a source of error reducing the reliability of DNA microarray results. RESULTS We have developed a tool called XHM (cross hybridization on microarrays) for assessment of the reliability of hybridization signals by detecting potential cross-hybridizations on DNA microarrays. This is done by comparing the sequences of the probes against an extensive database representing the transcriptome of the organism in question. XHM is available online at http://www.bioinfo.no/tools/xhm/. CONCLUSIONS Using XHM with its user-adjustable parameters will enable scientists to check their lists of differentially expressed genes from microarray experiments for potential cross-hybridizations. This provides information that may be useful in the validation of the microarray results.
Collapse
Affiliation(s)
- Kristian Flikka
- Computational Biology Unit, Bergen Center for Computational Science, UNIFOB/UiB, Thormoehlensgt.55, N-5008 Bergen, Norway
| | - Fekadu Yadetie
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
- Sars International Centre for Marine Molecular Biology, Bergen High Technology Centre, Thormoehlensgt. 55, N-5008 Bergen, Norway
| | - Astrid Laegreid
- Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
| | - Inge Jonassen
- Computational Biology Unit, Bergen Center for Computational Science, UNIFOB/UiB, Thormoehlensgt.55, N-5008 Bergen, Norway
- Department of Informatics, University of Bergen, PB. 7800, N-5020 Bergen, Norway
| |
Collapse
|