1
|
Cheng YH. A Novel Teaching-Learning-Based Optimization for Improved Mutagenic Primer Design in Mismatch PCR-RFLP SNP Genotyping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:86-98. [PMID: 26886734 DOI: 10.1109/tcbb.2015.2430354] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Many single nucleotide polymorphisms (SNPs) for complex genetic diseases are genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) in small-scale basic research studies. It is an essential work to design feasible PCR-RFLP primer pair and find out available restriction enzymes to recognize the target SNP for PCR experiments. However, many SNPs are incapable of performing PCR-RFLP makes SNP genotyping become unpractical. A genetic algorithm (GA) had been proposed for designing mutagenic primer and get available restriction enzymes, but it gives an unrefined solution in mutagenic primers. In order to improve the mutagenic primer design, we propose TLBOMPD (TLBO-based Mutagenic Primer Design) a novel computational intelligence-based method that uses the notion of "teaching and learning" to search for more feasible mutagenic primers and provide the latest available restriction enzymes. The original Wallace's formula for the calculation of melting temperature is maintained, and more accurate calculation formulas of GC-based melting temperature and thermodynamic melting temperature are introduced into the proposed method. Mutagenic matrix is also reserved to increase the efficiency of judging a hypothetical mutagenic primer if involve available restriction enzymes for recognizing the target SNP. Furthermore, the core of SNP-RFLPing version 2 is used to enhance the mining work for restriction enzymes based on the latest REBASE. Twenty-five SNPs with mismatch PCR-RFLP screened from 288 SNPs in human SLC6A4 gene are used to appraise the TLBOMPD. Also, the computational results are compared with those of the GAMPD. In the future, the usage of the mutagenic primers in the wet lab needs to been validated carefully to increase the reliability of the method. The TLBOMPD is implemented in JAVA and it is freely available at http://tlbompd.googlecode.com/.
Collapse
|
2
|
Tulpan D, Ghiggi A, Montemanni R. Computational Sequence Design Techniques for DNA Microarray Technologies. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In systems biology and biomedical research, microarray technology is a method of choice that enables the complete quantitative and qualitative ascertainment of gene expression patterns for whole genomes. The selection of high quality oligonucleotide sequences that behave consistently across multiple experiments is a key step in the design, fabrication and experimental performance of DNA microarrays. The aim of this chapter is to outline recent algorithmic developments in microarray probe design, evaluate existing probe sequences used in commercial arrays, and suggest methodologies that have the potential to improve on existing design techniques.
Collapse
Affiliation(s)
- Dan Tulpan
- National Research Council of Canada, Canada
| | | | - Roberto Montemanni
- Istituto Dalle Molle di Studi sull’Intelligenza Artificiale, Switzerland
| |
Collapse
|
3
|
UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes. Comp Funct Genomics 2012; 2012:520732. [PMID: 22693428 PMCID: PMC3368176 DOI: 10.1155/2012/520732] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 03/15/2012] [Indexed: 01/31/2023] Open
Abstract
Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR) assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors.
Collapse
|
4
|
Tulpan D, Ghiggi A, Montemanni R. Computational Sequence Design Techniques for DNA Microarray Technologies. SYSTEMIC APPROACHES IN BIOINFORMATICS AND COMPUTATIONAL SYSTEMS BIOLOGY 2011. [DOI: 10.4018/978-1-61350-435-2.ch003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
In systems biology and biomedical research, microarray technology is a method of choice that enables the complete quantitative and qualitative ascertainment of gene expression patterns for whole genomes. The selection of high quality oligonucleotide sequences that behave consistently across multiple experiments is a key step in the design, fabrication and experimental performance of DNA microarrays. The aim of this chapter is to outline recent algorithmic developments in microarray probe design, evaluate existing probe sequences used in commercial arrays, and suggest methodologies that have the potential to improve on existing design techniques.
Collapse
Affiliation(s)
- Dan Tulpan
- National Research Council of Canada, Canada
| | | | - Roberto Montemanni
- Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), Switzerland
| |
Collapse
|
5
|
Lemoine S, Combes F, Le Crom S. An evaluation of custom microarray applications: the oligonucleotide design challenge. Nucleic Acids Res 2009; 37:1726-39. [PMID: 19208645 PMCID: PMC2665234 DOI: 10.1093/nar/gkp053] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The increase in feature resolution and the availability of multipack formats from microarray providers has opened the way to various custom genomic applications. However, oligonucleotide design and selection remains a bottleneck of the microarray workflow. Several tools are available to perform this work, and choosing the best one is not an easy task, nor are the choices obvious. Here we review the oligonucleotide design field to help users make their choice. We have first performed a comparative evaluation of the available solutions based on a set of criteria including: ease of installation, user-friendly access, the number of parameters and settings available. In a second step, we chose to submit two real cases to a selection of programs. Finally, we used a set of tests for the in silico benchmark of the oligo sets obtained from each type of software. We show that the design software must be selected according to the goal of the scientist, depending on factors such as the organism used, the number of probes required and their localization on the target sequence. The present work provides keys to the choice of the most relevant software, according to the various parameters we tested.
Collapse
Affiliation(s)
- Sophie Lemoine
- INSERM, CNRS, IFR36, Plate-forme Transcriptome, Paris, France
| | | | | |
Collapse
|
6
|
You FM, Huo N, Gu YQ, Luo MC, Ma Y, Hane D, Lazo GR, Dvorak J, Anderson OD. BatchPrimer3: a high throughput web application for PCR and sequencing primer design. BMC Bioinformatics 2008; 9:253. [PMID: 18510760 PMCID: PMC2438325 DOI: 10.1186/1471-2105-9-253] [Citation(s) in RCA: 484] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2007] [Accepted: 05/29/2008] [Indexed: 01/25/2023] Open
Abstract
Background Microsatellite (simple sequence repeat – SSR) and single nucleotide polymorphism (SNP) markers are two types of important genetic markers useful in genetic mapping and genotyping. Often, large-scale genomic research projects require high-throughput computer-assisted primer design. Numerous such web-based or standard-alone programs for PCR primer design are available but vary in quality and functionality. In particular, most programs lack batch primer design capability. Such a high-throughput software tool for designing SSR flanking primers and SNP genotyping primers is increasingly demanded. Results A new web primer design program, BatchPrimer3, is developed based on Primer3. BatchPrimer3 adopted the Primer3 core program as a major primer design engine to choose the best primer pairs. A new score-based primer picking module is incorporated into BatchPrimer3 and used to pick position-restricted primers. BatchPrimer3 v1.0 implements several types of primer designs including generic primers, SSR primers together with SSR detection, and SNP genotyping primers (including single-base extension primers, allele-specific primers, and tetra-primers for tetra-primer ARMS PCR), as well as DNA sequencing primers. DNA sequences in FASTA format can be batch read into the program. The basic information of input sequences, as a reference of parameter setting of primer design, can be obtained by pre-analysis of sequences. The input sequences can be pre-processed and masked to exclude and/or include specific regions, or set targets for different primer design purposes as in Primer3Web and primer3Plus. A tab-delimited or Excel-formatted primer output also greatly facilitates the subsequent primer-ordering process. Thousands of primers, including wheat conserved intron-flanking primers, wheat genome-specific SNP genotyping primers, and Brachypodium SSR flanking primers in several genome projects have been designed using the program and validated in several laboratories. Conclusion BatchPrimer3 is a comprehensive web primer design program to develop different types of primers in a high-throughput manner. Additional methods of primer design can be easily integrated into future versions of BatchPrimer3. The program with source code and thousands of PCR and sequencing primers designed for wheat and Brachypodium are accessible at .
Collapse
Affiliation(s)
- Frank M You
- Department of Plant Sciences, University of California, CA 95616, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
You FM, Huo N, Gu YQ, Luo MC, Ma Y, Hane D, Lazo GR, Dvorak J, Anderson OD. BatchPrimer3: a high throughput web application for PCR and sequencing primer design. BMC Bioinformatics 2008. [PMID: 18510760 DOI: 10.1186/1471‐2105‐9‐253] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Microsatellite (simple sequence repeat - SSR) and single nucleotide polymorphism (SNP) markers are two types of important genetic markers useful in genetic mapping and genotyping. Often, large-scale genomic research projects require high-throughput computer-assisted primer design. Numerous such web-based or standard-alone programs for PCR primer design are available but vary in quality and functionality. In particular, most programs lack batch primer design capability. Such a high-throughput software tool for designing SSR flanking primers and SNP genotyping primers is increasingly demanded. RESULTS A new web primer design program, BatchPrimer3, is developed based on Primer3. BatchPrimer3 adopted the Primer3 core program as a major primer design engine to choose the best primer pairs. A new score-based primer picking module is incorporated into BatchPrimer3 and used to pick position-restricted primers. BatchPrimer3 v1.0 implements several types of primer designs including generic primers, SSR primers together with SSR detection, and SNP genotyping primers (including single-base extension primers, allele-specific primers, and tetra-primers for tetra-primer ARMS PCR), as well as DNA sequencing primers. DNA sequences in FASTA format can be batch read into the program. The basic information of input sequences, as a reference of parameter setting of primer design, can be obtained by pre-analysis of sequences. The input sequences can be pre-processed and masked to exclude and/or include specific regions, or set targets for different primer design purposes as in Primer3Web and primer3Plus. A tab-delimited or Excel-formatted primer output also greatly facilitates the subsequent primer-ordering process. Thousands of primers, including wheat conserved intron-flanking primers, wheat genome-specific SNP genotyping primers, and Brachypodium SSR flanking primers in several genome projects have been designed using the program and validated in several laboratories. CONCLUSION BatchPrimer3 is a comprehensive web primer design program to develop different types of primers in a high-throughput manner. Additional methods of primer design can be easily integrated into future versions of BatchPrimer3. The program with source code and thousands of PCR and sequencing primers designed for wheat and Brachypodium are accessible at http://wheat.pw.usda.gov/demos/BatchPrimer3/.
Collapse
Affiliation(s)
- Frank M You
- Department of Plant Sciences, University of California, CA 95616, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
8
|
DNA microarray analysis of central carbohydrate metabolism: glycolytic/gluconeogenic carbon switch in the hyperthermophilic crenarchaeum Thermoproteus tenax. J Bacteriol 2008; 190:2231-8. [PMID: 18178743 DOI: 10.1128/jb.01524-07] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In order to unravel the role of regulation on transcript level in central carbohydrate metabolism (CCM) of Thermoproteus tenax, a focused DNA microarray was constructed by using 85 open reading frames involved in CCM. A transcriptional analysis comparing heterotrophic growth on glucose versus autotrophic growth on CO2-H2 was performed.
Collapse
|
9
|
|
10
|
Lange C, Zaigler A, Hammelmann M, Twellmeyer J, Raddatz G, Schuster SC, Oesterhelt D, Soppa J. Genome-wide analysis of growth phase-dependent translational and transcriptional regulation in halophilic archaea. BMC Genomics 2007; 8:415. [PMID: 17997854 PMCID: PMC3225822 DOI: 10.1186/1471-2164-8-415] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2007] [Accepted: 11/12/2007] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Differential expression of genes can be regulated on many different levels. Most global studies of gene regulation concentrate on transcript level regulation, and very few global analyses of differential translational efficiencies exist. The studies have revealed that in Saccharomyces cerevisiae, Arabidopsis thaliana, and human cell lines translational regulation plays a significant role. Additional species have not been investigated yet. Particularly, until now no global study of translational control with any prokaryotic species was available. RESULTS A global analysis of translational control was performed with two haloarchaeal model species, Halobacterium salinarum and Haloferax volcanii. To identify differentially regulated genes, exponentially growing and stationary phase cells were compared. More than 20% of H. salinarum transcripts are translated with non-average efficiencies. By far the largest group is comprised of genes that are translated with above-average efficiency specifically in exponential phase, including genes for many ribosomal proteins, RNA polymerase subunits, enzymes, and chemotaxis proteins. Translation of 1% of all genes is specifically repressed in either of the two growth phases. For comparison, DNA microarrays were also used to identify differential transcriptional regulation in H. salinarum, and 17% of all genes were found to have non-average transcript levels in exponential versus stationary phase. In H. volcanii, 12% of all genes are translated with non-average efficiencies. The overlap with H. salinarum is negligible. In contrast to H. salinarum, 4.6% of genes have non-average translational efficiency in both growth phases, and thus they might be regulated by other stimuli than growth phase. CONCLUSION For the first time in any prokaryotic species it was shown that a significant fraction of genes is under differential translational control. Groups of genes with different regulatory patterns were discovered. However, neither the fractions nor the identity of regulated genes are conserved between H. salinarum and H. volcanii, indicating that prokaryotes as well as eukaryotes use differential translational control for the regulation of gene expression, but that the identity of regulated genes is not conserved. For 70 H. salinarum genes potentiation of regulation was observed, but for the majority of regulated genes either transcriptional or translational regulation is employed.
Collapse
Affiliation(s)
- Christian Lange
- Institute for Molecular Biosciences, Johann Wolfgang Goethe University, Max-von-Laue-Strasse 9, 60438 Frankfurt a,M., Germany.
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Twellmeyer J, Wende A, Wolfertz J, Pfeiffer F, Panhuysen M, Zaigler A, Soppa J, Welzl G, Oesterhelt D. Microarray analysis in the archaeon Halobacterium salinarum strain R1. PLoS One 2007; 2:e1064. [PMID: 17957248 PMCID: PMC2020435 DOI: 10.1371/journal.pone.0001064] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2007] [Accepted: 09/28/2007] [Indexed: 11/24/2022] Open
Abstract
Background Phototrophy of the extremely halophilic archaeon Halobacterium salinarum was explored for decades. The research was mainly focused on the expression of bacteriorhodopsin and its functional properties. In contrast, less is known about genome wide transcriptional changes and their impact on the physiological adaptation to phototrophy. The tool of choice to record transcriptional profiles is the DNA microarray technique. However, the technique is still rarely used for transcriptome analysis in archaea. Methodology/Principal Findings We developed a whole-genome DNA microarray based on our sequence data of the Hbt. salinarum strain R1 genome. The potential of our tool is exemplified by the comparison of cells growing under aerobic and phototrophic conditions, respectively. We processed the raw fluorescence data by several stringent filtering steps and a subsequent MAANOVA analysis. The study revealed a lot of transcriptional differences between the two cell states. We found that the transcriptional changes were relatively weak, though significant. Finally, the DNA microarray data were independently verified by a real-time PCR analysis. Conclusion/Significance This is the first DNA microarray analysis of Hbt. salinarum cells that were actually grown under phototrophic conditions. By comparing the transcriptomics data with current knowledge we could show that our DNA microarray tool is well applicable for transcriptome analysis in the extremely halophilic archaeon Hbt. salinarum. The reliability of our tool is based on both the high-quality array of DNA probes and the stringent data handling including MAANOVA analysis. Among the regulated genes more than 50% had unknown functions. This underlines the fact that haloarchaeal phototrophy is still far away from being completely understood. Hence, the data recorded in this study will be subject to future systems biology analysis.
Collapse
Affiliation(s)
- Jens Twellmeyer
- Max-Planck-Institute of Biochemistry, Membrane Biochemistry, Martinsried, Germany
| | - Andy Wende
- Max-Planck-Institute of Biochemistry, Membrane Biochemistry, Martinsried, Germany
| | - Jan Wolfertz
- Max-Planck-Institute of Biochemistry, Membrane Biochemistry, Martinsried, Germany
| | - Friedhelm Pfeiffer
- Max-Planck-Institute of Biochemistry, Membrane Biochemistry, Martinsried, Germany
| | - Markus Panhuysen
- Max-Planck-Institute of Psychiatry, Molecular Neurogenetics, Munich, Germany
| | - Alexander Zaigler
- Institute of Molecular Biosciences, University of Frankfurt, Frankfurt am Main, Germany
| | - Jörg Soppa
- Institute of Molecular Biosciences, University of Frankfurt, Frankfurt am Main, Germany
| | - Gerhard Welzl
- Institute of Biomathematics and Biometry, Forschungszentrum für Umwelt und Gesundheit (GSF)-National Research Centre for Environment and Health, Neuherberg, Germany
| | - Dieter Oesterhelt
- Max-Planck-Institute of Biochemistry, Membrane Biochemistry, Martinsried, Germany
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
12
|
Feng S, Tillier ERM. A fast and flexible approach to oligonucleotide probe design for genomes and gene families. Bioinformatics 2007; 23:1195-202. [PMID: 17392329 DOI: 10.1093/bioinformatics/btm114] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION With hundreds of completely sequenced microbial genomes available, and advancements in DNA microarray technology, the detection of genes in microbial communities consisting of hundreds of thousands of sequences may be possible. The existing strategies developed for DNA probe design, geared toward identifying specific sequences, are not suitable due to the lack of coverage, flexibility and efficiency necessary for applications in metagenomics. METHODS ProDesign is a tool developed for the selection of oligonucleotide probes to detect members of gene families present in environmental samples. Gene family-specific probe sequences are generated based on specific and shared words, which are found with the spaced seed hashing algorithm. To detect more sequences, those sharing some common words are re-clustered into new families, then probes specific for the new families are generated. RESULTS The program is very flexible in that it can be used for designing probes for detecting many genes families simultaneously and specifically in one or more genomes. Neither the length nor the melting temperature of the probes needs to be predefined. We have found that ProDesign provides more flexibility, coverage and speed than other software programs used in the selection of probes for genomic and gene family arrays. AVAILABILITY ProDesign is licensed free of charge to academic users. ProDesign and Supplementary Material can be obtained by contacting the authors. A web server for ProDesign is available at http://www.uhnresearch.ca/labs/tillier/ProDesign/ProDesign.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shengzhong Feng
- Institute of Computing Technology, Chinese Academy of Sciences, China
| | | |
Collapse
|
13
|
Taboada EN, Luebbert CC, Nash JHE. Studying bacterial genome dynamics using microarray-based comparative genomic hybridization. Methods Mol Biol 2007; 396:223-53. [PMID: 18025696 DOI: 10.1007/978-1-59745-515-2_15] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Genome sequencing has revealed the remarkable amount of genetic diversity that can be encountered in bacterial genomes. In particular, the comparison of genome sequences from closely related strains has uncovered significant differences in gene content, hinting at the dynamic nature of bacterial genomes. The study of these genome dynamics is crucial to leveraging genomic information because the genome sequence of a single bacterial strain may not accurately represent the genome of the species. The dynamic nature of bacterial genome content has required us to apply the concepts of comparative genomics (CG) at the species level. Although direct genome sequence comparisons are an ideal method of performing CG, one current constraint is the limited availability of multiple genome sequences from a given bacterial species. DNA microarray-based comparative genomic hybridization (MCGH), which can be used to determine the presence or absence of thousands of genes in a single hybridization experiment, provides a powerful alternative for determining genome content and has been successfully used to investigate the genome dynamics of a wide number of bacterial species. Although MCGH-based studies have already provided a new vista on bacterial genome diversity, original methods for MCGH have been limited by the absence of novel gene sequences included in the microarray. New applications of the MCGH platform not only promise to accelerate the pace of novel gene discovery but will also help provide an integrated microarray-based approach to the study of bacterial CG.
Collapse
|
14
|
Abstract
The profiling of mRNA expression based on DNA arrays has become a powerful tool to study genome-wide transcription of genes in a number of organisms. GST-PRIME is a software package created to facilitate large-scale primer design for the amplification of probes to be immobilized on arrays for transcriptome analyses, even though it can be also applied in low-throughput approaches. GST-PRIME allows highly efficient, direct amplification of gene-sequence tags (GSTs) from genomic DNA (gDNA), starting from annotated genome or transcript sequences. GST-PRIME provides a customer-friendly platform for automatic primer design, and despite the relative simplicity of the algorithm, experimental tests in the model plant species Arabidopsis thaliana confirmed the reliability of the software. This chapter describes the algorithm used for primer design, its input and output files, and the installation of the standalone package and its use.
Collapse
|
15
|
Mehlmann M, Dawson ED, Townsend MB, Smagala JA, Moore CL, Smith CB, Cox NJ, Kuchta RD, Rowlen KL. Robust sequence selection method used to develop the FluChip diagnostic microarray for influenza virus. J Clin Microbiol 2006; 44:2857-62. [PMID: 16891503 PMCID: PMC1594657 DOI: 10.1128/jcm.00135-06] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
DNA microarrays have proven to be powerful tools for gene expression analyses and are becoming increasingly attractive for diagnostic applications, e.g., for virus identification and subtyping. The selection of appropriate sequences for use on a microarray poses a challenge, particularly for highly mutable organisms such as influenza viruses, human immunodeficiency viruses, and hepatitis C viruses. The goal of this work was to develop an efficient method for mining large databases in order to identify regions of conservation in the influenza virus genome. From these regions of conservation, capture and label sequences capable of discriminating between different viral types and subtypes were selected. The salient features of the method were the use of phylogenetic trees for data reduction and the selection of a relatively small number of capture and label sequences capable of identifying a broad spectrum of influenza viruses. A detailed experimental evaluation of the selected sequences is described in a companion paper. The software is freely available under the General Public License at http://www.colorado.edu/chemistry/RGHP/software/.
Collapse
Affiliation(s)
- Martin Mehlmann
- Department of Chemistry and Biochemistry, University of Colorado, UCB215, Boulder, CO 80303, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Gressmann H, Linz B, Ghai R, Pleissner KP, Schlapbach R, Yamaoka Y, Kraft C, Suerbaum S, Meyer TF, Achtman M. Gain and loss of multiple genes during the evolution of Helicobacter pylori. PLoS Genet 2006; 1:e43. [PMID: 16217547 PMCID: PMC1245399 DOI: 10.1371/journal.pgen.0010043] [Citation(s) in RCA: 172] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2005] [Accepted: 08/26/2005] [Indexed: 12/16/2022] Open
Abstract
Sequence diversity and gene content distinguish most isolates of Helicobacter pylori. Even greater sequence differences differentiate distinct populations of H. pylori from different continents, but it was not clear whether these populations also differ in gene content. To address this question, we tested 56 globally representative strains of H. pylori and four strains of Helicobacter acinonychis with whole genome microarrays. Of the weighted average of 1,531 genes present in the two sequenced genomes, 25% are absent in at least one strain of H. pylori and 21% were absent or variable in H. acinonychis. We extrapolate that the core genome present in all isolates of H. pylori contains 1,111 genes. Variable genes tend to be small and possess unusual GC content; many of them have probably been imported by horizontal gene transfer. Phylogenetic trees based on the microarray data differ from those based on sequences of seven genes from the core genome. These discrepancies are due to homoplasies resulting from independent gene loss by deletion or recombination in multiple strains, which distort phylogenetic patterns. The patterns of these discrepancies versus population structure allow a reconstruction of the timing of the acquisition of variable genes within this species. Variable genes that are located within the cag pathogenicity island were apparently first acquired en bloc after speciation. In contrast, most other variable genes are of unknown function or encode restriction/modification enzymes, transposases, or outer membrane proteins. These seem to have been acquired prior to speciation of H. pylori and were subsequently lost by convergent evolution within individual strains. Thus, the use of microarrays can reveal patterns of gene gain or loss when examined within a phylogenetic context that is based on sequences of core genes.
Collapse
Affiliation(s)
- Helga Gressmann
- Department of Molecular Biology, Max-Planck-Institut für Infektionsbiologie, Berlin, Germany
| | - Bodo Linz
- Department of Molecular Biology, Max-Planck-Institut für Infektionsbiologie, Berlin, Germany
| | - Rohit Ghai
- Institut für Medizinische Mikrobiologie, Justus-Liebig-Universität, Giessen, Germany
| | - Klaus-Peter Pleissner
- Core Facility Bioinformatics, Max-Planck-Institut für Infektionsbiologie, Berlin, Germany
| | - Ralph Schlapbach
- Functional Genomics Center Zurich, ETH Zurich/University of Zurich, Zurich, Switzerland
| | - Yoshio Yamaoka
- Department of Medicine, M.E. DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, Texas, United States of America
| | - Christian Kraft
- Medizinische Hochschule Hannover, Institut für Medizinische Mikrobiologie und Krankenhaushygiene, Hannover, Germany
| | - Sebastian Suerbaum
- Medizinische Hochschule Hannover, Institut für Medizinische Mikrobiologie und Krankenhaushygiene, Hannover, Germany
| | - Thomas F Meyer
- Department of Molecular Biology, Max-Planck-Institut für Infektionsbiologie, Berlin, Germany
| | - Mark Achtman
- Department of Molecular Biology, Max-Planck-Institut für Infektionsbiologie, Berlin, Germany
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
17
|
Mann TP, Humbert R, Stamatoyannopolous JA, Noble WS. AUTOMATED VALIDATION OF POLYMERASE CHAIN REACTION AMPLICON MELTING CURVES. J Bioinform Comput Biol 2006; 4:299-315. [PMID: 16819785 DOI: 10.1142/s0219720006001989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2005] [Revised: 12/01/2005] [Accepted: 01/31/2006] [Indexed: 11/18/2022]
Abstract
The polymerase chain reaction (PCR) is a fundamental tool of molecular biology. Quantitative PCR is the gold-standard methodology for determination of DNA copy numbers, quantitating transcription, and numerous other applications. A major barrier to large-scale application of PCR for quantitative genomic analyses is the current requirement for manual validation of individual PCRs to ensure generation of a single product. This typically requires visual inspection either of gel electrophoreses or temperature dissociation ("melting") curves of individual PCRs — a time-consuming and costly process. Here we describe a robust computational solution to this fundamental problem. Using a training set of 10 080 reactions comprising multiple quantitative PCRs from each of 1728 unique human genomic amplicons, we developed a support vector machine classifier capable of discriminating single-product PCRs with better than 99% accuracy. This approach has broad utility, and eliminates a major bottleneck to widespread application of PCR for high-throughput genomic applications.
Collapse
Affiliation(s)
- Tobias P Mann
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| | | | | | | |
Collapse
|
18
|
Nebozhyn M, Loboda A, Kari L, Rook AH, Vonderheid EC, Lessin S, Berger C, Edelson R, Nichols C, Yousef M, Gudipati L, Shang M, Showe MK, Showe LC. Quantitative PCR on 5 genes reliably identifies CTCL patients with 5% to 99% circulating tumor cells with 90% accuracy. Blood 2006; 107:3189-96. [PMID: 16403914 PMCID: PMC1464056 DOI: 10.1182/blood-2005-07-2813] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
We previously identified a small number of genes using cDNA arrays that accurately diagnosed patients with Sézary Syndrome (SS), the erythrodermic and leukemic form of cutaneous T-cell lymphoma (CTCL). We now report the development of a quantitative real-time polymerase chain reaction (qRT-PCR) assay that uses expression values for just 5 of those genes: STAT4, GATA-3, PLS3, CD1D, and TRAIL. qRT-PCR data from peripheral blood mononuclear cells (PBMCs) accurately classified 88% of 17 patients with high blood tumor burden and 100% of 12 healthy controls in the training set using Fisher linear discriminant analysis (FLDA). The same 5 genes were then assayed on 56 new samples from 49 SS patients with blood tumor burdens of 5% to 99% and 69 samples from 65 new healthy controls. The average accuracy over 1000 resamplings was 90% using FLDA and 88% using support vector machine (SVM). We also tested the classifier on 14 samples from patients with CTCL with no detectable peripheral involvement and 3 patients with atopic dermatitis with severe erythroderma. The accuracy was 100% in identifying these samples as non-SS patients. These results are the first to demonstrate that gene expression profiling by quantitative PCR on a selected number of critical genes can be employed to molecularly diagnosis SS.
Collapse
Affiliation(s)
- Michael Nebozhyn
- The Wistar Institute, 3601 Spruce St, Philadelphia, PA 19104, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Lindroos HL, Mira A, Repsilber D, Vinnere O, Näslund K, Dehio M, Dehio C, Andersson SGE. Characterization of the genome composition of Bartonella koehlerae by microarray comparative genomic hybridization profiling. J Bacteriol 2005; 187:6155-65. [PMID: 16109957 PMCID: PMC1196136 DOI: 10.1128/jb.187.17.6155-6165.2005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Bartonella henselae is present in a wide range of wild and domestic feline hosts and causes cat-scratch disease and bacillary angiomatosis in humans. We have estimated here the gene content of Bartonella koehlerae, a novel species isolated from cats that was recently identified as an agent of human endocarditis. The investigation was accomplished by comparative genomic hybridization (CGH) to a microarray constructed from the sequenced 1.93-Mb genome of B. henselae. Control hybridizations of labeled DNA from the human pathogen Bartonella quintana with a reduced genome of 1.58 Mb were performed to evaluate the accuracy of the array for genes with known levels of sequence divergence. Genome size estimates of B. koehlerae by pulsed-field gel electrophoresis matched that calculated by the CGH, indicating a genome of 1.7 to 1.8 Mb with few unique genes. As in B. quintana, sequences in the prophage and the genomic islands were reported absent in B. koehlerae. In addition, sequence variability was recorded in the chromosome II-like region, where B. koehlerae showed an intermediate retention pattern of both coding and noncoding sequences. Although most of the genes missing in B. koehlerae are also absent from B. quintana, its phylogenetic placement near B. henselae suggests independent deletion events, indicating that host specificity is not solely attributed to genes in the genomic islands. Rather, the results underscore the instability of the genomic islands even within bacterial populations adapted to the same host-vector system, as in the case of B. henselae and B. koehlerae.
Collapse
Affiliation(s)
- Hillevi L Lindroos
- Department of Molecular Evolution, Evolutionary Biology Center, Norbyvägen 18C, 752 36 Uppsala, Sweden
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
Functional genomics methods are used to investigate the huge amount of information contained in genomes. Numerous experimental methods rely on the use of oligo- or polynucleotides. Nucleotide strand hybridization forms the underlying principle for these methods. For all these techniques, the probes should be unique for analyzed genes. In addition to being unique for the studied genes, the probes should fulfill a large number of criteria to be usable and valid. The criteria include for example, avoidance of self-annealing, suitable melting temperature and nucleotide composition. We developed a method for searching unique and valid oligonucleotides or probes for genes so that there is not even a similar (approximate) occurrence in any other location of the whole genome. By using probe size 25, we analyzed 17 complete genomes representing a wide range of both prokaryotic and eukaryotic organisms. More than 92% of all the genes in the investigated genomes contained valid oligonucleotides. Extensive statistical tests were performed to characterize the properties of unique and valid oligonucleotides. Unique and valid oligonucleotides were relatively evenly distributed in genes except for the beginning and end, which were somewhat overrepresented. The flanking regions in eukaryotes were clearly underrepresented among suitable oligonucleotides. In addition to distributions within genes, the effects on codon and amino acid usage were also studied.
Collapse
Affiliation(s)
| | | | - Mauno Vihinen
- Institute of Medical Technology, FI-33014 University of TampereFinland
- Research Unit, Tampere University HospitalFI-33520 Tampere, Finland
- To whom correspondence should be addressed. Tel: +358 3 35517735; Fax: +358 3 35517710;
| |
Collapse
|
21
|
Rouchka EC, Khalyfa A, Cooper NGF. MPrime: efficient large scale multiple primer and oligonucleotide design for customized gene microarrays. BMC Bioinformatics 2005; 6:175. [PMID: 16014168 PMCID: PMC1187872 DOI: 10.1186/1471-2105-6-175] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2005] [Accepted: 07/13/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Enhancements in sequencing technology have recently yielded assemblies of large genomes including rat, mouse, human, fruit fly, and zebrafish. The availability of large-scale genomic and genic sequence data coupled with advances in microarray technology have made it possible to study the expression of large numbers of sequence products under several different conditions in days where traditional molecular biology techniques might have taken months, or even years. Therefore, to efficiently study a number of gene products associated with a disease, pathway, or other biological process, it is necessary to be able to design primer pairs or oligonucleotides en masse rather than using a time consuming and laborious gene-by-gene method. RESULTS We have developed an integrated system, MPrime, in order to efficiently calculate primer pairs or specific oligonucleotides for multiple genic regions based on a keyword, gene name, accession number, or sequence fasta format within the rat, mouse, human, fruit fly, and zebrafish genomes. A set of products created for mouse housekeeping genes from MPrime-designed primer pairs has been validated using both PCR-amplification and DNA sequencing. CONCLUSION These results indicate MPrime accurately incorporates standard PCR primer design characteristics to produce high scoring primer pairs for genes of interest. In addition, sequence similarity for a set of oligonucleotides constructed for the same set of genes indicates high specificity in oligo design.
Collapse
Affiliation(s)
- Eric C Rouchka
- Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Louisville, Kentucky, USA
- Bioinformatics Research Group, University of Louisville, Louisville, Kentucky, USA
| | - Abdelnaby Khalyfa
- Department of Anatomical Sciences and Neurobiology, University of Louisville School of Medicine, Louisville, Kentucky, USA
- Bioinformatics Research Group, University of Louisville, Louisville, Kentucky, USA
| | - Nigel GF Cooper
- Department of Anatomical Sciences and Neurobiology, University of Louisville School of Medicine, Louisville, Kentucky, USA
- Bioinformatics Research Group, University of Louisville, Louisville, Kentucky, USA
| |
Collapse
|
22
|
Marks H, Vorst O, van Houwelingen AMML, van Hulten MCW, Vlak JM. Gene-expression profiling of White spot syndrome virus in vivo. J Gen Virol 2005; 86:2081-2100. [PMID: 15958687 DOI: 10.1099/vir.0.80895-0] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
White spot syndrome virus, type species of the genus Whispovirus in the family Nimaviridae, is a large, double-stranded DNA (dsDNA) virus that infects crustaceans. The genome of the completely sequenced isolate WSSV-TH encodes 184 putative open reading frames (ORFs), the functions of which are largely unknown. To study the transcription of these ORFs, a DNA microarray was constructed, containing probes corresponding to nearly all putative WSSV-TH ORFs. Transcripts of 79 % of these ORFs could be detected in the gills of WSSV-infected shrimp (Penaeus monodon). Clustering of the transcription profiles of the individual genes during infection showed two major classes of genes: the first class reached maximal expression at 20 h post-infection (p.i.) (putative early) and the other class at 2 days p.i. (putative late). Nearly all major and minor structural virion-protein genes clustered in the latter group. These data provide evidence that, similar to other large, dsDNA viruses, the WSSV genes at large are expressed in a coordinated and cascaded fashion. Furthermore, the transcriptomes of the WSSV isolates WSSV-TH and TH-96-II, which have differential virulence, were compared at 2 days p.i. The TH-96-II genome encodes 10 ORFs that are not present in WSSV-TH, of which at least seven were expressed in P. monodon as well as in crayfish (Astacus leptodactylus), suggesting a functional but not essential role for these genes during infection. Expression levels of most other ORFs shared by both isolates were similar. Evaluation of transcription profiles by using a genome-wide approach provides a better understanding of WSSV transcription regulation and a new tool to study WSSV gene function.
Collapse
Affiliation(s)
- Hendrik Marks
- Laboratory of Virology, Wageningen University, Binnenhaven 11, 6709 PD Wageningen, The Netherlands
| | - Oscar Vorst
- Plant Research International, Postbus 16, 6700 AA Wageningen, The Netherlands
| | | | - Mariëlle C W van Hulten
- Laboratory of Virology, Wageningen University, Binnenhaven 11, 6709 PD Wageningen, The Netherlands
| | - Just M Vlak
- Laboratory of Virology, Wageningen University, Binnenhaven 11, 6709 PD Wageningen, The Netherlands
| |
Collapse
|
23
|
Ng KW, Lawson J, Garner HR. PathoGene: a pathogen coding sequence discovery and analysis resource. Biotechniques 2005; 37:218, 220-2. [PMID: 15335212 DOI: 10.2144/04372st01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
PathoGene is a web-based resource that streamlines the process of predicting genes in microorganisms and designs PCR primers for amplification to facilitate sequence analysis and experimentation. PathoGene currently supports primer design for every complete microbial, viral, and fungal genome as cataloged in GenBank by the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/). The resulting primers can then be subjected to a stand-alone Basic Local Alignment Search Tool (BLAST) system called PathoBLAST in which the predicted PCR product and/or primers can be compared against the genome of interest or a similar genome to find related genes or estimate primer quality.
Collapse
Affiliation(s)
- Kar-wai Ng
- The University of Texas Southwestern Medical Center, Dallas, TX 75390-8591, USA
| | | | | |
Collapse
|
24
|
Mann TP, Humbert R, Stamatoyannopolous JA, Noble WS. Automated validation of polymerase chain reactions using amplicon melting curves. PROCEEDINGS. IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE 2005:377-85. [PMID: 16447995 DOI: 10.1109/csb.2005.17] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
PCR, the polymerase chain reaction, is a fundamental tool of molecular biology. Quantitative PCR is the gold-standard methodology for determination of DNA copy numbers, quantitating transcription, and numerous other applications. A major barrier to large-scale application of PCR for quantitative genomic analyses is the current requirement for manual validation of individual PCR reactions to ensure generation of a single product. This typically requires visual inspection either of gel electrophoreses or temperature dissociation ("melting") curves of individual PCR reactions - a time-consuming and costly process. Here we describe a robust computational solution to this fundamental problem. Using a training set of 10,080 reactions comprising multiple quantitative PCR reactions from each of 1,728 unique human genomic amplicons, we developed a support vector machine classifier capable of discriminating single-product PCR reactions with better than 99% accuracy. This approach has broad utility, and eliminates a major bottleneck to widespread application of PCR for high-throughput genomic applications.
Collapse
Affiliation(s)
- Tobias P Mann
- Department of Genome Science, University of Washington, Seattle, WA, USA
| | | | | | | |
Collapse
|
25
|
Chen YA, Mckillen DJ, Wu S, Jenny MJ, Chapman R, Gross PS, Warr GW, Almeida JS. Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information. BMC Bioinformatics 2004; 5:191. [PMID: 15585062 PMCID: PMC539232 DOI: 10.1186/1471-2105-5-191] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2004] [Accepted: 12/07/2004] [Indexed: 12/04/2022] Open
Abstract
Background Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-hybridization between paralogous genes. In organisms with limited genomic information, like marine organisms, this challenge is even greater due to annotation uncertainty. No general tool is available for cDNA microarray probe selection for these organisms. Therefore, the goal of the design procedure described here is to select a subset of ESTs that will minimize sequence redundancy and characterize potential cross-hybridization while providing functionally representative probes. Results Sequence similarity between ESTs, quantified by the E-value of pair-wise alignment, was used as a surrogate for expected hybridization between corresponding sequences. Using this value as a measure of dissimilarity, sequence redundancy reduction was performed by hierarchical cluster analyses. The choice of how many microarray probes to retain was made based on an index developed for this research: a sequence diversity index (SDI) within a sequence diversity plot (SDP). This index tracked the decreasing within-cluster sequence diversity as the number of clusters increased. For a given stage in the agglomeration procedure, the EST having the highest similarity to all the other sequences within each cluster, the centroid EST, was selected as a microarray probe. A small dataset of ESTs from Atlantic white shrimp (Litopenaeus setiferus) was used to test this algorithm so that the detailed results could be examined. The functional representative level of the selected probes was quantified using Gene Ontology (GO) annotations. Conclusions For organisms with limited genomic information, combining hierarchical clustering methods to analyze ESTs can yield an optimal cDNA microarray design. If biomarker discovery is the goal of the microarray experiments, the average linkage method is more effective, while single linkage is more suitable if identification of physiological mechanisms is more of interest. This general design procedure is not limited to designing single-species cDNA microarrays for marine organisms, and it can equally be applied to multiple-species microarrays of any organisms with limited genomic information.
Collapse
Affiliation(s)
- Yian A Chen
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| | - David J Mckillen
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA
| | - Shuyuan Wu
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| | - Matthew J Jenny
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA
- Marine Biomedicine and Environmental Science Center, Medical University of South Carolina, Charleston, SC, USA
| | - Robert Chapman
- Marine Biomedicine and Environmental Science Center, Medical University of South Carolina, Charleston, SC, USA
- South Carolina Department of Natural Resources, Marine Resources Research Institute, Charleston, SC, USA
| | - Paul S Gross
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA
- Marine Biomedicine and Environmental Science Center, Medical University of South Carolina, Charleston, SC, USA
| | - Gregory W Warr
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA
- Marine Biomedicine and Environmental Science Center, Medical University of South Carolina, Charleston, SC, USA
| | - Jonas S Almeida
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA
| |
Collapse
|
26
|
Abstract
MOTIVATION Selecting oligonucleotide probes for use in microarray design, and other applications requiring signature sequences, involves identifying sequences which will bind strongly to their intended target, while binding only weakly (or preferably, not at all) to non-target sequences which may be present in the hybridization reaction. While many tools to assist in selection of such sequences exist, all the ones we examined lack important oligo design and software features. RESULTS YODA is an application for assisting biological researchers in selecting signature sequences. It incorporates a custom sequence similarity search to find potential cross-hybridizing non-target sequences. For this task, most oligo design tools rely on BLAST, which is ill suited for it due to an unacceptable risk of false negatives. YODA supports multiple probe design goals including single-genome, multiple-genome, pathogen-host and species/strain-identification. A graphical interface is provided as well as a command-line interface, both of which support many user-controlled parameters. YODA is easy to install and use and runs on Windows, Mac OS X and Linux platforms. AVAILABILITY Freely available (LGLP) along with source code and additional documentation at http://pathport.vbi.vt.edu/YODA CONTACT: enordber@vbi.vt.edu.
Collapse
Affiliation(s)
- Eric K Nordberg
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA.
| |
Collapse
|
27
|
Rimour S, Hill D, Militon C, Peyret P. GoArrays: highly dynamic and efficient microarray probe design. Bioinformatics 2004; 21:1094-103. [PMID: 15531611 DOI: 10.1093/bioinformatics/bti112] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The use of oligonucleotide microarray technology requires a very detailed attention to the design of specific probes spotted on the solid phase. These problems are far from being commonplace since they refer to complex physicochemical constraints. Whereas there are more and more publicly available programs for microarray oligonucleotide design, most of them use the same algorithm or criteria to design oligos, with only little variation. RESULTS We show that classical approaches used in oligo design software may be inefficient under certain experimental conditions, especially when dealing with complex target mixtures. Indeed, our biological model is a human obligate parasite, the microsporidia Encephalitozoon cuniculi. Targets that are extracted from biological samples are composed of a mixture of pathogen transcripts and host cell transcripts. We propose a new approach to design oligonucleotides which combines good specificity with a potentially high sensitivity. This approach is original in the biological point of view as well as in the algorithmic point of view. We also present an experimental validation of this new strategy by comparing results obtained with standard oligos and with our composite oligos. A specific E.cuniculi microarray will overcome the difficulty to discriminate the parasite mRNAs from the host cell mRNAs demonstrating the power of the microarray approach to elucidate the lifestyle of an intracellular pathogen using mix mRNAs.
Collapse
Affiliation(s)
- Sébastien Rimour
- LIMOS UMR CNRS 6158, Blaise Pascal University, Clermont-Ferrand II BP 10125, 63177 Aubiere Cedex, France.
| | | | | | | |
Collapse
|
28
|
Majtán T, Bukovská G, Timko J. DNA microarrays — techniques and applications in microbial systems. Folia Microbiol (Praha) 2004; 49:635-64. [PMID: 15881400 DOI: 10.1007/bf02931546] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Genome projects produce a huge amount of sequence information. As a result, the focus of genomics research is turning toward deduction of functional information about newly discovered genes. Thus structural genomics paves the way for a new discipline called functional genomics by providing the information required for microarray manufacture. Microarray technology is the result of automation and miniaturization in the detection of differential gene expression. By using this technology one can make a parallel analysis of RNA abundance and DNA homology for thousands of genes in a single experiment. Over the past several years, this unique technology has been used to explore hundreds transcriptional patterns and genome differences for a variety of microbial species. Applications of microarrays extend beyond the boundaries of basic biology into diagnostics, environmental monitoring, pharmacology, toxicology and biotechnology. We describe comprehensive nature of DNA microarray technology with emphasis on fabrication of DNA microarrays and application of this technology in biological environment with primary accent on microbial systems.
Collapse
Affiliation(s)
- T Majtán
- Institute of Molecular Biology, Centre of Excellence for Molecular Medicine of the Slovak Academy of Sciences, 845 51 Bratislava, Slovakia.
| | | | | |
Collapse
|
29
|
Weckx S, De Rijk P, Van Broeckhoven C, Del-Favero J. SNPbox: a modular software package for large-scale primer design. Bioinformatics 2004; 21:385-7. [PMID: 15347573 DOI: 10.1093/bioinformatics/bti006] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED We developed a modular software package SNPbox that automates and standardizes the generation of PCR primers and is used in the strategy for constructing single nucleotide polymorphisms (SNPs) maps. In this strategy, the focus of primer design can be either on the validation of annotated public SNPs or on the SNP discovery in exon regions or extended genomic regions, both by resequencing. SNPbox relies on Primer3 for the primer design and combines this program with other publicly available software tools such as BLAST, Spidey and RepeatMasker, and newly developed algorithms. Primer conditions were chosen such that PCR amplifications are uniform for each PCR amplicon facilitating the use of high-throughput genetic platforms. SNPbox can also be used for the design of primer sets for mutation analysis, STR marker genotyping and microarray oligo design. Of the 2500 primer sets designed by SNPbox, 95% successfully amplified genomic DNA under uniform PCR conditions. AVAILABILITY The software is available from the authors upon request. SUPPLEMENTARY INFORMATION SNPbox_supplement.
Collapse
Affiliation(s)
- Stefan Weckx
- Department of Molecular Genetics , Bioinformatics Unit, Flanders Interuniversity Institute for Biotechnology, University of Antwerp, B-2610 Antwerpen, Belgium
| | | | | | | |
Collapse
|
30
|
Niehus E, Gressmann H, Ye F, Schlapbach R, Dehio M, Dehio C, Stack A, Meyer TF, Suerbaum S, Josenhans C. Genome-wide analysis of transcriptional hierarchy and feedback regulation in the flagellar system of Helicobacter pylori. Mol Microbiol 2004; 52:947-61. [PMID: 15130117 DOI: 10.1111/j.1365-2958.2004.04006.x] [Citation(s) in RCA: 145] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The flagellar system of Helicobacter pylori, which comprises more than 40 mostly unclustered genes, is essential for colonization of the human stomach mucosa. In order to elucidate the complex transcriptional circuitry of flagellar biosynthesis in H. pylori and its link to other cell functions, mutants in regulatory genes governing flagellar biosynthesis (rpoN, flgR, flhA, flhF, HP0244) and whole-genome microarray technology were used in this study. The regulon controlled by RpoN, its activator FlgR (FleR) and the cognate histidine kinase HP0244 (FleS) was characterized on a genome-wide scale for the first time. Seven novel genes (HP1076, HP1233, HP1154/1155, HP0366/367, HP0869) were identified as belonging to RpoN-associated flagellar regulons. The hydrogenase accessory gene HP0869 was the only annotated non-flagellar gene in the RpoN regulon. Flagellar basal body components FlhA and FlhF were characterized as functional equivalents to master regulators in H. pylori, as their absence led to a general reduction of transcripts in the RpoN (class 2) and FliA (class 3) regulons, and of 24 genes newly attributed to intermediate regulons, under the control of two or more promoters. FlhA- and FlhF-dependent regulons comprised flagellar and non-flagellar genes. Transcriptome analysis revealed that negative feedback regulation of the FliA regulon was dependent on the antisigma factor FlgM. FlgM was also involved in FlhA- but not FlhF-dependent feedback control of the RpoN regulon. In contrast to other bacteria, chemotaxis and flagellar motor genes were not controlled by FliA or RpoN. A true master regulator of flagellar biosynthesis is absent in H. pylori, consistent with the essential role of flagellar motility and chemotaxis for this organism.
Collapse
Affiliation(s)
- Eike Niehus
- Institute of Hygiene and Microbiology, University of Wuerzburg, Josef-Schneider-Strasse 2, D-97080 Wuerzburg, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Haas SA, Hild M, Wright APH, Hain T, Talibi D, Vingron M. Genome-scale design of PCR primers and long oligomers for DNA microarrays. Nucleic Acids Res 2003; 31:5576-81. [PMID: 14500820 PMCID: PMC206452 DOI: 10.1093/nar/gkg752] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
During the last years, the demand for custom-made cDNA chips/arrays as well as whole genome chips is increasing rapidly. The efficient selection of gene-specific primers/oligomers is of the utmost importance for the successful production of such chips. We developed GenomePRIDE, a highly flexible and scalable software for designing primers/oligomers for large-scale projects. The program is able to generate either long oligomers (40-70 bases), or PCR primers for the amplification of gene-specific DNA fragments of user-defined length. Additionally, primers can be designed in-frame in order to facilitate large-scale cloning into expression vectors. Furthermore, GenomePRIDE can be adapted to specific applications such as the generation of genomic amplicon arrays or the design of fragments specific for alternative splice isoforms. We tested the performance of GenomePRIDE on the entire genomes of Listeria monocytogenes (1584 gene-specific PCRs, 48 long oligomers) as well as of eukaryotes such as Schizosaccharomyces pombe (5006 gene-specific PCRs), and Drosophila melanogaster (21 306 gene-specific PCRs). With its computing speed of 1000 primer pairs per hour and a PCR amplification success of 99%, GenomePRIDE represents an extremely cost- and time-effective program.
Collapse
Affiliation(s)
- Stefan A Haas
- Department of Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Ihnestrasse 73, D-14195 Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
32
|
Kazmierczak MJ, Mithoe SC, Boor KJ, Wiedmann M. Listeria monocytogenes sigma B regulates stress response and virulence functions. J Bacteriol 2003; 185:5722-34. [PMID: 13129943 PMCID: PMC193959 DOI: 10.1128/jb.185.19.5722-5734.2003] [Citation(s) in RCA: 256] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
While the stress-responsive alternative sigma factor sigma(B) has been identified in different species of Bacillus, Listeria, and Staphylococcus, the sigma(B) regulon has been extensively characterized only in B. subtilis. We combined biocomputing and microarray-based strategies to identify sigma(B)-dependent genes in the facultative intracellular pathogen Listeria monocytogenes. Hidden Markov model (HMM)-based searches identified 170 candidate sigma(B)-dependent promoter sequences in the strain EGD-e genome sequence. These data were used to develop a specialized, 208-gene microarray, which included 166 genes downstream of HMM-predicted sigma(B)-dependent promoters as well as selected virulence and stress response genes. RNA for the microarray experiments was isolated from both wild-type and Delta sigB null mutant L. monocytogenes cells grown to stationary phase or exposed to osmotic stress (0.5 M KCl). Microarray analyses identified a total of 55 genes with statistically significant sigma(B)-dependent expression under the conditions used in these experiments, with at least 1.5-fold-higher expression in the wild type over the sigB mutant under either stress condition (51 genes showed at least 2.0-fold-higher expression in the wild type). Of the 55 genes exhibiting sigma(B)-dependent expression, 54 were preceded by a sequence resembling the sigma(B) promoter consensus sequence. Rapid amplification of cDNA ends-PCR was used to confirm the sigma(B)-dependent nature of a subset of eight selected promoter regions. Notably, the sigma(B)-dependent L. monocytogenes genes identified through this HMM/microarray strategy included both stress response genes (e.g., gadB, ctc, and the glutathione reductase gene lmo1433) and virulence genes (e.g., inlA, inlB, and bsh). Our data demonstrate that, in addition to regulating expression of genes important for survival under environmental stress conditions, sigma(B) also contributes to regulation of virulence gene expression in L. monocytogenes. These findings strongly suggest that sigma(B) contributes to L. monocytogenes gene expression during infection.
Collapse
|
33
|
Talla E, Tekaia F, Brino L, Dujon B. A novel design of whole-genome microarray probes for Saccharomyces cerevisiae which minimizes cross-hybridization. BMC Genomics 2003; 4:38. [PMID: 14499002 PMCID: PMC239980 DOI: 10.1186/1471-2164-4-38] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2003] [Accepted: 09/22/2003] [Indexed: 12/19/2022] Open
Abstract
Background Numerous DNA microarray hybridization experiments have been performed in yeast over the last years using either synthetic oligonucleotides or PCR-amplified coding sequences as probes. The design and quality of the microarray probes are of critical importance for hybridization experiments as well as subsequent analysis of the data. Results We present here a novel design of Saccharomyces cerevisiae microarrays based on a refined annotation of the genome and with the aim of reducing cross-hybridization between related sequences. An effort was made to design probes of similar lengths, preferably located in the 3'-end of reading frames. The sequence of each gene was compared against the entire yeast genome and optimal sub-segments giving no predicted cross-hybridization were selected. A total of 5660 novel probes (more than 97% of the yeast genes) were designed. For the remaining 143 genes, cross-hybridization was unavoidable. Using a set of 18 deletant strains, we have experimentally validated our cross-hybridization procedure. Sensitivity, reproducibility and dynamic range of these new microarrays have been measured. Based on this experience, we have written a novel program to design long oligonucleotides for microarray hybridizations of complete genome sequences. Conclusions A validated procedure to predict cross-hybridization in microarray probe design was defined in this work. Subsequently, a novel Saccharomyces cerevisiae microarray (which minimizes cross-hybridization) was designed and constructed. Arrays are available at Eurogentec S. A. Finally, we propose a novel design program, OliD, which allows automatic oligonucleotide design for microarrays. The OliD program is available from authors.
Collapse
Affiliation(s)
- Emmanuel Talla
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Fredj Tekaia
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Laurent Brino
- Eurogentec s.a., Parc Scientifique du Sart Tilman, B-4102 Seraing, Belgium
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| |
Collapse
|
34
|
Chen SH, Lin CY, Cho CS, Lo CZ, Hsiung CA. Primer Design Assistant (PDA): A web-based primer design tool. Nucleic Acids Res 2003; 31:3751-4. [PMID: 12824410 PMCID: PMC168967 DOI: 10.1093/nar/gkg560] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Primer Design Assistant (PDA) is a web interface primer design service combined with thermodynamic theory to evaluate the fitness of primers. It runs in a Linux-Apache-MySQL-PHP structure on a PC equipped with dual CPU (Intel Pentium III 1.4 GHz) and 512 Mb of RAM. A succinct user interface of PDA is accomplished by built-in parameters setting. Advanced options on 5' GC content, 3' GC content, dimer check and hairpin check are available. The option of covered region constrains the PCR product to cover a user-defined segment. PDA accepts single sequence query or multiple ones in FASTA format. It produces optimal and homogenous primer pairs that meet the need in experimental design with large-scaled PCR amplifications. Considering the system loading, the size of a submitted sequence is limited to 10 kb and the total sequence number in a query is limited to 20. The authors may be contacted regarding other requirements for primer design. The web application can be found at http://dbb.nhri.org.tw/primer/.
Collapse
Affiliation(s)
- S H Chen
- Division of Biostatistics and Bioinformatics, National Health Research Institutes, 128, Yen-Chiu-Yuan Rd Sec. 2, Taipei 115, Taiwan
| | | | | | | | | |
Collapse
|
35
|
Tinsley CR, Perrin A, Borezée E, Nassif X. Neisseria microarrays. Methods Enzymol 2003; 358:188-207. [PMID: 12474388 DOI: 10.1016/s0076-6879(02)58090-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2023]
Affiliation(s)
- Colin R Tinsley
- INSERM U570, Faculté de Médecine Necker-Enfants Malades, 75730 Paris, France
| | | | | | | |
Collapse
|
36
|
Boa Z, Ma WL, Hu ZY, Rong S, Shi YB, Zheng WL. A method for evaluation of the quality of DNA microarray spots. JOURNAL OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2002; 35:532-5. [PMID: 12359098 DOI: 10.5483/bmbrep.2002.35.5.532] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
To establish a method to evaluate the quality of the printed microarray and DNA fragments' immobilization. The target gene fragments that were made with the restriction display PCR (RD-PCR) technique were printed on a superamine modified glass slide, then immobilized with UV cross-linking and heat. This chip was hybridized with universal primers that were labeled with cy3-dUTP, as well as cDNA that was labeled with cy3-dCTP, as the conventional protocol. Most of the target gene fragments on the chip showed positive signals, but the negative control showed no signal, and vice versa. We established a method that enables an effective evaluation of the quality of the microarrays.
Collapse
Affiliation(s)
- Zhang Boa
- Department of Biochemistry, First Military Medical University, Guangzhou 510515, PR China
| | | | | | | | | | | |
Collapse
|
37
|
Srinivasan J, Sinz W, Lanz C, Brand A, Nandakumar R, Raddatz G, Witte H, Keller H, Kipping I, Pires-daSilva A, Jesse T, Millare J, de Both M, Schuster SC, Sommer RJ. A Bacterial Artificial Chromosome-Based Genetic Linkage Map of the Nematode Pristionchus pacificus. Genetics 2002; 162:129-34. [PMID: 12242228 PMCID: PMC1462235 DOI: 10.1093/genetics/162.1.129] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
To understand the evolution of developmental processes, nonmodel organisms in the nematodes, insects, and vertebrates are compared with established model systems. Often, these comparisons suffer from the inability to apply sophisticated technologies to these nonmodel species. In the nematode Pristionchus pacificus, cellular and genetic analyses are used to compare vulva development to that of Caenorhabditis elegans. However, substantial changes in gene function between P. pacificus and C. elegans limit the use of candidate gene approaches in studying P. pacificus mutations. To facilitate map-based cloning of mutations in P. pacificus, we constructed a BAC-based genetic linkage map. A BAC library of 13,440 clones was generated and completely end sequenced. By comparing BAC end and EST sequences between the “wild-type” strain P. pacificus var. California and the polymorphic strain P. pacificus var. Washington, 133 single-stranded conformational polymorphisms were identified. These markers were tested on a meiotic mapping panel of 46 randomly picked F2 animals after a cross of the two strains, providing the first genetic linkage map of P. pacificus. A mapping strategy using two selected markers per chromosome was devised and the efficiency of this approach was illustrated by the mapping of the Ppa-unc-1/Twitchin gene.
Collapse
Affiliation(s)
- Jagan Srinivasan
- Abteilung für Evolutionsbiologie, Max-Planck Institut für Entwicklungsbiologie, 72076 Tübingen, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Abstract
DNA microarray technology allows a parallel analysis of RNA abundance and DNA homology for thousands of genes in a single experiment. Over the past few years, this powerful technology has been used to explore transcriptional profiles and genome differences for a variety of microorganisms, greatly facilitating our understanding of microbial metabolism. With the increasing availability of complete microbial genomes, DNA microarrays are becoming a common tool in many areas of microbial research, including microbial physiology, pathogenesis, epidemiology, ecology, phylogeny, pathway engineering and fermentation optimization.
Collapse
Affiliation(s)
- R W Ye
- E328/148B, DuPont Experimental Station, DuPont Central Research and Development, Route 141 and Henry Clay Road, Wilmington, DE 19880, USA.
| | | | | | | |
Collapse
|
39
|
Varotto C, Richly E, Salamini F, Leister D. GST-PRIME: a genome-wide primer design software for the generation of gene sequence tags. Nucleic Acids Res 2001; 29:4373-7. [PMID: 11691924 PMCID: PMC60177 DOI: 10.1093/nar/29.21.4373] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The availability of sequenced genomes has generated a need for experimental approaches that allow the simultaneous analysis of large, or even complete, sets of genes. To facilitate such analyses, we have developed GST-PRIME, a software package for retrieving and assembling gene sequences, even from complex genomes, using the NCBI public database, and then designing sets of primer pairs for use in gene amplification. Primers were designed by the program for the direct amplification of gene sequence tags (GSTs) from either genomic DNA or cDNA. Test runs of GST-PRIME on 2000 randomly selected Arabidopsis and Drosophila genes demonstrate that 93 and 88% of resulting GSTs, respectively, fulfilled imposed length criteria. GST-PRIME primer pairs were tested on a set of 1900 Arabidopsis genes coding for chloroplast-targeted proteins: 95% of the primer pairs used in PCRs with genomic DNA generated the correct amplicons. GST-PRIME can thus be reliably used for large-scale or specific amplification of intron-containing genes of multicellular eukaryotes.
Collapse
Affiliation(s)
- C Varotto
- Zentrum zur Identifikation von Genfunktionen durch Insertionsmutagenese bei Arabidopsis thaliana (ZIGIA), Max-Planck-Institut für Züchtungsforschung, Carl-von-Linné Weg 10, 50829 Köln, Germany
| | | | | | | |
Collapse
|
40
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2001. [PMCID: PMC2448396 DOI: 10.1002/cfg.59] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|