1
|
Baross Á, Delaney AD, Li HI, Nayar T, Flibotte S, Qian H, Chan SY, Asano J, Ally A, Cao M, Birch P, Brown-John M, Fernandes N, Go A, Kennedy G, Langlois S, Eydoux P, Friedman JM, Marra MA. Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data. BMC Bioinformatics 2007; 8:368. [PMID: 17910767 PMCID: PMC2148068 DOI: 10.1186/1471-2105-8-368] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2006] [Accepted: 10/02/2007] [Indexed: 01/22/2023] Open
Abstract
Background Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays. Results We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection. Conclusion We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity.
Collapse
Affiliation(s)
- Ágnes Baross
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
- Genome British Columbia, 500-555 West 8th Avenue, Vancouver, BC, V5Z 1C6, Canada
| | - Allen D Delaney
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - H Irene Li
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Tarun Nayar
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Stephane Flibotte
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Hong Qian
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Susanna Y Chan
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Jennifer Asano
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Adrian Ally
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Manqiu Cao
- Affymetrix Inc., 3420 Central Expressway, Santa Clara, CA 95051, USA
| | - Patricia Birch
- Dept. of Medical Genetics, University of British Columbia, Children's & Women's Hospital, Box 153, 4500 Oak Street, Vancouver, BC, V6H 3N1, Canada
| | - Mabel Brown-John
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Nicole Fernandes
- Dept. of Medical Genetics, University of British Columbia, Children's & Women's Hospital, Box 153, 4500 Oak Street, Vancouver, BC, V6H 3N1, Canada
| | - Anne Go
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
| | - Giulia Kennedy
- Affymetrix Inc., 3420 Central Expressway, Santa Clara, CA 95051, USA
| | - Sylvie Langlois
- Dept. of Medical Genetics, University of British Columbia, Children's & Women's Hospital, Box 153, 4500 Oak Street, Vancouver, BC, V6H 3N1, Canada
| | - Patrice Eydoux
- Dept. of Pathology and Laboratory Medicine, BC Children's Hospital,4480 Oak Street, Vancouver, BC, V6H 3N1, Canada
| | - JM Friedman
- Dept. of Medical Genetics, University of British Columbia, Children's & Women's Hospital, Box 153, 4500 Oak Street, Vancouver, BC, V6H 3N1, Canada
| | - Marco A Marra
- Genome Sciences Centre, BC Cancer Agency, British Columbia Cancer Agency, Suite 100, 570 West 7th Avenue, Vancouver, BC, V5Z 4S6, Canada
- Dept. of Medical Genetics, University of British Columbia, Children's & Women's Hospital, Box 153, 4500 Oak Street, Vancouver, BC, V6H 3N1, Canada
| |
Collapse
|
2
|
Friedman JM, Baross A, Delaney AD, Ally A, Arbour L, Armstrong L, Asano J, Bailey DK, Barber S, Birch P, Brown-John M, Cao M, Chan S, Charest DL, Farnoud N, Fernandes N, Flibotte S, Go A, Gibson WT, Holt RA, Jones SJM, Kennedy GC, Krzywinski M, Langlois S, Li HI, McGillivray BC, Nayar T, Pugh TJ, Rajcan-Separovic E, Schein JE, Schnerch A, Siddiqui A, Van Allen MI, Wilson G, Yong SL, Zahir F, Eydoux P, Marra MA. Oligonucleotide microarray analysis of genomic imbalance in children with mental retardation. Am J Hum Genet 2006; 79:500-13. [PMID: 16909388 PMCID: PMC1559542 DOI: 10.1086/507471] [Citation(s) in RCA: 225] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2006] [Accepted: 07/06/2006] [Indexed: 11/03/2022] Open
Abstract
The cause of mental retardation in one-third to one-half of all affected individuals is unknown. Microscopically detectable chromosomal abnormalities are the most frequently recognized cause, but gain or loss of chromosomal segments that are too small to be seen by conventional cytogenetic analysis has been found to be another important cause. Array-based methods offer a practical means of performing a high-resolution survey of the entire genome for submicroscopic copy-number variants. We studied 100 children with idiopathic mental retardation and normal results of standard chromosomal analysis, by use of whole-genome sampling analysis with Affymetrix GeneChip Human Mapping 100K arrays. We found de novo deletions as small as 178 kb in eight cases, de novo duplications as small as 1.1 Mb in two cases, and unsuspected mosaic trisomy 9 in another case. This technology can detect at least twice as many potentially pathogenic de novo copy-number variants as conventional cytogenetic analysis can in people with mental retardation.
Collapse
Affiliation(s)
- J M Friedman
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Morin RD, Chang E, Petrescu A, Liao N, Griffith M, Kirkpatrick R, Butterfield YS, Young AC, Stott J, Barber S, Babakaiff R, Dickson MC, Matsuo C, Wong D, Yang GS, Smailus DE, Wetherby KD, Kwong PN, Grimwood J, Brinkley CP, Brown-John M, Reddix-Dugue ND, Mayo M, Schmutz J, Beland J, Park M, Gibson S, Olson T, Bouffard GG, Tsai M, Featherstone R, Chand S, Siddiqui AS, Jang W, Lee E, Klein SL, Blakesley RW, Zeeberg BR, Narasimhan S, Weinstein JN, Pennacchio CP, Myers RM, Green ED, Wagner L, Gerhard DS, Marra MA, Jones SJ, Holt RA. Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling. Genome Res 2006; 16:796-803. [PMID: 16672307 PMCID: PMC1479861 DOI: 10.1101/gr.4871006] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.
Collapse
Affiliation(s)
- Ryan D. Morin
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Elbert Chang
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Anca Petrescu
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Nancy Liao
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Malachi Griffith
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Robert Kirkpatrick
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | | | - Alice C. Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Jeffrey Stott
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Sarah Barber
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Ryan Babakaiff
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Mark C. Dickson
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Corey Matsuo
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - David Wong
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - George S. Yang
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Duane E. Smailus
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Keith D. Wetherby
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Peggy N. Kwong
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Jane Grimwood
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | - Mabel Brown-John
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | | | - Michael Mayo
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Jeremy Schmutz
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Jaclyn Beland
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Morgan Park
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Susan Gibson
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Teika Olson
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Gerard G. Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Miranda Tsai
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Ruth Featherstone
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Steve Chand
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Asim S. Siddiqui
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Wonhee Jang
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | - Ed Lee
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | - Steven L. Klein
- National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | - Barry R. Zeeberg
- Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology
| | | | - John N. Weinstein
- Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology
| | - Christa Prange Pennacchio
- The I.M.A.G.E Consortium, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Richard M. Myers
- Stanford Human Genome Center and Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Eric D. Green
- NIH Intramural Sequencing Center, National Human Genome Research Institute
| | - Lukas Wagner
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA
| | | | - Marco A. Marra
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Steven J.M. Jones
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
| | - Robert A. Holt
- British Columbia Genome Sciences Centre, BCCA, Vancouver, BC V5Z 1L3 Canada
- Corresponding author.E-mail ; fax (604) 877-6085
| |
Collapse
|
4
|
Siddiqui AS, Khattra J, Delaney AD, Zhao Y, Astell C, Asano J, Babakaiff R, Barber S, Beland J, Bohacec S, Brown-John M, Chand S, Charest D, Charters AM, Cullum R, Dhalla N, Featherstone R, Gerhard DS, Hoffman B, Holt RA, Hou J, Kuo BYL, Lee LLC, Lee S, Leung D, Ma K, Matsuo C, Mayo M, McDonald H, Prabhu AL, Pandoh P, Riggins GJ, de Algara TR, Rupert JL, Smailus D, Stott J, Tsai M, Varhol R, Vrljicak P, Wong D, Wu MK, Xie YY, Yang G, Zhang I, Hirst M, Jones SJM, Helgason CD, Simpson EM, Hoodless PA, Marra MA. A mouse atlas of gene expression: large-scale digital gene-expression profiles from precisely defined developing C57BL/6J mouse tissues and cells. Proc Natl Acad Sci U S A 2005; 102:18485-90. [PMID: 16352711 PMCID: PMC1311911 DOI: 10.1073/pnas.0509455102] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We analyzed 8.55 million LongSAGE tags generated from 72 libraries. Each LongSAGE library was prepared from a different mouse tissue. Analysis of the data revealed extensive overlap with existing gene data sets and evidence for the existence of approximately 24,000 previously undescribed genomic loci. The visual cortex, pancreas, mammary gland, preimplantation embryo, and placenta contain the largest number of differentially expressed transcripts, 25% of which are previously undescribed loci.
Collapse
Affiliation(s)
- Asim S Siddiqui
- Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Research Centre, British Columbia Cancer Agency, Vancouver, BC, Canada V5Z 4S6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Krzywinski M, Bosdet I, Smailus D, Chiu R, Mathewson C, Wye N, Barber S, Brown-John M, Chan S, Chand S, Cloutier A, Girn N, Lee D, Masson A, Mayo M, Olson T, Pandoh P, Prabhu AL, Schoenmakers E, Tsai M, Albertson D, Lam W, Choy CO, Osoegawa K, Zhao S, de Jong PJ, Schein J, Jones S, Marra MA. A set of BAC clones spanning the human genome. Nucleic Acids Res 2004; 32:3651-60. [PMID: 15247347 PMCID: PMC484185 DOI: 10.1093/nar/gkh700] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2004] [Revised: 06/22/2004] [Accepted: 06/22/2004] [Indexed: 11/15/2022] Open
Abstract
Using the human bacterial artificial chromosome (BAC) fingerprint-based physical map, genome sequence assembly and BAC end sequences, we have generated a fingerprint-validated set of 32 855 BAC clones spanning the human genome. The clone set provides coverage for at least 98% of the human fingerprint map, 99% of the current assembled sequence and has an effective resolving power of 79 kb. We have made the clone set publicly available, anticipating that it will generally facilitate FISH or array-CGH-based identification and characterization of chromosomal alterations relevant to disease.
Collapse
Affiliation(s)
- Martin Krzywinski
- BC Cancer Agency Genome Sciences Center and BC Cancer Agency, Vancouver, BC V5Z 4E6, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Krzywinski M, Wallis J, Gösele C, Bosdet I, Chiu R, Graves T, Hummel O, Layman D, Mathewson C, Wye N, Zhu B, Albracht D, Asano J, Barber S, Brown-John M, Chan S, Chand S, Cloutier A, Davito J, Fjell C, Gaige T, Ganten D, Girn N, Guggenheimer K, Himmelbauer H, Kreitler T, Leach S, Lee D, Lehrach H, Mayo M, Mead K, Olson T, Pandoh P, Prabhu AL, Shin H, Tänzer S, Thompson J, Tsai M, Walker J, Yang G, Sekhon M, Hillier L, Zimdahl H, Marziali A, Osoegawa K, Zhao S, Siddiqui A, de Jong PJ, Warren W, Mardis E, McPherson JD, Wilson R, Hübner N, Jones S, Marra M, Schein J. Integrated and sequence-ordered BAC- and YAC-based physical maps for the rat genome. Genome Res 2004; 14:766-79. [PMID: 15060021 PMCID: PMC383324 DOI: 10.1101/gr.2336604] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2004] [Accepted: 02/16/2004] [Indexed: 01/08/2023]
Abstract
As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome.
Collapse
Affiliation(s)
- Martin Krzywinski
- Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, Canada V5Z 4E6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|