1
|
Genome-Wide Detection and Analysis of Copy Number Variation in Anhui Indigenous and Western Commercial Pig Breeds Using Porcine 80K SNP BeadChip. Genes (Basel) 2023; 14:genes14030654. [PMID: 36980927 PMCID: PMC10047991 DOI: 10.3390/genes14030654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 02/27/2023] [Accepted: 03/01/2023] [Indexed: 03/08/2023] Open
Abstract
Copy number variation (CNV) is an important class of genetic variations widely associated with the porcine genome, but little is known about the characteristics of CNVs in foreign and indigenous pig breeds. We performed a genome-wide comparison of CNVs between Anhui indigenous pig (AHIP) and Western commercial pig (WECP) breeds based on data from the Porcine 80K SNP BeadChip. After analysis using the PennCNV software, we detected 3863 and 7546 CNVs in the AHIP and WECP populations, respectively. We obtained 225 (loss: 178, gain: 47) and 379 (loss: 293, gain: 86) copy number variation regions (CNVRs) randomly distributed across the autosomes of the AHIP and WECP populations, accounting for 10.90% and 22.57% of the porcine autosomal genome, respectively. Functional enrichment analysis of genes in the CNVRs identified genes related to immunity (FOXJ1, FOXK2, MBL2, TNFRSF4, SIRT1, NCF1) and meat quality (DGAT1, NT5E) in the WECP population; these genes were a loss event in the WECP population. This study provides important information on CNV differences between foreign and indigenous pig breeds, making it possible to provide a reference for future improvement of these breeds and their production performance.
Collapse
|
2
|
Genetic Influences on Fetal Alcohol Spectrum Disorder. Genes (Basel) 2023; 14:genes14010195. [PMID: 36672936 PMCID: PMC9859092 DOI: 10.3390/genes14010195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 01/06/2023] [Accepted: 01/10/2023] [Indexed: 01/14/2023] Open
Abstract
Fetal alcohol spectrum disorder (FASD) encompasses the range of deleterious outcomes of prenatal alcohol exposure (PAE) in the affected offspring, including developmental delay, intellectual disability, attention deficits, and conduct disorders. Several factors contribute to the risk for and severity of FASD, including the timing, dose, and duration of PAE and maternal factors such as age and nutrition. Although poorly understood, genetic factors also contribute to the expression of FASD, with studies in both humans and animal models revealing genetic influences on susceptibility. In this article, we review the literature related to the genetics of FASD in humans, including twin studies, candidate gene studies in different populations, and genetic testing identifying copy number variants. Overall, these studies suggest different genetic factors, both in the mother and in the offspring, influence the phenotypic outcomes of PAE. While further work is needed, understanding how genetic factors influence FASD will provide insight into the mechanisms contributing to alcohol teratogenicity and FASD risk and ultimately may lead to means for early detection and intervention.
Collapse
|
3
|
Išerić H, Alkan C, Hach F, Numanagić I. Fast characterization of segmental duplication structure in multiple genome assemblies. Algorithms Mol Biol 2022; 17:4. [PMID: 35303886 PMCID: PMC8932185 DOI: 10.1186/s13015-022-00210-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/08/2022] [Indexed: 11/29/2022] Open
Abstract
Motivation The increasing availability of high-quality genome assemblies raised interest in the characterization of genomic architecture. Major architectural elements, such as common repeats and segmental duplications (SDs), increase genome plasticity that stimulates further evolution by changing the genomic structure and inventing new genes. Optimal computation of SDs within a genome requires quadratic-time local alignment algorithms that are impractical due to the size of most genomes. Additionally, to perform evolutionary analysis, one needs to characterize SDs in multiple genomes and find relations between those SDs and unique (non-duplicated) segments in other genomes. A naïve approach consisting of multiple sequence alignment would make the optimal solution to this problem even more impractical. Thus there is a need for fast and accurate algorithms to characterize SD structure in multiple genome assemblies to better understand the evolutionary forces that shaped the genomes of today. Results Here we introduce a new approach, BISER, to quickly detect SDs in multiple genomes and identify elementary SDs and core duplicons that drive the formation of such SDs. BISER improves earlier tools by (i) scaling the detection of SDs with low homology to multiple genomes while introducing further 7–33\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\times$$\end{document}× speed-ups over the existing tools, and by (ii) characterizing elementary SDs and detecting core duplicons to help trace the evolutionary history of duplications to as far as 300 million years. Availability and implementation BISER is implemented in Seq programming language and is publicly available at https://github.com/0xTCG/biser.
Collapse
|
4
|
Tham CY, Tirado-Magallanes R, Goh Y, Fullwood MJ, Koh BTH, Wang W, Ng CH, Chng WJ, Thiery A, Tenen DG, Benoukraf T. NanoVar: accurate characterization of patients' genomic structural variants using low-depth nanopore sequencing. Genome Biol 2020; 21:56. [PMID: 32127024 PMCID: PMC7055087 DOI: 10.1186/s13059-020-01968-7] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 02/21/2020] [Indexed: 12/19/2022] Open
Abstract
The recent advent of third-generation sequencing technologies brings promise for better characterization of genomic structural variants by virtue of having longer reads. However, long-read applications are still constrained by their high sequencing error rates and low sequencing throughput. Here, we present NanoVar, an optimized structural variant caller utilizing low-depth (8X) whole-genome sequencing data generated by Oxford Nanopore Technologies. NanoVar exhibits higher structural variant calling accuracy when benchmarked against current tools using low-depth simulated datasets. In patient samples, we successfully validate structural variants characterized by NanoVar and uncover normal alternative sequences or alleles which are present in healthy individuals.
Collapse
Affiliation(s)
- Cheng Yong Tham
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Roberto Tirado-Magallanes
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Yufen Goh
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore
| | - Melissa J Fullwood
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Bryan T H Koh
- Department of Orthopedic Surgery, National University Health Systems, Singapore, 119228, Singapore
| | - Wilson Wang
- Department of Orthopedic Surgery, National University Health Systems, Singapore, 119228, Singapore.,Department of Orthopaedic Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore
| | - Chin Hin Ng
- Department of Hematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, 119228, Singapore
| | - Wee Joo Chng
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,Department of Hematology-Oncology, National University Cancer Institute of Singapore, National University Health System, Singapore, 119228, Singapore.,Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore
| | - Alexandre Thiery
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore
| | - Daniel G Tenen
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore.,Harvard Stem Cell Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Touati Benoukraf
- Cancer Science Institute of Singapore, National University of Singapore, Centre for Translational Medicine, 14 Medical Drive, #12-01, Singapore, 117599, Singapore. .,Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, A1B 3V6, Canada.
| |
Collapse
|
5
|
Pharmacogenes (PGx-genes): Current understanding and future directions. Gene 2019; 718:144050. [DOI: 10.1016/j.gene.2019.144050] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 08/13/2019] [Accepted: 08/14/2019] [Indexed: 12/14/2022]
|
6
|
Kachouie NN, Deebani W, Christiani DC. Identifying Similarities and Disparities Between DNA Copy Number Changes in Cancer and Matched Blood Samples. Cancer Invest 2019; 37:535-545. [PMID: 31584296 DOI: 10.1080/07357907.2019.1667368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Background: Non-small cell lung cancer (NSCLC) is the first cause of cancer-related mortality for men and women in the United States. In spite of curative resection in early-stage, patient survival is not optimal and recurrence rate is high. Consequently, early detection and staging is essential to increase the patient's survival.Methods: Copy number (CN) changes in cancer populations have been broadly investigated to identify CN gains and deletions associated with cancer. In contrast, in this research, we quantify the similarities and disparities between cancer and paired peripheral blood samples using maximal information coefficient (MIC). We then detect the spatial locations with substantially high and the spatial locations with very low MICs in each chromosome. These locations can potentially help with early diagnosis, treatment, and prevention of cancer by identifying the similarities and disparities between cancer and healthy tissues.Results: Lung cancer data used in this project contains CN pairs for cancer and blood (non-involved) samples for 63 subjects. MIC was obtained to quantify the relation (linear or nonlinear) between cancer-blood pair samples for 63 subjects at each location for each chromosome. MIC values above a high threshold and MIC values below a low threshold were located. Among them top five (with lowest MIC's and with highest MIC's) were identified for each chromosome. For these identified locations, a high MIC score indicates high similarity between blood (non-involved) and cancer samples, while a low MIC score shows lack of similarity between the two samples.Conclusions: The results showed that a few chromosomes have a large number of MICs exceeding a high threshold. These locations can potentially be used to identify early indicators of NSCLC. In contrast, second group of chromosomes have several locations with small MICs which are potential candidates to develop biomarkers for discriminating cancer from the matched blood sample. Moreover, there is a third group of chromosomes with a large number of MICs exceeding a high threshold and a large set of MICs below a low threshold. These locations can help with both finding early indicators of cancer and developing biomarkers for discriminating cancer from non-involved tissue.
Collapse
Affiliation(s)
- Nezamoddin N Kachouie
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, Florida, USA
| | - Wejdan Deebani
- Department of Mathematical Sciences, Florida Institute of Technology, Melbourne, Florida, USA
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts, USA.,Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
7
|
Keel BN, Nonneman DJ, Lindholm-Perry AK, Oliver WT, Rohrer GA. A Survey of Copy Number Variation in the Porcine Genome Detected From Whole-Genome Sequence. Front Genet 2019; 10:737. [PMID: 31475038 PMCID: PMC6707380 DOI: 10.3389/fgene.2019.00737] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 07/12/2019] [Indexed: 12/11/2022] Open
Abstract
Copy number variations (CNVs) are gains and losses of large regions of genomic sequence between individuals of a species. Although CNVs have been associated with various phenotypic traits in humans and other species, the extent to which CNVs impact phenotypic variation remains unclear. In swine, as well as many other species, relatively little is understood about the frequency of CNV in the genome, sizes, locations, and other chromosomal properties. In this work, we identified and characterized CNV by utilizing whole-genome sequence from 240 members of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the purebred founding boars (12 Duroc and 12 Landrace), 48 of the founding Yorkshire-Landrace composite sows, 109 composite animals from generations 4 through 9, 29 composite animals from generation 15, and 30 purebred industry boars (15 Landrace and 15 Yorkshire) used as sires in generations 10 through 15. Using a combination of split reads, paired-end mapping, and read depth approaches, we identified a total of 3,538 copy number variable regions (CNVRs), including 1,820 novel CNVRs not reported in previous studies. The CNVRs covered 0.94% of the porcine genome and overlapped 1,401 genes. Gene ontology analysis identified that CNV-overlapped genes were enriched for functions related to organism development. Additionally, CNVRs overlapped with many known quantitative trait loci (QTL). In particular, analysis of QTL previously identified in the USMARC herd showed that CNVRs were most overlapped with reproductive traits, such as age of puberty and ovulation rate, and CNVRs were significantly enriched for reproductive QTL.
Collapse
Affiliation(s)
- Brittney N Keel
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, United States
| | - Dan J Nonneman
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, United States
| | | | - William T Oliver
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, United States
| | - Gary A Rohrer
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, United States
| |
Collapse
|
8
|
Sahlin K, Tomaszkiewicz M, Makova KD, Medvedev P. Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon. Nat Commun 2018; 9:4601. [PMID: 30389934 PMCID: PMC6214943 DOI: 10.1038/s41467-018-06910-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 09/29/2018] [Indexed: 12/30/2022] Open
Abstract
A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16802, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Medical Genomics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Computational Biology and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Medical Genomics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Computational Biology and Bioinformatics, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
9
|
Sapkota Y, Narasimhan A, Kumaran M, Sehrawat BS, Damaraju S. A Genome-Wide Association Study to Identify Potential Germline Copy Number Variants for Sporadic Breast Cancer Susceptibility. Cytogenet Genome Res 2016; 149:156-164. [PMID: 27668787 DOI: 10.1159/000448558] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/23/2016] [Indexed: 11/19/2022] Open
Abstract
Breast cancer (BC) predisposition in populations arises from both genetic and nongenetic risk factors. Structural variations such as copy number variations (CNVs) are heritable determinants for disease susceptibility. The primary objectives of this study are (1) to identify CNVs associated with sporadic BC using a genome-wide association study (GWAS) design; (2) to utilize 2 distinct CNV calling algorithms to identify concordant CNVs as a strategy to reduce false positive associations in the hypothesis-generating GWAS discovery phase, and (3) to identify potential candidate CNVs for follow-up replication studies. We used Affymetrix SNP Array 6.0 data profiled on Caucasian subjects (422 cases/348 controls) to call CNVs using algorithms implemented in Nexus Copy Number and Partek Genomics Suite software. Nexus algorithm identified CNVs associated with BC (731 autosomal CNVs with >5% frequency in the total sample and Q < 0.05). Thirteen CNVs were identified when Partek algorithm-called CNVs were overlapped with Nexus-identified CNVs; these CNVs showed concordances for frequency, effect size, and direction. Coding genes present within BC-associated CNVs were known to play a role in disease etiology and prognosis. Long noncoding RNAs identified within CNVs showed tissue-specific expression, indicating potential functional relevance of the findings. The identified candidate CNVs warrant independent replication.
Collapse
Affiliation(s)
- Yadav Sapkota
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tenn., USA
| | | | | | | | | |
Collapse
|
10
|
Zhang J, Zuo T, Wang D, Peterson T. Transposition-mediated DNA re-replication in maize. eLife 2014; 3:e03724. [PMID: 25406063 PMCID: PMC4270019 DOI: 10.7554/elife.03724] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2014] [Accepted: 11/17/2014] [Indexed: 02/03/2023] Open
Abstract
Every DNA segment in a eukaryotic genome normally replicates once and only once per cell cycle to maintain genome stability. We show here that this restriction can be bypassed through alternative transposition, a transposition reaction that utilizes the termini of two separate, nearby transposable elements (TEs). Our results suggest that alternative transposition during S phase can induce re-replication of the TEs and their flanking sequences. The DNA re-replication can spontaneously abort to generate double-strand breaks, which can be repaired to generate Composite Insertions composed of transposon termini flanking segmental duplications of various lengths. These results show how alternative transposition coupled with DNA replication and repair can significantly alter genome structure and may have contributed to rapid genome evolution in maize and possibly other eukaryotes.
Collapse
Affiliation(s)
- Jianbo Zhang
- Department of Genetics,
Development and Cell Biology, Iowa State
University, Ames, United States
- Department of
Agronomy, Iowa State University,
Ames,
United States
| | - Tao Zuo
- Department of Genetics,
Development and Cell Biology, Iowa State
University, Ames, United States
- Department of
Agronomy, Iowa State University,
Ames,
United States
| | - Dafang Wang
- Department of Genetics,
Development and Cell Biology, Iowa State
University, Ames, United States
- Department of
Agronomy, Iowa State University,
Ames,
United States
| | - Thomas Peterson
- Department of Genetics,
Development and Cell Biology, Iowa State
University, Ames, United States
- Department of
Agronomy, Iowa State University,
Ames,
United States
| |
Collapse
|
11
|
Carvalho CMB, Zuccherato LW, Williams CL, Neill NJ, Murdock DR, Bainbridge M, Jhangiani SN, Muzny DM, Gibbs RA, Ip W, Guillerman RP, Lupski JR, Bertuch AA. Structural variation and missense mutation in SBDS associated with Shwachman-Diamond syndrome. BMC MEDICAL GENETICS 2014; 15:64. [PMID: 24898207 PMCID: PMC4057820 DOI: 10.1186/1471-2350-15-64] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 05/29/2014] [Indexed: 12/18/2022]
Abstract
Background Shwachman–Diamond syndrome (SDS) is an autosomal recessive ribosomopathy caused mainly by compound heterozygous mutations in SBDS. Structural variation (SV) involving the SBDS locus has been rarely reported in association with the disease. We aimed to determine whether an SV contributed to the pathogenesis of a case lacking biallelic SBDS point mutations. Case presentation Whole exome sequencing was performed in a patient with SDS lacking biallelic SBDS point mutations. Array comparative genomic hybridization and Southern blotting were used to seek SVs across the SBDS locus. Locus-specific polymerase chain reaction (PCR) encompassing flanking intronic sequence was also performed to investigate mutation within the locus. RNA expression and Western blotting were performed to analyze allele and protein expression. We found the child harbored a single missense mutation in SBDS (c.98A > C; p.K33T), inherited from the mother, and an SV in the SBDS locus, inherited from the father. The missense allele and SV segregated in accordance with Mendelian expectations for autosomal recessive SDS. Complementary DNA and western blotting analysis and locus specific PCR support the contention that the SV perturbed SBDS protein expression in the father and child. Conclusion Our findings implicate genomic rearrangements in the pathogenesis of some cases of SDS and support patients lacking biallelic SBDS point mutations be tested for SV within the SBDS locus.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Alison A Bertuch
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
12
|
Askree SH, Chin ELH, Bean LH, Coffee B, Tanner A, Hegde M. Detection limit of intragenic deletions with targeted array comparative genomic hybridization. BMC Genet 2013; 14:116. [PMID: 24304607 PMCID: PMC4235222 DOI: 10.1186/1471-2156-14-116] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2013] [Accepted: 11/12/2013] [Indexed: 11/24/2022] Open
Abstract
Background Pathogenic mutations range from single nucleotide changes to deletions or duplications that encompass a single exon to several genes. The use of gene-centric high-density array comparative genomic hybridization (aCGH) has revolutionized the detection of intragenic copy number variations. We implemented an exon-centric design of high-resolution aCGH to detect single- and multi-exon deletions and duplications in a large set of genes using the OGT 60 K and 180 K arrays. Here we describe the molecular characterization and breakpoint mapping of deletions at the smaller end of the detectable range in several genes using aCGH. Results The method initially implemented to detect single to multiple exon deletions, was able to detect deletions much smaller than anticipated. The selected deletions we describe vary in size, ranging from over 2 kb to as small as 12 base pairs. The smallest of these deletions are only detectable after careful manual review during data analysis. Suspected deletions smaller than the detection size for which the method was optimized, were rigorously followed up and confirmed with PCR-based investigations to uncover the true detection size limit of intragenic deletions with this technology. False-positive deletion calls often demonstrated single nucleotide changes or an insertion causing lower hybridization of probes demonstrating the sensitivity of aCGH. Conclusions With optimizing aCGH design and careful review process, aCGH can uncover intragenic deletions as small as dozen bases. These data provide insight that will help optimize probe coverage in array design and illustrate the true assay sensitivity. Mapping of the breakpoints confirms smaller deletions and contributes to the understanding of the mechanism behind these events. Our knowledge of the mutation spectra of several genes can be expected to change as previously unrecognized intragenic deletions are uncovered.
Collapse
Affiliation(s)
| | | | | | | | | | - Madhuri Hegde
- Emory Genetics Laboratory, Department of Human Genetics, Emory University, 2165 N Decatur Road, Decatur, GA 30033, USA.
| |
Collapse
|
13
|
Single exon-resolution targeted chromosomal microarray analysis of known and candidate intellectual disability genes. Eur J Hum Genet 2013; 22:792-800. [PMID: 24253858 DOI: 10.1038/ejhg.2013.248] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Revised: 09/05/2013] [Accepted: 09/27/2013] [Indexed: 02/07/2023] Open
Abstract
Intellectual disability affects about 3% of individuals globally, with∼50% idiopathic. We designed an exonic-resolution array targeting all known submicroscopic chromosomal intellectual disability syndrome loci, causative genes for intellectual disability, and potential candidate genes, all genes encoding glutamate receptors and epigenetic regulators. Using this platform, we performed chromosomal microarray analysis on 165 intellectual disability trios (affected child and both normal parents). We identified and independently validated 36 de novo copy-number changes in 32 trios. In all, 67% of the validated events were intragenic, involving only exon 1 (which includes the promoter sequence according to our design), exon 1 and adjacent exons, or one or more exons excluding exon 1. Seventeen of the 36 copy-number variants involve genes known to cause intellectual disability. Eleven of these, including seven intragenic variants, are clearly pathogenic (involving STXBP1, SHANK3 (3 patients), IL1RAPL1, UBE2A, NRXN1, MEF2C, CHD7, 15q24 and 9p24 microdeletion), two are likely pathogenic (PI4KA, DCX), two are unlikely to be pathogenic (GRIK2, FREM2), and two are unclear (ARID1B, 15q22 microdeletion). Twelve individuals with genomic imbalances identified by our array were tested with a clinical microarray, and six had a normal result. We identified de novo copy-number variants within genes not previously implicated in intellectual disability and uncovered pathogenic variation of known intellectual disability genes below the detection limit of standard clinical diagnostic chromosomal microarray analysis.
Collapse
|
14
|
Yang R, Chen B, Pfütze K, Buch S, Steinke V, Holinski-Feder E, Stöcker S, von Schönfels W, Becker T, Schackert HK, Royer-Pokora B, Kloor M, Schmiegel WH, Büttner R, Engel C, Lascorz Puertolas J, Försti A, Kunkel N, Bugert P, Schreiber S, Krawczak M, Schafmayer C, Propping P, Hampe J, Hemminki K, Burwinkel B. Genome-wide analysis associates familial colorectal cancer with increases in copy number variations and a rare structural variation at 12p12.3. Carcinogenesis 2013; 35:315-23. [PMID: 24127187 DOI: 10.1093/carcin/bgt344] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Colorectal cancer (CRC) is one of the most common cancer worldwide. However, a large number of genetic risk factors involved in CRC have not been understood. Copy number variations (CNVs) might partly contribute to the 'missing heritability' of CRC. An increased overall burden of CNV has been identified in several complex diseases, whereas the association between the overall CNV burden and CRC risk is largely unknown. We performed a genome-wide investigation of CNVs on genomic DNA from 384 familial CRC cases and 1285 healthy controls by the Affymetrix 6.0 array. An increase of overall CNV burden was observed in familial CRC patients compared with healthy controls, especially for CNVs larger than 50kb (case/control ratio = 1.66, P = 0.025). In addition, we discovered for the first time a novel structural variation at 12p12.3 and determined the breakpoints by strategic PCR and sequencing. This 12p12.3 structural variation was found in four of 2862 CRC cases but not in 6243 healthy controls (P = 0.0098). RERGL gene (RERG/RAS-like), the only gene influenced by the 12p12.3 structural variation, sharing most of the conserved regions with its close family member RERG tumor suppressor gene (RAS-like, estrogen-regulated, growth inhibitor), might be a novel CRC-related gene. In conclusion, this is the first study to reveal the contribution of the overall burden of CNVs to familial CRC risk and identify a novel rare structural variation at 12p12.3 containing RERGL gene to be associated with CRC.
Collapse
Affiliation(s)
- Rongxi Yang
- Molecular Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Campbell CD, Eichler EE. Properties and rates of germline mutations in humans. Trends Genet 2013; 29:575-84. [PMID: 23684843 PMCID: PMC3785239 DOI: 10.1016/j.tig.2013.04.005] [Citation(s) in RCA: 188] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Revised: 04/05/2013] [Accepted: 04/18/2013] [Indexed: 11/25/2022]
Abstract
All genetic variation arises via new mutations; therefore, determining the rate and biases for different classes of mutation is essential for understanding the genetics of human disease and evolution. Decades of mutation rate analyses have focused on a relatively small number of loci because of technical limitations. However, advances in sequencing technology have allowed for empirical assessments of genome-wide rates of mutation. Recent studies have shown that 76% of new mutations originate in the paternal lineage and provide unequivocal evidence for an increase in mutation with paternal age. Although most analyses have focused on single nucleotide variants (SNVs), studies have begun to provide insight into the mutation rate for other classes of variation, including copy number variants (CNVs), microsatellites, and mobile element insertions (MEIs). Here, we review the genome-wide analyses for the mutation rate of several types of variants and suggest areas for future research.
Collapse
Affiliation(s)
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195
- Howard Hughes Medical Institute, Seattle, WA 98195
| |
Collapse
|
16
|
Sodhi SS, Jeong DK, Sharma N, Lee JH, Kim JH, Kim SH, Kim SW, Oh SJ. Marker Assisted Selection-Applications and Evaluation for Commercial Poultry Breeding. ACTA ACUST UNITED AC 2013. [DOI: 10.5536/kjps.2013.40.3.223] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
17
|
Crooijmans RPMA, Fife MS, Fitzgerald TW, Strickland S, Cheng HH, Kaiser P, Redon R, Groenen MAM. Large scale variation in DNA copy number in chicken breeds. BMC Genomics 2013; 14:398. [PMID: 23763846 PMCID: PMC3751642 DOI: 10.1186/1471-2164-14-398] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2012] [Accepted: 06/04/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Detecting genetic variation is a critical step in elucidating the molecular mechanisms underlying phenotypic diversity. Until recently, such detection has mostly focused on single nucleotide polymorphisms (SNPs) because of the ease in screening complete genomes. Another type of variant, copy number variation (CNV), is emerging as a significant contributor to phenotypic variation in many species. Here we describe a genome-wide CNV study using array comparative genomic hybridization (aCGH) in a wide variety of chicken breeds. RESULTS We identified 3,154 CNVs, grouped into 1,556 CNV regions (CNVRs). Thirty percent of the CNVs were detected in at least 2 individuals. The average size of the CNVs detected was 46.3 kb with the largest CNV, located on GGAZ, being 4.3 Mb. Approximately 75% of the CNVs are copy number losses relatively to the Red Jungle Fowl reference genome. The genome coverage of CNVRs in this study is 60 Mb, which represents almost 5.4% of the chicken genome. In particular large gene families such as the keratin gene family and the MHC show extensive CNV. CONCLUSIONS A relative large group of the CNVs are line-specific, several of which were previously shown to be related to the causative mutation for a number of phenotypic variants. The chance that inter-specific CNVs fall into CNVRs detected in chicken is related to the evolutionary distance between the species. Our results provide a valuable resource for the study of genetic and phenotypic variation in this phenotypically diverse species.
Collapse
Affiliation(s)
- Richard P M A Crooijmans
- Animal Breeding and Genomics Centre, Wageningen University, P.O. Box 338, Wageningen 6700 AH, The Netherlands.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Xin H, Lee D, Hormozdiari F, Yedkar S, Mutlu O, Alkan C. Accelerating read mapping with FastHASH. BMC Genomics 2013; 14 Suppl 1:S13. [PMID: 23369189 PMCID: PMC3549798 DOI: 10.1186/1471-2164-14-s1-s13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS.We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection.We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.
Collapse
Affiliation(s)
- Hongyi Xin
- Depts. of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | |
Collapse
|
19
|
Abstract
AbstractThe science of genetics is undergoing a paradigm shift. Recent discoveries, including the activity of retrotransposons, the extent of copy number variations, somatic and chromosomal mosaicism, and the nature of the epigenome as a regulator of DNA expressivity, are challenging a series of dogmas concerning the nature of the genome and the relationship between genotype and phenotype. According to three widely held dogmas, DNA is the unchanging template of heredity, is identical in all the cells and tissues of the body, and is the sole agent of inheritance. Rather than being an unchanging template, DNA appears subject to a good deal of environmentally induced change. Instead of identical DNA in all the cells of the body, somatic mosaicism appears to be the normal human condition. And DNA can no longer be considered the sole agent of inheritance. We now know that the epigenome, which regulates gene expressivity, can be inherited via the germline. These developments are particularly significant for behavior genetics for at least three reasons: First, epigenetic regulation, DNA variability, and somatic mosaicism appear to be particularly prevalent in the human brain and probably are involved in much of human behavior; second, they have important implications for the validity of heritability and gene association studies, the methodologies that largely define the discipline of behavior genetics; and third, they appear to play a critical role in development during the perinatal period and, in particular, in enabling phenotypic plasticity in offspring. I examine one of the central claims to emerge from the use of heritability studies in the behavioral sciences, the principle of minimal shared maternal effects, in light of the growing awareness that the maternal perinatal environment is a critical venue for the exercise of adaptive phenotypic plasticity. This consideration has important implications for both developmental and evolutionary biology.
Collapse
|
20
|
Association analysis of LCE3C-LCE3B deletion in Tunisian psoriatic population. Arch Dermatol Res 2012; 304:733-8. [PMID: 22926764 DOI: 10.1007/s00403-012-1279-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Revised: 07/18/2012] [Accepted: 07/23/2012] [Indexed: 10/28/2022]
Abstract
An association between a common deletion comprising the late cornified envelope LCE3B and LCE3C genes (LCE3C_LCE3B-del) and psoriasis has been reported in Caucasian and Asian populations. To investigate whether this deletion plays a role in the genetic of psoriasis in Tunisian population, we determined the LCE3C_LCE3B-del genotype in 180 Ps patients and 208 healthy controls from different regions of Tunisia. The LCE3B and LCE3C gene variant was determined in the patients through PCR amplification and the SPSS software package. The frequency of the LCE3C_LCE3B-del was similar between patients and healthy controls. Subanalyses by family history revealed that the frequency of LCE3C_LCE3B-del was significantly higher in patients with a positive family history than in control individuals, as well as in individuals with a positive family history versus those without in the case cohort. However, no significant difference was observed between psoriatic patients with no family history and controls. We also evaluated the relationship between LCE3C_LCE3B-del and PSORS1. No significant epistatic effect was observed suggesting that there was no significant epistasis of the two loci in the Tunisian population. Our findings indicate that the LCE3C_LCE3B-del might play a role in familial psoriasis in the Tunisian population.
Collapse
|
21
|
Beckner ME, Sampath R, Flowers AB, Katira K, D'Souza D, Patil S, Patel RB, Nordberg ML, Nanda A. Low-level amplification of oncogenes correlates inversely with age for patients with nontypical meningiomas. World Neurosurg 2011; 79:313-9.e1-10. [PMID: 22120298 DOI: 10.1016/j.wneu.2011.08.023] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Revised: 06/20/2011] [Accepted: 08/05/2011] [Indexed: 11/17/2022]
Abstract
BACKGROUND This study sought to identify genes in nontypical meningiomas with gains in copy number (CN) that correlate with earlier age of onset, an indicator of aggressiveness. METHODS Among 94 adult patients, 91 had 105 meningiomas that were histologically confirmed. World Health Organization grades I (typical), II (atypical), and III (anaplastic) were assigned to tumors in 76, 14, and 1 patient, respectively. Brain invasion indicated that two World Health Organization grade I meningiomas were biologically atypical. DNA from 15 invasive/atypical/anaplastic meningiomas and commercial normal DNA were analyzed with multiplex ligation dependent probe amplification. The CN ratios (fold differences from normal) for 78 genes were determined. The CN ratio was defined as [tumor CN]/[normal CN] for each gene to normalize results. RESULTS Characteristic gene losses (CN ratio < 0.75) occurred in >50% of the invasive/atypical/anaplastic meningiomas at 22q11, 1p34.2, and 1p22.1 loci. Gains (CN ratio ≥ 2.0) occurred in each tumor for 2 or more of 19 genes. Each of the 19 genes' CN ratio was ≥ 2.0 in multiple tumors, and their collective sums (up to 49.1) correlated inversely with age (r = -0.72), minus an outlier. In patients ≤ 55 versus >55 years, 5 genes (BIRC2, BRAF, MET, NRAS, and PIK3CA) individually exhibited significantly higher CN ratios (P < 0.05) or a trend for them (P < 0.09), with corrections for multiple comparisons, and their sums correlated inversely with age (r = -0.74). CONCLUSIONS Low levels of amplification for selected oncogenes in invasive/atypical/anaplastic meningiomas were higher in younger adults, with the CN gains potentially underlying biological aggressiveness associated with early tumor development.
Collapse
Affiliation(s)
- Marie E Beckner
- Department of Neurology, Louisiana State University Health Sciences Center, Shreveport, Louisiana, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Buizer-Voskamp JE, Muntjewerff JW, Strengman E, Sabatti C, Stefansson H, Vorstman JAS, Ophoff RA. Genome-wide analysis shows increased frequency of copy number variation deletions in Dutch schizophrenia patients. Biol Psychiatry 2011; 70:655-62. [PMID: 21489405 PMCID: PMC3137747 DOI: 10.1016/j.biopsych.2011.02.015] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Revised: 02/09/2011] [Accepted: 02/11/2011] [Indexed: 01/30/2023]
Abstract
BACKGROUND Since 2008, multiple studies have reported on copy number variations (CNVs) in schizophrenia. However, many regions are unique events with minimal overlap between studies. This makes it difficult to gain a comprehensive overview of all CNVs involved in the etiology of schizophrenia. We performed a systematic CNV study on the basis of a homogeneous genome-wide dataset aiming at all CNVs ≥ 50 kilobase pair. We complemented this analysis with a review of cytogenetic and chromosomal abnormalities for schizophrenia reported in the literature with the purpose of combining classical genetic findings and our current understanding of genomic variation. METHODS We investigated 834 Dutch schizophrenia patients and 672 Dutch control subjects. The CNVs were included if they were detected by QuantiSNP (http://www.well.ox.ac.uk/QuantiSNP/) as well as PennCNV (http://www.neurogenome.org/cnv/penncnv/) and contain known protein coding genes. The integrated identification of CNV regions and cytogenetic loci indicates regions of interest (cytogenetic regions of interest [CROIs]). RESULTS In total, 2437 CNVs were identified with an average number of 2.1 CNVs/subject for both cases and control subjects. We observed significantly more deletions but not duplications in schizophrenia cases versus control subjects. The CNVs identified coincide with loci previously reported in the literature, confirming well-established schizophrenia CROIs 1q42 and 22q11.2 as well as indicating a potentially novel CROI on chromosome 5q35.1. CONCLUSIONS Chromosomal deletions are more prevalent in schizophrenia patients than in healthy subjects and therefore confer a risk factor for pathogenicity. The combination of our CNV data with previously reported cytogenetic abnormalities in schizophrenia provides an overview of potentially interesting regions for positional candidate genes.
Collapse
Affiliation(s)
- Jacobine E Buizer-Voskamp
- Rudolf Magnus Institute of Neuroscience, Department of Psychiatry, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands, Department of Medical Genetics, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| | - Jan-Willem Muntjewerff
- Department of Psychiatry, University Medical Centre St Radboud, PO Box 9101, 6500 HB Nijmegen, The Netherlands
| | | | - Eric Strengman
- Department of Medical Genetics, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands
| | - Chiara Sabatti
- Department of Health Research and Policy, Stanford University School of Medicine, HRP Redwood Building, Stanford, CA 94305-5405, USA
| | - Hreinn Stefansson
- CNS Division, deCODE genetics, Sturlugata 8, IS-101 Reykjavik, Iceland
| | - Jacob AS Vorstman
- Rudolf Magnus Institute of Neuroscience, Department of Psychiatry, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands
| | - Roel A Ophoff
- Rudolf Magnus Institute of Neuroscience, Department of Psychiatry, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands, Department of Medical Genetics, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG Utrecht, The Netherlands, Center for Neurobehavioral Genetics, University of California, 695 Charles E Young Drive South, Los Angeles, CA 90095, USA
| |
Collapse
|
23
|
Nicholas TJ, Baker C, Eichler EE, Akey JM. A high-resolution integrated map of copy number polymorphisms within and between breeds of the modern domesticated dog. BMC Genomics 2011; 12:414. [PMID: 21846351 PMCID: PMC3166287 DOI: 10.1186/1471-2164-12-414] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Accepted: 08/16/2011] [Indexed: 01/22/2023] Open
Abstract
Background Structural variation contributes to the rich genetic and phenotypic diversity of the modern domestic dog, Canis lupus familiaris, although compared to other organisms, catalogs of canine copy number variants (CNVs) are poorly defined. To this end, we developed a customized high-density tiling array across the canine genome and used it to discover CNVs in nine genetically diverse dogs and a gray wolf. Results In total, we identified 403 CNVs that overlap 401 genes, which are enriched for defense/immunity, oxidoreductase, protease, receptor, signaling molecule and transporter genes. Furthermore, we performed detailed comparisons between CNVs located within versus outside of segmental duplications (SDs) and find that CNVs in SDs are enriched for gene content and complexity. Finally, we compiled all known dog CNV regions and genotyped them with a custom aCGH chip in 61 dogs from 12 diverse breeds. These data allowed us to perform the first population genetics analysis of canine structural variation and identify CNVs that potentially contribute to breed specific traits. Conclusions Our comprehensive analysis of canine CNVs will be an important resource in genetically dissecting canine phenotypic and behavioral variation.
Collapse
Affiliation(s)
- Thomas J Nicholas
- Department of Genome Sciences, University of Washington, 1705 NE Pacific, Seattle, WA 98195, USA
| | | | | | | |
Collapse
|
24
|
Chen W, Hayward C, Wright AF, Hicks AA, Vitart V, Knott S, Wild SH, Pramstaller PP, Wilson JF, Rudan I, Porteous DJ. Copy number variation across European populations. PLoS One 2011; 6:e23087. [PMID: 21829696 PMCID: PMC3150386 DOI: 10.1371/journal.pone.0023087] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 07/12/2011] [Indexed: 12/13/2022] Open
Abstract
Genome analysis provides a powerful approach to test for evidence of genetic variation within and between geographical regions and local populations. Copy number variants which comprise insertions, deletions and duplications of genomic sequence provide one such convenient and informative source. Here, we investigate copy number variants from genome wide scans of single nucleotide polymorphisms in three European population isolates, the island of Vis in Croatia, the islands of Orkney in Scotland and the South Tyrol in Italy. We show that whereas the overall copy number variant frequencies are similar between populations, their distribution is highly specific to the population of origin, a finding which is supported by evidence for increased kinship correlation for specific copy number variants within populations.
Collapse
Affiliation(s)
- Wanting Chen
- Medical Genetics Section, Centre for Molecular Medicine, Institute of Genetics & Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road South, Edinburgh, United Kingdom
| | - Caroline Hayward
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, United Kingdom
| | - Alan F. Wright
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, United Kingdom
| | - Andrew A. Hicks
- Institute of Genetic Medicine, European Academy Bozen/Bolzano (EURAC), Bolzano/Bozen, Italy - Affiliated Institute of the University of Lübeck, Lübeck, Germany
| | - Veronique Vitart
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, Western General Hospital, Edinburgh, United Kingdom
| | - Sara Knott
- Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, King's Buildings, Edinburgh, United Kingdom
| | - Sarah H. Wild
- Centre for Population Health Sciences, The University of Edinburgh Medical School, Edinburgh, United Kingdom
| | - Peter P. Pramstaller
- Institute of Genetic Medicine, European Academy Bozen/Bolzano (EURAC), Bolzano/Bozen, Italy - Affiliated Institute of the University of Lübeck, Lübeck, Germany
- Department of Neurology, General Central Hospital, Bolzano, Italy
- Department of Neurology, University of Lübeck, Lübeck, Germany
| | - James F. Wilson
- Centre for Population Health Sciences, The University of Edinburgh Medical School, Edinburgh, United Kingdom
| | - Igor Rudan
- Centre for Population Health Sciences, The University of Edinburgh Medical School, Edinburgh, United Kingdom
- Croatian Centre for Global Health, University of Split Medical School, Split, Croatia
| | - David J. Porteous
- Medical Genetics Section, Centre for Molecular Medicine, Institute of Genetics & Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road South, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
25
|
Johansson ACV, Feuk L. Characterization of copy number-stable regions in the human genome. Hum Mutat 2011; 32:947-55. [PMID: 21542059 DOI: 10.1002/humu.21524] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 04/20/2011] [Indexed: 01/25/2023]
Abstract
In the past few years the number of copy number variants (CNVs) identified in the human genome has increased significantly, but our understanding of the functional impact of CNVs is still limited. Clinically significant variations cannot easily be distinguished from benign, complicating interpretation of patient data. Multiple studies have focused on analysis of regions that vary in copy number in specific disorders. Here we use the opposite strategy and focus our analysis on regions that never seem to vary in the general population, hypothesizing that these are copy number stable because variations within them are deleterious. Our results show that copy number stable regions are characterized by correlation with a number of genomic features, allowing us to define a list of genomic regions that are dosage sensitive in humans. We find that these dosage-sensitive regions show significant overlap with de novo CNVs identified in patients with intellectual disability or autism. There is also a significant association between copy number stable regions and rare inherited variants in autism patients, but not in controls. Based on this predictive power, we propose that copy number stable regions can be used to complement maps of known CNVs to facilitate interpretation of patient data.
Collapse
Affiliation(s)
- Anna C V Johansson
- Department of Immunology, Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, Sweden
| | | |
Collapse
|
26
|
Campbell C, Sampas N, Tsalenko A, Sudmant P, Kidd J, Malig M, Vu T, Vives L, Tsang P, Bruhn L, Eichler E. Population-genetic properties of differentiated human copy-number polymorphisms. Am J Hum Genet 2011; 88:317-32. [PMID: 21397061 DOI: 10.1016/j.ajhg.2011.02.004] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Revised: 02/07/2011] [Accepted: 02/15/2011] [Indexed: 01/19/2023] Open
Abstract
Copy-number variants (CNVs) can reach appreciable frequencies in the human population, and recent discoveries have shown that several of these copy-number polymorphisms (CNPs) are associated with human diseases, including lupus, psoriasis, Crohn disease, and obesity. Despite new advances, significant biases remain in terms of CNP discovery and genotyping. We developed a method based on single-channel intensity data and benchmarked against copy numbers determined from sequencing read depth to successfully obtain CNP genotypes for 1495 CNPs from 487 human DNA samples of diverse ethnic backgrounds. This microarray contained CNPs in segmental duplication-rich regions and insertions of sequences not represented in the reference genome assembly or on standard SNP microarray platforms. We observe that CNPs in segmental duplications are more likely to be population differentiated than CNPs in unique regions (p = 0.015) and that biallelic CNPs show greater stratification when compared to frequency-matched SNPs (p = 0.0026). Although biallelic CNPs show a strong correlation of copy number with flanking SNP genotypes, the majority of multicopy CNPs do not (40% with r > 0.8). We selected a subset of CNPs for further characterization in 1876 additional samples from 62 populations; this revealed striking population-differentiated structural variants in genes of clinical significance such as OCLN, a tight junction protein involved in hepatitis C viral entry. Our microarray design allows these variants to be rapidly tested for disease association and our results suggest that CNPs (especially those that cannot be imputed from SNP genotypes) might have contributed disproportionately to human diversity and selection.
Collapse
|
27
|
Seroussi E, Glick G, Shirak A, Yakobson E, Weller JI, Ezra E, Zeron Y. Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs. BMC Genomics 2010; 11:673. [PMID: 21114805 PMCID: PMC3091787 DOI: 10.1186/1471-2164-11-673] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 11/29/2010] [Indexed: 12/18/2022] Open
Abstract
Background Copy number variation (CNV) has been recently identified in human and other mammalian genomes, and there is a growing awareness of CNV's potential as a major source for heritable variation in complex traits. Genomic selection is a newly developed tool based on the estimation of breeding values for quantitative traits through the use of genome-wide genotyping of SNPs. Over 30,000 Holstein bulls have been genotyped with the Illumina BovineSNP50 BeadChip, which includes 54,001 SNPs (~SNP/50,000 bp), some of which fall within CNV regions. Results We used the BeadChip data obtained for 912 Israeli bulls to investigate the effects of CNV on SNP calls. For each of the SNPs, we estimated the frequencies of occurrence of loss of heterozygosity (LOH) and of gain, based either on deviation from the expected Hardy-Weinberg equilibrium (HWE) or on signal intensity (SI) using the PennCNV "detect" option. Correlations between LOH/CNV frequencies predicted by the two methods were low (up to r = 0.08). Nevertheless, 418 locations displayed significantly high frequencies by both methods. Efficiency of designating large genomic clusters of olfactory receptors as CNVs was 29%. Frequency values for copy loss were distinguishable in non-autosomal regions, indicating misplacement of a region in the current BTA7 map. Analysis of BTA18 placed major quantitative trait loci affecting net merit in the US Holstein population in regions rich in segmental duplications and CNVs. Enrichment of transporters in CNV loci suggested their potential effect on milk-production traits. Conclusions Expansion of HWE and PennCNV analyses allowed estimating LOH/CNV frequencies, and combining the two methods yielded more sensitive detection of inherited CNVs and better estimation of their possible effects on cattle genetics. Although this approach was more effective than methodologies previously applied in cattle, it has severe limitations. Thus the number of CNVs reported here for the Holstein breed may represent as little as one-tenth of inherited common structural variation.
Collapse
Affiliation(s)
- Eyal Seroussi
- Institute of Animal Sciences, ARO, The Volcani Center, Bet Dagan 50250, Israel.
| | | | | | | | | | | | | |
Collapse
|
28
|
Boone PM, Bacino CA, Shaw CA, Eng PA, Hixson PM, Pursley AN, Kang SHL, Yang Y, Wiszniewska J, Nowakowska BA, del Gaudio D, Xia Z, Simpson-Patel G, Immken LL, Gibson JB, Tsai ACH, Bowers JA, Reimschisel TE, Schaaf CP, Potocki L, Scaglia F, Gambin T, Sykulski M, Bartnik M, Derwinska K, Wisniowiecka-Kowalnik B, Lalani SR, Probst FJ, Bi W, Beaudet AL, Patel A, Lupski JR, Cheung SW, Stankiewicz P. Detection of clinically relevant exonic copy-number changes by array CGH. Hum Mutat 2010; 31:1326-42. [PMID: 20848651 DOI: 10.1002/humu.21360] [Citation(s) in RCA: 201] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2010] [Accepted: 09/02/2010] [Indexed: 12/22/2022]
Abstract
Array comparative genomic hybridization (aCGH) is a powerful tool for the molecular elucidation and diagnosis of disorders resulting from genomic copy-number variation (CNV). However, intragenic deletions or duplications--those including genomic intervals of a size smaller than a gene--have remained beyond the detection limit of most clinical aCGH analyses. Increasing array probe number improves genomic resolution, although higher cost may limit implementation, and enhanced detection of benign CNV can confound clinical interpretation. We designed an array with exonic coverage of selected disease and candidate genes and used it clinically to identify losses or gains throughout the genome involving at least one exon and as small as several hundred base pairs in size. In some patients, the detected copy-number change occurs within a gene known to be causative of the observed clinical phenotype, demonstrating the ability of this array to detect clinically relevant CNVs with subkilobase resolution. In summary, we demonstrate the utility of a custom-designed, exon-targeted oligonucleotide array to detect intragenic copy-number changes in patients with various clinical phenotypes.
Collapse
Affiliation(s)
- Philip M Boone
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Neill NJ, Torchia BS, Bejjani BA, Shaffer LG, Ballif BC. Comparative analysis of copy number detection by whole-genome BAC and oligonucleotide array CGH. Mol Cytogenet 2010; 3:11. [PMID: 20587050 PMCID: PMC2909945 DOI: 10.1186/1755-8166-3-11] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Accepted: 06/29/2010] [Indexed: 12/11/2022] Open
Abstract
Background Microarray-based comparative genomic hybridization (aCGH) is a powerful diagnostic tool for the detection of DNA copy number gains and losses associated with chromosome abnormalities, many of which are below the resolution of conventional chromosome analysis. It has been presumed that whole-genome oligonucleotide (oligo) arrays identify more clinically significant copy-number abnormalities than whole-genome bacterial artificial chromosome (BAC) arrays, yet this has not been systematically studied in a clinical diagnostic setting. Results To determine the difference in detection rate between similarly designed BAC and oligo arrays, we developed whole-genome BAC and oligonucleotide microarrays and validated them in a side-by-side comparison of 466 consecutive clinical specimens submitted to our laboratory for aCGH. Of the 466 cases studied, 67 (14.3%) had a copy-number imbalance of potential clinical significance detectable by the whole-genome BAC array, and 73 (15.6%) had a copy-number imbalance of potential clinical significance detectable by the whole-genome oligo array. However, because both platforms identified copy number variants of unclear clinical significance, we designed a systematic method for the interpretation of copy number alterations and tested an additional 3,443 cases by BAC array and 3,096 cases by oligo array. Of those cases tested on the BAC array, 17.6% were found to have a copy-number abnormality of potential clinical significance, whereas the detection rate increased to 22.5% for the cases tested by oligo array. In addition, we validated the oligo array for detection of mosaicism and found that it could routinely detect mosaicism at levels of 30% and greater. Conclusions Although BAC arrays have faster turnaround times, the increased detection rate of oligo arrays makes them attractive for clinical cytogenetic testing.
Collapse
|
30
|
Hehir-Kwa JY, Wieskamp N, Webber C, Pfundt R, Brunner HG, Gilissen C, de Vries BBA, Ponting CP, Veltman JA. Accurate distinction of pathogenic from benign CNVs in mental retardation. PLoS Comput Biol 2010; 6:e1000752. [PMID: 20421931 PMCID: PMC2858682 DOI: 10.1371/journal.pcbi.1000752] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2009] [Accepted: 03/19/2010] [Indexed: 11/18/2022] Open
Abstract
Copy number variants (CNVs) have recently been recognized as a common form of genomic variation in humans. Hundreds of CNVs can be detected in any individual genome using genomic microarrays or whole genome sequencing technology, but their phenotypic consequences are still poorly understood. Rare CNVs have been reported as a frequent cause of neurological disorders such as mental retardation (MR), schizophrenia and autism, prompting widespread implementation of CNV screening in diagnostics. In previous studies we have shown that, in contrast to benign CNVs, MR-associated CNVs are significantly enriched in genes whose mouse orthologues, when disrupted, result in a nervous system phenotype. In this study we developed and validated a novel computational method for differentiating between benign and MR-associated CNVs using structural and functional genomic features to annotate each CNV. In total 13 genomic features were included in the final version of a Naïve Bayesian Tree classifier, with LINE density and mouse knock-out phenotypes contributing most to the classifier's accuracy. After demonstrating that our method (called GECCO) perfectly classifies CNVs causing known MR-associated syndromes, we show that it achieves high accuracy (94%) and negative predictive value (99%) on a blinded test set of more than 1,200 CNVs from a large cohort of individuals with MR. These results indicate that this classification method will be of value for objectively prioritizing CNVs in clinical research and diagnostics. Rare copy number variants (CNVs) are a frequent cause of neurological disorders such as mental retardation (MR). However CNVs are also commonly identified in healthy individuals. It is therefore crucial for both diagnostic and research applications to be able to distinguish between disease-causing CNVs and “benign” CNVs occurring as normal genomic variation. Separating these two types can take advantage of significant differences in their genomic contents. For example, benign CNVs are enriched in repetitive sequences. By contrast, CNVs associated with MR tend to have high densities of functional elements, including genes whose mouse orthologues, when knocked-out, lead to specific nervous system abnormalities. We have developed a novel objective approach that is effective in distinguishing MR-associated CNVs from benign CNVs based on the presence of 13 genomic attributes. This method is able to achieve high accuracies in a cohort of CNVs known to cause MR and in a cohort of individuals with unexplained MR. The development of this technique promises to substantially improve the methodology for determining the pathogenicity of CNVs.
Collapse
Affiliation(s)
- Jayne Y. Hehir-Kwa
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Nienke Wieskamp
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Caleb Webber
- MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, Oxford, United Kingdom
| | - Rolph Pfundt
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Han G. Brunner
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Christian Gilissen
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Bert B. A. de Vries
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
| | - Chris P. Ponting
- MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, Oxford, United Kingdom
| | - Joris A. Veltman
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands
- * E-mail:
| |
Collapse
|
31
|
Evolution in health and medicine Sackler colloquium: Genomic disorders: a window into human gene and genome evolution. Proc Natl Acad Sci U S A 2010; 107 Suppl 1:1765-71. [PMID: 20080665 DOI: 10.1073/pnas.0906222107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Gene duplications alter the genetic constitution of organisms and can be a driving force of molecular evolution in humans and the great apes. In this context, the study of genomic disorders has uncovered the essential role played by the genomic architecture, especially low copy repeats (LCRs) or segmental duplications (SDs). In fact, regardless of the mechanism, LCRs can mediate or stimulate rearrangements, inciting genomic instability and generating dynamic and unstable regions prone to rapid molecular evolution. In humans, copy-number variation (CNV) has been implicated in common traits such as neuropathy, hypertension, color blindness, infertility, and behavioral traits including autism and schizophrenia, as well as disease susceptibility to HIV, lupus nephritis, and psoriasis among many other clinical phenotypes. The same mechanisms implicated in the origin of genomic disorders may also play a role in the emergence of segmental duplications and the evolution of new genes by means of genomic and gene duplication and triplication, exon shuffling, exon accretion, and fusion/fission events.
Collapse
|
32
|
The evolution of human segmental duplications and the core duplicon hypothesis. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2009; 74:355-62. [PMID: 19717539 DOI: 10.1101/sqb.2009.74.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Duplicated sequences are important sources of genetic instability and in the evolution of new gene function within species. Hominids have a preponderance of intrachromosomal duplications organized in an interspersed fashion, as opposed to tandem duplications, which are common in other mammalian genomes such as mouse, dog, and cow. Multiple lines of evidence, including sequence divergence, comparative primate genomes, and fluorescence in situ hybridization (FISH) analyses, point to an excess of segmental duplications in the common ancestor of humans and African great apes. We find that much of the interspersed human duplication architecture within chromosomes is focused around common sequence elements referred to as "core duplicons." These cores correspond to the expansion of gene families, some of which show signatures of positive selection and lack orthologs present in other mammalian species. This genomic architecture predisposes apes and humans not only to extensive genetic diversity, but also to large-scale structural diversity mediated by nonallelic homologous recombination. In humans, many de novo large-scale genomic changes mediated by these duplications are associated with neuropsychiatric and neurodevelopmental disease. We propose that the disadvantage of a high rate of new mutations is offset by the selective advantage of newly minted genes within the cores.
Collapse
|
33
|
Takezaki N, Nei M. Genomic drift and evolution of microsatellite DNAs in human populations. Mol Biol Evol 2009; 26:1835-40. [PMID: 19406937 DOI: 10.1093/molbev/msp091] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In recent years, copy number variation (CNV) of DNA segments has become a hot topic in the study of genetic variation, and a large amount of CNVs has been uncovered in human populations. The CNVs involving the smallest units of DNA segments are microsatellite DNAs, and the evolutionary change of microsatellite DNAs is believed to occur mostly by the increase or decrease of one repeat unit at a time in a more or less neutral fashion. If we note that eukaryotic genomes contain millions of microsatellite loci, this pattern of nucleotide change is expected to generate random changes of genome size, that is, genomic drift, and will provide a neutral model of CNV evolution. We therefore investigated the amount of variation of the total number of repeats (TNR) per individual concerned with 145 microsatellite loci in three human populations, Africans, Europeans, and Asians. It was shown that the TNR follows the normal distribution in all three populations and that the extent of variation of TNR is more than 50% greater in Africans than in Europeans and Asians as expected from the hypothesis of African origin of modern humans. If we consider all microsatellite loci in the human genome and compute the variation of the total number of nucleotides involved (TNN), it is possible to study the contribution of microsatellite loci to the genome size variation. This study has shown that the genome sizes of human individuals are affected considerably by genomic drift of microsatellite DNA alone. This pattern of evolution is similar to that of olfactory receptor (OR) genes previously studied in human populations and support the idea that the number of OR genes has evolved in a more or less neutral fashion. However, this conclusion does not necessarily apply to the genomewide CNVs of various DNA segments, and it appears that long variant DNA fragments are deleterious and under purifying selection.
Collapse
|