1
|
Lindstedt C, Bagley R, Calhim S, Jones M, Linnen C. The impact of life stage and pigment source on the evolution of novel warning signal traits. Evolution 2022; 76:554-572. [PMID: 35103303 PMCID: PMC9304160 DOI: 10.1111/evo.14443] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 12/06/2021] [Indexed: 11/28/2022]
Abstract
Our understanding of how novel warning color traits evolve in natural populations is largely based on studies of reproductive stages and organisms with endogenously produced pigmentation. In these systems, genetic drift is often required for novel alleles to overcome strong purifying selection stemming from frequency‐dependent predation and positive assortative mating. Here, we integrate data from field surveys, predation experiments, population genomics, and phenotypic correlations to explain the origin and maintenance of geographic variation in a diet‐based larval pigmentation trait in the redheaded pine sawfly (Neodiprion lecontei), a pine‐feeding hymenopteran. Although our experiments confirm that N. lecontei larvae are indeed aposematic—and therefore likely to experience frequency‐dependent predation—our genomic data do not support a historical demographic scenario that would have facilitated the spread of an initially deleterious allele via drift. Additionally, significantly elevated differentiation at a known color locus suggests that geographic variation in larval color is currently maintained by selection. Together, these data suggest that the novel white morph likely spread via selection. However, white body color does not enhance aposematic displays, nor is it correlated with enhanced chemical defense or immune function. Instead, the derived white‐bodied morph is disproportionately abundant on a pine species with a reduced carotenoid content relative to other pine hosts, suggesting that bottom‐up selection via host plants may have driven divergence among populations. Overall, our results suggest that life stage and pigment source can have a substantial impact on the evolution of novel warning signals, highlighting the need to investigate diverse aposematic taxa to develop a comprehensive understanding of color variation in nature.
Collapse
Affiliation(s)
- Carita Lindstedt
- Department of Biological and Environmental Sciences, University of Jyväskylä, Finland
| | - Robin Bagley
- Department of Biology, University of Kentucky, Lexington, Kentucky, 40506, USA.,Department of Evolution, Ecology, and Organismal Biology, The Ohio State University at Lima, Lima, OH, 45804, USA
| | - Sara Calhim
- Department of Biological and Environmental Sciences, University of Jyväskylä, Finland
| | - Mackenzie Jones
- Department of Biology, University of Kentucky, Lexington, Kentucky, 40506, USA
| | - Catherine Linnen
- Department of Biology, University of Kentucky, Lexington, Kentucky, 40506, USA
| |
Collapse
|
2
|
Shin W, Mun S, Kim J, Lee W, Park DG, Choi S, Lee TY, Cha S, Han K. Novel Discovery of LINE-1 in a Korean Individual by a Target Enrichment Method. Mol Cells 2019; 42:87-95. [PMID: 30699287 PMCID: PMC6354063 DOI: 10.14348/molcells.2018.0351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 10/10/2018] [Accepted: 10/26/2018] [Indexed: 11/27/2022] Open
Abstract
Long interspersed element-1 (LINE-1 or L1) is an autonomous retrotransposon, which is capable of inserting into a new region of genome. Previous studies have reported that these elements lead to genomic variations and altered functions by affecting gene expression and genetic networks. Mounting evidence strongly indicates that genetic diseases or various cancers can occur as a result of retrotransposition events that involve L1s. Therefore, the development of methodologies to study the structural variations and interpersonal insertion polymorphisms by L1 element-associated changes in an individual genome is invaluable. In this study, we applied a systematic approach to identify human-specific L1s (i.e., L1Hs) through the bioinformatics analysis of high-throughput next-generation sequencing data. We identified 525 candidates that could be inferred to carry non-reference L1Hs in a Korean individual genome (KPGP9). Among them, we randomly selected 40 candidates and validated that approximately 92.5% of non-reference L1Hs were inserted into a KPGP9 genome. In addition, unlike conventional methods, our relatively simple and expedited approach was highly reproducible in confirming the L1 insertions. Taken together, our findings strongly support that the identification of non-reference L1Hs by our novel target enrichment method demonstrates its future application to genomic variation studies on the risk of cancer and genetic disorders.
Collapse
Affiliation(s)
- Wonseok Shin
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Seyoung Mun
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Junse Kim
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Wooseok Lee
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Dong-Guk Park
- Department of Surgery, Dankook University College of Medicine, Cheonan 31116,
Korea
| | - Seungkyu Choi
- Department of Pathology, Dankook University College of Medicine, Cheonan 31116,
Korea
| | - Tae Yoon Lee
- Department of Technology Education and Department of Biomedical Engineering, Chungnam National University, Daejeon 34134,
Korea
| | - Seunghee Cha
- Department of Oral and Maxillofacial Diagnostic Sciences, University of Florida College of Dentistry, Gainesville, FL 32610,
USA
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| |
Collapse
|
3
|
Castéra L, Harter V, Muller E, Krieger S, Goardon N, Ricou A, Rousselin A, Paimparay G, Legros A, Bruet O, Quesnelle C, Domin F, San C, Brault B, Fouillet R, Abadie C, Béra O, Berthet P, Frébourg T, Vaur D; French Exome Project Consortium. Landscape of pathogenic variations in a panel of 34 genes and cancer risk estimation from 5131 HBOC families. Genet Med 2018; 20:1677-86. [PMID: 29988077 DOI: 10.1038/s41436-018-0005-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 03/20/2018] [Indexed: 12/25/2022] Open
Abstract
PURPOSE Integration of gene panels in the diagnosis of hereditary breast and ovarian cancer (HBOC) requires a careful evaluation of the risk associated with pathogenic or likely pathogenic variants (PVs) detected in each gene. Here we analyzed 34 genes in 5131 suspected HBOC index cases by next-generation sequencing. METHODS Using the Exome Aggregation Consortium data sets plus 571 individuals from the French Exome Project, we simulated the probability that an individual from the Exome Aggregation Consortium carries a PV and compared it to the estimated frequency within the HBOC population. RESULTS Odds ratio conferred by PVs within BRCA1, BRCA2, PALB2, RAD51C, RAD51D, ATM, BRIP1, CHEK2, and MSH6 were estimated at 13.22 [10.01-17.22], 8.61 [6.78-10.82], 8.22 [4.91-13.05], 4.54 [2.55-7.48], 5.23 [1.46-13.17], 3.20 [2.14-4.53], 2.49 [1.42-3.97], 1.67 [1.18-2.27], and 2.50 [1.12-4.67], respectively. PVs within RAD51C, RAD51D, and BRIP1 were associated with ovarian cancer family history (OR = 11.36 [5.78-19.59], 12.44 [2.94-33.30] and 3.82 [1.66-7.11]). PALB2 PVs were associated with bilateral breast cancer (OR = 16.17 [5.48-34.10]) and BARD1 PVs with triple-negative breast cancer (OR = 11.27 [3.37-25.01]). Burden tests performed in both patients and the French Exome Project population confirmed the association of PVs of BRCA1, BRCA2, PALB2, and RAD51C with HBOC. CONCLUSION Our results validate the integration of PALB2, RAD51C, and RAD51D in the diagnosis of HBOC and suggest that the other genes are involved in an oligogenic determinism.
Collapse
|
4
|
Muniz FL, Campos Z, Hernández Rangel SM, Martínez JG, Souza BC, De Thoisy B, Botero-arias R, Hrbek T, Farias IP. Delimitation of evolutionary units in Cuvier’s dwarf caiman, Paleosuchus palpebrosus (Cuvier, 1807): insights from conservation of a broadly distributed species. CONSERV GENET 2018; 19:599-610. [DOI: 10.1007/s10592-017-1035-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
Chanroj V, Rattanawong R, Phumichai T, Tangphatsornruang S, Ukoskit K. Genome-wide association mapping of latex yield and girth in Amazonian accessions of Hevea brasiliensis grown in a suboptimal climate zone. Genomics 2017; 109:475-484. [PMID: 28751185 DOI: 10.1016/j.ygeno.2017.07.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Revised: 07/07/2017] [Accepted: 07/21/2017] [Indexed: 12/29/2022]
Abstract
Latex yield and growth are the key complex traits in commercial rubber production. The present study is the first to report genome-wide association mapping of latex yield and girth, for 170 Amazonian accessions grown in a suboptimal area characterized by limited rainfall and a lengthy dry season. Targeted sequence enrichment to capture gene transcripts generated 14,155 high quality filtered single nucleotide polymorphisms (SNPs) of which 94.3% resided in coding regions. The rapid decay of linkage disequilibrium over physical and genetic distance found in the accessions was comparable to those previously reported for several outcrossing species. A mixed linear model detected three significant SNPs in three candidate genes involved in plant adaptation to drought stress, individually explaining 12.7-15.7% of the phenotypic variance. The SNPs identified in the study will help to extend understanding, and to support genetic improvement of rubber trees grown in drought-affected regions.
Collapse
Affiliation(s)
- Vipavee Chanroj
- Department of Biotechnology, Faculty of Science and Technology, Thammasat University, Rangsit Campus, Klong Luang, Pathumtani 12121, Thailand
| | - Ratchanee Rattanawong
- Nong Khai Rubber Research Center, Rubber Research Institute of Thailand, Rattanawapi District, Nong Khai, 43120, Thailand
| | | | - Sithichoke Tangphatsornruang
- National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Kittipat Ukoskit
- Department of Biotechnology, Faculty of Science and Technology, Thammasat University, Rangsit Campus, Klong Luang, Pathumtani 12121, Thailand.
| |
Collapse
|
6
|
Ahmadloo S, Nakaoka H, Hayano T, Hosomichi K, You H, Utsuno E, Sangai T, Nishimura M, Matsushita K, Hata A, Nomura F, Inoue I. Rapid and cost-effective high-throughput sequencing for identification of germline mutations of BRCA1 and BRCA2. J Hum Genet 2017; 62:561-7. [DOI: 10.1038/jhg.2017.5] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 10/26/2016] [Accepted: 12/05/2016] [Indexed: 12/30/2022]
|
7
|
Schott RK, Panesar B, Card DC, Preston M, Castoe TA, Chang BS. Targeted Capture of Complete Coding Regions across Divergent Species. Genome Biol Evol 2017; 9:398-414. [PMID: 28137744 PMCID: PMC5381602 DOI: 10.1093/gbe/evx005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/18/2017] [Indexed: 02/06/2023] Open
Abstract
Despite continued advances in sequencing technologies, there is a need for methods that can efficiently sequence large numbers of genes from diverse species. One approach to accomplish this is targeted capture (hybrid enrichment). While these methods are well established for genome resequencing projects, cross-species capture strategies are still being developed and generally focus on the capture of conserved regions, rather than complete coding regions from specific genes of interest. The resulting data is thus useful for phylogenetic studies, but the wealth of comparative data that could be used for evolutionary and functional studies is lost. Here, we design and implement a targeted capture method that enables recovery of complete coding regions across broad taxonomic scales. Capture probes were designed from multiple reference species and extensively tiled in order to facilitate cross-species capture. Using novel bioinformatics pipelines we were able to recover nearly all of the targeted genes with high completeness from species that were up to 200 myr divergent. Increased probe diversity and tiling for a subset of genes had a large positive effect on both recovery and completeness. The resulting data produced an accurate species tree, but importantly this same data can also be applied to studies of molecular evolution and function that will allow researchers to ask larger questions in broader phylogenetic contexts. Our method demonstrates the utility of cross-species approaches for the capture of full length coding sequences, and will substantially improve the ability for researchers to conduct large-scale comparative studies of molecular evolution and function.
Collapse
Affiliation(s)
- Ryan K. Schott
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario, Canada
| | - Bhawandeep Panesar
- Department of Cell and Systems Biology, University of Toronto, Ontario, Canada
| | - Daren C. Card
- Department of Biology, University of Texas at Arlington, Arlington, TX
| | - Matthew Preston
- Department of Cell and Systems Biology, University of Toronto, Ontario, Canada
| | - Todd A. Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX
| | - Belinda S.W. Chang
- Department of Ecology and Evolutionary Biology, University of Toronto, Ontario, Canada
- Department of Cell and Systems Biology, University of Toronto, Ontario, Canada
- Centre for the Analysis of Genomes and Function, University of Toronto, Canada
| |
Collapse
|
8
|
Bagley RK, Sousa VC, Niemiller ML, Linnen CR. History, geography and host use shape genomewide patterns of genetic variation in the redheaded pine sawfly (
Neodiprion lecontei
). Mol Ecol 2017; 26:1022-1044. [DOI: 10.1111/mec.13972] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Revised: 11/10/2016] [Accepted: 12/01/2016] [Indexed: 01/03/2023]
Affiliation(s)
- Robin K. Bagley
- Department of Biology University of Kentucky Lexington KY 40506 USA
| | - Vitor C. Sousa
- cE3c ‐ Centre for Ecology, Evolution and Environmental Changes Faculdade de Ciências Universidade de Lisboa 1749‐016 Lisboa Portugal
| | - Matthew L. Niemiller
- Illinois Natural History Survey Prairie Research Institute University of Illinois Urbana‐Champaign Champaign IL 61820 USA
| | | |
Collapse
|
9
|
Fraïsse C, Belkhir K, Welch JJ, Bierne N. Local interspecies introgression is the main cause of extreme levels of intraspecific differentiation in mussels. Mol Ecol 2015; 25:269-86. [DOI: 10.1111/mec.13299] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 06/19/2015] [Accepted: 06/19/2015] [Indexed: 12/15/2022]
Affiliation(s)
- Christelle Fraïsse
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
- Station Marine; Université Montpellier; 2 rue des Chantiers 34200 Sète France
- Department of Genetics; University of Cambridge; Downing Street CB2 3EH Cambridge UK
| | - Khalid Belkhir
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
| | - John J. Welch
- Department of Genetics; University of Cambridge; Downing Street CB2 3EH Cambridge UK
| | - Nicolas Bierne
- Institut des Sciences de l'Evolution (UMR 5554); CNRS - Université Montpellier; Place Eugène Bataillon 34095 Montpellier France
- Station Marine; Université Montpellier; 2 rue des Chantiers 34200 Sète France
| |
Collapse
|
10
|
Dasgupta MG, Dharanishanthi V, Agarwal I, Krutovsky KV. Development of genetic markers in Eucalyptus species by target enrichment and exome sequencing. PLoS One 2015; 10:e0116528. [PMID: 25602379 PMCID: PMC4300219 DOI: 10.1371/journal.pone.0116528] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 12/08/2014] [Indexed: 02/02/2023] Open
Abstract
The advent of next-generation sequencing has facilitated large-scale discovery, validation and assessment of genetic markers for high density genotyping. The present study was undertaken to identify markers in genes supposedly related to wood property traits in three Eucalyptus species. Ninety four genes involved in xylogenesis were selected for hybridization probe based nuclear genomic DNA target enrichment and exome sequencing. Genomic DNA was isolated from the leaf tissues and used for on-array probe hybridization followed by Illumina sequencing. The raw sequence reads were trimmed and high-quality reads were mapped to the E. grandis reference sequence and the presence of single nucleotide variants (SNVs) and insertions/ deletions (InDels) were identified across the three species. The average read coverage was 216X and a total of 2294 SNVs and 479 InDels were discovered in E. camaldulensis, 2383 SNVs and 518 InDels in E. tereticornis, and 1228 SNVs and 409 InDels in E. grandis. Additionally, SNV calling and InDel detection were conducted in pair-wise comparisons of E. tereticornis vs. E. grandis, E. camaldulensis vs. E. tereticornis and E. camaldulensis vs. E. grandis. This study presents an efficient and high throughput method on development of genetic markers for family– based QTL and association analysis in Eucalyptus.
Collapse
Affiliation(s)
- Modhumita Ghosh Dasgupta
- Division of Plant Biotechnology, Institute of Forest Genetics and Tree Breeding, P.B. No. 1061, R.S. Puram, Coimbatore–641002, India
- * E-mail:
| | - Veeramuthu Dharanishanthi
- Division of Plant Biotechnology, Institute of Forest Genetics and Tree Breeding, P.B. No. 1061, R.S. Puram, Coimbatore–641002, India
| | - Ishangi Agarwal
- Genotypic Technology Private Limited, #2/13, Balaji Complex, Poojari Layout, 80, Feet Road, R. M. V. 2nd Stage, Bangalore-560094, India
| | - Konstantin V. Krutovsky
- Department of Forest Genetics and Forest Tree Breeding, Büsgen Institute, Georg August University of Göttingen, Büsgenweg 2, D-37077 Göttingen, Germany
- Department of Ecosystem Science and Management, Texas A&M University, 2138 TAMU, College Station, TX 77843-2138, United States of America
- N.I. Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119333, Russia
- Genome Research and Education Center, Siberian Federal University, 50a/2 Akademgorodok, Krasnoyarsk 660036, Russia
| |
Collapse
|
11
|
Scheible M, Loreille O, Just R, Irwin J. Short tandem repeat typing on the 454 platform: Strategies and considerations for targeted sequencing of common forensic markers. Forensic Sci Int Genet 2014; 12:107-19. [DOI: 10.1016/j.fsigen.2014.04.010] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2013] [Revised: 03/12/2014] [Accepted: 04/22/2014] [Indexed: 01/05/2023]
|
12
|
Kenny EM, Cormican P, Furlong S, Heron E, Kenny G, Fahey C, Kelleher E, Ennis S, Tropea D, Anney R, Corvin AP, Donohoe G, Gallagher L, Gill M, Morris DW. Excess of rare novel loss-of-function variants in synaptic genes in schizophrenia and autism spectrum disorders. Mol Psychiatry 2014; 19:872-9. [PMID: 24126926 DOI: 10.1038/mp.2013.127] [Citation(s) in RCA: 144] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2012] [Revised: 08/02/2013] [Accepted: 08/08/2013] [Indexed: 02/03/2023]
Abstract
Schizophrenia (SZ) and autism spectrum disorders (ASDs) are complex neurodevelopmental disorders that may share an underlying pathology suggested by shared genetic risk variants. We sequenced the exonic regions of 215 genes in 147 ASD cases, 273 SZ cases and 287 controls, to identify rare risk mutations. Genes were primarily selected for their function in the synapse and were categorized as: (1) Neurexin and Neuroligin Interacting Proteins, (2) Post-synaptic Glutamate Receptor Complexes, (3) Neural Cell Adhesion Molecules, (4) DISC1 and Interactors and (5) Functional and Positional Candidates. Thirty-one novel loss-of-function (LoF) variants that are predicted to severely disrupt protein-coding sequence were detected among 2 861 rare variants. We found an excess of LoF variants in the combined cases compared with controls (P=0.02). This effect was stronger when analysis was limited to singleton LoF variants (P=0.0007) and the excess was present in both SZ (P=0.002) and ASD (P=0.001). As an individual gene category, Neurexin and Neuroligin Interacting Proteins carried an excess of LoF variants in cases compared with controls (P=0.05). A de novo nonsense variant in GRIN2B was identified in an ASD case adding to the growing evidence that this is an important risk gene for the disorder. These data support synapse formation and maintenance as key molecular mechanisms for SZ and ASD.
Collapse
|
13
|
Henry IM, Nagalakshmi U, Lieberman MC, Ngo KJ, Krasileva KV, Vasquez-Gross H, Akhunova A, Akhunov E, Dubcovsky J, Tai TH, Comai L. Efficient Genome-Wide Detection and Cataloging of EMS-Induced Mutations Using Exome Capture and Next-Generation Sequencing. Plant Cell 2014; 26:1382-1397. [PMID: 24728647 PMCID: PMC4036560 DOI: 10.1105/tpc.113.121590] [Citation(s) in RCA: 94] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Revised: 02/17/2014] [Accepted: 03/19/2014] [Indexed: 05/18/2023]
Abstract
Chemical mutagenesis efficiently generates phenotypic variation in otherwise homogeneous genetic backgrounds, enabling functional analysis of genes. Advances in mutation detection have brought the utility of induced mutant populations on par with those produced by insertional mutagenesis, but systematic cataloguing of mutations would further increase their utility. We examined the suitability of multiplexed global exome capture and sequencing coupled with custom-developed bioinformatics tools to identify mutations in well-characterized mutant populations of rice (Oryza sativa) and wheat (Triticum aestivum). In rice, we identified ∼18,000 induced mutations from 72 independent M2 individuals. Functional evaluation indicated the recovery of potentially deleterious mutations for >2600 genes. We further observed that specific sequence and cytosine methylation patterns surrounding the targeted guanine residues strongly affect their probability to be alkylated by ethyl methanesulfonate. Application of these methods to six independent M2 lines of tetraploid wheat demonstrated that our bioinformatics pipeline is applicable to polyploids. In conclusion, we provide a method for developing large-scale induced mutation resources with relatively small investments that is applicable to resource-poor organisms. Furthermore, our results demonstrate that large libraries of sequenced mutations can be readily generated, providing enhanced opportunities to study gene function and assess the effect of sequence and chromatin context on mutations.
Collapse
Affiliation(s)
- Isabelle M Henry
- Plant Biology Department and Genome Center, University of California, Davis, California 95616
| | - Ugrappa Nagalakshmi
- Plant Biology Department and Genome Center, University of California, Davis, California 95616
| | - Meric C Lieberman
- Plant Biology Department and Genome Center, University of California, Davis, California 95616
| | - Kathie J Ngo
- Plant Biology Department and Genome Center, University of California, Davis, California 95616
| | - Ksenia V Krasileva
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Hans Vasquez-Gross
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Alina Akhunova
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas 66502 Integrated Genomics Facility, Kansas State University, Manhattan, Kansas 66502
| | - Eduard Akhunov
- Department of Plant Pathology, Kansas State University, Manhattan, Kansas 66502
| | - Jorge Dubcovsky
- Department of Plant Sciences, University of California, Davis, California 95616 Howard Hughes Medical Institute, Chevy Chase, Maryland 20815
| | - Thomas H Tai
- Crops Pathology and Genetics Research Unit, U.S. Department of Agriculture, Agricultural Research Service, Davis, California, 95616
| | - Luca Comai
- Plant Biology Department and Genome Center, University of California, Davis, California 95616
| |
Collapse
|
14
|
Castéra L, Krieger S, Rousselin A, Legros A, Baumann JJ, Bruet O, Brault B, Fouillet R, Goardon N, Letac O. Next-generation sequencing for the diagnosis of hereditary breast and ovarian cancer using genomic capture targeting multiple candidate genes. Eur J Hum Genet. 2014;22:1305-1313. [PMID: 24549055 DOI: 10.1038/ejhg.2014.16] [Citation(s) in RCA: 184] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 01/15/2014] [Accepted: 01/16/2014] [Indexed: 02/08/2023] Open
Abstract
To optimize the molecular diagnosis of hereditary breast and ovarian cancer (HBOC), we developed a next-generation sequencing (NGS)-based screening based on the capture of a panel of genes involved, or suspected to be involved in HBOC, on pooling of indexed DNA and on paired-end sequencing in an Illumina GAIIx platform, followed by confirmation by Sanger sequencing or MLPA/QMPSF. The bioinformatic pipeline included CASAVA, NextGENe, CNVseq and Alamut-HT. We validated this procedure by the analysis of 59 patients' DNAs harbouring SNVs, indels or large genomic rearrangements of BRCA1 or BRCA2. We also conducted a blind study in 168 patients comparing NGS versus Sanger sequencing or MLPA analyses of BRCA1 and BRCA2. All mutations detected by conventional procedures were detected by NGS. We then screened, using three different versions of the capture set, a large series of 708 consecutive patients. We detected in these patients 69 germline deleterious alterations within BRCA1 and BRCA2, and 4 TP53 mutations in 468 patients also tested for this gene. We also found 36 variations inducing either a premature codon stop or a splicing defect among other genes: 5/708 in CHEK2, 3/708 in RAD51C, 1/708 in RAD50, 7/708 in PALB2, 3/708 in MRE11A, 5/708 in ATM, 3/708 in NBS1, 1/708 in CDH1, 3/468 in MSH2, 2/468 in PMS2, 1/708 in BARD1, 1/468 in PMS1 and 1/468 in MLH3. These results demonstrate the efficiency of NGS in performing molecular diagnosis of HBOC. Detection of mutations within other genes than BRCA1 and BRCA2 highlights the genetic heterogeneity of HBOC.
Collapse
|
15
|
Kenna KP, McLaughlin RL, Byrne S, Elamin M, Heverin M, Kenny EM, Cormican P, Morris DW, Donaghy CG, Bradley DG, Hardiman O. Delineating the genetic heterogeneity of ALS using targeted high-throughput sequencing. J Med Genet 2013; 50:776-83. [PMID: 23881933 PMCID: PMC3812897 DOI: 10.1136/jmedgenet-2013-101795] [Citation(s) in RCA: 135] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
BACKGROUND Over 100 genes have been implicated in the aetiology of amyotrophic lateral sclerosis (ALS). A detailed understanding of their independent and cumulative contributions to disease burden may help guide various clinical and research efforts. METHODS Using targeted high-throughput sequencing, we characterised the variation of 10 Mendelian and 23 low penetrance/tentative ALS genes within a population-based cohort of 444 Irish ALS cases (50 fALS, 394 sALS) and 311 age-matched and geographically matched controls. RESULTS Known or potential high-penetrance ALS variants were identified within 17.1% of patients (38% of fALS, 14.5% of sALS). 12.8% carried variants of Mendelian disease genes (C9orf72 8.78%; SETX 2.48%; ALS2 1.58%; FUS 0.45%; TARDBP 0.45%; OPTN 0.23%; VCP 0.23%. ANG, SOD1, VAPB 0%), 4.7% carried variants of low penetrance/tentative ALS genes and 9.7% (30% of fALS, 7.1% of sALS) carried previously described ALS variants (C9orf72 8.78%; FUS 0.45%; TARDBP 0.45%). 1.6% of patients carried multiple known/potential disease variants, including all identified carriers of an established ALS variant (p<0.01); TARDBP:c.859G>A(p.[G287S]) (n=2/2 sALS). Comparison of our results with those from studies of other European populations revealed significant differences in the spectrum of disease variation (p=1.7×10(-4)). CONCLUSIONS Up to 17% of Irish ALS cases may carry high-penetrance variants within the investigated genes. However, the precise nature of genetic susceptibility differs significantly from that reported within other European populations. Certain variants may not cause disease in isolation and concomitant analysis of disease genes may prove highly important.
Collapse
Affiliation(s)
- Kevin P Kenna
- Smurfit Institute of Genetics, Trinity College, Dublin, Ireland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Uitdewilligen JGAML, Wolters AMA, D’hoop BB, Borm TJA, Visser RGF, van Eck HJ. A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS One 2013; 8:e62355. [PMID: 23667470 PMCID: PMC3648547 DOI: 10.1371/journal.pone.0062355] [Citation(s) in RCA: 238] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 03/20/2013] [Indexed: 11/23/2022] Open
Abstract
Assessment of genomic DNA sequence variation and genotype calling in autotetraploids implies the ability to distinguish among five possible alternative allele copy number states. This study demonstrates the accuracy of genotyping-by-sequencing (GBS) of a large collection of autotetraploid potato cultivars using next-generation sequencing. It is still costly to reach sufficient read depths on a genome wide scale, across the cultivated gene pool. Therefore, we enriched cultivar-specific DNA sequencing libraries using an in-solution hybridisation method (SureSelect). This complexity reduction allowed to confine our study to 807 target genes distributed across the genomes of 83 tetraploid cultivars and one reference (DM 1–3 511). Indexed sequencing libraries were paired-end sequenced in 7 pools of 12 samples using Illumina HiSeq2000. After filtering and processing the raw sequence data, 12.4 Gigabases of high-quality sequence data was obtained, which mapped to 2.1 Mb of the potato reference genome, with a median average read depth of 63× per cultivar. We detected 129,156 sequence variants and genotyped the allele copy number of each variant for every cultivar. In this cultivar panel a variant density of 1 SNP/24 bp in exons and 1 SNP/15 bp in introns was obtained. The average minor allele frequency (MAF) of a variant was 0.14. Potato germplasm displayed a large number of relatively rare variants and/or haplotypes, with 61% of the variants having a MAF below 0.05. A very high average nucleotide diversity (π = 0.0107) was observed. Nucleotide diversity varied among potato chromosomes. Several genes under selection were identified. Genotyping-by-sequencing results, with allele copy number estimates, were validated with a KASP genotyping assay. This validation showed that read depths of ∼60–80× can be used as a lower boundary for reliable assessment of allele copy number of sequence variants in autotetraploids. Genotypic data were associated with traits, and alleles strongly influencing maturity and flesh colour were identified.
Collapse
Affiliation(s)
- Jan G. A. M. L. Uitdewilligen
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Anne-Marie A. Wolters
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Bjorn B. D’hoop
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
| | - Theo J. A. Borm
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
| | - Richard G. F. Visser
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
- Centre for BioSystems Genomics, Wageningen, The Netherlands
| | - Herman J. van Eck
- Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands
- The Graduate School for Experimental Plant Sciences, Wageningen, The Netherlands
- Centre for BioSystems Genomics, Wageningen, The Netherlands
- * E-mail:
| |
Collapse
|
17
|
McCormack JE, Hird SM, Zellmer AJ, Carstens BC, Brumfield RT. Applications of next-generation sequencing to phylogeography and phylogenetics. Mol Phylogenet Evol 2013; 66:526-38. [DOI: 10.1016/j.ympev.2011.12.007] [Citation(s) in RCA: 445] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2011] [Revised: 12/02/2011] [Accepted: 12/05/2011] [Indexed: 01/09/2023]
|
18
|
Ramos E, Levinson BT, Chasnoff S, Hughes A, Young AL, Thornton K, Li A, Vallania FL, Province M, Druley TE. Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing. BMC Genomics 2012; 13:683. [PMID: 23216810 DOI: 10.1186/1471-2164-13-683] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 11/23/2012] [Indexed: 11/15/2022] Open
Abstract
Background Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. Results We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22–48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. Conclusions This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
Collapse
|
19
|
Lord J, Turton J, Medway C, Shi H, Brown K, Lowe J, Mann D, Pickering-Brown S, Kalsheker N, Passmore P, Morgan K. Next generation sequencing of CLU, PICALM and CR1: pitfalls and potential solutions. Int J Mol Epidemiol Genet 2012; 3:262-75. [PMID: 23205178 PMCID: PMC3508540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Accepted: 10/24/2012] [Indexed: 11/05/2023]
Abstract
CLU, PICALM and CR1 were identified as genetic risk factors for late onset Alzheimer's disease (AD) in two large genome wide association studies (GWAS) published in 2009, but the variants that convey this alteration in disease risk, and how the genes relate to AD pathology is yet to be discovered. A next generation sequencing (NGS) project was conducted targeting CLU, CR1 and PICALM, in 96 AD samples (8 pools of 12), in an attempt to discover rare variants within these AD associated genes. Inclusion of repetitive regions in the design of the SureSelect capture lead to significant issues in alignment of the data, leading to poor specificity and a lower than expected depth of coverage. A strong positive correlation (0.964, p<0.001) was seen between NGS and 1000 genome project frequency estimates. Of the ~170 "novel" variants detected in the genes, seven SNPs, all of which were present in multiple sample pools, were selected for validation by Sanger sequencing. Two SNPs were successfully validated by this method, and shown to be genuine variants, while five failed validation. These spurious SNP calls occurred as a result of the presence of small indels and mononucleotide repeats, indicating such features should be regarded with caution, and validation via an independent method is important for NGS variant calls.
Collapse
Affiliation(s)
- Jenny Lord
- Human Genetics, School of Molecular Medical Sciences, Queens Medical Centre, University of Nottingham Nottingham, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Shearer AE, Hildebrand MS, Ravi H, Joshi S, Guiffre AC, Novak B, Happe S, LeProust EM, Smith RJH. Pre-capture multiplexing improves efficiency and cost-effectiveness of targeted genomic enrichment. BMC Genomics 2012; 13:618. [PMID: 23148716 PMCID: PMC3534602 DOI: 10.1186/1471-2164-13-618] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 11/09/2012] [Indexed: 01/19/2023] Open
Abstract
Background Targeted genomic enrichment (TGE) is a widely used method for isolating and enriching specific genomic regions prior to massively parallel sequencing. To make effective use of sequencer output, barcoding and sample pooling (multiplexing) after TGE and prior to sequencing (post-capture multiplexing) has become routine. While previous reports have indicated that multiplexing prior to capture (pre-capture multiplexing) is feasible, no thorough examination of the effect of this method has been completed on a large number of samples. Here we compare standard post-capture TGE to two levels of pre-capture multiplexing: 12 or 16 samples per pool. We evaluated these methods using standard TGE metrics and determined the ability to identify several classes of genetic mutations in three sets of 96 samples, including 48 controls. Our overall goal was to maximize cost reduction and minimize experimental time while maintaining a high percentage of reads on target and a high depth of coverage at thresholds required for variant detection. Results We adapted the standard post-capture TGE method for pre-capture TGE with several protocol modifications, including redesign of blocking oligonucleotides and optimization of enzymatic and amplification steps. Pre-capture multiplexing reduced costs for TGE by at least 38% and significantly reduced hands-on time during the TGE protocol. We found that pre-capture multiplexing reduced capture efficiency by 23 or 31% for pre-capture pools of 12 and 16, respectively. However efficiency losses at this step can be compensated by reducing the number of simultaneously sequenced samples. Pre-capture multiplexing and post-capture TGE performed similarly with respect to variant detection of positive control mutations. In addition, we detected no instances of sample switching due to aberrant barcode identification. Conclusions Pre-capture multiplexing improves efficiency of TGE experiments with respect to hands-on time and reagent use compared to standard post-capture TGE. A decrease in capture efficiency is observed when using pre-capture multiplexing; however, it does not negatively impact variant detection and can be accommodated by the experimental design.
Collapse
Affiliation(s)
- A Eliot Shearer
- Department of Otolaryngology - Head & Neck Surgery, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
ElSharawy A, Warner J, Olson J, Forster M, Schilhabel MB, Link DR, Rose-John S, Schreiber S, Rosenstiel P, Brayer J, Franke A. Accurate variant detection across non-amplified and whole genome amplified DNA using targeted next generation sequencing. BMC Genomics 2012; 13:500. [PMID: 22994565 PMCID: PMC3534403 DOI: 10.1186/1471-2164-13-500] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 09/13/2012] [Indexed: 01/31/2023] Open
Abstract
Background Many hypothesis-driven genetic studies require the ability to comprehensively and efficiently target specific regions of the genome to detect sequence variations. Often, sample availability is limited requiring the use of whole genome amplification (WGA). We evaluated a high-throughput microdroplet-based PCR approach in combination with next generation sequencing (NGS) to target 384 discrete exons from 373 genes involved in cancer. In our evaluation, we compared the performance of six non-amplified gDNA samples from two HapMap family trios. Three of these samples were also preamplified by WGA and evaluated. We tested sample pooling or multiplexing strategies at different stages of the tested targeted NGS (T-NGS) workflow. Results The results demonstrated comparable sequence performance between non-amplified and preamplified samples and between different indexing strategies [sequence specificity of 66.0% ± 3.4%, uniformity (coverage at 0.2× of the mean) of 85.6% ± 0.6%]. The average genotype concordance maintained across all the samples was 99.5% ± 0.4%, regardless of sample type or pooling strategy. We did not detect any errors in the Mendelian patterns of inheritance of genotypes between the parents and offspring within each trio. We also demonstrated the ability to detect minor allele frequencies within the pooled samples that conform to predicted models. Conclusion Our described PCR-based sample multiplex approach and the ability to use WGA material for NGS may enable researchers to perform deep resequencing studies and explore variants at very low frequencies and cost.
Collapse
Affiliation(s)
- Abdou ElSharawy
- Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE. Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One 2012; 7:e37135. [PMID: 22675423 PMCID: PMC3365034 DOI: 10.1371/journal.pone.0037135] [Citation(s) in RCA: 1888] [Impact Index Per Article: 157.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 04/13/2012] [Indexed: 12/14/2022] Open
Abstract
The ability to efficiently and accurately determine genotypes is a keystone technology in modern genetics, crucial to studies ranging from clinical diagnostics, to genotype-phenotype association, to reconstruction of ancestry and the detection of selection. To date, high capacity, low cost genotyping has been largely achieved via “SNP chip” microarray-based platforms which require substantial prior knowledge of both genome sequence and variability, and once designed are suitable only for those targeted variable nucleotide sites. This method introduces substantial ascertainment bias and inherently precludes detection of rare or population-specific variants, a major source of information for both population history and genotype-phenotype association. Recent developments in reduced-representation genome sequencing experiments on massively parallel sequencers (commonly referred to as RAD-tag or RADseq) have brought direct sequencing to the problem of population genotyping, but increased cost and procedural and analytical complexity have limited their widespread adoption. Here, we describe a complete laboratory protocol, including a custom combinatorial indexing method, and accompanying software tools to facilitate genotyping across large numbers (hundreds or more) of individuals for a range of markers (hundreds to hundreds of thousands). Our method requires no prior genomic knowledge and achieves per-site and per-individual costs below that of current SNP chip technology, while requiring similar hands-on time investment, comparable amounts of input DNA, and downstream analysis times on the order of hours. Finally, we provide empirical results from the application of this method to both genotyping in a laboratory cross and in wild populations. Because of its flexibility, this modified RADseq approach promises to be applicable to a diversity of biological questions in a wide range of organisms.
Collapse
Affiliation(s)
- Brant K Peterson
- Department of Organismic & Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, United States of America.
| | | | | | | | | |
Collapse
|
23
|
Rossetti S, Hopp K, Sikkink RA, Sundsbak JL, Lee YK, Kubly V, Eckloff BW, Ward CJ, Winearls CG, Torres VE, Harris PC. Identification of gene mutations in autosomal dominant polycystic kidney disease through targeted resequencing. J Am Soc Nephrol 2012; 23:915-33. [PMID: 22383692 DOI: 10.1681/asn.2011101032] [Citation(s) in RCA: 127] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
Mutations in two large multi-exon genes, PKD1 and PKD2, cause autosomal dominant polycystic kidney disease (ADPKD). The duplication of PKD1 exons 1-32 as six pseudogenes on chromosome 16, the high level of allelic heterogeneity, and the cost of Sanger sequencing complicate mutation analysis, which can aid diagnostics of ADPKD. We developed and validated a strategy to analyze both the PKD1 and PKD2 genes using next-generation sequencing by pooling long-range PCR amplicons and multiplexing bar-coded libraries. We used this approach to characterize a cohort of 230 patients with ADPKD. This process detected definitely and likely pathogenic variants in 115 (63%) of 183 patients with typical ADPKD. In addition, we identified atypical mutations, a gene conversion, and one missed mutation resulting from allele dropout, and we characterized the pattern of deep intronic variation for both genes. In summary, this strategy involving next-generation sequencing is a model for future genetic characterization of large ADPKD populations.
Collapse
Affiliation(s)
- Sandro Rossetti
- Division of Nephrology and Hypertension, Mayo Clinic, Rochester, MN 55905, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst Biol 2012; 61:717-26. [PMID: 22232343 DOI: 10.1093/sysbio/sys004] [Citation(s) in RCA: 667] [Impact Index Per Article: 55.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Although massively parallel sequencing has facilitated large-scale DNA sequencing, comparisons among distantly related species rely upon small portions of the genome that are easily aligned. Methods are needed to efficiently obtain comparable DNA fragments prior to massively parallel sequencing, particularly for biologists working with non-model organisms. We introduce a new class of molecular marker, anchored by ultraconserved genomic elements (UCEs), that universally enable target enrichment and sequencing of thousands of orthologous loci across species separated by hundreds of millions of years of evolution. Our analyses here focus on use of UCE markers in Amniota because UCEs and phylogenetic relationships are well-known in some amniotes. We perform an in silico experiment to demonstrate that sequence flanking 2030 UCEs contains information sufficient to enable unambiguous recovery of the established primate phylogeny. We extend this experiment by performing an in vitro enrichment of 2386 UCE-anchored loci from nine, non-model avian species. We then use alignments of 854 of these loci to unambiguously recover the established evolutionary relationships within and among three ancient bird lineages. Because many organismal lineages have UCEs, this type of genetic marker and the analytical framework we outline can be applied across the tree of life, potentially reshaping our understanding of phylogeny at many taxonomic levels.
Collapse
Affiliation(s)
- Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA.
| | | | | | | | | | | |
Collapse
|
25
|
Marroni F, Pinosio S, Morgante M. The quest for rare variants: pooled multiplexed next generation sequencing in plants. Front Plant Sci 2012; 3:133. [PMID: 22754557 PMCID: PMC3384946 DOI: 10.3389/fpls.2012.00133] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2012] [Accepted: 06/04/2012] [Indexed: 05/08/2023]
Abstract
Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by individual Sanger sequencing. The aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method, we will explain in detail the possible experimental and analytical approaches and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled NGS can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity, and Tajima's D. Finally, we will discuss applications and future perspectives of the multiplexed NGS approach.
Collapse
Affiliation(s)
- Fabio Marroni
- Istituto di Genomica Applicata,Udine, Italy
- *Correspondence: Fabio Marroni, Istituto di Genomica Applicata, Via J. Linussio 51, 33100 Udine, Italy. e-mail:
| | - Sara Pinosio
- Istituto di Genomica Applicata,Udine, Italy
- CNR, Istituto di Genetica Vegetale, Sezione di Firenze,Firenze, Italy
| | - Michele Morgante
- Istituto di Genomica Applicata,Udine, Italy
- Dipartimento di Scienze Agrarie e Ambientali, Università di Udine,Udine, Italy
| |
Collapse
|
26
|
McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfield RT, Glenn TC. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species-tree analysis. Genome Res 2011; 22:746-54. [PMID: 22207614 DOI: 10.1101/gr.125864.111] [Citation(s) in RCA: 260] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Phylogenomics offers the potential to fully resolve the Tree of Life, but increasing genomic coverage also reveals conflicting evolutionary histories among genes, demanding new analytical strategies for elucidating a single history of life. Here, we outline a phylogenomic approach using a novel class of phylogenetic markers derived from ultraconserved elements and flanking DNA. Using species-tree analysis that accounts for discord among hundreds of independent loci, we show that this class of marker is useful for recovering deep-level phylogeny in placental mammals. In broad outline, our phylogeny agrees with recent phylogenomic studies of mammals, including several formerly controversial relationships. Our results also inform two outstanding questions in placental mammal phylogeny involving rapid speciation, where species-tree methods are particularly needed. Contrary to most phylogenomic studies, our study supports a first-diverging placental mammal lineage that includes elephants and tenrecs (Afrotheria). The level of conflict among gene histories is consistent with this basal divergence occurring in or near a phylogenetic "anomaly zone" where a failure to account for coalescent stochasticity will mislead phylogenetic inference. Addressing a long-standing phylogenetic mystery, we find some support from a high genomic coverage data set for a traditional placement of bats (Chiroptera) sister to a clade containing Perissodactyla, Cetartiodactyla, and Carnivora, and not nested within the latter clade, as has been suggested recently, although other results were conflicting. One of the most remarkable findings of our study is that ultraconserved elements and their flanking DNA are a rich source of phylogenetic information with strong potential for application across Amniotes.
Collapse
Affiliation(s)
- John E McCormack
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana 70803, USA.
| | | | | | | | | | | |
Collapse
|
27
|
Wang HY, Jain A. Novel sequencing-based strategies for high-throughput discovery of genetic mutations underlying inherited antibody deficiency disorders. Curr Allergy Asthma Rep 2011; 11:352-60. [PMID: 21792638 DOI: 10.1007/s11882-011-0211-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Human inherited antibody deficiency disorders are generally caused by mutations in genes involved in the pathways regulating B-cell class switch recombination; DNA damage repair; and B-cell development, differentiation, and survival. Sequencing a large set of candidate genes involved in these pathways appears to be a highly efficient way to identify novel mutations. Herein we review several high-throughput sequencing approaches as well as recent improvements in target gene enrichment technologies. Systematic improvement of enrichment and sequencing methods, along with refinement of the experimental process is necessary to develop a cost-effective high-throughput resequencing assay for a large cohort of patient samples. The Hyper-IgM/CVID chip is one example of a resequencing platform that may be used to identify known or novel mutations in patents with various types of inherited antibody deficiency.
Collapse
Affiliation(s)
- Hong-Ying Wang
- Laboratory of Host Defenses, National Institute of Allergy and Infectious Diseases, National Institutes of Health, CRC, 5W-3840, 10 Center Drive, Bethesda, MD 20892, USA.
| | | |
Collapse
|
28
|
Mertes F, Elsharawy A, Sauer S, van Helvoort JMLM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics 2011; 10:374-86. [PMID: 22121152 PMCID: PMC3245553 DOI: 10.1093/bfgp/elr033] [Citation(s) in RCA: 163] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings.
Collapse
Affiliation(s)
- Florian Mertes
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Berglund EC, Kiialainen A, Syvänen AC. Next-generation sequencing technologies and applications for human genetic history and forensics. Investig Genet 2011; 2:23. [PMID: 22115430 PMCID: PMC3267688 DOI: 10.1186/2041-2223-2-23] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Accepted: 11/24/2011] [Indexed: 12/24/2022]
Abstract
Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics.
Collapse
Affiliation(s)
- Eva C Berglund
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden
| | - Anna Kiialainen
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden
| | - Ann-Christine Syvänen
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, 751 85 Uppsala, Sweden
| |
Collapse
|
30
|
Abstract
Most characteristics in living organisms show continuous variation, which suggests that they are controlled by multiple genes. Quantitative trait loci (QTL) analysis can identify the genes underlying continuous traits by establishing associations between genetic markers and observed phenotypic variation in a segregating population. The new high-throughput sequencing (HTS) technologies greatly facilitate QTL analysis by providing genetic markers at genome-wide resolution in any species without previous knowledge of its genome. In addition HTS serves to quantify molecular phenotypes, which aids to identify the loci responsible for QTLs and to understand the mechanisms underlying diversity. The constant improvements in price, experimental protocols, computational pipelines, and statistical frameworks are making feasible the use of HTS for any research group interested in quantitative genetics. In this review I discuss the application of HTS for molecular marker discovery, population genotyping, and expression profiling in QTL analysis.
Collapse
Affiliation(s)
- José M. Jiménez-Gómez
- Department of Plant Breeding and Genetics, Max Planck Institute for Plant Breeding ResearchKöln, Germany
| |
Collapse
|
31
|
Nakaoka H, Cui T, Tajima A, Oka A, Mitsunaga S, Kashiwase K, Homma Y, Sato S, Suzuki Y, Inoko H, Inoue I. A systems genetics approach provides a bridge from discovered genetic variants to biological pathways in rheumatoid arthritis. PLoS One 2011; 6:e25389. [PMID: 21980439 PMCID: PMC3182219 DOI: 10.1371/journal.pone.0025389] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 09/02/2011] [Indexed: 11/18/2022] Open
Abstract
Genome-wide association studies (GWAS) have yielded novel genetic loci underlying common diseases. We propose a systems genetics approach to utilize these discoveries for better understanding of the genetic architecture of rheumatoid arthritis (RA). Current evidence of genetic associations with RA was sought through PubMed and the NHGRI GWAS catalog. The associations of 15 single nucleotide polymorphisms and HLA-DRB1 alleles were confirmed in 1,287 cases and 1,500 controls of Japanese subjects. Among these, HLA-DRB1 alleles and eight SNPs showed significant associations and all but one of the variants had the same direction of effect as identified in the previous studies, indicating that the genetic risk factors underlying RA are shared across populations. By receiver operating characteristic curve analysis, the area under the curve (AUC) for the genetic risk score based on the selected variants was 68.4%. For seropositive RA patients only, the AUC improved to 70.9%, indicating good but suboptimal predictive ability. A simulation study shows that more than 200 additional loci with similar effect size as recent GWAS findings or 20 rare variants with intermediate effects are needed to achieve AUC = 80.0%. We performed the random walk with restart (RWR) algorithm to prioritize genes for future mapping studies. The performance of the algorithm was confirmed by leave-one-out cross-validation. The RWR algorithm pointed to ZAP70 in the first rank, in which mutation causes RA-like autoimmune arthritis in mice. By applying the hierarchical clustering method to a subnetwork comprising RA-associated genes and top-ranked genes by the RWR, we found three functional modules relevant to RA etiology: "leukocyte activation and differentiation", "pattern-recognition receptor signaling pathway", and "chemokines and their receptors".These results suggest that the systems genetics approach is useful to find directions of future mapping strategies to illuminate biological pathways.
Collapse
Affiliation(s)
- Hirofumi Nakaoka
- Division of Human Genetics, Department of Integrated Genetics, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Tailin Cui
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
| | - Atsushi Tajima
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
- Department of Human Genetics and Public Health, Institute of Health Biosciences, The University of Tokusima Graduate School, Tokushima, Tokushima, Japan
| | - Akira Oka
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
| | - Shigeki Mitsunaga
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
| | - Koichi Kashiwase
- Department of Laboratory, Japanese Red Cross Tokyo Blood Center, Koto-ku, Tokyo, Japan
| | - Yasuhiko Homma
- Department of Clinical Health Science, Tokai University School of Medicine, Isehara, Kanagawa, Japan
| | - Shinji Sato
- Department of Internal Medicine, Division of Rheumatology, Tokai University School of Medicine, Isehara, Kanagawa, Japan
| | - Yasuo Suzuki
- Department of Internal Medicine, Division of Rheumatology, Tokai University School of Medicine, Isehara, Kanagawa, Japan
| | - Hidetoshi Inoko
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
| | - Ituro Inoue
- Division of Human Genetics, Department of Integrated Genetics, National Institute of Genetics, Mishima, Shizuoka, Japan
- Division of Molecular Life Science, School of Medicine, Tokai University, Isehara, Kanagawa, Japan
| |
Collapse
|